AMR-UTI: Antimicrobial Resistance in Urinary Tract Infections

Get the dataset

AMR-UTI is a freely accessible dataset, derived from electronic health record (EHR) information on over 80,000 patients with urinary tract infections (UTI) treated at Massachusetts General Hospital and Brigham & Women’s Hospital in Boston, MA, USA between 2007 and 2016. The data is released in collaboration with Mass General Brigham and PhysioNet.

Each observation in the dataset corresponds to a urine specimen sent to the clinical microbiology laboratory to assess for antimicrobial resistance (AMR). Each observation includes:

  • Ground-truth labels for antibiotic resistance to all common antibiotics used to treat UTIs, derived from laboratory testing.
  • Antibiotic treatment decisions, which were made without knowledge of the test results.
  • Patient and specimen features useful for prediction of resistance.

Quick Links

Clinical context

Urinary tract infections (UTIs) represent one of the most common complaints faced by healthcare providers in inpatient and outpatient settings. It is a common indication for antibiotic treatment, but overuse of broad spectrum therapies has selected for antimicrobial resistant pathogens. With this in mind, clinicians send urine specimens to the microbiology laboratory to conduct antibiotic susceptibility testing.

The receipt of definitive data from the microbiology laboratory, however, can take as long as 72 hours to return, and an antibiotic must be chosen in the meantime. This situation is referred to as empiric antibiotic treatment. When selecting an antibiotic therapy, providers must balance between the goal of using narrow spectrum antibiotics, while avoiding inappropriate antibiotic therapy (the selection of an antibiotic to which the patient is resistant).

What is this data useful for?

This dataset is designed to support the development of algorithms to guide empiric treatment decisions in the context of uncomplicated UTIs, helping providers to choose effective antibiotics while avoiding the overuse of broad spectrum therapies.

Because antibiotic susceptibility testing provides a proxy for counterfactual outcomes under different treatments, this dataset supports the development and validation of causal inference and policy learning methods more broadly. To support the study of transfer learning, we also include a broader cohort of more complicated UTIs.

Publications using AMR-UTI

.js-id-AMR-UTI-Dataset

Towards Verifiable Text Generation with Symbolic References

Learning to Decode Collaboratively with Multiple Language Models

Prediction-powered Generalization of Causal Inferences

A Data-Centric Approach to Generate Faithful and High Quality Patient Summaries with Large Language Models

Benchmarking observational studies with experimental data under right-censoring

Machine learning to predict notes for chart review in the oncology setting: a proof of concept strategy for improving clinician note-writing

Joint AI-driven event prediction and longitudinal modeling in newly diagnosed and relapsed multiple myeloma

Effective Human-AI Teams via Learned Natural Language Rules and Onboarding

Conceptualizing Machine Learning for Dynamic Information Retrieval of Electronic Health Record Notes

Large-Scale Study of Temporal Shift in Health Insurance Claims

A Deep Dive into Single-Cell RNA Sequencing Foundation Models

Conformalized Unconditional Quantile Regression

Falsification of Internal and External Validity in Observational Studies via Conditional Moment Restrictions

TabLLM: Few-shot Classification of Tabular Data with Large Language Models

Who Should Predict? Exact Algorithms For Learning to Defer to Humans

Training Subset Selection for Weak Supervision

Evaluating Robustness to Dataset Shift via Parametric Robustness Sets

Falsification before Extrapolation in Causal Effect Estimation

Sample Efficient Learning of Predictors that Complement Humans

Co-training Improves Prompt-based Learning for Large Language Models

Bias-robust Integration of Observational and Experimental Estimators

Clustering Interval-Censored Time-Series for Disease Phenotyping

ETAB: A Benchmark Suite for Visual Representation Learning in Echocardiography

Large Language Models are Few-Shot Clinical Information Extractors

Leveraging Time Irreversibility with Order-Contrastive Pre-training

Single cell characterization of myeloma and its precursor conditions reveals transcriptional signatures of early tumorigenesis

Teaching Humans When To Defer to a Classifier via Exemplars

The Potential For Bias In Machine Learning And Opportunities For Health Insurers To Address It

Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance

MedKnowts: Unified Documentation and Information Retrieval for Electronic Health Records

Graph cuts always find a global optimum for Potts models (with a catch)

Neural Pharmacodynamic State Space Modeling

Regularizing towards Causal Invariance: Linear Models with Proxies

Assessing the Impact of Automated Suggestions on Decision Making: Domain Experts Mediate Model Errors but Take Less Initiative

Automated NLP Extraction of Clinical Rationale for Treatment Discontinuation in Breast Cancer

Trajectory Inspection: A Method for Iterative Clinician-Driven Design of Reinforcement Learning Studies

Beyond perturbation stability: LP recovery guarantees for MAP inference on noisy stable instances

PClean: Bayesian Data Cleaning at Scale with Domain-Specific Probabilistic Programming

Deep Contextual Clinical Prediction with Reverse Distillation

Directing Human Attention in Event Localization for Clinical Timeline Creation

Pulse of the Pandemic: Iterative Topic Filtering for Clinical Information Extraction from Social Media

A decision algorithm to promote outpatient antimicrobial stewardship for uncomplicated urinary tract infection

Characterization of Overlap in Observational Studies

Consistent Estimators for Learning to Defer to an Expert

Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models

Estimation of Bounds on Potential Outcomes For Decision Making

Fast, Structured Clinical Documentation via Contextual Autocomplete

Generalization Bounds and Representation Learning for Estimation of Potential Outcomes and Causal Effects

Predicting human health from biofluid-based metabolomics using machine learning

Predicting Remission Among Patients With Rheumatoid Arthritis Starting Tocilizumab Monotherapy: Model Derivation and Remission Score Development

Robust Benchmarking for Machine Learning of Clinical Entity Extraction

Robustly Extracting Medical Knowledge from EHRs: A Case Study of Learning a Health Knowledge Graph

Treatment Policy Learning in Multiobjective Settings with Fully Observed Outcomes

Derivation and validation of a machine learning record linkage algorithm between emergency medical services and the emergency department

Block Stability for MAP Inference

Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models

Guidelines for reinforcement learning in healthcare.

Improving documentation of presenting problems in the emergency department using a domain-specific ontology and machine learning-driven user interfaces

Overcomplete Independent Component Analysis via SDP

Support and Invertibility in Domain-Invariant Representations

Train and Test Tightness of LP Relaxations in Structured Prediction

Cell-specific prediction and application of drug-induced gene expression profiles

Evaluating Reinforcement Learning Algorithms in Observational Health Settings

Learning Topic Models - Provably and Efficiently

Learning Weighted Representations for Generalization Across Designs

Machine Learning Analysis of Heterogeneity in the Effect of Student Mindset Interventions

Max-margin learning with the Bayes Factor

Optimality of Approximate Inference Algorithms on Stable Instances

Recurrent Neural Networks for Multivariate Time Series with Missing Values

Semi-Amortized Variational Autoencoders

Why Is My Classifier Discriminatory?

Causal Effect Inference with Deep Latent-Variable Models

Contextual Autocomplete: A Novel User Interface Using Machine Learning to Improve Ontology Usage and Structured Data Capture for Presenting Problems in the Emergency Department

Creating an Automated Trigger for Sepsis Clinical Decision Support at Emergency Department Triage using Machine Learning

Discourse-Based Objectives for Fast Unsupervised Sentence Representation Learning

Early Identification of Patients with Acute Decompensated Heart Failure

Electronic phenotyping with APHRODITE and the Observational Health Sciences and Informatics (OHDSI) data network

Estimating individual treatment effect: generalization bounds and algorithms

Grounded Recurrent Neural Networks

Learning a Health Knowledge Graph from Electronic Medical Records

Objective Assessment of Depressive Symptoms with Machine Learning and Wearable Sensors Data

Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation

Structured Inference Networks for Nonlinear State Space Models

Using Machine Learning to Recommend Oncology Clinical Trials

Character-Aware Neural Language Models

Clinical Tagging with Joint Probabilistic Models

Comparison of approaches for heart failure case identification from electronic health record data

Electronic Medical Record Phenotyping using the Anchor & Learn Framework

Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization

Learning Low-Dimensional Representations of Medical Concepts

Learning Representations for Counterfactual Inference

Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests

Population-Level Prediction of Type 2 Diabetes using Claims Data and Analysis of Risk Factors

Tightness of LP Relaxations for Almost Balanced Models

Train and Test Tightness of LP Relaxations in Structured Prediction

A Fast Variational Approach for Learning Markov Random Field Language Models

Anchored Discrete Factor Analysis

Barrier Frank-Wolfe for Marginal Inference

Deep Kalman Filters

How Hard is Inference for Structured Prediction?

Predicting chronic comorbid conditions of type 2 diabetes in Newly-Diagnosed Diabetic Patients

Temporal Convolutional Neural Networks for Diagnosis from Lab Tests

Visual Exploration of Temporal Data in Electronic Medical Records

Instance Segmentation of Indoor Scenes using a Coverage Loss

Lifted Tree-Reweighted Variational Inference

Understanding the Bethe Approximation: When and How can it go Wrong?

Unsupervised Learning of Disease Progression Models

Using Anchors to Estimate Clinical State without Labeled Data

A Practical Algorithm for Topic Modeling with Provable Guarantees

Discovering Hidden Variables in Noisy-Or Networks using Quartet Tests

Predicting Chief Complaints at Triage Time in the Emergency Department

SparsityBoost: A New Scoring Function for Learning Bayesian Network Structure

Unsupervised Learning of Noisy-Or Bayesian Networks

A Comparison of Dimensionality Reduction Techniques for Unstructured Clinical Text

Efficiently Searching for Frustrated Cycles in MAP Inference

Introduction to Dual Decomposition for Inference

Probabilistic models for personalizing web search

Complexity of Inference in Latent Dirichlet Allocation

Personalizing web search results by reading level

Dual Decomposition for Parsing with Non-Projective Head Automata

Learning Bayesian Network Structure using LP Relaxations

Learning Efficiently with Approximate Inference via Dual Losses

More data means less inference: A pseudo-max approach to structured learning

On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing

Clusters and Coarse Partitions in LP Relaxations

Scaling All-Pairs Overlay Routing

Tree Block Coordinate Descent for MAP in Graphical Models

New Outer Bounds on the Marginal Polytope

Tightening LP Relaxations for MAP using Message-Passing

Probabilistic Modeling of Systematic Errors in Two-Hybrid Experiments

Approximate Inference for Infinite Contingent Bayesian Networks

BLOG: probabilistic models with unknown objects