“Accelerating translational research by integrating clinical data into systems biology at scale”
–Assistant Professor, Jennifer Hadlock, MD
The Hadlock lab collaborates with systems biologists, data scientists, clinicians and population health experts to understand transitions between wellness and disease. We conduct translational research to improve risk models for clinical decision support and investigate novel methods to accelerate research discovery. Our lab works directly with two types of clinical data: 1) over 19,000,000 electronic health records (EHRs) from fifty hospitals and 900 clinics across five states, and 2) high-fidelity prospective study data that combines genomics, imaging, health-related social needs and patient-reported outcomes. We collaborate closely with experts at research institutions and community hospitals across the country. Our current investigations focus on three areas:
- Supervised machine learning for biomedically interpretable models for clinical decision support
- Bridging semantic gaps to integrate real-world clinical data with scientific knowledge graphs
- Domain-agnostic machine learning approaches for detecting biases and discovering biomedical insights in electronic health record (EHR) data
Machine learning for biomedical, interpretable risk models for clinical decision support
Risk scores aim to support medical personnel when they face complex, multifactorial and high-stakes clinical decisions about optimal patient care. The Hadlock Lab is developing more accurate, clinically relevant risk models to inform individual patient care for prevention, screening, treatment and follow-up. We integrate machine learning, biomedical knowledge ontologies, and clinical expertise to analyze EHR data. These approaches are designed to be extensible across many patient conditions, and we are currently investigating early risk stratification for preterm birth, serious COVID-19 outcomes and sepsis. Sepsis occurs when a person has an infection and suffers from an overwhelming inflammatory response, which can lead to organ failure and death. It affects millions of people a year, and is a heterogeneous condition with diverse presentations and no clear diagnostic tests. We collaborate with the interdisciplinary Hoag Sepsis Collaborative to prioritize risk models that have the potential to directly improve patient care. Our current investigations focus on stratification at three different time points: primary and secondary prevention, early symptom recognition, and pre-hospital transport.
Bridging the semantic gap between real-world clinical data and knowledge graphs
Bioinformatics, clinical informatics, and scientific knowledge bases continue to advance rapidly. However, many challenges remain to integrate this disparate data for research. For our work on specific conditions, we address challenges with semantic interoperability by applying existing, well-curated ontologies, and when additional accuracy is needed, expert-reviewed maps. The Hadlock Lab is also part of the NIH NCATS Biomedical Data Translator, collaborating with researchers to connect clinical data with over twenty other categories of data, including multi-omics, pharmacology, and environmental exposures.
Accelerating biomedical discovery in electronic health records (EHRs) and longitudinal studies of health transitions
Understanding similarity between patients is a fundamental concept underlying clinical care and biomedical research. Our lab uses distance metrics to analyze similarity over 300,000,000 patient encounters. We are currently collaborating with several labs at ISB to apply new domain-agnostic approaches for rapid discovery of patterns in semi-structured clinical data. The patterns that emerge can include several categories of new insights: hypotheses for biomedical research, artifacts of healthcare delivery, error and biases. By focusing directly on clinical observations of phenotype and exposures in the real world, we can minimize the noise introduced by mapping investigations through single diagnostic labels, and increase the chance of detecting previously unobserved patterns. Once surfaced, these hypotheses can be prioritized for rigorous investigation using existing research methods. We also apply both supervised and unsupervised machine learning on the robust data from the Adolescent Cognitive Brain Development (ABCD) study.