American Medical Informatics Association - Building Machine Learning Models to Characterize the Role of the Maternal Environment on Preeclampsia

An Obfuscation Algorithm Designed to Support Machine Learning with Real World Evidence.

Presentation Time: 09:45 AM - 09:57 AM

Abstract Keywords: Artificial Intelligence, Bioinformatics, Informatics Implementation, Machine Learning, Workflow
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics

To support HIPAA-compliant machine learning with real-world clinical data, Memorial Sloan Kettering developed a patient-centric obfuscation algorithm that deidentifies unstructured electronic health records. Using John Snow Labs’ NLP tools and expert annotation, PHI labels were validated across ~14,000 documents. The algorithm preserves document structure, consistency, and typographical quirks while ensuring high-quality obfuscation.

Speaker:
Andrew Niederhausern, BS
MSKCC

Authors:
John Philip, MS - Memorial Sloan Kettering Cancer Center; Ilya Sher, NA - Memorial Sloan Kettering; Rohan Singh, Mac - Memorial Sloan Kettering; Sribatsa Das, NA - Memorial Sloan Kettering; Mrunmayi Deshpande, NA - Memorial Sloan Kettering; Ta-Ching Chen, NA - Memorial Sloan Kettering; Nadia Bahadur, Masters of Clinical Research - Memorial Sloan Kettering Cancer Center; Brian Tran van der Stegen, MBA - Memorial Sloan Kettering; Haiyu Zheng, MS - Memorial Sloan Kettering; David Talby, Ph.D. - John Snow Labs; Veysel Kocaman;

Using machine learning models to inform personalized, cost-effective treatment recommendations

Presentation Time: 09:57 AM - 10:09 AM

Abstract Keywords: Machine Learning, Artificial Intelligence, Clinical Decision Support, Healthcare Economics/Cost of Care, Quantitative Methods
Primary Track: Foundations
Programmatic Theme: Clinical Informatics

Rapid, reliable, and affordable diagnostic tests are not always available. Consequently, clinicians often rely on patient characteristics and symptoms to guide treatment decisions. While machine learning (ML) models can predict the presence of the disease, they do not account for costs and health impact of treatment. We propose methods to integrate ML models with decision models that account for the long-term health outcomes and costs associated with available treatment options.

Speaker:
Mariana Neves, PhD
Yale

Author:
Reza Yaesoubi, PhD - Yale School of Public Health;

Leveraging Machine Learning and Robotic Process Automation to identify and convert unstructured colonoscopy results into actionable data

Presentation Time: 10:09 AM - 10:21 AM

Abstract Keywords: Documentation Burden, Healthcare Quality, Information Extraction, Machine Learning, Natural Language Processing, Workflow
Primary Track: Applications
Programmatic Theme: Clinical Informatics

Our health system had the objective to create a more efficient way to ensure accurate documentation of colorectal cancer screening follow-up intervals from inbound colonoscopy reports. We developed an integrated workflow using machine learning and robotic process automation to extract and update follow-up dates from unstructured data. As proof of concept, we outline the process, validity, and implementation of this approach in a large academic health system.

Speaker:
Adam Szerencsy, DO
NYU Langone Health

Authors:
Elizabeth Stevens, PhD, MPH - NYU Grossman School of Medicine; Jager Hartman, BS - NYU Langone Health; Paul Testa, MD, JD, MPH - NYU Langone Health; Ajay Mansukhani, BS - NYU Langone Health; Casey Monina, RN - NYU Langone Health; Amelia Shunk, MMCi - NYU Grossman School of Medicine; David Ranson, BA - NYU Langone Health; Yana Imberg, BA - NYU Langone Health; Ann Cote, BA - NYU Langone Health; Dinesha Prabhu, BS - NYU Langone Health; Adam Szerencsy, DO - NYU Langone Health;

Building Machine Learning Models to Characterize the Role of the Maternal Environment on Preeclampsia

Presentation Time: 10:21 AM - 10:33 AM

Abstract Keywords: Environmental Health and Climate Informatics, Machine Learning, Geospatial (GIS) Data/Analysis
Primary Track: Applications
Programmatic Theme: Translational Bioinformatics

In this study we combine individual- and area-level environmental exposure data to construct a complex representation of the maternal environment and characterize the role of prenatal exposures on the development of preeclampsia. We leverage the capabilities of machine learning to integrate high-dimensional environmental exposure data structures and develop interpretable models that elucidate the dynamics between the maternal environment and preeclampsia development. With these models, we hope to inform public health interventions for pregnant populations.

Speaker:
Chloé Paris, Bachelor of Science
University of Pennsylvania

Authors:
Chloé Paris, Bachelor of Science - University of Pennsylvania; Rachel Ledyard, MPH - Children's Hospital of Philadelphia; Heather Burris, MD, MPH - Children's Hospital of Philadelphia; University of Pennsylvania; Joseph Romano, PhD - University of Pennsylvania;

A Fair Machine Learning Model to Predict High Data-Continuity in Electronic Health Records Data

Presentation Time: 10:33 AM - 10:45 AM

Abstract Keywords: Clinical Decision Support, Fairness and elimination of bias, Machine Learning
Primary Track: Foundations
Programmatic Theme: Clinical Research Informatics

Incomplete electronic health record (EHR) data poses challenges for research and care. This study developed a fair machine learning model to identify patients with high data continuity within EHR data. We used OneFlorida+ EHR data linked to Florida Medicaid claims, and continuity was measured through Mean Proportion of Encounters Captured (MPEC). Our model achieved an AUROC of 0.77 and an accuracy of 71% to identify patients with high EHR data continuity.

Speaker:
Yao An Lee, Master of Science
University of Florida, Department of Pharmaceutical Outcomes & Policy

Authors:
Serena Jingchuan Guo, MD, PhD - University of Florida; Jiang Bian, PhD - Indiana University; Yu Huang, PhD - Indiana University;

Leveraging data-driven analyses to improve subpopulation robustness of a schizophrenia prediction model

Presentation Time: 10:45 AM - 10:57 AM

Abstract Keywords: Deep Learning, Artificial Intelligence, Machine Learning, Fairness and elimination of bias
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics

We create a machine learning model to predict diagnostic transition from psychosis to schizophrenia and present a data-driven analysis to identify subsets of features where the model lacks robustness. We find that the model lacks robustness over features related to visit frequency (which we term healthcare utilization) and psychiatric care. We hypothesize that the model may focus more on mental health utilization, ignoring the predictive value of non-mental health utilization.

Speaker:
Aparajita Kashyap, MA
Columbia University Department of Biomedical Informatics

Authors:
Aparajita Kashyap, MA - Columbia University Department of Biomedical Informatics; Steven Kushner, MD, PhD - Columbia University Irving Medical Center; Noémie Elhadad, PhD - Columbia University; Shalmali Joshi, PhD - Columbia University;

Custom CSS

S26: Data Alchemy: Transforming Clinical Complexity into Actionable ML Intelligence

Building Machine Learning Models to Characterize the Role of the Maternal Environment on Preeclampsia

Category

Description

Custom CSS