Custom CSS
double-click to edit, do not edit in source
11/17/2025 |
9:45 AM – 11:00 AM |
Room 5
S26: Data Alchemy: Transforming Clinical Complexity into Actionable ML Intelligence
Presentation Type: Oral Presentations
An Obfuscation Algorithm Designed to Support Machine Learning with Real World Evidence.
Presentation Time: 09:45 AM - 09:57 AM
Abstract Keywords: Artificial Intelligence, Bioinformatics, Informatics Implementation, Machine Learning, Workflow
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
To support HIPAA-compliant machine learning with real-world clinical data, Memorial Sloan Kettering developed a patient-centric obfuscation algorithm that deidentifies unstructured electronic health records. Using John Snow Labs’ NLP tools and expert annotation, PHI labels were validated across ~14,000 documents. The algorithm preserves document structure, consistency, and typographical quirks while ensuring high-quality obfuscation.
Speaker:
Andrew
Niederhausern,
BS
MSKCC
Authors:
John Philip, MS - Memorial Sloan Kettering Cancer Center;
Ilya Sher,
NA -
Memorial Sloan Kettering;
Rohan Singh, Mac - Memorial Sloan Kettering;
Sribatsa Das,
NA -
Memorial Sloan Kettering;
Mrunmayi Deshpande,
NA -
Memorial Sloan Kettering;
Ta-Ching Chen,
NA -
Memorial Sloan Kettering;
Nadia Bahadur, Masters of Clinical Research - Memorial Sloan Kettering Cancer Center;
Brian Tran van der Stegen,
MBA -
Memorial Sloan Kettering;
Haiyu Zheng,
MS -
Memorial Sloan Kettering;
David Talby, Ph.D. - John Snow Labs;
Veysel Kocaman;
Andrew
Niederhausern,
BS - MSKCC
Using machine learning models to inform personalized, cost-effective treatment recommendations
Presentation Time: 09:57 AM - 10:09 AM
Abstract Keywords: Machine Learning, Artificial Intelligence, Clinical Decision Support, Healthcare Economics/Cost of Care, Quantitative Methods
Primary Track: Foundations
Programmatic Theme: Clinical Informatics
Rapid, reliable, and affordable diagnostic tests are not always available. Consequently, clinicians often rely on patient characteristics and symptoms to guide treatment decisions. While machine learning (ML) models can predict the presence of the disease, they do not account for costs and health impact of treatment. We propose methods to integrate ML models with decision models that account for the long-term health outcomes and costs associated with available treatment options.
Speaker:
Mariana
Neves,
PhD
Yale
Author:
Reza Yaesoubi, PhD - Yale School of Public Health;
Mariana
Neves,
PhD - Yale
Leveraging Machine Learning and Robotic Process Automation to identify and convert unstructured colonoscopy results into actionable data
Presentation Time: 10:09 AM - 10:21 AM
Abstract Keywords: Documentation Burden, Healthcare Quality, Information Extraction, Machine Learning, Natural Language Processing, Workflow
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Our health system had the objective to create a more efficient way to ensure accurate documentation of colorectal cancer screening follow-up intervals from inbound colonoscopy reports. We developed an integrated workflow using machine learning and robotic process automation to extract and update follow-up dates from unstructured data. As proof of concept, we outline the process, validity, and implementation of this approach in a large academic health system.
Speaker:
Adam
Szerencsy,
DO
NYU Langone Health
Authors:
Elizabeth Stevens, PhD, MPH - NYU Grossman School of Medicine;
Jager Hartman,
BS -
NYU Langone Health;
Paul Testa, MD, JD, MPH - NYU Langone Health;
Ajay Mansukhani,
BS -
NYU Langone Health;
Casey Monina,
RN -
NYU Langone Health;
Amelia Shunk, MMCi - NYU Grossman School of Medicine;
David Ranson,
BA -
NYU Langone Health;
Yana Imberg,
BA -
NYU Langone Health;
Ann Cote,
BA -
NYU Langone Health;
Dinesha Prabhu,
BS -
NYU Langone Health;
Adam Szerencsy, DO - NYU Langone Health;
Adam
Szerencsy,
DO - NYU Langone Health
Building Machine Learning Models to Characterize the Role of the Maternal Environment on Preeclampsia
Presentation Time: 10:21 AM - 10:33 AM
Abstract Keywords: Environmental Health and Climate Informatics, Machine Learning, Geospatial (GIS) Data/Analysis
Primary Track: Applications
Programmatic Theme: Translational Bioinformatics
In this study we combine individual- and area-level environmental exposure data to construct a complex representation of the maternal environment and characterize the role of prenatal exposures on the development of preeclampsia. We leverage the capabilities of machine learning to integrate high-dimensional environmental exposure data structures and develop interpretable models that elucidate the dynamics between the maternal environment and preeclampsia development. With these models, we hope to inform public health interventions for pregnant populations.
Speaker:
Chloé
Paris,
Bachelor of Science
University of Pennsylvania
Authors:
Chloé Paris, Bachelor of Science - University of Pennsylvania;
Rachel Ledyard,
MPH -
Children's Hospital of Philadelphia;
Heather Burris,
MD, MPH -
Children's Hospital of Philadelphia; University of Pennsylvania;
Joseph Romano, PhD - University of Pennsylvania;
Chloé
Paris,
Bachelor of Science - University of Pennsylvania
A Fair Machine Learning Model to Predict High Data-Continuity in Electronic Health Records Data
Presentation Time: 10:33 AM - 10:45 AM
Abstract Keywords: Clinical Decision Support, Fairness and elimination of bias, Machine Learning
Primary Track: Foundations
Programmatic Theme: Clinical Research Informatics
Incomplete electronic health record (EHR) data poses challenges for research and care. This study developed a fair machine learning model to identify patients with high data continuity within EHR data. We used OneFlorida+ EHR data linked to Florida Medicaid claims, and continuity was measured through Mean Proportion of Encounters Captured (MPEC). Our model achieved an AUROC of 0.77 and an accuracy of 71% to identify patients with high EHR data continuity.
Speaker:
Yao An
Lee,
Master of Science
University of Florida, Department of Pharmaceutical Outcomes & Policy
Authors:
Serena Jingchuan Guo, MD, PhD - University of Florida;
Jiang Bian,
PhD -
Indiana University;
Yu Huang,
PhD -
Indiana University;
Yao An
Lee,
Master of Science - University of Florida, Department of Pharmaceutical Outcomes & Policy
Leveraging data-driven analyses to improve subpopulation robustness of a schizophrenia prediction model
Presentation Time: 10:45 AM - 10:57 AM
Abstract Keywords: Deep Learning, Artificial Intelligence, Machine Learning, Fairness and elimination of bias
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
We create a machine learning model to predict diagnostic transition from psychosis to schizophrenia and present a data-driven analysis to identify subsets of features where the model lacks robustness. We find that the model lacks robustness over features related to visit frequency (which we term healthcare utilization) and psychiatric care. We hypothesize that the model may focus more on mental health utilization, ignoring the predictive value of non-mental health utilization.
Speaker:
Aparajita
Kashyap,
MA
Columbia University Department of Biomedical Informatics
Authors:
Aparajita Kashyap, MA - Columbia University Department of Biomedical Informatics;
Steven Kushner,
MD, PhD -
Columbia University Irving Medical Center;
Noémie Elhadad, PhD - Columbia University;
Shalmali Joshi, PhD - Columbia University;
Aparajita
Kashyap,
MA - Columbia University Department of Biomedical Informatics