3/11/2025 |
1:30 PM – 3:00 PM |
Urban
S12: Modeling Complex Real World Data
Presentation Type: Podium Abstract
Unraveling Complex Temporal Patterns in EHRs via Robust Irregular Tensor Factorization
2025 Informatics Summit On Demand
Presentation Time: 01:30 PM - 01:45 PM
Abstract Keywords: EHR-based Phenotyping, Machine Learning, Generative AI, and Predictive Modeling, Data Mining and Knowledge Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Digital Health Technologies for Patient Research
Electronic health records (EHRs) contain diverse patient data with varying visit frequencies, resulting in unaligned tensors in the time mode. While PARAFAC2 has been used for extracting meaningful medical concepts from EHRs, existing methods fail to capture non-linear and complex temporal patterns and struggle with missing entries.
In this paper, we propose REPAR, an RNN Regularized Robust PARAFAC2 method to model complex temporal dependencies and enhance robustness in the presence of missing data. Our approach employs RNNs for temporal regularization and a low-rank constraint for robustness. We design a hybrid optimization framework that handles multiple regularizations and supports various data types. REPAR is evaluated on 3 real-world EHR datasets, demonstrating improved reconstruction and robustness under missing data. Two case studies further showcase REPAR's ability to extract meaningful dynamic phenotypes and enhance phenotype predictability from noisy temporal EHRs.
Speaker(s):
Linghui Zeng, MS
Emory University
Author(s):
Yifei Ren, PhD - Emory University; Linghui Zeng, MS - Emory University; Jian Lou, PhD - Zhejiang University; Xiaoqian Jiang, PhD - University of Texas Health Science Center at Houston; Li Xiong, PhD - Emory University; Joyce Ho, PhD - Emory University; Sivasubramanium Bhavani, MD - Emory University;
2025 Informatics Summit On Demand
Presentation Time: 01:30 PM - 01:45 PM
Abstract Keywords: EHR-based Phenotyping, Machine Learning, Generative AI, and Predictive Modeling, Data Mining and Knowledge Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Digital Health Technologies for Patient Research
Electronic health records (EHRs) contain diverse patient data with varying visit frequencies, resulting in unaligned tensors in the time mode. While PARAFAC2 has been used for extracting meaningful medical concepts from EHRs, existing methods fail to capture non-linear and complex temporal patterns and struggle with missing entries.
In this paper, we propose REPAR, an RNN Regularized Robust PARAFAC2 method to model complex temporal dependencies and enhance robustness in the presence of missing data. Our approach employs RNNs for temporal regularization and a low-rank constraint for robustness. We design a hybrid optimization framework that handles multiple regularizations and supports various data types. REPAR is evaluated on 3 real-world EHR datasets, demonstrating improved reconstruction and robustness under missing data. Two case studies further showcase REPAR's ability to extract meaningful dynamic phenotypes and enhance phenotype predictability from noisy temporal EHRs.
Speaker(s):
Linghui Zeng, MS
Emory University
Author(s):
Yifei Ren, PhD - Emory University; Linghui Zeng, MS - Emory University; Jian Lou, PhD - Zhejiang University; Xiaoqian Jiang, PhD - University of Texas Health Science Center at Houston; Li Xiong, PhD - Emory University; Joyce Ho, PhD - Emory University; Sivasubramanium Bhavani, MD - Emory University;
Systematic Exploration of Hospital Cost Variability: A Conformal Prediction-Based Outlier Detection Method for Electronic Health Records
2025 Informatics Summit On Demand
Presentation Time: 01:45 PM - 02:00 PM
Abstract Keywords: Real-World Evidence and Policy Making, Informatics Research/Biomedical Informatics Research Methods, Phenomics and Phenome-wide Association Studies
Primary Track: Clinical Research Informatics
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
Marked variability in inpatient hospitalization costs poses significant challenges to healthcare quality, resource allocation, and patient outcomes. Traditional methods like Diagnosis-Related Groups (DRGs) aid in cost management but lack practical solutions for enhancing hospital care value. We introduce a novel methodology for outlier detection in Electronic Health Records (EHRs) using Conformal Prediction. This approach identifies and prioritizes areas for optimizing high-value care processes. Unlike conventional predictive models that neglect uncertainty, our method employs Conformal Quantile Regression (CQR) to generate robust prediction intervals, offering a comprehensive view of cost variability. By integrating Conformal Prediction with machine learning models, healthcare professionals can more accurately pinpoint opportunities for quality and efficiency improvements. Our framework systematically evaluates unexplained hospital cost variations and generates interpretable hypotheses for refining clinical practices associated with atypical costs. This data-driven approach offers a systematic method to generate clinically sound hypotheses that may inform processes to enhance care quality and optimize resource utilization.
Speaker(s):
François Grolleau, MD, PhD
Stanford Center for Biomedical Informatics Research
Author(s):
François Grolleau, MD, PhD - Stanford Center for Biomedical Informatics Research; Ethan Goh, MD, MS - Stanford University; Stephen Ma, MD, PhD - Stanford University School of Medicine; Jonathan Masterson, Director, Strategic Finance - Stanford Health Care, Menlo Park, CA; Ted Ross, Senior Vice President, Finance - Stanford Health Care, Menlo Park, CA; Arnold Milstein, MD, MPH - Clinical Excellence Research Center, Stanford University School of Medicine; Jonathan Chen, MD, PhD - Stanford University Hospital;
2025 Informatics Summit On Demand
Presentation Time: 01:45 PM - 02:00 PM
Abstract Keywords: Real-World Evidence and Policy Making, Informatics Research/Biomedical Informatics Research Methods, Phenomics and Phenome-wide Association Studies
Primary Track: Clinical Research Informatics
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
Marked variability in inpatient hospitalization costs poses significant challenges to healthcare quality, resource allocation, and patient outcomes. Traditional methods like Diagnosis-Related Groups (DRGs) aid in cost management but lack practical solutions for enhancing hospital care value. We introduce a novel methodology for outlier detection in Electronic Health Records (EHRs) using Conformal Prediction. This approach identifies and prioritizes areas for optimizing high-value care processes. Unlike conventional predictive models that neglect uncertainty, our method employs Conformal Quantile Regression (CQR) to generate robust prediction intervals, offering a comprehensive view of cost variability. By integrating Conformal Prediction with machine learning models, healthcare professionals can more accurately pinpoint opportunities for quality and efficiency improvements. Our framework systematically evaluates unexplained hospital cost variations and generates interpretable hypotheses for refining clinical practices associated with atypical costs. This data-driven approach offers a systematic method to generate clinically sound hypotheses that may inform processes to enhance care quality and optimize resource utilization.
Speaker(s):
François Grolleau, MD, PhD
Stanford Center for Biomedical Informatics Research
Author(s):
François Grolleau, MD, PhD - Stanford Center for Biomedical Informatics Research; Ethan Goh, MD, MS - Stanford University; Stephen Ma, MD, PhD - Stanford University School of Medicine; Jonathan Masterson, Director, Strategic Finance - Stanford Health Care, Menlo Park, CA; Ted Ross, Senior Vice President, Finance - Stanford Health Care, Menlo Park, CA; Arnold Milstein, MD, MPH - Clinical Excellence Research Center, Stanford University School of Medicine; Jonathan Chen, MD, PhD - Stanford University Hospital;
powerROC: An Interactive Web Tool for Sample Size Calculation in Assessing Models' Discriminative Abilities
2025 Informatics Summit On Demand
Presentation Time: 02:00 PM - 02:15 PM
Abstract Keywords: Informatics Research/Biomedical Informatics Research Methods, Data Literacy and Numeracy, Biomedical Informatics and Data Science Workforce Education
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Emerging Best Practices for Clinical Research Informatics Operations
Rigorous external validation is crucial for assessing the generalizability of prediction models, particularly by evaluating their discrimination (AUROC) on new data. This often involves comparing a new model's AUROC to that of an established reference model. However, many studies rely on arbitrary rules of thumb for sample size calculations, often resulting in underpowered analyses and unreliable conclusions. This paper reviews crucial concepts for accurate sample size determination in AUROC-based external validation studies, making the theory and practice more accessible to researchers and clinicians. We introduce powerROC, an open-source web tool designed to simplify these calculations, enabling both the evaluation of a single model and the comparison of two models. The tool offers guidance on selecting target precision levels and employs flexible approaches, leveraging either pilot data or user-defined probability distributions. We illustrate powerROC’s utility through a case study on hospital mortality prediction using the MIMIC database.
Speaker(s):
François Grolleau, MD, PhD
Stanford Center for Biomedical Informatics Research
Author(s):
François Grolleau, MD, PhD - Stanford Center for Biomedical Informatics Research; Robert Tibshirani, PhD - Department of Statistics and Biomedical Data Science, Stanford University; Jonathan Chen, MD, PhD - Stanford University Hospital;
2025 Informatics Summit On Demand
Presentation Time: 02:00 PM - 02:15 PM
Abstract Keywords: Informatics Research/Biomedical Informatics Research Methods, Data Literacy and Numeracy, Biomedical Informatics and Data Science Workforce Education
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Emerging Best Practices for Clinical Research Informatics Operations
Rigorous external validation is crucial for assessing the generalizability of prediction models, particularly by evaluating their discrimination (AUROC) on new data. This often involves comparing a new model's AUROC to that of an established reference model. However, many studies rely on arbitrary rules of thumb for sample size calculations, often resulting in underpowered analyses and unreliable conclusions. This paper reviews crucial concepts for accurate sample size determination in AUROC-based external validation studies, making the theory and practice more accessible to researchers and clinicians. We introduce powerROC, an open-source web tool designed to simplify these calculations, enabling both the evaluation of a single model and the comparison of two models. The tool offers guidance on selecting target precision levels and employs flexible approaches, leveraging either pilot data or user-defined probability distributions. We illustrate powerROC’s utility through a case study on hospital mortality prediction using the MIMIC database.
Speaker(s):
François Grolleau, MD, PhD
Stanford Center for Biomedical Informatics Research
Author(s):
François Grolleau, MD, PhD - Stanford Center for Biomedical Informatics Research; Robert Tibshirani, PhD - Department of Statistics and Biomedical Data Science, Stanford University; Jonathan Chen, MD, PhD - Stanford University Hospital;
Elastic Net Regression for Discovering Comorbidities Associated with Plural Chronic Overlapping Pain Conditions
2025 Informatics Summit On Demand
Presentation Time: 02:15 PM - 02:30 PM
Abstract Keywords: Data-Driven Research and Discovery, Outcomes Research, Clinical Epidemiology, Population Health, Secondary Use of EHR Data
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Chronic pain disorders such as migraine and fibromyalgia co-occur in some people but not in others. The cause to such difference remains unclear. We used longitudinal EHR data and elastic net regression to discover comorbidities associated with plural versus singular chronic pain occurrence. The top comorbidities discovered between males and females showed remarkable similarity. Digestive disorders were found most predictive for chronic pain plurality, followed by disorders with both well- and less-recognized chronic pain associations.
Speaker(s):
Jungwei Fan, PhD
Mayo Clinic
Author(s):
Haiquan Li, PhD - University of Arizona; William Hooten, MD - Mayo Clinic;
2025 Informatics Summit On Demand
Presentation Time: 02:15 PM - 02:30 PM
Abstract Keywords: Data-Driven Research and Discovery, Outcomes Research, Clinical Epidemiology, Population Health, Secondary Use of EHR Data
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Chronic pain disorders such as migraine and fibromyalgia co-occur in some people but not in others. The cause to such difference remains unclear. We used longitudinal EHR data and elastic net regression to discover comorbidities associated with plural versus singular chronic pain occurrence. The top comorbidities discovered between males and females showed remarkable similarity. Digestive disorders were found most predictive for chronic pain plurality, followed by disorders with both well- and less-recognized chronic pain associations.
Speaker(s):
Jungwei Fan, PhD
Mayo Clinic
Author(s):
Haiquan Li, PhD - University of Arizona; William Hooten, MD - Mayo Clinic;
Development and Validation of a Discrete Event Simulation Model to Stress Test Networked Emergency Department Staffing Policies
2025 Informatics Summit On Demand
Presentation Time: 02:30 PM - 02:45 PM
Abstract Keywords: Real-World Evidence and Policy Making, Data-Driven Research and Discovery, Implementation Science and Deployment
Primary Track: Clinical Research Informatics
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Emergency department (ED) staffing is chronically strained at baseline and may be further stressed by clinician absenteeism. We develop and explore a networked discrete event simulation model to stress test ED staffing policies and quantify the effect of degree and persistence of clinician absenteeism on patient Length of Stay (LOS). Local clinician absenteeism primarily affects LOS in local EDs. Future work will evaluate the effects of absenteeism in a network under external strain.
Speaker(s):
Arwen Declan, MD PhD
Prisma Health
Author(s):
Martha Sabogal de la Pava, MS - Clemson University; Kristina Mathis, BS - University of South Carolina School of Medicine Greenville; Braxton Howell, B.S. - University of South Carolina School of Medicine Greenville; Erin Seawright, BSN - University of South Carolina School of Medicine Greenville; Aisha Nelson, MS - Clemson University; Emily Tucker, PhD - Clemson University;
2025 Informatics Summit On Demand
Presentation Time: 02:30 PM - 02:45 PM
Abstract Keywords: Real-World Evidence and Policy Making, Data-Driven Research and Discovery, Implementation Science and Deployment
Primary Track: Clinical Research Informatics
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Emergency department (ED) staffing is chronically strained at baseline and may be further stressed by clinician absenteeism. We develop and explore a networked discrete event simulation model to stress test ED staffing policies and quantify the effect of degree and persistence of clinician absenteeism on patient Length of Stay (LOS). Local clinician absenteeism primarily affects LOS in local EDs. Future work will evaluate the effects of absenteeism in a network under external strain.
Speaker(s):
Arwen Declan, MD PhD
Prisma Health
Author(s):
Martha Sabogal de la Pava, MS - Clemson University; Kristina Mathis, BS - University of South Carolina School of Medicine Greenville; Braxton Howell, B.S. - University of South Carolina School of Medicine Greenville; Erin Seawright, BSN - University of South Carolina School of Medicine Greenville; Aisha Nelson, MS - Clemson University; Emily Tucker, PhD - Clemson University;
Systematic Exploration of Hospital Cost Variability: A Conformal Prediction-Based Outlier Detection Method for Electronic Health Records
Category
Paper - Regular