Times are displayed in (UTC-04:00) Eastern Time (US & Canada) Change
3/11/2025 |
10:30 AM – 12:00 PM |
Monongahela
S07: Predictive Modeling: Approaches
Presentation Type: Podium Abstract
Session Credits: 1.5
Session Chair:
Kun-Hsing Yu, MD, PhD - Harvard Medical School
Unsupervised Coverage Sampling to Enhance Clinical Chart Review Coverage for Computational Phenotype Development
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: EHR-based Phenotyping, Data Mining and Knowledge Discovery, Data Quality
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
“Gold-standard” labels from clinicians are crucial for developing computational phenotyping algorithms. Typically, charts are sampled randomly. However, random sampling may not be able to capture the diverse manifestations of the patient population. To address this issue, this study introduces an unsupervised sampling approach to select medical charts for review, enhancing clinical chart review efficiency
Speaker(s):
Zigui Wang, Bachelor
Duke University
Author(s):
Zigui Wang, Bachelor - Duke University; Jillian Hurst, Ph.D. - Duke University; Chuan Hong, Ph.D. - Duke University; Benjamin Goldstein, PhD - Duke University;
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: EHR-based Phenotyping, Data Mining and Knowledge Discovery, Data Quality
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
“Gold-standard” labels from clinicians are crucial for developing computational phenotyping algorithms. Typically, charts are sampled randomly. However, random sampling may not be able to capture the diverse manifestations of the patient population. To address this issue, this study introduces an unsupervised sampling approach to select medical charts for review, enhancing clinical chart review efficiency
Speaker(s):
Zigui Wang, Bachelor
Duke University
Author(s):
Zigui Wang, Bachelor - Duke University; Jillian Hurst, Ph.D. - Duke University; Chuan Hong, Ph.D. - Duke University; Benjamin Goldstein, PhD - Duke University;
MRISeqClassifier: A Deep Learning Toolkit for Precise MRI Sequence Classification
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Bioimaging Techniques and Applications, Medical Imaging, Machine Learning, Generative AI, and Predictive Modeling
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Digital Health Technologies for Patient Research
Magnetic Resonance Imaging (MRI) is a crucial diagnostic tool in medicine, widely used to detect and assess various health conditions. Different MRI sequences, such as T1-weighted, T2-weighted, and FLAIR, serve distinct roles by highlighting different tissue characteristics and contrasts. However, distinguishing them based solely on the description file is currently impossible due to confusing or incorrect annotations. Additionally, there is a notable lack of effective tools to differentiate these sequences. In response, we developed a deep learning-based toolkit tailored for small, unrefined MRI datasets. This toolkit enables precise sequence classification and delivers performance comparable to systems trained on large, meticulously curated datasets. Utilizing lightweight model architectures and incorporating a voting ensemble method, the toolkit enhances accuracy and stability. It achieves a 99% accuracy rate using only 10% of the data typically required in other research. The code is available at https://github.com/JinqianPan/MRISeqClassifier.
Speaker(s):
Jie Xu, PhD
University of Florida
Author(s):
Qi Chen, MD, MS - The 2nd Affiliated Hospital and Yuying Children's Hospital of WMU, Wenzhou, Zhejiang, China; Chengkun Sun, MS - University of Florida; Renjie Liang, MS - University of Florida; Jiang Bian, PhD - University of Florida; Jie Xu, PhD - University of Florida;
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Bioimaging Techniques and Applications, Medical Imaging, Machine Learning, Generative AI, and Predictive Modeling
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Digital Health Technologies for Patient Research
Magnetic Resonance Imaging (MRI) is a crucial diagnostic tool in medicine, widely used to detect and assess various health conditions. Different MRI sequences, such as T1-weighted, T2-weighted, and FLAIR, serve distinct roles by highlighting different tissue characteristics and contrasts. However, distinguishing them based solely on the description file is currently impossible due to confusing or incorrect annotations. Additionally, there is a notable lack of effective tools to differentiate these sequences. In response, we developed a deep learning-based toolkit tailored for small, unrefined MRI datasets. This toolkit enables precise sequence classification and delivers performance comparable to systems trained on large, meticulously curated datasets. Utilizing lightweight model architectures and incorporating a voting ensemble method, the toolkit enhances accuracy and stability. It achieves a 99% accuracy rate using only 10% of the data typically required in other research. The code is available at https://github.com/JinqianPan/MRISeqClassifier.
Speaker(s):
Jie Xu, PhD
University of Florida
Author(s):
Qi Chen, MD, MS - The 2nd Affiliated Hospital and Yuying Children's Hospital of WMU, Wenzhou, Zhejiang, China; Chengkun Sun, MS - University of Florida; Renjie Liang, MS - University of Florida; Jiang Bian, PhD - University of Florida; Jie Xu, PhD - University of Florida;
Leveraging SNOMED CT for patient cohort identification over heterogeneous EHR data
Presentation Time: 11:00 AM - 11:15 AM
Abstract Keywords: Ontologies, Data/System Integration, Standardization and Interoperability, Cohort Discovery
Primary Track: Clinical Research Informatics
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
SNOMED CT is extensively employed to standardize data across diverse patient datasets and support cohort identification, with studies revealing its benefits and challenges. In this work, we developed a SNOMED CT-driven cohort query system over a heterogeneous Optum® de-identified COVID-19 Electronic Health Record dataset leveraging concept mappings between ICD-9-CM/ICD-10-CM and SNOMED CT. We evaluated the benefits and challenges of using SNOMED CT to perform cohort queries based on both query code sets and actual patients retrieved from the database, leveraging the original ICD-9-CM and ICD-10-CM as baselines. Manual review of 80 random cases revealed 65 cases containing 148 true positive codes and 25 cases containing 63 false positive codes. The manual evaluation also revealed issues in code naming, mappings, and hierarchical relations. Overall, our study indicates that while the SNOMED CT-driven query system holds considerable promise for comprehensive cohort queries, careful attention must be given to the challenges of falsely included codes and patients.
Speaker(s):
Xubing Hao, Bachelor's degree
The University of Texas Health Science Center at Houston
Author(s):
Xubing Hao, Bachelor's degree - The University of Texas Health Science Center at Houston; Yan Huang, Ph.D - UT Health Science Center; Licong Cui, PhD - The University of Texas Health Science Center at Houston (UTHealth Houston); Xiaojin Li, Ph.D. - UTHealth;
Presentation Time: 11:00 AM - 11:15 AM
Abstract Keywords: Ontologies, Data/System Integration, Standardization and Interoperability, Cohort Discovery
Primary Track: Clinical Research Informatics
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
SNOMED CT is extensively employed to standardize data across diverse patient datasets and support cohort identification, with studies revealing its benefits and challenges. In this work, we developed a SNOMED CT-driven cohort query system over a heterogeneous Optum® de-identified COVID-19 Electronic Health Record dataset leveraging concept mappings between ICD-9-CM/ICD-10-CM and SNOMED CT. We evaluated the benefits and challenges of using SNOMED CT to perform cohort queries based on both query code sets and actual patients retrieved from the database, leveraging the original ICD-9-CM and ICD-10-CM as baselines. Manual review of 80 random cases revealed 65 cases containing 148 true positive codes and 25 cases containing 63 false positive codes. The manual evaluation also revealed issues in code naming, mappings, and hierarchical relations. Overall, our study indicates that while the SNOMED CT-driven query system holds considerable promise for comprehensive cohort queries, careful attention must be given to the challenges of falsely included codes and patients.
Speaker(s):
Xubing Hao, Bachelor's degree
The University of Texas Health Science Center at Houston
Author(s):
Xubing Hao, Bachelor's degree - The University of Texas Health Science Center at Houston; Yan Huang, Ph.D - UT Health Science Center; Licong Cui, PhD - The University of Texas Health Science Center at Houston (UTHealth Houston); Xiaojin Li, Ph.D. - UTHealth;
Correcting for Case-Mix Shift when Developing Clinical Prediction Models
Presentation Time: 11:15 AM - 11:30 AM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data-Driven Research and Discovery, Clinical Decision Support for Translational/Data Science Interventions
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
When developing a clinical prediction model (CPM), a case-mix shift could occur in the development dataset where the distribution of individual predictors changes, potentially affecting the model performance. This study exploits the case-mix shift that is already observed in the development dataset to address the case-mix shift between the development and deployment phase of a CPM.
Speaker(s):
Haya Elayan, PhD
The University of Manchester
Author(s):
Haya Elayan, PhD - The University of Manchester;
Presentation Time: 11:15 AM - 11:30 AM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data-Driven Research and Discovery, Clinical Decision Support for Translational/Data Science Interventions
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
When developing a clinical prediction model (CPM), a case-mix shift could occur in the development dataset where the distribution of individual predictors changes, potentially affecting the model performance. This study exploits the case-mix shift that is already observed in the development dataset to address the case-mix shift between the development and deployment phase of a CPM.
Speaker(s):
Haya Elayan, PhD
The University of Manchester
Author(s):
Haya Elayan, PhD - The University of Manchester;
A Machine Learning Model to Predict the Use of Evidence-Based Practice in the Intensive Care Unit
Presentation Time: 11:30 AM - 11:45 AM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Machine Learning, Generative AI, and Predictive Modeling, Data-Driven Research and Discovery
Primary Track: Clinical Research Informatics
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
We examined the role of a machine learning model in predicting the use of evidence-based practice in health care. Using multi-hospital EHR data, we trained an XGBoost model to predict whether ICU patients would receive guideline-recommended care during their next hospital day. The model displayed good performance. Our results could increase the usefulness of clinical decision support by enabling customized recommendations based on individual patient need.
Speaker(s):
John Minturn, MAS
University of Pittsburgh
Author(s):
Andrew King, PhD - University of Pittsburgh; Billie Davis, PhD - University of Pittsburgh; Lu Tang, PhD - University of Pittsburgh; Leigh Bukowski, MPH - University of Pittsburgh; John Zimmerman, MDes - Carnegie Mellon University; Jeremy Kahn, MD MS - University of Pittsburgh;
Presentation Time: 11:30 AM - 11:45 AM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Machine Learning, Generative AI, and Predictive Modeling, Data-Driven Research and Discovery
Primary Track: Clinical Research Informatics
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
We examined the role of a machine learning model in predicting the use of evidence-based practice in health care. Using multi-hospital EHR data, we trained an XGBoost model to predict whether ICU patients would receive guideline-recommended care during their next hospital day. The model displayed good performance. Our results could increase the usefulness of clinical decision support by enabling customized recommendations based on individual patient need.
Speaker(s):
John Minturn, MAS
University of Pittsburgh
Author(s):
Andrew King, PhD - University of Pittsburgh; Billie Davis, PhD - University of Pittsburgh; Lu Tang, PhD - University of Pittsburgh; Leigh Bukowski, MPH - University of Pittsburgh; John Zimmerman, MDes - Carnegie Mellon University; Jeremy Kahn, MD MS - University of Pittsburgh;
A Comparative Analysis of Patient Similarity Measures for Outcome Prediction
Presentation Time: 11:45 AM - 12:00 PM
Abstract Keywords: Data-Driven Research and Discovery, Informatics Research/Biomedical Informatics Research Methods, Secondary Use of EHR Data
Primary Track: Clinical Research Informatics
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Personalized medicine aims to improve clinical outcomes by tailoring treatments to individual patients based on genetic, phenotypic, or psychosocial characteristics, leveraging insights from similar patients. This is particularly necessary for managing diseases with significant variability in their causes, progressions and prognoses. Accurate measurement of patient similarity is crucial in this context, as it enables the identification of a high-quality cohort of similar patients, thereby enhancing clinical decision making with better evidence. However, previous studies have not comprehensively compared different patient similarity measures in large-scale retrospective analyses of electronic health records (EHRs). In this study, we conducted a comparative analysis of four patient similarity measures focusing on feature weighting mechanisms, using EHR data from 46,968 hospitalized patients. For evaluation, we assessed the patient similarity measures for predicting acute kidney injury, readmission, and mortality. Our results showed that using grid-searched weights to combine features based by their types outperformed all other methods.
Speaker(s):
Deyi Li, Master of Science
University of Florida
Author(s):
Deyi Li, Master of Science - University of Florida; Alan Yu, MD - Division of Nephrology and Hypertension and the Kidney Institute, University of Kansas Medical Center, Kansas City, Kansas; Mei Liu, PhD - University of Florida;
Presentation Time: 11:45 AM - 12:00 PM
Abstract Keywords: Data-Driven Research and Discovery, Informatics Research/Biomedical Informatics Research Methods, Secondary Use of EHR Data
Primary Track: Clinical Research Informatics
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Personalized medicine aims to improve clinical outcomes by tailoring treatments to individual patients based on genetic, phenotypic, or psychosocial characteristics, leveraging insights from similar patients. This is particularly necessary for managing diseases with significant variability in their causes, progressions and prognoses. Accurate measurement of patient similarity is crucial in this context, as it enables the identification of a high-quality cohort of similar patients, thereby enhancing clinical decision making with better evidence. However, previous studies have not comprehensively compared different patient similarity measures in large-scale retrospective analyses of electronic health records (EHRs). In this study, we conducted a comparative analysis of four patient similarity measures focusing on feature weighting mechanisms, using EHR data from 46,968 hospitalized patients. For evaluation, we assessed the patient similarity measures for predicting acute kidney injury, readmission, and mortality. Our results showed that using grid-searched weights to combine features based by their types outperformed all other methods.
Speaker(s):
Deyi Li, Master of Science
University of Florida
Author(s):
Deyi Li, Master of Science - University of Florida; Alan Yu, MD - Division of Nephrology and Hypertension and the Kidney Institute, University of Kansas Medical Center, Kansas City, Kansas; Mei Liu, PhD - University of Florida;