Times are displayed in (UTC-08:00) Pacific Time (US & Canada) Change
11/12/2024 |
8:30 AM – 10:00 AM |
Continental Ballroom 8-9
S63: Prediction Algorithms - Dance to the (algo)rhythm
Presentation Type: Oral
Session Chair:
Imon Banerjee, PhD - Arizona State U, Mayo Clinic
Derivation and Experimental Performance of Standard and Novel Uncertainty Calibration Techniques
Presentation Time: 08:30 AM - 08:45 AM
Abstract Keywords: Machine Learning, Deep Learning, Data Mining, Evaluation
Primary Track: Foundations
Programmatic Theme: Clinical Informatics
To aid in the transparency of state-of-the-art machine learning models, there has been considerable research performed in uncertainty quantification (UQ). UQ aims to quantify what a model does not know by measuring variation of the model under stochastic conditions and has been demonstrated to be a potentially powerful tool for medical AI. Evaluation of UQ, however, is largely constrained to visual analysis. In this work, we expand upon the Rejection Classification Index (RC-Index) and introduce the relative RC-Index as measures of uncertainty based on rejection classification curves. We hypothesize that rejection classification curves can be used as a basis to derive a metric of how well a given arbitrary uncertainty quantification metric can identify potentially incorrect predictions by an ML model. We compare RC-Index and rRC-Index to established measures based on lift curves.
Speaker(s):
Katherine (Katie) Brown, PhD
Vanderbilt University Medical Center
Author(s):
Katherine (Katie) Brown, PhD - Vanderbilt University Medical Center; Steven Talbert, PhD - University of Central Florida; Douglas Talbert, PhD - Tennessee Tech University;
Presentation Time: 08:30 AM - 08:45 AM
Abstract Keywords: Machine Learning, Deep Learning, Data Mining, Evaluation
Primary Track: Foundations
Programmatic Theme: Clinical Informatics
To aid in the transparency of state-of-the-art machine learning models, there has been considerable research performed in uncertainty quantification (UQ). UQ aims to quantify what a model does not know by measuring variation of the model under stochastic conditions and has been demonstrated to be a potentially powerful tool for medical AI. Evaluation of UQ, however, is largely constrained to visual analysis. In this work, we expand upon the Rejection Classification Index (RC-Index) and introduce the relative RC-Index as measures of uncertainty based on rejection classification curves. We hypothesize that rejection classification curves can be used as a basis to derive a metric of how well a given arbitrary uncertainty quantification metric can identify potentially incorrect predictions by an ML model. We compare RC-Index and rRC-Index to established measures based on lift curves.
Speaker(s):
Katherine (Katie) Brown, PhD
Vanderbilt University Medical Center
Author(s):
Katherine (Katie) Brown, PhD - Vanderbilt University Medical Center; Steven Talbert, PhD - University of Central Florida; Douglas Talbert, PhD - Tennessee Tech University;
A generative foundation model for structured patient trajectory data
Presentation Time: 08:45 AM - 09:00 AM
Abstract Keywords: Bioinformatics, Clinical Decision Support, Deep Learning, Large Language Models (LLMs), Knowledge Representation and Information Modeling, Real-World Evidence Generation, Internal Medicine or Medical Subspecialty, Data Mining
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Advancements in artificial intelligence propelled the implementation of general-purpose multitasking agents called foundation models. However, it has been challenging for foundation models to handle structured longitudinal medical data due to the mixed data types and variable timestamps in these data. Acquiring large training data is another obstacle. This study proposes a generative foundation model to manage patient trajectory data of variable lengths with mixed data types (categorical and continuous variables). Additionally, we propose a data pipeline to supply real-world data large enough to support foundation models. We locally obtained a large clinical dataset with a reproducible data pipeline scheme that leveraged a national HL7 message standard. Our trained model acquired the ability to suggest clinically relevant medical concepts and continuous variables for general purposes. The model also synthesized a database of more than 10,000 realistic patient trajectories. Our results suggest promising future downstream clinical applications of the foundation model.
Speaker(s):
Yu Akagi, M.D.
The University of Tokyo
Author(s):
Yu Akagi, M.D. - The University of Tokyo; Tomohisa Seki, MD PhD - the University of Tokyo hospital; Yoshimasa Kawazoe, M.D., Ph.D. - The University of Tokyo; Toru Takiguchi, M.D., Ph.D - The University of Tokyo; Kazuhiko Ohe, MD - University of Tokyo Hospital;
Presentation Time: 08:45 AM - 09:00 AM
Abstract Keywords: Bioinformatics, Clinical Decision Support, Deep Learning, Large Language Models (LLMs), Knowledge Representation and Information Modeling, Real-World Evidence Generation, Internal Medicine or Medical Subspecialty, Data Mining
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Advancements in artificial intelligence propelled the implementation of general-purpose multitasking agents called foundation models. However, it has been challenging for foundation models to handle structured longitudinal medical data due to the mixed data types and variable timestamps in these data. Acquiring large training data is another obstacle. This study proposes a generative foundation model to manage patient trajectory data of variable lengths with mixed data types (categorical and continuous variables). Additionally, we propose a data pipeline to supply real-world data large enough to support foundation models. We locally obtained a large clinical dataset with a reproducible data pipeline scheme that leveraged a national HL7 message standard. Our trained model acquired the ability to suggest clinically relevant medical concepts and continuous variables for general purposes. The model also synthesized a database of more than 10,000 realistic patient trajectories. Our results suggest promising future downstream clinical applications of the foundation model.
Speaker(s):
Yu Akagi, M.D.
The University of Tokyo
Author(s):
Yu Akagi, M.D. - The University of Tokyo; Tomohisa Seki, MD PhD - the University of Tokyo hospital; Yoshimasa Kawazoe, M.D., Ph.D. - The University of Tokyo; Toru Takiguchi, M.D., Ph.D - The University of Tokyo; Kazuhiko Ohe, MD - University of Tokyo Hospital;
Integrating Multi-sensor Time-series Data for ALSFRS-R Clinical Scale Predictions in an ALS Patient Case Study
Presentation Time: 09:00 AM - 09:15 AM
Abstract Keywords: Ubiquitous Computing and Sensors, Machine Learning, Personal Health Informatics, Patient / Person Generated Health Data (Patient Reported Outcomes)
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
Clinical tools for measuring disease progression in amyotrophic lateral sclerosis (ALS) rely on in-clinic assessment, thus limiting the frequency of measurement and potentially delaying needed treatments. The ALS Functional Rating Scale Revised (ALSFRS-R) instrument, the gold standard for quantifying disease progression in ALS patients, can be subjective and does not capture day-to-day variability in function. As such, clinicians may be missing subtle yet criti- cal shifts in patient health status pointing to the need for more objective and continuous monitoring methods. In-home sensor technologies could supplement traditional clinical instruments with more frequent and quantitative measure- ments as early indicators of changes in function. This study evaluates the methodologies for integrating clinician scored scales obtained at one-month intervals with daily sensor-based health parameter estimates for building predic- tive models using participant case study data. Using the XGBoost regressor estimator in single base learning, we test the usability of interpolation on low frequency monthly ALSFRS-R assessments to align with high frequency sensor data features. Model error rates are evaluated to determine the suitability of sensor-based features as predictors for estimating component and composite scores. We find a mean RMSE of 0.276 across 9 ALSFRS-R sub-scale predictive models and an RMSE of 2.984 for predicting the composite ALSFRS-R score. Within the 10 models, models fit with interpolated assessment scores with lowest RMSE were represented by backward fill (3), exponential (3), sigmoid (1), inverse exponential (1), and cubic spline (1) interpolation types.
Speaker(s):
Noah Marchal, M.S.
University of Missouri
Author(s):
Noah Marchal, M.S. - University of Missouri; William Janes, OTD, MSCI, OTR/L - University of Missouri; Juliana Earwood, OTD, OTR/L - University of Missouri; Abu Mosa, PhD, MS, FAMIA - University of Missouri School of Medicine; Mihail Popescu, PhD - University of Missouri; Marjorie Skubic, PhD - University of Missouri; Xing Song, PhD - University of Missouri;
Presentation Time: 09:00 AM - 09:15 AM
Abstract Keywords: Ubiquitous Computing and Sensors, Machine Learning, Personal Health Informatics, Patient / Person Generated Health Data (Patient Reported Outcomes)
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
Clinical tools for measuring disease progression in amyotrophic lateral sclerosis (ALS) rely on in-clinic assessment, thus limiting the frequency of measurement and potentially delaying needed treatments. The ALS Functional Rating Scale Revised (ALSFRS-R) instrument, the gold standard for quantifying disease progression in ALS patients, can be subjective and does not capture day-to-day variability in function. As such, clinicians may be missing subtle yet criti- cal shifts in patient health status pointing to the need for more objective and continuous monitoring methods. In-home sensor technologies could supplement traditional clinical instruments with more frequent and quantitative measure- ments as early indicators of changes in function. This study evaluates the methodologies for integrating clinician scored scales obtained at one-month intervals with daily sensor-based health parameter estimates for building predic- tive models using participant case study data. Using the XGBoost regressor estimator in single base learning, we test the usability of interpolation on low frequency monthly ALSFRS-R assessments to align with high frequency sensor data features. Model error rates are evaluated to determine the suitability of sensor-based features as predictors for estimating component and composite scores. We find a mean RMSE of 0.276 across 9 ALSFRS-R sub-scale predictive models and an RMSE of 2.984 for predicting the composite ALSFRS-R score. Within the 10 models, models fit with interpolated assessment scores with lowest RMSE were represented by backward fill (3), exponential (3), sigmoid (1), inverse exponential (1), and cubic spline (1) interpolation types.
Speaker(s):
Noah Marchal, M.S.
University of Missouri
Author(s):
Noah Marchal, M.S. - University of Missouri; William Janes, OTD, MSCI, OTR/L - University of Missouri; Juliana Earwood, OTD, OTR/L - University of Missouri; Abu Mosa, PhD, MS, FAMIA - University of Missouri School of Medicine; Mihail Popescu, PhD - University of Missouri; Marjorie Skubic, PhD - University of Missouri; Xing Song, PhD - University of Missouri;
Bayesian Priors From Large Language Models Make Clinical Prediction Models More Interpretable
Presentation Time: 09:15 AM - 09:30 AM
Abstract Keywords: Machine Learning, Large Language Models (LLMs), Rule-based artificial intelligence, Natural Language Processing
Primary Track: Foundations
Programmatic Theme: Clinical Informatics
Training clinical machine learning (ML) models on thousands of features extracted from the Electronic Health Record (EHR) without any clinical curation can often lead to models that rely on spurious and clinically irrelevant features; on the other hand, because it is infeasible to have clinical experts review thousands of features, clinically curated features are often sparse and models trained using such features usually lack predictive power. We propose to leverage large language models (LLMs) to mimic the clinician’s input: we use an LLM to score the clinical relevance of EHR features and encode this information as a Bayesian prior for training a clinical ML model. In a case study training readmission risk prediction models, we show that this principled approach to integrating LLM-generated clinical priors results in models with high predictive power and far more interpretable feature sets.
Speaker(s):
Avni Kothari, MS
UCSF
Author(s):
Jean Feng, PhD; Lucas Zier, MD - UCSF; Seth Goldman, MD - UCSF; Daniel Bennett, MD - UCSF; Elizabeth Connelly, MPH - UCSF; James Marks, PhD - UCSF;
Presentation Time: 09:15 AM - 09:30 AM
Abstract Keywords: Machine Learning, Large Language Models (LLMs), Rule-based artificial intelligence, Natural Language Processing
Primary Track: Foundations
Programmatic Theme: Clinical Informatics
Training clinical machine learning (ML) models on thousands of features extracted from the Electronic Health Record (EHR) without any clinical curation can often lead to models that rely on spurious and clinically irrelevant features; on the other hand, because it is infeasible to have clinical experts review thousands of features, clinically curated features are often sparse and models trained using such features usually lack predictive power. We propose to leverage large language models (LLMs) to mimic the clinician’s input: we use an LLM to score the clinical relevance of EHR features and encode this information as a Bayesian prior for training a clinical ML model. In a case study training readmission risk prediction models, we show that this principled approach to integrating LLM-generated clinical priors results in models with high predictive power and far more interpretable feature sets.
Speaker(s):
Avni Kothari, MS
UCSF
Author(s):
Jean Feng, PhD; Lucas Zier, MD - UCSF; Seth Goldman, MD - UCSF; Daniel Bennett, MD - UCSF; Elizabeth Connelly, MPH - UCSF; James Marks, PhD - UCSF;
Which Roads Lead to Rome? Evaluating Methods for Incorporating AI Risk Prediction Models into Electronic Health Records
Presentation Time: 09:30 AM - 09:45 AM
Abstract Keywords: Clinical Decision Support, Informatics Implementation, Machine Learning
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Increasingly, AI predictive models are integrated in the electronic health records (EHR) for clinical decision support. Using a validated predictive model for postpartum depression (PPD) as a use case, we compared implementation strategies between Epic Nebula and other cloud platforms such as Microsoft Azure in terms of workforce needs, model types, data requirements, maintenance, deployment speed, user interface, security, and transferability. Our findings may inform the infrastructure planning for future AI implementation studies.
Speaker(s):
John Gossey, MD, MS, MPH
Weill Cornell Medicine
Author(s):
John Travis Gossey, MD - Weill Cornell Medicine; Sam Kamara, N/A - NewYork-Presbyterian Hospital; Yiye Zhang, PhD - Weill Cornell Medicine;
Presentation Time: 09:30 AM - 09:45 AM
Abstract Keywords: Clinical Decision Support, Informatics Implementation, Machine Learning
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Increasingly, AI predictive models are integrated in the electronic health records (EHR) for clinical decision support. Using a validated predictive model for postpartum depression (PPD) as a use case, we compared implementation strategies between Epic Nebula and other cloud platforms such as Microsoft Azure in terms of workforce needs, model types, data requirements, maintenance, deployment speed, user interface, security, and transferability. Our findings may inform the infrastructure planning for future AI implementation studies.
Speaker(s):
John Gossey, MD, MS, MPH
Weill Cornell Medicine
Author(s):
John Travis Gossey, MD - Weill Cornell Medicine; Sam Kamara, N/A - NewYork-Presbyterian Hospital; Yiye Zhang, PhD - Weill Cornell Medicine;
S63: Prediction Algorithms - Dance to the (algo)rhythm
Description
Date: Tuesday (11/12)
Time: 8:30 AM to 10:00 AM
Room: Continental Ballroom 8-9
Time: 8:30 AM to 10:00 AM
Room: Continental Ballroom 8-9