3/11/2025 |
5:00 PM – 6:30 PM |
William Penn Ballroom
Poster Session 1
Presentation Type: Poster
Leveraging Large Language Models for Named Entity Recognition of Anxiety and Nausea and Vomiting in Patients with Cancer
Poster Number: P01
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Biomarker Discovery and Development, Bioimaging Techniques and Applications, Data Mining and Knowledge Discovery, Natural Language Processing, Open Science for Biomedical Research and Translational Medicine, Ontologies, Proactive Machine Learning and Reinforcement Learning, Citizen Science and Democratization of AI and Informatics
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Cancer remains a major global health issue, being the second leading cause of death worldwide. Nausea and vomiting affect up to 40% of cancer patients, often resulting in anxiety due to interactions with treatments such as chemotherapy. The complex relationship between physical and psychological symptoms necessitates holistic care addressing both. Effective symptom management enhances patients' quality of life and treatment outcomes. While electronic health records (EHRs) contain valuable unstructured data, manual extraction of relevant information is labor-intensive. Named Entity Recognition (NER) can automate this process by transforming unstructured text into structured, analyzable data. This study examines the use of two large language models (LLMs), BERT and GPT, to identify cancer symptoms—nausea/vomiting and anxiety—in clinical texts from 8,490 cancer patients. We developed Symptom-BERT and Symptom-GPT through additional pretraining on Bio-Clinical-BERT and GPT-2, then fine-tuning a gold-standard corpus of 1,048 clinical texts. BIO tagging was applied for symptom classification, and the dataset was split 80/20 for training and testing. Symptom-BERT achieved the highest F1 scores for nausea/vomiting (0.989) and anxiety (0.912), outperforming GPT-based models. Analysis revealed that 39.31% of patients exhibited anxiety symptoms, while 28.69% showed signs of nausea/vomiting. Our results demonstrate that further pretraining of LLMs significantly enhances performance, especially for physical symptoms. However, psychological symptoms like anxiety remain more challenging to detect due to their subjective nature. This approach reduces manual review efforts for clinicians, improves healthcare responsiveness, and enhances symptom management for cancer patients.
Speaker(s):
Author(s):
Poster Number: P01
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Biomarker Discovery and Development, Bioimaging Techniques and Applications, Data Mining and Knowledge Discovery, Natural Language Processing, Open Science for Biomedical Research and Translational Medicine, Ontologies, Proactive Machine Learning and Reinforcement Learning, Citizen Science and Democratization of AI and Informatics
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Cancer remains a major global health issue, being the second leading cause of death worldwide. Nausea and vomiting affect up to 40% of cancer patients, often resulting in anxiety due to interactions with treatments such as chemotherapy. The complex relationship between physical and psychological symptoms necessitates holistic care addressing both. Effective symptom management enhances patients' quality of life and treatment outcomes. While electronic health records (EHRs) contain valuable unstructured data, manual extraction of relevant information is labor-intensive. Named Entity Recognition (NER) can automate this process by transforming unstructured text into structured, analyzable data. This study examines the use of two large language models (LLMs), BERT and GPT, to identify cancer symptoms—nausea/vomiting and anxiety—in clinical texts from 8,490 cancer patients. We developed Symptom-BERT and Symptom-GPT through additional pretraining on Bio-Clinical-BERT and GPT-2, then fine-tuning a gold-standard corpus of 1,048 clinical texts. BIO tagging was applied for symptom classification, and the dataset was split 80/20 for training and testing. Symptom-BERT achieved the highest F1 scores for nausea/vomiting (0.989) and anxiety (0.912), outperforming GPT-based models. Analysis revealed that 39.31% of patients exhibited anxiety symptoms, while 28.69% showed signs of nausea/vomiting. Our results demonstrate that further pretraining of LLMs significantly enhances performance, especially for physical symptoms. However, psychological symptoms like anxiety remain more challenging to detect due to their subjective nature. This approach reduces manual review efforts for clinicians, improves healthcare responsiveness, and enhances symptom management for cancer patients.
Speaker(s):
Author(s):
Discrepancies in Reported Results Between Trial Registries and Journal Articles for AI Clinical Research
Poster Number: P02
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Health Information and Biomedical Data Dissemination Strategies
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
In this comprehensive analysis of 28,248 AI clinical research publications and 3,710 pre-registered trials, we found a mere 26 trials displaying outcomes in both trial registries and publications, but none of their results are completely consistent, particularly in secondary outcomes and adverse events. These issues underscore the pressing need for stringent reporting guidelines to ensure the integrity and reliability of AI clinical research.
Speaker(s):
Zixuan He, PhD Candidate
Peking University
Author(s):
Zixuan He, PhD Candidate - Peking University; Lan Yang, BSc - Peking University; Xiaofan Li, BSc - National University of Singapore; Jian Du - Peking University;
Poster Number: P02
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Health Information and Biomedical Data Dissemination Strategies
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
In this comprehensive analysis of 28,248 AI clinical research publications and 3,710 pre-registered trials, we found a mere 26 trials displaying outcomes in both trial registries and publications, but none of their results are completely consistent, particularly in secondary outcomes and adverse events. These issues underscore the pressing need for stringent reporting guidelines to ensure the integrity and reliability of AI clinical research.
Speaker(s):
Zixuan He, PhD Candidate
Peking University
Author(s):
Zixuan He, PhD Candidate - Peking University; Lan Yang, BSc - Peking University; Xiaofan Li, BSc - National University of Singapore; Jian Du - Peking University;
Towards Generalizable and Explainable Disease Prediction Models Using Causal Machine Learning
Poster Number: P03
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Machine Learning, Generative AI, and Predictive Modeling, Patient-centered Research and Care, Infectious Disease Modeling
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Machine learning models show promise in disease prediction, but ensuring they generalize to unseen populations and out-of-distribution data remains a challenge for real-world clinical impact. We used structure learning algorithms to identify causal relationships between parameters and leveraged this causal structure as input to the graph neural network for training. Our model, trained on US population and tested on a Chinese population, demonstrated strong generalizability and explainability in early sepsis prediction.
Speaker(s):
Ankit Gupta, Lead Scientist
Siemens Healthcare Private Limited
Author(s):
Ruchi Chauhan, MS - Siemens Healthcare Private Limited;
Poster Number: P03
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Machine Learning, Generative AI, and Predictive Modeling, Patient-centered Research and Care, Infectious Disease Modeling
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Machine learning models show promise in disease prediction, but ensuring they generalize to unseen populations and out-of-distribution data remains a challenge for real-world clinical impact. We used structure learning algorithms to identify causal relationships between parameters and leveraged this causal structure as input to the graph neural network for training. Our model, trained on US population and tested on a Chinese population, demonstrated strong generalizability and explainability in early sepsis prediction.
Speaker(s):
Ankit Gupta, Lead Scientist
Siemens Healthcare Private Limited
Author(s):
Ruchi Chauhan, MS - Siemens Healthcare Private Limited;
Describing Reasons for Nonactionable Alerts from a Clinical Decision Support System Generated by Artificial Intelligence (AI) in a Clinical Trial
Poster Number: P04
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Machine Learning, Generative AI, and Predictive Modeling, Clinical and Research Data Collection, Curation, Preservation, or Sharing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Integrating Clinical Research and Clinical Care Workflows
The "MEnD-AKI" trial uses a neural network-based deep learning model to create pharmacist alerts for nephrotoxin stewardship interventions. However, the use of AI has led to nonactionable alerts, highlighting the need to determine and address reasons for their occurrence. Out of 843 alerts, 77.6% were nonactionable due to missing information, information changes, coding challenges, and other issues. Improvements in AI-generated alerts are possible by first acknowledging the ways that nonactionable alerts are produced.
Speaker(s):
Tiffany Tran, PharmD
University of Pittsburgh School of Pharmacy
Author(s):
Tiffany Tran, PharmD - University of Pittsburgh School of Pharmacy; Nabihah Amatullah, PharmD, RPh - Cooperman Barnabas Medical Center; Britney Stottlemyer, PharmD - University of Pittsburgh; Caiden Lukan, PharmD - University of Pittsburgh; Tezcan Ozrazgat-Baslanti, PhD - University of Florida; Azra Bihorac, MD, MS - University of Florida College of Medicine; Sandra Kane-Gill, PharmD, MS, FCCM, FCCP - University of Pittsburgh;
Poster Number: P04
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Machine Learning, Generative AI, and Predictive Modeling, Clinical and Research Data Collection, Curation, Preservation, or Sharing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Integrating Clinical Research and Clinical Care Workflows
The "MEnD-AKI" trial uses a neural network-based deep learning model to create pharmacist alerts for nephrotoxin stewardship interventions. However, the use of AI has led to nonactionable alerts, highlighting the need to determine and address reasons for their occurrence. Out of 843 alerts, 77.6% were nonactionable due to missing information, information changes, coding challenges, and other issues. Improvements in AI-generated alerts are possible by first acknowledging the ways that nonactionable alerts are produced.
Speaker(s):
Tiffany Tran, PharmD
University of Pittsburgh School of Pharmacy
Author(s):
Tiffany Tran, PharmD - University of Pittsburgh School of Pharmacy; Nabihah Amatullah, PharmD, RPh - Cooperman Barnabas Medical Center; Britney Stottlemyer, PharmD - University of Pittsburgh; Caiden Lukan, PharmD - University of Pittsburgh; Tezcan Ozrazgat-Baslanti, PhD - University of Florida; Azra Bihorac, MD, MS - University of Florida College of Medicine; Sandra Kane-Gill, PharmD, MS, FCCM, FCCP - University of Pittsburgh;
Machine Learning Pipeline Flags Instances of Acute Respiratory Distress Syndrome from Electronic Health Records
Poster Number: P05
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Data Mining and Knowledge Discovery, Data-Driven Research and Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
This work introduces an automated diagnostic pipeline for Acute Respiratory Distress Syndrome (ARDS) in mechanically ventilated adult patients. Leveraging natural language processing and XGBoost models, this pipeline achieves high performance in retrospectively adjudicating ARDS, with a sensitivity of 93.5% and specificity of 73.9% on a subset of MIMIC III encounters. This reproducible, accurate pipeline presents a valuable tool to enhance ARDS recognition and treatment strategies, potentially aiding clinicians in managing both routine and complex cases.
Speaker(s):
Felix Morales, MSc.
Vizient, Inc.
Author(s):
Feihong Xu, PhD candidate - Northwestern University; Hyojun Lee, PhD - S&P Global; Heliodoro Tejedor Navarro, MSc - Katoid; Meagan Bechel, MD, PhD - Emory University; Jesse Kelso, MD - Endeavor Health; Eryn Cameron, MD - Endeavor Health; Curtis Weiss, MD - Endeavor Health; Luis Nunes Amaral, PhD - Northwestern University;
Poster Number: P05
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Data Mining and Knowledge Discovery, Data-Driven Research and Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
This work introduces an automated diagnostic pipeline for Acute Respiratory Distress Syndrome (ARDS) in mechanically ventilated adult patients. Leveraging natural language processing and XGBoost models, this pipeline achieves high performance in retrospectively adjudicating ARDS, with a sensitivity of 93.5% and specificity of 73.9% on a subset of MIMIC III encounters. This reproducible, accurate pipeline presents a valuable tool to enhance ARDS recognition and treatment strategies, potentially aiding clinicians in managing both routine and complex cases.
Speaker(s):
Felix Morales, MSc.
Vizient, Inc.
Author(s):
Feihong Xu, PhD candidate - Northwestern University; Hyojun Lee, PhD - S&P Global; Heliodoro Tejedor Navarro, MSc - Katoid; Meagan Bechel, MD, PhD - Emory University; Jesse Kelso, MD - Endeavor Health; Eryn Cameron, MD - Endeavor Health; Curtis Weiss, MD - Endeavor Health; Luis Nunes Amaral, PhD - Northwestern University;
Embedding clinical guidelines into large language model for headache diagnosis decision support
Poster Number: P06
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Natural Language Processing, Patient-centered Research and Care
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Large language models (LLMs) have demonstrated their capabilities in various clinical scenarios. LLM enhanced with retrieval-augmented generation (RAG), has shown promise as a decision support tool in oncology, medication safety and Covid-19 management. However, many implementations have neglected to incorporate available clinical guidelines as part of the knowledge base in RAG systems. Integrating these guidelines into LLMs provides an opportunity to bridge the gap created by the specialized terminology of headache management, thereby enhancing the model's decision-support capabilities. In this study we seek to explore the potential of LLM enhanced with RAG and available headache guidelines in clinical decision support in headache management.
Speaker(s):
Yanshan Wang, PhD
University of Pittsburgh
Author(s):
Xizhi Wu, Master of Science - University of Pittsburgh; Justin Rousseau, MD, MMSc - University of Texas Southwestern Medical Center; Yifan Peng, PhD - Weill Cornell Medicine; Dept of Population Health Sciences; Div of Health Informatics; Yanshan Wang, PhD - University of Pittsburgh;
Poster Number: P06
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Natural Language Processing, Patient-centered Research and Care
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Large language models (LLMs) have demonstrated their capabilities in various clinical scenarios. LLM enhanced with retrieval-augmented generation (RAG), has shown promise as a decision support tool in oncology, medication safety and Covid-19 management. However, many implementations have neglected to incorporate available clinical guidelines as part of the knowledge base in RAG systems. Integrating these guidelines into LLMs provides an opportunity to bridge the gap created by the specialized terminology of headache management, thereby enhancing the model's decision-support capabilities. In this study we seek to explore the potential of LLM enhanced with RAG and available headache guidelines in clinical decision support in headache management.
Speaker(s):
Yanshan Wang, PhD
University of Pittsburgh
Author(s):
Xizhi Wu, Master of Science - University of Pittsburgh; Justin Rousseau, MD, MMSc - University of Texas Southwestern Medical Center; Yifan Peng, PhD - Weill Cornell Medicine; Dept of Population Health Sciences; Div of Health Informatics; Yanshan Wang, PhD - University of Pittsburgh;
Enhancing Dietary Supplement Question Answer via Retrieval-Augmented Generation (RAG) with LLM
Poster Number: P07
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data Integration, Advanced Data Visualization Tools and Techniques, Data Mining and Knowledge Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
This study developed iDISK2.0, a comprehensive dietary supplement (DS) knowledge base. The RAG system was introduced to combine LLM, significantly enhancing the accuracy of DS-related question answering. Our evaluation demonstrates that iDISK2.0-RAG significantly outperforms traditional LLMs, achieving over 95% accuracy and offering precise and reliable information to inform research and healthcare decisions.
Speaker(s):
Author(s):
Yu Hou, PhD - University of Minnesota; Rui Zhang, PhD, FAMIA - University of Minnesota, Twin Cities;
Poster Number: P07
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data Integration, Advanced Data Visualization Tools and Techniques, Data Mining and Knowledge Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
This study developed iDISK2.0, a comprehensive dietary supplement (DS) knowledge base. The RAG system was introduced to combine LLM, significantly enhancing the accuracy of DS-related question answering. Our evaluation demonstrates that iDISK2.0-RAG significantly outperforms traditional LLMs, achieving over 95% accuracy and offering precise and reliable information to inform research and healthcare decisions.
Speaker(s):
Author(s):
Yu Hou, PhD - University of Minnesota; Rui Zhang, PhD, FAMIA - University of Minnesota, Twin Cities;
Automatic Detection of Personally Identifiable Information in Korean Medical Records Using BERT-based Korean Language Models
Poster Number: P08
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data Security and Privacy, Natural Language Processing, Secondary Use of EHR Data
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Recent advancements in language models have generated significant interest in their application to medicine and healthcare. However, for the secondary use of real-world medical data, a comprehensive de-identification process is essential. To automate this process, we trained a BERT-based Korean language model to detect 14 types of personally identifiable information (PII) in medical data. Among the models, DistilKoBERT achieved the highest F1 score for most entities. Certain entities had too few cases in the training data to evaluate model performance, but this limitation could be mitigated through data augmentation in future work.
Speaker(s):
Sujeong Lee, MSc
Sungkyunkwan University
Author(s):
Sujeong Lee, MSc - Sungkyunkwan University; Wonchul Cha – Samsung Medical Center; Wonchul Cha - Samsung Medical Center;
Poster Number: P08
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data Security and Privacy, Natural Language Processing, Secondary Use of EHR Data
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Recent advancements in language models have generated significant interest in their application to medicine and healthcare. However, for the secondary use of real-world medical data, a comprehensive de-identification process is essential. To automate this process, we trained a BERT-based Korean language model to detect 14 types of personally identifiable information (PII) in medical data. Among the models, DistilKoBERT achieved the highest F1 score for most entities. Certain entities had too few cases in the training data to evaluate model performance, but this limitation could be mitigated through data augmentation in future work.
Speaker(s):
Sujeong Lee, MSc
Sungkyunkwan University
Author(s):
Sujeong Lee, MSc - Sungkyunkwan University; Wonchul Cha – Samsung Medical Center; Wonchul Cha - Samsung Medical Center;
A Scalable Framework for Accurately Retrieving Structured EHR Data for Critically Ill Patients Using Open-Source Large Language Models
Poster Number: P09
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data/System Integration, Standardization and Interoperability, Machine Learning, Generative AI, and Predictive Modeling, Data Security and Privacy, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Emerging Best Practices for Clinical Research Informatics Operations
In the fast-paced environment of the intensive care unit (ICU), timely access to specific pieces of information is critical for making appropriate medical decisions. Unfortunately, these data are often scattered throughout multiple sections of the electronic health record (EHR), making them hard to locate and easy to overlook. Large language models (LLMs) are a form of deep learning trained on vast quantities of information, enabling them to understand context and return outputs that match user intents. These capabilities, however, are hindered by the propensity for LLMs to generate inaccurate information (hallucinations). In this analysis, we investigate the feasibility of using LLMs to query legacy EHR infrastructure and return complete and accurate results for the exemplary case of critically ill ICU patients with severe infections. In these preliminary results, the LLM was able to extract demographic, vital sign, laboratory and medication information for a subset of 250 randomly selected patients in a manner that generated identical results to human-based extractions. This novel approach demonstrates the feasibility of using LLMs to access structured EHR data in a reliable manner that could be used for augmented clinical decision support, workflow efficiency and research data extractions.
Speaker(s):
WEIWEI MA, Master of Science
Washington University in St. Louis, Michelson, Andrew
Author(s):
WEIWEI MA, Master of Science - Washington University in St. Louis, Michelson, Andrew;
Poster Number: P09
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data/System Integration, Standardization and Interoperability, Machine Learning, Generative AI, and Predictive Modeling, Data Security and Privacy, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Emerging Best Practices for Clinical Research Informatics Operations
In the fast-paced environment of the intensive care unit (ICU), timely access to specific pieces of information is critical for making appropriate medical decisions. Unfortunately, these data are often scattered throughout multiple sections of the electronic health record (EHR), making them hard to locate and easy to overlook. Large language models (LLMs) are a form of deep learning trained on vast quantities of information, enabling them to understand context and return outputs that match user intents. These capabilities, however, are hindered by the propensity for LLMs to generate inaccurate information (hallucinations). In this analysis, we investigate the feasibility of using LLMs to query legacy EHR infrastructure and return complete and accurate results for the exemplary case of critically ill ICU patients with severe infections. In these preliminary results, the LLM was able to extract demographic, vital sign, laboratory and medication information for a subset of 250 randomly selected patients in a manner that generated identical results to human-based extractions. This novel approach demonstrates the feasibility of using LLMs to access structured EHR data in a reliable manner that could be used for augmented clinical decision support, workflow efficiency and research data extractions.
Speaker(s):
WEIWEI MA, Master of Science
Washington University in St. Louis, Michelson, Andrew
Author(s):
WEIWEI MA, Master of Science - Washington University in St. Louis, Michelson, Andrew;
Creatinine Prediction Using Deep Learning Methods
Poster Number: P10
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data-Driven Research and Discovery, Machine Learning, Generative AI, and Predictive Modeling, Secondary Use of EHR Data, Clinical Decision Support for Translational/Data Science Interventions
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Narrow-therapeutic drugs undergoing renal clearance necessitate careful monitoring due to risks of adverse effects. A one-compartment pharmacokinetics deep-learning-based model was developed to predict serum creatinine level as the key renal function indicator. Our cohort, from Memorial Hermann Health System, included 9,710 patients. Our gated recurrent unit-based model achieved a Root Mean Square Error (RMSE) of 0.594. As an acute kidney injury is defined as an increase in serum creatinine in 0.3 or more, this RMSE of 0.594 is promising for prediction of acute changes in renal function with further model refinement.
Speaker(s):
Merlyn Joseph, PhD Student
UTHealth Houston
Author(s):
Merlyn Joseph, PhD Student - UTHealth Houston; Ziqian Xie, Phd - The University of Texas Health Science Center at Houston (UTHealth); Laila Rasmy, PhD, MSc, MBA, RPh. - UTHealth MSBMI; Degui Zhi, Ph.D. - The University of Texas Health Science Center at Houston (UTHealth) McWilliams School of Biomedical Informatics; Masayuki Nigo, MD - Houston Methodist;
Poster Number: P10
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data-Driven Research and Discovery, Machine Learning, Generative AI, and Predictive Modeling, Secondary Use of EHR Data, Clinical Decision Support for Translational/Data Science Interventions
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Narrow-therapeutic drugs undergoing renal clearance necessitate careful monitoring due to risks of adverse effects. A one-compartment pharmacokinetics deep-learning-based model was developed to predict serum creatinine level as the key renal function indicator. Our cohort, from Memorial Hermann Health System, included 9,710 patients. Our gated recurrent unit-based model achieved a Root Mean Square Error (RMSE) of 0.594. As an acute kidney injury is defined as an increase in serum creatinine in 0.3 or more, this RMSE of 0.594 is promising for prediction of acute changes in renal function with further model refinement.
Speaker(s):
Merlyn Joseph, PhD Student
UTHealth Houston
Author(s):
Merlyn Joseph, PhD Student - UTHealth Houston; Ziqian Xie, Phd - The University of Texas Health Science Center at Houston (UTHealth); Laila Rasmy, PhD, MSc, MBA, RPh. - UTHealth MSBMI; Degui Zhi, Ph.D. - The University of Texas Health Science Center at Houston (UTHealth) McWilliams School of Biomedical Informatics; Masayuki Nigo, MD - Houston Methodist;
Correcting Uncertainty Quantification in Variable-Length Sequence Models
Poster Number: P11
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data-Driven Research and Discovery, Data Literacy and Numeracy, Data Mining and Knowledge Discovery, EHR-based Phenotyping, Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing, Measuring Outcomes
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Uncertainty quantification is vital for ensuring the reliability of deep learning models. However, models such as Bag of Words, Word2Vec, and Transformers often exhibit overconfidence when handling variable-length sequences, particularly after truncation, resulting in reduced posterior variance in predictions. In this study, we empirically demonstrate that employing non-padding mean pooling as a simple, yet effective strategy significantly improves uncertainty quantification, aligning posterior variance with theoretical distributions on simulated datasets.
Speaker(s):
Jason Ma, Master
Duke University
Author(s):
Zhicheng Ma, Bachelor - Duke University; Boyao Li, Master - Duke University; Matthew Engelhard, PhD - Duke university; Samuel Berchuck,, PhD - Duke university;
Poster Number: P11
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data-Driven Research and Discovery, Data Literacy and Numeracy, Data Mining and Knowledge Discovery, EHR-based Phenotyping, Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing, Measuring Outcomes
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Uncertainty quantification is vital for ensuring the reliability of deep learning models. However, models such as Bag of Words, Word2Vec, and Transformers often exhibit overconfidence when handling variable-length sequences, particularly after truncation, resulting in reduced posterior variance in predictions. In this study, we empirically demonstrate that employing non-padding mean pooling as a simple, yet effective strategy significantly improves uncertainty quantification, aligning posterior variance with theoretical distributions on simulated datasets.
Speaker(s):
Jason Ma, Master
Duke University
Author(s):
Zhicheng Ma, Bachelor - Duke University; Boyao Li, Master - Duke University; Matthew Engelhard, PhD - Duke university; Samuel Berchuck,, PhD - Duke university;
Utilizing Retrieval-Augmented Generation (RAG) to Create Training Plans for K-Award Applicants
Poster Number: P12
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Education and Training, Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Research Career Development Awards (“K awards”) provide junior faculty resources, training, mentorship, and opportunities to conduct research projects that will help them transition to independent investigators. The “training plan” section of the application often becomes an obstacle for candidates, who struggle to format the section clearly or include training activities that support their research goals. We implemented retrieval-augmented generation (RAG) using open-source frameworks to assist in the generation of this text.
Speaker(s):
Diane Keogh, BA
Harvard Medical School
Author(s):
Niteesh Choudhry, MD, PhD - Mass General Brigham; Douglas MacFadden, MS - Harvard Medical School;
Poster Number: P12
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Education and Training, Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Research Career Development Awards (“K awards”) provide junior faculty resources, training, mentorship, and opportunities to conduct research projects that will help them transition to independent investigators. The “training plan” section of the application often becomes an obstacle for candidates, who struggle to format the section clearly or include training activities that support their research goals. We implemented retrieval-augmented generation (RAG) using open-source frameworks to assist in the generation of this text.
Speaker(s):
Diane Keogh, BA
Harvard Medical School
Author(s):
Niteesh Choudhry, MD, PhD - Mass General Brigham; Douglas MacFadden, MS - Harvard Medical School;
Towards Equitable Predictions: A Group Conditional Concordance Index to Quantify Fairness in Time-to-Event Prognostication
Poster Number: P13
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Fairness and Disparity Research in Health Informatics, Ethical, Legal, and Social Issues, Real-World Evidence and Policy Making
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Fairness metrics are essential tools to quantify and mitigate unfairness in the algorithm. Currently, most fairness metrics focus on binary classifications, while less attention is given to time-to-event settings. In this work, we propose a new metric by conditioning Concordance Index(CI) on group status. This group fairness metric aligns with the concept of separation and integrates naturally with the prediction of right-censored events. We showed that our metric is a weighted average of CI, and constructed an unbiased estimator using Inverse Probability of Censoring Weight. We also conducted two case studies: (1) Evaluating popular survival analysis models constructed on harmonized data from Framingham Offspring, MESA, and ARIC studies. (2) Validating existing Cardiovascular Disease prediction models on an Electronic Health Record(EHR) database. The results indicate that these models exhibit bias across different demographic groups. Our metric effectively captures this bias and provides a straightforward interpretation. We also investigated how different components' prediction errors affect our metric as well as the Calibration. We believe that our metric only measures differences in ranking accuracy across groups and need to integrate with clinical knowledge to comprehensively assess fairness.
Speaker(s):
Haoyuan Wang, Bachelor
Duke
Author(s):
Matthew Engelhard, PhD, MD - Duke University School of Medicine; Chuan Hong, PhD - Duke University; Michael Pencina, PhD - Duke University School of Medicine; Ricardo Henao, Ph.D - Duke University; Riddhiman Bhattacharya, Ph.D - Duke University; Daniel Wojdyla, Master - Duke Clinical Research Institute; Haoyuan Wang, Bachelor - Duke;
Poster Number: P13
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Fairness and Disparity Research in Health Informatics, Ethical, Legal, and Social Issues, Real-World Evidence and Policy Making
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Fairness metrics are essential tools to quantify and mitigate unfairness in the algorithm. Currently, most fairness metrics focus on binary classifications, while less attention is given to time-to-event settings. In this work, we propose a new metric by conditioning Concordance Index(CI) on group status. This group fairness metric aligns with the concept of separation and integrates naturally with the prediction of right-censored events. We showed that our metric is a weighted average of CI, and constructed an unbiased estimator using Inverse Probability of Censoring Weight. We also conducted two case studies: (1) Evaluating popular survival analysis models constructed on harmonized data from Framingham Offspring, MESA, and ARIC studies. (2) Validating existing Cardiovascular Disease prediction models on an Electronic Health Record(EHR) database. The results indicate that these models exhibit bias across different demographic groups. Our metric effectively captures this bias and provides a straightforward interpretation. We also investigated how different components' prediction errors affect our metric as well as the Calibration. We believe that our metric only measures differences in ranking accuracy across groups and need to integrate with clinical knowledge to comprehensively assess fairness.
Speaker(s):
Haoyuan Wang, Bachelor
Duke
Author(s):
Matthew Engelhard, PhD, MD - Duke University School of Medicine; Chuan Hong, PhD - Duke University; Michael Pencina, PhD - Duke University School of Medicine; Ricardo Henao, Ph.D - Duke University; Riddhiman Bhattacharya, Ph.D - Duke University; Daniel Wojdyla, Master - Duke Clinical Research Institute; Haoyuan Wang, Bachelor - Duke;
Using EWC Loss to Borrow from External Data & Improve Model Performance
Poster Number: P15
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data Integration, Secondary Use of EHR Data
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
When constructing a clinical predictive model on data from a target population, incorporating external data may improve the model’s performance. However, data use agreements often prohibit concatenating the external data to the target data. A deep neural network using an elastic weight consolidation loss function enables borrowing without concatenation. We show that this approach improves performance compared to a model from either the target or external datasets alone.
Speaker(s):
Jonathan Hui, Master of Biostatistics
Dr. Benjamin A. Goldstein
Author(s):
Jonathan Hui, Master of Biostatistics - Dr. Benjamin A. Goldstein; Benjamin Goldstein, Ph.D. - Duke University; Matthew Engelhard, PhD, MD - Duke University School of Medicine; Meng Xia, Ph.D. - Duke University;
Poster Number: P15
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data Integration, Secondary Use of EHR Data
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
When constructing a clinical predictive model on data from a target population, incorporating external data may improve the model’s performance. However, data use agreements often prohibit concatenating the external data to the target data. A deep neural network using an elastic weight consolidation loss function enables borrowing without concatenation. We show that this approach improves performance compared to a model from either the target or external datasets alone.
Speaker(s):
Jonathan Hui, Master of Biostatistics
Dr. Benjamin A. Goldstein
Author(s):
Jonathan Hui, Master of Biostatistics - Dr. Benjamin A. Goldstein; Benjamin Goldstein, Ph.D. - Duke University; Matthew Engelhard, PhD, MD - Duke University School of Medicine; Meng Xia, Ph.D. - Duke University;
Comparing LLM-Feature Selection and Conventional Approaches for Computational Phenotyping and Clinical Risk Prediction Algorithms
Poster Number: P16
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Secondary Use of EHR Data, Patient-centered Research and Care, Data Mining and Knowledge Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Feature selection plays a crucial role in the development of phenotyping and clinical risk prediction models, impacting accuracy, interpretability, and model generalizability. This study compares three feature selection methods: 1) inclusion of all available features, 2) using large language model (LLM)-selected features, and 3) selecting features based on data co-occurrence patterns (KESER algorithm).
The study evaluated the performance of these methods using a phenotyping model for Long COVID, applying models to electronic health records (EHR) and patient-reported symptoms from the RECOVER study. Feature selection methods were compared based on their influence on model performance, particularly area under the receiver operating characteristic curve (AUC-ROC) and precision-recall curves.
Preliminary results indicated that including all features resulted in high-dimensional models with potential overfitting but provided comprehensive data coverage. LLM-based feature selection produced more compact models with competitive accuracy, leveraging pre-trained knowledge to identify relevant clinical features. The data co-occurrence approach offered a balance between model complexity and performance by reducing the feature set while preserving clinically relevant relationships.
Each method presented unique advantages and trade-offs. Comprehensive feature inclusion increased data utilization but risked overfitting, while LLM and co-occurrence-based selection methods demonstrated promise for improving performance and interpretability. Future work will focus on refining these approaches and assessing their generalizability across different clinical contexts.
Speaker(s):
Victor Castro, MS
Mass General Brigham
Author(s):
Vivian Gainer, MS - Mass General Brigham; Nich Wattanasin, MS - Mass General Brigham; Michael Mendis - Mass General Brigham; Barbara Benoit, BS - Mass General Brigham; Andy Cagan, BS - Mass General Brigham; Ana Holzbach, PhD - Mass General Brigham; Shawn Murphy, MD, Ph.D. - Massachusetts General Hospital;
Poster Number: P16
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Secondary Use of EHR Data, Patient-centered Research and Care, Data Mining and Knowledge Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Feature selection plays a crucial role in the development of phenotyping and clinical risk prediction models, impacting accuracy, interpretability, and model generalizability. This study compares three feature selection methods: 1) inclusion of all available features, 2) using large language model (LLM)-selected features, and 3) selecting features based on data co-occurrence patterns (KESER algorithm).
The study evaluated the performance of these methods using a phenotyping model for Long COVID, applying models to electronic health records (EHR) and patient-reported symptoms from the RECOVER study. Feature selection methods were compared based on their influence on model performance, particularly area under the receiver operating characteristic curve (AUC-ROC) and precision-recall curves.
Preliminary results indicated that including all features resulted in high-dimensional models with potential overfitting but provided comprehensive data coverage. LLM-based feature selection produced more compact models with competitive accuracy, leveraging pre-trained knowledge to identify relevant clinical features. The data co-occurrence approach offered a balance between model complexity and performance by reducing the feature set while preserving clinically relevant relationships.
Each method presented unique advantages and trade-offs. Comprehensive feature inclusion increased data utilization but risked overfitting, while LLM and co-occurrence-based selection methods demonstrated promise for improving performance and interpretability. Future work will focus on refining these approaches and assessing their generalizability across different clinical contexts.
Speaker(s):
Victor Castro, MS
Mass General Brigham
Author(s):
Vivian Gainer, MS - Mass General Brigham; Nich Wattanasin, MS - Mass General Brigham; Michael Mendis - Mass General Brigham; Barbara Benoit, BS - Mass General Brigham; Andy Cagan, BS - Mass General Brigham; Ana Holzbach, PhD - Mass General Brigham; Shawn Murphy, MD, Ph.D. - Massachusetts General Hospital;
Use of LLMs for Crosswalk and Validation of Medical Codes
Poster Number: P17
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Data Integration, Data Transformation/ETL, Informatics Research/Biomedical Informatics Research Methods, Ontologies, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Proactive Machine Learning in Biomedical Applications: The Power of Generative AI and Reinforcement Learning
Mapping medical codes from different ontologies can be challenging. First, a single code may be linked to multiple codes with the same chapter code. Second, a single code may be linked to multiple chapter codes. We present how Large Language Models and other Natural Language Processing techniques can be used for medical codes translation to enhance crosswalk mappings.
Speaker(s):
Carlos Tavarez Martinez, Master
Memorial Sloan Kettering Cancer Center
Author(s):
Nadia Bahadur, Masters of Clinical Research - Memorial Sloan Kettering Cancer Center; John Philip, MS - Memorial Sloan Kettering Cancer Center; Andrew Niederhausern - MSKCC; Gilan El Saadawi - 3M M*Modal; Gary Wallace, BS - Realyze Intelligence Inc;
Poster Number: P17
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Data Integration, Data Transformation/ETL, Informatics Research/Biomedical Informatics Research Methods, Ontologies, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Proactive Machine Learning in Biomedical Applications: The Power of Generative AI and Reinforcement Learning
Mapping medical codes from different ontologies can be challenging. First, a single code may be linked to multiple codes with the same chapter code. Second, a single code may be linked to multiple chapter codes. We present how Large Language Models and other Natural Language Processing techniques can be used for medical codes translation to enhance crosswalk mappings.
Speaker(s):
Carlos Tavarez Martinez, Master
Memorial Sloan Kettering Cancer Center
Author(s):
Nadia Bahadur, Masters of Clinical Research - Memorial Sloan Kettering Cancer Center; John Philip, MS - Memorial Sloan Kettering Cancer Center; Andrew Niederhausern - MSKCC; Gilan El Saadawi - 3M M*Modal; Gary Wallace, BS - Realyze Intelligence Inc;
Borrowing from the Future: Early-Stage Enhancement through Contrastive Regularization in a Multimodal Discrete Failure Time Model
Poster Number: P18
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, EHR-based Phenotyping, Data Mining and Knowledge Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
Risk assessments made at later a time point will perform better, since more information has been collected and the patient is approaching the potential onset of clinical outcome. The data observed at later a time point may dominate the clinical prediction task. We develop a contrastive-learning-based model that improves risk assessment at earlier time points by leveraging information from the future. Data leakage problem is avoided because model is only evaluated on partial data.
Speaker(s):
Minghui Sun, Master of Biostatistics
Duke University
Author(s):
Minghui Sun, Master of Biostatistics - Duke University; Matthew Engelhard, PhD, MD - Duke University School of Medicine; Benjamin Goldstein, PhD - Duke University;
Poster Number: P18
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, EHR-based Phenotyping, Data Mining and Knowledge Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
Risk assessments made at later a time point will perform better, since more information has been collected and the patient is approaching the potential onset of clinical outcome. The data observed at later a time point may dominate the clinical prediction task. We develop a contrastive-learning-based model that improves risk assessment at earlier time points by leveraging information from the future. Data leakage problem is avoided because model is only evaluated on partial data.
Speaker(s):
Minghui Sun, Master of Biostatistics
Duke University
Author(s):
Minghui Sun, Master of Biostatistics - Duke University; Matthew Engelhard, PhD, MD - Duke University School of Medicine; Benjamin Goldstein, PhD - Duke University;
Airway Stenosis Classification with Breathing Audio Data
Poster Number: P19
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Clinical Decision Support for Translational/Data Science Interventions, Biomarker Discovery and Development
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Airway stenosis, a narrowing of the upper airway that impairs speech and breathing, disproportionately affects females and is life threatening for infants. Invasive endoscopy is currently required for the diagnosis of AS, posing the risk of worsening its severity. To address this challenge, we developed an AI model for non-invasive AS detection with breathing audio data.
Speaker(s):
Lars Schimmelpfennig, BS
Institute for Informatics, Data Science & Biostatistics at WashU
Author(s):
Lars Schimmelpfennig, BS - Institute for Informatics, Data Science & Biostatistics at WashU; Ashi Keshariya, BS - Saint Louis University; Ashi Keshariya, BS - Saint Louis University; Zachary Abrams, PhD - Institute for Informatics at Washington University School of Medicine in St. Louis;
Poster Number: P19
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Clinical Decision Support for Translational/Data Science Interventions, Biomarker Discovery and Development
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Airway stenosis, a narrowing of the upper airway that impairs speech and breathing, disproportionately affects females and is life threatening for infants. Invasive endoscopy is currently required for the diagnosis of AS, posing the risk of worsening its severity. To address this challenge, we developed an AI model for non-invasive AS detection with breathing audio data.
Speaker(s):
Lars Schimmelpfennig, BS
Institute for Informatics, Data Science & Biostatistics at WashU
Author(s):
Lars Schimmelpfennig, BS - Institute for Informatics, Data Science & Biostatistics at WashU; Ashi Keshariya, BS - Saint Louis University; Ashi Keshariya, BS - Saint Louis University; Zachary Abrams, PhD - Institute for Informatics at Washington University School of Medicine in St. Louis;
Optimizing Retrieval-Augmented Generation (RAG) for Retrospective Ischemic Stroke Identification: A Comparative Study of Embedding Models and Retrieved Chunks
Poster Number: P20
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, EHR-based Phenotyping, Informatics Research/Biomedical Informatics Research Methods, Cohort Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
This research evaluates methods to optimize Retrieval-Augmented Generation (RAG) technique for using Large language Models (LLMs) to assist with the chart review. Results show that RAG outperforms the conventional ICD-10 code screening for ischemic stroke patients and that a large embedding model can effectively assist chart review. A smaller, biomedical-specific embedding model can achieve comparable performance by manipulating chunk retrieval quantity. Optimal chunk numbers vary based on the embedding models.
Speaker(s):
Heekyong Park, PhD
Mass General Brigham
Author(s):
Heekyong Park, PhD - Mass General Brigham; Martin Rees, BS - Mass General Brigham; Yichuan Grace Hsieh, PhD - Mass General Brigham; Nich Wattanasin, MS - Mass General Brigham; Allan J. Harris, BS - Mass General Brigham; Vivian Gainer, MS - Mass General Brigham; Thomas McShane, MS - Mass General Brigham; Kavishwar Wagholikar, MD, PhD - Harvard Medical School /MGH; Shawn Murphy, MD, Ph.D. - Massachusetts General Hospital;
Poster Number: P20
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, EHR-based Phenotyping, Informatics Research/Biomedical Informatics Research Methods, Cohort Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
This research evaluates methods to optimize Retrieval-Augmented Generation (RAG) technique for using Large language Models (LLMs) to assist with the chart review. Results show that RAG outperforms the conventional ICD-10 code screening for ischemic stroke patients and that a large embedding model can effectively assist chart review. A smaller, biomedical-specific embedding model can achieve comparable performance by manipulating chunk retrieval quantity. Optimal chunk numbers vary based on the embedding models.
Speaker(s):
Heekyong Park, PhD
Mass General Brigham
Author(s):
Heekyong Park, PhD - Mass General Brigham; Martin Rees, BS - Mass General Brigham; Yichuan Grace Hsieh, PhD - Mass General Brigham; Nich Wattanasin, MS - Mass General Brigham; Allan J. Harris, BS - Mass General Brigham; Vivian Gainer, MS - Mass General Brigham; Thomas McShane, MS - Mass General Brigham; Kavishwar Wagholikar, MD, PhD - Harvard Medical School /MGH; Shawn Murphy, MD, Ph.D. - Massachusetts General Hospital;
Machine Learning for Early Prediction of Alzheimer's Disease and Related Dementias Using Electronic Health Record (EHR) Data
Poster Number: P21
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Secondary Use of EHR Data, Data-Driven Research and Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
This study explores machine learning (ML) models trained on electronic health record data from the University of Missouri (MU) Healthcare for early diagnosis and prediction of Alzheimer’s Disease and Related Dementias (ADRD) in a 5-year window. The Gradient Boosting Trees model performed best with an AUC-ROC of 0.833 over a five-year window. SHAP analysis identified key risk factors, including depressive disorder, age, anxiety, sleep apnea, and headache, demonstrating the potential of ML in ADRD prediction.
Speaker(s):
Sonia Akter, PhD Student
University of Missouri
Author(s):
Zhandi Liu, Master - University of Missouri Columbia; Praveen Rao, PhD - University of Missouri; Eduardo Simoes - University of Missouri; Sonia Akter, PhD Student - University of Missouri;
Poster Number: P21
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Secondary Use of EHR Data, Data-Driven Research and Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
This study explores machine learning (ML) models trained on electronic health record data from the University of Missouri (MU) Healthcare for early diagnosis and prediction of Alzheimer’s Disease and Related Dementias (ADRD) in a 5-year window. The Gradient Boosting Trees model performed best with an AUC-ROC of 0.833 over a five-year window. SHAP analysis identified key risk factors, including depressive disorder, age, anxiety, sleep apnea, and headache, demonstrating the potential of ML in ADRD prediction.
Speaker(s):
Sonia Akter, PhD Student
University of Missouri
Author(s):
Zhandi Liu, Master - University of Missouri Columbia; Praveen Rao, PhD - University of Missouri; Eduardo Simoes - University of Missouri; Sonia Akter, PhD Student - University of Missouri;
Synthetic Personas for Human Services Fields Context: LLM Adapter Training for Conversational AI Deployments.
Poster Number: P22
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Education and Training, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
This study explores the application of Large Language Models (LLMs) in creating synthetic personas for training and simulation in human services, with a specific focus on the Child and Adolescent Needs and Strengths (CANS) assessment tool. An interdisciplinary team developed a Conversational AI agent to simulate children and youth in care, aged 5-18, for CANS certification training. The process involved collecting training vignettes, generating initial synthetic personas and conversations using LLMs, conducting expert reviews of the generated content, refining synthetic conversations, training a Low Rank Adapter (LoRA) for the child/youth persona, and finally deploying the LoRA via internal infrastructure. The team faced challenges in addressing sensitive topics while navigating LLM safety guardrails. Their methodology provides a foundation for creating specialized persona-based AI agents for training and testing in human services. Future research will focus on assessing the efficacy of synthetic personas in human services training, exploring new safety and alignment testing scenarios, and utilizing techniques such as RLHF, abliteration, and dataset synthesis. This work demonstrates the potential of LLM technologies in creating nuanced, specialized AI agents for educational and simulation purposes in fields like child welfare and behavioral health in general. The methodology developed in this project offers a potential foundation for creating more sophisticated and specialized persona-based Conversational AI agents, which could have significant implications for training and systems-testing in various human services and healthcare applications.
Speaker(s):
Dmitry Strakovsky, MFA
University of Kentucky
Author(s):
Caroline Leach, BS - University of Kentucky; Mitchell Klusty, B.S. Compouter Science - University of Kentucky; John S. Lyons, PhD, MA - University of Kentucky; Mark Lardner, LCSW-C - University of Kentucky; Lauren Mergen, BS, MS - University of Kentucky; Lynn Steiner, MSW - University of Kentucky; Cody Bumgardner, PhD - University of Kentucky; Jeffery Talbert, PhD - University of Kentucky;
Poster Number: P22
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Education and Training, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
This study explores the application of Large Language Models (LLMs) in creating synthetic personas for training and simulation in human services, with a specific focus on the Child and Adolescent Needs and Strengths (CANS) assessment tool. An interdisciplinary team developed a Conversational AI agent to simulate children and youth in care, aged 5-18, for CANS certification training. The process involved collecting training vignettes, generating initial synthetic personas and conversations using LLMs, conducting expert reviews of the generated content, refining synthetic conversations, training a Low Rank Adapter (LoRA) for the child/youth persona, and finally deploying the LoRA via internal infrastructure. The team faced challenges in addressing sensitive topics while navigating LLM safety guardrails. Their methodology provides a foundation for creating specialized persona-based AI agents for training and testing in human services. Future research will focus on assessing the efficacy of synthetic personas in human services training, exploring new safety and alignment testing scenarios, and utilizing techniques such as RLHF, abliteration, and dataset synthesis. This work demonstrates the potential of LLM technologies in creating nuanced, specialized AI agents for educational and simulation purposes in fields like child welfare and behavioral health in general. The methodology developed in this project offers a potential foundation for creating more sophisticated and specialized persona-based Conversational AI agents, which could have significant implications for training and systems-testing in various human services and healthcare applications.
Speaker(s):
Dmitry Strakovsky, MFA
University of Kentucky
Author(s):
Caroline Leach, BS - University of Kentucky; Mitchell Klusty, B.S. Compouter Science - University of Kentucky; John S. Lyons, PhD, MA - University of Kentucky; Mark Lardner, LCSW-C - University of Kentucky; Lauren Mergen, BS, MS - University of Kentucky; Lynn Steiner, MSW - University of Kentucky; Cody Bumgardner, PhD - University of Kentucky; Jeffery Talbert, PhD - University of Kentucky;
Exploring Knowledge Augmentation Strategies for Large Language Models Applied to Natural Product Drug Interaction Research
Poster Number: P23
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Proactive Machine Learning in Biomedical Applications: The Power of Generative AI and Reinforcement Learning
Natural product-drug interactions (NPDIs) might result from the co-consumption of botanical products used for medicinal purposes and pharmaceuticals. Identifying the potential mechanisms underlying NPDIs is challenging due to gaps in scientific knowledge and data curation. Studying NPDI mechanisms often requires manual synthesis of information from the published literature of various fields. Large language models (LLMs) have been shown capable of efficiently synthesizing information from disjointed resources through training and Retrieval Augmented Generation (RAG). This pilot work explores the potential for LLMs to serve as information synthesis tools to inform hypotheses about potential NPDI mechanisms.
Speaker(s):
Israel Dilan-Pantojas, Bsc. Computer Science
University of Pittsburgh
Author(s):
Israel Dilan-Pantojas, Bsc. Computer Science - University of Pittsburgh; Sanya Taneja, MS - University of Pittsburgh; Kojo Abanyie, PharmD/Post-doctoral Scholar - University of PIttsburgh; Richard Boyce, PhD - University of Pittsburgh;
Poster Number: P23
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Proactive Machine Learning in Biomedical Applications: The Power of Generative AI and Reinforcement Learning
Natural product-drug interactions (NPDIs) might result from the co-consumption of botanical products used for medicinal purposes and pharmaceuticals. Identifying the potential mechanisms underlying NPDIs is challenging due to gaps in scientific knowledge and data curation. Studying NPDI mechanisms often requires manual synthesis of information from the published literature of various fields. Large language models (LLMs) have been shown capable of efficiently synthesizing information from disjointed resources through training and Retrieval Augmented Generation (RAG). This pilot work explores the potential for LLMs to serve as information synthesis tools to inform hypotheses about potential NPDI mechanisms.
Speaker(s):
Israel Dilan-Pantojas, Bsc. Computer Science
University of Pittsburgh
Author(s):
Israel Dilan-Pantojas, Bsc. Computer Science - University of Pittsburgh; Sanya Taneja, MS - University of Pittsburgh; Kojo Abanyie, PharmD/Post-doctoral Scholar - University of PIttsburgh; Richard Boyce, PhD - University of Pittsburgh;
Vocal Clues: Leveraging Voice Biomarkers for Screening and Early Detection of Anxiety Disorders
Poster Number: P24
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Biomedical Informatics and Data Science Workforce Education
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Digital Health Technologies for Patient Research
Anxiety is a common mental health disorder worldwide. This study explores using voice biomarkers to identify individuals with anxiety. 107 participants freely responded to three questions. From these audio-recorded responses, linguistic and acoustic features were extracted, in addition to clinical data, and fed through various machine learning models. The two best performing models exhibited 77% accuracy in identifying individuals with anxiety.
Speaker(s):
Hannah Slater, MS
Vanderbilt University Department of Biomedical Informatics
Author(s):
Hannah Slater, MS - Vanderbilt University Department of Biomedical Informatics; Rosie Mugoya, Bsn - Goldfarb School of Nursing and Washington University of St. Louis; Zachary Abrams, PhD - Institute for Informatics at Washington University School of Medicine in St. Louis;
Poster Number: P24
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Biomedical Informatics and Data Science Workforce Education
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Digital Health Technologies for Patient Research
Anxiety is a common mental health disorder worldwide. This study explores using voice biomarkers to identify individuals with anxiety. 107 participants freely responded to three questions. From these audio-recorded responses, linguistic and acoustic features were extracted, in addition to clinical data, and fed through various machine learning models. The two best performing models exhibited 77% accuracy in identifying individuals with anxiety.
Speaker(s):
Hannah Slater, MS
Vanderbilt University Department of Biomedical Informatics
Author(s):
Hannah Slater, MS - Vanderbilt University Department of Biomedical Informatics; Rosie Mugoya, Bsn - Goldfarb School of Nursing and Washington University of St. Louis; Zachary Abrams, PhD - Institute for Informatics at Washington University School of Medicine in St. Louis;
APEX: Apical View Extraction using Deep Learning for Research
Poster Number: P25
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Medical Imaging, Data Mining and Knowledge Discovery, Data-Driven Research and Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
This project introduces Apex, an algorithm that automatically identifies critical apical views in echocardiography (ECHO) images. Using VGG-19[1] model, the algorithm enhances diagnostic efficiency, reduces manual intervention, and significantly decreases the size of ECHO DICOM data by focusing on key views. This automation is essential for accurate disease prediction and more efficient data processing, streamlining the overall workflow in echocardiographic assessments.
Speaker(s):
Akshay Arora, MS
Baylor Scott and White
Author(s):
Akshay Arora, MS - Baylor Scott and White; Elisa Priest, PhD - Baylor Scott and White; Muhammad Shahzeb Khan, MD - Baylor Scott and White;
Poster Number: P25
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Medical Imaging, Data Mining and Knowledge Discovery, Data-Driven Research and Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
This project introduces Apex, an algorithm that automatically identifies critical apical views in echocardiography (ECHO) images. Using VGG-19[1] model, the algorithm enhances diagnostic efficiency, reduces manual intervention, and significantly decreases the size of ECHO DICOM data by focusing on key views. This automation is essential for accurate disease prediction and more efficient data processing, streamlining the overall workflow in echocardiographic assessments.
Speaker(s):
Akshay Arora, MS
Baylor Scott and White
Author(s):
Akshay Arora, MS - Baylor Scott and White; Elisa Priest, PhD - Baylor Scott and White; Muhammad Shahzeb Khan, MD - Baylor Scott and White;
Results of a Pilot Implementation of a Large Language Model Note Generation Software at a Tertiary Children’s Hospital
Poster Number: P26
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Implementation Science and Deployment, Collaborative Workflow Systems, Stakeholder (i.e., patients or community) Engagement
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
Documentation is frequently reported as a contributor to healthcare provider burnout. We conducted a pilot trial of a large language model (LLM) ambient listening note-generation software among 49 outpatient pediatric healthcare providers. There were significant reductions in time in notes per encounter measured from the electronic health record. Providers reported significant reductions in cognitive load from notes, documentation outside of work hours, and burnout.
Speaker(s):
Jonathan Pelletier, MD
Akron Children's Hospital
Author(s):
Jonathan Pelletier, MD - Akron Children's Hospital; Kevin Watson, MD - Akron Children's Hospital; Sarah Rush, MD - Akron Children's Hospital;
Poster Number: P26
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Implementation Science and Deployment, Collaborative Workflow Systems, Stakeholder (i.e., patients or community) Engagement
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
Documentation is frequently reported as a contributor to healthcare provider burnout. We conducted a pilot trial of a large language model (LLM) ambient listening note-generation software among 49 outpatient pediatric healthcare providers. There were significant reductions in time in notes per encounter measured from the electronic health record. Providers reported significant reductions in cognitive load from notes, documentation outside of work hours, and burnout.
Speaker(s):
Jonathan Pelletier, MD
Akron Children's Hospital
Author(s):
Jonathan Pelletier, MD - Akron Children's Hospital; Kevin Watson, MD - Akron Children's Hospital; Sarah Rush, MD - Akron Children's Hospital;
The Future of Chronic Care Management: Exploring How AI Enhances Patient Outcomes and Lowers Hospitalization Rates
Poster Number: P27
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Mobile Health, Wearable Devices and Patient-Generated Health Data, Measuring Outcomes
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Artificial Intelligence (AI) driven Chronic Health Management aims to empower patients and care teams in handling complex health issues by anticipating and preventing health crises by promoting patient self-care. To achieve this, the Ibis Health platform utilizes Remote Patient Monitoring (RPM) as a key source of vital signs, health symptoms, and behavioral data, enabling the prediction of patient exacerbations and the coordination of interventions to prevent acute care situations. This analysis examined the Ibis Health hypertension, diabetes and congestive heart failure populations. Results revealed that AI integrated chronic care management resulted in significant reduction in number of hospitalizations and improvement in patient outcomes.
Speaker(s):
Hannah Dodson, DNP, FNP-C
Senscio Systems
Author(s):
Erik Johnson, B.S. - Senscio Systems; Piali De, PhD - Senscio Systems; Hannah Dodson, DNP, FNP-C - Senscio Systems;
Poster Number: P27
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Mobile Health, Wearable Devices and Patient-Generated Health Data, Measuring Outcomes
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Artificial Intelligence (AI) driven Chronic Health Management aims to empower patients and care teams in handling complex health issues by anticipating and preventing health crises by promoting patient self-care. To achieve this, the Ibis Health platform utilizes Remote Patient Monitoring (RPM) as a key source of vital signs, health symptoms, and behavioral data, enabling the prediction of patient exacerbations and the coordination of interventions to prevent acute care situations. This analysis examined the Ibis Health hypertension, diabetes and congestive heart failure populations. Results revealed that AI integrated chronic care management resulted in significant reduction in number of hospitalizations and improvement in patient outcomes.
Speaker(s):
Hannah Dodson, DNP, FNP-C
Senscio Systems
Author(s):
Erik Johnson, B.S. - Senscio Systems; Piali De, PhD - Senscio Systems; Hannah Dodson, DNP, FNP-C - Senscio Systems;
Real-World Evaluation of an Artificial Intelligence-Augmented Outreach Program to Enhance Medicare Advantage Member Experience
Poster Number: P28
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Implementation Science and Deployment, Natural Language Processing, Collaborative Workflow Systems, Outcomes Research, Clinical Epidemiology, Population Health, Data Integration
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
We describe key components and outcomes of a payor-led, artificial intelligence-augmented, customer service framework. Building on diverse data sources with interoperability, technological capabilities, and standardized workflows, are operationalized to effectively deliver contextualized information to provide personalized and actionable support to Medicare Advantage members, at scale.
Speaker(s):
Eleanor Beltz, PhD, ATC
Aetna Medical Affairs, CVS Health
Author(s):
Eleanor Beltz, PhD, ATC - Aetna Medical Affairs, CVS Health; Sofija Miljovska, MSc - CVS Health; Amanda Zaleski, PhD, MS - Aetna; Kelly Jean Craig, PhD - CVS Health; Matthew Churgin, PhD - CVS Health; Hugo Artigas, MSc - CVS Health; Adam Jacobson, MSc - CVS Health; Bryan Calderon, MSc - CVS Health; Gautier Poittevin, MSc - CVS Health; Anurag Agarwal, MS - CVS Health; Dustin Sherman, MHA - CVS Health; Shawn Smith, MBA - CVS Health; Lukas Hansen, MBA - CVS Health; Dorothea Verbrugge, MD - CVS Health; Laure Salomon, MBA - CVS Health;
Poster Number: P28
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Implementation Science and Deployment, Natural Language Processing, Collaborative Workflow Systems, Outcomes Research, Clinical Epidemiology, Population Health, Data Integration
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
We describe key components and outcomes of a payor-led, artificial intelligence-augmented, customer service framework. Building on diverse data sources with interoperability, technological capabilities, and standardized workflows, are operationalized to effectively deliver contextualized information to provide personalized and actionable support to Medicare Advantage members, at scale.
Speaker(s):
Eleanor Beltz, PhD, ATC
Aetna Medical Affairs, CVS Health
Author(s):
Eleanor Beltz, PhD, ATC - Aetna Medical Affairs, CVS Health; Sofija Miljovska, MSc - CVS Health; Amanda Zaleski, PhD, MS - Aetna; Kelly Jean Craig, PhD - CVS Health; Matthew Churgin, PhD - CVS Health; Hugo Artigas, MSc - CVS Health; Adam Jacobson, MSc - CVS Health; Bryan Calderon, MSc - CVS Health; Gautier Poittevin, MSc - CVS Health; Anurag Agarwal, MS - CVS Health; Dustin Sherman, MHA - CVS Health; Shawn Smith, MBA - CVS Health; Lukas Hansen, MBA - CVS Health; Dorothea Verbrugge, MD - CVS Health; Laure Salomon, MBA - CVS Health;
Social Determinants of Health and Mental Health Outcomes in Older Adults: An Analysis of All of Us COVID Survey Data
Poster Number: P29
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Social Determinants of Health
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Digital Health Technologies for Patient Research
We explore the mental health of elderly in relation to social determinants of health (SDOH) during the COVID pandemic. Factors such as physical isolation and social disconnectedness, resulting from interventions such as home isolation and restricted visits to nursing homes, are considered as potential contributors to the decline in mental health. By leveraging data from the All of Us Research Program, we explore the relationship between mental health outcomes and SDOH among a large cohort of elderly participants. Male gender and reporting being African American appear to be associated with relatively better mental health outcomes, whereas being divorced, separated, or widowed is linked to poorer mental health. Income and education demonstrate inconsistent effects on mental health outcomes. The deep neural network models outperform regression models in the same outcomes, by modeling more complex relationships between variables. Predictive performance remains modest for COVID-related anxiety and general well-being.
Speaker(s):
Phillip Ma, MD
George Washington University SMHS
Author(s):
Yijun Shao, PhD - George Washington University; Yan Cheng, PhD - George Washington University; Youxuan Ling, MS - GWU SMHS; Qing Zeng, PhD - George Washington University; Stuart Nelson, MD, FACP, FACMI - George Washington University;
Poster Number: P29
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Social Determinants of Health
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Digital Health Technologies for Patient Research
We explore the mental health of elderly in relation to social determinants of health (SDOH) during the COVID pandemic. Factors such as physical isolation and social disconnectedness, resulting from interventions such as home isolation and restricted visits to nursing homes, are considered as potential contributors to the decline in mental health. By leveraging data from the All of Us Research Program, we explore the relationship between mental health outcomes and SDOH among a large cohort of elderly participants. Male gender and reporting being African American appear to be associated with relatively better mental health outcomes, whereas being divorced, separated, or widowed is linked to poorer mental health. Income and education demonstrate inconsistent effects on mental health outcomes. The deep neural network models outperform regression models in the same outcomes, by modeling more complex relationships between variables. Predictive performance remains modest for COVID-related anxiety and general well-being.
Speaker(s):
Phillip Ma, MD
George Washington University SMHS
Author(s):
Yijun Shao, PhD - George Washington University; Yan Cheng, PhD - George Washington University; Youxuan Ling, MS - GWU SMHS; Qing Zeng, PhD - George Washington University; Stuart Nelson, MD, FACP, FACMI - George Washington University;
Activities of Daily Living Recognition for Aging Adults Using Deep Learning
Poster Number: P30
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Proactive Machine Learning and Reinforcement Learning
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
The study develops a deep learning-based model to recognize Activities of Daily Living (ADLs) for aging adults using non-invasive ambient sensor data. Using LSTM neural networks, the model achieved 0.88 accuracy in predicting activities such as sleep, eat and washing dishes ..etc. The research offers insights for enhancing elderly care through artificial intelligence, aiding health care providers in assessing participants’ progress with clinical decision support.
Speaker(s):
Nader Abdalnabi, MSB
University of Missouri
Author(s):
Praveen Rao, PhD - University of Missouri; Knoo Lee, PhD RN - University of Missouri - Columbia;
Poster Number: P30
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Proactive Machine Learning and Reinforcement Learning
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
The study develops a deep learning-based model to recognize Activities of Daily Living (ADLs) for aging adults using non-invasive ambient sensor data. Using LSTM neural networks, the model achieved 0.88 accuracy in predicting activities such as sleep, eat and washing dishes ..etc. The research offers insights for enhancing elderly care through artificial intelligence, aiding health care providers in assessing participants’ progress with clinical decision support.
Speaker(s):
Nader Abdalnabi, MSB
University of Missouri
Author(s):
Praveen Rao, PhD - University of Missouri; Knoo Lee, PhD RN - University of Missouri - Columbia;
Q-Table Initialization Using Expert Data in Reinforcement Learning
Poster Number: P31
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Reproducible Research Methods and Tools
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Proactive Machine Learning in Biomedical Applications: The Power of Generative AI and Reinforcement Learning
Using a simple Gridworld Q-learning problem, we simulate retrospective datasets that inform Q-table initialization and demonstrate the utility of incorporating retrospective data derived from expert performance. We propose a technique using relative action frequency to refine the Q-table initialization scheme to expedite convergence. The use of retrospective data from a near-optimal expert policy does hasten convergence by 85%, but down-scaling rarely observed state-action values results in slower training and no improvement in final model performance.
Speaker(s):
Tanner Wilson
University of Pittsburgh
Author(s):
Tanner Wilson - University of Pittsburgh; Harry Hochheiser, PhD - University of Pittsburgh Department of Biomedical Informatics;
Poster Number: P31
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Reproducible Research Methods and Tools
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Proactive Machine Learning in Biomedical Applications: The Power of Generative AI and Reinforcement Learning
Using a simple Gridworld Q-learning problem, we simulate retrospective datasets that inform Q-table initialization and demonstrate the utility of incorporating retrospective data derived from expert performance. We propose a technique using relative action frequency to refine the Q-table initialization scheme to expedite convergence. The use of retrospective data from a near-optimal expert policy does hasten convergence by 85%, but down-scaling rarely observed state-action values results in slower training and no improvement in final model performance.
Speaker(s):
Tanner Wilson
University of Pittsburgh
Author(s):
Tanner Wilson - University of Pittsburgh; Harry Hochheiser, PhD - University of Pittsburgh Department of Biomedical Informatics;
Using Clinical Documentation to Identify Late-Talking Children and Highlighting Potential Disparities in the Timing of a Late Talking Diagnosis
Poster Number: P32
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Machine Learning, Generative AI, and Predictive Modeling, EHR-based Phenotyping, Secondary Use of EHR Data
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Late Talking is a common developmental delay in children. Formal diagnosis is used to initiate services. This project aims to identify late-talking children using clinical notes via Natural Language Processing. Using BioClinicalBert, we are able to develop a strong classifier (AUC = 0.98). Further, we assess whether there are disparities in a formal diagnosis of late talking (via ICD codes) based on factors such as sex, race/ethnicity, spoken language and insurance type.
Speaker(s):
Jiang Shu, Student
DUKE UNIVERSITY
Author(s):
Jiang Shu, MS Student - Duke University; Danai Fannin, PhD - North Carolina Central University; Lauren Franz, MBChB, MPH - Duke University; Matthew Engelhard, PhD, MD - Duke University School of Medicine; Benjamin Goldstein, PhD - Duke University;
Poster Number: P32
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Machine Learning, Generative AI, and Predictive Modeling, EHR-based Phenotyping, Secondary Use of EHR Data
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
Late Talking is a common developmental delay in children. Formal diagnosis is used to initiate services. This project aims to identify late-talking children using clinical notes via Natural Language Processing. Using BioClinicalBert, we are able to develop a strong classifier (AUC = 0.98). Further, we assess whether there are disparities in a formal diagnosis of late talking (via ICD codes) based on factors such as sex, race/ethnicity, spoken language and insurance type.
Speaker(s):
Jiang Shu, Student
DUKE UNIVERSITY
Author(s):
Jiang Shu, MS Student - Duke University; Danai Fannin, PhD - North Carolina Central University; Lauren Franz, MBChB, MPH - Duke University; Matthew Engelhard, PhD, MD - Duke University School of Medicine; Benjamin Goldstein, PhD - Duke University;
Extracting Echocardiogram Entities with Light-Weight, Open-Source Large Language Models
Poster Number: P33
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Informatics Research/Biomedical Informatics Research Methods, Machine Learning, Generative AI, and Predictive Modeling
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
We quantify the clinical information extraction performance of fourteen open-source large language models (LLMs) from 507 manually labeled echocardiograms. We demonstrate that Gemma2:9b-instruct has a remarkable ability to accurately extract nine unique clinical entities from narrative echocardiograms with an F1 score of 0.961. Our research highlights the potential for LLMs to produce high-quality extractions from unstructured health records ready for downstream clinical research.
Speaker(s):
Jonathan Chi, B.S.
Washington University in St. Louis
Author(s):
Jonathan Chi, B.S. - Washington University in St. Louis; Ethan Hillis, MS - Institute for Informatics at Washington University School of Medicine in St. Louis; Yazan Rouphail, BS - Washington University in St. Louis; Ningning Ma, MD - Washington University School of Medicine in St. Louis; An Nguyen, MD - Washington University School of Medicine in St. Louis; Jane Wang, MD - Washington University School of Medicine in St. Louis; Mackenzie Hofford, MD - Washington University; Aditi Gupta - Washington University in St. Louis; Patrick Lyons, MD - Oregon Health & Science University; Adam Wilcox, PhD - Washington University in St. Louis; Albert Lai, PhD, FACMI, FAMIA - Washington University; Philip Payne, PhD, FACMI - Washington University School of Medicine in St. Louis; Caitlin Dreisbach, PhD, RN - University of Rochester; Andrew Michelson, MD - Washington University School of Medicine in St. Louis;
Poster Number: P33
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Informatics Research/Biomedical Informatics Research Methods, Machine Learning, Generative AI, and Predictive Modeling
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
We quantify the clinical information extraction performance of fourteen open-source large language models (LLMs) from 507 manually labeled echocardiograms. We demonstrate that Gemma2:9b-instruct has a remarkable ability to accurately extract nine unique clinical entities from narrative echocardiograms with an F1 score of 0.961. Our research highlights the potential for LLMs to produce high-quality extractions from unstructured health records ready for downstream clinical research.
Speaker(s):
Jonathan Chi, B.S.
Washington University in St. Louis
Author(s):
Jonathan Chi, B.S. - Washington University in St. Louis; Ethan Hillis, MS - Institute for Informatics at Washington University School of Medicine in St. Louis; Yazan Rouphail, BS - Washington University in St. Louis; Ningning Ma, MD - Washington University School of Medicine in St. Louis; An Nguyen, MD - Washington University School of Medicine in St. Louis; Jane Wang, MD - Washington University School of Medicine in St. Louis; Mackenzie Hofford, MD - Washington University; Aditi Gupta - Washington University in St. Louis; Patrick Lyons, MD - Oregon Health & Science University; Adam Wilcox, PhD - Washington University in St. Louis; Albert Lai, PhD, FACMI, FAMIA - Washington University; Philip Payne, PhD, FACMI - Washington University School of Medicine in St. Louis; Caitlin Dreisbach, PhD, RN - University of Rochester; Andrew Michelson, MD - Washington University School of Medicine in St. Louis;
Identifying Long COVID Patients from Primary Care Clinics and Long COVID Specialty Care Clinic with Natural Language Processing
Poster Number: P34
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Clinical Decision Support for Translational/Data Science Interventions, EHR-based Phenotyping
Working Group: Natural Language Processing Working Group
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Long COVID patients, depending on their severity, are treated in the primary care or the specialty Long COVID clinics. Timely referral of severe patients to specialty clinics is important for health outcomes. This study develops a machine learning approach to identify Long COVID patients most in need to receive access to specialty clinics using ICD (international classification of diseases) codes and patient reported symptoms extracted from clinical notes.
Speaker(s):
Weipeng Zhou, PhD candidate
Univeristy of Washington
Author(s):
Poster Number: P34
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Clinical Decision Support for Translational/Data Science Interventions, EHR-based Phenotyping
Working Group: Natural Language Processing Working Group
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Long COVID patients, depending on their severity, are treated in the primary care or the specialty Long COVID clinics. Timely referral of severe patients to specialty clinics is important for health outcomes. This study develops a machine learning approach to identify Long COVID patients most in need to receive access to specialty clinics using ICD (international classification of diseases) codes and patient reported symptoms extracted from clinical notes.
Speaker(s):
Weipeng Zhou, PhD candidate
Univeristy of Washington
Author(s):
Comparing Long COVID Patients from Primary Care Clinics and Long COVID Specialty Care Clinic with Natural Language Processing
Poster Number: P35
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, EHR-based Phenotyping, Data Mining and Knowledge Discovery
Working Group: Clinical Decision Support Working Group
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
We compared Long COVID patient characteristics in the primary care and specialty care settings using both symptoms gathered by ICD codes and natural language processing from notes. We found patients treated in the specialty Long COVID clinic were more White, younger, covered at higher rates by commercial insurance, and fewer Hispanics than patients treated in the primary care clinics. The specialty Long COVID clinic patients also tended to have higher prevalence of fatigue and fever.
Speaker(s):
Weipeng Zhou, PhD
Yale University
Weipeng Zhou, PhD candidate
Univeristy of Washington
Author(s):
Poster Number: P35
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, EHR-based Phenotyping, Data Mining and Knowledge Discovery
Working Group: Clinical Decision Support Working Group
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
We compared Long COVID patient characteristics in the primary care and specialty care settings using both symptoms gathered by ICD codes and natural language processing from notes. We found patients treated in the specialty Long COVID clinic were more White, younger, covered at higher rates by commercial insurance, and fewer Hispanics than patients treated in the primary care clinics. The specialty Long COVID clinic patients also tended to have higher prevalence of fatigue and fever.
Speaker(s):
Weipeng Zhou, PhD
Yale University
Weipeng Zhou, PhD candidate
Univeristy of Washington
Author(s):
A Comprehensive Resource of Eponymous Diseases to Address Privacy Challenges in Clinical Natural Language Processing Systems
Poster Number: P36
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Ontologies, Data Security and Privacy, Data Sharing/Interoperability, Machine Learning, Generative AI, and Predictive Modeling
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Proactive Machine Learning in Biomedical Applications: The Power of Generative AI and Reinforcement Learning
Eponymous diseases, named after individuals, present challenges for natural language processing (NLP) tools in healthcare, as they can be misidentified as personal names, risking misinterpretation and privacy breaches. To address this, we developed a standardized resource of eponymous diseases. This resource can be integrated into entity extraction and de-identification tools to enhance accuracy and safeguard patient privacy.
Speaker(s):
Lina Sulieman, PhD
Vanderbilt University Medical Center
Author(s):
Hiral Master, PT, PhD, MPH; Cathy Shyr, PhD - Vanderbilt University Medical Center; Lina Sulieman, PhD - Vanderbilt University Medical Center; Karthik Natarajan, PhD - Columbia University Dept of Biomedical Informatics;
Poster Number: P36
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Ontologies, Data Security and Privacy, Data Sharing/Interoperability, Machine Learning, Generative AI, and Predictive Modeling
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Proactive Machine Learning in Biomedical Applications: The Power of Generative AI and Reinforcement Learning
Eponymous diseases, named after individuals, present challenges for natural language processing (NLP) tools in healthcare, as they can be misidentified as personal names, risking misinterpretation and privacy breaches. To address this, we developed a standardized resource of eponymous diseases. This resource can be integrated into entity extraction and de-identification tools to enhance accuracy and safeguard patient privacy.
Speaker(s):
Lina Sulieman, PhD
Vanderbilt University Medical Center
Author(s):
Hiral Master, PT, PhD, MPH; Cathy Shyr, PhD - Vanderbilt University Medical Center; Lina Sulieman, PhD - Vanderbilt University Medical Center; Karthik Natarajan, PhD - Columbia University Dept of Biomedical Informatics;
Conversion of the International Classification of Orofacial Pain from a Terminology to an Ontology
Poster Number: P37
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Ontologies, Data Standards, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
We are converting International Classification of Orofacial Pain (ICOP) terminology into an ontological representation of orofacial pain, as part of our work on two related ontologies, the Pain Ontology and the Oral Health and Disease Ontology (OHD). The Pain Ontology represents the different fundamental types of pain and their mechanisms. The OHD provides a framework for dental and medical health care information. Our goal is to import high level classes from the Pain Ontology into the OHD as parents of ontology classes we create based on ICOP terms, which fit well with the overall domain of OHD. Through this work, we are enriching OHD not only with ontology classes for types of orofacial pain, but also with additional linked classes that represent elements of dental disease not previously represented in OHD. In addition, we are adding a few general pain classes to the Pain Ontology. By creating an ontology from ICOP, we can logically relate the resulting classes to classes in other ontologies, such as the Uberon anatomy ontology, allowing for enhanced integration and inferencing with data annotated to ICOP terms, and this work has resulted in improvements to both OHD and the Pain Ontology.
Speaker(s):
Author(s):
Ava Cunningham, HS Diploma - University at Buffalo; William Duncan, PhD - University of Florida; Alexander Diehl, PhD - University at Buffalo;
Poster Number: P37
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Ontologies, Data Standards, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
We are converting International Classification of Orofacial Pain (ICOP) terminology into an ontological representation of orofacial pain, as part of our work on two related ontologies, the Pain Ontology and the Oral Health and Disease Ontology (OHD). The Pain Ontology represents the different fundamental types of pain and their mechanisms. The OHD provides a framework for dental and medical health care information. Our goal is to import high level classes from the Pain Ontology into the OHD as parents of ontology classes we create based on ICOP terms, which fit well with the overall domain of OHD. Through this work, we are enriching OHD not only with ontology classes for types of orofacial pain, but also with additional linked classes that represent elements of dental disease not previously represented in OHD. In addition, we are adding a few general pain classes to the Pain Ontology. By creating an ontology from ICOP, we can logically relate the resulting classes to classes in other ontologies, such as the Uberon anatomy ontology, allowing for enhanced integration and inferencing with data annotated to ICOP terms, and this work has resulted in improvements to both OHD and the Pain Ontology.
Speaker(s):
Author(s):
Ava Cunningham, HS Diploma - University at Buffalo; William Duncan, PhD - University of Florida; Alexander Diehl, PhD - University at Buffalo;
Large Language Models (LLMs) in medical decision support: a scoping review
Poster Number: P38
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Patient-centered Research and Care, Machine Learning, Generative AI, and Predictive Modeling, Education and Training
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
The use of Large Language Models (LLMs) in medicine is growing, offering AI-driven conversational support for healthcare professionals, and aiming to empower patients with personalized information. These virtual aides try to provide timely insights into current health data, support clinical decision-making, and may enhance patient education and research efforts. However, questions remain about their role in medicine, especially in predicting and assessing patients’ health status in form of clinical decision support, prompting a scoping review to explore where research efforts in this area have been directed and where further efforts in research and medical practice are required. This scoping review provides an overview of established models with a special focus on decision support, their explainability in (medical) domains and how far they contributed to more efficient healthcare processes through their implementation.
Speaker(s):
Zeineb Sassi, Master
University of Regensburg
Author(s):
Poster Number: P38
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Patient-centered Research and Care, Machine Learning, Generative AI, and Predictive Modeling, Education and Training
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
The use of Large Language Models (LLMs) in medicine is growing, offering AI-driven conversational support for healthcare professionals, and aiming to empower patients with personalized information. These virtual aides try to provide timely insights into current health data, support clinical decision-making, and may enhance patient education and research efforts. However, questions remain about their role in medicine, especially in predicting and assessing patients’ health status in form of clinical decision support, prompting a scoping review to explore where research efforts in this area have been directed and where further efforts in research and medical practice are required. This scoping review provides an overview of established models with a special focus on decision support, their explainability in (medical) domains and how far they contributed to more efficient healthcare processes through their implementation.
Speaker(s):
Zeineb Sassi, Master
University of Regensburg
Author(s):
Probabilistic Disease Surveillance Using Large Language Model
Poster Number: P39
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Public Health Informatics, EHR-based Phenotyping, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
This study evaluates the ability of LLaMA 3 to estimate probabilities by comparing various prompting strategies. We assessed its effectiveness in detecting influenza cases among emergency encounters using electronic health records. The findings demonstrate that performance improves when prompts include a few examples and further enhances when incorporating probabilistic estimations from a machine-learned Bayesian Network model and feature importance derived from historical data, highlighting the potential of LLMs in probabilistic disease surveillance.
Speaker(s):
Ye Ye, PhD
University of Pittsburgh Department of Biomedical Informatics
Author(s):
Chenxi Song, Master of Science - University of Pittsburgh; Yuhe Gao, Master in Information Science - University of Pittsburgh; Runxue Bao, PhD - GE Healthcare; Yiming Sun, BE - University of Pittsburgh; Julianys Tirado Alicea, undergraded - University of Puerto Rico; Ye Ye, PhD - University of Pittsburgh Department of Biomedical Informatics;
Poster Number: P39
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Public Health Informatics, EHR-based Phenotyping, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
This study evaluates the ability of LLaMA 3 to estimate probabilities by comparing various prompting strategies. We assessed its effectiveness in detecting influenza cases among emergency encounters using electronic health records. The findings demonstrate that performance improves when prompts include a few examples and further enhances when incorporating probabilistic estimations from a machine-learned Bayesian Network model and feature importance derived from historical data, highlighting the potential of LLMs in probabilistic disease surveillance.
Speaker(s):
Ye Ye, PhD
University of Pittsburgh Department of Biomedical Informatics
Author(s):
Chenxi Song, Master of Science - University of Pittsburgh; Yuhe Gao, Master in Information Science - University of Pittsburgh; Runxue Bao, PhD - GE Healthcare; Yiming Sun, BE - University of Pittsburgh; Julianys Tirado Alicea, undergraded - University of Puerto Rico; Ye Ye, PhD - University of Pittsburgh Department of Biomedical Informatics;
Examine disparities in the development of CVD and non-CVD conditions between renters and homeowners
Poster Number: P40
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Social Determinants of Health, Fairness and Disparity Research in Health Informatics, Secondary Use of EHR Data, Data-Driven Research and Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
This study evaluates the disparities in the long-term chronic health burden (both CVD and non-CVD) between patients of different housing status across different levels of CVD risk predicted by two cardiovascular risk prediction algorithms, PCE and PREVENT. Results show a consistently higher chronic health burden in renters than homeowners across different levels of CVD risk developed by both models, highlighting the impact of social determinants of health factors such as housing on overall health conditions.
Speaker(s):
Haoyun Hong, BA
American Heart Association
Author(s):
Zihang Gao, MS - American Heart Association; Sadiya Khan, MD, MSc - Northwestern University; Yingying Sang, MSc - New York University; Haoyuan Wang, Bachelor - Duke; Chuan Hong, PhD - Duke University; Michael Pencina, PhD - Duke University School of Medicine; Jennifer Hall, PhD - American Heart Association; Juan Zhao, PhD - American Heart Association;
Poster Number: P40
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Social Determinants of Health, Fairness and Disparity Research in Health Informatics, Secondary Use of EHR Data, Data-Driven Research and Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
This study evaluates the disparities in the long-term chronic health burden (both CVD and non-CVD) between patients of different housing status across different levels of CVD risk predicted by two cardiovascular risk prediction algorithms, PCE and PREVENT. Results show a consistently higher chronic health burden in renters than homeowners across different levels of CVD risk developed by both models, highlighting the impact of social determinants of health factors such as housing on overall health conditions.
Speaker(s):
Haoyun Hong, BA
American Heart Association
Author(s):
Zihang Gao, MS - American Heart Association; Sadiya Khan, MD, MSc - Northwestern University; Yingying Sang, MSc - New York University; Haoyuan Wang, Bachelor - Duke; Chuan Hong, PhD - Duke University; Michael Pencina, PhD - Duke University School of Medicine; Jennifer Hall, PhD - American Heart Association; Juan Zhao, PhD - American Heart Association;
Covalent Inhibitors on Cap-Dependent Translation as a Novel Targeted Therapy for Breast Cancer
Poster Number: P41
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Biomarker Discovery and Development, Informatics Research/Biomedical Informatics Research Methods, Pharmacogenomics
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
Covalent docking has emerged as a promising strategy for targeting challenging protein residues. Specifically, lysine residues, which are often overlooked due to their lower nucleophilicity compared to cysteines, have shown potential in the development of covalent inhibitors for eukaryotic translation initiation factor 4E (eIF4E), a critical regulator of protein synthesis implicated in breast cancers. Leveraging aryl sulfonyl fluorides, recent studies had successfully docked to K162 within eIF4E and shown reduced protein activity, but the molecules had faced significant specificity and instability. In this study, I improved the molecules in the following aspects: specificity and stability. Regarding specificity, I found additional lysine residues, such as K106 and K108, as promising targets for covalent inhibition through structural analyses of eIF4E. I also designed bivalent molecules that covalently bind to both lysine residues simultaneously. This bivalent binding increases the specificity as there are only two regions of two lysine residues in three amino acids, whereas there are a total of 15 lysine residues. The differences in binding affinity and docking score between the bivalent and control groups were insignificant, suggesting that the potency of the bivalent molecule didn’t drop. In addition, regarding stability, I found that substituting the aryl sulfonyl fluorides with triflate esters with decreased RMSD and a more stable Radius of Gyration (Rg) in 1x PBS solution environment and DMSO solution. The differences in binding affinity and docking score between the triflate esters and the aryl sulfonyl fluorides group were also insignificant, suggesting that the potency didn’t drop.
Speaker(s):
Xingchuan Ma, High School Student
Portsmouth Abbey School
Author(s):
Poster Number: P41
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Biomarker Discovery and Development, Informatics Research/Biomedical Informatics Research Methods, Pharmacogenomics
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
Covalent docking has emerged as a promising strategy for targeting challenging protein residues. Specifically, lysine residues, which are often overlooked due to their lower nucleophilicity compared to cysteines, have shown potential in the development of covalent inhibitors for eukaryotic translation initiation factor 4E (eIF4E), a critical regulator of protein synthesis implicated in breast cancers. Leveraging aryl sulfonyl fluorides, recent studies had successfully docked to K162 within eIF4E and shown reduced protein activity, but the molecules had faced significant specificity and instability. In this study, I improved the molecules in the following aspects: specificity and stability. Regarding specificity, I found additional lysine residues, such as K106 and K108, as promising targets for covalent inhibition through structural analyses of eIF4E. I also designed bivalent molecules that covalently bind to both lysine residues simultaneously. This bivalent binding increases the specificity as there are only two regions of two lysine residues in three amino acids, whereas there are a total of 15 lysine residues. The differences in binding affinity and docking score between the bivalent and control groups were insignificant, suggesting that the potency of the bivalent molecule didn’t drop. In addition, regarding stability, I found that substituting the aryl sulfonyl fluorides with triflate esters with decreased RMSD and a more stable Radius of Gyration (Rg) in 1x PBS solution environment and DMSO solution. The differences in binding affinity and docking score between the triflate esters and the aryl sulfonyl fluorides group were also insignificant, suggesting that the potency didn’t drop.
Speaker(s):
Xingchuan Ma, High School Student
Portsmouth Abbey School
Author(s):
RNA-seq analysis revealed potential age-related biomarkers of NOTCH pathway
Poster Number: P42
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Biomarker Discovery and Development, Transcriptomics, Genomics/Omic Data Interpretation, Data-Driven Research and Discovery
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Novel Methods for Variant Detection and Interpretation from Omics Data
Increased evidence indicates the emerging role of Notch pathway in the regulation of aging. In this study, we retrospectively studied RNA sequencing data from two types of skin tissues and identified the notch-related genes NOTCH2 and NOTCH3 with negative associations with age, GZMB and HES1 with positive associations with age. These biomarker findings can contribute to potential therapeutic targets for aging and age-related diseases.
Speaker(s):
Jiaxing Liu, MS
The University at Buffalo
Author(s):
Jiaxing Liu, MS - The University at Buffalo; Skyler Resendez - The University at Buffalo; Melissa Resnick, PhD - University at Buffalo School of Medicine; Peter Elkin, MD, MACP, FACMI, FNYAM, FAMIA, FIAHSI - Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York;
Poster Number: P42
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Biomarker Discovery and Development, Transcriptomics, Genomics/Omic Data Interpretation, Data-Driven Research and Discovery
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Novel Methods for Variant Detection and Interpretation from Omics Data
Increased evidence indicates the emerging role of Notch pathway in the regulation of aging. In this study, we retrospectively studied RNA sequencing data from two types of skin tissues and identified the notch-related genes NOTCH2 and NOTCH3 with negative associations with age, GZMB and HES1 with positive associations with age. These biomarker findings can contribute to potential therapeutic targets for aging and age-related diseases.
Speaker(s):
Jiaxing Liu, MS
The University at Buffalo
Author(s):
Jiaxing Liu, MS - The University at Buffalo; Skyler Resendez - The University at Buffalo; Melissa Resnick, PhD - University at Buffalo School of Medicine; Peter Elkin, MD, MACP, FACMI, FNYAM, FAMIA, FIAHSI - Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York;
Validation of Pregnancy Date Estimation from EHR Outpatient Care Data
Poster Number: P43
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical and Research Data Collection, Curation, Preservation, or Sharing, Data Quality, Secondary Use of EHR Data
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
We estimated pregnancy dates from discrete fields in an outpatient electronic health record setting; estimated dates were then compared to dates from linked state birth certificates. For 47,515 pregnancies from Oregon or California, clinician-entered episodes were most reliable for estimating pregnancy start and end dates (average <1 day difference). Procedure and diagnosis outcome codes are useful for identifying pregnancies with less accurate timing.
Speaker(s):
Kristin Lyon-Scott, MPS
OCHIN, Inc
Author(s):
Teresa Schmidt, PhD - OCHIN; Janne Boone-Heinonen, PhD, MPH - Oregon Health and Science University; Jenna Donovan, MPH - OCHIN, Inc; Jenine Dankovchik, BS - OCHIN; Kimberly K. Vesco, MD, MPH - Kaiser Permanente; Jennifer Hauschildt;
Poster Number: P43
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical and Research Data Collection, Curation, Preservation, or Sharing, Data Quality, Secondary Use of EHR Data
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
We estimated pregnancy dates from discrete fields in an outpatient electronic health record setting; estimated dates were then compared to dates from linked state birth certificates. For 47,515 pregnancies from Oregon or California, clinician-entered episodes were most reliable for estimating pregnancy start and end dates (average <1 day difference). Procedure and diagnosis outcome codes are useful for identifying pregnancies with less accurate timing.
Speaker(s):
Kristin Lyon-Scott, MPS
OCHIN, Inc
Author(s):
Teresa Schmidt, PhD - OCHIN; Janne Boone-Heinonen, PhD, MPH - Oregon Health and Science University; Jenna Donovan, MPH - OCHIN, Inc; Jenine Dankovchik, BS - OCHIN; Kimberly K. Vesco, MD, MPH - Kaiser Permanente; Jennifer Hauschildt;
Alternative Cognitive Assessment for Annual Wellness Visit
Poster Number: P44
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Patient-centered Research and Care
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Digital Health Technologies for Patient Research
Early detection of cognitive impairment (CI) is vital, but current screening methods lack the sensitivity to detect mild cases before impairment progresses. The MyCog and MyCog Mobile apps were designed to be self-administered during the rooming process or before the clinic visit. Both apps integrate with Epic electronic health record (EHR) system, triggering clinical decision support recommendations. MyCog is currently operational in 16 clinics. MyCog Mobile is set to pilot in clinics starting spring 2025.
Speaker(s):
Callie Jones, BA
Northwestern University
Author(s):
Stephanie Ruth Young, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Michael Bass, MS - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Greg Byrne, MA - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Elizabeth Dworak, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Sarah Filec, MPH - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Julia Yoshino-Benavente, MPH - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Laura Curtis, MS - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Morgan Bonham, BS - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Zahra Hosseinian, MA - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Lihua Yao, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Yusuke Shono, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Maria Varela-Diaz, MS - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Andrew Cooper, MS - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Richard Gershon, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Michael Wolf, PhD, MPH - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Cindy Nowinski, MD, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University;
Poster Number: P44
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Clinical Decision Support for Translational/Data Science Interventions, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Patient-centered Research and Care
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Digital Health Technologies for Patient Research
Early detection of cognitive impairment (CI) is vital, but current screening methods lack the sensitivity to detect mild cases before impairment progresses. The MyCog and MyCog Mobile apps were designed to be self-administered during the rooming process or before the clinic visit. Both apps integrate with Epic electronic health record (EHR) system, triggering clinical decision support recommendations. MyCog is currently operational in 16 clinics. MyCog Mobile is set to pilot in clinics starting spring 2025.
Speaker(s):
Callie Jones, BA
Northwestern University
Author(s):
Stephanie Ruth Young, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Michael Bass, MS - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Greg Byrne, MA - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Elizabeth Dworak, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Sarah Filec, MPH - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Julia Yoshino-Benavente, MPH - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Laura Curtis, MS - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Morgan Bonham, BS - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Zahra Hosseinian, MA - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Lihua Yao, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Yusuke Shono, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Maria Varela-Diaz, MS - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Andrew Cooper, MS - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Richard Gershon, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University; Michael Wolf, PhD, MPH - Center for Applied Health Research on Aging, Feinberg School of Medicine, Northwestern University; Cindy Nowinski, MD, PhD - Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University;
Unraveling the Triggers of Idiopathic Inflammatory Myopathies: An Approach Using Molecular Mimicry
Poster Number: P45
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data-Driven Research and Discovery, Data Mining and Knowledge Discovery, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
We investigated the molecular mimicry hypothesis in Idiopathic Inflammatory Myopathies (IIMs) by analyzing homology between myositis-specific autoantibodies (MSA) and infectious agents (IA) epitopes. Using BlastP, we identified ten matches involving Alphapapillomavirus 7, SARS-CoV-2, and Mycobacterium tuberculosis. The findings suggest potential infectious triggers, particularly respiratory infections, in IIM development. Future work includes integrating machine learning with patient records to further explore infection patterns linked to IIM pathogenesis, guiding clinical research for better therapeutic interventions.
Speaker(s):
Tayler Fearn, Bachelors Bioinformatics
University of Utah
Author(s):
Julio Facelli, PhD - Facelli; Ram Gouripeddi, MD - University of Utah; Pablo J Maldonado-Catala, PhD - University of Utah; Dorota Lebiedz-Odrobina, MD - University of Utah; Naomi Schlesinger, MD - University of Utah;
Poster Number: P45
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Data-Driven Research and Discovery, Data Mining and Knowledge Discovery, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
We investigated the molecular mimicry hypothesis in Idiopathic Inflammatory Myopathies (IIMs) by analyzing homology between myositis-specific autoantibodies (MSA) and infectious agents (IA) epitopes. Using BlastP, we identified ten matches involving Alphapapillomavirus 7, SARS-CoV-2, and Mycobacterium tuberculosis. The findings suggest potential infectious triggers, particularly respiratory infections, in IIM development. Future work includes integrating machine learning with patient records to further explore infection patterns linked to IIM pathogenesis, guiding clinical research for better therapeutic interventions.
Speaker(s):
Tayler Fearn, Bachelors Bioinformatics
University of Utah
Author(s):
Julio Facelli, PhD - Facelli; Ram Gouripeddi, MD - University of Utah; Pablo J Maldonado-Catala, PhD - University of Utah; Dorota Lebiedz-Odrobina, MD - University of Utah; Naomi Schlesinger, MD - University of Utah;
Novel Applications to Demonstrate the Utility of MD3F, A Multivariate Distance Drift Diffusion Framework
Poster Number: P46
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Genomics/Omic Data Interpretation, Clinical Genomics/Omics and Interventions Based on Omics Data, Secondary Use of EHR Data
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
MD3F, a multivariate distance drift diffusion framework developed for analyzing high-dimensional longitudinal data, integrates advanced statistical methodologies with computational efficiency to uncover temporal patterns in disease progression. We compiled two sets of data to demonstrate novel applications of MD3F; microbiome samples and electronic health record data from head and neck cancer patients. These new applications enhance our understanding of individualized health trajectories and underscore the potential of robust analytical frameworks in advancing precision medicine initiatives.
Speaker(s):
Jessica Zielinski, PhD Student
Medical University of South Carolina
Author(s):
Poster Number: P46
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Genomics/Omic Data Interpretation, Clinical Genomics/Omics and Interventions Based on Omics Data, Secondary Use of EHR Data
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
MD3F, a multivariate distance drift diffusion framework developed for analyzing high-dimensional longitudinal data, integrates advanced statistical methodologies with computational efficiency to uncover temporal patterns in disease progression. We compiled two sets of data to demonstrate novel applications of MD3F; microbiome samples and electronic health record data from head and neck cancer patients. These new applications enhance our understanding of individualized health trajectories and underscore the potential of robust analytical frameworks in advancing precision medicine initiatives.
Speaker(s):
Jessica Zielinski, PhD Student
Medical University of South Carolina
Author(s):
A Machine Learning Prediction Model for Breast Cancer Recurrence
Poster Number: P47
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data-Driven Research and Discovery, Clinical Genomics/Omics and Interventions Based on Omics Data, Clinical Decision Support for Translational/Data Science Interventions
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
Breast cancer is a complex and heterogeneous disease, presenting significant challenges, especially for metastatic cases. This study utilized a machine learning-based model to predict recurrence in lobular and ductal breast cancer using gene expression. XGBoost performed the best, with an AUROC of 0.9182 for lobular and 0.6812 for ductal subtypes. The findings highlight the potential of ML-based models in recurrence prediction. Future work will integrate additional omics data to improve accuracy, enhancing individualized patient care.
Speaker(s):
Merih Toruner, Sc.B.
Brown University
Author(s):
Merih Toruner, Sc.B. - Brown University; Jessica Patricoski, PhD Candidate, Computational Biology - Brown University Center for Computational Molecular Biology; Ece Uzun, PhD - Lifespan/Brown University;
Poster Number: P47
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data-Driven Research and Discovery, Clinical Genomics/Omics and Interventions Based on Omics Data, Clinical Decision Support for Translational/Data Science Interventions
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
Breast cancer is a complex and heterogeneous disease, presenting significant challenges, especially for metastatic cases. This study utilized a machine learning-based model to predict recurrence in lobular and ductal breast cancer using gene expression. XGBoost performed the best, with an AUROC of 0.9182 for lobular and 0.6812 for ductal subtypes. The findings highlight the potential of ML-based models in recurrence prediction. Future work will integrate additional omics data to improve accuracy, enhancing individualized patient care.
Speaker(s):
Merih Toruner, Sc.B.
Brown University
Author(s):
Merih Toruner, Sc.B. - Brown University; Jessica Patricoski, PhD Candidate, Computational Biology - Brown University Center for Computational Molecular Biology; Ece Uzun, PhD - Lifespan/Brown University;
Mining Omics Data for CLN3 Biomarker Identification
Poster Number: P48
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Biomarker Discovery and Development, Data Mining and Knowledge Discovery, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Clinical Research Informatics
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
CLN3, also known as juvenile neuronal ceroid lipofuscinosis, is a rare and progressive neurodegenerative disorder characterized by the accumulation of lipopigments in the brain, leading to cognitive decline, seizures, and vision loss in affected individuals. Biomarkers are critical for understanding disease mechanisms, monitoring disease progression, and evaluating therapeutic responses. By integrating proteomics data and laboratory tests, this study identified biomarkers that are closely associated with CLN3 mechanisms.
Speaker(s):
Shixue Sun, PhD
NCATS/NIH
Author(s):
Poster Number: P48
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Biomarker Discovery and Development, Data Mining and Knowledge Discovery, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Clinical Research Informatics
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
CLN3, also known as juvenile neuronal ceroid lipofuscinosis, is a rare and progressive neurodegenerative disorder characterized by the accumulation of lipopigments in the brain, leading to cognitive decline, seizures, and vision loss in affected individuals. Biomarkers are critical for understanding disease mechanisms, monitoring disease progression, and evaluating therapeutic responses. By integrating proteomics data and laboratory tests, this study identified biomarkers that are closely associated with CLN3 mechanisms.
Speaker(s):
Shixue Sun, PhD
NCATS/NIH
Author(s):
Examining Large Multimodal Models’ Abilities in Accurately, Completely, and Consistently Interpreting Healthcare Infographics
Poster Number: P49
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Machine Learning, Generative AI, and Predictive Modeling, Patient-centered Research and Care
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Infographics are widely used in healthcare education materials. Little is known about if large multimodal
models (LMMs) could interpret infographics. We tested four LMMs, GPT-4V, Gemini-Pro-vision, LLaVA
1.6-34B, and CogVLM-17B, using 5 infographics, on accuracy, completeness, and consistency. We found
that performance was consistent but differed in accuracy and completeness across infographics with
varying complexity. Post-hoc tests revealed varying patterns that GPT-4V performed significantly well on
completeness compared to other models, but not significant in terms of accuracy. LLaVA outperformed
CogVLM in terms of completeness on infographics containing medium key points. The architecture and
training process differences may lead to the different capabilities in interpreting healthcare infographics.
These findings shed light on LMMs’ abilities in interpreting infographics and their applicability in producing
healthcare education materials.
Speaker(s):
Zhimeng Luo
University of Pittsburgh
Author(s):
Ning Zou; Bo Xie, PhD - University of Texas at Austin; Daqing He, PhD - University of Pittsburgh; Robin Hilsabeck, Ph.D. - University of Texas Health Sciences Center at San Antonio; Alyssa Aguirre, LCSW-S - University of Texas Dell Medical School;
Poster Number: P49
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Machine Learning, Generative AI, and Predictive Modeling, Patient-centered Research and Care
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Infographics are widely used in healthcare education materials. Little is known about if large multimodal
models (LMMs) could interpret infographics. We tested four LMMs, GPT-4V, Gemini-Pro-vision, LLaVA
1.6-34B, and CogVLM-17B, using 5 infographics, on accuracy, completeness, and consistency. We found
that performance was consistent but differed in accuracy and completeness across infographics with
varying complexity. Post-hoc tests revealed varying patterns that GPT-4V performed significantly well on
completeness compared to other models, but not significant in terms of accuracy. LLaVA outperformed
CogVLM in terms of completeness on infographics containing medium key points. The architecture and
training process differences may lead to the different capabilities in interpreting healthcare infographics.
These findings shed light on LMMs’ abilities in interpreting infographics and their applicability in producing
healthcare education materials.
Speaker(s):
Zhimeng Luo
University of Pittsburgh
Author(s):
Ning Zou; Bo Xie, PhD - University of Texas at Austin; Daqing He, PhD - University of Pittsburgh; Robin Hilsabeck, Ph.D. - University of Texas Health Sciences Center at San Antonio; Alyssa Aguirre, LCSW-S - University of Texas Dell Medical School;
Exploring Caregivers’ Needs, Interests and Preferences in UTI Management Digital Solutions for People with Dementia
Poster Number: P50
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Patient-centered Research and Care, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Stakeholder (i.e., patients or community) Engagement
Primary Track: Clinical Research Informatics
Programmatic Theme: Digital Health Technologies for Patient Research
Urinary tract infections (UTIs) are a leading cause of preventable hospitalizations in persons living with dementia (PLWD), where timely and effective ambulatory care could mitigate these admissions. Dementia caregivers often struggle to recognize UTI symptoms and lack access to reliable medical information, highlighting the need for digital interventions tailored to their specific challenges. This study aimed to explore dementia caregivers' needs, interests, and preferences in digital UTI management solutions for caring for persons with moderate to late-stage dementia. Through semi-structured Zoom interviews with 11 caregivers, key insights were gathered, including the importance of hygiene practices and the overwhelming burden caused by communication barriers with PLWD. Caregivers expressed interest in apps and smart devices with reminders, health management tools, and UTI screening capabilities, while also emphasizing the technological preferences for ease of use, graphic-rich interfaces, and technical support. Additionally, caregivers identified the value of tailored education and real-time communication features for early intervention and decision-making support. These findings underscore the potential for user-centered digital interventions to reduce caregiver burden and improve care quality, ultimately preventing hospitalizations. Further research should focus on iterative design and testing to ensure digital solutions effectively meet caregivers’ needs and enhance their caregiving experience.
Speaker(s):
Kuan-Ching Wu, PhD
University of Washington
Author(s):
Chi-shan Tsai, MSN - University of Washington; Oleg Zaslavsky, PhD - University of Washington;
Poster Number: P50
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Patient-centered Research and Care, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Stakeholder (i.e., patients or community) Engagement
Primary Track: Clinical Research Informatics
Programmatic Theme: Digital Health Technologies for Patient Research
Urinary tract infections (UTIs) are a leading cause of preventable hospitalizations in persons living with dementia (PLWD), where timely and effective ambulatory care could mitigate these admissions. Dementia caregivers often struggle to recognize UTI symptoms and lack access to reliable medical information, highlighting the need for digital interventions tailored to their specific challenges. This study aimed to explore dementia caregivers' needs, interests, and preferences in digital UTI management solutions for caring for persons with moderate to late-stage dementia. Through semi-structured Zoom interviews with 11 caregivers, key insights were gathered, including the importance of hygiene practices and the overwhelming burden caused by communication barriers with PLWD. Caregivers expressed interest in apps and smart devices with reminders, health management tools, and UTI screening capabilities, while also emphasizing the technological preferences for ease of use, graphic-rich interfaces, and technical support. Additionally, caregivers identified the value of tailored education and real-time communication features for early intervention and decision-making support. These findings underscore the potential for user-centered digital interventions to reduce caregiver burden and improve care quality, ultimately preventing hospitalizations. Further research should focus on iterative design and testing to ensure digital solutions effectively meet caregivers’ needs and enhance their caregiving experience.
Speaker(s):
Kuan-Ching Wu, PhD
University of Washington
Author(s):
Chi-shan Tsai, MSN - University of Washington; Oleg Zaslavsky, PhD - University of Washington;
Creatinine Prediction Using Deep Learning Methods
Category
Poster - Student