Times are displayed in (UTC-04:00) Eastern Time (US & Canada) Change
3/11/2025 |
3:30 PM – 5:00 PM |
Monongahela
S16: LLMs: Evaluations, Applications, and Optimizations
Presentation Type: Podium Abstract
Session Credits: 1.5
Transfer Learning with Clinical Concept Embeddings from Large Language Models
Presentation Time: 03:30 PM - 03:45 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing, Public Health Informatics, Knowledge Representation, Management, or Engineering, EHR-based Phenotyping
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Knowledge sharing is crucial in healthcare, especially when leveraging data from multiple clinical sites to address
data scarcity, reduce costs, and enable timely interventions. Transfer learning can facilitate cross-site knowledge
transfer, but a major challenge is heterogeneity in clinical concepts across different sites. Capturing the semantic
meaning of clinical concepts and reducing heterogeneity, Large Language Models (LLMs) show significant potential.
This study analyzed electronic health records from two large healthcare systems to assess the impact of semantic
embeddings from LLMs on local, shared, and transfer learning models. Results indicate that domain-specific LLMs,
such as Med-BERT, consistently outperform in local and direct transfer scenarios, while generic models like Ope-
nAI embeddings require fine-tuning for optimal performance. However, excessive tuning of models with biomedical
embeddings may reduce effectiveness, emphasizing the need for balance. This study highlights the importance of
domain-specific embeddings and careful model tuning for effective knowledge transfer in healthcare.
Speaker(s):
Ye Ye, PhD
University of Pittsburgh Department of Biomedical Informatics
Author(s):
Yuhe Gao, Master in Information Science - University of Pittsburgh; Runxue Bao, PhD - GE Healthcare; Yuelyu Ji, PhD - University of Pittsburgh; Yiming Sun, BE - University of Pittsburgh; Chenxi Song, Master of Science - University of Pittsburgh; Jeffrey Ferraro, PhD - University of Utah; Ye Ye, PhD - University of Pittsburgh Department of Biomedical Informatics;
Presentation Time: 03:30 PM - 03:45 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing, Public Health Informatics, Knowledge Representation, Management, or Engineering, EHR-based Phenotyping
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Knowledge sharing is crucial in healthcare, especially when leveraging data from multiple clinical sites to address
data scarcity, reduce costs, and enable timely interventions. Transfer learning can facilitate cross-site knowledge
transfer, but a major challenge is heterogeneity in clinical concepts across different sites. Capturing the semantic
meaning of clinical concepts and reducing heterogeneity, Large Language Models (LLMs) show significant potential.
This study analyzed electronic health records from two large healthcare systems to assess the impact of semantic
embeddings from LLMs on local, shared, and transfer learning models. Results indicate that domain-specific LLMs,
such as Med-BERT, consistently outperform in local and direct transfer scenarios, while generic models like Ope-
nAI embeddings require fine-tuning for optimal performance. However, excessive tuning of models with biomedical
embeddings may reduce effectiveness, emphasizing the need for balance. This study highlights the importance of
domain-specific embeddings and careful model tuning for effective knowledge transfer in healthcare.
Speaker(s):
Ye Ye, PhD
University of Pittsburgh Department of Biomedical Informatics
Author(s):
Yuhe Gao, Master in Information Science - University of Pittsburgh; Runxue Bao, PhD - GE Healthcare; Yuelyu Ji, PhD - University of Pittsburgh; Yiming Sun, BE - University of Pittsburgh; Chenxi Song, Master of Science - University of Pittsburgh; Jeffrey Ferraro, PhD - University of Utah; Ye Ye, PhD - University of Pittsburgh Department of Biomedical Informatics;
Institutional Platform for Secure Self-Service Large Language Model Exploration
Presentation Time: 03:45 PM - 04:00 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing, Reproducible Research Methods and Tools, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
This paper introduces a user-friendly platform developed by the University of Kentucky Center for Applied AI, designed to make customized large language models (LLMs) more accessible. By capitalizing on recent advancements in multi-LoRA inference, the system efficiently accommodates custom adapters for a diverse range of users and projects. The paper outlines the system’s architecture and key features, encompassing dataset curation, model training, secure inference, and text-based feature extraction. We illustrate the establishment of a tenant-aware computational network using agent-based methods, securely utilizing islands of isolated resources as a unified system. The platform strives to deliver secure, affordable LLM services, emphasizing process and data isolation, end-to-end encryption, and role-based resource authentication. This contribution aligns with the overarching goal of enabling simplified access to cutting-edge AI models and technology in support of scientific discovery and the development of biomedical informatics.
Speaker(s):
Mitchell Klusty, B.S. Computer Science
University of Kentucky
Author(s):
Cody Bumgardner, PhD - University of Kentucky; Mitchell Klusty, B.S. Computer Science - University of Kentucky; Vaiden Logan, B.S. in Computer Engineering - UKY; Samuel Armstrong, MS - University of Kentucky; Caroline Leach, BS - University of Kentucky; Caylin Hickey - University of Kentucky; Jeffery Talbert, PhD - University of Kentucky;
Presentation Time: 03:45 PM - 04:00 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing, Reproducible Research Methods and Tools, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
This paper introduces a user-friendly platform developed by the University of Kentucky Center for Applied AI, designed to make customized large language models (LLMs) more accessible. By capitalizing on recent advancements in multi-LoRA inference, the system efficiently accommodates custom adapters for a diverse range of users and projects. The paper outlines the system’s architecture and key features, encompassing dataset curation, model training, secure inference, and text-based feature extraction. We illustrate the establishment of a tenant-aware computational network using agent-based methods, securely utilizing islands of isolated resources as a unified system. The platform strives to deliver secure, affordable LLM services, emphasizing process and data isolation, end-to-end encryption, and role-based resource authentication. This contribution aligns with the overarching goal of enabling simplified access to cutting-edge AI models and technology in support of scientific discovery and the development of biomedical informatics.
Speaker(s):
Mitchell Klusty, B.S. Computer Science
University of Kentucky
Author(s):
Cody Bumgardner, PhD - University of Kentucky; Mitchell Klusty, B.S. Computer Science - University of Kentucky; Vaiden Logan, B.S. in Computer Engineering - UKY; Samuel Armstrong, MS - University of Kentucky; Caroline Leach, BS - University of Kentucky; Caylin Hickey - University of Kentucky; Jeffery Talbert, PhD - University of Kentucky;
Content Analysis of Over-the-Counter Hearing Aid Reviews
Presentation Time: 04:00 PM - 04:15 PM
Abstract Keywords: Patient-centered Research and Care, Natural Language Processing, Mobile Health, Wearable Devices and Patient-Generated Health Data
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Hearing loss is a prevalent and impactful condition that affects millions globally. In 2022, the U.S. Food and Drug Administration (FDA) approved over-the-counter (OTC) hearing aids for individuals with mild to moderate hearing loss, establishing a distinct category separate from prescription hearing aids. This regulatory change may leave some patients, particularly those unfamiliar with hearing aids, without medical guidance in their decision-making process. To address this, our team developed the CLEARdashboard (Consumer Led Evidence - Amplification Resources dashboard) as an educational platform to assist users in comparing the technical specifications of various OTC hearing aids. In this study, we proposed a new key feature on the CLEARdashboard that is to utilize Natural Language Processing (NLP) methods to analyze product reviews from two prominent hearing aid online retailers. Analyzing product reviews using NLP is particularly helpful because these reviews often contain detailed, real-world insights into the performance and usability of hearing aids that may not be captured in technical specifications alone. We used NLP techniques in the automatic summarization of large volumes of user feedback into concise "pros and cons" lists, providing patients with a clearer understanding of the strengths and limitations of each device. This approach saves patients from manually sifting through extensive reviews and helps them make informed choices based on aggregated consumer experiences. The generated summaries were validated by three human evaluators to ensure the most comprehensive and reliable method of presenting this information, enhancing the decision-making process for individuals selecting OTC hearing aids.
Speaker(s):
Yanshan Wang, PhD
University of Pittsburgh
Author(s):
Alisa Stolyar; Jamie Katz, MS - University of Pittsburgh; Catherine Dymowski, BS - University of Pittsburgh; Tierney Lyons, BS - University of Pittsburgh; Aravind Parthasarathy, PhD - University of Pittsburgh; Hari Bharadwaj, PhD - University of Pittsburgh; Elaine Mormer, PhD - University of Pittsburgh; Catherine Palmer, PhD - University of Pittsburgh; Yanshan Wang, PhD - University of Pittsburgh;
Presentation Time: 04:00 PM - 04:15 PM
Abstract Keywords: Patient-centered Research and Care, Natural Language Processing, Mobile Health, Wearable Devices and Patient-Generated Health Data
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Hearing loss is a prevalent and impactful condition that affects millions globally. In 2022, the U.S. Food and Drug Administration (FDA) approved over-the-counter (OTC) hearing aids for individuals with mild to moderate hearing loss, establishing a distinct category separate from prescription hearing aids. This regulatory change may leave some patients, particularly those unfamiliar with hearing aids, without medical guidance in their decision-making process. To address this, our team developed the CLEARdashboard (Consumer Led Evidence - Amplification Resources dashboard) as an educational platform to assist users in comparing the technical specifications of various OTC hearing aids. In this study, we proposed a new key feature on the CLEARdashboard that is to utilize Natural Language Processing (NLP) methods to analyze product reviews from two prominent hearing aid online retailers. Analyzing product reviews using NLP is particularly helpful because these reviews often contain detailed, real-world insights into the performance and usability of hearing aids that may not be captured in technical specifications alone. We used NLP techniques in the automatic summarization of large volumes of user feedback into concise "pros and cons" lists, providing patients with a clearer understanding of the strengths and limitations of each device. This approach saves patients from manually sifting through extensive reviews and helps them make informed choices based on aggregated consumer experiences. The generated summaries were validated by three human evaluators to ensure the most comprehensive and reliable method of presenting this information, enhancing the decision-making process for individuals selecting OTC hearing aids.
Speaker(s):
Yanshan Wang, PhD
University of Pittsburgh
Author(s):
Alisa Stolyar; Jamie Katz, MS - University of Pittsburgh; Catherine Dymowski, BS - University of Pittsburgh; Tierney Lyons, BS - University of Pittsburgh; Aravind Parthasarathy, PhD - University of Pittsburgh; Hari Bharadwaj, PhD - University of Pittsburgh; Elaine Mormer, PhD - University of Pittsburgh; Catherine Palmer, PhD - University of Pittsburgh; Yanshan Wang, PhD - University of Pittsburgh;
BioMistral-NLU: Towards More Generalizable Medical Language Understanding through Instruction Tuning
Presentation Time: 04:15 PM - 04:30 PM
Abstract Keywords: Natural Language Processing, Informatics Research/Biomedical Informatics Research Methods, Machine Learning, Generative AI, and Predictive Modeling
Working Group: Natural Language Processing Working Group
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Large language models (LLMs) such as ChatGPT are fine-tuned on large and diverse instruction-following corpora, and can generalize to new tasks. However, those instruction-tuned LLMs often perform poorly in specialized medical natural language understanding (NLU) tasks that require domain knowledge, granular text comprehension, and struc- tured data extraction. To bridge the gap, we: (1) propose a unified prompting format for 7 important NLU tasks, (2) curate an instruction-tuning dataset, MNLU-Instruct, utilizing diverse existing open-source medical NLU corpora, and (3) develop BioMistral-NLU, a generalizable medical NLU model, through fine-tuning BioMistral on MNLU-Instruct. We evaluate BioMistral-NLU in a zero-shot setting, across 6 important NLU tasks, from two widely adopted medical NLU benchmarks: BLUE and BLURB. Our experiments show that our BioMistral-NLU outperforms the original BioMistral, as well as the proprietary LLMs - ChatGPT and GPT-4. Our dataset-agnostic prompting strategy and in- struction tuning step over diverse NLU tasks enhance LLMs’ generalizability across diverse medical NLU tasks. Our ablation experiments show that instruction-tuning on a wider variety of tasks, even when the total number of training instances remains constant, enhances downstream zero-shot generalization.
Speaker(s):
Yujuan Fu, BSE
University of Washington
Author(s):
Giridhar Kaushik Ramachandran, Student - George Mason Univeristy; Namu Park, MS - Biomedical Informatics and Medical Education, University of Washington; Kevin Lybarger, PhD - George Mason University; Fei Xia, PhD - University of Washington; Ozlem Uzuner, PhD - George Mason University; Meliha Yetisgen, PhD - University of Washington;
Presentation Time: 04:15 PM - 04:30 PM
Abstract Keywords: Natural Language Processing, Informatics Research/Biomedical Informatics Research Methods, Machine Learning, Generative AI, and Predictive Modeling
Working Group: Natural Language Processing Working Group
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Large language models (LLMs) such as ChatGPT are fine-tuned on large and diverse instruction-following corpora, and can generalize to new tasks. However, those instruction-tuned LLMs often perform poorly in specialized medical natural language understanding (NLU) tasks that require domain knowledge, granular text comprehension, and struc- tured data extraction. To bridge the gap, we: (1) propose a unified prompting format for 7 important NLU tasks, (2) curate an instruction-tuning dataset, MNLU-Instruct, utilizing diverse existing open-source medical NLU corpora, and (3) develop BioMistral-NLU, a generalizable medical NLU model, through fine-tuning BioMistral on MNLU-Instruct. We evaluate BioMistral-NLU in a zero-shot setting, across 6 important NLU tasks, from two widely adopted medical NLU benchmarks: BLUE and BLURB. Our experiments show that our BioMistral-NLU outperforms the original BioMistral, as well as the proprietary LLMs - ChatGPT and GPT-4. Our dataset-agnostic prompting strategy and in- struction tuning step over diverse NLU tasks enhance LLMs’ generalizability across diverse medical NLU tasks. Our ablation experiments show that instruction-tuning on a wider variety of tasks, even when the total number of training instances remains constant, enhances downstream zero-shot generalization.
Speaker(s):
Yujuan Fu, BSE
University of Washington
Author(s):
Giridhar Kaushik Ramachandran, Student - George Mason Univeristy; Namu Park, MS - Biomedical Informatics and Medical Education, University of Washington; Kevin Lybarger, PhD - George Mason University; Fei Xia, PhD - University of Washington; Ozlem Uzuner, PhD - George Mason University; Meliha Yetisgen, PhD - University of Washington;
Automatically Identifying Event Reports of Workplace Violence and Communication Failures using Large Language Models
Presentation Time: 04:30 PM - 04:45 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Learning Healthcare System, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Safety event reporting forms a cornerstone of identifying and mitigating risks to patient and staff safety. However,
variabilities in reporting and limited resources to analyze and classify event reports delay healthcare organizations'
ability to rapidly identify safety event trends and to improve workplace safety. We demonstrated how large language
models can classify safety event report narratives as workplace violence (F1: 0.80 for physical violence; F1: 0.94 for
verbal abuse) and communication failures (F1: 0.94) as a first step toward enabling automated labeling of safety
event reports and ultimately improving workplace safety.
Speaker(s):
Sa Youn Hwang, MS
University of Pennsylvania
Author(s):
Mike Becker, BSc - University of Pennsylvania; Sy Hwang, MS - University of Pennsylvania; Emily Schriver, MS - University of Pennsylvania; Caryn Douma, RN MS - University of Pennsylvania; Caoimhe Duffy, MD MSc - University of Pennsylvania; Joshua Atkins, MD PhD - University of Pennsylvania; Caitlyn McShane, MBA - University of Pennsylvania; Jason Lubken, BS - University of Pennsylvania; Asaf Hanish, MPH - University of Pennsylvania; John McGreevey, MD - University of Pennsylvania; Susan Regli, PhD - University of Pennsylvania Health System; Danielle Mowery, PhD, MS, MS, FAMIA - University of Pennsylvania;
Presentation Time: 04:30 PM - 04:45 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Learning Healthcare System, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Safety event reporting forms a cornerstone of identifying and mitigating risks to patient and staff safety. However,
variabilities in reporting and limited resources to analyze and classify event reports delay healthcare organizations'
ability to rapidly identify safety event trends and to improve workplace safety. We demonstrated how large language
models can classify safety event report narratives as workplace violence (F1: 0.80 for physical violence; F1: 0.94 for
verbal abuse) and communication failures (F1: 0.94) as a first step toward enabling automated labeling of safety
event reports and ultimately improving workplace safety.
Speaker(s):
Sa Youn Hwang, MS
University of Pennsylvania
Author(s):
Mike Becker, BSc - University of Pennsylvania; Sy Hwang, MS - University of Pennsylvania; Emily Schriver, MS - University of Pennsylvania; Caryn Douma, RN MS - University of Pennsylvania; Caoimhe Duffy, MD MSc - University of Pennsylvania; Joshua Atkins, MD PhD - University of Pennsylvania; Caitlyn McShane, MBA - University of Pennsylvania; Jason Lubken, BS - University of Pennsylvania; Asaf Hanish, MPH - University of Pennsylvania; John McGreevey, MD - University of Pennsylvania; Susan Regli, PhD - University of Pennsylvania Health System; Danielle Mowery, PhD, MS, MS, FAMIA - University of Pennsylvania;
AE2Vec: Automated electronic health record code grouping for phenotyping
Presentation Time: 04:45 PM - 05:00 PM
Abstract Keywords: EHR-based Phenotyping, Knowledge Representation, Management, or Engineering, Informatics of Cancer Immunotherapy, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Computable phenotyping has proven a powerful tool for identifying and/or characterizing patients. However, models cannot inherently understand relationships between medical concepts in electronic health records. In this study, we developed an automated method of sorting structured EHR codes into labeled, disease relevant groupings, and evaluated their clinical relevance as well as their utility in downstream phenotyping tasks.
Speaker(s):
Steven Tran, BS
Northwestern University - Feinberg School of Medicine
Author(s):
Steven Tran, BS - Northwestern University - Feinberg School of Medicine; Abel Kho, MD, FACMI - Northwestern University; Catherine Gao, MD - Northwestern; David Liebovitz, MD - Northwestern University Feinberg School of Medicine; Jeffrey Sosman, MD - Northwestern University Feinberg School of Medicine; Yuan Luo, PhD - Northwestern University; Theresa Walunas, PhD - Northwestern University;
Presentation Time: 04:45 PM - 05:00 PM
Abstract Keywords: EHR-based Phenotyping, Knowledge Representation, Management, or Engineering, Informatics of Cancer Immunotherapy, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Computable phenotyping has proven a powerful tool for identifying and/or characterizing patients. However, models cannot inherently understand relationships between medical concepts in electronic health records. In this study, we developed an automated method of sorting structured EHR codes into labeled, disease relevant groupings, and evaluated their clinical relevance as well as their utility in downstream phenotyping tasks.
Speaker(s):
Steven Tran, BS
Northwestern University - Feinberg School of Medicine
Author(s):
Steven Tran, BS - Northwestern University - Feinberg School of Medicine; Abel Kho, MD, FACMI - Northwestern University; Catherine Gao, MD - Northwestern; David Liebovitz, MD - Northwestern University Feinberg School of Medicine; Jeffrey Sosman, MD - Northwestern University Feinberg School of Medicine; Yuan Luo, PhD - Northwestern University; Theresa Walunas, PhD - Northwestern University;