- Home
- 2025 Annual Symposium Gallery
- Design and Evaluation of EMPATHICA: A Chatbot for Enhancing Medication Literacy
Custom CSS
double-click to edit, do not edit in source
11/17/2025 |
8:00 AM – 9:15 AM |
Room 7
S18: Design, Test, Repeat: Evaluating Intelligent Systems in Clinical Practice
Presentation Type: Oral Presentations
Beyond Random Splitting: Evaluating the Impact of Data Partitioning Strategies on Ventilator-Associated Pneumonia Prediction Using Electronic Health Records
Presentation Time: 08:00 AM - 08:12 AM
Abstract Keywords: Artificial Intelligence, Critical Care, Deep Learning, Bioinformatics
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Ventilator-Associated Pneumonia (VAP) significantly impacts critical care outcomes, yet prediction models often overlook healthcare data’s hierarchical structure. Using MIMIC-III data, we developed a multi-source extraction approach integrating structured data with clinical notes, identifying 679 VAP and 3,207 non-VAP cases. We investigated how data splitting methodologies affect model performance by comparing four strategies: Vent Session-Based Split, Vent Session-Based Split on Single ICU Stays, Admission-Based Split, and Admission-Based Split on Single ICU Stays.
Evaluating Random Forest, Decision Tree, and XGBoost models revealed that conventional random splitting yielded moderately high performance (AUROC: 79-81%), restricting to single ICU stays surprisingly improved performance (AUROC: 86-87%), while admission-based approaches showed more realistic results (AUROC: 72-76%).
Feature analysis identified MV hours, systolic BP and urine counts as consistently important predictors from the Random Forest model. These findings demonstrate that robust VAP prediction requires evaluation frameworks that respect healthcare data’s hierarchical nature.
Speaker:
Miriam Asare-Baiden, MS
Emory university
Authors:
Miriam Asare-Baiden, MS - Emory university; Wenhui Zhang, PhD - Emory University; Vicki Hertzberg, PhD - Emory University; Joyce Ho, PhD - Emory University;
Presentation Time: 08:00 AM - 08:12 AM
Abstract Keywords: Artificial Intelligence, Critical Care, Deep Learning, Bioinformatics
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Ventilator-Associated Pneumonia (VAP) significantly impacts critical care outcomes, yet prediction models often overlook healthcare data’s hierarchical structure. Using MIMIC-III data, we developed a multi-source extraction approach integrating structured data with clinical notes, identifying 679 VAP and 3,207 non-VAP cases. We investigated how data splitting methodologies affect model performance by comparing four strategies: Vent Session-Based Split, Vent Session-Based Split on Single ICU Stays, Admission-Based Split, and Admission-Based Split on Single ICU Stays.
Evaluating Random Forest, Decision Tree, and XGBoost models revealed that conventional random splitting yielded moderately high performance (AUROC: 79-81%), restricting to single ICU stays surprisingly improved performance (AUROC: 86-87%), while admission-based approaches showed more realistic results (AUROC: 72-76%).
Feature analysis identified MV hours, systolic BP and urine counts as consistently important predictors from the Random Forest model. These findings demonstrate that robust VAP prediction requires evaluation frameworks that respect healthcare data’s hierarchical nature.
Speaker:
Miriam Asare-Baiden, MS
Emory university
Authors:
Miriam Asare-Baiden, MS - Emory university; Wenhui Zhang, PhD - Emory University; Vicki Hertzberg, PhD - Emory University; Joyce Ho, PhD - Emory University;
Miriam
Asare-Baiden,
MS - Emory university
Towards Inpatient Discharge Summary Automation via Large Language Models: A Multidimensional Evaluation with a HIPAA-Compliant Instance of GPT-4o and Clinical Expert Assessment
Presentation Time: 08:12 AM - 08:24 AM
Abstract Keywords: Documentation Burden, Large Language Models (LLMs), Evaluation, Transitions of Care, Healthcare Quality, Patient Safety
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Large language models (LLMs) have demonstrated potential to automate clinical documentation tasks that may reduce clinician burden, such as generation of hospital discharge summaries. Prior research used older LLMs and limited data, raising concerns about fabrications and omissions. In this study, we evaluated the automatic generation of inpatient Internal Medicine discharge summaries using a HIPAA-compliant Microsoft Azure instance of OpenAI's GPT-4o. Both human-written and AI-generated discharge summaries were scored by Internal Medicine hospital faculty for quality, readability/conciseness, factuality and completeness, presence of hallucinations/omissions and their impact on safety, and compared with the actual discharge summaries. Our results showed that the AI-generated discharge summaries significantly outperformed actual human written summaries in both quality and readability/conciseness and were comparable to humans in factuality and completeness, with a minimal cost.
Speaker:
Tyler Osborne, PhD Student
Stony Brook University
Authors:
Tyler Osborne, PhD Student - Stony Brook University; Sadia Abbasi, M.D. - Stony Brook Medicine; Stephanie Hong, M.D. - Stony Brook Medicine; Robert Sexton, M.D. - Stony Brook Medicine; Jonathan Ambut, MD - Stony Brook University Hospital; Neil Patel, MD, MBA - Stony Brook University; Richard Rosenthal, MD; Lyncean Ung, D.O. - Stony Brook Medicine; Fusheng Wang, Ph.D. - Stony Brook University; Rachel Wong, M.D., M.P.H., M.B.A., M.S. - Stony Brook Medicine;
Presentation Time: 08:12 AM - 08:24 AM
Abstract Keywords: Documentation Burden, Large Language Models (LLMs), Evaluation, Transitions of Care, Healthcare Quality, Patient Safety
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Large language models (LLMs) have demonstrated potential to automate clinical documentation tasks that may reduce clinician burden, such as generation of hospital discharge summaries. Prior research used older LLMs and limited data, raising concerns about fabrications and omissions. In this study, we evaluated the automatic generation of inpatient Internal Medicine discharge summaries using a HIPAA-compliant Microsoft Azure instance of OpenAI's GPT-4o. Both human-written and AI-generated discharge summaries were scored by Internal Medicine hospital faculty for quality, readability/conciseness, factuality and completeness, presence of hallucinations/omissions and their impact on safety, and compared with the actual discharge summaries. Our results showed that the AI-generated discharge summaries significantly outperformed actual human written summaries in both quality and readability/conciseness and were comparable to humans in factuality and completeness, with a minimal cost.
Speaker:
Tyler Osborne, PhD Student
Stony Brook University
Authors:
Tyler Osborne, PhD Student - Stony Brook University; Sadia Abbasi, M.D. - Stony Brook Medicine; Stephanie Hong, M.D. - Stony Brook Medicine; Robert Sexton, M.D. - Stony Brook Medicine; Jonathan Ambut, MD - Stony Brook University Hospital; Neil Patel, MD, MBA - Stony Brook University; Richard Rosenthal, MD; Lyncean Ung, D.O. - Stony Brook Medicine; Fusheng Wang, Ph.D. - Stony Brook University; Rachel Wong, M.D., M.P.H., M.B.A., M.S. - Stony Brook Medicine;
Tyler
Osborne,
PhD Student - Stony Brook University
PHEONA: An Evaluation Framework for Large Language Model-based Approaches to Computational Phenotyping
Presentation Time: 08:24 AM - 08:36 AM
Abstract Keywords: Evaluation, Large Language Models (LLMs), Critical Care
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
Computational phenotyping is essential for biomedical research but often requires significant time and resources, especially since traditional methods typically involve extensive manual data review. While machine learning and natural language processing advancements have helped, further improvements are needed. Few studies have explored using Large Language Models (LLMs) for these tasks despite known advantages of LLMs for text-based tasks. To facilitate further research in this area, we developed an evaluation framework, Evaluation of PHEnotyping for Observational Health Data (PHEONA), that outlines context-specific considerations. We applied and demonstrated PHEONA on concept classification, a specific task within a broader phenotyping process for Acute Respiratory Failure (ARF) respiratory support therapies. From the sample concepts tested, we achieved high classification accuracy, suggesting the potential for LLM-based methods to improve computational phenotyping processes.
Speaker:
Sarah Pungitore, MS
The University of Arizona
Authors:
Vignesh Subbian, PhD - University of Arizona; Shashank Yadav, PhD Student - University of Arizona;
Presentation Time: 08:24 AM - 08:36 AM
Abstract Keywords: Evaluation, Large Language Models (LLMs), Critical Care
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
Computational phenotyping is essential for biomedical research but often requires significant time and resources, especially since traditional methods typically involve extensive manual data review. While machine learning and natural language processing advancements have helped, further improvements are needed. Few studies have explored using Large Language Models (LLMs) for these tasks despite known advantages of LLMs for text-based tasks. To facilitate further research in this area, we developed an evaluation framework, Evaluation of PHEnotyping for Observational Health Data (PHEONA), that outlines context-specific considerations. We applied and demonstrated PHEONA on concept classification, a specific task within a broader phenotyping process for Acute Respiratory Failure (ARF) respiratory support therapies. From the sample concepts tested, we achieved high classification accuracy, suggesting the potential for LLM-based methods to improve computational phenotyping processes.
Speaker:
Sarah Pungitore, MS
The University of Arizona
Authors:
Vignesh Subbian, PhD - University of Arizona; Shashank Yadav, PhD Student - University of Arizona;
Sarah
Pungitore,
MS - The University of Arizona
Supporting Patients in Managing Electronic Health Records and Biospecimens Consent for Research: Insights from a Mixed-Methods Usability Evaluation of the iAGREE Portal
Presentation Time: 08:36 AM - 08:48 AM
Abstract Keywords: Patient Engagement and Preferences, Usability, User-centered Design Methods
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
De-identified health data are frequently used in research. As AI advances heighten the risk of re-identification, it is important to respond to concerns about transparency, data privacy, and patient preferences. However, few practical and user-friendly solutions exist. We developed iAGREE, a patient-centered electronic consent management portal that allows patients to set granular preferences for sharing electronic health records and biospecimens with researchers. To refine the iAGREE portal, we conducted a mixed-methods usability evaluation with 40 participants from three U.S. health systems. Our results show that the portal received highly positive usability feedback. Moreover, participants identified areas for improvement, suggested actionable enhancements, and proposed additional features to better support informed granular consent while reducing patient burden. Insights from this study may inform further improvements to iAGREE and provide practical guidance for designing patient-centered consent management tools.
Speaker:
Di Hu, Master of Science in Information Systems
University of California - Irvine
Authors:
Di Hu, Master of Science in Information Systems - University of California - Irvine; Xi Lu, PhD - University at Buffalo; Yunan Chen, PhD - University of California, Irvine; Michelle Keller, PhD, MPH - University of Southern California, Leonard School of Gerontology; An Nguyen, OTD, OTR/L - Cedars-Sinai Medical Center; Vu Le, Master of Science - University of California, Irvine; Tsung-Ting Kuo, PhD - Yale University; Lucila Ohno-Machado, MD, PhD - Yale School of Medicine; Kai Zheng, PhD - University of California, Irvine;
Presentation Time: 08:36 AM - 08:48 AM
Abstract Keywords: Patient Engagement and Preferences, Usability, User-centered Design Methods
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
De-identified health data are frequently used in research. As AI advances heighten the risk of re-identification, it is important to respond to concerns about transparency, data privacy, and patient preferences. However, few practical and user-friendly solutions exist. We developed iAGREE, a patient-centered electronic consent management portal that allows patients to set granular preferences for sharing electronic health records and biospecimens with researchers. To refine the iAGREE portal, we conducted a mixed-methods usability evaluation with 40 participants from three U.S. health systems. Our results show that the portal received highly positive usability feedback. Moreover, participants identified areas for improvement, suggested actionable enhancements, and proposed additional features to better support informed granular consent while reducing patient burden. Insights from this study may inform further improvements to iAGREE and provide practical guidance for designing patient-centered consent management tools.
Speaker:
Di Hu, Master of Science in Information Systems
University of California - Irvine
Authors:
Di Hu, Master of Science in Information Systems - University of California - Irvine; Xi Lu, PhD - University at Buffalo; Yunan Chen, PhD - University of California, Irvine; Michelle Keller, PhD, MPH - University of Southern California, Leonard School of Gerontology; An Nguyen, OTD, OTR/L - Cedars-Sinai Medical Center; Vu Le, Master of Science - University of California, Irvine; Tsung-Ting Kuo, PhD - Yale University; Lucila Ohno-Machado, MD, PhD - Yale School of Medicine; Kai Zheng, PhD - University of California, Irvine;
Di
Hu,
Master of Science in Information Systems - University of California - Irvine
Design and Evaluation of EMPATHICA: A Chatbot for Enhancing Medication Literacy
Presentation Time: 08:48 AM - 09:00 AM
Abstract Keywords: Artificial Intelligence, Human-computer Interaction, Patient Engagement and Preferences, Patient Safety, Evaluation, Natural Language Processing, Public Health, Personal Health Informatics
Primary Track: Applications
Programmatic Theme: Consumer Health Informatics
Health literacy significantly impacts patient outcomes, yet many struggle to understand complex medication information. Medication non-adherence often results from poor comprehension of drug instructions and contributes to preventable hospitalizations and poor treatment outcomes. With the increasing use of digital health interventions, AI-powered chatbots present an opportunity to improve patient access to understandable and personalized medication information. This study evaluates the usability, accessibility, and effectiveness of EMPATHICA, an AI-powered chatbot designed to provide patient-centric medication information. The study assesses whether chatbot-generated responses improve patient comprehension and engagement. The research observed the interaction between participants and the web-based application, where participants asked the chatbot to measure the chatbot's usability and accuracy using various qualitative and quantitative measures and expert physician evaluation of the responses. AI-driven chatbots have the potential to bridge health literacy gaps by providing clear and accessible medication information. By evaluating EMPATHICA, this study contributes to the growing field of AI applications supporting patient-informed medication use.
Speaker:
Yuri Quintana, PhD
Beth Israel Deaconess Medical Center
Authors:
Yuri Quintana, PhD - Beth Israel Deaconess Medical Center; Katherine Bloom, BA - Beth Israel Deaconess Medical Center; Gyana Srivastava, BA - Beth Israel Deaconess Medical Center; Ava Homiar, HBSc - Harvard Medical School; Annlouise Assaf, PhD; Glenda Thomas, BSc - Nceptive; Elizabeth Lowe, BA - Beth Israel Deaconess Medical Center; David Hampton, BSc - Takeda Pharmaceutical Company Limited; Viola Wontor, BA - Pfizer Inc.;
Presentation Time: 08:48 AM - 09:00 AM
Abstract Keywords: Artificial Intelligence, Human-computer Interaction, Patient Engagement and Preferences, Patient Safety, Evaluation, Natural Language Processing, Public Health, Personal Health Informatics
Primary Track: Applications
Programmatic Theme: Consumer Health Informatics
Health literacy significantly impacts patient outcomes, yet many struggle to understand complex medication information. Medication non-adherence often results from poor comprehension of drug instructions and contributes to preventable hospitalizations and poor treatment outcomes. With the increasing use of digital health interventions, AI-powered chatbots present an opportunity to improve patient access to understandable and personalized medication information. This study evaluates the usability, accessibility, and effectiveness of EMPATHICA, an AI-powered chatbot designed to provide patient-centric medication information. The study assesses whether chatbot-generated responses improve patient comprehension and engagement. The research observed the interaction between participants and the web-based application, where participants asked the chatbot to measure the chatbot's usability and accuracy using various qualitative and quantitative measures and expert physician evaluation of the responses. AI-driven chatbots have the potential to bridge health literacy gaps by providing clear and accessible medication information. By evaluating EMPATHICA, this study contributes to the growing field of AI applications supporting patient-informed medication use.
Speaker:
Yuri Quintana, PhD
Beth Israel Deaconess Medical Center
Authors:
Yuri Quintana, PhD - Beth Israel Deaconess Medical Center; Katherine Bloom, BA - Beth Israel Deaconess Medical Center; Gyana Srivastava, BA - Beth Israel Deaconess Medical Center; Ava Homiar, HBSc - Harvard Medical School; Annlouise Assaf, PhD; Glenda Thomas, BSc - Nceptive; Elizabeth Lowe, BA - Beth Israel Deaconess Medical Center; David Hampton, BSc - Takeda Pharmaceutical Company Limited; Viola Wontor, BA - Pfizer Inc.;
Yuri
Quintana,
PhD - Beth Israel Deaconess Medical Center
Tracking Timely Diagnostic Resolution After Abnormal Screening Mammograms for Early Breast Cancer Detection: Multi-Site Evaluation of a Novel Electronic Clinical Quality Measure (eCQM)
Presentation Time: 09:00 AM - 09:12 AM
Abstract Keywords: Healthcare Quality, Patient Safety, Informatics Implementation, Population Health
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Brigham and Women’s Hospital developed a novel eCQM, entitled: “Rate of Timely Follow-up on Abnormal Screening Mammograms for Breast Cancer Detection.” The objective of this study was to pilot this measure at two health systems with different EHR vendors. The overall eCQM rates were 91.9% and 88.8%, although rates varied by year. The measure demonstrated feasibility, validity, and reliability, and highlighted the importance of tracking performance over time even in systems that consistently performed well.
Speaker:
Ania Syrowatka, PhD
Brigham and Women's Hospital/Harvard Medical School
Authors:
Ania Syrowatka, PhD - Brigham and Women's Hospital/Harvard Medical School; Marlika Marceau, Senior Research Assistant/Bachelor's - Mass General Brigham; Omar Nafaa, MS - Mass General Brigham; Michael Sainlaire - Brigham and Women's Health; Shadi Hijjawi, MD - Penn State Health; Richard Schreiber, MD, FACP, FAMIA - Penn State Health; Raghavendra Davuluri, MS - Penn State Health; Tim Nye, MS - Oracle Health; Catherine Yoon, MS - Brigham and Women's Hospital; Tien Thai - Brigham and Women's Health; Lipika Samal, MD - Brigham and Women's Hospital; Li Zhou, MD, PhD, FACMI, FIAHSI, FAMIA - Brigham and Women's Hospital, Harvard Medical School; Stuart Lipsitz - Brigham and Women's Hospital; David Bates, MD - Mass General Brigham; Harvard University; PATRICIA C DYKES, PhD, MA, RN - Brigham and Women's Hospital/Harvard Medical School;
Presentation Time: 09:00 AM - 09:12 AM
Abstract Keywords: Healthcare Quality, Patient Safety, Informatics Implementation, Population Health
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Brigham and Women’s Hospital developed a novel eCQM, entitled: “Rate of Timely Follow-up on Abnormal Screening Mammograms for Breast Cancer Detection.” The objective of this study was to pilot this measure at two health systems with different EHR vendors. The overall eCQM rates were 91.9% and 88.8%, although rates varied by year. The measure demonstrated feasibility, validity, and reliability, and highlighted the importance of tracking performance over time even in systems that consistently performed well.
Speaker:
Ania Syrowatka, PhD
Brigham and Women's Hospital/Harvard Medical School
Authors:
Ania Syrowatka, PhD - Brigham and Women's Hospital/Harvard Medical School; Marlika Marceau, Senior Research Assistant/Bachelor's - Mass General Brigham; Omar Nafaa, MS - Mass General Brigham; Michael Sainlaire - Brigham and Women's Health; Shadi Hijjawi, MD - Penn State Health; Richard Schreiber, MD, FACP, FAMIA - Penn State Health; Raghavendra Davuluri, MS - Penn State Health; Tim Nye, MS - Oracle Health; Catherine Yoon, MS - Brigham and Women's Hospital; Tien Thai - Brigham and Women's Health; Lipika Samal, MD - Brigham and Women's Hospital; Li Zhou, MD, PhD, FACMI, FIAHSI, FAMIA - Brigham and Women's Hospital, Harvard Medical School; Stuart Lipsitz - Brigham and Women's Hospital; David Bates, MD - Mass General Brigham; Harvard University; PATRICIA C DYKES, PhD, MA, RN - Brigham and Women's Hospital/Harvard Medical School;
Ania
Syrowatka,
PhD - Brigham and Women's Hospital/Harvard Medical School
Design and Evaluation of EMPATHICA: A Chatbot for Enhancing Medication Literacy
Category
Paper - Regular
Description
Custom CSS
double-click to edit, do not edit in source
11/17/2025 09:15 AM (Eastern Time (US & Canada))