- Home
- 2025 Annual Symposium Gallery
- Assessment of GPT-based Conversational Agents Aimed to Reduce Healthcare Provider Stigma
Custom CSS
double-click to edit, do not edit in source
11/17/2025 |
3:30 PM – 4:45 PM |
Room 9
S51: Artificial Intelligence in Health Professions Education: Personalization, Prediction, and Pedagogy
Presentation Type: LIEAF
GenPACKS: Validation of AI-Generated Clinical Questions with Remedial Explanations for Physician Assistant Education
Presentation Time: 03:30 PM - 03:42 PM
Abstract Keywords: Education and Training, Large Language Models (LLMs), Evaluation
Primary Track: Applications
Programmatic Theme: Academic Informatics / LIEAF
Fluctuating PANCE first-time pass rates for the Physician Assistant (PA) program at the University of Pittsburgh prompted the development of GenPACKS, an AI-driven clinical knowledge assessment system. Using retrieval-augmented generation with large language models, we generated 160 synthetic Question-Answer-Explanation (QAE) triplets across three medical domains. An ensemble of LLMs validated 130 QAEs, with subsequent faculty review showing 50% unanimous and 93.8% majority acceptance. The research provides a foundational methodology for AI-powered remediation tools for PA education assessment.
Speaker:
I Made Agus Setiawan, PhD
University of Pittsburgh
Authors:
Maria Yuliana, Bachelor - University of Pittsburgh; Bayu Aryoyudanta, Master - University of Pittsburgh; Bob Hua, Master - University of Pittsburgh; Haomin Hu, PhD in Rehabilitation Science - University of Pittsburgh; Dipu Patel, DMSc - University of Pittsburgh; David C. Beck, EdD - University of Pittsburgh; Andi Saptono, PhD - University of Pittsburgh School of Health and Rehabilitation Services; Bambang Parmanto, PhD - University of Pittsburgh School of Health and Rehabilitation Sciences;
Presentation Time: 03:30 PM - 03:42 PM
Abstract Keywords: Education and Training, Large Language Models (LLMs), Evaluation
Primary Track: Applications
Programmatic Theme: Academic Informatics / LIEAF
Fluctuating PANCE first-time pass rates for the Physician Assistant (PA) program at the University of Pittsburgh prompted the development of GenPACKS, an AI-driven clinical knowledge assessment system. Using retrieval-augmented generation with large language models, we generated 160 synthetic Question-Answer-Explanation (QAE) triplets across three medical domains. An ensemble of LLMs validated 130 QAEs, with subsequent faculty review showing 50% unanimous and 93.8% majority acceptance. The research provides a foundational methodology for AI-powered remediation tools for PA education assessment.
Speaker:
I Made Agus Setiawan, PhD
University of Pittsburgh
Authors:
Maria Yuliana, Bachelor - University of Pittsburgh; Bayu Aryoyudanta, Master - University of Pittsburgh; Bob Hua, Master - University of Pittsburgh; Haomin Hu, PhD in Rehabilitation Science - University of Pittsburgh; Dipu Patel, DMSc - University of Pittsburgh; David C. Beck, EdD - University of Pittsburgh; Andi Saptono, PhD - University of Pittsburgh School of Health and Rehabilitation Services; Bambang Parmanto, PhD - University of Pittsburgh School of Health and Rehabilitation Sciences;
I Made Agus
Setiawan,
PhD - University of Pittsburgh
Assessment of GPT-based Conversational Agents Aimed to Reduce Healthcare Provider Stigma
Presentation Time: 03:42 PM - 03:54 PM
Abstract Keywords: Artificial Intelligence, Education and Training, Large Language Models (LLMs)
Primary Track: Applications
Programmatic Theme: Academic Informatics / LIEAF
This study evaluated GPT-based conversational agents in tasks related to healthcare provider stigma. The main finding was that GPT-4o models, using Role-Playing (RP) and Chain of Thought (CoT) techniques, outperformed other models in tasks such as defining healthcare provider stigma, identifying types of stigma, and explaining its consequences. These results suggest that advanced prompting techniques significantly enhance the agent’s ability to deliver complex and nuanced information about healthcare provider stigma.
Speaker:
David Villarreal-Zegarra, MPH
Department of Biomedical Informatics, University of Utah
Authors:
David Villarreal-Zegarra, MPH - Department of Biomedical Informatics, University of Utah; Mahony Reategui Rivera, MD - University of Utah; Yscenia Paredes-Gonzales, MSc - Digital Health Research Center, Lima, Peru; Gianfranco Centeno-Terrazas, MPH - Digital Health Research Center, Instituto Peruano de Orientación Psicológica, Lima, Peru; Joseph Finkelstein, MD, PhD - University of Utah;
Presentation Time: 03:42 PM - 03:54 PM
Abstract Keywords: Artificial Intelligence, Education and Training, Large Language Models (LLMs)
Primary Track: Applications
Programmatic Theme: Academic Informatics / LIEAF
This study evaluated GPT-based conversational agents in tasks related to healthcare provider stigma. The main finding was that GPT-4o models, using Role-Playing (RP) and Chain of Thought (CoT) techniques, outperformed other models in tasks such as defining healthcare provider stigma, identifying types of stigma, and explaining its consequences. These results suggest that advanced prompting techniques significantly enhance the agent’s ability to deliver complex and nuanced information about healthcare provider stigma.
Speaker:
David Villarreal-Zegarra, MPH
Department of Biomedical Informatics, University of Utah
Authors:
David Villarreal-Zegarra, MPH - Department of Biomedical Informatics, University of Utah; Mahony Reategui Rivera, MD - University of Utah; Yscenia Paredes-Gonzales, MSc - Digital Health Research Center, Lima, Peru; Gianfranco Centeno-Terrazas, MPH - Digital Health Research Center, Instituto Peruano de Orientación Psicológica, Lima, Peru; Joseph Finkelstein, MD, PhD - University of Utah;
David
Villarreal-Zegarra,
MPH - Department of Biomedical Informatics, University of Utah
Optimizing Open-Source LLM Performance for Detecting Diagnosis of Heart Failure
Presentation Time: 03:54 PM - 04:06 PM
Abstract Keywords: Large Language Models (LLMs), Machine Learning, Artificial Intelligence
Primary Track: Applications
Programmatic Theme: Academic Informatics / LIEAF
A key challenge in improving healthcare is the rapid identification of patients with specific conditions from vast
amounts of unstructured clinical notes. We hypothesize that detection of diagnosis of Heart Failure in clinical notes
by using Large Language Models (LLMs) can be improved by enhancing RAG query, context-aware chunking,
medical domain adaptation and instruction-tuning. Overall instruction fine-tuning and domain adaptation
significantly improved performance, and context aware chunking led to moderate improvement in performance.
Speaker:
Ayush Khaneja, MS
Massachussetts General Hospital, Boston
Authors:
Prabin Shakya, PhD (running) - Massachusetts General Hospital; Ozan Unlu, MD - Mass General Brigham; David Zelle; Shahzad Hassan, MBBS, MD - Brigham and Women's Hospital; Benjamin M. Scirica, MD - Brigham and Women's Hospital / Harvard Medical School; Alexander Blood, SMD - Brigham and Women’s Hospital / Harvard Medical School; Kavishwar Wagholikar, MD, PhD - Harvard Medical School /MGH;
Presentation Time: 03:54 PM - 04:06 PM
Abstract Keywords: Large Language Models (LLMs), Machine Learning, Artificial Intelligence
Primary Track: Applications
Programmatic Theme: Academic Informatics / LIEAF
A key challenge in improving healthcare is the rapid identification of patients with specific conditions from vast
amounts of unstructured clinical notes. We hypothesize that detection of diagnosis of Heart Failure in clinical notes
by using Large Language Models (LLMs) can be improved by enhancing RAG query, context-aware chunking,
medical domain adaptation and instruction-tuning. Overall instruction fine-tuning and domain adaptation
significantly improved performance, and context aware chunking led to moderate improvement in performance.
Speaker:
Ayush Khaneja, MS
Massachussetts General Hospital, Boston
Authors:
Prabin Shakya, PhD (running) - Massachusetts General Hospital; Ozan Unlu, MD - Mass General Brigham; David Zelle; Shahzad Hassan, MBBS, MD - Brigham and Women's Hospital; Benjamin M. Scirica, MD - Brigham and Women's Hospital / Harvard Medical School; Alexander Blood, SMD - Brigham and Women’s Hospital / Harvard Medical School; Kavishwar Wagholikar, MD, PhD - Harvard Medical School /MGH;
Ayush
Khaneja,
MS - Massachussetts General Hospital, Boston
To what Degree can LLMs Support Medical Informatics Research? Examining the Interplay of Research Support LLMs with LLM Critics
Presentation Time: 04:06 PM - 04:18 PM
Abstract Keywords: Controlled Terminologies, Ontologies, and Vocabularies, Large Language Models (LLMs), Standards
Primary Track: Foundations
Programmatic Theme: Academic Informatics / LIEAF
The rapid development of Large Language Models (LLMs) has opened up new possibilities for their role in supporting
research. This study assesses whether LLMs can generate “thoughtful” research plans in the domain of Medical
Informatics and whether LLM-generated critiques can improve such plans. Using an LLM pipeline, we prompt four
LLMs to generate primary research plans. Subsequently, these plans are mutually critiqued and then the LLMs are
prompted to refine their outputs based on these critiques. These original and improved responses are then reviewed
by human evaluators for errors, hallucinations, etc. We employ ROUGE scores, cosine similarity, and length
differences to quantify similarity across responses. Our findings reveal variations in outputs among four LLMs, the
impact of critiques, and differences between primary and secondary outputs. All LLMs produce cogent outputs and
critiques, integrating feedback when generating improved outputs. Human evaluators can distinguish between
primary and secondary responses in most cases.
Speaker:
Naren Khatwani, PhD Student
New Jersey Institute of Technology
Authors:
Lijing Wang, PhD - New Jersey Institute of Technology; James Geller, PhD - NJIT;
Presentation Time: 04:06 PM - 04:18 PM
Abstract Keywords: Controlled Terminologies, Ontologies, and Vocabularies, Large Language Models (LLMs), Standards
Primary Track: Foundations
Programmatic Theme: Academic Informatics / LIEAF
The rapid development of Large Language Models (LLMs) has opened up new possibilities for their role in supporting
research. This study assesses whether LLMs can generate “thoughtful” research plans in the domain of Medical
Informatics and whether LLM-generated critiques can improve such plans. Using an LLM pipeline, we prompt four
LLMs to generate primary research plans. Subsequently, these plans are mutually critiqued and then the LLMs are
prompted to refine their outputs based on these critiques. These original and improved responses are then reviewed
by human evaluators for errors, hallucinations, etc. We employ ROUGE scores, cosine similarity, and length
differences to quantify similarity across responses. Our findings reveal variations in outputs among four LLMs, the
impact of critiques, and differences between primary and secondary outputs. All LLMs produce cogent outputs and
critiques, integrating feedback when generating improved outputs. Human evaluators can distinguish between
primary and secondary responses in most cases.
Speaker:
Naren Khatwani, PhD Student
New Jersey Institute of Technology
Authors:
Lijing Wang, PhD - New Jersey Institute of Technology; James Geller, PhD - NJIT;
Naren
Khatwani,
PhD Student - New Jersey Institute of Technology
Predicting Amyotrophic Lateral Sclerosis Progression: A Multi-task Screener-Learner Approach
Presentation Time: 04:18 PM - 04:30 PM
Abstract Keywords: Deep Learning, Artificial Intelligence, Machine Learning, Diagnostic Systems
Working Group: Knowledge Discovery and Data Mining Working Group
Primary Track: Applications
Programmatic Theme: Academic Informatics / LIEAF
Amyotrophic lateral sclerosis (ALS) is a rare progressive motor neuron disease with complex heterogeneity. This paper presents a multi-task learning model to simultaneously predict ALS Functional Rating Scale (ALSFRS) reduction rate and Vital Capacity (VC) measurements using the PRO-ACT dataset. We developed a multi-task screener-learner model that leveraging the shared representation between prediction tasks, which showed superior performance compared to single-task models.
Speaker:
Hamza Turabieh, Phd
University of Missouri
Authors:
Xing Song, PhD - University of Missouri; Jeffrey Statland, M.D. - University of Kansas Medical Center in Kansas City;
Presentation Time: 04:18 PM - 04:30 PM
Abstract Keywords: Deep Learning, Artificial Intelligence, Machine Learning, Diagnostic Systems
Working Group: Knowledge Discovery and Data Mining Working Group
Primary Track: Applications
Programmatic Theme: Academic Informatics / LIEAF
Amyotrophic lateral sclerosis (ALS) is a rare progressive motor neuron disease with complex heterogeneity. This paper presents a multi-task learning model to simultaneously predict ALS Functional Rating Scale (ALSFRS) reduction rate and Vital Capacity (VC) measurements using the PRO-ACT dataset. We developed a multi-task screener-learner model that leveraging the shared representation between prediction tasks, which showed superior performance compared to single-task models.
Speaker:
Hamza Turabieh, Phd
University of Missouri
Authors:
Xing Song, PhD - University of Missouri; Jeffrey Statland, M.D. - University of Kansas Medical Center in Kansas City;
Hamza
Turabieh,
Phd - University of Missouri
A Nurse Innovation Led Whole-Person Health App for Student Learning and Research Opportunities in an Informatics Graduate Program
Presentation Time: 04:30 PM - 04:42 PM
Abstract Keywords: Controlled Terminologies, Ontologies, and Vocabularies, Data Sharing, Education and Training, Curriculum Development, Nursing Informatics, Mobile Health, Teaching Innovation
Primary Track: Applications
Programmatic Theme: Academic Informatics / LIEAF
The MyStrengths+MyHealth (MSMH) app was developed in 2017 by nursing informatics faculty at the University of Minnesota to facilitate whole-person health assessments, incorporating social and behavioral determinants of health. Using the Omaha System as a standardized terminology framework, the app collects data on strengths, challenges, and needs across 42 concepts within four domains. Since its inception, MSMH has been utilized by over 2,000 adults in various studies and has been recognized for its innovation, notably earning an American Nurses Association Innovation Award in 2023. This presentation explores the integration of MSMH into nursing informatics education as both a learning and research tool. Faculty piloted the app in informatics practicums, particularly in population health informatics and standards and interoperability courses. Student involvement extended beyond coursework to participation in research phases, including data collection, management, analysis, and dissemination. The breadth and depth of data collected have provided rich opportunities for real-world learning through data visualization, dashboard creation, and community-level analytics. Since 2020, MSMH has been incorporated into informatics education at the University of Minnesota, engaging 13 students in practical applications and scholarly research. The app has supported studies involving diverse populations, leading to student co-authorship in nursing, public health, and health informatics publications. Findings indicate that MSMH enhances student learning in structured data, interoperability, and person-generated health data. Its continued use and expansion within academic programs can further advance competencies in health informatics, data standards, and patient-centered care analytics.
Speaker:
Robin Austin, PhD, DNP, DC, RN, NI-BC, FAMIA, FAAN
University of Minnesota, School of Nursing
Author:
Sripriya Rajamani, MBBS, MPH, PhD, FAMIA - University of Minnesota;
Presentation Time: 04:30 PM - 04:42 PM
Abstract Keywords: Controlled Terminologies, Ontologies, and Vocabularies, Data Sharing, Education and Training, Curriculum Development, Nursing Informatics, Mobile Health, Teaching Innovation
Primary Track: Applications
Programmatic Theme: Academic Informatics / LIEAF
The MyStrengths+MyHealth (MSMH) app was developed in 2017 by nursing informatics faculty at the University of Minnesota to facilitate whole-person health assessments, incorporating social and behavioral determinants of health. Using the Omaha System as a standardized terminology framework, the app collects data on strengths, challenges, and needs across 42 concepts within four domains. Since its inception, MSMH has been utilized by over 2,000 adults in various studies and has been recognized for its innovation, notably earning an American Nurses Association Innovation Award in 2023. This presentation explores the integration of MSMH into nursing informatics education as both a learning and research tool. Faculty piloted the app in informatics practicums, particularly in population health informatics and standards and interoperability courses. Student involvement extended beyond coursework to participation in research phases, including data collection, management, analysis, and dissemination. The breadth and depth of data collected have provided rich opportunities for real-world learning through data visualization, dashboard creation, and community-level analytics. Since 2020, MSMH has been incorporated into informatics education at the University of Minnesota, engaging 13 students in practical applications and scholarly research. The app has supported studies involving diverse populations, leading to student co-authorship in nursing, public health, and health informatics publications. Findings indicate that MSMH enhances student learning in structured data, interoperability, and person-generated health data. Its continued use and expansion within academic programs can further advance competencies in health informatics, data standards, and patient-centered care analytics.
Speaker:
Robin Austin, PhD, DNP, DC, RN, NI-BC, FAMIA, FAAN
University of Minnesota, School of Nursing
Author:
Sripriya Rajamani, MBBS, MPH, PhD, FAMIA - University of Minnesota;
Robin
Austin,
PhD, DNP, DC, RN, NI-BC, FAMIA, FAAN - University of Minnesota, School of Nursing
Assessment of GPT-based Conversational Agents Aimed to Reduce Healthcare Provider Stigma
Category
Podium Abstract
Description
Custom CSS
double-click to edit, do not edit in source
11/17/2025 04:45 PM (Eastern Time (US & Canada))