Assessing the Quality of Synthetic Data Produced by LLaMA and ChatGPT in the Context of Obstetric Stigmatizing Language
Poster Number: P115
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Large Language Models (LLMs), Natural Language Processing, Nursing Informatics
Primary Track: Foundations
Programmatic Theme: Clinical Research Informatics
This exploratory study assessed the use of LLaMA and ChatGPT to generate synthetic data containing stigmatizing language in obstetrics care. Using three prompting approaches, six sets of synthetic sentences were generated and evaluated by nurse researchers for how similar they are to clinicians’ notes and how stigmatizing the language is. LLaMA's few-shot approach yielded the highest similarity and stigmatizing ratings, significantly outperforming other approaches. ChatGPT showed varied results with no significant differences.
Speaker(s):
Jihye Scroggins, PhD
Columbia University School of Nursing
Author(s):
Jihye Scroggins, PhD - Columbia University School of Nursing; Veronica Barcelona, PhD - Columbia University School of Nursing; Ismael Hulchafo, MS - Columbia University School of Nursing; Sarah Harkins, BSN, RN - Columbia University School of Nursing; Danielle Scharp, MSN, BSN - Danielle Scharp; Hans Moen, PhD - Aalto University; Anahita Davoudi; Max Topaz, PhD, RN, MA, FAAN, FIAHSI, FACMI - Columbia University School of Nursing;
Poster Number: P115
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Large Language Models (LLMs), Natural Language Processing, Nursing Informatics
Primary Track: Foundations
Programmatic Theme: Clinical Research Informatics
This exploratory study assessed the use of LLaMA and ChatGPT to generate synthetic data containing stigmatizing language in obstetrics care. Using three prompting approaches, six sets of synthetic sentences were generated and evaluated by nurse researchers for how similar they are to clinicians’ notes and how stigmatizing the language is. LLaMA's few-shot approach yielded the highest similarity and stigmatizing ratings, significantly outperforming other approaches. ChatGPT showed varied results with no significant differences.
Speaker(s):
Jihye Scroggins, PhD
Columbia University School of Nursing
Author(s):
Jihye Scroggins, PhD - Columbia University School of Nursing; Veronica Barcelona, PhD - Columbia University School of Nursing; Ismael Hulchafo, MS - Columbia University School of Nursing; Sarah Harkins, BSN, RN - Columbia University School of Nursing; Danielle Scharp, MSN, BSN - Danielle Scharp; Hans Moen, PhD - Aalto University; Anahita Davoudi; Max Topaz, PhD, RN, MA, FAAN, FIAHSI, FACMI - Columbia University School of Nursing;
Assessing the Quality of Synthetic Data Produced by LLaMA and ChatGPT in the Context of Obstetric Stigmatizing Language
Category
Poster - Student
Description
Date: Monday (11/11)
Time: 05:00 PM to 06:30 PM
Room: Grand Ballroom (Posters)
Time: 05:00 PM to 06:30 PM
Room: Grand Ballroom (Posters)