Using Large Language Models to Detect Stigmatizing Language in Clinical Documentation
Presentation Time: 10:00 AM - 10:15 AM
Abstract Keywords: Fairness and Elimination of Bias, Natural Language Processing, Large Language Models (LLMs)
Working Group: Natural Language Processing Working Group
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
Stigmatizing language (SL) in electronic health records (EHRs) is widespread and has been measured to capture
implicit biases within clinical notes. The use of SL significantly influences outcomes for individuals experiencing
substance use disorders (SUDs) and various other chronic conditions.1 Previous research on understanding biases was performed using qualitative interviews of patients and healthcare providers.1 Recent advancements in natural language processing (NLP) have made it possible to analyze SL in clinical notes from EHRs.2 The National Institute on Drug Abuse’s (NIDA) “Words Matter”3 initiative provided a list of terms and phrases to avoid during clinical documentation and patient-physician communication. However, it is important to note that while many of these terms may appear in clinical notes, they can sometimes be used in different contexts that do not necessarily constitute SL. Therefore, it is imperative to understand the contextual nuances of these terms to accurately identify and characterize the presence of SL in clinical notes. Large language models (LLM) have gained popularity for their efficacy in understanding languages, marking significant milestones in the field of NLP. These models excel in various tasks including text generation and question answering. We developed an automated method to detect and characterize SL in clinical notes using an advanced fine-tuned LLM called “Fine-tuned LAnguage Net-T5 (FLAN-T5)”.4 We used FLAN-T5 in a “question-answering” fashion to help extract SL from clinical texts.
Speaker(s):
Braja Gopal Patra, PhD
Weill Cornell Medicine; Dept of Population Health Sciences; Div of Health Informatics
Author(s):
Prakash Adekkanattu, PhD - Weill Cornell Medicine; Veer Vekaria; Marianne Sharko, MD, MS - Weill Cornell Medical College Health Policy and Research; Jessica Ancker, MPH, PhD, FACMI - Vanderbilt University Medical Center; Jyotishman Pathak, PhD - Weill Cornell Medical College, Cornell University;
Presentation Time: 10:00 AM - 10:15 AM
Abstract Keywords: Fairness and Elimination of Bias, Natural Language Processing, Large Language Models (LLMs)
Working Group: Natural Language Processing Working Group
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
Stigmatizing language (SL) in electronic health records (EHRs) is widespread and has been measured to capture
implicit biases within clinical notes. The use of SL significantly influences outcomes for individuals experiencing
substance use disorders (SUDs) and various other chronic conditions.1 Previous research on understanding biases was performed using qualitative interviews of patients and healthcare providers.1 Recent advancements in natural language processing (NLP) have made it possible to analyze SL in clinical notes from EHRs.2 The National Institute on Drug Abuse’s (NIDA) “Words Matter”3 initiative provided a list of terms and phrases to avoid during clinical documentation and patient-physician communication. However, it is important to note that while many of these terms may appear in clinical notes, they can sometimes be used in different contexts that do not necessarily constitute SL. Therefore, it is imperative to understand the contextual nuances of these terms to accurately identify and characterize the presence of SL in clinical notes. Large language models (LLM) have gained popularity for their efficacy in understanding languages, marking significant milestones in the field of NLP. These models excel in various tasks including text generation and question answering. We developed an automated method to detect and characterize SL in clinical notes using an advanced fine-tuned LLM called “Fine-tuned LAnguage Net-T5 (FLAN-T5)”.4 We used FLAN-T5 in a “question-answering” fashion to help extract SL from clinical texts.
Speaker(s):
Braja Gopal Patra, PhD
Weill Cornell Medicine; Dept of Population Health Sciences; Div of Health Informatics
Author(s):
Prakash Adekkanattu, PhD - Weill Cornell Medicine; Veer Vekaria; Marianne Sharko, MD, MS - Weill Cornell Medical College Health Policy and Research; Jessica Ancker, MPH, PhD, FACMI - Vanderbilt University Medical Center; Jyotishman Pathak, PhD - Weill Cornell Medical College, Cornell University;
Using Large Language Models to Detect Stigmatizing Language in Clinical Documentation
Category
Podium Abstract