Two-layer Retrieval Augmented Generation Framework for Low-resource Medical Question-answering Using Reddit Data: Proof of Concept
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: Large Language Models (LLMs), Public Health, Social Media and Connected Health
Primary Track: Applications
Programmatic Theme: Public Health Informatics
This study develops a two-layer retrieval-augmented generation framework for medical question-answering using social media data on novel psychoactive substances. Evaluating the framework using Reddit data on xylazine and ketamine, results showed comparable performance between GPT-4 and a quantized large language model (Nous-Hermes-2-7B-DPO) across multiple metrics, demonstrating smaller large language models' effectiveness for medical question-answering.
Speaker(s):
Sudeshna Das, PhD
Emory University
Author(s):
Yao Ge, Master - Emory University; Yuting Guo, MS - Emory University; Swati Rajwal, PhD - Emory University; JaMor Hairston, MSHI, MS - Emory University; Jeanne Powell, PhD - Emory University; Drew Walker, PhD - Emory University; Snigdha Peddireddy, MPH - Emory University; Sahithi Lakamana, Systems Software Engineer - Emory University; Selen Bozkurt Watson, PhD, MS - Emory University; Matthew Reyna, PhD - Emory University; Reza Sameni, PhD - Emory University; Yunyu Xiao, PhD - Weill Cornell Medicine, Population Health Sciences; Sangmi Kim, PhD, MPH, RN - Nell Hodgson Woodruff School of Nursing, Emory University; Rasheeta Chandler, PhD - Emory University; Natalie Hernandez, PhD - Morehouse School of Medicine; Danielle Mowery, PhD, MS, MS, FAMIA - University of Pennsylvania; Jeanmarie Perrone, MD - Perelman School of Medicine at the University of Pennsylvania; Abeed Sarker, PhD - Emory University School of Medicine;
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: Large Language Models (LLMs), Public Health, Social Media and Connected Health
Primary Track: Applications
Programmatic Theme: Public Health Informatics
This study develops a two-layer retrieval-augmented generation framework for medical question-answering using social media data on novel psychoactive substances. Evaluating the framework using Reddit data on xylazine and ketamine, results showed comparable performance between GPT-4 and a quantized large language model (Nous-Hermes-2-7B-DPO) across multiple metrics, demonstrating smaller large language models' effectiveness for medical question-answering.
Speaker(s):
Sudeshna Das, PhD
Emory University
Author(s):
Yao Ge, Master - Emory University; Yuting Guo, MS - Emory University; Swati Rajwal, PhD - Emory University; JaMor Hairston, MSHI, MS - Emory University; Jeanne Powell, PhD - Emory University; Drew Walker, PhD - Emory University; Snigdha Peddireddy, MPH - Emory University; Sahithi Lakamana, Systems Software Engineer - Emory University; Selen Bozkurt Watson, PhD, MS - Emory University; Matthew Reyna, PhD - Emory University; Reza Sameni, PhD - Emory University; Yunyu Xiao, PhD - Weill Cornell Medicine, Population Health Sciences; Sangmi Kim, PhD, MPH, RN - Nell Hodgson Woodruff School of Nursing, Emory University; Rasheeta Chandler, PhD - Emory University; Natalie Hernandez, PhD - Morehouse School of Medicine; Danielle Mowery, PhD, MS, MS, FAMIA - University of Pennsylvania; Jeanmarie Perrone, MD - Perelman School of Medicine at the University of Pennsylvania; Abeed Sarker, PhD - Emory University School of Medicine;
Two-layer Retrieval Augmented Generation Framework for Low-resource Medical Question-answering Using Reddit Data: Proof of Concept
Category
Podium Abstract