- Home
- 2025 Annual Symposium Program Gallery
- S71: From Threads to Therapy: Social Media as Public Health Evidence
Times are displayed in (UTC-04:00) Eastern Time (US & Canada) Change
Custom CSS
double-click to edit, do not edit in source
11/18/2025 |
9:45 AM – 11:00 AM |
Room 8
S71: From Threads to Therapy: Social Media as Public Health Evidence
Presentation Type: Oral Presentations
RAG vs Reddit: Decoding Autism Conversations on Reddit with LLMs and Topic Modeling
Presentation Time: 09:45 AM - 10:00 AM
Abstract Keywords: Social Media and Connected Health, Large Language Models (LLMs), Natural Language Processing, Data Mining, Information Extraction
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Social media platforms like Reddit have become vital spaces for autistic individuals and caregivers to seek advice, share experiences, and discuss challenges. Simultaneously, Large Language Models (LLMs) are increasingly used to provide medical guidance. This study examines autism-related discussions on Reddit, comparing them with clinician-patient discussions and evaluating the effectiveness of an autism-specific Retrieval-Augmented Generation (RAG) system. We applied BERTopic to identify key discussion themes in r/autism and r/autism_parenting, revealing significant discussions around behavioral challenges, and practical support. Comparing clinical messages from the University of Missouri Thompson Center for Autism and Neurodevelopment, we found caregivers in clinical settings focused more on medication management, whereas online discussions emphasized non-traditional therapies. We then assessed LLM-generated responses against Reddit peer advice, discussing the differences in accuracy, relevance, empathy and helpfulness. This work underscores the potential of RAG systems in enhancing autism-related guidance while emphasizing the importance of community-driven insights in healthcare conversations.
Speaker:Presentation Time: 09:45 AM - 10:00 AM
Abstract Keywords: Social Media and Connected Health, Large Language Models (LLMs), Natural Language Processing, Data Mining, Information Extraction
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Social media platforms like Reddit have become vital spaces for autistic individuals and caregivers to seek advice, share experiences, and discuss challenges. Simultaneously, Large Language Models (LLMs) are increasingly used to provide medical guidance. This study examines autism-related discussions on Reddit, comparing them with clinician-patient discussions and evaluating the effectiveness of an autism-specific Retrieval-Augmented Generation (RAG) system. We applied BERTopic to identify key discussion themes in r/autism and r/autism_parenting, revealing significant discussions around behavioral challenges, and practical support. Comparing clinical messages from the University of Missouri Thompson Center for Autism and Neurodevelopment, we found caregivers in clinical settings focused more on medication management, whereas online discussions emphasized non-traditional therapies. We then assessed LLM-generated responses against Reddit peer advice, discussing the differences in accuracy, relevance, empathy and helpfulness. This work underscores the potential of RAG systems in enhancing autism-related guidance while emphasizing the importance of community-driven insights in healthcare conversations.
Deshan Wattegama, MS
University of Missouri
Authors:
Deshan Wattegama, MS - University of Missouri; Benjamin Black, MD - University of Missouri; Marcus Moen, MS - Thompson Foundation for Autism & Neurodevelopment; Chi-Ren Shyu, PhD, FACMI, FAMIA - University of Missouri-Columbia;
Deshan Wattegama, MS - University of Missouri; Benjamin Black, MD - University of Missouri; Marcus Moen, MS - Thompson Foundation for Autism & Neurodevelopment; Chi-Ren Shyu, PhD, FACMI, FAMIA - University of Missouri-Columbia;
Mining Social Media Data for Influenza Vaccine Effectiveness Using a Large Language Model and Chain-of-Thought Prompting
Presentation Time: 10:00 AM - 10:15 AM
Abstract Keywords: Natural Language Processing, Large Language Models (LLMs), Data Mining, Artificial Intelligence, Patient / Person Generated Health Data (Patient Reported Outcomes), Infectious Diseases and Epidemiology, Social Media and Connected Health
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Influenza vaccine effectiveness (VE) estimation plays a critical role in public health decision-making by quantifying the real-world impact of vaccination campaigns and guiding policy adjustments. Current approaches to VE estimation are constrained by limited population representation, selection bias, and delayed reporting. To address some of these gaps, we propose leveraging large language models (LLMs) with few-shot chain-of-thought (CoT) prompting to mine social media data for real-time influenza VE estimation. We annotated over 4,000 tweets from the 2020–2021 flu season using structured guidelines, achieving high inter-annotator agreement. Our best prompting strategy achieves F1 scores above 87% for identifying influenza vaccination statuses and test outcomes, outperforming traditional supervised fine-tuning methods by large margins. These findings indicate that LLM-based prompting approaches effectively identify relevant social media information for influenza VE estimation, offering a valuable real-time surveillance tool that complements traditional epidemiological methods.
Speaker:Presentation Time: 10:00 AM - 10:15 AM
Abstract Keywords: Natural Language Processing, Large Language Models (LLMs), Data Mining, Artificial Intelligence, Patient / Person Generated Health Data (Patient Reported Outcomes), Infectious Diseases and Epidemiology, Social Media and Connected Health
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Influenza vaccine effectiveness (VE) estimation plays a critical role in public health decision-making by quantifying the real-world impact of vaccination campaigns and guiding policy adjustments. Current approaches to VE estimation are constrained by limited population representation, selection bias, and delayed reporting. To address some of these gaps, we propose leveraging large language models (LLMs) with few-shot chain-of-thought (CoT) prompting to mine social media data for real-time influenza VE estimation. We annotated over 4,000 tweets from the 2020–2021 flu season using structured guidelines, achieving high inter-annotator agreement. Our best prompting strategy achieves F1 scores above 87% for identifying influenza vaccination statuses and test outcomes, outperforming traditional supervised fine-tuning methods by large margins. These findings indicate that LLM-based prompting approaches effectively identify relevant social media information for influenza VE estimation, offering a valuable real-time surveillance tool that complements traditional epidemiological methods.
Dongfang Xu, PhD
Cedars-Sinai Medical Center
Authors:
Dongfang Xu, PhD - Cedars-Sinai Medical Center; Guillermo Lopez Garcia, PhD - Cedars-Sinai Medical Center; Karen O’Connor, MSc - University of Pennsylvania; Haily Holston, BSc - Cedars-Sinai Medical Center; Ari Klein - University of Pennsylvania; Ivan Flores Amaro, BSc - Cedars-Sinai Medical Center; Matthew Scotch, PhD, MPH - Arizona State University; Graciela Gonzalez-Hernandez, PhD - Cedars-Sinai Medical Center;
Dongfang Xu, PhD - Cedars-Sinai Medical Center; Guillermo Lopez Garcia, PhD - Cedars-Sinai Medical Center; Karen O’Connor, MSc - University of Pennsylvania; Haily Holston, BSc - Cedars-Sinai Medical Center; Ari Klein - University of Pennsylvania; Ivan Flores Amaro, BSc - Cedars-Sinai Medical Center; Matthew Scotch, PhD, MPH - Arizona State University; Graciela Gonzalez-Hernandez, PhD - Cedars-Sinai Medical Center;
Healthy Lifestyles and Self-Improvement Videos on YouTube: A Thematic Analysis of Teen-Targeted Social Media Content
Presentation Time: 10:15 AM - 10:30 AM
Abstract Keywords: Human-computer Interaction, Public Health, Qualitative Methods
Primary Track: Applications
Programmatic Theme: Consumer Health Informatics
As teenagers increasingly turn to social media for health-related information, understanding the values of teen- targeted content has become important. Although videos on healthy lifestyles and self-improvement are gaining popularity on social media platforms like YouTube, little is known about how these videos engage and benefit teenage viewers. To address this, we conducted a thematic analysis of 44 YouTube videos and 66,901 comments. We found that these videos provide various advice on teenagers’ common challenges, use engaging narratives for authenticity, and foster teen-centered communities through comments. However, a few videos also gave misleading advice to adolescents that can be potentially harmful. Based on our findings, we discuss design implications for creating relatable and intriguing social media content for adolescents. Additionally, we suggest ways for social media platforms to promote healthier and safer experiences for teenagers.
Speaker:Presentation Time: 10:15 AM - 10:30 AM
Abstract Keywords: Human-computer Interaction, Public Health, Qualitative Methods
Primary Track: Applications
Programmatic Theme: Consumer Health Informatics
As teenagers increasingly turn to social media for health-related information, understanding the values of teen- targeted content has become important. Although videos on healthy lifestyles and self-improvement are gaining popularity on social media platforms like YouTube, little is known about how these videos engage and benefit teenage viewers. To address this, we conducted a thematic analysis of 44 YouTube videos and 66,901 comments. We found that these videos provide various advice on teenagers’ common challenges, use engaging narratives for authenticity, and foster teen-centered communities through comments. However, a few videos also gave misleading advice to adolescents that can be potentially harmful. Based on our findings, we discuss design implications for creating relatable and intriguing social media content for adolescents. Additionally, we suggest ways for social media platforms to promote healthier and safer experiences for teenagers.
Kyuha Jung, MA
University of California, Irvine
Authors:
Kyuha Jung, MA - University of California, Irvine; Tyler Kim, High School - South High School; Yunan Chen, PhD - University of California, Irvine;
Kyuha Jung, MA - University of California, Irvine; Tyler Kim, High School - South High School; Yunan Chen, PhD - University of California, Irvine;
Two-layer Retrieval Augmented Generation Framework for Low-resource Medical Question-answering Using Reddit Data: Proof of Concept
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: Large Language Models (LLMs), Public Health, Social Media and Connected Health
Primary Track: Applications
Programmatic Theme: Public Health Informatics
This study develops a two-layer retrieval-augmented generation framework for medical question-answering using social media data on novel psychoactive substances. Evaluating the framework using Reddit data on xylazine and ketamine, results showed comparable performance between GPT-4 and a quantized large language model (Nous-Hermes-2-7B-DPO) across multiple metrics, demonstrating smaller large language models' effectiveness for medical question-answering.
Speaker:Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: Large Language Models (LLMs), Public Health, Social Media and Connected Health
Primary Track: Applications
Programmatic Theme: Public Health Informatics
This study develops a two-layer retrieval-augmented generation framework for medical question-answering using social media data on novel psychoactive substances. Evaluating the framework using Reddit data on xylazine and ketamine, results showed comparable performance between GPT-4 and a quantized large language model (Nous-Hermes-2-7B-DPO) across multiple metrics, demonstrating smaller large language models' effectiveness for medical question-answering.
Sudeshna Das, PhD
Emory University
Authors:
Yao Ge, Master - Emory University; Yuting Guo, MS - Emory University; Swati Rajwal, PhD - Emory University; JaMor Hairston, MSHI, MS - Emory University; Jeanne Powell, PhD - Emory University; Drew Walker, PhD - Emory University; Snigdha Peddireddy, MPH - Emory University; Sahithi Lakamana, Systems Software Engineer - Emory University; Selen Bozkurt Watson, PhD, MS - Emory University; Matthew Reyna, PhD - Emory University; Reza Sameni, PhD - Emory University; Yunyu Xiao, PhD - Weill Cornell Medicine, Population Health Sciences; Sangmi Kim, PhD, MPH, RN - Nell Hodgson Woodruff School of Nursing, Emory University; Rasheeta Chandler, PhD - Emory University; Natalie Hernandez, PhD - Morehouse School of Medicine; Danielle Mowery, PhD, MS, MS, FAMIA - University of Pennsylvania; Jeanmarie Perrone, MD - Perelman School of Medicine at the University of Pennsylvania; Abeed Sarker, PhD - Emory University School of Medicine;
Yao Ge, Master - Emory University; Yuting Guo, MS - Emory University; Swati Rajwal, PhD - Emory University; JaMor Hairston, MSHI, MS - Emory University; Jeanne Powell, PhD - Emory University; Drew Walker, PhD - Emory University; Snigdha Peddireddy, MPH - Emory University; Sahithi Lakamana, Systems Software Engineer - Emory University; Selen Bozkurt Watson, PhD, MS - Emory University; Matthew Reyna, PhD - Emory University; Reza Sameni, PhD - Emory University; Yunyu Xiao, PhD - Weill Cornell Medicine, Population Health Sciences; Sangmi Kim, PhD, MPH, RN - Nell Hodgson Woodruff School of Nursing, Emory University; Rasheeta Chandler, PhD - Emory University; Natalie Hernandez, PhD - Morehouse School of Medicine; Danielle Mowery, PhD, MS, MS, FAMIA - University of Pennsylvania; Jeanmarie Perrone, MD - Perelman School of Medicine at the University of Pennsylvania; Abeed Sarker, PhD - Emory University School of Medicine;
Understanding Negative Health Outcomes of Vaping by Mining Millions of Posts and Comments in Reddit
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Data Mining, Social Media and Connected Health, Public Health, Natural Language Processing, Patient / Person Generated Health Data (Patient Reported Outcomes)
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Electronic cigarette (vaping) usage in the U.S. has steadily increased, raising significant public health concerns. Extensive research demonstrates various negative health outcomes associated with vaping. However, many potential harms remain understudied, especially those directly reported by users. Social media platforms such as Reddit offer rich, real-time sources of unfiltered personal accounts, presenting a unique opportunity to explore health outcomes beyond traditional clinical research. In this study, we systematically investigated potential negative health outcomes (NHOs) by analyzing millions of posts and comments from 15 active vaping-related subreddits in 2019. Employing robust data-driven methodologies, including advanced natural language processing (NLP) techniques such as sentiment analysis, UMLS tagging, and topic modeling, we identified distinct patterns of vaping-related health concerns. Our findings highlight the value of user-generated content for early detection of emerging risks, guiding clinicians, policymakers, and public health initiatives aimed at mitigating vaping-related harms, particularly among younger populations.
Speaker:Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Data Mining, Social Media and Connected Health, Public Health, Natural Language Processing, Patient / Person Generated Health Data (Patient Reported Outcomes)
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Electronic cigarette (vaping) usage in the U.S. has steadily increased, raising significant public health concerns. Extensive research demonstrates various negative health outcomes associated with vaping. However, many potential harms remain understudied, especially those directly reported by users. Social media platforms such as Reddit offer rich, real-time sources of unfiltered personal accounts, presenting a unique opportunity to explore health outcomes beyond traditional clinical research. In this study, we systematically investigated potential negative health outcomes (NHOs) by analyzing millions of posts and comments from 15 active vaping-related subreddits in 2019. Employing robust data-driven methodologies, including advanced natural language processing (NLP) techniques such as sentiment analysis, UMLS tagging, and topic modeling, we identified distinct patterns of vaping-related health concerns. Our findings highlight the value of user-generated content for early detection of emerging risks, guiding clinicians, policymakers, and public health initiatives aimed at mitigating vaping-related harms, particularly among younger populations.
DIAN HU, PhD
University of Maryland
Authors:
DIAN HU, PhD - University of Maryland; Dezhi Wu, PhD - University of South Carolina; Erin Kasson, MS, MSW - Washington University in St. Louis; Patricia Cavazos-Rehg, Ph.D. - Department of Psychiatry, Washington University School of Medicine; Hongfang Liu, PhD - University of Texas Health Science Center at Houston; Ming Huang, PhD - UTHealth Houston;
DIAN HU, PhD - University of Maryland; Dezhi Wu, PhD - University of South Carolina; Erin Kasson, MS, MSW - Washington University in St. Louis; Patricia Cavazos-Rehg, Ph.D. - Department of Psychiatry, Washington University School of Medicine; Hongfang Liu, PhD - University of Texas Health Science Center at Houston; Ming Huang, PhD - UTHealth Houston;
S71: From Threads to Therapy: Social Media as Public Health Evidence
Description
Custom CSS
double-click to edit, do not edit in source