Understanding Negative Health Outcomes of Vaping by Mining Millions of Posts and Comments in Reddit
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Data Mining, Social Media and Connected Health, Public Health, Natural Language Processing, Patient / Person Generated Health Data (Patient Reported Outcomes)
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Electronic cigarette (vaping) usage in the U.S. has steadily increased, raising significant public health concerns. Extensive research demonstrates various negative health outcomes associated with vaping. However, many potential harms remain understudied, especially those directly reported by users. Social media platforms such as Reddit offer rich, real-time sources of unfiltered personal accounts, presenting a unique opportunity to explore health outcomes beyond traditional clinical research. In this study, we systematically investigated potential negative health outcomes (NHOs) by analyzing millions of posts and comments from 15 active vaping-related subreddits in 2019. Employing robust data-driven methodologies, including advanced natural language processing (NLP) techniques such as sentiment analysis, UMLS tagging, and topic modeling, we identified distinct patterns of vaping-related health concerns. Our findings highlight the value of user-generated content for early detection of emerging risks, guiding clinicians, policymakers, and public health initiatives aimed at mitigating vaping-related harms, particularly among younger populations.
Speaker(s):
DIAN HU, PhD
University of Maryland
Author(s):
DIAN HU, PhD - University of Maryland; Dezhi Wu, PhD - University of South Carolina; Erin Kasson, MS, MSW - Washington University in St. Louis; Patricia Cavazos-Rehg, Ph.D. - Department of Psychiatry, Washington University School of Medicine; Hongfang Liu, PhD - University of Texas Health Science Center at Houston; Ming Huang, PhD - UTHealth Houston;
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Data Mining, Social Media and Connected Health, Public Health, Natural Language Processing, Patient / Person Generated Health Data (Patient Reported Outcomes)
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Electronic cigarette (vaping) usage in the U.S. has steadily increased, raising significant public health concerns. Extensive research demonstrates various negative health outcomes associated with vaping. However, many potential harms remain understudied, especially those directly reported by users. Social media platforms such as Reddit offer rich, real-time sources of unfiltered personal accounts, presenting a unique opportunity to explore health outcomes beyond traditional clinical research. In this study, we systematically investigated potential negative health outcomes (NHOs) by analyzing millions of posts and comments from 15 active vaping-related subreddits in 2019. Employing robust data-driven methodologies, including advanced natural language processing (NLP) techniques such as sentiment analysis, UMLS tagging, and topic modeling, we identified distinct patterns of vaping-related health concerns. Our findings highlight the value of user-generated content for early detection of emerging risks, guiding clinicians, policymakers, and public health initiatives aimed at mitigating vaping-related harms, particularly among younger populations.
Speaker(s):
DIAN HU, PhD
University of Maryland
Author(s):
DIAN HU, PhD - University of Maryland; Dezhi Wu, PhD - University of South Carolina; Erin Kasson, MS, MSW - Washington University in St. Louis; Patricia Cavazos-Rehg, Ph.D. - Department of Psychiatry, Washington University School of Medicine; Hongfang Liu, PhD - University of Texas Health Science Center at Houston; Ming Huang, PhD - UTHealth Houston;
Understanding Negative Health Outcomes of Vaping by Mining Millions of Posts and Comments in Reddit
Category
Paper - Student