Times are displayed in (UTC-07:00) Pacific Time (US & Canada) Change
11/11/2024 |
1:45 PM – 3:15 PM |
Golden Gate 1-2
S43: Mental Health and Social Media - Likes, Shares, and Scares
Presentation Type: Oral
Session Chair:
Li Zhou, MD, PhD, FACMI, FIAHSI, FAMIA - Brigham and Women's Hospital, Harvard Medical School
Description
An onsite recording of this session will be included in the Symposium OnDemand offering.
Large-scale Text Mining of Suicide Attempt improves Identification of Distinct Suicidal Events in Electronic Health Records
Presentation Time: 01:45 PM - 02:00 PM
Abstract Keywords: Natural Language Processing, Information Extraction, Evaluation
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
In this study, we explore a natural language processing (NLP) algorithm’s capacity to identify proximal but distinct suicide attempt (SA) events compared to diagnostic code-based approaches. This study used an NLP algorithm with high precision in identifying SA events, which processes clinical notes for suicide-related text expressions and generates SA outcome relevance scores on mentioned dates. We chart reviewed all SA visit pairs less than 15 days apart. Despite sample size limitations, our NLP method surpassed the code-based model's performance (0.85 [95% CI: 0.74 - 0.92] vs. 0.78 [95% CI: 0.56 - 0.92], p = 0.71). More importantly, NLP detected three times more SA visit pairs <15 days compared to the code-based approach (71 vs. 23), with only 3 overlaps. This study demonstrates NLP's efficacy in identifying distinct SA visit pairs. Given minimal overlap, we suggest leveraging both clinical notes and diagnostic codes for a comprehensive SA event detection.
Speaker(s):
Hyunjoon Lee, MS
Vanderbilt University Department of Biomedical Informatics
Author(s):
Hyunjoon Lee, MS - Vanderbilt University Department of Biomedical Informatics; Cosmin Bejan, PhD - Vanderbilt University Medical Center; Colin Walsh - Department of Biomedical Informatics, Vanderbilt University;
Presentation Time: 01:45 PM - 02:00 PM
Abstract Keywords: Natural Language Processing, Information Extraction, Evaluation
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
In this study, we explore a natural language processing (NLP) algorithm’s capacity to identify proximal but distinct suicide attempt (SA) events compared to diagnostic code-based approaches. This study used an NLP algorithm with high precision in identifying SA events, which processes clinical notes for suicide-related text expressions and generates SA outcome relevance scores on mentioned dates. We chart reviewed all SA visit pairs less than 15 days apart. Despite sample size limitations, our NLP method surpassed the code-based model's performance (0.85 [95% CI: 0.74 - 0.92] vs. 0.78 [95% CI: 0.56 - 0.92], p = 0.71). More importantly, NLP detected three times more SA visit pairs <15 days compared to the code-based approach (71 vs. 23), with only 3 overlaps. This study demonstrates NLP's efficacy in identifying distinct SA visit pairs. Given minimal overlap, we suggest leveraging both clinical notes and diagnostic codes for a comprehensive SA event detection.
Speaker(s):
Hyunjoon Lee, MS
Vanderbilt University Department of Biomedical Informatics
Author(s):
Hyunjoon Lee, MS - Vanderbilt University Department of Biomedical Informatics; Cosmin Bejan, PhD - Vanderbilt University Medical Center; Colin Walsh - Department of Biomedical Informatics, Vanderbilt University;
Using Large Language Models for sentiment analysis of health-related social media data: empirical evaluation and practical tips
Presentation Time: 02:00 PM - 02:15 PM
Abstract Keywords: Social Media and Connected Health, Natural Language Processing, Large Language Models (LLMs), Evaluation
Primary Track: Applications
Programmatic Theme: Consumer Health Informatics
Health-related social media data generated by patients and the public provide valuable insights into patient experiences and opinions toward health issues such as vaccination and medical treatments. Using Natural Language Processing (NLP) methods to analyze such data, however, often requires high-quality annotations that are difficult to obtain. The recent emergence of Large Language Models (LLMs) such as the Generative Pre-trained Transformers (GPTs) has shown promising performance on a variety of NLP tasks in the health domain with little to no annotated data. However, their potential in analyzing health-related social media data remains underexplored. In this paper, we report empirical evaluations of LLMs (GPT-3.5-Turbo, FLAN-T5, and BERT-based models) on a common NLP task of health-related social media data: sentiment analysis for identifying opinions toward health issues. We explored how different prompting and fine-tuning strategies affect the performance of LLMs on social media datasets across diverse health topics, including Healthcare Reform, vaccination, mask wearing, and healthcare service quality. We found that LLMs outperformed VADER, a widely used off-the-shelf sentiment analysis tool, but are far from being able to produce accurate sentiment labels. However, their performance can be improved by data-specific prompts with information about the context, task, and targets. The highest performing LLMs are BERT-based models that were fine-tuned on aggregated data. We provided practical tips for researchers to use LLMs on health-related social media for optimal outcomes. We also discuss future work needed to continue to improve the performance of LLMs for analyzing health-related social media data with minimal annotations.
Speaker(s):
Lu He, PhD
University of Wisconsin-Milwaukee
Author(s):
Lu He, PhD - University of Wisconsin-Milwaukee; Sammie Omranian - University of Wisconsin - Milwaukee; Susan McRoy, PhD - University of Wisconsin-Milwaukee; Kai Zheng, PhD - University of California, Irvine;
Presentation Time: 02:00 PM - 02:15 PM
Abstract Keywords: Social Media and Connected Health, Natural Language Processing, Large Language Models (LLMs), Evaluation
Primary Track: Applications
Programmatic Theme: Consumer Health Informatics
Health-related social media data generated by patients and the public provide valuable insights into patient experiences and opinions toward health issues such as vaccination and medical treatments. Using Natural Language Processing (NLP) methods to analyze such data, however, often requires high-quality annotations that are difficult to obtain. The recent emergence of Large Language Models (LLMs) such as the Generative Pre-trained Transformers (GPTs) has shown promising performance on a variety of NLP tasks in the health domain with little to no annotated data. However, their potential in analyzing health-related social media data remains underexplored. In this paper, we report empirical evaluations of LLMs (GPT-3.5-Turbo, FLAN-T5, and BERT-based models) on a common NLP task of health-related social media data: sentiment analysis for identifying opinions toward health issues. We explored how different prompting and fine-tuning strategies affect the performance of LLMs on social media datasets across diverse health topics, including Healthcare Reform, vaccination, mask wearing, and healthcare service quality. We found that LLMs outperformed VADER, a widely used off-the-shelf sentiment analysis tool, but are far from being able to produce accurate sentiment labels. However, their performance can be improved by data-specific prompts with information about the context, task, and targets. The highest performing LLMs are BERT-based models that were fine-tuned on aggregated data. We provided practical tips for researchers to use LLMs on health-related social media for optimal outcomes. We also discuss future work needed to continue to improve the performance of LLMs for analyzing health-related social media data with minimal annotations.
Speaker(s):
Lu He, PhD
University of Wisconsin-Milwaukee
Author(s):
Lu He, PhD - University of Wisconsin-Milwaukee; Sammie Omranian - University of Wisconsin - Milwaukee; Susan McRoy, PhD - University of Wisconsin-Milwaukee; Kai Zheng, PhD - University of California, Irvine;
Visualization of COVID-19-associated Mucormycosis Web on 'X’: A Tale of social media, Government and Media responses
Presentation Time: 02:15 PM - 02:30 PM
Abstract Keywords: Delivering Health Information and Knowledge to the Public, Social Media and Connected Health, Global Health, Biosurveillance, Population Health, Self-care/Management/Monitoring, Data Mining, Mobile Health
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Amidst the COVID-19 pandemic, India faced an additional challenge with COVID-19-associated Mucormycosis (CAM), declared a notifiable disease on May 20, 2021. Analyzing Twitter(X) & NodeXL data from May 15 to 22, 2021, revealed extensive engagement (14,626 retweets, 8,733 mentions) but limited mutual interactions. Notable entities like @INCIndia and @MoHFW_India shaped discussions. This research underscores the rapid escalation of CAM and the pivotal role of social media, government and media in shaping narratives during health crises.
Speaker(s):
Nishant Jain, MS, MHA,CLSSGB,CGSHI
University of Missouri
Author(s):
Suzanne Boren, PhD, MHA, FACMI, FAMIA - University of Missouri; Iris Zachary, PhD - University of Missouri, Health Management and Informatics;
Presentation Time: 02:15 PM - 02:30 PM
Abstract Keywords: Delivering Health Information and Knowledge to the Public, Social Media and Connected Health, Global Health, Biosurveillance, Population Health, Self-care/Management/Monitoring, Data Mining, Mobile Health
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Amidst the COVID-19 pandemic, India faced an additional challenge with COVID-19-associated Mucormycosis (CAM), declared a notifiable disease on May 20, 2021. Analyzing Twitter(X) & NodeXL data from May 15 to 22, 2021, revealed extensive engagement (14,626 retweets, 8,733 mentions) but limited mutual interactions. Notable entities like @INCIndia and @MoHFW_India shaped discussions. This research underscores the rapid escalation of CAM and the pivotal role of social media, government and media in shaping narratives during health crises.
Speaker(s):
Nishant Jain, MS, MHA,CLSSGB,CGSHI
University of Missouri
Author(s):
Suzanne Boren, PhD, MHA, FACMI, FAMIA - University of Missouri; Iris Zachary, PhD - University of Missouri, Health Management and Informatics;
The 20% among Suicide Deaths died Without Warning Signs: Trend Analysis based on Suicide Decedents in the US from 2003-2020
Presentation Time: 02:30 PM - 02:45 PM
Abstract Keywords: Machine Learning, Population Health, Diversity, Equity, Inclusion, Accessibility, and Health Equity
Primary Track: Policy
Programmatic Theme: Public Health Informatics
Understanding who is less likely to reveal suicidal intentions is crucial for developing effective prevention strategies, as suicide rates increase in the US, notably among marginalized groups. Existing studies, limited by small, homogeneous samples, fail to thoroughly analyze various demographics and contexts.
This study investigates trends in disclosed and non-disclosed suicide deaths in the US from 2003 to 2020, considering age, gender, race, ethnicity, methods of suicide, intended recipients of disclosures, and drug/substance categories. It utilizes cross-sectional data from 500,072 suicide decedents across 49 states, Puerto Rico, and the District of Columbia, sourced from the National Violent Death Reporting System's Restricted Access Database, with statistical analyses conducted between October 2023 and January 2024.
The main outcomes measured were disclosures of suicidal intent within one month prior to death and the presence of suicide notes. Results show a consistent 80:20 ratio of non-disclosed to disclosed suicides. Specific groups, including older adults, both genders, certain racial groups, and those who died by specific methods or substances, displayed significantly lower odds of disclosing suicidal intentions. Notably, Black decedents disclosed at markedly lower rates than White decedents, with disparities more pronounced among females of these racial groups.
The study emphasizes the need for targeted suicide prevention strategies, especially for racial minorities, older adults, males, and individuals utilizing certain suicide methods or substances. It highlights the necessity for increased public health efforts to normalize mental distress and enhance access to mental health services.
Speaker(s):
Yunyu Xiao, PhD
Weill Cornell Medicine, Population Health Sciences
Author(s):
Ziyuan Zhang, Bachelor - Department of Epidemiology, Harvard T.H. Chan School of Public Health; Timothy Brown, PhD - University of California, Berkeley; Katherine M. Keyes, PhD - Columbia University; Abeed Sarker, PhD - Emory University School of Medicine; Paul Yip, PhD - HKU; Julie Cerel, PhD - University of Kentucky; John Mann, MD - Columbia University;
Presentation Time: 02:30 PM - 02:45 PM
Abstract Keywords: Machine Learning, Population Health, Diversity, Equity, Inclusion, Accessibility, and Health Equity
Primary Track: Policy
Programmatic Theme: Public Health Informatics
Understanding who is less likely to reveal suicidal intentions is crucial for developing effective prevention strategies, as suicide rates increase in the US, notably among marginalized groups. Existing studies, limited by small, homogeneous samples, fail to thoroughly analyze various demographics and contexts.
This study investigates trends in disclosed and non-disclosed suicide deaths in the US from 2003 to 2020, considering age, gender, race, ethnicity, methods of suicide, intended recipients of disclosures, and drug/substance categories. It utilizes cross-sectional data from 500,072 suicide decedents across 49 states, Puerto Rico, and the District of Columbia, sourced from the National Violent Death Reporting System's Restricted Access Database, with statistical analyses conducted between October 2023 and January 2024.
The main outcomes measured were disclosures of suicidal intent within one month prior to death and the presence of suicide notes. Results show a consistent 80:20 ratio of non-disclosed to disclosed suicides. Specific groups, including older adults, both genders, certain racial groups, and those who died by specific methods or substances, displayed significantly lower odds of disclosing suicidal intentions. Notably, Black decedents disclosed at markedly lower rates than White decedents, with disparities more pronounced among females of these racial groups.
The study emphasizes the need for targeted suicide prevention strategies, especially for racial minorities, older adults, males, and individuals utilizing certain suicide methods or substances. It highlights the necessity for increased public health efforts to normalize mental distress and enhance access to mental health services.
Speaker(s):
Yunyu Xiao, PhD
Weill Cornell Medicine, Population Health Sciences
Author(s):
Ziyuan Zhang, Bachelor - Department of Epidemiology, Harvard T.H. Chan School of Public Health; Timothy Brown, PhD - University of California, Berkeley; Katherine M. Keyes, PhD - Columbia University; Abeed Sarker, PhD - Emory University School of Medicine; Paul Yip, PhD - HKU; Julie Cerel, PhD - University of Kentucky; John Mann, MD - Columbia University;
Assessing demographic differences in psychological pain, hopelessness, connectedness, and capacity for suicide based on terminology-driven natural language processing of VHA clinical progress notes
Presentation Time: 02:45 PM - 03:00 PM
Abstract Keywords: Patient / Person Generated Health Data (Patient Reported Outcomes), Natural Language Processing, Information Extraction
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Psychological pain, hopelessness, connectedness, and capacity for suicide are among the most important drivers of suicidal behavior. Scores based on terminology-driven natural language processing (NLP) of Veterans Health Administration (VHA) clinical progress notes showed meaningful change in these four factors before patients attempted or died by suicide. It is as yet unknown if these changes depend on sex, age group, race/ethnicity, and being involved in the criminal legal system (e.g., in court or incarcerated). We will present results of a repeated measures analysis of psychological pain, hopelessness, connectedness, and capacity for suicide with a between-subjects component for the listed demographic subgroups.
Clinical progress notes entered between 2014 and September 2022 during eight weeks before patients attempted or died by suicide (n=43,581, female 7089, male 36,492) were pulled from the VHA corporate data warehouse. These notes were tagged and labelled for the four factors using a vocabulary of terms available from BioPortal. Using these labels, patient mean scores for the four factors were computed across the last four weeks (1-4) and across the four weeks (5-8) before that. Repeated measures analysis of variance was used to test the effect of Time, patient subgroup, and the Time by subgroup interaction.
Results support the hypothesis that our terminology-driven NLP pipeline to determine psychological pain, hopelessness, connectedness, and capacity for suicide captures meaningful change in demographic subgroups prior to a suicidal event. Scores for these four factors may support clinical decision-making regarding suicide prevention.
Speaker(s):
Esther Meerwijk, Phd, MSN
Author(s):
Asqar Shotqara, MS - VA Palo Alto Health Care System; Suzanne Tamang, PhD - Stanford University; Ruth Reeves - Tennessee Valley Health Care System, US Veterans' Affairs; Mark Ilgen, PhD - University of Michigan, Ann Arbor, MI; Andrea Finlay, PhD - VA Palo Alto Health Care System; Alex Harris, PhD, MS - VA Palo Alto Health Care System;
Presentation Time: 02:45 PM - 03:00 PM
Abstract Keywords: Patient / Person Generated Health Data (Patient Reported Outcomes), Natural Language Processing, Information Extraction
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Psychological pain, hopelessness, connectedness, and capacity for suicide are among the most important drivers of suicidal behavior. Scores based on terminology-driven natural language processing (NLP) of Veterans Health Administration (VHA) clinical progress notes showed meaningful change in these four factors before patients attempted or died by suicide. It is as yet unknown if these changes depend on sex, age group, race/ethnicity, and being involved in the criminal legal system (e.g., in court or incarcerated). We will present results of a repeated measures analysis of psychological pain, hopelessness, connectedness, and capacity for suicide with a between-subjects component for the listed demographic subgroups.
Clinical progress notes entered between 2014 and September 2022 during eight weeks before patients attempted or died by suicide (n=43,581, female 7089, male 36,492) were pulled from the VHA corporate data warehouse. These notes were tagged and labelled for the four factors using a vocabulary of terms available from BioPortal. Using these labels, patient mean scores for the four factors were computed across the last four weeks (1-4) and across the four weeks (5-8) before that. Repeated measures analysis of variance was used to test the effect of Time, patient subgroup, and the Time by subgroup interaction.
Results support the hypothesis that our terminology-driven NLP pipeline to determine psychological pain, hopelessness, connectedness, and capacity for suicide captures meaningful change in demographic subgroups prior to a suicidal event. Scores for these four factors may support clinical decision-making regarding suicide prevention.
Speaker(s):
Esther Meerwijk, Phd, MSN
Author(s):
Asqar Shotqara, MS - VA Palo Alto Health Care System; Suzanne Tamang, PhD - Stanford University; Ruth Reeves - Tennessee Valley Health Care System, US Veterans' Affairs; Mark Ilgen, PhD - University of Michigan, Ann Arbor, MI; Andrea Finlay, PhD - VA Palo Alto Health Care System; Alex Harris, PhD, MS - VA Palo Alto Health Care System;
Identifying and characterizing suicide decedent subtypes using deep embedded clustering
Presentation Time: 03:00 PM - 03:15 PM
Abstract Keywords: Population Health, Machine Learning, Data Mining
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Suicide is a multi-determined and low-frequency event, which makes phenotyping decedents a challenging task. Given the difficulty of collecting detailed information on suicide decedents, most prior efforts focused on subtyping suicidal behavior using samples that also included ideators and attempters. However, differences within decedents may not emerge when assessing heterogeneous groups.
The purpose of this work is to identify and characterize suicide decedent subtypes using health records data of suicide decedents in the state of Maryland. We developed an unsupervised learning approach based on deep embedded clustering (DEC) to derive stable typologies that can be generalized to unseen data.
This is a retrospective study using Maryland’s Statewide Suicide Data Warehouse (MSDW). The analyses included 848 individuals who died by suicide from 2016 to 2020 in Maryland that have both electronic health records (EHR) and hospital discharge data available in MSDW.
We identified 4 distinct profiles by comparing different learning approaches. The best approach was the
DEC with 4 clusters with a cross-validated PS score of 0.94 (SD= 0.029). We found no differences in race, ethnicity, or marital status between the 4 groups; however, there were statistically significant differences between
profiles in age at death, % of females, comorbidities, psychiatric illnesses, number of encounters, and SDH challenges. A time-based analysis showed stability of the clusters up to 6 months before death.
For future work, we plan to assess the value of these profiles in predicting suicide. We will incorporate claims data for the decedents and plan to further validate these profiles.
Speaker(s):
Anas Belouali, MEng, MS
Johns Hopkins
Author(s):
Christopher Kitchen, MS; Ayah Zirikly, PhD - JHU; Paul Nestadt, MD - Johns Hopkins; Holly Wilcox, PhD - Johns Hopkins; Hadi Kharrazi, MD, PhD, FAMIA, FACMI - Johns Hopkins University;
Presentation Time: 03:00 PM - 03:15 PM
Abstract Keywords: Population Health, Machine Learning, Data Mining
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Suicide is a multi-determined and low-frequency event, which makes phenotyping decedents a challenging task. Given the difficulty of collecting detailed information on suicide decedents, most prior efforts focused on subtyping suicidal behavior using samples that also included ideators and attempters. However, differences within decedents may not emerge when assessing heterogeneous groups.
The purpose of this work is to identify and characterize suicide decedent subtypes using health records data of suicide decedents in the state of Maryland. We developed an unsupervised learning approach based on deep embedded clustering (DEC) to derive stable typologies that can be generalized to unseen data.
This is a retrospective study using Maryland’s Statewide Suicide Data Warehouse (MSDW). The analyses included 848 individuals who died by suicide from 2016 to 2020 in Maryland that have both electronic health records (EHR) and hospital discharge data available in MSDW.
We identified 4 distinct profiles by comparing different learning approaches. The best approach was the
DEC with 4 clusters with a cross-validated PS score of 0.94 (SD= 0.029). We found no differences in race, ethnicity, or marital status between the 4 groups; however, there were statistically significant differences between
profiles in age at death, % of females, comorbidities, psychiatric illnesses, number of encounters, and SDH challenges. A time-based analysis showed stability of the clusters up to 6 months before death.
For future work, we plan to assess the value of these profiles in predicting suicide. We will incorporate claims data for the decedents and plan to further validate these profiles.
Speaker(s):
Anas Belouali, MEng, MS
Johns Hopkins
Author(s):
Christopher Kitchen, MS; Ayah Zirikly, PhD - JHU; Paul Nestadt, MD - Johns Hopkins; Holly Wilcox, PhD - Johns Hopkins; Hadi Kharrazi, MD, PhD, FAMIA, FACMI - Johns Hopkins University;