Analyzing Dementia Caregivers’ Experiences on Twitter: A Term-Weighted Topic Modeling Approach
Presentation Time: 08:45 AM - 09:00 AM
Abstract Keywords: Social Media and Connected Health, Natural Language Processing, Bioinformatics, Data Mining, Machine Learning, Large Language Models (LLMs)
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Dementia significantly impacts affected individuals and their families, making it essential to understand the experiences and concerns of family caregivers for enhanced support and care. This study introduces a novel approach for analyzing tweets from individuals with family members suffering from dementia. By collecting data from Twitter (now called X), we applied advanced natural language processing techniques, including spam removal, lemmatization, stopword removal, compound word segmentation, and spell checking, to preprocess the data. We enhanced conventional topic model—Gibbs Sampling Dirichlet Multinomial Mixture Model (GSDMM)—with two term-weighting strategies, “Log” and “BDC”, to mitigate the impact of common and topic-specific stopwords, respectively. This enhanced approach enabled the identification of key topics among dementia-affected families, offering semantically rich and contextually coherent topics, demonstrating that our method outperforms the state-of-the-art BERTopic model in clarity and consistency. We further leveraged ChatGPT 4, alongside two human experts, to interpret these topics. Our findings illuminate the multifaceted challenges faced by dementia caregivers. This work aims to provide healthcare professionals, researchers, and support organizations with a valuable tool to better understand and address the needs of caregivers impacted by dementia.
Speaker(s):
Bojian Hou, PhD
University of Pennsylvania
Presentation Time: 08:45 AM - 09:00 AM
Abstract Keywords: Social Media and Connected Health, Natural Language Processing, Bioinformatics, Data Mining, Machine Learning, Large Language Models (LLMs)
Primary Track: Applications
Programmatic Theme: Public Health Informatics
Dementia significantly impacts affected individuals and their families, making it essential to understand the experiences and concerns of family caregivers for enhanced support and care. This study introduces a novel approach for analyzing tweets from individuals with family members suffering from dementia. By collecting data from Twitter (now called X), we applied advanced natural language processing techniques, including spam removal, lemmatization, stopword removal, compound word segmentation, and spell checking, to preprocess the data. We enhanced conventional topic model—Gibbs Sampling Dirichlet Multinomial Mixture Model (GSDMM)—with two term-weighting strategies, “Log” and “BDC”, to mitigate the impact of common and topic-specific stopwords, respectively. This enhanced approach enabled the identification of key topics among dementia-affected families, offering semantically rich and contextually coherent topics, demonstrating that our method outperforms the state-of-the-art BERTopic model in clarity and consistency. We further leveraged ChatGPT 4, alongside two human experts, to interpret these topics. Our findings illuminate the multifaceted challenges faced by dementia caregivers. This work aims to provide healthcare professionals, researchers, and support organizations with a valuable tool to better understand and address the needs of caregivers impacted by dementia.
Speaker(s):
Bojian Hou, PhD
University of Pennsylvania
Analyzing Dementia Caregivers’ Experiences on Twitter: A Term-Weighted Topic Modeling Approach
Category
Paper - Regular