Times are displayed in (UTC-04:00) Eastern Time (US & Canada) Change
3/10/2025 |
3:30 PM – 5:00 PM |
Urban
S02: Alzheimers Informatics
Presentation Type: Podium Abstract
2025 Informatics Summit On Demand
Session Credits: 1.5
Session Chair:
Mary Regina Boland
Understanding the Clinical Modalities Important in NeuroDegenerative Disorders and Risk of Patient Injury Using Machine Learning and Survival Analysis
2025 Informatics Summit On Demand
Presentation Time: 03:30 PM - 03:45 PM
Abstract Keywords: Data Mining and Knowledge Discovery, Fairness and Disparity Research in Health Informatics, Secondary Use of EHR Data
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Falls among the elderly and especially those with NeuroDegenerative Disorders (NDD) reduces life expectancy. The purpose of this study is to explore the role of Machine Learning on Electronic Health Records (EHR) data for time-to-event survival analysis prediction of injuries, and role of sensitive attributes, e.g., Race, Ethnicity, Sex, in these models. We used multiple survival analysis methods on a cohort of 29,045 patients 65 years and older treated at PennMedicine for either NDD, Mild Cognitive Impairment (MCI), or another disease. We compare the algorithms and explore the role of multiple modalities on improving prediction of injuries among NDD patients, specifically medications and laboratory tests. Overall, we found that medication features resulted in either increased Hazard Ratios (HR) or reduced HR depending on the NDD type. We found that being of Black race significantly increased the risk of fall/injury in the models that included only medication and sensitive attribute features. The combined model that used both modalities (medications and laboratory information) removed this relationship between being of Black race and increases in fall/injury. Therefore, we found that combining modalities in these survival models in the prediction of fall/injury risk among NDD and MCI individuals results in findings that are robust to different Racial and Ethnic groups with no biases apparent in our final combined modality results. Furthermore, combining modalities (both medications and laboratory values) improved the survival analysis performance across multiple survival analysis methods, when compared using the C-index.
Speaker(s):
Mary Regina Boland, MA, MPhil, PhD, FAMIA
Saint Vincent College
Author(s):
Kazi Noshin, BS - University of Virginia; Mary Regina Boland, PhD, FAMIA - Saint Vincent College; Bojian Hou, PhD - University of Pennsylvania; Weiqing He, Highschool Diploma - University of Pennsylvania; Victoria Lu, BS - University of Virginia; Carol Manning, PhD, ABPP-CN - University of Virginia; Li Shen, Ph.D. - University of Pennsylvania; Aidong Zhang, PhD - University of Virginia;
2025 Informatics Summit On Demand
Presentation Time: 03:30 PM - 03:45 PM
Abstract Keywords: Data Mining and Knowledge Discovery, Fairness and Disparity Research in Health Informatics, Secondary Use of EHR Data
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Falls among the elderly and especially those with NeuroDegenerative Disorders (NDD) reduces life expectancy. The purpose of this study is to explore the role of Machine Learning on Electronic Health Records (EHR) data for time-to-event survival analysis prediction of injuries, and role of sensitive attributes, e.g., Race, Ethnicity, Sex, in these models. We used multiple survival analysis methods on a cohort of 29,045 patients 65 years and older treated at PennMedicine for either NDD, Mild Cognitive Impairment (MCI), or another disease. We compare the algorithms and explore the role of multiple modalities on improving prediction of injuries among NDD patients, specifically medications and laboratory tests. Overall, we found that medication features resulted in either increased Hazard Ratios (HR) or reduced HR depending on the NDD type. We found that being of Black race significantly increased the risk of fall/injury in the models that included only medication and sensitive attribute features. The combined model that used both modalities (medications and laboratory information) removed this relationship between being of Black race and increases in fall/injury. Therefore, we found that combining modalities in these survival models in the prediction of fall/injury risk among NDD and MCI individuals results in findings that are robust to different Racial and Ethnic groups with no biases apparent in our final combined modality results. Furthermore, combining modalities (both medications and laboratory values) improved the survival analysis performance across multiple survival analysis methods, when compared using the C-index.
Speaker(s):
Mary Regina Boland, MA, MPhil, PhD, FAMIA
Saint Vincent College
Author(s):
Kazi Noshin, BS - University of Virginia; Mary Regina Boland, PhD, FAMIA - Saint Vincent College; Bojian Hou, PhD - University of Pennsylvania; Weiqing He, Highschool Diploma - University of Pennsylvania; Victoria Lu, BS - University of Virginia; Carol Manning, PhD, ABPP-CN - University of Virginia; Li Shen, Ph.D. - University of Pennsylvania; Aidong Zhang, PhD - University of Virginia;
SLR: A Modified Logistic Regression Model with Sinkhorn Divergence for Alzheimer’s Disease Classification
2025 Informatics Summit On Demand
Presentation Time: 03:45 PM - 04:00 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data-Driven Research and Discovery, Medical Imaging
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
Logistic regression is a widely used model in machine learning, particularly as a baseline for binary classification tasks due to its simplicity, effectiveness, and interpretability. It is especially powerful when dealing with categorical features. Despite its advantages, standard logistic regression fails to capture the distributional and geometric structure of data, especially when features are derived from structured spaces like brain imaging. For instance, in Voxel-Based Morphometry (VBM), measurements from distinct brain regions follow a clear spatial organization, which standard logistic regression cannot fully leverage. In this paper, we propose Sinkhorn Logistic Regression (SLR), a variant of logistic regression that incorporates the Sinkhorn divergence as a loss function. This adaptation enables the model to leverage geometric information about the data distribution, enhancing its performance on structured datasets.
Speaker(s):
Li Shen, Ph.D.
University of Pennsylvania
Author(s):
Qipeng Zhan, MS - University of Pennsylvania; Zhuoping Zhou, Master of Art - University of Pennsylvania; Zixuan Wen; Zexuan Wang, MA - University of Pennsylvania; Boning Tong, MSE - University of Pennsylvania; Heng Huang, PhD - University of Maryland; Andrew J. Saykin, PsyD - Indiana University; Paul M. Thompson, PhD - University of Southern California; Christos Davatzikos, PhD - University of Pennsylvania; Li Shen, Ph.D. - University of Pennsylvania;
2025 Informatics Summit On Demand
Presentation Time: 03:45 PM - 04:00 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data-Driven Research and Discovery, Medical Imaging
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
Logistic regression is a widely used model in machine learning, particularly as a baseline for binary classification tasks due to its simplicity, effectiveness, and interpretability. It is especially powerful when dealing with categorical features. Despite its advantages, standard logistic regression fails to capture the distributional and geometric structure of data, especially when features are derived from structured spaces like brain imaging. For instance, in Voxel-Based Morphometry (VBM), measurements from distinct brain regions follow a clear spatial organization, which standard logistic regression cannot fully leverage. In this paper, we propose Sinkhorn Logistic Regression (SLR), a variant of logistic regression that incorporates the Sinkhorn divergence as a loss function. This adaptation enables the model to leverage geometric information about the data distribution, enhancing its performance on structured datasets.
Speaker(s):
Li Shen, Ph.D.
University of Pennsylvania
Author(s):
Qipeng Zhan, MS - University of Pennsylvania; Zhuoping Zhou, Master of Art - University of Pennsylvania; Zixuan Wen; Zexuan Wang, MA - University of Pennsylvania; Boning Tong, MSE - University of Pennsylvania; Heng Huang, PhD - University of Maryland; Andrew J. Saykin, PsyD - Indiana University; Paul M. Thompson, PhD - University of Southern California; Christos Davatzikos, PhD - University of Pennsylvania; Li Shen, Ph.D. - University of Pennsylvania;
Sex-Based Differences in the Association of Epigenetic Age Acceleration with Alzheimer’s Disease Biomarkers and Cognitive Measures
2025 Informatics Summit On Demand
Presentation Time: 04:00 PM - 04:15 PM
Abstract Keywords: Data-Driven Research and Discovery, Epigenomics, Real-World Evidence and Policy Making
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Alzheimer’s Disease (AD) is a neurodegenerative disorder marked by cognitive and functional decline. Biological sex has been linked to differences in lifetime AD risk, AD-related neuropathology, and the rate of cognitive decline, although the underlying biological mechanisms driving these disparities remain unclear. While aging clocks are increasingly used to estimate biological aging, their associations with aging-related conditions and the connection between systemic aging and brain aging are not well understood. In this study, we used data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) to examine how sex influences the relationship between age acceleration and cognitive performance as well as brain volume. Our findings indicate that second- and third-generation aging clocks more effectively capture sex-based differences in these relationships compared to first-generation clocks. Future research should focus on validating these findings in an external cohort and exploring them longitudinally.
Speaker(s):
Travyse Edwards, PhD
University of Pennsylvania
Author(s):
Travyse Edwards, BA - University of Pennsylvania; Li Shen, Ph.D. - University of Pennsylvania; Qi Long, Ph.D. - University of Pennsylvania; Tianhua Zhai, PhD - University of Pennsylvania; Andrew Saykin, PsyD - Indiana University; Kwangsik Nho, PhD - Indiana University;
2025 Informatics Summit On Demand
Presentation Time: 04:00 PM - 04:15 PM
Abstract Keywords: Data-Driven Research and Discovery, Epigenomics, Real-World Evidence and Policy Making
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
Alzheimer’s Disease (AD) is a neurodegenerative disorder marked by cognitive and functional decline. Biological sex has been linked to differences in lifetime AD risk, AD-related neuropathology, and the rate of cognitive decline, although the underlying biological mechanisms driving these disparities remain unclear. While aging clocks are increasingly used to estimate biological aging, their associations with aging-related conditions and the connection between systemic aging and brain aging are not well understood. In this study, we used data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) to examine how sex influences the relationship between age acceleration and cognitive performance as well as brain volume. Our findings indicate that second- and third-generation aging clocks more effectively capture sex-based differences in these relationships compared to first-generation clocks. Future research should focus on validating these findings in an external cohort and exploring them longitudinally.
Speaker(s):
Travyse Edwards, PhD
University of Pennsylvania
Author(s):
Travyse Edwards, BA - University of Pennsylvania; Li Shen, Ph.D. - University of Pennsylvania; Qi Long, Ph.D. - University of Pennsylvania; Tianhua Zhai, PhD - University of Pennsylvania; Andrew Saykin, PsyD - Indiana University; Kwangsik Nho, PhD - Indiana University;
Phenotyping Cognitive Presentations in Alzheimer’s Disease: A Deep Clustering Approach
2025 Informatics Summit On Demand
Presentation Time: 04:15 PM - 04:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data-Driven Research and Discovery, Data Mining and Knowledge Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
This study applied Deep Fusion Clustering Network (DFCN) to phenotype patients with clinician-diagnosed early-stage Alzheimer's disease (AD). When evaluated with data from the GERAS-US study, DFCN outperformed K-Prototype clustering in identifying patient subgroups with distinct baseline cognitive profiles and differing risks of cognitive decline within three years. These findings suggest that deep clustering techniques like DFCN can potentially enhance our understanding of the heterogeneity in disease progression of early AD.
Speaker(s):
Jinying Chen, PhD
Boston University
Author(s):
Jiayu Lu, Master - Boston University; Vijetha Balakundi, Master of Science - Boston University; Ting Fang Alvin Ang, Dr. - Boston University; Rhoda Au, PhD - Boston University; Jinying Chen, PhD - Boston University;
2025 Informatics Summit On Demand
Presentation Time: 04:15 PM - 04:30 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data-Driven Research and Discovery, Data Mining and Knowledge Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Health Data Science and Artificial Intelligence Innovation: From Single-Center to Multi-Site
This study applied Deep Fusion Clustering Network (DFCN) to phenotype patients with clinician-diagnosed early-stage Alzheimer's disease (AD). When evaluated with data from the GERAS-US study, DFCN outperformed K-Prototype clustering in identifying patient subgroups with distinct baseline cognitive profiles and differing risks of cognitive decline within three years. These findings suggest that deep clustering techniques like DFCN can potentially enhance our understanding of the heterogeneity in disease progression of early AD.
Speaker(s):
Jinying Chen, PhD
Boston University
Author(s):
Jiayu Lu, Master - Boston University; Vijetha Balakundi, Master of Science - Boston University; Ting Fang Alvin Ang, Dr. - Boston University; Rhoda Au, PhD - Boston University; Jinying Chen, PhD - Boston University;
Leveraging Social Determinants of Health in Alzheimer’s Research Using LLM-Augmented Literature Mining and Knowledge Graphs
2025 Informatics Summit On Demand
Presentation Time: 04:30 PM - 04:45 PM
Abstract Keywords: Social Determinants of Health, Data Mining and Knowledge Discovery, Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Growing evidence suggests that social determinants of health (SDoH), a set of nonmedical factors, affect individuals’ risks of developing Alzheimer’s disease (AD) and related dementias. Nevertheless, the etiological mechanisms underlying such relationships remain largely unclear, mainly due to difficulties in collecting relevant information. This study presents a novel, automated framework that leverages recent advancements of large language model (LLM) as well as classic natural language processing techniques to mine SDoH knowledge from extensive literature and to integrate it with AD-related biological entities extracted from the general-purpose knowledge graph PrimeKG. Utilizing graph neural networks, we performed link prediction tasks to evaluate the resultant SDoH-augmented knowledge graph. Our framework shows promise for enhancing knowledge discovery in AD and can be generalized to other SDoH-related research areas, offering a new tool for exploring the impact of social determinants on health outcomes. Our code is available at: https://github.com/hwq0726/SDoHenPKG
Speaker(s):
Tianqi Shang, MS
University of Pennsylvania
Shu Yang, PhD
University of Pennsylvania
Author(s):
Tianqi Shang, Master of Engineer in Computer Science - University of Pennsylvania; Shu Yang, PhD - University of Pennsylvania; Weiqing He, bachelor - University of Pennsylvania; Tianhua Zhai, PhD - Unversity of Pennsylvania; Dawei Li, MS - Arizona State University; Bojian Hou, PhD - University of Pennsylvania; Tianlong Chen, PhD - University of North Carolina at Chapel Hill; Jason Moore, PhD, FACMI - Cedars-Sinai; Marylyn Ritchie, PhD - University of Pennsylvania, Perelman School of Medicine; Li Shen, Ph.D. - University of Pennsylvania;
2025 Informatics Summit On Demand
Presentation Time: 04:30 PM - 04:45 PM
Abstract Keywords: Social Determinants of Health, Data Mining and Knowledge Discovery, Machine Learning, Generative AI, and Predictive Modeling, Informatics Research/Biomedical Informatics Research Methods, Natural Language Processing
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Growing evidence suggests that social determinants of health (SDoH), a set of nonmedical factors, affect individuals’ risks of developing Alzheimer’s disease (AD) and related dementias. Nevertheless, the etiological mechanisms underlying such relationships remain largely unclear, mainly due to difficulties in collecting relevant information. This study presents a novel, automated framework that leverages recent advancements of large language model (LLM) as well as classic natural language processing techniques to mine SDoH knowledge from extensive literature and to integrate it with AD-related biological entities extracted from the general-purpose knowledge graph PrimeKG. Utilizing graph neural networks, we performed link prediction tasks to evaluate the resultant SDoH-augmented knowledge graph. Our framework shows promise for enhancing knowledge discovery in AD and can be generalized to other SDoH-related research areas, offering a new tool for exploring the impact of social determinants on health outcomes. Our code is available at: https://github.com/hwq0726/SDoHenPKG
Speaker(s):
Tianqi Shang, MS
University of Pennsylvania
Shu Yang, PhD
University of Pennsylvania
Author(s):
Tianqi Shang, Master of Engineer in Computer Science - University of Pennsylvania; Shu Yang, PhD - University of Pennsylvania; Weiqing He, bachelor - University of Pennsylvania; Tianhua Zhai, PhD - Unversity of Pennsylvania; Dawei Li, MS - Arizona State University; Bojian Hou, PhD - University of Pennsylvania; Tianlong Chen, PhD - University of North Carolina at Chapel Hill; Jason Moore, PhD, FACMI - Cedars-Sinai; Marylyn Ritchie, PhD - University of Pennsylvania, Perelman School of Medicine; Li Shen, Ph.D. - University of Pennsylvania;
Early Alzheimer's Detection Through Voice Analysis: Harnessing Locally Deployable LLMs via ADetectoLocum, a privacy-preserving diagnostic system
2025 Informatics Summit On Demand
Presentation Time: 04:45 PM - 05:00 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data Security and Privacy, Clinical Decision Support for Translational/Data Science Interventions
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Diagnosing Alzheimer's Disease (AD) early and cost-effectively is crucial. Recent advancements in Large Language Models (LLMs) like ChatGPT have made accurate, affordable AD detection feasible. Yet, HIPAA compliance and the challenge of integrating these models into hospital systems limit their use. Addressing these constraints, we introduce ADetectoLocum, an open-source LLM equipped model designed for AD risk detection within hospital environments. This model evaluates AD risk through spontaneous patient speech, enhancing diagnostic processes without external data exchange. Our approach secures local deployment and significantly surpasses previous models in predictive accuracy for AD detection, especially in early-stage identification. ADetectoLocum therefore offers a reliable solution for AD diagnostics in healthcare institutions.
Speaker(s):
Genevieve Mortensen, B.S.
Indiana University
Author(s):
Genevieve Mortensen, B.S. - Indiana University; Rui Zhu, Ph.D - Yale University;
2025 Informatics Summit On Demand
Presentation Time: 04:45 PM - 05:00 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Data Security and Privacy, Clinical Decision Support for Translational/Data Science Interventions
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Diagnosing Alzheimer's Disease (AD) early and cost-effectively is crucial. Recent advancements in Large Language Models (LLMs) like ChatGPT have made accurate, affordable AD detection feasible. Yet, HIPAA compliance and the challenge of integrating these models into hospital systems limit their use. Addressing these constraints, we introduce ADetectoLocum, an open-source LLM equipped model designed for AD risk detection within hospital environments. This model evaluates AD risk through spontaneous patient speech, enhancing diagnostic processes without external data exchange. Our approach secures local deployment and significantly surpasses previous models in predictive accuracy for AD detection, especially in early-stage identification. ADetectoLocum therefore offers a reliable solution for AD diagnostics in healthcare institutions.
Speaker(s):
Genevieve Mortensen, B.S.
Indiana University
Author(s):
Genevieve Mortensen, B.S. - Indiana University; Rui Zhu, Ph.D - Yale University;