Times are displayed in (UTC-04:00) Eastern Time (US & Canada) Change
3/11/2025 |
10:30 AM – 12:00 PM |
Frick
S09: Literature to Knowledge
Presentation Type: Podium Abstract
Session Credits: 1.5
Using Topic Modeling to Understand Implementation Science in the Biomedical Literature
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: Implementation Science and Deployment, Natural Language Processing, Data Mining and Knowledge Discovery
Primary Track: Clinical Research Informatics
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
To understand the scope and trends within Implementation Science (IS), we incorporated informatics approaches and methods to an IS literature review. We analyzed 9,098 IS articles from PubMed using natural language processing techniques, namely topic modeling, and identified 15 distinct themes. These themes represent two categories of IS: Metascience (e.g., Theories, Models, and Frameworks) and IS Topics (e.g., Healthcare and Chronic Disease). Our findings demonstrate that an expanding scope of IS over time, providing valuable insights into areas within the field that may require further exploration and discussion.
Speaker(s):
LauraEllen Ashcraft, PhD
University of Pennsylvania
Author(s):
Hayoung Donnelly, Ph.D. - University of Pennsylvania; LauraEllen Ashcraft, PhD - University of Pennsylvania; Joseph Romano, PhD - University of Pennsylvania; Sarah Daniels, PhD - Center for Innovation to Implementation (Ci2i); Danielle Mowery, PhD, MS, MS, FAMIA - University of Pennsylvania;
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: Implementation Science and Deployment, Natural Language Processing, Data Mining and Knowledge Discovery
Primary Track: Clinical Research Informatics
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
To understand the scope and trends within Implementation Science (IS), we incorporated informatics approaches and methods to an IS literature review. We analyzed 9,098 IS articles from PubMed using natural language processing techniques, namely topic modeling, and identified 15 distinct themes. These themes represent two categories of IS: Metascience (e.g., Theories, Models, and Frameworks) and IS Topics (e.g., Healthcare and Chronic Disease). Our findings demonstrate that an expanding scope of IS over time, providing valuable insights into areas within the field that may require further exploration and discussion.
Speaker(s):
LauraEllen Ashcraft, PhD
University of Pennsylvania
Author(s):
Hayoung Donnelly, Ph.D. - University of Pennsylvania; LauraEllen Ashcraft, PhD - University of Pennsylvania; Joseph Romano, PhD - University of Pennsylvania; Sarah Daniels, PhD - Center for Innovation to Implementation (Ci2i); Danielle Mowery, PhD, MS, MS, FAMIA - University of Pennsylvania;
The Emergence of Large Language Models in Literature Reviews: An Automated Systematic Review
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing, Knowledge Representation, Management, or Engineering
Working Group: Natural Language Processing Working Group
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Objective: This automated review aims to summarize the usage of Large Language Models (LLMs) in the process of creating a scientific review. We look at the range of stages in a review that can be automated and assess the current state-of-the-art projects in the field.
Methods: The search was conducted in June 2024 in PubMed, Scopus, Dimensions, and Google Scholar databases by human reviewers. Screening and extraction process took place in Covidence with the help of LLM add-on which uses OpenAI gpt-4o model. ChatGPT was used to clean extracted data and generate code for figures in this review, ChatGPT and Scite.ai were used for drafting the manuscript.
Results: from 3,788 articles retrieved, 172 were deemed eligible by LLM add-on for Covidence for the final review. GPT-based LLMs emerged as the most dominant architecture for review automation (n=126, 73.2%). A significant number of automation projects were found, but only a limited number of papers (n=26, 15.1%) were actual reviews that used LLM during their creation. Most citations focused on automation of a particular stage of review, such as Searching for publications (n=60, 34.9%), and Data extraction (n=54, 31.4%). When comparing pooled performance of GPT-based and BERT-based models, the former were better in data extraction with mean precision 83.0% and recall 86.0%.
Conclusion: Our LLM-assisted systematic review revealed a significant number of research projects related to review automation using LLMs. The results looked promising, and we anticipate that LLMs will soon change the way the scientific reviews are conducted.
Speaker(s):
Dmitry Scherbakov, PhD
Medical University of South Carolina
Author(s):
Nina Hubig, PhD - Clemson University, School of Computing; Vinita Jansari, PhD - Clemson University, School of Computing; Alexander Bakumenko, MSc - Clemson University, School of Computing; Jihad Obeid, MD - Medical University of South Carolina; Leslie Lenert, MD - Medical University of South Carolina;
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing, Knowledge Representation, Management, or Engineering
Working Group: Natural Language Processing Working Group
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Objective: This automated review aims to summarize the usage of Large Language Models (LLMs) in the process of creating a scientific review. We look at the range of stages in a review that can be automated and assess the current state-of-the-art projects in the field.
Methods: The search was conducted in June 2024 in PubMed, Scopus, Dimensions, and Google Scholar databases by human reviewers. Screening and extraction process took place in Covidence with the help of LLM add-on which uses OpenAI gpt-4o model. ChatGPT was used to clean extracted data and generate code for figures in this review, ChatGPT and Scite.ai were used for drafting the manuscript.
Results: from 3,788 articles retrieved, 172 were deemed eligible by LLM add-on for Covidence for the final review. GPT-based LLMs emerged as the most dominant architecture for review automation (n=126, 73.2%). A significant number of automation projects were found, but only a limited number of papers (n=26, 15.1%) were actual reviews that used LLM during their creation. Most citations focused on automation of a particular stage of review, such as Searching for publications (n=60, 34.9%), and Data extraction (n=54, 31.4%). When comparing pooled performance of GPT-based and BERT-based models, the former were better in data extraction with mean precision 83.0% and recall 86.0%.
Conclusion: Our LLM-assisted systematic review revealed a significant number of research projects related to review automation using LLMs. The results looked promising, and we anticipate that LLMs will soon change the way the scientific reviews are conducted.
Speaker(s):
Dmitry Scherbakov, PhD
Medical University of South Carolina
Author(s):
Nina Hubig, PhD - Clemson University, School of Computing; Vinita Jansari, PhD - Clemson University, School of Computing; Alexander Bakumenko, MSc - Clemson University, School of Computing; Jihad Obeid, MD - Medical University of South Carolina; Leslie Lenert, MD - Medical University of South Carolina;
Performance of Large Language Models in Answering Critical Care Medicine Questions
Presentation Time: 11:00 AM - 11:15 AM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Education and Training, Real-World Evidence and Policy Making
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Large Language Models have been tested on medical student-level questions, but their performance in specialized fields like Critical Care Medicine (CCM) is less explored. This study evaluated Meta-Llama 3.1 models (8B and 70B parameters) on 871 CCM questions. Llama3.1:70B outperformed 8B by 30%, with 60% average accuracy. Performance varied across domains, highest in Research (68.4%) and lowest in Renal (47.9%), highlighting the need for broader future work to improve models across various subspecialty domains.
Speaker(s):
Mahmoud Alwakeel, MD
Duke University
Author(s):
Mahmoud Alwakeel, MD - Duke University; Aditya Nagori, Ph.D - Duke University School of Medicine; Neal Chaisson, MD - Cleveland Clinic; An-Kwok Ian Wong, MD, Ph.D - Duke University Health System; Vijay Krishnamoorthy, MD, Ph.D - Duke University; Rishikesan Kamaleswaran, Ph.D - Duke University School of Medicine;
Presentation Time: 11:00 AM - 11:15 AM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Education and Training, Real-World Evidence and Policy Making
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Large Language Models have been tested on medical student-level questions, but their performance in specialized fields like Critical Care Medicine (CCM) is less explored. This study evaluated Meta-Llama 3.1 models (8B and 70B parameters) on 871 CCM questions. Llama3.1:70B outperformed 8B by 30%, with 60% average accuracy. Performance varied across domains, highest in Research (68.4%) and lowest in Renal (47.9%), highlighting the need for broader future work to improve models across various subspecialty domains.
Speaker(s):
Mahmoud Alwakeel, MD
Duke University
Author(s):
Mahmoud Alwakeel, MD - Duke University; Aditya Nagori, Ph.D - Duke University School of Medicine; Neal Chaisson, MD - Cleveland Clinic; An-Kwok Ian Wong, MD, Ph.D - Duke University Health System; Vijay Krishnamoorthy, MD, Ph.D - Duke University; Rishikesan Kamaleswaran, Ph.D - Duke University School of Medicine;
Leveraging GPT-4o for Automated Extraction of Neural Projections from Scientific Literature
Presentation Time: 11:15 AM - 11:30 AM
Abstract Keywords: Natural Language Processing, Machine Learning, Generative AI, and Predictive Modeling, Machine Learning, Generative AI, and Predictive Modeling
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Sudden Unexpected Death in Epilepsy (SUDEP) is a major cause of death for epilepsy patients having uncontrolled seizures. Understanding the complex neural circuits within the central nervous system is crucial for understanding the mechanisms underlying cardiorespiratory regulation, particularly in the context of SUDEP. This study explores the potential of GPT-4o, a cutting-edge language model, to automate the extraction of neural projections from scientific literature. We developed prompts to extract neuroscientific structures, extract projections, and perform synonym harmonization. Applying the approach to four neuroscientific articles, the method extracted 205 projections. A random sample of 100 projections identified was handed over to a domain expert for review where 95 were found to be correct. Therefore, GPT-4o was determined to be accurate in parsing complex scientific texts in extracting neural projections. Future work will involve extracting additional entities like techniques and species information for the projections identified.
Speaker(s):
Rashmie Abeysinghe, PhD
The University of Texas Health Science Center at Houston
Author(s):
Gorbachev Jowah, MD - The University of Texas Health Science Center at Houston; Licong Cui, PhD - The University of Texas Health Science Center at Houston (UTHealth Houston); Samden Lhatoo, MD - The University of Texas Health Science Center at Houston; GQ Zhang, PhD - The University of Texas Health Science Center at Houston;
Presentation Time: 11:15 AM - 11:30 AM
Abstract Keywords: Natural Language Processing, Machine Learning, Generative AI, and Predictive Modeling, Machine Learning, Generative AI, and Predictive Modeling
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Sudden Unexpected Death in Epilepsy (SUDEP) is a major cause of death for epilepsy patients having uncontrolled seizures. Understanding the complex neural circuits within the central nervous system is crucial for understanding the mechanisms underlying cardiorespiratory regulation, particularly in the context of SUDEP. This study explores the potential of GPT-4o, a cutting-edge language model, to automate the extraction of neural projections from scientific literature. We developed prompts to extract neuroscientific structures, extract projections, and perform synonym harmonization. Applying the approach to four neuroscientific articles, the method extracted 205 projections. A random sample of 100 projections identified was handed over to a domain expert for review where 95 were found to be correct. Therefore, GPT-4o was determined to be accurate in parsing complex scientific texts in extracting neural projections. Future work will involve extracting additional entities like techniques and species information for the projections identified.
Speaker(s):
Rashmie Abeysinghe, PhD
The University of Texas Health Science Center at Houston
Author(s):
Gorbachev Jowah, MD - The University of Texas Health Science Center at Houston; Licong Cui, PhD - The University of Texas Health Science Center at Houston (UTHealth Houston); Samden Lhatoo, MD - The University of Texas Health Science Center at Houston; GQ Zhang, PhD - The University of Texas Health Science Center at Houston;
Assessing Geographic Diversity in Systematic Reviews: A 3D Interactive Approach Using Cochrane SRs in IPF
Presentation Time: 11:30 AM - 11:45 AM
Abstract Keywords: Advanced Data Visualization Tools and Techniques, Clinical Decision Support for Translational/Data Science Interventions, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Machine Learning, Generative AI, and Predictive Modeling
Working Group: Clinical Informatics Systems Working Group
Primary Track: Clinical Research Informatics
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
The inclusion of authors from diverse geographic backgrounds in systematic reviews (SRs) is crucial to ensure that the questions addressed and conclusions made represent the global community. This study aims to assess the geographic diversity of authors involved in systematic reviews of interventions for Idiopathic Pulmonary Fibrosis (IPF), comparing Cochrane SRs with non-Cochrane SRs. Using a novel 3D interactive visualization tool based on Three.js, we map author affiliations, visualize geographic connections, and calculate diversity using a weight algorithm that accounts for journal impact factors and author distributions. The objective is to uncover potential biases by analyzing the concentration of authors in specific regions, enabling deeper insights into geographic representation and its impact on review outcomes.
Speaker(s):
Hui Li, Phd
University of Texas Health Science Center at Houston
Author(s):
Presentation Time: 11:30 AM - 11:45 AM
Abstract Keywords: Advanced Data Visualization Tools and Techniques, Clinical Decision Support for Translational/Data Science Interventions, Clinical and Research Data Collection, Curation, Preservation, or Sharing, Machine Learning, Generative AI, and Predictive Modeling
Working Group: Clinical Informatics Systems Working Group
Primary Track: Clinical Research Informatics
Programmatic Theme: Real-World Evidence in Informatics: Bridging the Gap between Research and Practice
The inclusion of authors from diverse geographic backgrounds in systematic reviews (SRs) is crucial to ensure that the questions addressed and conclusions made represent the global community. This study aims to assess the geographic diversity of authors involved in systematic reviews of interventions for Idiopathic Pulmonary Fibrosis (IPF), comparing Cochrane SRs with non-Cochrane SRs. Using a novel 3D interactive visualization tool based on Three.js, we map author affiliations, visualize geographic connections, and calculate diversity using a weight algorithm that accounts for journal impact factors and author distributions. The objective is to uncover potential biases by analyzing the concentration of authors in specific regions, enabling deeper insights into geographic representation and its impact on review outcomes.
Speaker(s):
Hui Li, Phd
University of Texas Health Science Center at Houston
Author(s):
A Large-Language Model Framework for Relative Timeline Extraction from PubMed Case Reports
Presentation Time: 11:45 AM - 12:00 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Timing of clinical events is central to characterization of patient trajectories, enabling analyses such as process tracing, forecasting, and causal reasoning. However, structured electronic health records capture few data elements critical to these tasks, while clinical reports lack temporal localization of events in structured form. We present a system that transforms case reports into textual time series—structured pairs of textual events and timestamps. We contrast manual and large language model (LLM) annotations (n=320 and n=115 respectively) of ten randomly-sampled PubMed open-access (PMOA) case reports (N=152,974) and assess inter-LLM agreement (n=1,260, N=93). We find that the LLM models have low event recall (O1-preview: 0.35) but high temporal concordance among identified events (O1-preview: 0.93). By establishing the task, annotation, and assessment systems, and by demonstrating high concordance, this work may serve as a benchmark for leveraging the PMOA corpus for temporal analytics.
Speaker(s):
Jeremy Weiss, MD PhD
National Library of Medicine
Author(s):
Presentation Time: 11:45 AM - 12:00 PM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Natural Language Processing, Informatics Research/Biomedical Informatics Research Methods
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Harnessing the Power of Large Language Models in Health Data Science
Timing of clinical events is central to characterization of patient trajectories, enabling analyses such as process tracing, forecasting, and causal reasoning. However, structured electronic health records capture few data elements critical to these tasks, while clinical reports lack temporal localization of events in structured form. We present a system that transforms case reports into textual time series—structured pairs of textual events and timestamps. We contrast manual and large language model (LLM) annotations (n=320 and n=115 respectively) of ten randomly-sampled PubMed open-access (PMOA) case reports (N=152,974) and assess inter-LLM agreement (n=1,260, N=93). We find that the LLM models have low event recall (O1-preview: 0.35) but high temporal concordance among identified events (O1-preview: 0.93). By establishing the task, annotation, and assessment systems, and by demonstrating high concordance, this work may serve as a benchmark for leveraging the PMOA corpus for temporal analytics.
Speaker(s):
Jeremy Weiss, MD PhD
National Library of Medicine
Author(s):