Times are displayed in (UTC-07:00) Pacific Time (US & Canada) Change
11/13/2024 |
9:45 AM – 11:00 AM |
Franciscan B
S117: Biomedical Data Integration - Data Jigsaw
Presentation Type: Oral
Session Chair:
Andrey Soares, PhD - University of Colorado School of Medicine
GLOWH-kb - A Curated, Web-based Catalog of Published Data on the Impact of Environment on Human Health
Presentation Time: 09:45 AM - 10:00 AM
Abstract Keywords: Environmental Health and Climate Informatics, Knowledge Representation and Information Modeling, Information Retrieval, Informatics Implementation
Primary Track: Applications
Modeling the risk of climate change on human health involves complex and dynamic interactions. These include shifting patterns of weather and fluctuating environmental pollutants against a backdrop of socio-economic conditions, population demographics, advances in medicine, and other factors. It is vital to understand the interplay between the environment, human health and other health determinants with changing space and time. That is, mere knowledge of the impact on a specific demographic in a single location is not enough. It is important to supplement that information with data showing the same variables could give a different health outcome on another population demographic in a different geographic location. The objective of this paper is to describe the design, implementation, and utility of a unique application GLOWH-kb (GLObal knoWledgebase to catalog the impact of environmental exposure on human Health). The development of GLOWH-kb encompasses : 1) data extraction and curation from identified publications; 2) design and implementation of the backend database; and 3) a user-friendly web-based interface. Given the fact that different diseases are associated with distinct air pollution parameters and exhibit diverse patterns of association, as a case study we focused on Asthma paired with air pollution parameters PM2.5, PM10, NO2, SO2, CO and Ozone separately and retrieved publications if the population sample set was from emergency department visits or hospital admission counts. GLOWH-kb offers map, graph and tabular views of the data. These visual representations enable easy comparison of studies, allowing researchers to identify patterns, trends, and potential correlations.
Speaker(s):
Haseena Rajeevan, PhD
Biomedical Informatics and Data Science, Yale University
Author(s):
Presentation Time: 09:45 AM - 10:00 AM
Abstract Keywords: Environmental Health and Climate Informatics, Knowledge Representation and Information Modeling, Information Retrieval, Informatics Implementation
Primary Track: Applications
Modeling the risk of climate change on human health involves complex and dynamic interactions. These include shifting patterns of weather and fluctuating environmental pollutants against a backdrop of socio-economic conditions, population demographics, advances in medicine, and other factors. It is vital to understand the interplay between the environment, human health and other health determinants with changing space and time. That is, mere knowledge of the impact on a specific demographic in a single location is not enough. It is important to supplement that information with data showing the same variables could give a different health outcome on another population demographic in a different geographic location. The objective of this paper is to describe the design, implementation, and utility of a unique application GLOWH-kb (GLObal knoWledgebase to catalog the impact of environmental exposure on human Health). The development of GLOWH-kb encompasses : 1) data extraction and curation from identified publications; 2) design and implementation of the backend database; and 3) a user-friendly web-based interface. Given the fact that different diseases are associated with distinct air pollution parameters and exhibit diverse patterns of association, as a case study we focused on Asthma paired with air pollution parameters PM2.5, PM10, NO2, SO2, CO and Ozone separately and retrieved publications if the population sample set was from emergency department visits or hospital admission counts. GLOWH-kb offers map, graph and tabular views of the data. These visual representations enable easy comparison of studies, allowing researchers to identify patterns, trends, and potential correlations.
Speaker(s):
Haseena Rajeevan, PhD
Biomedical Informatics and Data Science, Yale University
Author(s):
BioNLPOrium – A Unified, AI-ready Collection of Biomedical Corpora for Advancing Natural Language Processing Research
Presentation Time: 10:00 AM - 10:15 AM
Abstract Keywords: Data Sharing, Natural Language Processing, Knowledge Representation and Information Modeling
Primary Track: Applications
Motivated by the FAIR data principles, we are developing a unified, AI-ready collection of biomedical corpora for advancing biomedical NLP research. We aim to create a unified format for easy access, aggregation, interoperability, and reuse of datasets, reducing engineering efforts. Additionally, we developed a toolkit with several scripts for dataset format conversion and preprocessing. We envision our platform, BioNLPOrium, to be a valuable and continuously expanding platform for the biomedical NLP community.
Speaker(s):
Vipina K. Keloth, PhD
Yale University
Author(s):
Presentation Time: 10:00 AM - 10:15 AM
Abstract Keywords: Data Sharing, Natural Language Processing, Knowledge Representation and Information Modeling
Primary Track: Applications
Motivated by the FAIR data principles, we are developing a unified, AI-ready collection of biomedical corpora for advancing biomedical NLP research. We aim to create a unified format for easy access, aggregation, interoperability, and reuse of datasets, reducing engineering efforts. Additionally, we developed a toolkit with several scripts for dataset format conversion and preprocessing. We envision our platform, BioNLPOrium, to be a valuable and continuously expanding platform for the biomedical NLP community.
Speaker(s):
Vipina K. Keloth, PhD
Yale University
Author(s):
DataMed 2.0: A Discovery Index for Finding Biomedical Datasets
Presentation Time: 10:15 AM - 10:30 AM
Abstract Keywords: Data Mining, Data Sharing, Information Visualization
Primary Track: Applications
Recent growth in data science continues to accumulate data in various repositories within the biomedical domain. The tremendous growth of data challenges the researchers to find the relevant ones from all repositories. DataMed, a data discovery index, indexes the data from various repositories. In the current study, we developed a new web interface and included many advanced features such as a visualization interface. We release DataMed 2.0 for finding the biomedical datasets from NIH-endorsed repositories.
Speaker(s):
Kalpana Raja, PhD, MRSB, CSci
School of Medicine, Yale University
Author(s):
Huan He, Ph.D. - Yale University; Ryan Denlinger, PhD - Section for Biomedical Informatics and Data Science, School of Medicine, Yale University; Maxwell Wibert, BA - Section for Biomedical Informatics and Data Science, School of Medicine, Yale University; Xueqing Peng, PhD - Yale University; Jeffrey Zhang, PhD - Yale University; Christopher Gilman, BS - Section for Biomedical Informatics and Data Science, School of Medicine, Yale University; Kijana Richmond, BA - Section for Biomedical Informatics and Data Science, School of Medicine, Yale University; Hua Xu, Ph.D - Yale University;
Presentation Time: 10:15 AM - 10:30 AM
Abstract Keywords: Data Mining, Data Sharing, Information Visualization
Primary Track: Applications
Recent growth in data science continues to accumulate data in various repositories within the biomedical domain. The tremendous growth of data challenges the researchers to find the relevant ones from all repositories. DataMed, a data discovery index, indexes the data from various repositories. In the current study, we developed a new web interface and included many advanced features such as a visualization interface. We release DataMed 2.0 for finding the biomedical datasets from NIH-endorsed repositories.
Speaker(s):
Kalpana Raja, PhD, MRSB, CSci
School of Medicine, Yale University
Author(s):
Huan He, Ph.D. - Yale University; Ryan Denlinger, PhD - Section for Biomedical Informatics and Data Science, School of Medicine, Yale University; Maxwell Wibert, BA - Section for Biomedical Informatics and Data Science, School of Medicine, Yale University; Xueqing Peng, PhD - Yale University; Jeffrey Zhang, PhD - Yale University; Christopher Gilman, BS - Section for Biomedical Informatics and Data Science, School of Medicine, Yale University; Kijana Richmond, BA - Section for Biomedical Informatics and Data Science, School of Medicine, Yale University; Hua Xu, Ph.D - Yale University;
RealMedQA: A pilot biomedical question answering dataset containing realistic clinical questions
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: Large Language Models (LLMs), Natural Language Processing, Machine Learning, Clinical Decision Support, Clinical Guidelines, Bioinformatics, Information Retrieval, Deep Learning
Primary Track: Foundations
Programmatic Theme: Clinical Informatics
Clinical question answering systems have the potential to provide clinicians with relevant and timely answers to their questions. Nonetheless, despite the advances that have been made, adoption of these systems in clinical settings has been slow. One issue is a lack of question-answering datasets which reflect the real-world needs of health professionals. In this work, we present RealMedQA, a dataset of realistic clinical questions generated by humans and an LLM. We describe the process for generating and verifying the QA pairs and assess several QA models on BioASQ and RealMedQA to assess the relative difficulty of matching answers to questions. We show that the LLM is more cost-efficient for generating "perfect" QA pairs. Additionally, we achieve a lower lexical similarity between questions and answers than BioASQ which provides an additional challenge to the top two QA models, as per the results. We release our code and our dataset publicly to encourage further research.
Speaker(s):
Gregory Kell, MPhil
King's College London
Author(s):
Angus Roberts, PhD - King's College London; Serge Umansky, PhD - Metadvice Ltd.; Yuti Khare, MBBS - Maidstone and Tunbridge Wells NHS Trust; Najma Ahmed, MBBS - King’s College London; Nikhil Patel, MBBS (in progress) - King's College London; Chloe Simela, MD - King's College London; Jack Coumbe, BSc (in progress) - King's College London; Julian Rozario, BSc (in progress) - King's College London; Ryan-Rhys Griffiths, PhD - University of Cambridge; Iain Marshall, PhD - King’s College London;
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: Large Language Models (LLMs), Natural Language Processing, Machine Learning, Clinical Decision Support, Clinical Guidelines, Bioinformatics, Information Retrieval, Deep Learning
Primary Track: Foundations
Programmatic Theme: Clinical Informatics
Clinical question answering systems have the potential to provide clinicians with relevant and timely answers to their questions. Nonetheless, despite the advances that have been made, adoption of these systems in clinical settings has been slow. One issue is a lack of question-answering datasets which reflect the real-world needs of health professionals. In this work, we present RealMedQA, a dataset of realistic clinical questions generated by humans and an LLM. We describe the process for generating and verifying the QA pairs and assess several QA models on BioASQ and RealMedQA to assess the relative difficulty of matching answers to questions. We show that the LLM is more cost-efficient for generating "perfect" QA pairs. Additionally, we achieve a lower lexical similarity between questions and answers than BioASQ which provides an additional challenge to the top two QA models, as per the results. We release our code and our dataset publicly to encourage further research.
Speaker(s):
Gregory Kell, MPhil
King's College London
Author(s):
Angus Roberts, PhD - King's College London; Serge Umansky, PhD - Metadvice Ltd.; Yuti Khare, MBBS - Maidstone and Tunbridge Wells NHS Trust; Najma Ahmed, MBBS - King’s College London; Nikhil Patel, MBBS (in progress) - King's College London; Chloe Simela, MD - King's College London; Jack Coumbe, BSc (in progress) - King's College London; Julian Rozario, BSc (in progress) - King's College London; Ryan-Rhys Griffiths, PhD - University of Cambridge; Iain Marshall, PhD - King’s College London;
Observational study of travel distance in U.S. telemedicine sessions with estimates of emissions savings
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Telemedicine, Environmental Health and Climate Informatics, Healthcare Economics/Cost of Care
Primary Track: Applications
Programmatic Theme: Public Health Informatics
We currently lack robust estimates of emissions savings attributable to telemedicine. Here, we determined the travel distance between participants in U.S. telemedicine sessions and on that basis, estimated the associated annual CO2 emissions savings for U.S. telemedicine in 2021-2022 as nearly 1.5 million tons. Further, we estimated emissions savings per telemedicine session, and per minute of telemedicine care delivery.
Speaker(s):
Mollie Cummins, PhD, RN, FAAN, FACMI
University of Utah
Author(s):
Sukrut Shishupal, MSc - University of Utah; Bob Wong, PhD - University of Utah; Neng Wan, PhD - University of Utah; Jace Johnny, DNP - University of Utah; Amy Mhatre-Owens, MS - University of Utah; Ram Gouripeddi, MD - University of Utah; Julia Ivanova, PhD - Doxy.me; Triton Ong, PhD - Doxy.me; Hiral Soni, PhD - Doxy.me; Brandon Welch, PhD - MUSC; Brian Bunnell, PhD - University of South Florida;
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Telemedicine, Environmental Health and Climate Informatics, Healthcare Economics/Cost of Care
Primary Track: Applications
Programmatic Theme: Public Health Informatics
We currently lack robust estimates of emissions savings attributable to telemedicine. Here, we determined the travel distance between participants in U.S. telemedicine sessions and on that basis, estimated the associated annual CO2 emissions savings for U.S. telemedicine in 2021-2022 as nearly 1.5 million tons. Further, we estimated emissions savings per telemedicine session, and per minute of telemedicine care delivery.
Speaker(s):
Mollie Cummins, PhD, RN, FAAN, FACMI
University of Utah
Author(s):
Sukrut Shishupal, MSc - University of Utah; Bob Wong, PhD - University of Utah; Neng Wan, PhD - University of Utah; Jace Johnny, DNP - University of Utah; Amy Mhatre-Owens, MS - University of Utah; Ram Gouripeddi, MD - University of Utah; Julia Ivanova, PhD - Doxy.me; Triton Ong, PhD - Doxy.me; Hiral Soni, PhD - Doxy.me; Brandon Welch, PhD - MUSC; Brian Bunnell, PhD - University of South Florida;