Times are displayed in (UTC-05:00) Eastern Time (US & Canada) Change
3/11/2025 |
10:30 AM – 12:00 PM |
Urban
S08: Translational Bioinformatics
Presentation Type: Podium Abstract
Linking mutation burden and somatic mutations signatures to gene expression patterns in the developing human brain
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: Genotype-phenotype Association Studies (including GWAS), Genomics/Omic Data Interpretation, Genomics/Omic Data Interpretation
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
While germline mutations are inherited, somatic mutations can arise during prenatal brain development and cause neurological disease and neurodevelopmental disorders. The rates and patterns of somatic mutations and their relation to gene expression across healthy developing brain regions are poorly studied. In this study, we present a framework to quantify the effects of mutation in general and somatic mutation signatures on gene expression profiles of the human developing brain regions.
Speaker(s):
Judith Somekh, PhD
University of Haifa
Author(s):
Judith Somekh, PhD - University of Haifa; Isana Veksler-Lublinsky, Dr. - Ben-Gurion University of the Negev; Or Amar, Mr. - University of Haifa; Isaac Kohane, MD, PhD - Harvard Medical School;
Presentation Time: 10:30 AM - 10:45 AM
Abstract Keywords: Genotype-phenotype Association Studies (including GWAS), Genomics/Omic Data Interpretation, Genomics/Omic Data Interpretation
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
While germline mutations are inherited, somatic mutations can arise during prenatal brain development and cause neurological disease and neurodevelopmental disorders. The rates and patterns of somatic mutations and their relation to gene expression across healthy developing brain regions are poorly studied. In this study, we present a framework to quantify the effects of mutation in general and somatic mutation signatures on gene expression profiles of the human developing brain regions.
Speaker(s):
Judith Somekh, PhD
University of Haifa
Author(s):
Judith Somekh, PhD - University of Haifa; Isana Veksler-Lublinsky, Dr. - Ben-Gurion University of the Negev; Or Amar, Mr. - University of Haifa; Isaac Kohane, MD, PhD - Harvard Medical School;
Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Genomics/Omic Data Interpretation, Genotype-phenotype Association Studies (including GWAS), Informatics Research/Biomedical Informatics Research Methods
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Novel Methods for Variant Detection and Interpretation from Omics Data
Predicting phenotypes with complex genetic bases based on a small, interpretable set of variant features remains a challenging task. Conventionally, data-driven approaches are utilized for this task, yet the high dimensional nature of genotype data make the analysis and prediction difficult. Motivated by the extensive knowledge encoded in pre-trained LLMs and their success in processing complex biomedical concepts, we set to examine the ability of LLMs in feature selection and engineering for tabular genotype data, with a novel knowledge-driven framework. We develop FREEFORM, Free-flow Reasoning and Ensembling for Enhanced Feature Output and Robust Modeling, designed with chain-of-thought and ensembling principles, to select and engineer features with the intrinsic knowledge of LLMs. Evaluated on two distinct genotype-phenotype datasets, genetic ancestry and hereditary hearing loss, we find this framework outperforms several data-driven methods, particularly on low-shot regimes. FREEFORM is available as open-source framework at GitHub: https://github.com/PennShenLab/FREEFORM
Speaker(s):
Joseph Lee, Bachelor's of Science in Networked and Social Systems Engineering
University of Pennsylvania
Author(s):
Joseph Lee, MS - Unversity of Pennsylvania; Shu Yang, PhD - University of Pennsylvania; Jae Young Baik, BA - Unversity of Pennsylvania; Xiaoxi Liu, PhD - RIKEN Center for Integrative Medical Sciences; Zhen Tan, MS - Arizona State University; Dawei Li, MS - Arizona State University; Zixuan Wen; Zixuan Wen, MA - Unversity of Pennsylvania; Bojian Hou, PhD - University of Pennsylvania; Duy Duong-Tran, PhD - United States Naval Academy; Tianlong Chen, PhD - University of North Carolina at Chapel Hill; Li Shen, Ph.D. - University of Pennsylvania;
Presentation Time: 10:45 AM - 11:00 AM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Genomics/Omic Data Interpretation, Genotype-phenotype Association Studies (including GWAS), Informatics Research/Biomedical Informatics Research Methods
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Novel Methods for Variant Detection and Interpretation from Omics Data
Predicting phenotypes with complex genetic bases based on a small, interpretable set of variant features remains a challenging task. Conventionally, data-driven approaches are utilized for this task, yet the high dimensional nature of genotype data make the analysis and prediction difficult. Motivated by the extensive knowledge encoded in pre-trained LLMs and their success in processing complex biomedical concepts, we set to examine the ability of LLMs in feature selection and engineering for tabular genotype data, with a novel knowledge-driven framework. We develop FREEFORM, Free-flow Reasoning and Ensembling for Enhanced Feature Output and Robust Modeling, designed with chain-of-thought and ensembling principles, to select and engineer features with the intrinsic knowledge of LLMs. Evaluated on two distinct genotype-phenotype datasets, genetic ancestry and hereditary hearing loss, we find this framework outperforms several data-driven methods, particularly on low-shot regimes. FREEFORM is available as open-source framework at GitHub: https://github.com/PennShenLab/FREEFORM
Speaker(s):
Joseph Lee, Bachelor's of Science in Networked and Social Systems Engineering
University of Pennsylvania
Author(s):
Joseph Lee, MS - Unversity of Pennsylvania; Shu Yang, PhD - University of Pennsylvania; Jae Young Baik, BA - Unversity of Pennsylvania; Xiaoxi Liu, PhD - RIKEN Center for Integrative Medical Sciences; Zhen Tan, MS - Arizona State University; Dawei Li, MS - Arizona State University; Zixuan Wen; Zixuan Wen, MA - Unversity of Pennsylvania; Bojian Hou, PhD - University of Pennsylvania; Duy Duong-Tran, PhD - United States Naval Academy; Tianlong Chen, PhD - University of North Carolina at Chapel Hill; Li Shen, Ph.D. - University of Pennsylvania;
Inter-tissue coordination patterns of metabolic transcriptomes
Presentation Time: 11:00 AM - 11:15 AM
Abstract Keywords: Systems Biology and Network Analysis, Genomics/Omic Data Interpretation, Data-Driven Research and Discovery, Transcriptomics
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
Understanding inter-organ communication in the entire body is crucial for comprehending health and disease. We present a computational approach that allows to define inter-tissue communication and a general coordination pattern of metabolic transcriptomes at a whole-body scale, applied to 19 human tissues and validated using external datasets. We reveal known and novel inter-tissue metabolic links and a significant global coregulation pattern. Our framework may apply to other types of transcriptomes and used to detect changes across different conditions.
Speaker(s):
Judith Somekh, PhD
University of Haifa
Author(s):
Judith Somekh, PhD - University of Haifa; Irit Hochberg, MD - Technion;
Presentation Time: 11:00 AM - 11:15 AM
Abstract Keywords: Systems Biology and Network Analysis, Genomics/Omic Data Interpretation, Data-Driven Research and Discovery, Transcriptomics
Primary Track: Translation Bioinformatics/Precision Medicine
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
Understanding inter-organ communication in the entire body is crucial for comprehending health and disease. We present a computational approach that allows to define inter-tissue communication and a general coordination pattern of metabolic transcriptomes at a whole-body scale, applied to 19 human tissues and validated using external datasets. We reveal known and novel inter-tissue metabolic links and a significant global coregulation pattern. Our framework may apply to other types of transcriptomes and used to detect changes across different conditions.
Speaker(s):
Judith Somekh, PhD
University of Haifa
Author(s):
Judith Somekh, PhD - University of Haifa; Irit Hochberg, MD - Technion;
IntelliGenes: Multi-modal AI/ML platform for novel biomarker discovery and predictive medicine
Presentation Time: 11:15 AM - 11:30 AM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Biomarker Discovery and Development, Clinical Genomics/Omics and Interventions Based on Omics Data, Data Integration, EHR-based Phenotyping, Open Science for Biomedical Research and Translational Medicine, Advanced Data Visualization Tools and Techniques, Data-Driven Research and Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
In this study, we present IntelliGenes, a novel, interactive, customizable, cross-platform, and user-friendly AI/ML application for multi-omics data exploration to discover novel biomarkers and predict rare, common, and complex diseases. The implemented methodology is based on a nexus of conventional statistical techniques and cutting-edge ML algorithms. We have designed and implemented it in a way that the user with and without computational background can apply AI/ML approaches to discover novel biomarkers and predict diseases.
Speaker(s):
William DeGroat, BS
Rutgers Institute for Health, Health Care Policy and Aging Research
Author(s):
William DeGroat, BS - Rutgers Institute for Health, Health Care Policy and Aging Research; Rishabh Narayanan, BS - Rutgers Institute for Health, Health Care Policy and Aging Research; Dinesh Mendhe, MS - Rutgers Institute for Health, Health Care Policy and Aging Research; Elizabeth Peker, BS - Rutgers Institute for Health, Health Care Policy and Aging Research; Saman Zeeshan, PhD - Department of Biomedical and Health Informatics, UMKC School of Medicine; Zeeshan Ahmed, PhD - Rutgers Institute for Health, Health Care Policy and Aging Research; Department of Medicine, Rutgers Robert Wood Johnson Medical School;
Presentation Time: 11:15 AM - 11:30 AM
Abstract Keywords: Machine Learning, Generative AI, and Predictive Modeling, Biomarker Discovery and Development, Clinical Genomics/Omics and Interventions Based on Omics Data, Data Integration, EHR-based Phenotyping, Open Science for Biomedical Research and Translational Medicine, Advanced Data Visualization Tools and Techniques, Data-Driven Research and Discovery
Primary Track: Data Science/Artificial Intelligence
Programmatic Theme: Translational Bioinformatics Using Multi-Modal Patient Data and AI
In this study, we present IntelliGenes, a novel, interactive, customizable, cross-platform, and user-friendly AI/ML application for multi-omics data exploration to discover novel biomarkers and predict rare, common, and complex diseases. The implemented methodology is based on a nexus of conventional statistical techniques and cutting-edge ML algorithms. We have designed and implemented it in a way that the user with and without computational background can apply AI/ML approaches to discover novel biomarkers and predict diseases.
Speaker(s):
William DeGroat, BS
Rutgers Institute for Health, Health Care Policy and Aging Research
Author(s):
William DeGroat, BS - Rutgers Institute for Health, Health Care Policy and Aging Research; Rishabh Narayanan, BS - Rutgers Institute for Health, Health Care Policy and Aging Research; Dinesh Mendhe, MS - Rutgers Institute for Health, Health Care Policy and Aging Research; Elizabeth Peker, BS - Rutgers Institute for Health, Health Care Policy and Aging Research; Saman Zeeshan, PhD - Department of Biomedical and Health Informatics, UMKC School of Medicine; Zeeshan Ahmed, PhD - Rutgers Institute for Health, Health Care Policy and Aging Research; Department of Medicine, Rutgers Robert Wood Johnson Medical School;
Evolution of Genomic Indicators for Pharmacogenomics: Retrospective Analysis and Implications for Knowledge Management
Presentation Time: 11:30 AM - 11:45 AM
Abstract Keywords: Pharmacogenomics, Knowledge Representation, Management, or Engineering, Clinical Decision Support for Translational/Data Science Interventions, Data Mining and Knowledge Discovery
Primary Track: Clinical Research Informatics
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
Pharmacogenomics (PGx) incorporates patient genetic data into pharmacotherapy guidelines to improve patient outcomes. Clinical decision support (CDS) systems rely on underlying knowledge bases, information models, and encoded rule logic to implement clinical guidelines. However, changes in PGx knowledge and result reporting standards necessitate continual maintenance of CDS rule logic and data reporting in electronic health records (EHRs).
We reviewed over 12-years of PGx CDS implementation at Mayo Clinic, identifying three different methods of recording patient PGx data in multiple EHRs. Prior to enterprise-wide EHR convergence, each Mayo Clinic site followed task force developed gene-drug guidelines to develop rules for annotating gene-phenotype data within patient allergy and problem lists. These annotations frequently lacked discrete genotype or provenance data, precluding detailed tracking of changes in each system. After EHR convergence, all Mayo Clinic sites used Genomic Indicator (GI) profiles (N=158) within an EHR module specifically designed to capture gene-phenotype information. Several post-implementation modification events incorporated new PGx knowledge, including adding new gene-drug indicator sets, updating genotype-phenotype specifications, and assigning haplotype enzyme activity score data for quantitative phenotypes. The incorporation of phenotype results from a large multi-gene panel resulted in the creation of 29 test-specific indicators,12 of which were later removed or merged with previously established GIs due to the use of non-standardized nomenclature and classifications.
Our results demonstrate limitations of using pre-coordinated terms for complex and evolving knowledge and suggest the need for a robust knowledge model and standardized nomenclature to provide adequate data provenance and support genomic medicine at scale.
Speaker(s):
Sarah Senum, MS
Mayo Clinic
Author(s):
Robert Freimuth, PhD - Mayo Clinic; Salem Bajjali, Master of Science - Mayo Clinic; Aly Khalifa, PhD - Mayo Clinic; Jessica Wright, Pharm.D., R.Ph. - Mayo Clinic;
Presentation Time: 11:30 AM - 11:45 AM
Abstract Keywords: Pharmacogenomics, Knowledge Representation, Management, or Engineering, Clinical Decision Support for Translational/Data Science Interventions, Data Mining and Knowledge Discovery
Primary Track: Clinical Research Informatics
Programmatic Theme: Implementation Science and Deployment in Informatics: Enabling Clinical and Translational Research
Pharmacogenomics (PGx) incorporates patient genetic data into pharmacotherapy guidelines to improve patient outcomes. Clinical decision support (CDS) systems rely on underlying knowledge bases, information models, and encoded rule logic to implement clinical guidelines. However, changes in PGx knowledge and result reporting standards necessitate continual maintenance of CDS rule logic and data reporting in electronic health records (EHRs).
We reviewed over 12-years of PGx CDS implementation at Mayo Clinic, identifying three different methods of recording patient PGx data in multiple EHRs. Prior to enterprise-wide EHR convergence, each Mayo Clinic site followed task force developed gene-drug guidelines to develop rules for annotating gene-phenotype data within patient allergy and problem lists. These annotations frequently lacked discrete genotype or provenance data, precluding detailed tracking of changes in each system. After EHR convergence, all Mayo Clinic sites used Genomic Indicator (GI) profiles (N=158) within an EHR module specifically designed to capture gene-phenotype information. Several post-implementation modification events incorporated new PGx knowledge, including adding new gene-drug indicator sets, updating genotype-phenotype specifications, and assigning haplotype enzyme activity score data for quantitative phenotypes. The incorporation of phenotype results from a large multi-gene panel resulted in the creation of 29 test-specific indicators,12 of which were later removed or merged with previously established GIs due to the use of non-standardized nomenclature and classifications.
Our results demonstrate limitations of using pre-coordinated terms for complex and evolving knowledge and suggest the need for a robust knowledge model and standardized nomenclature to provide adequate data provenance and support genomic medicine at scale.
Speaker(s):
Sarah Senum, MS
Mayo Clinic
Author(s):
Robert Freimuth, PhD - Mayo Clinic; Salem Bajjali, Master of Science - Mayo Clinic; Aly Khalifa, PhD - Mayo Clinic; Jessica Wright, Pharm.D., R.Ph. - Mayo Clinic;