A Knowledge Graph Driven Approach to Extend BioNLP Annotations to Facilitate the Generation of Clinical Code Sets
Poster Number: P117
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Information Extraction, Controlled Terminologies, Ontologies, and Vocabularies, Knowledge Representation and Information Modeling
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
The extraction of clinical entities and code sets from textual data are fundamental in biomedical research utilizing real-world data sources. However, the generation of clinically valid code sets from unstructured text remains a challenge in Biomedical Natural Language Processing (BioNLP). Current BioNLP methods rely on single controlled terminological resources for entity linking, which limits the breadth of code set generation and increases the likelihood of false positives. To address this limitation, we propose a novel approach utilizing a large-scale Knowledge Graph (KG) within a Knowledge Management System to extend BioNLP annotations to generate semantically meaningful code sets. We implemented and applied semantic-based queries designed to systematically traverse the KG across multiple semantic relationships, thereby identifying a comprehensive range of clinical entities and relevant codes from multiple terminologies. Pilot evaluations on disease entity annotations from clinical trial eligibility criteria demonstrated the generation of code sets with an average of 58 codes per set. Evaluation against curated code sets from the Value Set Authority Center releveled a moderate overlap. The findings suggest that leveraging KGs can facilitate the generation of clinically relevant code sets in a semi-automated manner.
Speaker(s):
Ali Daowd, MD, PhD
Semedy, Inc.
Author(s):
Marcelo Fiszman, MD, Ph.D. - Semedy Inc; Charles Lagor, MD, PhD, MBA - Semedy; Saverio Maviglia, MD - Semedy, Inc.; Roberto Rocha, MD, PhD, FACMI - Semedy, Inc.;
Poster Number: P117
Presentation Time: 05:00 PM - 06:30 PM
Abstract Keywords: Natural Language Processing, Information Extraction, Controlled Terminologies, Ontologies, and Vocabularies, Knowledge Representation and Information Modeling
Primary Track: Applications
Programmatic Theme: Clinical Research Informatics
The extraction of clinical entities and code sets from textual data are fundamental in biomedical research utilizing real-world data sources. However, the generation of clinically valid code sets from unstructured text remains a challenge in Biomedical Natural Language Processing (BioNLP). Current BioNLP methods rely on single controlled terminological resources for entity linking, which limits the breadth of code set generation and increases the likelihood of false positives. To address this limitation, we propose a novel approach utilizing a large-scale Knowledge Graph (KG) within a Knowledge Management System to extend BioNLP annotations to generate semantically meaningful code sets. We implemented and applied semantic-based queries designed to systematically traverse the KG across multiple semantic relationships, thereby identifying a comprehensive range of clinical entities and relevant codes from multiple terminologies. Pilot evaluations on disease entity annotations from clinical trial eligibility criteria demonstrated the generation of code sets with an average of 58 codes per set. Evaluation against curated code sets from the Value Set Authority Center releveled a moderate overlap. The findings suggest that leveraging KGs can facilitate the generation of clinically relevant code sets in a semi-automated manner.
Speaker(s):
Ali Daowd, MD, PhD
Semedy, Inc.
Author(s):
Marcelo Fiszman, MD, Ph.D. - Semedy Inc; Charles Lagor, MD, PhD, MBA - Semedy; Saverio Maviglia, MD - Semedy, Inc.; Roberto Rocha, MD, PhD, FACMI - Semedy, Inc.;
A Knowledge Graph Driven Approach to Extend BioNLP Annotations to Facilitate the Generation of Clinical Code Sets
Category
Poster - Regular
Description
Date: Monday (11/11)
Time: 05:00 PM to 06:30 PM
Room: Grand Ballroom (Posters)
Time: 05:00 PM to 06:30 PM
Room: Grand Ballroom (Posters)