Primary Track: Data Science/Artificial Intelligence
PheBee is an ontology-native system for scalable and auditable phenotyping that combines graph databases with modern data lake architecture. It links patients, encounters, and notes to HPO and MONDO terms with detailed provenance, including NLP-derived annotations mapped to text spans. The demonstration will showcase hierarchical querying, auditable extraction pipelines, and a Python client for ingestion and cohort retrieval. PheBee is deployed in NICU genomics workflows, providing phenotypic profiles that inform predictive models for genomic sequencing.
Speaker(s): David Gordon, BS Nationwide Children's Hospital
Working Group: Clinical Research Informatics Working Group
Primary Track: Clinical Research Informatics
Greater Plains Collaborative (GPC) and Research Action for Health Network (REACHnet) collaboratively developed the PCORnet Empirical Data Curation (EDC) process by migrating the traditional SAS program to a modular Python and Streamlit application natively within Snowflake’s secure cloud environment. This innovative approach enables quicker, parallel execution of data quality checks, better navigable visualization than static reports, and on-demand reruns of specific checks. GPC and REACHnet implemented 16 of 46 total checks so far, with ongoing development of the rest of the PCORnet checks. This solution reduces runtime and lowers computational demands. By making data curation more accessible and efficient, this approach helps improve research-ready data quality across the network
Speaker(s): Vasanthi Mandhadi, Masters University of Missouri
Abu Mosa, PhD, MS, FAMIA University of Alabama at Birmingham
Sylvester Tumusiime
Author(s): Vasanthi Mandhadi, Masters - University of Missouri;
Md Soliman Islam, M.Sc;
Xing Song, PhD - University of Missouri;
Abu Mosa, PhD, MS, FAMIA - University of Alabama at Birmingham;
James McClay, MD - University of Missouri;
Md Kamruz Zaman Rana, MSHI - University of Missouri - Columbia;
Sylvester Tumusiime;
Kristina Larson, PhD - Louisiana Public Health Institute;
Kyle Bradford, MPH - Louisiana Public Health Institute;
Tom Carton, Phd - Louisiana Public Health Institute;
Bradley Taylor, M.B.A. - Medical College of Wisconsin;
Vasanthi
Mandhadi,
Masters - University of Missouri
Abu
Mosa,
PhD, MS, FAMIA - University of Alabama at Birmingham
Sylvester
Tumusiime -
EXACT: LLMs for Structured Eligibility Extraction for Clinical Trial Matching
Primary Track: Data Science/Artificial Intelligence
Patient enrollment in clinical trials remains below 3%, leading to frequent delays and underpowered studies. A major barrier is the inability of patients to efficiently identify trials for which they are truly eligible. While large language models (LLMs) suggest promise, most current open-source systems attempt to match patients to trials by asking the LLM to select from a list. These methods are evaluated by “top-n” overlap with clinician gold standards, but they do not achieve precision eligibility matching across all trial attributes.
The EXtracting Attributes from Clinical Trials (EXACT) system takes a different approach. Instead of treating trials as black boxes, EXACT uses attribute-specific prompts to extract granular eligibility criteria directly from each trial’s unstructured text, with real-time accuracy monitoring for each attribute. Patient information—augmented when needed through a chatbot-led interview—is then matched against these structured criteria to identify trials for which the patient is truly eligible.
Eligible trials are further prioritized using a Multi-Criteria Decision Analysis framework that incorporates patient-defined values for risk, benefit, burden, and distance. This produces a ranked list of trials that are both eligible and aligned with the patient’s preferences.
EXACT powers CancerBot and has been deployed within the Harvard DCI Network and other nonprofit foundations, supporting patients through a nurse-navigator persona that integrates smoothly with human navigators. This approach demonstrates how structured eligibility extraction, interoperable data pipelines, and patient-centered design can transform trial matching from approximate recommendation into precise, values-aligned navigation.
Speaker(s): Adam Blum, MSCS CancerBot
Author(s): Adam Blum, MSCS - CancerBot;
Steven Labkoff, MD, FACP, FACMI, FAMIA - The Division of Clinical Informatics, Beth Israel Deaconess Medical Center;
Adam
Blum,
MSCS - CancerBot
TRI40: System Demonstrations 2
Description
Custom CSS
double-click to edit, do not edit in source
Date: Thursday (05/21) Time: 9:45 AM to 11:00 AM Room: Pikes Peak - 555 Building, 2nd Floor