American Medical Informatics Association - Zero-Copy by Design: A Governance Framework for Cross-Hospital Secondary Use of Clinical Data

FHIR Starters: Opportunities and Challenges of Implementing Clinical Data Interoperability Services for REDCap

Presentation Type: Podium Abstract

Click to View Presentation

Presentation Time: 03:15 PM - 03:27 PM

Primary Track: Translation Bioinformatics/Precision Medicine

Fast Healthcare Interoperability Resources offer a standardized framework for exchanging clinical data, yet adoption for research remains challenging. This study evaluated the implementation of Clinical Data Interoperability Services in REDCap for clinical research. Through training and workflow support, researchers achieved high data extraction accuracy and reduced manual chart review time. Despite operational barriers, coordinated expertise and user education enabled successful adoption, demonstrating the potential for FHIR to enhance clinical research and improve data workflow efficiency.

Speaker(s):
Izabelle Humes, PT, DPT, MS
Oregon Health and Science University

Author(s):
Izabelle Humes, PT, DPT, MS - Oregon Health and Science University; Daniel Persson, BS - Oregon Health & Science University; Gina DeNoble, MS - Oregon Health & Science University; Emma Young, MS - Oregon Health & Science University; Ashley Herrick, BA - Oregon Health & Science University; Erik Benton, BA - Oregon Health & Science University; Nicole Weiskopf, PhD - Oregon Health & Science University; David Dorr, MD, MS, FACMI, FAMIA, FIAHSI - Oregon Health & Science University;

Zero-Copy by Design: A Governance Framework for Cross-Hospital Secondary Use of Clinical Data

Presentation Type: Podium Abstract

Click to View Presentation

Presentation Time: 03:27 PM - 03:39 PM

Primary Track: Clinical Research Informatics

Zero-copy data sharing is proposed as a governance principle that minimizes new persistent copies of patient-level data in multi-hospital collaborations. We derive a five-part framework (source-of-truth designation, controlled analytic environments, copy accounting, execution- and output-focused access control, and network-wide policy alignment) and demonstrate its application in an OSU–Nationwide Children’s Hospital collaboration, showing how zero-copy simplifies negotiations and reduces perceived and actual data-sharing risk.

Speaker(s):
Christopher Bartlett, PhD, MHA
Nationwide Children's Hosptial

Author(s):
Tim Huerta, PhD, MS - The Ohio State University; Christopher Bartlett, PhD, MHA - Nationwide Children's Hosptial;

Establishing a scalable imaging repository and de-identification pipeline

Presentation Type: Podium Abstract

Click to View Presentation

Presentation Time: 03:39 PM - 03:51 PM

Primary Track: Clinical Research Informatics

We established a medical image de-identification pipeline integrated with an open-source imaging repository, validated for DICOM images including burned-in PHI. Optical character recognition (OCR) based pixel PHI detection using PaddleOCR tool, together with an iterative non-PHI identification process, achieves 100% sensitivity and over 97% specificity, enabling efficient, secure preparation of large imaging datasets, reducing manual burden for PHI review. This sustainable research data infrastructure can be readily adapted to a cloud or hybrid environment.

Speaker(s):
Dale Chen-Song, Masters
Nationwide Children's Hospital

Author(s):
Yungui Huang, PhD, MBA, FAMIA - Abigail Wexner Research Institute at Nationwide Children's Hospital; Judd Storrs, PhD - Nationwide Children's Hospital; Dale Chen-Song, Ms - Nationwide Children's Hospital; Sergio Corrales-Guerrero, PhD - Nationwide Children's Hospital; Jennifer Muszynski, MD, MPH - Nationwide Children's Hospital; Farah Brink, MD - Nationwide Children's Hospital; John Burian, MS - Nationwide Children's Hospital; Christopher Bartlett, PhD, MHA - Nationwide Children's Hosptial;

LLMetaMap: Refining Clinical Concept Mapping Through an LLM-Augmented MetaMap Framework

Presentation Type: Podium Abstract

Click to View Presentation

Presentation Time: 03:51 PM - 04:03 PM

Primary Track: Data Science/Artificial Intelligence

Accurate clinical concept mapping is essential for transforming unstructured clinical text into reliable features for downstream analytics. MetaMap, one of the most widely used tools for mapping free-text to the UMLS Metathesaurus, is known to misinterpret abbreviations, polysemous terms, and negated expressions, resulting in substantial false-positive noise. Recent advances in large language models (LLMs) offer promising contextual understanding that may address these limitations. We propose LLMetaMap, an LLM-augmented framework designed to automatically validate and refine MetaMap-generated concepts. In this study, we evaluated whether LLMs can accurately adjudicate MetaMap mappings as true or false positives using gold-standard labels. MetaMap was applied to a MIMIC-III discharge summary, yielding 604 candidate concepts across 60 semantic categories. Human annotators judged only 37.2% of mappings as correct, and MetaMap’s confidence score showed poor discriminative ability (AUC = 0.61). Four LLMs—GPT-5.1, Llama-3.1-8B, GPT-OSS-20B, and Qwen-3-32B—were prompted with source context, extracted terms, normalized concepts, and semantic metadata to classify each mapping. GPT-5.1 achieved the strongest performance (97.2% accuracy), with consistently high precision and recall across all major categories. Open-source models demonstrated mixed but promising category-specific performance, though they were more susceptible to false positives. These findings demonstrate that LLMs can substantially reduce MetaMap noise and enhance the accuracy of clinical concept extraction. LLMetaMap offers a scalable approach to improving UMLS mapping reliability, and future work will focus on optimizing lightweight, locally deployable LLMs to support privacy-preserving clinical NLP workflows.

Speaker(s):
Yikuan Li, Ph.D.
George Mason University

Author(s):
Rediet Woldeselassie, MS - George Mason University; Lu He, PhD - University of Wisconsin-Milwaukee; Yuan Luo, PhD - Northwestern University; Ozlem Uzuner, PhD - George Mason University;

Is Poor Reporting a Barrier to Phenotype Reuse? A PhenoFit Reality Check

Presentation Type: Podium Abstract

Click to View Presentation

Presentation Time: 04:03 PM - 04:15 PM

Primary Track: Clinical Research Informatics

Computational phenotyping uses algorithms to identify patient cohorts from electronic health records (EHRs), however a fragmented approach to development has led to proliferation of algorithms for the same conditions. To address this, we developed the PhenoFit framework to assess phenotype algorithm fitness for purpose and fitness for reuse and assessed the quality of reporting in the literature through a systematic review of published computable phenotype algorithms. The final dataset contains 1,900 study-phenotype pairs, however only 13.7% (n=260) reported basic information about the algorithm detail, the exact cohort sought, and the validation approach and performance. This study has shown that publications describing phenotype algorithms provide insufficient detail to begin evaluating fitness for purpose or reuse, demonstrating a need to improve reporting standards for phenotype algorithms.

Speaker(s):
Laura Wiley, PhD
Washington University in St. Louis

Author(s):
Luke Rasmussen, MS, FAMIA - Northwestern University; Jennifer Malinowski, PhD - Write InScite LLC; Rebecca Levinson, PhD - University of Heidelberg; Sheila Manemann, MPH - Mayo Clinic; Melissa Wilson; Martin Chapman, PhD - King's College London; Jennifer Pacheco, MS - University of Arizona, Center for Biostatistics & Biomedical Informatics; Suzette Bielinski, MEd, PhD - Mayo Clinic; Laura Wiley, PhD - Washington University in St. Louis;

Enhancing Medicare Coverage Analysis in Clinical Trials: Evaluating the Utility of Large Language Models for Clinical Service Entity Extraction

Presentation Type: Podium Abstract

Click to View Presentation

Presentation Time: 04:15 PM - 04:27 PM

Primary Track: Data Science/Artificial Intelligence

This study evaluates the feasibility of using large language models (LLMs) to streamline the Medicare Coverage Analysis (MCA) workflow, with a specific focus on extracting clinical procedures, and laboratory tests from clinical trial protocols. Our findings indicate that LLMs, particularly the GPT model, show strong potential to improve the efficiency and accuracy of the MCA process, thereby reducing the manual effort and costs associated with clinical trial billing compliance.

Speaker(s):
Rixin Wang, PhD
School of Medicine, Yale University

Author(s):
Rixin Wang, PhD - School of Medicine, Yale University; Eric Borchardt, PhD - School of Medicine, Yale University; Kei-Hoi Cheung, PhD - Yale University; Brian Sevier, PhD - School of Medicine, Yale University; Daniella Meeker, PhD - Yale School of Medicine;

Custom CSS

TRI12: Data Plumbing & Interoperability (Oral Presentations)

Zero-Copy by Design: A Governance Framework for Cross-Hospital Secondary Use of Clinical Data

Category

Description

Custom CSS