Semantic Clinical Artificial Intelligence (SCAI) Improves LLM performance on the USMLE Step 1, 2 and 3 Examinations
Presentation Time: 04:15 PM - 04:30 PM
Abstract Keywords: Controlled Terminologies, Ontologies, and Vocabularies, Large Language Models (LLMs), Natural Language Processing, Clinical Decision Support
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Introduction – Large Language Models (LLMs) such as ChatGPT and GPT4 have been shown to perform well on the USMLE examination, achieving over 60% accuracy. Current LLMs predict the next word given a string of words and although this turns out to be quite powerful it lacks formal semantics and therefore cannot reason. We have developed a semantically augmented LLM named Semantic Clinical Artificial Intelligence (SCAI) which is a Generative Pre-trained Transformer. Our hypothesis was that adding semantics to LLMs would improve accuracy. This was tested on the United States Medical Licensing Examination (USMLE) performance comparing the LLM alone against the SCAI model.
Results: There were 87 text based questions in the examination. The native 13B parameter Llama LLM was able to get 29 (33.3%) questions correct. The SCAI version of the same native Llama 13B parameter LLM was able to get 48 (55.2%) of the 87 questions correct, p<0.0001.
There were 101 text based questions in the step 2 examination. The native LLM was able to get 35 (35%) correct and the SCAI was able to get 49 correct (49%), with p=0.005.
There were 123 text based questions in the step 3 examination. The native LLM was able to get 45 (36.6%) correct and the SCAI was able to get 68 correct (55.3%), with p<0.0001.
Semantic augmentation using RAG (SCAI) led to significantly improved scores on the USMLE step 1, 2 and 3 tests. None of these methods were able to pass any of the USMLE step exams.
Speaker(s):
Peter Elkin, MD, MACP, FACMI, FNYAM, FAMIA, FIAHSI
Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York
Presentation Time: 04:15 PM - 04:30 PM
Abstract Keywords: Controlled Terminologies, Ontologies, and Vocabularies, Large Language Models (LLMs), Natural Language Processing, Clinical Decision Support
Primary Track: Applications
Programmatic Theme: Clinical Informatics
Introduction – Large Language Models (LLMs) such as ChatGPT and GPT4 have been shown to perform well on the USMLE examination, achieving over 60% accuracy. Current LLMs predict the next word given a string of words and although this turns out to be quite powerful it lacks formal semantics and therefore cannot reason. We have developed a semantically augmented LLM named Semantic Clinical Artificial Intelligence (SCAI) which is a Generative Pre-trained Transformer. Our hypothesis was that adding semantics to LLMs would improve accuracy. This was tested on the United States Medical Licensing Examination (USMLE) performance comparing the LLM alone against the SCAI model.
Results: There were 87 text based questions in the examination. The native 13B parameter Llama LLM was able to get 29 (33.3%) questions correct. The SCAI version of the same native Llama 13B parameter LLM was able to get 48 (55.2%) of the 87 questions correct, p<0.0001.
There were 101 text based questions in the step 2 examination. The native LLM was able to get 35 (35%) correct and the SCAI was able to get 49 correct (49%), with p=0.005.
There were 123 text based questions in the step 3 examination. The native LLM was able to get 45 (36.6%) correct and the SCAI was able to get 68 correct (55.3%), with p<0.0001.
Semantic augmentation using RAG (SCAI) led to significantly improved scores on the USMLE step 1, 2 and 3 tests. None of these methods were able to pass any of the USMLE step exams.
Speaker(s):
Peter Elkin, MD, MACP, FACMI, FNYAM, FAMIA, FIAHSI
Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York
Semantic Clinical Artificial Intelligence (SCAI) Improves LLM performance on the USMLE Step 1, 2 and 3 Examinations
Category
Podium Abstract