Amit Gupta, Swarndeep Singh, Hema Malhotra, Himanshu Pruthi, Aparna Sharma, Amit K Garg, Mukesh Yadav, Devasenathipathy Kandasamy, Atul Batra, Krithika Rangarajan
{"title":"Provision of Radiology Reports Simplified With Large Language Models to Patients With Cancer: Impact on Patient Satisfaction.","authors":"Amit Gupta, Swarndeep Singh, Hema Malhotra, Himanshu Pruthi, Aparna Sharma, Amit K Garg, Mukesh Yadav, Devasenathipathy Kandasamy, Atul Batra, Krithika Rangarajan","doi":"10.1200/CCI-24-00166","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To explore the perceived utility and effect of simplified radiology reports on oncology patients' knowledge and feasibility of large language models (LLMs) to generate such reports.</p><p><strong>Materials and methods: </strong>This study was approved by the Institute Ethics Committee. In phase I, five state-of-the-art LLMs (Generative Pre-Trained Transformer-4o [GPT-4o], Google Gemini, Claude Opus, Llama-3.1-8B, and Phi-3.5-mini) were tested to simplify 50 oncology computed tomography (CT) report impressions using five distinct prompts with each LLM. The outputs were evaluated quantitatively using readability indices. Five LLM-prompt combinations with best average readability scores were also assessed qualitatively, and the best LLM-prompt combination was selected. In phase II, 100 consecutive oncology patients were randomly assigned into two groups: original report (received original report impression) and simplified report (received LLM-generated simplified versions of their CT report impressions under the supervision of a radiologist). A questionnaire assessed the impact of these reports on patients' knowledge and perceived utility.</p><p><strong>Results: </strong>In phase I, Claude Opus-Prompt 3 (explain to a 15-year-old) performed slightly better than other LLMs, although scores for GPT-4o, Gemini, Claude Opus, and Llama-3.1 were not significantly different (<i>P</i> > .0033 on Wilcoxon signed-rank test with Bonferroni correction). In phase II, simplified report group patients demonstrated significantly better knowledge of primary site and extent of their disease as well as showed significantly higher confidence and understanding of the report (<i>P</i> < .05 for all). Only three (of 50) simplified reports required corrections by the radiologist.</p><p><strong>Conclusion: </strong>Simplified radiology reports significantly enhanced patients' understanding and confidence in comprehending their medical condition. LLMs performed very well at this simplification task; therefore, they can be potentially used for this purpose, although there remains a need for human oversight.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400166"},"PeriodicalIF":3.3000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI-24-00166","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/29 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: To explore the perceived utility and effect of simplified radiology reports on oncology patients' knowledge and feasibility of large language models (LLMs) to generate such reports.
Materials and methods: This study was approved by the Institute Ethics Committee. In phase I, five state-of-the-art LLMs (Generative Pre-Trained Transformer-4o [GPT-4o], Google Gemini, Claude Opus, Llama-3.1-8B, and Phi-3.5-mini) were tested to simplify 50 oncology computed tomography (CT) report impressions using five distinct prompts with each LLM. The outputs were evaluated quantitatively using readability indices. Five LLM-prompt combinations with best average readability scores were also assessed qualitatively, and the best LLM-prompt combination was selected. In phase II, 100 consecutive oncology patients were randomly assigned into two groups: original report (received original report impression) and simplified report (received LLM-generated simplified versions of their CT report impressions under the supervision of a radiologist). A questionnaire assessed the impact of these reports on patients' knowledge and perceived utility.
Results: In phase I, Claude Opus-Prompt 3 (explain to a 15-year-old) performed slightly better than other LLMs, although scores for GPT-4o, Gemini, Claude Opus, and Llama-3.1 were not significantly different (P > .0033 on Wilcoxon signed-rank test with Bonferroni correction). In phase II, simplified report group patients demonstrated significantly better knowledge of primary site and extent of their disease as well as showed significantly higher confidence and understanding of the report (P < .05 for all). Only three (of 50) simplified reports required corrections by the radiologist.
Conclusion: Simplified radiology reports significantly enhanced patients' understanding and confidence in comprehending their medical condition. LLMs performed very well at this simplification task; therefore, they can be potentially used for this purpose, although there remains a need for human oversight.