大型语言模型有助于放射学研究中所需的生物统计和编码工作。

IF 3.8 2区医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Academic Radiology Pub Date : 2025-02-01 DOI:10.1016/j.acra.2024.09.042

Adarsh Ghosh MD , Hailong Li PhD , Andrew T. Trout MD

{"title":"大型语言模型有助于放射学研究中所需的生物统计和编码工作。","authors":"Adarsh Ghosh MD , Hailong Li PhD , Andrew T. Trout MD","doi":"10.1016/j.acra.2024.09.042","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><div>Original research in radiology often involves handling large datasets, data manipulation, statistical tests, and coding. Recent studies show that large language models (LLMs) can solve bioinformatics tasks, suggesting their potential in radiology research. This study evaluates an LLM's ability to provide statistical and deep learning solutions and code for radiology research.</div></div><div><h3>Materials and Methods</h3><div>We used web-based chat interfaces available for ChatGPT-4o, ChatGPT-3.5, and Google Gemini.</div></div><div><h3>Experiment 1: Biostatistics and Data Visualization</h3><div>We assessed each LLMs' ability to suggest biostatistical tests and generate R code for the same using a Cancer Imaging Archive dataset. Prompts were based on statistical analyses from a peer-reviewed manuscript. The generated code was tested in R Studio for correctness, runtime errors and the ability to generate the requested visualization.</div></div><div><h3>Experiment 2: Deep Learning</h3><div>We used the RSNA-STR Pneumonia Detection Challenge dataset to evaluate ChatGPT-4o and Gemini’s ability to generate Python code for transformer-based image classification models (Vision Transformer ViT-B/16). The generated code was tested in a Jupiter Notebook for functionality and run time errors.</div></div><div><h3>Results</h3><div>Out of the 8 statistical questions posed, correct statistical answers were suggested for 7 (ChatGPT-4o), 6 (ChatGPT-3.5), and 5 (Gemini) scenarios. The R code output by ChatGPT-4o had fewer runtime errors (6 out of the 7 total codes provided) compared to ChatGPT-3.5 (5/7) and Gemini (5/7). Both ChatGPT4o and Gemini were able to generate visualization requested with a few run time errors. Iteratively copying runtime errors from the code generated by ChatGPT4o into the chat helped resolve them. Gemini initially hallucinated during code generation but was able to provide accurate code on restarting the experiment.</div><div>ChatGPT4-o and Gemini successfully generated initial Python code for deep learning tasks. Errors encountered during implementation were resolved through iterations using the chat interface, demonstrating LLM utility in providing baseline code for further code refinement and resolving run time errors.</div></div><div><h3>Conclusion</h3><div>LLMs can assist in coding tasks for radiology research, providing initial code for data visualization, statistical tests, and deep learning models helping researchers with foundational biostatistical knowledge. While LLM can offer a useful starting point, they require users to refine and validate the code and caution is necessary due to potential errors, the risk of hallucinations and data privacy regulations.</div></div><div><h3>Summary statement</h3><div>LLMs can help with coding and statistical problems in radiology research. This can help primary authors trouble shoot coding needed in radiology research.</div></div>","PeriodicalId":50928,"journal":{"name":"Academic Radiology","volume":"32 2","pages":"Pages 604-611"},"PeriodicalIF":3.8000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Large Language Models can Help with Biostatistics and Coding Needed in Radiology Research\",\"authors\":\"Adarsh Ghosh MD , Hailong Li PhD , Andrew T. Trout MD\",\"doi\":\"10.1016/j.acra.2024.09.042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Introduction</h3><div>Original research in radiology often involves handling large datasets, data manipulation, statistical tests, and coding. Recent studies show that large language models (LLMs) can solve bioinformatics tasks, suggesting their potential in radiology research. This study evaluates an LLM's ability to provide statistical and deep learning solutions and code for radiology research.</div></div><div><h3>Materials and Methods</h3><div>We used web-based chat interfaces available for ChatGPT-4o, ChatGPT-3.5, and Google Gemini.</div></div><div><h3>Experiment 1: Biostatistics and Data Visualization</h3><div>We assessed each LLMs' ability to suggest biostatistical tests and generate R code for the same using a Cancer Imaging Archive dataset. Prompts were based on statistical analyses from a peer-reviewed manuscript. The generated code was tested in R Studio for correctness, runtime errors and the ability to generate the requested visualization.</div></div><div><h3>Experiment 2: Deep Learning</h3><div>We used the RSNA-STR Pneumonia Detection Challenge dataset to evaluate ChatGPT-4o and Gemini’s ability to generate Python code for transformer-based image classification models (Vision Transformer ViT-B/16). The generated code was tested in a Jupiter Notebook for functionality and run time errors.</div></div><div><h3>Results</h3><div>Out of the 8 statistical questions posed, correct statistical answers were suggested for 7 (ChatGPT-4o), 6 (ChatGPT-3.5), and 5 (Gemini) scenarios. The R code output by ChatGPT-4o had fewer runtime errors (6 out of the 7 total codes provided) compared to ChatGPT-3.5 (5/7) and Gemini (5/7). Both ChatGPT4o and Gemini were able to generate visualization requested with a few run time errors. Iteratively copying runtime errors from the code generated by ChatGPT4o into the chat helped resolve them. Gemini initially hallucinated during code generation but was able to provide accurate code on restarting the experiment.</div><div>ChatGPT4-o and Gemini successfully generated initial Python code for deep learning tasks. Errors encountered during implementation were resolved through iterations using the chat interface, demonstrating LLM utility in providing baseline code for further code refinement and resolving run time errors.</div></div><div><h3>Conclusion</h3><div>LLMs can assist in coding tasks for radiology research, providing initial code for data visualization, statistical tests, and deep learning models helping researchers with foundational biostatistical knowledge. While LLM can offer a useful starting point, they require users to refine and validate the code and caution is necessary due to potential errors, the risk of hallucinations and data privacy regulations.</div></div><div><h3>Summary statement</h3><div>LLMs can help with coding and statistical problems in radiology research. This can help primary authors trouble shoot coding needed in radiology research.</div></div>\",\"PeriodicalId\":50928,\"journal\":{\"name\":\"Academic Radiology\",\"volume\":\"32 2\",\"pages\":\"Pages 604-611\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Academic Radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1076633224006913\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Academic Radiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1076633224006913","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

摘要

介绍：放射学的原创性研究通常涉及处理大型数据集、数据处理、统计测试和编码。最近的研究表明，大型语言模型（LLM）可以解决生物信息学任务，这表明它们在放射学研究中具有潜力。本研究评估了 LLM 为放射学研究提供统计和深度学习解决方案及代码的能力：我们使用了 ChatGPT-4o、ChatGPT-3.5 和 Google Gemini 的网络聊天界面。实验 1：生物统计和数据可视化：我们使用癌症成像档案数据集评估了每位 LLM 建议生物统计测试和生成 R 代码的能力。提示基于同行评审手稿中的统计分析。生成的代码在 R Studio 中进行了测试，以确定其正确性、运行时的错误以及生成所需的可视化的能力。实验 2：深度学习：我们使用 RSNA-STR 肺炎检测挑战赛数据集来评估 ChatGPT-4o 和 Gemini 为基于变换器的图像分类模型（Vision Transformer ViT-B/16）生成 Python 代码的能力。生成的代码在 Jupiter Notebook 中进行了功能和运行时间错误测试：结果：在提出的 8 个统计问题中，有 7 个（ChatGPT-4o）、6 个（ChatGPT-3.5）和 5 个（Gemini）方案提出了正确的统计答案。与 ChatGPT-3.5 （5/7）和 Gemini （5/7）相比，ChatGPT-4o 输出的 R 代码出现的运行时错误较少（总共 7 个代码中的 6 个）。ChatGPT4o 和 Gemini 都能生成运行时错误较少的可视化请求。将运行时错误从 ChatGPT4o 生成的代码中反复复制到聊天中有助于解决这些错误。Gemini 最初在代码生成过程中出现幻觉，但在重新启动实验后能够提供准确的代码。ChatGPT4-o 和 Gemini 成功为深度学习任务生成了初始 Python 代码。在执行过程中遇到的错误通过使用聊天界面进行迭代得到了解决，这证明了 LLM 在为进一步完善代码和解决运行时错误提供基线代码方面的作用：LLM 可以协助放射学研究的编码任务，为数据可视化、统计测试和深度学习模型提供初始代码，帮助研究人员掌握基础生物统计知识。虽然 LLM 可以提供一个有用的起点，但它们需要用户完善和验证代码，而且由于潜在的错误、幻觉风险和数据隐私法规，有必要谨慎行事：LLM 可帮助解决放射学研究中的编码和统计问题。这可以帮助主要作者解决放射学研究中所需的编码问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Large Language Models can Help with Biostatistics and Coding Needed in Radiology Research

Introduction

Original research in radiology often involves handling large datasets, data manipulation, statistical tests, and coding. Recent studies show that large language models (LLMs) can solve bioinformatics tasks, suggesting their potential in radiology research. This study evaluates an LLM's ability to provide statistical and deep learning solutions and code for radiology research.

Materials and Methods

We used web-based chat interfaces available for ChatGPT-4o, ChatGPT-3.5, and Google Gemini.

Experiment 1: Biostatistics and Data Visualization

We assessed each LLMs' ability to suggest biostatistical tests and generate R code for the same using a Cancer Imaging Archive dataset. Prompts were based on statistical analyses from a peer-reviewed manuscript. The generated code was tested in R Studio for correctness, runtime errors and the ability to generate the requested visualization.

Experiment 2: Deep Learning

We used the RSNA-STR Pneumonia Detection Challenge dataset to evaluate ChatGPT-4o and Gemini’s ability to generate Python code for transformer-based image classification models (Vision Transformer ViT-B/16). The generated code was tested in a Jupiter Notebook for functionality and run time errors.

Results

Out of the 8 statistical questions posed, correct statistical answers were suggested for 7 (ChatGPT-4o), 6 (ChatGPT-3.5), and 5 (Gemini) scenarios. The R code output by ChatGPT-4o had fewer runtime errors (6 out of the 7 total codes provided) compared to ChatGPT-3.5 (5/7) and Gemini (5/7). Both ChatGPT4o and Gemini were able to generate visualization requested with a few run time errors. Iteratively copying runtime errors from the code generated by ChatGPT4o into the chat helped resolve them. Gemini initially hallucinated during code generation but was able to provide accurate code on restarting the experiment.

ChatGPT4-o and Gemini successfully generated initial Python code for deep learning tasks. Errors encountered during implementation were resolved through iterations using the chat interface, demonstrating LLM utility in providing baseline code for further code refinement and resolving run time errors.

Conclusion

LLMs can assist in coding tasks for radiology research, providing initial code for data visualization, statistical tests, and deep learning models helping researchers with foundational biostatistical knowledge. While LLM can offer a useful starting point, they require users to refine and validate the code and caution is necessary due to potential errors, the risk of hallucinations and data privacy regulations.

Summary statement

LLMs can help with coding and statistical problems in radiology research. This can help primary authors trouble shoot coding needed in radiology research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Academic Radiology 医学-核医学

CiteScore

7.60

自引率

10.40%

发文量

432

审稿时长

18 days

期刊介绍： Academic Radiology publishes original reports of clinical and laboratory investigations in diagnostic imaging, the diagnostic use of radioactive isotopes, computed tomography, positron emission tomography, magnetic resonance imaging, ultrasound, digital subtraction angiography, image-guided interventions and related techniques. It also includes brief technical reports describing original observations, techniques, and instrumental developments; state-of-the-art reports on clinical issues, new technology and other topics of current medical importance; meta-analyses; scientific studies and opinions on radiologic education; and letters to the Editor.