Harnessing AI for Comprehensive Reporting of Medical AI Research

IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC International Journal of Imaging Systems and Technology Pub Date : 2025-02-11 DOI:10.1002/ima.70047
Mohamed L. Seghier
{"title":"Harnessing AI for Comprehensive Reporting of Medical AI Research","authors":"Mohamed L. Seghier","doi":"10.1002/ima.70047","DOIUrl":null,"url":null,"abstract":"<p>In this editorial, I would like to succinctly discuss the potential of using AI to improve reporting medical AI research. There are already several published guidelines and checklists in the current literature but how they are interpreted and implemented varies with publishers, editors, reviewers and authors. Here, I discuss the possibility of harnessing generative AI tools in order to assist authors to comprehensively report their AI work and meet current guidelines, with the ultimate aim to improve transparency and replicability in medical AI research. The succinct discussion below reckons two key issues: (1) AI has a seductive allure that might affect how AI-generated evidence is scrutinized and disseminated, hence the need for comprehensive and transparent reporting, and (2) authors sometimes feel uncertain about what to report in the light of so many existing guidelines about reporting AI research and the lack of consensus in the field.</p><p>It has been argued that extraneous or irrelevant information with a seductive allure can improve the ratings of scientific explanations [<span>1</span>]. AI, with its overhyped knowledgeability, can convey biases and false information that readers might judge believable [<span>2</span>]. AI can write highly convincing text that can impress or deceive readers, even in the presence of errors and false information [<span>3, 4</span>]. Likewise, merely mentioning “AI” in the title of a research paper seems to increase its citation potential [<span>5</span>]. The latter might incentivise scientists to use AI purely to boost their work citability, regardless of whether AI improved their work quality. In this context, one might speculate that some publications that used AI but with flawed methodologies or wrong conclusions might have slipped through the cracks of peer review, with many already being indexed and citable [<span>6</span>]. Overall, emerging evidence suggests that AI has an intrinsic seductive allure that is shaping the medical research landscape and impacting how readers appraise research articles that employ AI. This is why improving the reporting and evaluation of AI work is of paramount importance, and in this editorial, I underscore the potential role of generative AI for that purpose.</p><p>Consider this: readers might find a paper entitled “<i>Association between condition X and biomarker Y demonstrated with deep learning</i>” novel and worth reading. Now, imagine if the same finding was evidenced with a traditional analysis method and entitled “<i>Association between condition X and biomarker Y demonstrated with a correlation analysis</i>”, though it is unlikely that the authors of the latter will consider correlation analysis worth mentioning in the article title. Although both pieces of work report the same finding, they may not enjoy the same buzz and high citability in the field. This is because AI-based methods and traditional analysis methods operate at different maturity levels. Readers (and reviewers) are quite familiar with the scope and limitations of a correlation analysis, but the same cannot be said about AI. Having clear guidelines on how to comprehensively report and rigorously evaluate medical AI research is thus extremely important.</p><p>No one denies AI's huge potential in medical research, such as automating the analysis of complex medical data and accelerating the discovery of useful markers. However, AI may discover new data-driven features and disease-markers relationships that do not always align with prior medical knowledge, raising the question of how to reconcile common medical knowledge with AI-generated evidence. Likewise, there is a risk that AI's seductive allure might diminish the critical analysis and scrutiny of AI-generated evidence, thus weakening the rigour of the peer review process in evaluating AI papers. Therefore, when AI is used to enhance the process of scientific discovery, the core principles of scientific methodology, including falsifiability, must be upheld. However, when it comes to falsifiability, independently testing and disproving AI-generated evidence remains difficult. For example, does a 2% reduction in accuracy or another performance metric disprove a particular AI method?</p><p>Indeed, there is no consensus about the conceptual and methodological frameworks by which AI-generated evidence can be securitized and falsified. This is because deploying AI to study a particular question involves several aspects that create multiple sources of error or bias that are not always easy to gauge. This includes how data is curated, cleaned, imputed, augmented, divided or aggregated, how relevant features are identified, reduced or combined, and how AI architecture is built, trained or validated. As AI can generate fabricated articles including articles with empirical results (see discussion in [<span>4</span>]), frameworks that uphold falsification are paramount in AI research [<span>7</span>]. The recent example of the AI-Generated Science (AIGS) system [<span>8</span>], with AI agents that can independently and autonomously create knowledge, poses significant questions to AI research at many ethical, legal and scientific levels. This is why the authors of AIGS identified falsification as a core agent of that system to verify and scrutinise AI-generated scientific discoveries.</p><p>To minimise the risk of proliferation of flawed or fabricated AI research that could harm clinical practice, many guidelines and checklists for improving the reporting of medical AI research have been proposed. Such AI reporting guidelines are very useful to support authors to comprehensively present their AI work and to enhance the rigorous evaluation of their work during the peer review process. Some of the existing guidelines include MAIC-10 (Must AI Criteria-10), CLAIM (Checklist for Artificial Intelligence in Medical Imaging), STARD-AI (Standards for Reporting of Diagnostic Accuracy Study-AI), MI-CLAIM (Minimum Information about Clinical Artificial Intelligence Modeling), MINIMAR (Minimum Information for Medical AI Reporting), RQS (Radiomics Quality Score), QAMAI (Quality Analysis of Medical Artificial Intelligence), TRIPOD+AI (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis), CONSORT-AI (Consolidated Standards of Reporting Trials–AI), SPIRIT-AI (Standard Protocol Items: Recommendations for Interventional Trials-AI), FUTURE-AI (Fairness Universality Traceability Usability Robustness Explainability-AI), CAIR (Clinical AI Research), DECIDE-AI (Developmental and Exploratory Clinical Investigations of DEcision support systems driven by Artificial Intelligence), CLEAR (CheckList for EvaluAtion of Radiomics research), DOME (Data, Optimization, Model and Evaluation); see discussion in [<span>9-11</span>]. The relevance of each checklist depends on the specific topic and scope of the AI research.</p><p>However, AI researchers feel overwhelmed (and sometimes confused) by so many guidelines and checklists that are not implemented or interpreted in the same way by reviewers, editors and publishers. Hence, to maximise their impact and usefulness, publishers should consider offering easy-to-follow article templates that explicitly specify what one must report in each section of a manuscript in order to meet their guidelines and checklists about AI research. Likewise, similar to existing AI-powered tools for plagiarism detection, image manipulation, and language editing, publishers should join force with AI developers to create AI-powered tools that can automatically flag up submissions that do not conform to specific guidelines and provide constructive feedback to authors on how to improve the reporting of their AI research. Such AI tools can be made accessible to authors before submission to guide them through the process of improving their manuscripts. These tools should be fine-tuned and updated regularly to meet the ever-changing challenges and trends of AI research, thereby ensuring comprehensive and accurate reporting of medical AI research and ultimately improving transparency and replicability in the field.</p><p>The author declares no conflicts of interest.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 2","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70047","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70047","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

In this editorial, I would like to succinctly discuss the potential of using AI to improve reporting medical AI research. There are already several published guidelines and checklists in the current literature but how they are interpreted and implemented varies with publishers, editors, reviewers and authors. Here, I discuss the possibility of harnessing generative AI tools in order to assist authors to comprehensively report their AI work and meet current guidelines, with the ultimate aim to improve transparency and replicability in medical AI research. The succinct discussion below reckons two key issues: (1) AI has a seductive allure that might affect how AI-generated evidence is scrutinized and disseminated, hence the need for comprehensive and transparent reporting, and (2) authors sometimes feel uncertain about what to report in the light of so many existing guidelines about reporting AI research and the lack of consensus in the field.

It has been argued that extraneous or irrelevant information with a seductive allure can improve the ratings of scientific explanations [1]. AI, with its overhyped knowledgeability, can convey biases and false information that readers might judge believable [2]. AI can write highly convincing text that can impress or deceive readers, even in the presence of errors and false information [3, 4]. Likewise, merely mentioning “AI” in the title of a research paper seems to increase its citation potential [5]. The latter might incentivise scientists to use AI purely to boost their work citability, regardless of whether AI improved their work quality. In this context, one might speculate that some publications that used AI but with flawed methodologies or wrong conclusions might have slipped through the cracks of peer review, with many already being indexed and citable [6]. Overall, emerging evidence suggests that AI has an intrinsic seductive allure that is shaping the medical research landscape and impacting how readers appraise research articles that employ AI. This is why improving the reporting and evaluation of AI work is of paramount importance, and in this editorial, I underscore the potential role of generative AI for that purpose.

Consider this: readers might find a paper entitled “Association between condition X and biomarker Y demonstrated with deep learning” novel and worth reading. Now, imagine if the same finding was evidenced with a traditional analysis method and entitled “Association between condition X and biomarker Y demonstrated with a correlation analysis”, though it is unlikely that the authors of the latter will consider correlation analysis worth mentioning in the article title. Although both pieces of work report the same finding, they may not enjoy the same buzz and high citability in the field. This is because AI-based methods and traditional analysis methods operate at different maturity levels. Readers (and reviewers) are quite familiar with the scope and limitations of a correlation analysis, but the same cannot be said about AI. Having clear guidelines on how to comprehensively report and rigorously evaluate medical AI research is thus extremely important.

No one denies AI's huge potential in medical research, such as automating the analysis of complex medical data and accelerating the discovery of useful markers. However, AI may discover new data-driven features and disease-markers relationships that do not always align with prior medical knowledge, raising the question of how to reconcile common medical knowledge with AI-generated evidence. Likewise, there is a risk that AI's seductive allure might diminish the critical analysis and scrutiny of AI-generated evidence, thus weakening the rigour of the peer review process in evaluating AI papers. Therefore, when AI is used to enhance the process of scientific discovery, the core principles of scientific methodology, including falsifiability, must be upheld. However, when it comes to falsifiability, independently testing and disproving AI-generated evidence remains difficult. For example, does a 2% reduction in accuracy or another performance metric disprove a particular AI method?

Indeed, there is no consensus about the conceptual and methodological frameworks by which AI-generated evidence can be securitized and falsified. This is because deploying AI to study a particular question involves several aspects that create multiple sources of error or bias that are not always easy to gauge. This includes how data is curated, cleaned, imputed, augmented, divided or aggregated, how relevant features are identified, reduced or combined, and how AI architecture is built, trained or validated. As AI can generate fabricated articles including articles with empirical results (see discussion in [4]), frameworks that uphold falsification are paramount in AI research [7]. The recent example of the AI-Generated Science (AIGS) system [8], with AI agents that can independently and autonomously create knowledge, poses significant questions to AI research at many ethical, legal and scientific levels. This is why the authors of AIGS identified falsification as a core agent of that system to verify and scrutinise AI-generated scientific discoveries.

To minimise the risk of proliferation of flawed or fabricated AI research that could harm clinical practice, many guidelines and checklists for improving the reporting of medical AI research have been proposed. Such AI reporting guidelines are very useful to support authors to comprehensively present their AI work and to enhance the rigorous evaluation of their work during the peer review process. Some of the existing guidelines include MAIC-10 (Must AI Criteria-10), CLAIM (Checklist for Artificial Intelligence in Medical Imaging), STARD-AI (Standards for Reporting of Diagnostic Accuracy Study-AI), MI-CLAIM (Minimum Information about Clinical Artificial Intelligence Modeling), MINIMAR (Minimum Information for Medical AI Reporting), RQS (Radiomics Quality Score), QAMAI (Quality Analysis of Medical Artificial Intelligence), TRIPOD+AI (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis), CONSORT-AI (Consolidated Standards of Reporting Trials–AI), SPIRIT-AI (Standard Protocol Items: Recommendations for Interventional Trials-AI), FUTURE-AI (Fairness Universality Traceability Usability Robustness Explainability-AI), CAIR (Clinical AI Research), DECIDE-AI (Developmental and Exploratory Clinical Investigations of DEcision support systems driven by Artificial Intelligence), CLEAR (CheckList for EvaluAtion of Radiomics research), DOME (Data, Optimization, Model and Evaluation); see discussion in [9-11]. The relevance of each checklist depends on the specific topic and scope of the AI research.

However, AI researchers feel overwhelmed (and sometimes confused) by so many guidelines and checklists that are not implemented or interpreted in the same way by reviewers, editors and publishers. Hence, to maximise their impact and usefulness, publishers should consider offering easy-to-follow article templates that explicitly specify what one must report in each section of a manuscript in order to meet their guidelines and checklists about AI research. Likewise, similar to existing AI-powered tools for plagiarism detection, image manipulation, and language editing, publishers should join force with AI developers to create AI-powered tools that can automatically flag up submissions that do not conform to specific guidelines and provide constructive feedback to authors on how to improve the reporting of their AI research. Such AI tools can be made accessible to authors before submission to guide them through the process of improving their manuscripts. These tools should be fine-tuned and updated regularly to meet the ever-changing challenges and trends of AI research, thereby ensuring comprehensive and accurate reporting of medical AI research and ultimately improving transparency and replicability in the field.

The author declares no conflicts of interest.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
International Journal of Imaging Systems and Technology
International Journal of Imaging Systems and Technology 工程技术-成像科学与照相技术
CiteScore
6.90
自引率
6.10%
发文量
138
审稿时长
3 months
期刊介绍: The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals. IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging. The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered. The scope of the journal includes, but is not limited to, the following in the context of biomedical research: Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.; Neuromodulation and brain stimulation techniques such as TMS and tDCS; Software and hardware for imaging, especially related to human and animal health; Image segmentation in normal and clinical populations; Pattern analysis and classification using machine learning techniques; Computational modeling and analysis; Brain connectivity and connectomics; Systems-level characterization of brain function; Neural networks and neurorobotics; Computer vision, based on human/animal physiology; Brain-computer interface (BCI) technology; Big data, databasing and data mining.
期刊最新文献
TS-Net: Trans-Scale Network for Medical Image Segmentation A Time-Adaptive Diffusion-Based CT Image Denoising Method by Processing Directional and Non-Local Information Generating Medical Reports With a Novel Deep Learning Architecture 3D Microscopic Images Segmenter Modeling by Applying Two-Stage Optimization to an Ensemble of Segmentation Methods Using a Genetic Algorithm CLA-UNet: Convolution and Focused Linear Attention Fusion for Tumor Cell Nucleus Segmentation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1