Addressing the reporting chasm of artificial intelligence research: the DECIDE-AI reporting guidelines.

IF 1.6 Q2 SURGERY BMJ Surgery Interventions Health Technologies Pub Date : 2022-07-29 eCollection Date: 2022-01-01 DOI:10.1136/bmjsit-2022-000154

John Gerrard Hanrahan, Danyal Zaman Khan, Hani J Marcus

{"title":"Addressing the reporting chasm of artificial intelligence research: the DECIDE-AI reporting guidelines.","authors":"John Gerrard Hanrahan, Danyal Zaman Khan, Hani J Marcus","doi":"10.1136/bmjsit-2022-000154","DOIUrl":null,"url":null,"abstract":"© Author(s) (or their employer(s)) 2022. Reuse permitted under CC BYNC. No commercial reuse. See rights and permissions. Published by BMJ. EDITORIAL The meteoric rise of artificial intelligence (AI) to the forefront of healthcare innovation has unearthed an array of avenues for surgical researchers to pursue. Applications found throughout the surgical patient pathway mean AI offers newfound support systems for clinical decisionmaking. Indeed, a growing number of technologies are entering clinical practice, with a recent review evaluating randomised controlled trials of diagnostic prediction tools suggests that potential benefits of AI that contemporary healthcare stands to realise. However, the pathway to translation to the bedside for these technologies is variable. Captured aptly in a recent editorial, there are clear examples of AI technologies already approved for clinical use in the USA, both with and without evaluation through randomised controlled trials. This speaks to a wider problem of evaluation in AI innovation, where insufficient reporting in randomised controlled trials prompted the development of several reporting guidelines, examples including the Consolidated Standards of Reporting TrialsAI and Standard Protocol Items: Recommendations for Interventional TrialsAI guidelines advising the minimum reporting standards for clinical trials and protocols, respectively. Similarly, guidance for the initial stages of AI development has been developed, namely, the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPODAI) guidelines for machine learning (ML) prediction models. Yet, when one looks at the process of AI translation, from in silico to clinical trial, an evaluation chasm becomes obvious, with guidance lacking on studies reflecting stages 2a and 2b of the IDEAL (Idea, Development, Exploration, Assessment, Longterm study) collaborative. These stages reflect the refinement and preparation for larger clinical studies, which are influenced by factors from the operator including learning curves or training; the health system the technologies enter into or organisational factors such as integration into clinical workflows. Study design features such as patient selection for both training and testing an intervention, and even the AI model itself, are crucial factors to consider prior to largescale testing. Vasey and colleagues have identified a gap in the reporting guidelines for evaluating AIdriven decision support systems, producing reporting guidelines to support the evaluation of their early stages. This was achieved through an international, tworound modified Delphi consensus process producing a 17 AIspecific item and 10 generic item reporting guidelines (DECIDEAI), informing the reporting of earlystage clinical studies of AIbased decision support systems in healthcare. The systems perspective taken by Vasey et al frame AI decisionsupport systems as complex interventions. This perspective clearly elucidates the importance of understanding of the workflow or clinical process interventions are intended to enter, alongside the evaluation setting of the AI. Reporting of such demonstrates the setting, or even systemspecific evaluation in the selected trial which may be important in judging intervention efficacy when applied to the same clinical problem in alternate health systems or settings. Furthermore, the emulation of aviation or military human factors appraisal is another value of the DECIDEAI guidelines, particularly as the augmentative nature of AI decisionsupport systems rely on humancomputer interactions. It is evident, for example, in surgery that learningcurves of surgeons influence clinical outcomes, meaning complex interventions including AIbased tools must account for this during coright.","PeriodicalId":33349,"journal":{"name":"BMJ Surgery Interventions Health Technologies","volume":" ","pages":"e000154"},"PeriodicalIF":1.6000,"publicationDate":"2022-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/51/52/bmjsit-2022-000154.PMC9345081.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Surgery Interventions Health Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjsit-2022-000154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}

引用次数: 0

Abstract

© Author(s) (or their employer(s)) 2022. Reuse permitted under CC BYNC. No commercial reuse. See rights and permissions. Published by BMJ. EDITORIAL The meteoric rise of artificial intelligence (AI) to the forefront of healthcare innovation has unearthed an array of avenues for surgical researchers to pursue. Applications found throughout the surgical patient pathway mean AI offers newfound support systems for clinical decisionmaking. Indeed, a growing number of technologies are entering clinical practice, with a recent review evaluating randomised controlled trials of diagnostic prediction tools suggests that potential benefits of AI that contemporary healthcare stands to realise. However, the pathway to translation to the bedside for these technologies is variable. Captured aptly in a recent editorial, there are clear examples of AI technologies already approved for clinical use in the USA, both with and without evaluation through randomised controlled trials. This speaks to a wider problem of evaluation in AI innovation, where insufficient reporting in randomised controlled trials prompted the development of several reporting guidelines, examples including the Consolidated Standards of Reporting TrialsAI and Standard Protocol Items: Recommendations for Interventional TrialsAI guidelines advising the minimum reporting standards for clinical trials and protocols, respectively. Similarly, guidance for the initial stages of AI development has been developed, namely, the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPODAI) guidelines for machine learning (ML) prediction models. Yet, when one looks at the process of AI translation, from in silico to clinical trial, an evaluation chasm becomes obvious, with guidance lacking on studies reflecting stages 2a and 2b of the IDEAL (Idea, Development, Exploration, Assessment, Longterm study) collaborative. These stages reflect the refinement and preparation for larger clinical studies, which are influenced by factors from the operator including learning curves or training; the health system the technologies enter into or organisational factors such as integration into clinical workflows. Study design features such as patient selection for both training and testing an intervention, and even the AI model itself, are crucial factors to consider prior to largescale testing. Vasey and colleagues have identified a gap in the reporting guidelines for evaluating AIdriven decision support systems, producing reporting guidelines to support the evaluation of their early stages. This was achieved through an international, tworound modified Delphi consensus process producing a 17 AIspecific item and 10 generic item reporting guidelines (DECIDEAI), informing the reporting of earlystage clinical studies of AIbased decision support systems in healthcare. The systems perspective taken by Vasey et al frame AI decisionsupport systems as complex interventions. This perspective clearly elucidates the importance of understanding of the workflow or clinical process interventions are intended to enter, alongside the evaluation setting of the AI. Reporting of such demonstrates the setting, or even systemspecific evaluation in the selected trial which may be important in judging intervention efficacy when applied to the same clinical problem in alternate health systems or settings. Furthermore, the emulation of aviation or military human factors appraisal is another value of the DECIDEAI guidelines, particularly as the augmentative nature of AI decisionsupport systems rely on humancomputer interactions. It is evident, for example, in surgery that learningcurves of surgeons influence clinical outcomes, meaning complex interventions including AIbased tools must account for this during coright.

查看原文