临床人工智能在医疗保健中的监测性能:范围综述。

IF 1.5 Q3 HEALTH CARE SCIENCES & SERVICES JBI evidence synthesis Pub Date : 2024-12-01 DOI:10.11124/JBIES-24-00042
Eline Sandvig Andersen, Johan Baden Birk-Korch, Rasmus Søgaard Hansen, Line Haugaard Fly, Richard Röttger, Diana Maria Cespedes Arcani, Claus Lohman Brasen, Ivan Brandslund, Jonna Skov Madsen
{"title":"临床人工智能在医疗保健中的监测性能:范围综述。","authors":"Eline Sandvig Andersen, Johan Baden Birk-Korch, Rasmus Søgaard Hansen, Line Haugaard Fly, Richard Röttger, Diana Maria Cespedes Arcani, Claus Lohman Brasen, Ivan Brandslund, Jonna Skov Madsen","doi":"10.11124/JBIES-24-00042","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The objective of this review was to provide an overview of the diverse methods described, tested, or implemented for monitoring performance of clinical artificial intelligence (AI) systems, while also summarizing the arguments given for or against these methods.</p><p><strong>Introduction: </strong>The integration of AI in clinical decision-making is steadily growing. Performances of AI systems evolve over time, necessitating ongoing performance monitoring. However, the evidence on specific monitoring methods is sparse and heterogeneous. Thus, an overview of the evidence on this topic is warranted to guide further research on clinical AI monitoring.</p><p><strong>Inclusion criteria: </strong>We included publications detailing metrics or statistical processes employed in systematic, continuous, or repeated initiatives aimed at evaluating or predicting the clinical performance of AI models with direct implications for patient management in health care. No limitations on language or publication date were enforced.</p><p><strong>Methods: </strong>We performed systematic database searches in MEDLINE (Ovid), Embase (Ovid), Scopus, and ProQuest Dissertations and Theses Global, supplemented by backward and forward citation searches and gray literature searches. Two or more independent reviewers conducted title and abstract screening, full-text evaluation, and data extraction using a tool developed by the authors. During extraction, the methods identified were divided into subcategories. The results are presented narratively and summarized in tables and graphs.</p><p><strong>Results: </strong>Thirty-nine sources of evidence were included in the review, with the most abundant source types being opinion papers/narrative reviews (33%) and simulation studies (33%). One guideline on the topic was identified, offering limited guidance on specific metrics and statistical methods. The number of sources included increased year by year, with almost 4 times as many sources included in 2023 compared with 2019. The most commonly reported performance metrics were traditional metrics from the medical literature, including area under the receiver operating characteristics curve (AUROC), sensitivity, specificity, and predictive values, although few arguments were given supporting these choices. Some studies reported on metrics and statistical processing specifically designed to monitor clinical AI.</p><p><strong>Conclusion: </strong>This review provides a summary of the methods described for monitoring AI in health care. It reveals a relative scarcity of evidence and guidance for specific practical implementation of performance monitoring of clinical AI. This underscores the imperative for further research, discussion, and guidance regarding the specifics of implementing monitoring for clinical AI. The steady increase in the number of relevant sources published per year suggests that this area of research is gaining increased focus, and the amount of evidence and guidance available will likely increase significantly over the coming years.</p><p><strong>Review registration: </strong>Open Science Framework https://osf.io/afkrn.</p>","PeriodicalId":36399,"journal":{"name":"JBI evidence synthesis","volume":"22 12","pages":"2423-2446"},"PeriodicalIF":1.5000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630661/pdf/","citationCount":"0","resultStr":"{\"title\":\"Monitoring performance of clinical artificial intelligence in health care: a scoping review.\",\"authors\":\"Eline Sandvig Andersen, Johan Baden Birk-Korch, Rasmus Søgaard Hansen, Line Haugaard Fly, Richard Röttger, Diana Maria Cespedes Arcani, Claus Lohman Brasen, Ivan Brandslund, Jonna Skov Madsen\",\"doi\":\"10.11124/JBIES-24-00042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>The objective of this review was to provide an overview of the diverse methods described, tested, or implemented for monitoring performance of clinical artificial intelligence (AI) systems, while also summarizing the arguments given for or against these methods.</p><p><strong>Introduction: </strong>The integration of AI in clinical decision-making is steadily growing. Performances of AI systems evolve over time, necessitating ongoing performance monitoring. However, the evidence on specific monitoring methods is sparse and heterogeneous. Thus, an overview of the evidence on this topic is warranted to guide further research on clinical AI monitoring.</p><p><strong>Inclusion criteria: </strong>We included publications detailing metrics or statistical processes employed in systematic, continuous, or repeated initiatives aimed at evaluating or predicting the clinical performance of AI models with direct implications for patient management in health care. No limitations on language or publication date were enforced.</p><p><strong>Methods: </strong>We performed systematic database searches in MEDLINE (Ovid), Embase (Ovid), Scopus, and ProQuest Dissertations and Theses Global, supplemented by backward and forward citation searches and gray literature searches. Two or more independent reviewers conducted title and abstract screening, full-text evaluation, and data extraction using a tool developed by the authors. During extraction, the methods identified were divided into subcategories. The results are presented narratively and summarized in tables and graphs.</p><p><strong>Results: </strong>Thirty-nine sources of evidence were included in the review, with the most abundant source types being opinion papers/narrative reviews (33%) and simulation studies (33%). One guideline on the topic was identified, offering limited guidance on specific metrics and statistical methods. The number of sources included increased year by year, with almost 4 times as many sources included in 2023 compared with 2019. The most commonly reported performance metrics were traditional metrics from the medical literature, including area under the receiver operating characteristics curve (AUROC), sensitivity, specificity, and predictive values, although few arguments were given supporting these choices. Some studies reported on metrics and statistical processing specifically designed to monitor clinical AI.</p><p><strong>Conclusion: </strong>This review provides a summary of the methods described for monitoring AI in health care. It reveals a relative scarcity of evidence and guidance for specific practical implementation of performance monitoring of clinical AI. This underscores the imperative for further research, discussion, and guidance regarding the specifics of implementing monitoring for clinical AI. The steady increase in the number of relevant sources published per year suggests that this area of research is gaining increased focus, and the amount of evidence and guidance available will likely increase significantly over the coming years.</p><p><strong>Review registration: </strong>Open Science Framework https://osf.io/afkrn.</p>\",\"PeriodicalId\":36399,\"journal\":{\"name\":\"JBI evidence synthesis\",\"volume\":\"22 12\",\"pages\":\"2423-2446\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630661/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JBI evidence synthesis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11124/JBIES-24-00042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JBI evidence synthesis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11124/JBIES-24-00042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0

摘要

目的:本综述的目的是概述临床人工智能(AI)系统监测性能所描述、测试或实施的各种方法,同时总结支持或反对这些方法的论点。导读:人工智能在临床决策中的整合正在稳步发展。人工智能系统的性能会随着时间的推移而变化,因此需要对其进行持续的性能监控。然而,关于具体监测方法的证据是稀疏和异构的。因此,有必要对这一主题的证据进行概述,以指导临床人工智能监测的进一步研究。纳入标准:我们纳入了详细描述用于系统、连续或重复计划的指标或统计过程的出版物,这些计划旨在评估或预测人工智能模型的临床表现,对医疗保健中的患者管理有直接影响。对语言和出版日期没有限制。方法:系统检索MEDLINE (Ovid)、Embase (Ovid)、Scopus和ProQuest disserthesis and Theses Global数据库,并辅以前后引文检索和灰色文献检索。两名或两名以上的独立审稿人使用作者开发的工具进行标题和摘要筛选、全文评估和数据提取。在提取过程中,将确定的方法划分为子类别。结果是叙述和总结在表格和图表。结果:本综述纳入了39个证据来源,其中最丰富的来源类型是意见论文/叙述性评论(33%)和模拟研究(33%)。确定了关于该主题的一项准则,对具体指标和统计方法提供了有限的指导。纳入的来源数量逐年增加,2023年纳入的来源数量几乎是2019年的4倍。最常报道的绩效指标是医学文献中的传统指标,包括受试者工作特征曲线下面积(AUROC)、敏感性、特异性和预测值,尽管很少有论据支持这些选择。一些研究报告了专门用于监测临床人工智能的指标和统计处理。结论:本文综述了卫生保健中人工智能监测的方法。它揭示了临床人工智能绩效监测具体实际实施的证据和指导相对缺乏。这强调了对实施临床人工智能监测的细节进行进一步研究、讨论和指导的必要性。每年发表的相关来源数量的稳步增长表明,这一研究领域正在获得越来越多的关注,在未来几年中,现有证据和指导的数量可能会显著增加。评审注册:Open Science Framework https://osf.io/afkrn。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Monitoring performance of clinical artificial intelligence in health care: a scoping review.

Objective: The objective of this review was to provide an overview of the diverse methods described, tested, or implemented for monitoring performance of clinical artificial intelligence (AI) systems, while also summarizing the arguments given for or against these methods.

Introduction: The integration of AI in clinical decision-making is steadily growing. Performances of AI systems evolve over time, necessitating ongoing performance monitoring. However, the evidence on specific monitoring methods is sparse and heterogeneous. Thus, an overview of the evidence on this topic is warranted to guide further research on clinical AI monitoring.

Inclusion criteria: We included publications detailing metrics or statistical processes employed in systematic, continuous, or repeated initiatives aimed at evaluating or predicting the clinical performance of AI models with direct implications for patient management in health care. No limitations on language or publication date were enforced.

Methods: We performed systematic database searches in MEDLINE (Ovid), Embase (Ovid), Scopus, and ProQuest Dissertations and Theses Global, supplemented by backward and forward citation searches and gray literature searches. Two or more independent reviewers conducted title and abstract screening, full-text evaluation, and data extraction using a tool developed by the authors. During extraction, the methods identified were divided into subcategories. The results are presented narratively and summarized in tables and graphs.

Results: Thirty-nine sources of evidence were included in the review, with the most abundant source types being opinion papers/narrative reviews (33%) and simulation studies (33%). One guideline on the topic was identified, offering limited guidance on specific metrics and statistical methods. The number of sources included increased year by year, with almost 4 times as many sources included in 2023 compared with 2019. The most commonly reported performance metrics were traditional metrics from the medical literature, including area under the receiver operating characteristics curve (AUROC), sensitivity, specificity, and predictive values, although few arguments were given supporting these choices. Some studies reported on metrics and statistical processing specifically designed to monitor clinical AI.

Conclusion: This review provides a summary of the methods described for monitoring AI in health care. It reveals a relative scarcity of evidence and guidance for specific practical implementation of performance monitoring of clinical AI. This underscores the imperative for further research, discussion, and guidance regarding the specifics of implementing monitoring for clinical AI. The steady increase in the number of relevant sources published per year suggests that this area of research is gaining increased focus, and the amount of evidence and guidance available will likely increase significantly over the coming years.

Review registration: Open Science Framework https://osf.io/afkrn.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
JBI evidence synthesis
JBI evidence synthesis Nursing-Nursing (all)
CiteScore
4.50
自引率
3.70%
发文量
218
期刊最新文献
Conducting Pairwise and Network Meta-analyses in Updated and Living Systematic Reviews: a Scoping Review Protocol. Effectiveness of Combined Physical and Psychological Interventions on Anxiety and Depression Symptoms in Adult Patients With Chronic Obstructive Pulmonary Disease: a Systematic Review Protocol. Factors affecting decisions to use antibiotic-sparing treatment approaches in women with uncomplicated urinary tract infections: a scoping review protocol. Parent and Carer Experiences of Health Care Professionals' Communications About a Child's Higher Weight: a Qualitative Systematic Review. Methods for data extraction and data transformation in convergent integrated mixed methods systematic reviews.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1