人工智能的可接受错误率是否应该低于人类?

BJR open Pub Date : 2023-01-01 DOI:10.1259/bjro.20220053

Anders Lenskjold, Janus Uhd Nybing, Charlotte Trampedach, Astrid Galsgaard, Mathias Willadsen Brejnebøl, Henriette Raaschou, Martin Høyer Rose, Mikael Boesen

{"title":"人工智能的可接受错误率是否应该低于人类?","authors":"Anders Lenskjold, Janus Uhd Nybing, Charlotte Trampedach, Astrid Galsgaard, Mathias Willadsen Brejnebøl, Henriette Raaschou, Martin Høyer Rose, Mikael Boesen","doi":"10.1259/bjro.20220053","DOIUrl":null,"url":null,"abstract":"The first patient was misclassified in the diagnostic conclusion according to a local clinical expert opinion in a new clinical implementation of a knee osteoarthritis artificial intelligence (AI) algorithm at Bispebjerg-Frederiksberg University Hospital, Copenhagen, Denmark. In preparation for the evaluation of the AI algorithm, the implementation team collaborated with internal and external partners to plan workflows, and the algorithm was externally validated. After the misclassification, the team was left wondering: what is an acceptable error rate for a low-risk AI diagnostic algorithm? A survey among employees at the Department of Radiology showed significantly lower acceptable error rates for AI (6.8 %) than humans (11.3 %). A general mistrust of AI could cause the discrepancy in acceptable errors. AI may have the disadvantage of limited social capital and likeability compared to human co-workers, and therefore, less potential for forgiveness. Future AI development and implementation require further investigation of the fear of AI's unknown errors to enhance the trustworthiness of perceiving AI as a co-worker. Benchmark tools, transparency, and explainability are also needed to evaluate AI algorithms in clinical implementations to ensure acceptable performance.","PeriodicalId":72419,"journal":{"name":"BJR open","volume":"5 1","pages":"20220053"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10301708/pdf/","citationCount":"1","resultStr":"{\"title\":\"Should artificial intelligence have lower acceptable error rates than humans?\",\"authors\":\"Anders Lenskjold, Janus Uhd Nybing, Charlotte Trampedach, Astrid Galsgaard, Mathias Willadsen Brejnebøl, Henriette Raaschou, Martin Høyer Rose, Mikael Boesen\",\"doi\":\"10.1259/bjro.20220053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The first patient was misclassified in the diagnostic conclusion according to a local clinical expert opinion in a new clinical implementation of a knee osteoarthritis artificial intelligence (AI) algorithm at Bispebjerg-Frederiksberg University Hospital, Copenhagen, Denmark. In preparation for the evaluation of the AI algorithm, the implementation team collaborated with internal and external partners to plan workflows, and the algorithm was externally validated. After the misclassification, the team was left wondering: what is an acceptable error rate for a low-risk AI diagnostic algorithm? A survey among employees at the Department of Radiology showed significantly lower acceptable error rates for AI (6.8 %) than humans (11.3 %). A general mistrust of AI could cause the discrepancy in acceptable errors. AI may have the disadvantage of limited social capital and likeability compared to human co-workers, and therefore, less potential for forgiveness. Future AI development and implementation require further investigation of the fear of AI's unknown errors to enhance the trustworthiness of perceiving AI as a co-worker. Benchmark tools, transparency, and explainability are also needed to evaluate AI algorithms in clinical implementations to ensure acceptable performance.\",\"PeriodicalId\":72419,\"journal\":{\"name\":\"BJR open\",\"volume\":\"5 1\",\"pages\":\"20220053\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10301708/pdf/\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BJR open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1259/bjro.20220053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BJR open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1259/bjro.20220053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在丹麦哥本哈根bispebjerge - frederiksberg大学医院，根据当地临床专家的意见，在膝关节骨关节炎人工智能(AI)算法的新临床实施中，第一位患者在诊断结论中被错误分类。在准备评估AI算法时，实施团队与内部和外部合作伙伴合作规划工作流程，并对算法进行外部验证。在错误分类之后，团队想知道:低风险人工智能诊断算法的可接受错误率是多少?一项针对放射科员工的调查显示，人工智能的可接受错误率(6.8%)明显低于人类(11.3%)。对人工智能的普遍不信任可能会导致可接受错误的差异。与人类同事相比，人工智能的缺点可能是社会资本和受欢迎程度有限，因此，获得宽恕的可能性更小。未来的人工智能开发和实施需要进一步研究对人工智能未知错误的恐惧，以提高将人工智能视为同事的可信度。还需要基准工具、透明度和可解释性来评估临床实施中的人工智能算法，以确保可接受的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Should artificial intelligence have lower acceptable error rates than humans?

The first patient was misclassified in the diagnostic conclusion according to a local clinical expert opinion in a new clinical implementation of a knee osteoarthritis artificial intelligence (AI) algorithm at Bispebjerg-Frederiksberg University Hospital, Copenhagen, Denmark. In preparation for the evaluation of the AI algorithm, the implementation team collaborated with internal and external partners to plan workflows, and the algorithm was externally validated. After the misclassification, the team was left wondering: what is an acceptable error rate for a low-risk AI diagnostic algorithm? A survey among employees at the Department of Radiology showed significantly lower acceptable error rates for AI (6.8 %) than humans (11.3 %). A general mistrust of AI could cause the discrepancy in acceptable errors. AI may have the disadvantage of limited social capital and likeability compared to human co-workers, and therefore, less potential for forgiveness. Future AI development and implementation require further investigation of the fear of AI's unknown errors to enhance the trustworthiness of perceiving AI as a co-worker. Benchmark tools, transparency, and explainability are also needed to evaluate AI algorithms in clinical implementations to ensure acceptable performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BJR open

自引率

0.00%

发文量

审稿时长

18 weeks