Improving the Real-Time Classification of Disease Severity in Ulcerative Colitis: Artificial Intelligence as the Trigger for a Second Opinion.

IF 7.6 1区医学 Q1 GASTROENTEROLOGY & HEPATOLOGY American Journal of Gastroenterology Pub Date : 2026-02-01 Epub Date: 2025-02-28 DOI:10.14309/ajg.0000000000003382

Bobby Lo, Bjørn Møller, Christian Igel, Signe Wildt, Ida Vind, Flemming Bendtsen, Johan Burisch, Bulat Ibragimov

{"title":"Improving the Real-Time Classification of Disease Severity in Ulcerative Colitis: Artificial Intelligence as the Trigger for a Second Opinion.","authors":"Bobby Lo, Bjørn Møller, Christian Igel, Signe Wildt, Ida Vind, Flemming Bendtsen, Johan Burisch, Bulat Ibragimov","doi":"10.14309/ajg.0000000000003382","DOIUrl":null,"url":null,"abstract":"Introduction: Endoscopic classification of ulcerative colitis (UC) shows high interobserver variation. Previous research demonstrated that artificial intelligence (AI) can match the accuracy of central reading in scoring still images. We now extend this assessment to longer colon segments and integrate AI into clinical workflows, evaluating its use for real-time, video-based classification of disease severity, and as a support system for physicians.Methods: We trained a convolutional neural network with the Mayo Endoscopic Subscores (MESs) of 2,561 images and 53 videos from 645 patients. The model differentiated scorable from unscorable endoscopy sections through open-set recognition. Validation involved 140 video clips from 44 patients with UC. Six inflammatory bowel disease (IBD) experts and 16 nonexperts rated these videos, with expert scores as the gold standard. We assessed the model's performance and the value as a supporting system. Last, the model underwent an alpha test on a real-world patient as a real-time endoscopic support.Results: The model achieved an accuracy of 82%, with no significant differences between the experts and the AI. When used as a supporting system, it improved non-IBD experts' performance by 12% and disagreed with the primary physician in 20%-39% of cases. During the alpha test, it was successfully integrated into clinical practice, accurately distinguishing between MES 0 and MES 1, consistent with endoscopists' assessments.Discussion: Our innovative AI model shows significant potential for enhancing the accuracy of UC severity classification and improving the proficiency of non-IBD experts. It is designed for clinical use and has proven feasible in real-world testing.","PeriodicalId":7608,"journal":{"name":"American Journal of Gastroenterology","volume":" ","pages":"396-403"},"PeriodicalIF":7.6000,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Gastroenterology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.14309/ajg.0000000000003382","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/28 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Endoscopic classification of ulcerative colitis (UC) shows high interobserver variation. Previous research demonstrated that artificial intelligence (AI) can match the accuracy of central reading in scoring still images. We now extend this assessment to longer colon segments and integrate AI into clinical workflows, evaluating its use for real-time, video-based classification of disease severity, and as a support system for physicians.

Methods: We trained a convolutional neural network with the Mayo Endoscopic Subscores (MESs) of 2,561 images and 53 videos from 645 patients. The model differentiated scorable from unscorable endoscopy sections through open-set recognition. Validation involved 140 video clips from 44 patients with UC. Six inflammatory bowel disease (IBD) experts and 16 nonexperts rated these videos, with expert scores as the gold standard. We assessed the model's performance and the value as a supporting system. Last, the model underwent an alpha test on a real-world patient as a real-time endoscopic support.

Results: The model achieved an accuracy of 82%, with no significant differences between the experts and the AI. When used as a supporting system, it improved non-IBD experts' performance by 12% and disagreed with the primary physician in 20%-39% of cases. During the alpha test, it was successfully integrated into clinical practice, accurately distinguishing between MES 0 and MES 1, consistent with endoscopists' assessments.

Discussion: Our innovative AI model shows significant potential for enhancing the accuracy of UC severity classification and improving the proficiency of non-IBD experts. It is designed for clinical use and has proven feasible in real-world testing.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

改进溃疡性结肠炎疾病严重程度的实时分类：人工智能作为第二意见的触发器。

目的：溃疡性结肠炎（UC）的内镜分类在观察者之间存在很大差异。之前的研究表明，人工智能（AI）在对静态图像进行评分时，其准确性可与中心阅读相媲美。现在，我们将这一评估扩展到更长的结肠段，并将人工智能整合到临床工作流程中，评估其在基于视频的疾病严重程度实时分类中的应用，并将其作为医生的支持系统：我们使用来自 645 名患者的 2,561 张图像和 53 段视频的梅奥内镜评分（MES）训练了一个卷积神经网络。该模型通过开放集识别区分了可评分和不可评分的内窥镜检查部分。验证涉及 44 名 UC 患者的 140 个视频片段。六位炎症性肠病（IBD）专家和 16 位非专家对这些视频进行了评分，并将专家评分作为金标准。我们对模型的性能和作为辅助系统的价值进行了评估。最后，我们对该模型进行了阿尔法测试，测试结果显示，该模型可为真实世界的患者提供实时内窥镜支持：结果：该模型的准确率达到了 82%，专家和人工智能之间没有明显差异。在作为辅助系统使用时，非内窥镜专家的表现提高了12%，在20%-39%的病例中与主治医生的意见不一致。在阿尔法测试中，它成功融入了临床实践，准确区分了MES 0和MES 1，与内镜医师的评估结果一致：我们的创新型人工智能模型在提高 UC 严重程度分类的准确性和提高非内镜专家的熟练程度方面显示出巨大的潜力。该模型专为临床使用而设计，在实际测试中证明是可行的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

American Journal of Gastroenterology 医学-胃肠肝病学

CiteScore

11.40

自引率

5.10%

发文量

458

审稿时长

12 months

期刊介绍： Published on behalf of the American College of Gastroenterology (ACG), The American Journal of Gastroenterology (AJG) stands as the foremost clinical journal in the fields of gastroenterology and hepatology. AJG offers practical and professional support to clinicians addressing the most prevalent gastroenterological disorders in patients.