{"title":"Unmasking colorectal cancer: A high-performance semantic network for polyp and surgical instrument segmentation","authors":"","doi":"10.1016/j.engappai.2024.109292","DOIUrl":null,"url":null,"abstract":"<div><p>Colorectal cancer (CRC) remains a significant health concern, with colonoscopy serving as the gold standard for diagnosis. Accurately segmenting polyps from colonoscopy images is crucial for detecting polyps and preventing CRC. However, challenges such as varying polyp sizes, blurred edges, and uneven brightness hinder segmentation accuracy. Leveraging artificial intelligence (AI) and robot-assisted surgery mechanisms can aid surgeons and physicians in detecting and treating polyps. To address these challenges, we propose a Colorectal Network (CR-Net), an AI-based encoder-decoder network for precise polyp and surgical instrument segmentation. CR-Net incorporates a pre-trained Visual Geometry Group model with 16 convolution layers (VGG16), attention mechanisms, redesigned skip connections, and horizontal dense connections within a U-Net architecture. The VGG16 encoder captures robust visual features, while redesigned skip connections accommodate complex data dimensions, leading to enhanced segmentation outcomes. Horizontal dense connections transfer overlooked features from the encoder to subsequent layers, further improving segmentation accuracy. Additionally, a spatial attention block enhances spatial features and ensures compatibility during upsampling. Evaluation of datasets including the Kvasir segmentation (Kvasir-SEG) dataset, Computer Vision Center Clinic Database (CVC-ClinicDB), Kvasir-Instrument dataset, and University of Washington Sinus Surgery Live (UW-Sinus-Surgery-Live) dataset demonstrates CR-Net's superior performance, achieving Dice Similarity Coefficients of 96.21%, 96.54%, 96.32%, and 92.84%, respectively, surpassing previous methods. These results highlight CR-Net's potential in empowering healthcare professionals through advanced AI-driven engineering applications. By bridging AI techniques with engineering innovations, CR-Net represents a significant advancement in CRC diagnosis and treatment.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624014507","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Colorectal cancer (CRC) remains a significant health concern, with colonoscopy serving as the gold standard for diagnosis. Accurately segmenting polyps from colonoscopy images is crucial for detecting polyps and preventing CRC. However, challenges such as varying polyp sizes, blurred edges, and uneven brightness hinder segmentation accuracy. Leveraging artificial intelligence (AI) and robot-assisted surgery mechanisms can aid surgeons and physicians in detecting and treating polyps. To address these challenges, we propose a Colorectal Network (CR-Net), an AI-based encoder-decoder network for precise polyp and surgical instrument segmentation. CR-Net incorporates a pre-trained Visual Geometry Group model with 16 convolution layers (VGG16), attention mechanisms, redesigned skip connections, and horizontal dense connections within a U-Net architecture. The VGG16 encoder captures robust visual features, while redesigned skip connections accommodate complex data dimensions, leading to enhanced segmentation outcomes. Horizontal dense connections transfer overlooked features from the encoder to subsequent layers, further improving segmentation accuracy. Additionally, a spatial attention block enhances spatial features and ensures compatibility during upsampling. Evaluation of datasets including the Kvasir segmentation (Kvasir-SEG) dataset, Computer Vision Center Clinic Database (CVC-ClinicDB), Kvasir-Instrument dataset, and University of Washington Sinus Surgery Live (UW-Sinus-Surgery-Live) dataset demonstrates CR-Net's superior performance, achieving Dice Similarity Coefficients of 96.21%, 96.54%, 96.32%, and 92.84%, respectively, surpassing previous methods. These results highlight CR-Net's potential in empowering healthcare professionals through advanced AI-driven engineering applications. By bridging AI techniques with engineering innovations, CR-Net represents a significant advancement in CRC diagnosis and treatment.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.