Combining the Variational and Deep Learning Techniques for Classification of Video Capsule Endoscopic Images.

Journal of imaging informatics in medicine Pub Date : 2025-01-03 DOI:10.1007/s10278-024-01352-y

Bhavana Singh, Pushpendra Kumar, Shailendra Kumar Jain

{"title":"Combining the Variational and Deep Learning Techniques for Classification of Video Capsule Endoscopic Images.","authors":"Bhavana Singh, Pushpendra Kumar, Shailendra Kumar Jain","doi":"10.1007/s10278-024-01352-y","DOIUrl":null,"url":null,"abstract":"<p><p>Gastrointestinal tract-related cancers pose a significant health burden, with high mortality rates. In order to detect the anomalies of the gastrointestinal tract that may progress to cancer, a video capsule endoscopy procedure is employed. The number of video capsule endoscopic ( <math><mi>VCE</mi></math> ) images produced per examination is enormous, which necessitates hours of analysis by clinicians. Therefore, there is a pressing need for automated computer-aided lesion classification techniques. Computer-aided systems utilize deep learning (DL) techniques, as they can potentially enhance anomaly detection rates. However, most of the DL techniques available in the literature utilizes the static frames for the classification purpose, which uses only the spatial information of the image. In addition, they only perform binary classification. Thus, the presented work proposes a framework to perform multi-class classification of <math><mi>VCE</mi></math> images by using the dynamic information of the images. The proposed algorithm is a combination of the fractional order variational model and the DL model. The fractional order variational model captures the dynamic information of <math><mi>VCE</mi></math> images by estimating optical flow color maps. Optical flow color maps are fed to the DL model for training. The DL model performs the multi-class classification task and localizes the region of interest with the maximum class score. DL model is inspired by the Faster RCNN approach, and its backbone architecture is EfficientNet B0. The proposed framework achieves the average AUC value of 0.98, mAP value of 0.93, and 0.878 as balanced accuracy value. Hence, the proposed model is efficient in <math><mi>VCE</mi></math> image classification and detection of region of interest.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of imaging informatics in medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10278-024-01352-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Gastrointestinal tract-related cancers pose a significant health burden, with high mortality rates. In order to detect the anomalies of the gastrointestinal tract that may progress to cancer, a video capsule endoscopy procedure is employed. The number of video capsule endoscopic ( $VCE$ ) images produced per examination is enormous, which necessitates hours of analysis by clinicians. Therefore, there is a pressing need for automated computer-aided lesion classification techniques. Computer-aided systems utilize deep learning (DL) techniques, as they can potentially enhance anomaly detection rates. However, most of the DL techniques available in the literature utilizes the static frames for the classification purpose, which uses only the spatial information of the image. In addition, they only perform binary classification. Thus, the presented work proposes a framework to perform multi-class classification of $VCE$ images by using the dynamic information of the images. The proposed algorithm is a combination of the fractional order variational model and the DL model. The fractional order variational model captures the dynamic information of $VCE$ images by estimating optical flow color maps. Optical flow color maps are fed to the DL model for training. The DL model performs the multi-class classification task and localizes the region of interest with the maximum class score. DL model is inspired by the Faster RCNN approach, and its backbone architecture is EfficientNet B0. The proposed framework achieves the average AUC value of 0.98, mAP value of 0.93, and 0.878 as balanced accuracy value. Hence, the proposed model is efficient in $VCE$ image classification and detection of region of interest.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of imaging informatics in medicine

自引率

0.00%

发文量