首页 > 最新文献

International Journal of Imaging Systems and Technology最新文献

英文 中文
FDT-Net: Frequency-Aware Dual-Branch Transformer-Based Optic Cup and Optic Disk Segmentation With Parallel Contour Information Mining and Uncertainty-Guided Refinement FDT-Net:通过并行轮廓信息挖掘和不确定性引导的细化,实现基于频率感知双支变压器的光学杯和光学盘分割
IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-21 DOI: 10.1002/ima.23199
Jierui Gan, Hongqing Zhu, Tianwei Qian, Jiahao Liu, Ning Chen, Ziying Wang

Accurate segmentation of the optic cup and disc in fundus images is crucial for the prevention and diagnosis of glaucoma. However, challenges arise due to factors such as blood vessels, and mainstream networks often demonstrate limited capacity in extracting contour information. In this paper, we propose a segmentation framework named FDT-Net, which is based on a frequency-aware dual-branch Transformer (FDBT) architecture with parallel contour information mining and uncertainty-guided refinement. Specifically, we design a FDBT that operates in the frequency domain. This module leverages the inherent contextual awareness of Transformers and utilizes Discrete Cosine Transform (DCT) transformation to mitigate the impact of certain interference factors on segmentation. The FDBT comprises global and local branches that independently extract global and local information, thereby enhancing segmentation results. Moreover, to further mine additional contour information, this study develops the parallel contour information mining (PCIM) module to operate in parallel. These modules effectively capture more details from the edges of the optic cup and disc while avoiding mutual interference, thus optimizing segmentation performance in contour regions. Furthermore, we propose an uncertainty-guided refinement (UGR) module, which generates and quantifies uncertainty mass and leverages it to enhance model performance based on subjective logic theory. The experimental results on two publicly available datasets demonstrate the superior performance and competitive advantages of our proposed FDT-Net. The code for this project is available at https://github.com/Rookie49144/FDT-Net.

准确分割眼底图像中的视杯和视盘对于预防和诊断青光眼至关重要。然而,由于血管等因素的影响,主流网络在提取轮廓信息方面往往表现出有限的能力,这给我们带来了挑战。本文提出了一种名为 FDT-Net 的分割框架,它基于频率感知双分支变换器(FDBT)架构,具有并行轮廓信息提取和不确定性引导的细化功能。具体来说,我们设计了一个在频域中运行的 FDBT。该模块利用变换器固有的上下文意识,并利用离散余弦变换 (DCT) 转换来减轻某些干扰因素对分割的影响。FDBT 包括全局和局部分支,可独立提取全局和局部信息,从而增强分割结果。此外,为了进一步挖掘更多轮廓信息,本研究还开发了并行轮廓信息挖掘(PCIM)模块。这些模块能有效捕捉视杯和视盘边缘的更多细节,同时避免相互干扰,从而优化轮廓区域的分割性能。此外,我们还提出了不确定性引导的细化(UGR)模块,该模块可生成和量化不确定性质量,并利用它来提高基于主观逻辑理论的模型性能。在两个公开数据集上的实验结果证明了我们提出的 FDT-Net 的卓越性能和竞争优势。该项目的代码见 https://github.com/Rookie49144/FDT-Net。
{"title":"FDT-Net: Frequency-Aware Dual-Branch Transformer-Based Optic Cup and Optic Disk Segmentation With Parallel Contour Information Mining and Uncertainty-Guided Refinement","authors":"Jierui Gan,&nbsp;Hongqing Zhu,&nbsp;Tianwei Qian,&nbsp;Jiahao Liu,&nbsp;Ning Chen,&nbsp;Ziying Wang","doi":"10.1002/ima.23199","DOIUrl":"https://doi.org/10.1002/ima.23199","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate segmentation of the optic cup and disc in fundus images is crucial for the prevention and diagnosis of glaucoma. However, challenges arise due to factors such as blood vessels, and mainstream networks often demonstrate limited capacity in extracting contour information. In this paper, we propose a segmentation framework named FDT-Net, which is based on a frequency-aware dual-branch Transformer (FDBT) architecture with parallel contour information mining and uncertainty-guided refinement. Specifically, we design a FDBT that operates in the frequency domain. This module leverages the inherent contextual awareness of Transformers and utilizes Discrete Cosine Transform (DCT) transformation to mitigate the impact of certain interference factors on segmentation. The FDBT comprises global and local branches that independently extract global and local information, thereby enhancing segmentation results. Moreover, to further mine additional contour information, this study develops the parallel contour information mining (PCIM) module to operate in parallel. These modules effectively capture more details from the edges of the optic cup and disc while avoiding mutual interference, thus optimizing segmentation performance in contour regions. Furthermore, we propose an uncertainty-guided refinement (UGR) module, which generates and quantifies uncertainty mass and leverages it to enhance model performance based on subjective logic theory. The experimental results on two publicly available datasets demonstrate the superior performance and competitive advantages of our proposed FDT-Net. The code for this project is available at https://github.com/Rookie49144/FDT-Net.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M-Net: A Skin Cancer Classification With Improved Convolutional Neural Network Based on the Enhanced Gray Wolf Optimization Algorithm M-Net:基于增强型灰狼优化算法的改进型卷积神经网络的皮肤癌分类方法
IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-19 DOI: 10.1002/ima.23202
Zhinan Xu, Xiaoxia Zhang, Luzhou Liu

Skin cancer is a common malignant tumor causing tens of thousands of deaths each year, making early detection essential for better treatment outcomes. However, the similar visual characteristics of skin lesions make it challenging to accurately differentiate between lesion types. With advancements in deep learning, researchers have increasingly turned to convolutional neural networks for skin cancer detection and classification. In this article, an improved skin cancer classification model M-Net is proposed, and the enhanced gray wolf optimization algorithm is combined to improve the classification performance. The gray wolf optimization algorithm guides the wolf pack to prey through a multileader structure and gradually converges through the encirclement and pursuit mechanism, so as to perform a more detailed search in the later stage. To further improve the performance of the gray wolf optimization, this study introduces the simulated annealing algorithm to avoid falling into the local optimal state and expands the search range by improving the search mechanism, thus enhancing the global optimization ability of the algorithm. The M-Net model significantly improves the accuracy of classification by extracting features of skin lesions and optimizing parameters with the enhanced gray wolf optimization algorithm. The experimental results based on the ISIC 2018 dataset show that compared with the baseline model, the feature extraction network of the model has achieved a significant improvement in accuracy. The classification performance of M-Net is excellent in multiple indicators, with accuracy, precision, recall, and F1 score reaching 0.891, 0.857, 0.895, and 0.872, respectively. In addition, the modular design of M-Net enables it to flexibly adjust feature extraction and classification modules to adapt to different classification tasks, showing great scalability and applicability. In general, the model proposed in this article performs well in the classification of skin lesions, has broad clinical application prospects, and provides strong support for promoting the diagnosis of skin diseases.

皮肤癌是一种常见的恶性肿瘤,每年导致数万人死亡,因此早期发现对获得更好的治疗效果至关重要。然而,由于皮肤病变具有相似的视觉特征,因此准确区分病变类型具有挑战性。随着深度学习技术的进步,研究人员越来越多地将卷积神经网络用于皮肤癌检测和分类。本文提出了一种改进的皮肤癌分类模型 M-Net,并结合增强型灰狼优化算法来提高分类性能。灰狼优化算法通过多头领结构引导狼群捕食,并通过包围和追逐机制逐渐收敛,从而在后期进行更细致的搜索。为了进一步提高灰狼优化的性能,本研究引入了模拟退火算法,避免陷入局部最优状态,并通过改进搜索机制扩大搜索范围,从而提高算法的全局优化能力。M-Net 模型通过提取皮损特征并利用增强型灰狼优化算法优化参数,大大提高了分类的准确性。基于 ISIC 2018 数据集的实验结果表明,与基线模型相比,该模型的特征提取网络实现了准确率的显著提升。M-Net的分类性能在多个指标上表现优异,准确率、精度、召回率和F1得分分别达到0.891、0.857、0.895和0.872。此外,M-Net 的模块化设计使其能够灵活调整特征提取和分类模块,以适应不同的分类任务,表现出很强的可扩展性和适用性。总的来说,本文提出的模型在皮损分类中表现良好,具有广阔的临床应用前景,为促进皮肤病的诊断提供了有力的支持。
{"title":"M-Net: A Skin Cancer Classification With Improved Convolutional Neural Network Based on the Enhanced Gray Wolf Optimization Algorithm","authors":"Zhinan Xu,&nbsp;Xiaoxia Zhang,&nbsp;Luzhou Liu","doi":"10.1002/ima.23202","DOIUrl":"https://doi.org/10.1002/ima.23202","url":null,"abstract":"<div>\u0000 \u0000 <p>Skin cancer is a common malignant tumor causing tens of thousands of deaths each year, making early detection essential for better treatment outcomes. However, the similar visual characteristics of skin lesions make it challenging to accurately differentiate between lesion types. With advancements in deep learning, researchers have increasingly turned to convolutional neural networks for skin cancer detection and classification. In this article, an improved skin cancer classification model M-Net is proposed, and the enhanced gray wolf optimization algorithm is combined to improve the classification performance. The gray wolf optimization algorithm guides the wolf pack to prey through a multileader structure and gradually converges through the encirclement and pursuit mechanism, so as to perform a more detailed search in the later stage. To further improve the performance of the gray wolf optimization, this study introduces the simulated annealing algorithm to avoid falling into the local optimal state and expands the search range by improving the search mechanism, thus enhancing the global optimization ability of the algorithm. The M-Net model significantly improves the accuracy of classification by extracting features of skin lesions and optimizing parameters with the enhanced gray wolf optimization algorithm. The experimental results based on the ISIC 2018 dataset show that compared with the baseline model, the feature extraction network of the model has achieved a significant improvement in accuracy. The classification performance of M-Net is excellent in multiple indicators, with accuracy, precision, recall, and F1 score reaching 0.891, 0.857, 0.895, and 0.872, respectively. In addition, the modular design of M-Net enables it to flexibly adjust feature extraction and classification modules to adapt to different classification tasks, showing great scalability and applicability. In general, the model proposed in this article performs well in the classification of skin lesions, has broad clinical application prospects, and provides strong support for promoting the diagnosis of skin diseases.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142451158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Medical Image Fusion for Multiple Diseases Features Enhancement 医学图像融合增强多种疾病特征
IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-17 DOI: 10.1002/ima.23197
Sajid Ullah Khan, Meshal Alharbi, Sajid Shah, Mohammed ELAffendi

Throughout the past 20 years, medical imaging has found extensive application in clinical diagnosis. Doctors may find it difficult to diagnose diseases using only one imaging modality. The main objective of multimodal medical image fusion (MMIF) is to improve both the accuracy and quality of clinical assessments by extracting structural and spectral information from source images. This study proposes a novel MMIF method to assist doctors and postoperations such as image segmentation, classification, and further surgical procedures. Initially, the intensity-hue-saturation (IHS) model is utilized to decompose the positron emission tomography (PET)/single photon emission computed tomography (SPECT) image, followed by a hue-angle mapping method to discriminate high- and low-activity regions in the PET images. Then, a proposed structure feature adjustment (SFA) mechanism is used as a fusion strategy for high- and low-activity regions to obtain structural and anatomical details with minimum color distortion. In the second step, a new multi-discriminator generative adversarial network (MDcGAN) approach is proposed for obtaining the final fused image. The qualitative and quantitative results demonstrate that the proposed method is superior to existing MMIF methods in preserving the structural, anatomical, and functional details of the PET/SPECT images. Through our assessment, involving visual analysis and subsequent verification using statistical metrics, it becomes evident that color changes contribute substantial visual information to the fusion of PET and MR images. The quantitative outcomes demonstrate that, in the majority of cases, the proposed algorithm consistently outperformed other methods. Yet, in a few instances, it achieved the second-highest results. The validity of the proposed method was confirmed using diverse modalities, encompassing a total of 1012 image pairs.

在过去的 20 年中,医学成像在临床诊断中得到了广泛应用。医生可能会发现,仅使用一种成像模式很难诊断疾病。多模态医学影像融合(MMIF)的主要目的是从源图像中提取结构和光谱信息,从而提高临床评估的准确性和质量。本研究提出了一种新颖的多模态医学影像融合方法,以协助医生进行图像分割、分类和进一步手术等术后工作。首先,利用强度-色调-饱和度(IHS)模型分解正电子发射计算机断层扫描(PET)/单光子发射计算机断层扫描(SPECT)图像,然后利用色调-角度映射方法区分 PET 图像中的高活性和低活性区域。然后,利用提出的结构特征调整(SFA)机制作为高活性和低活性区域的融合策略,以最小的色彩失真获得结构和解剖细节。第二步,提出了一种新的多判别生成对抗网络(MDcGAN)方法,用于获得最终的融合图像。定性和定量结果表明,在保留 PET/SPECT 图像的结构、解剖和功能细节方面,所提出的方法优于现有的 MMIF 方法。通过我们的评估(包括视觉分析和随后的统计指标验证),可以明显看出,颜色变化为 PET 和 MR 图像的融合提供了大量视觉信息。定量结果表明,在大多数情况下,所提出的算法始终优于其他方法。然而,在少数情况下,它取得了第二高的结果。所提方法的有效性通过不同的模式得到了证实,共包含 1012 对图像。
{"title":"Medical Image Fusion for Multiple Diseases Features Enhancement","authors":"Sajid Ullah Khan,&nbsp;Meshal Alharbi,&nbsp;Sajid Shah,&nbsp;Mohammed ELAffendi","doi":"10.1002/ima.23197","DOIUrl":"https://doi.org/10.1002/ima.23197","url":null,"abstract":"<div>\u0000 \u0000 <p>Throughout the past 20 years, medical imaging has found extensive application in clinical diagnosis. Doctors may find it difficult to diagnose diseases using only one imaging modality. The main objective of multimodal medical image fusion (MMIF) is to improve both the accuracy and quality of clinical assessments by extracting structural and spectral information from source images. This study proposes a novel MMIF method to assist doctors and postoperations such as image segmentation, classification, and further surgical procedures. Initially, the intensity-hue-saturation (IHS) model is utilized to decompose the positron emission tomography (PET)/single photon emission computed tomography (SPECT) image, followed by a hue-angle mapping method to discriminate high- and low-activity regions in the PET images. Then, a proposed structure feature adjustment (SFA) mechanism is used as a fusion strategy for high- and low-activity regions to obtain structural and anatomical details with minimum color distortion. In the second step, a new multi-discriminator generative adversarial network (MDcGAN) approach is proposed for obtaining the final fused image. The qualitative and quantitative results demonstrate that the proposed method is superior to existing MMIF methods in preserving the structural, anatomical, and functional details of the PET/SPECT images. Through our assessment, involving visual analysis and subsequent verification using statistical metrics, it becomes evident that color changes contribute substantial visual information to the fusion of PET and MR images. The quantitative outcomes demonstrate that, in the majority of cases, the proposed algorithm consistently outperformed other methods. Yet, in a few instances, it achieved the second-highest results. The validity of the proposed method was confirmed using diverse modalities, encompassing a total of 1012 image pairs.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142449098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimization and Application Analysis of Phase Correction Method Based on Improved Image Registration in Ultrasonic Image Detection 基于改进图像注册的相位校正方法在超声波图像检测中的优化与应用分析
IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-11 DOI: 10.1002/ima.23185
Nannan Lu, Hongyan Shu

In order to prevent and detect a wide range of disorders, including those of the brain, thoracic, digestive, urogenital, and cardiovascular systems, ultrasound technology is essential for assessing physiological data and tissue morphology. Its capacity to deliver real-time, high-frequency scans makes it a handy and non-invasive diagnostic tool. However, issues like patient movements and probe jitter from human error can provide a large amount of interference, resulting in inaccurate test findings. Techniques for image registration can assist in locating and eliminating unwanted interference while maintaining crucial data. Even though there has been research on improving these techniques in Matlab, there are no specialized systems for interference removal, and the procedure is still time-consuming, particularly when working with huge quantities of ultrasound images. The phase correlation technique, which converts images into the frequency domain and makes noise suppression easier, is one of the most efficient algorithms now in use since it can tolerate noise with resilience. Nevertheless, little research has been done on using this technique to identify displacement in blood vessel wall ultrasound images. To address these gaps, this work presents an image registration system that uses the phase correlation algorithm. The system provides rotation, zoom registration, picture translation, and displacement detection of the vessel wall in addition to interference removal. Furthermore, batch processing is included to increase the effectiveness of registering multiple ultrasound pictures. Through efficient interference management and streamlined registration, this method offers a workable way to improve the precision and efficacy of ultrasonic diagnostics.

为了预防和检测各种疾病,包括脑部、胸部、消化系统、泌尿生殖系统和心血管系统的疾病,超声波技术对于评估生理数据和组织形态至关重要。它能够提供实时、高频扫描,是一种方便的非侵入性诊断工具。然而,病人的移动和人为失误造成的探头抖动等问题会产生大量干扰,导致检测结果不准确。图像配准技术可以帮助定位和消除不必要的干扰,同时保留关键数据。尽管 Matlab 中已经有了改进这些技术的研究,但目前还没有专门用于消除干扰的系统,而且这一过程仍然非常耗时,尤其是在处理大量超声波图像时。相位相关技术可将图像转换到频域,使噪声抑制变得更容易,是目前使用的最有效的算法之一,因为它能承受噪声的影响。然而,利用这种技术识别血管壁超声图像中位移的研究还很少。为了填补这些空白,这项研究提出了一种使用相位相关算法的图像配准系统。该系统提供旋转、缩放配准、图片平移、血管壁位移检测以及干扰去除功能。此外,该系统还包括批处理功能,以提高多张超声图像的配准效率。通过高效的干扰管理和简化的配准,该方法为提高超声波诊断的精确度和有效性提供了可行的途径。
{"title":"Optimization and Application Analysis of Phase Correction Method Based on Improved Image Registration in Ultrasonic Image Detection","authors":"Nannan Lu,&nbsp;Hongyan Shu","doi":"10.1002/ima.23185","DOIUrl":"https://doi.org/10.1002/ima.23185","url":null,"abstract":"<p>In order to prevent and detect a wide range of disorders, including those of the brain, thoracic, digestive, urogenital, and cardiovascular systems, ultrasound technology is essential for assessing physiological data and tissue morphology. Its capacity to deliver real-time, high-frequency scans makes it a handy and non-invasive diagnostic tool. However, issues like patient movements and probe jitter from human error can provide a large amount of interference, resulting in inaccurate test findings. Techniques for image registration can assist in locating and eliminating unwanted interference while maintaining crucial data. Even though there has been research on improving these techniques in Matlab, there are no specialized systems for interference removal, and the procedure is still time-consuming, particularly when working with huge quantities of ultrasound images. The phase correlation technique, which converts images into the frequency domain and makes noise suppression easier, is one of the most efficient algorithms now in use since it can tolerate noise with resilience. Nevertheless, little research has been done on using this technique to identify displacement in blood vessel wall ultrasound images. To address these gaps, this work presents an image registration system that uses the phase correlation algorithm. The system provides rotation, zoom registration, picture translation, and displacement detection of the vessel wall in addition to interference removal. Furthermore, batch processing is included to increase the effectiveness of registering multiple ultrasound pictures. Through efficient interference management and streamlined registration, this method offers a workable way to improve the precision and efficacy of ultrasonic diagnostics.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.23185","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142429964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature Pyramid Network Based Spatial Attention and Cross-Level Semantic Similarity for Diseases Segmentation From Capsule Endoscopy Images 基于特征金字塔网络的空间注意力和跨层语义相似性用于胶囊内窥镜图像的疾病分割
IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-07 DOI: 10.1002/ima.23194
Said Charfi, Mohamed EL Ansari, Lahcen Koutti, Ilyas ELjaafari, Ayoub ELLahyani

As an emerging technology that uses a pill-sized camera to visualize images of the digestive tract. Wireless capsule endoscopy (WCE) presents several advantages, since it is far less invasive, does not need sedation and has less possible complications compared to standard endoscopy. Hence, it might be exploited as alternative to the standard procedure. WCE is used to diagnosis a variety of gastro-intestinal diseases such as polyps, ulcers, crohns disease, and hemorrhages. Nevertheless, WCE videos produced after a test may consist of thousands of frames per patient that must be viewed by medical specialists, besides, the capsule free mobility and technological limits cause production of a low quality images. Hence, development of an automatic tool based on artificial intelligence might be very helpful. Moreover, most state-of-the-art works aim at images classification (normal/abnormal) while ignoring diseases segmentation. Therefore, in this work a novel method based on Feature Pyramid Network model is presented. This approach aims at diseases segmentation from WCE images. In this model, modules to optimize and combine features were employed. Specifically, semantic and spatial features were mutually compensated by spatial attention and cross-level global feature fusion modules. The proposed method testing F1-score and mean intersection over union are 94.149% and 89.414%, respectively, in the MICCAI 2017 dataset. In the KID Atlas dataset, the method achieved a testing F1-score and mean intersection over union of 94.557% and 90.416%, respectively. Through the performance analysis, the mean intersection over union in the MICCAI 2017 dataset is 20.414%, 18.484%, 11.444%, 8.794% more than existing approaches. Moreover, the proposed scheme surpassed the methods used for comparison by 29.986% and 9.416% in terms of mean intersection over union in KID Atlas dataset. These results indicate that the proposed approach is promising in diseases segmentation from WCE images.

作为一种新兴技术,无线胶囊内窥镜使用药丸大小的摄像头来观察消化道图像。与标准内窥镜检查相比,无线胶囊内窥镜检查(WCE)具有创伤小、无需镇静剂、并发症少等优点。因此,它可以作为标准程序的替代方法。WCE 用于诊断各种胃肠道疾病,如息肉、溃疡、羊角风病和出血。然而,测试后生成的 WCE 视频可能包含每位患者数千帧的图像,必须由医学专家观看,此外,胶囊的自由移动性和技术限制也导致生成的图像质量较低。因此,开发一种基于人工智能的自动工具可能会很有帮助。此外,大多数最先进的工作都以图像分类(正常/异常)为目标,而忽略了疾病分割。因此,本研究提出了一种基于特征金字塔网络模型的新方法。这种方法旨在从 WCE 图像中进行疾病分割。在该模型中,采用了优化和组合特征的模块。具体来说,语义特征和空间特征通过空间注意力和跨级别全局特征融合模块相互补偿。在 MICCAI 2017 数据集中,所提出方法的测试 F1 分数和平均交集超过联合率分别为 94.149% 和 89.414%。在 KID Atlas 数据集中,该方法的测试 F1 分数和平均交叉率分别为 94.557% 和 90.416%。通过性能分析,在 MICCAI 2017 数据集中,该方法的平均交集比联合交集分别高出 20.414%、18.484%、11.444% 和 8.794%。此外,在 KID Atlas 数据集中,拟议方案的平均交集超过联合度的比例分别比用于比较的方法高出 29.986% 和 9.416%。这些结果表明,所提出的方法在从 WCE 图像中进行疾病分割方面大有可为。
{"title":"Feature Pyramid Network Based Spatial Attention and Cross-Level Semantic Similarity for Diseases Segmentation From Capsule Endoscopy Images","authors":"Said Charfi,&nbsp;Mohamed EL Ansari,&nbsp;Lahcen Koutti,&nbsp;Ilyas ELjaafari,&nbsp;Ayoub ELLahyani","doi":"10.1002/ima.23194","DOIUrl":"https://doi.org/10.1002/ima.23194","url":null,"abstract":"<div>\u0000 \u0000 <p>As an emerging technology that uses a pill-sized camera to visualize images of the digestive tract. Wireless capsule endoscopy (WCE) presents several advantages, since it is far less invasive, does not need sedation and has less possible complications compared to standard endoscopy. Hence, it might be exploited as alternative to the standard procedure. WCE is used to diagnosis a variety of gastro-intestinal diseases such as polyps, ulcers, crohns disease, and hemorrhages. Nevertheless, WCE videos produced after a test may consist of thousands of frames per patient that must be viewed by medical specialists, besides, the capsule free mobility and technological limits cause production of a low quality images. Hence, development of an automatic tool based on artificial intelligence might be very helpful. Moreover, most state-of-the-art works aim at images classification (normal/abnormal) while ignoring diseases segmentation. Therefore, in this work a novel method based on Feature Pyramid Network model is presented. This approach aims at diseases segmentation from WCE images. In this model, modules to optimize and combine features were employed. Specifically, semantic and spatial features were mutually compensated by spatial attention and cross-level global feature fusion modules. The proposed method testing F1-score and mean intersection over union are 94.149% and 89.414%, respectively, in the MICCAI 2017 dataset. In the KID Atlas dataset, the method achieved a testing F1-score and mean intersection over union of 94.557% and 90.416%, respectively. Through the performance analysis, the mean intersection over union in the MICCAI 2017 dataset is 20.414%, 18.484%, 11.444%, 8.794% more than existing approaches. Moreover, the proposed scheme surpassed the methods used for comparison by 29.986% and 9.416% in terms of mean intersection over union in KID Atlas dataset. These results indicate that the proposed approach is promising in diseases segmentation from WCE images.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142429356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multispectral Blood Smear Background Images Reconstruction for Malaria Unstained Images Normalization 用于疟疾无染色图像归一化的多光谱血涂片背景图像重建技术
IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-04 DOI: 10.1002/ima.23182
Solange Doumun OULAI, Sophie Dabo-Niang, Jérémie Zoueu

Multispectral and multimodal unstained blood smear images are obtained and evaluated to offer computer-assisted automated diagnostic evidence for malaria. However, these images suffer from uneven lighting, contrast variability, and local luminosity due to the acquisition system. This limitation significantly impacts the diagnostic process and its overall outcomes. To overcome this limitation, it is crucial to perform normalization on the acquired multispectral images as a preprocessing step for malaria parasite detection. In this study, we propose a novel method for achieving this normalization, aiming to improve the accuracy and reliability of the diagnostic process. This method is based on estimating the Bright reference image, which captures the luminosity, and the contrast variability function from the background region of the image. This is achieved through two distinct resampling methodologies, namely Gaussian random field simulation by variogram analysis and Bootstrap resampling. A method for handling the intensity saturation issue of certain pixels is also proposed, which involves outlier imputation. Both of these proposed approaches for image normalization are demonstrated to outperform existing methods for multispectral and multimodal unstained blood smear images, as measured by the Structural Similarity Index Measure (SSIM), Mean Squared Error (MSE), Zero mean Sum of Absolute Differences (ZSAD), Peak Signal to Noise Ratio (PSNR), and Absolute Mean Brightness Error (AMBE). These methods not only improve the image contrast but also preserve its spectral footprint and natural appearance more accurately. The normalization technique employing Bootstrap resampling significantly reduces the acquisition time for multimodal and multispectral images by 66%. Moreover, the processing time for Bootstrap resampling is less than 4% of the processing time required for Gaussian random field simulation.

多光谱和多模态未染色血涂片图像的获取和评估可为疟疾提供计算机辅助自动诊断证据。然而,由于采集系统的原因,这些图像存在光照不均、对比度变化和局部亮度等问题。这一局限性严重影响了诊断过程及其整体结果。为了克服这一局限性,必须对获取的多光谱图像进行归一化处理,作为检测疟原虫的预处理步骤。在本研究中,我们提出了一种实现归一化的新方法,旨在提高诊断过程的准确性和可靠性。这种方法的基础是估算明亮参考图像,它能捕捉图像背景区域的亮度和对比度变化函数。这是通过两种不同的重采样方法实现的,即通过变异图分析进行高斯随机场模拟和 Bootstrap 重采样。此外,还提出了一种处理某些像素强度饱和问题的方法,其中涉及离群值估算。在多光谱和多模态未染色血涂片图像上,这两种拟议的图像归一化方法都证明优于现有方法,具体测量指标包括结构相似性指数测量(SSIM)、平均平方误差(MSE)、绝对差值零均值和(ZSAD)、峰值信噪比(PSNR)和绝对平均亮度误差(AMBE)。这些方法不仅能提高图像对比度,还能更准确地保留图像的光谱足迹和自然外观。采用 Bootstrap 重采样的归一化技术可将多模态和多光谱图像的采集时间大幅缩短 66%。此外,Bootstrap 重采样的处理时间不到高斯随机场模拟处理时间的 4%。
{"title":"A Multispectral Blood Smear Background Images Reconstruction for Malaria Unstained Images Normalization","authors":"Solange Doumun OULAI,&nbsp;Sophie Dabo-Niang,&nbsp;Jérémie Zoueu","doi":"10.1002/ima.23182","DOIUrl":"https://doi.org/10.1002/ima.23182","url":null,"abstract":"<p>Multispectral and multimodal unstained blood smear images are obtained and evaluated to offer computer-assisted automated diagnostic evidence for malaria. However, these images suffer from uneven lighting, contrast variability, and local luminosity due to the acquisition system. This limitation significantly impacts the diagnostic process and its overall outcomes. To overcome this limitation, it is crucial to perform normalization on the acquired multispectral images as a preprocessing step for malaria parasite detection. In this study, we propose a novel method for achieving this normalization, aiming to improve the accuracy and reliability of the diagnostic process. This method is based on estimating the Bright reference image, which captures the luminosity, and the contrast variability function from the background region of the image. This is achieved through two distinct resampling methodologies, namely Gaussian random field simulation by variogram analysis and Bootstrap resampling. A method for handling the intensity saturation issue of certain pixels is also proposed, which involves outlier imputation. Both of these proposed approaches for image normalization are demonstrated to outperform existing methods for multispectral and multimodal unstained blood smear images, as measured by the Structural Similarity Index Measure (SSIM), Mean Squared Error (MSE), Zero mean Sum of Absolute Differences (ZSAD), Peak Signal to Noise Ratio (PSNR), and Absolute Mean Brightness Error (AMBE). These methods not only improve the image contrast but also preserve its spectral footprint and natural appearance more accurately. The normalization technique employing Bootstrap resampling significantly reduces the acquisition time for multimodal and multispectral images by 66%. Moreover, the processing time for Bootstrap resampling is less than 4% of the processing time required for Gaussian random field simulation.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.23182","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142429469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced Lung Cancer Diagnosis and Staging With HRNeT: A Deep Learning Approach 利用 HRNeT 增强肺癌诊断和分期:一种深度学习方法
IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-04 DOI: 10.1002/ima.23193
N. Rathan, S. Lokesh

The healthcare industry has been significantly impacted by the widespread adoption of advanced technologies such as deep learning (DL) and artificial intelligence (AI). Among various applications, computer-aided diagnosis has become a critical tool to enhance medical practice. In this research, we introduce a hybrid approach that combines a deep neural model, data collection, and classification methods for CT scans. This approach aims to detect and classify the severity of pulmonary disease and the stages of lung cancer. Our proposed lung cancer detector and stage classifier (LCDSC) demonstrate greater performance, achieving higher accuracy, sensitivity, specificity, recall, and precision. We employ an active contour model for lung cancer segmentation and high-resolution net (HRNet) for stage classification. This methodology is validated using the industry-standard benchmark image dataset lung image database consortium and image database resource initiative (LIDC-IDRI). The results show a remarkable accuracy of 98.4% in classifying lung cancer stages. Our approach presents a promising solution for early lung cancer diagnosis, potentially leading to improved patient outcomes.

深度学习(DL)和人工智能(AI)等先进技术的广泛应用对医疗行业产生了重大影响。在各种应用中,计算机辅助诊断已成为加强医疗实践的重要工具。在这项研究中,我们介绍了一种混合方法,它结合了深度神经模型、数据收集和 CT 扫描分类方法。这种方法旨在检测和分类肺部疾病的严重程度和肺癌的分期。我们提出的肺癌检测器和分期分类器(LCDSC)表现出更高的性能,实现了更高的准确性、灵敏度、特异性、召回率和精确度。我们采用主动轮廓模型进行肺癌分割,采用高分辨率网(HRNet)进行分期分类。该方法利用行业标准基准图像数据集肺图像数据库联盟和图像数据库资源计划(LIDC-IDRI)进行了验证。结果显示,肺癌分期分类的准确率高达 98.4%。我们的方法为早期肺癌诊断提供了一种前景广阔的解决方案,有可能改善患者的预后。
{"title":"Enhanced Lung Cancer Diagnosis and Staging With HRNeT: A Deep Learning Approach","authors":"N. Rathan,&nbsp;S. Lokesh","doi":"10.1002/ima.23193","DOIUrl":"https://doi.org/10.1002/ima.23193","url":null,"abstract":"<div>\u0000 \u0000 <p>The healthcare industry has been significantly impacted by the widespread adoption of advanced technologies such as deep learning (DL) and artificial intelligence (AI). Among various applications, computer-aided diagnosis has become a critical tool to enhance medical practice. In this research, we introduce a hybrid approach that combines a deep neural model, data collection, and classification methods for CT scans. This approach aims to detect and classify the severity of pulmonary disease and the stages of lung cancer. Our proposed lung cancer detector and stage classifier (LCDSC) demonstrate greater performance, achieving higher accuracy, sensitivity, specificity, recall, and precision. We employ an active contour model for lung cancer segmentation and high-resolution net (HRNet) for stage classification. This methodology is validated using the industry-standard benchmark image dataset lung image database consortium and image database resource initiative (LIDC-IDRI). The results show a remarkable accuracy of 98.4% in classifying lung cancer stages. Our approach presents a promising solution for early lung cancer diagnosis, potentially leading to improved patient outcomes.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142429272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MFH-Net: A Hybrid CNN-Transformer Network Based Multi-Scale Fusion for Medical Image Segmentation MFH-Net:基于混合 CNN-Transformer 网络的多尺度融合医学图像分割技术
IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-02 DOI: 10.1002/ima.23192
Ying Wang, Meng Zhang, Jian'an Liang, Meiyan Liang

In recent years, U-Net and its variants have gained widespread use in medical image segmentation. One key aspect of U-Net's design is the skip connection, facilitating the retention of detailed information and leading to finer segmentation results. However, existing research often concentrates on enhancing either the encoder or decoder, neglecting the semantic gap between them, and resulting in suboptimal model performance. In response, we introduce Multi-Scale Fusion module aimed at enhancing the original skip connections and addressing the semantic gap. Our approach fully incorporates the correlation between outputs from adjacent encoder layers and facilitates bidirectional information exchange across multiple layers. Additionally, we introduce Channel Relation Perception module to guide the fused multi-scale information for efficient connection with decoder features. These two modules collectively bridge the semantic gap by capturing spatial and channel dependencies in the features, contributing to accurate medical image segmentation. Building upon these innovations, we propose a novel network called MFH-Net. On three publicly available datasets, ISIC2016, ISIC2017, and Kvasir-SEG, we perform a comprehensive evaluation of the network. The experimental results show that MFH-Net exhibits higher segmentation accuracy in comparison with other competing methods. Importantly, the modules we have devised can be seamlessly incorporated into various networks, such as U-Net and its variants, offering a potential avenue for further improving model performance.

近年来,U-Net 及其变体在医学图像分割中得到了广泛应用。U-Net 设计的一个关键方面是跳转连接,这有利于保留详细信息,从而获得更精细的分割结果。然而,现有的研究往往集中于增强编码器或解码器,忽略了两者之间的语义差距,导致模型性能不理想。为此,我们引入了多尺度融合模块,旨在增强原始跳转连接并解决语义差距问题。我们的方法充分考虑了相邻编码器层输出之间的相关性,促进了多层之间的双向信息交换。此外,我们还引入了通道关系感知模块,引导融合后的多尺度信息与解码器特征进行有效连接。这两个模块通过捕捉特征中的空间和通道依赖关系,共同弥合了语义鸿沟,为准确的医学影像分割做出了贡献。在这些创新的基础上,我们提出了一种名为 MFH-Net 的新型网络。我们在 ISIC2016、ISIC2017 和 Kvasir-SEG 这三个公开数据集上对该网络进行了全面评估。实验结果表明,与其他竞争方法相比,MFH-Net 具有更高的分割准确性。重要的是,我们设计的模块可以无缝集成到 U-Net 及其变体等各种网络中,为进一步提高模型性能提供了潜在的途径。
{"title":"MFH-Net: A Hybrid CNN-Transformer Network Based Multi-Scale Fusion for Medical Image Segmentation","authors":"Ying Wang,&nbsp;Meng Zhang,&nbsp;Jian'an Liang,&nbsp;Meiyan Liang","doi":"10.1002/ima.23192","DOIUrl":"https://doi.org/10.1002/ima.23192","url":null,"abstract":"<div>\u0000 \u0000 <p>In recent years, U-Net and its variants have gained widespread use in medical image segmentation. One key aspect of U-Net's design is the skip connection, facilitating the retention of detailed information and leading to finer segmentation results. However, existing research often concentrates on enhancing either the encoder or decoder, neglecting the semantic gap between them, and resulting in suboptimal model performance. In response, we introduce Multi-Scale Fusion module aimed at enhancing the original skip connections and addressing the semantic gap. Our approach fully incorporates the correlation between outputs from adjacent encoder layers and facilitates bidirectional information exchange across multiple layers. Additionally, we introduce Channel Relation Perception module to guide the fused multi-scale information for efficient connection with decoder features. These two modules collectively bridge the semantic gap by capturing spatial and channel dependencies in the features, contributing to accurate medical image segmentation. Building upon these innovations, we propose a novel network called MFH-Net. On three publicly available datasets, ISIC2016, ISIC2017, and Kvasir-SEG, we perform a comprehensive evaluation of the network. The experimental results show that MFH-Net exhibits higher segmentation accuracy in comparison with other competing methods. Importantly, the modules we have devised can be seamlessly incorporated into various networks, such as U-Net and its variants, offering a potential avenue for further improving model performance.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142428873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RetNet30: A Novel Stacked Convolution Neural Network Model for Automated Retinal Disease Diagnosis RetNet30:用于视网膜疾病自动诊断的新型堆积卷积神经网络模型
IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-25 DOI: 10.1002/ima.23187
Krishnakumar Subramaniam, Archana Naganathan

Automated diagnosis of retinal diseases holds significant promise in enhancing healthcare efficiency and patient outcomes. However, existing methods often lack the accuracy and efficiency required for timely disease detection. To address this gap, we introduce RetNet30, a novel stacked convolutional neural network (CNN) designed to revolutionize automated retinal disease diagnosis. RetNet30 combines a custom-built 30-layer CNN with a fine-tuned Inception V3 model, integrating these sub-models through logistic regression to achieve superior classification performance. Extensive evaluations on retinal image datasets such as DRIVE, STARE, CHASE_DB1, and HRF demonstrate significant improvements in accuracy, sensitivity, specificity, and area under the ROC curve (AUROC) when compared to conventional approaches. By leveraging advanced deep learning architectures, RetNet30 not only enhances diagnostic precision but also generalizes effectively across diverse datasets, establishing a new benchmark in retinal disease classification. This novel approach offers a highly efficient and reliable solution for early disease detection and patient management, addressing the limitations of manual examination methods. Through rigorous quantitative and qualitative assessments, our proposed method demonstrates its potential to significantly impact medical image analysis and improve healthcare outcomes. RetNet30 marks a major step forward in automated retinal disease diagnosis, showcasing the future of AI-driven advancements in ophthalmology.

视网膜疾病的自动诊断在提高医疗效率和改善患者治疗效果方面大有可为。然而,现有方法往往缺乏及时检测疾病所需的准确性和效率。为了弥补这一不足,我们推出了 RetNet30,这是一种新型的堆叠卷积神经网络(CNN),旨在彻底改变视网膜疾病的自动诊断。RetNet30 将定制的 30 层卷积神经网络与微调的 Inception V3 模型相结合,通过逻辑回归整合这些子模型,从而实现卓越的分类性能。在 DRIVE、STARE、CHASE_DB1 和 HRF 等视网膜图像数据集上进行的广泛评估表明,与传统方法相比,该技术在准确性、灵敏度、特异性和 ROC 曲线下面积 (AUROC) 方面都有显著提高。通过利用先进的深度学习架构,RetNet30 不仅提高了诊断精确度,还能在不同的数据集上有效泛化,为视网膜疾病分类树立了新的标杆。这种新方法为早期疾病检测和患者管理提供了高效可靠的解决方案,解决了人工检查方法的局限性。通过严格的定量和定性评估,我们提出的方法证明了其在显著影响医学图像分析和改善医疗效果方面的潜力。RetNet30 标志着视网膜疾病自动诊断向前迈出了一大步,展示了人工智能驱动的眼科进步的未来。
{"title":"RetNet30: A Novel Stacked Convolution Neural Network Model for Automated Retinal Disease Diagnosis","authors":"Krishnakumar Subramaniam,&nbsp;Archana Naganathan","doi":"10.1002/ima.23187","DOIUrl":"https://doi.org/10.1002/ima.23187","url":null,"abstract":"<div>\u0000 \u0000 <p>Automated diagnosis of retinal diseases holds significant promise in enhancing healthcare efficiency and patient outcomes. However, existing methods often lack the accuracy and efficiency required for timely disease detection. To address this gap, we introduce RetNet30, a novel stacked convolutional neural network (CNN) designed to revolutionize automated retinal disease diagnosis. RetNet30 combines a custom-built 30-layer CNN with a fine-tuned Inception V3 model, integrating these sub-models through logistic regression to achieve superior classification performance. Extensive evaluations on retinal image datasets such as DRIVE, STARE, CHASE_DB1, and HRF demonstrate significant improvements in accuracy, sensitivity, specificity, and area under the ROC curve (AUROC) when compared to conventional approaches. By leveraging advanced deep learning architectures, RetNet30 not only enhances diagnostic precision but also generalizes effectively across diverse datasets, establishing a new benchmark in retinal disease classification. This novel approach offers a highly efficient and reliable solution for early disease detection and patient management, addressing the limitations of manual examination methods. Through rigorous quantitative and qualitative assessments, our proposed method demonstrates its potential to significantly impact medical image analysis and improve healthcare outcomes. RetNet30 marks a major step forward in automated retinal disease diagnosis, showcasing the future of AI-driven advancements in ophthalmology.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142316971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-Layer Connection SegFormer Attention U-Net for Efficient TRUS Image Segmentation 跨层连接 SegFormer 关注 U-Net 实现高效 TRUS 图像分割
IF 3 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-09-24 DOI: 10.1002/ima.23178
Yongtao Shi, Wei Du, Chao Gao, Xinzhi Li

Accurately and rapidly segmenting the prostate in transrectal ultrasound (TRUS) images remains challenging due to the complex semantic information in ultrasound images. The paper discusses a cross-layer connection with SegFormer attention U-Net for efficient TRUS image segmentation. The SegFormer framework is enhanced by reducing model parameters and complexity without sacrificing accuracy. We introduce layer-skipping connections for precise positioning and combine local context with global dependency for superior feature recognition. The decoder is improved with Multi-layer Perceptual Convolutional Block Attention Module (MCBAM) for better upsampling and reduced information loss, leading to increased accuracy. The experimental results show that compared with classic or popular deep learning methods, this method has better segmentation performance, with the dice similarity coefficient (DSC) of 97.55% and the intersection over union (IoU) of 95.23%. This approach balances encoder efficiency, multi-layer information flow, and parameter reduction.

由于超声图像中的语义信息非常复杂,因此准确、快速地分割经直肠超声(TRUS)图像中的前列腺仍是一项挑战。本文讨论了跨层连接 SegFormer 注意力 U-Net,以实现高效 TRUS 图像分割。SegFormer 框架通过降低模型参数和复杂度而不牺牲准确性得到了增强。我们引入了层跳连接以实现精确定位,并将局部上下文与全局依赖性相结合,从而实现卓越的特征识别。解码器采用多层感知卷积块注意力模块(MCBAM)进行改进,以实现更好的上采样并减少信息丢失,从而提高准确性。实验结果表明,与经典或流行的深度学习方法相比,该方法具有更好的分割性能,骰子相似系数(DSC)为 97.55%,交集大于联合(IoU)为 95.23%。这种方法兼顾了编码器效率、多层信息流和参数缩减。
{"title":"Cross-Layer Connection SegFormer Attention U-Net for Efficient TRUS Image Segmentation","authors":"Yongtao Shi,&nbsp;Wei Du,&nbsp;Chao Gao,&nbsp;Xinzhi Li","doi":"10.1002/ima.23178","DOIUrl":"https://doi.org/10.1002/ima.23178","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurately and rapidly segmenting the prostate in transrectal ultrasound (TRUS) images remains challenging due to the complex semantic information in ultrasound images. The paper discusses a cross-layer connection with SegFormer attention U-Net for efficient TRUS image segmentation. The SegFormer framework is enhanced by reducing model parameters and complexity without sacrificing accuracy. We introduce layer-skipping connections for precise positioning and combine local context with global dependency for superior feature recognition. The decoder is improved with Multi-layer Perceptual Convolutional Block Attention Module (MCBAM) for better upsampling and reduced information loss, leading to increased accuracy. The experimental results show that compared with classic or popular deep learning methods, this method has better segmentation performance, with the dice similarity coefficient (DSC) of 97.55% and the intersection over union (IoU) of 95.23%. This approach balances encoder efficiency, multi-layer information flow, and parameter reduction.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142316942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Imaging Systems and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1