International Journal of Imaging Systems and Technology最新文献_第6页

A Novel Dual Attention Approach for DNN Based Automated Diabetic Retinopathy Grading 基于 DNN 的糖尿病视网膜病变自动分级的新型双重关注方法

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-09-13 DOI: 10.1002/ima.23175

Tareque Bashar Ovi, Nomaiya Bashree, Hussain Nyeem, Md Abdul Wahed, Faiaz Hasanuzzaman Rhythm, Ayat Subah Alam

Diabetic retinopathy (DR) poses a serious threat to vision, emphasising the need for early detection. Manual analysis of fundus images, though common, is error-prone and time-intensive. Existing automated diagnostic methods lack precision, particularly in the early stages of DR. This paper introduces the Soft Convolutional Block Attention Module-based Network (Soft-CBAMNet), a deep learning network designed for severity detection, which features Soft-CBAM attention to capture complex features from fundus images. The proposed network integrates both the convolutional block attention module (CBAM) and the soft-attention components, ensuring simultaneous processing of input features. Following this, attention maps undergo a max-pooling operation, and refined features are concatenated before passing through a dropout layer with a dropout rate of 50%. Experimental results on the APTOS dataset demonstrate the superior performance of Soft-CBAMNet, achieving an accuracy of 85.4% in multiclass DR grading. The proposed architecture has shown strong robustness and general feature learning capability, achieving a mean AUC of 0.81 on the IDRID dataset. Soft-CBAMNet's dynamic feature extraction capability across all classes is further justified by the inspection of intermediate feature maps. The model excels in identifying all stages of DR with increased precision, surpassing contemporary approaches. Soft-CBAMNet presents a significant advancement in DR diagnosis, offering improved accuracy and efficiency for timely intervention.

糖尿病视网膜病变（DR）对视力构成严重威胁，因此需要及早发现。人工分析眼底图像虽然常见，但容易出错且耗费时间。现有的自动诊断方法缺乏精确性，尤其是在 DR 的早期阶段。本文介绍了基于软卷积块注意模块的网络（Soft-CBAMNet），这是一种专为严重程度检测而设计的深度学习网络，它采用软卷积块注意模块来捕捉眼底图像中的复杂特征。该网络集成了卷积块注意力模块（CBAM）和软注意力组件，确保同时处理输入特征。随后，注意力图经过最大池化运算，在通过滤除层（滤除率为 50%）之前将精炼的特征串联起来。在 APTOS 数据集上的实验结果证明了 Soft-CBAMNet 的卓越性能，它在多类 DR 分级中的准确率达到了 85.4%。所提出的架构具有很强的鲁棒性和通用特征学习能力，在 IDRID 数据集上的平均 AUC 达到了 0.81。对中间特征图的检查进一步证明了 Soft-CBAMNet 跨所有类别的动态特征提取能力。该模型在识别 DR 的各个阶段时都表现出色，而且精确度更高，超越了其他同类方法。Soft-CBAMNet 在 DR 诊断方面取得了重大进展，提高了及时干预的准确性和效率。

{"title":"A Novel Dual Attention Approach for DNN Based Automated Diabetic Retinopathy Grading","authors":"Tareque Bashar Ovi, Nomaiya Bashree, Hussain Nyeem, Md Abdul Wahed, Faiaz Hasanuzzaman Rhythm, Ayat Subah Alam","doi":"10.1002/ima.23175","DOIUrl":"https://doi.org/10.1002/ima.23175","url":null,"abstract":"<div>\u0000 \u0000 <p>Diabetic retinopathy (DR) poses a serious threat to vision, emphasising the need for early detection. Manual analysis of fundus images, though common, is error-prone and time-intensive. Existing automated diagnostic methods lack precision, particularly in the early stages of DR. This paper introduces the Soft Convolutional Block Attention Module-based Network (Soft-CBAMNet), a deep learning network designed for severity detection, which features Soft-CBAM attention to capture complex features from fundus images. The proposed network integrates both the convolutional block attention module (CBAM) and the soft-attention components, ensuring simultaneous processing of input features. Following this, attention maps undergo a max-pooling operation, and refined features are concatenated before passing through a dropout layer with a dropout rate of 50%. Experimental results on the APTOS dataset demonstrate the superior performance of Soft-CBAMNet, achieving an accuracy of 85.4% in multiclass DR grading. The proposed architecture has shown strong robustness and general feature learning capability, achieving a mean AUC of 0.81 on the IDRID dataset. Soft-CBAMNet's dynamic feature extraction capability across all classes is further justified by the inspection of intermediate feature maps. The model excels in identifying all stages of DR with increased precision, surpassing contemporary approaches. Soft-CBAMNet presents a significant advancement in DR diagnosis, offering improved accuracy and efficiency for timely intervention.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Lightweight Deep Learning Model Optimization for Medical Image Analysis 用于医学图像分析的轻量级深度学习模型优化

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-09-13 DOI: 10.1002/ima.23173

Zahraa Al-Milaji, Hayder Yousif

Medical image labeling requires specialized knowledge; hence, the solution to the challenge of medical image classification lies in efficiently utilizing the few labeled samples to create a high-performance model. Building a high-performance model requires a complicated convolutional neural network (CNN) model with numerous parameters to be trained which makes the test quite expensive. In this paper, we propose optimizing a lightweight deep learning model with only five convolutional layers using the particle swarm optimization (PSO) algorithm to find the best number of kernel filters for each convolutional layer. For colored red, green, and blue (RGB) images acquired from different data sources, we suggest using stain separation using color deconvolution and horizontal and vertical flipping to produce new versions that can concentrate the representation of the images on structures and patterns. To mitigate the effect of training with incorrectly or uncertainly labeled images, grades of disease could have small variances, we apply a second-pass training excluding uncertain data. With a small number of parameters and higher accuracy, the proposed lightweight deep learning model optimization (LDLMO) algorithm shows strong resilience and generalization ability compared with most recent research on four MedMNIST datasets (RetinaMNIST, BreastMNIST, DermMNIST, and OCTMNIST), Medical-MNIST, and brain tumor MRI datasets.

医学图像标注需要专业知识；因此，解决医学图像分类难题的方法在于有效利用为数不多的标注样本来创建高性能模型。建立一个高性能模型需要一个复杂的卷积神经网络（CNN）模型，并需要训练大量参数，这使得测试成本相当高昂。在本文中，我们建议使用粒子群优化（PSO）算法优化仅有五个卷积层的轻量级深度学习模型，为每个卷积层找到最佳的内核过滤器数量。对于从不同数据源获取的彩色红、绿、蓝（RGB）图像，我们建议使用色彩解卷积和水平与垂直翻转来进行污点分离，从而生成新的版本，将图像的表示集中在结构和模式上。为了减轻使用不正确或不确定标记的图像进行训练所带来的影响，我们采用了排除不确定数据的二次训练。所提出的轻量级深度学习模型优化（LDLMO）算法参数少、准确率高，在四个MedMNIST数据集（RetinaMNIST、BreastMNIST、DermMNIST和OCTMNIST）、Medical-MNIST和脑肿瘤MRI数据集上，与最新研究相比，显示出很强的弹性和泛化能力。

{"title":"Lightweight Deep Learning Model Optimization for Medical Image Analysis","authors":"Zahraa Al-Milaji, Hayder Yousif","doi":"10.1002/ima.23173","DOIUrl":"https://doi.org/10.1002/ima.23173","url":null,"abstract":"<div>\u0000 \u0000 <p>Medical image labeling requires specialized knowledge; hence, the solution to the challenge of medical image classification lies in efficiently utilizing the few labeled samples to create a high-performance model. Building a high-performance model requires a complicated convolutional neural network (CNN) model with numerous parameters to be trained which makes the test quite expensive. In this paper, we propose optimizing a lightweight deep learning model with only five convolutional layers using the particle swarm optimization (PSO) algorithm to find the best number of kernel filters for each convolutional layer. For colored red, green, and blue (RGB) images acquired from different data sources, we suggest using stain separation using color deconvolution and horizontal and vertical flipping to produce new versions that can concentrate the representation of the images on structures and patterns. To mitigate the effect of training with incorrectly or uncertainly labeled images, grades of disease could have small variances, we apply a second-pass training excluding uncertain data. With a small number of parameters and higher accuracy, the proposed lightweight deep learning model optimization (LDLMO) algorithm shows strong resilience and generalization ability compared with most recent research on four MedMNIST datasets (RetinaMNIST, BreastMNIST, DermMNIST, and OCTMNIST), Medical-MNIST, and brain tumor MRI datasets.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Dynamic Multi-Output Convolutional Neural Network for Skin Lesion Classification 用于皮肤病变分类的动态多输出卷积神经网络

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-09-09 DOI: 10.1002/ima.23164

Yingyue Zhou, Junfei Guo, Hanmin Yao, Jiaqi Zhao, Xiaoxia Li, Jiamin Qin, Shuangli Liu

Skin cancer is a pressing global health issue, with high incidence and mortality rates. Convolutional neural network (CNN) models have been proven to be effective in improving performance in skin lesion image classification and reducing the medical burden. However, the inherent class imbalance in training data caused by the difficulty of collecting dermoscopy images leads to categorical overfitting, which still limits the effectiveness of these data-driven models in recognizing few-shot categories. To address this challenge, we propose a dynamic multi-output convolutional neural network (DMO-CNN) model that incorporates exit nodes into the standard CNN structure and includes feature refinement layers (FRLs) and an adaptive output scheduling (AOS) module. This model improves feature representation ability through multi-scale sub-feature maps and reduces the inter-layer dependencies during backpropagation. The FRLs ensure efficient and low-loss down-sampling, while the AOS module utilizes a trainable layer selection mechanism to refocus the model's attention on few-shot lesion categories. Additionally, a novel correction factor loss is introduced to supervise and promote AOS convergence. Our evaluation of the DMO-CNN model on the HAM10000 dataset demonstrates its effectiveness in multi-class skin lesion classification and its superior performance in recognizing few-shot categories. Despite utilizing a very simple VGG structure as the sole backbone structure, DMO-CNN achieved impressive performance of 0.885 in BACC and 0.983 in weighted AUC. These results are comparable to those of the ensemble model that won the ISIC 2018 challenge, highlighting the strong potential of DMO-CNN in dealing with few-shot skin lesion data.

皮肤癌是一个紧迫的全球健康问题，发病率和死亡率都很高。卷积神经网络（CNN）模型已被证明能有效提高皮肤病变图像分类的性能，减轻医疗负担。然而，由于皮肤镜图像难以收集，训练数据中固有的类别不平衡导致了分类过拟合，这仍然限制了这些数据驱动模型在识别少数几个类别时的有效性。为了应对这一挑战，我们提出了一种动态多输出卷积神经网络（DMO-CNN）模型，它将退出节点纳入标准 CNN 结构，并包含特征细化层（FRL）和自适应输出调度（AOS）模块。该模型通过多尺度子特征图提高了特征表示能力，并减少了反向传播过程中的层间依赖性。FRLs 可确保高效、低损耗的下采样，而 AOS 模块则利用可训练的层选择机制，将模型的注意力重新集中到少数病变类别上。此外，还引入了一种新的校正因子损失，以监督和促进 AOS 的收敛。我们在 HAM10000 数据集上对 DMO-CNN 模型进行了评估，结果表明该模型在多类皮损分类中非常有效，而且在识别少量皮损类别方面表现出色。尽管 DMO-CNN 采用了非常简单的 VGG 结构作为唯一的骨干结构，但其 BACC 和加权 AUC 分别达到了令人印象深刻的 0.885 和 0.983。这些结果与赢得 ISIC 2018 挑战赛的集合模型不相上下，凸显了 DMO-CNN 在处理少量皮损数据方面的强大潜力。

{"title":"A Dynamic Multi-Output Convolutional Neural Network for Skin Lesion Classification","authors":"Yingyue Zhou, Junfei Guo, Hanmin Yao, Jiaqi Zhao, Xiaoxia Li, Jiamin Qin, Shuangli Liu","doi":"10.1002/ima.23164","DOIUrl":"https://doi.org/10.1002/ima.23164","url":null,"abstract":"<div>\u0000 \u0000 <p>Skin cancer is a pressing global health issue, with high incidence and mortality rates. Convolutional neural network (CNN) models have been proven to be effective in improving performance in skin lesion image classification and reducing the medical burden. However, the inherent class imbalance in training data caused by the difficulty of collecting dermoscopy images leads to categorical overfitting, which still limits the effectiveness of these data-driven models in recognizing few-shot categories. To address this challenge, we propose a dynamic multi-output convolutional neural network (DMO-CNN) model that incorporates exit nodes into the standard CNN structure and includes feature refinement layers (FRLs) and an adaptive output scheduling (AOS) module. This model improves feature representation ability through multi-scale sub-feature maps and reduces the inter-layer dependencies during backpropagation. The FRLs ensure efficient and low-loss down-sampling, while the AOS module utilizes a trainable layer selection mechanism to refocus the model's attention on few-shot lesion categories. Additionally, a novel correction factor loss is introduced to supervise and promote AOS convergence. Our evaluation of the DMO-CNN model on the HAM10000 dataset demonstrates its effectiveness in multi-class skin lesion classification and its superior performance in recognizing few-shot categories. Despite utilizing a very simple VGG structure as the sole backbone structure, DMO-CNN achieved impressive performance of 0.885 in BACC and 0.983 in weighted AUC. These results are comparable to those of the ensemble model that won the ISIC 2018 challenge, highlighting the strong potential of DMO-CNN in dealing with few-shot skin lesion data.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142165288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reduction Accelerated Adaptive Step-Size FISTA Based Smooth-Lasso Regularization for Fluorescence Molecular Tomography Reconstruction 基于 FISTA 的平滑-拉索正则化荧光分子断层成像重构的还原加速自适应步长

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-09-09 DOI: 10.1002/ima.23166

Xiaoli Luo, Renhao Jiao, Tao Ma, Yunjie Liu, Zhu Gao, Xiuhong Shen, Qianqian Ren, Heng Zhang, Xiaowei He

In this paper, a reduced accelerated adaptive fast iterative shrinkage threshold algorithm based on Smooth-Lasso regularization (SL-RAFISTA-BB) is proposed for fluorescence molecular tomography (FMT) 3D reconstruction. This method uses the Smooth-Lasso regularization to fuse the group sparse prior information which can balance the relationship between the sparsity and smoothness of the solution, simplifying the process of calculation. In particular, the convergence speed of the FISTA is improved by introducing a reduction strategy and Barzilai-Borwein variable step size factor, and constructing a continuation strategy to reduce computing costs and the number of iterations. The experimental results show that the proposed algorithm not only accelerates the convergence speed of the iterative algorithm, but also improves the positioning accuracy of the tumor target, alleviates the over-sparse or over-smooth phenomenon of the reconstructed target, and clearly outlines the boundary information of the tumor target. We hope that this method can promote the development of optical molecular tomography.

本文提出了一种基于平滑-拉索正则化的加速自适应快速迭代收缩阈值算法（SL-RAFISTA-BB），用于荧光分子断层扫描（FMT）三维重建。该方法利用平滑-拉索正则化来融合组稀疏先验信息，从而平衡了解的稀疏性和平滑性之间的关系，简化了计算过程。特别是通过引入缩减策略和 Barzilai-Borwein 可变步长因子，以及构建延续策略来降低计算成本和迭代次数，从而提高了 FISTA 的收敛速度。实验结果表明，所提出的算法不仅加快了迭代算法的收敛速度，而且提高了肿瘤靶标的定位精度，缓解了重建靶标的过稀疏或过光滑现象，清晰地勾勒出肿瘤靶标的边界信息。我们希望该方法能促进光学分子断层成像技术的发展。

{"title":"Reduction Accelerated Adaptive Step-Size FISTA Based Smooth-Lasso Regularization for Fluorescence Molecular Tomography Reconstruction","authors":"Xiaoli Luo, Renhao Jiao, Tao Ma, Yunjie Liu, Zhu Gao, Xiuhong Shen, Qianqian Ren, Heng Zhang, Xiaowei He","doi":"10.1002/ima.23166","DOIUrl":"https://doi.org/10.1002/ima.23166","url":null,"abstract":"<div>\u0000 \u0000 <p>In this paper, a reduced accelerated adaptive fast iterative shrinkage threshold algorithm based on Smooth-Lasso regularization (SL-RAFISTA-BB) is proposed for fluorescence molecular tomography (FMT) 3D reconstruction. This method uses the Smooth-Lasso regularization to fuse the group sparse prior information which can balance the relationship between the sparsity and smoothness of the solution, simplifying the process of calculation. In particular, the convergence speed of the FISTA is improved by introducing a reduction strategy and Barzilai-Borwein variable step size factor, and constructing a continuation strategy to reduce computing costs and the number of iterations. The experimental results show that the proposed algorithm not only accelerates the convergence speed of the iterative algorithm, but also improves the positioning accuracy of the tumor target, alleviates the over-sparse or over-smooth phenomenon of the reconstructed target, and clearly outlines the boundary information of the tumor target. We hope that this method can promote the development of optical molecular tomography.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142165287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Cascade U-Net With Transformer for Retinal Multi-Lesion Segmentation 用于视网膜多病灶分割的带变换器的级联 U-Net

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-09-05 DOI: 10.1002/ima.23163

Haiyang Zheng, Feng Liu

Diabetic retinopathy (DR) is an important cause of blindness. If not diagnosed and treated in a timely manner, it can lead to irreversible vision loss. The diagnosis of DR relies heavily on specialized ophthalmologists. In recent years, with the development of artificial intelligence a number of diagnostics using this technique have begun to appear. One method for diagnosing diseases in this field is to segment four common kinds of lesions from color fundus images, including: exudates (EX), soft exudates (SE), hemorrhages (HE), and microaneurysms (MA). In this paper, we propose a segmentation model for DR based on deep learning. The main part of the model consists of two layers of improved U-Net network based on transformer, corresponding to the two stages of coarse segmentation and fine segmentation, respectively. The model can segment four common kinds of lesions from the input color fundus image at the same time. To validate the performance of our proposed model, we test our model on three public datasets: IDRiD, DDR, and DIARETDB1. The test results show that our proposed model achieves competitive results compared with the existing methods in terms of PR-AUC, ROC-AUC, Dice, and IoU, especially for lesions segmentation of SE and MA.

糖尿病视网膜病变（DR）是导致失明的重要原因之一。如果不及时诊断和治疗，会导致不可逆转的视力丧失。糖尿病视网膜病变的诊断在很大程度上依赖于专业的眼科医生。近年来，随着人工智能的发展，一些使用这种技术的诊断方法开始出现。该领域的一种疾病诊断方法是从彩色眼底图像中分割出四种常见病变，包括：渗出（EX）、软渗出（SE）、出血（HE）和微动脉瘤（MA）。本文提出了一种基于深度学习的 DR 分割模型。该模型的主体部分由两层基于变压器的改进型 U-Net 网络组成，分别对应粗分割和细分割两个阶段。该模型可同时从输入的彩色眼底图像中分割出四种常见病变。为了验证我们提出的模型的性能，我们在三个公共数据集上测试了我们的模型：IDRiD、DDR 和 DIARETDB1。测试结果表明，与现有方法相比，我们提出的模型在 PR-AUC、ROC-AUC、Dice 和 IoU 方面都取得了具有竞争力的结果，尤其是在 SE 和 MA 的病变分割方面。

{"title":"A Cascade U-Net With Transformer for Retinal Multi-Lesion Segmentation","authors":"Haiyang Zheng, Feng Liu","doi":"10.1002/ima.23163","DOIUrl":"https://doi.org/10.1002/ima.23163","url":null,"abstract":"<div>\u0000 \u0000 <p>Diabetic retinopathy (DR) is an important cause of blindness. If not diagnosed and treated in a timely manner, it can lead to irreversible vision loss. The diagnosis of DR relies heavily on specialized ophthalmologists. In recent years, with the development of artificial intelligence a number of diagnostics using this technique have begun to appear. One method for diagnosing diseases in this field is to segment four common kinds of lesions from color fundus images, including: exudates (EX), soft exudates (SE), hemorrhages (HE), and microaneurysms (MA). In this paper, we propose a segmentation model for DR based on deep learning. The main part of the model consists of two layers of improved U-Net network based on transformer, corresponding to the two stages of coarse segmentation and fine segmentation, respectively. The model can segment four common kinds of lesions from the input color fundus image at the same time. To validate the performance of our proposed model, we test our model on three public datasets: IDRiD, DDR, and DIARETDB1. The test results show that our proposed model achieves competitive results compared with the existing methods in terms of PR-AUC, ROC-AUC, Dice, and IoU, especially for lesions segmentation of SE and MA.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142152261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrating Multi-Scale Feature Boundary Module and Feature Fusion With CNN for Accurate Skin Cancer Segmentation and Classification 将多尺度特征边界模块和特征融合与 CNN 相结合，实现准确的皮肤癌分段和分类

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-09-05 DOI: 10.1002/ima.23167

S. Malaiarasan, R. Ravi

The skin, a crucial organ, plays a protective role in the human body, emphasizing the significance of early detection of skin diseases to prevent potential progression to skin cancer. The challenge lies in diagnosing these diseases at their early stages, where visual resemblance complicates differentiation, highlighting the need for an innovative automated method for precisely identifying skin lesions in biomedical images. This paper introduces a holistic methodology that combines DenseNet, multi-scale feature boundary module (MFBM), and feature fusion and decoding engine (FFDE) to tackle challenges in existing deep-learning image segmentation methods. Furthermore, a convolutional neural network model is designed for the classification of segmented images. The DenseNet encoder efficiently extracts features at four resolution levels, leveraging dense connectivity to capture intricate hierarchical features. The proposed MFBM plays a crucial role in extracting boundary information, employing parallel dilated convolutions with various dilation rates for effective multi-scale information capture. To overcome potential disadvantages related to the conversion of features during segmentation, our approach ensures the preservation of context features. The proposed FFDE method adaptively fuses features from different levels, restoring skin lesion location information while preserving local details. The evaluation of the model is conducted on the HAM10000 dataset, which consists of 10 015 dermoscopy images, yielding promising results.

皮肤是人体的重要器官，对人体起着保护作用，因此强调早期发现皮肤病以防止发展成皮肤癌的重要性。在这些疾病的早期阶段进行诊断是一项挑战，因为视觉上的相似性会使区分变得复杂，这就凸显出需要一种创新的自动方法来精确识别生物医学图像中的皮肤病变。本文介绍了一种结合 DenseNet、多尺度特征边界模块（MFBM）和特征融合与解码引擎（FFDE）的整体方法，以应对现有深度学习图像分割方法的挑战。此外，还为分割图像的分类设计了一个卷积神经网络模型。DenseNet 编码器能有效提取四个分辨率级别的特征，利用密集连接捕捉错综复杂的层次特征。所提出的 MFBM 在提取边界信息方面起着至关重要的作用，它采用不同扩张率的并行扩张卷积来有效捕捉多尺度信息。为了克服分割过程中与特征转换相关的潜在缺点，我们的方法确保保留上下文特征。所提出的 FFDE 方法能自适应地融合不同层次的特征，在保留局部细节的同时恢复皮损位置信息。该模型在由 10 015 张皮肤镜图像组成的 HAM10000 数据集上进行了评估，结果令人满意。

{"title":"Integrating Multi-Scale Feature Boundary Module and Feature Fusion With CNN for Accurate Skin Cancer Segmentation and Classification","authors":"S. Malaiarasan, R. Ravi","doi":"10.1002/ima.23167","DOIUrl":"https://doi.org/10.1002/ima.23167","url":null,"abstract":"<div>\u0000 \u0000 <p>The skin, a crucial organ, plays a protective role in the human body, emphasizing the significance of early detection of skin diseases to prevent potential progression to skin cancer. The challenge lies in diagnosing these diseases at their early stages, where visual resemblance complicates differentiation, highlighting the need for an innovative automated method for precisely identifying skin lesions in biomedical images. This paper introduces a holistic methodology that combines DenseNet, multi-scale feature boundary module (MFBM), and feature fusion and decoding engine (FFDE) to tackle challenges in existing deep-learning image segmentation methods. Furthermore, a convolutional neural network model is designed for the classification of segmented images. The DenseNet encoder efficiently extracts features at four resolution levels, leveraging dense connectivity to capture intricate hierarchical features. The proposed MFBM plays a crucial role in extracting boundary information, employing parallel dilated convolutions with various dilation rates for effective multi-scale information capture. To overcome potential disadvantages related to the conversion of features during segmentation, our approach ensures the preservation of context features. The proposed FFDE method adaptively fuses features from different levels, restoring skin lesion location information while preserving local details. The evaluation of the model is conducted on the HAM10000 dataset, which consists of 10 015 dermoscopy images, yielding promising results.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142152262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised Domain Adaptation for Simultaneous Segmentation and Classification of the Retinal Arteries and Veins 用于视网膜动脉和静脉同时分割和分类的无监督领域适应技术

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-09-04 DOI: 10.1002/ima.23151

Lanyan Xue, Wenjun Zhang, Lizheng Lu, Yunsheng Chen, Kaibin Li

Automatic segmentation of the fundus retinal vessels and accurate classification of the arterial and venous vessels play an important role in clinical diagnosis. This article proposes a fundus retinal vascular segmentation and arteriovenous classification network that combines the adversarial training and attention mechanism to address the issues of fundus retinal arteriovenous classification error and ambiguous segmentation of fine blood vessels. It consists of three core components: discriminator, generator, and segmenter. In order to address the domain shift issue, U-Net is employed as a discriminator, and data samples for arterial and venous vessels are generated with a generator using an unsupervised domain adaption (UDA) approach. The classification of retinal arterial and venous vessels (A/V) as well as the segmentation of fine vessels is improved by adding a self-attention mechanism to improve attention to vessel edge features and the terminal fine vessels. Non-strided convolution and non-pooled downsampling methods are also used to avoid losing fine-grained information and learning less effective feature representations. The performance of multi-class blood vessel segmentation is as follows, per test results on the DRIVE dataset: F1-score (F1) has a value of 0.7496 and an accuracy of 0.9820. The accuracy of A/V categorization has increased by 1.35% when compared to AU-Net. The outcomes demonstrate that by enhancing the baseline U-Net, the strategy we suggested enhances the automated classification and segmentation of blood vessels.

眼底视网膜血管的自动分割和动静脉血管的准确分类在临床诊断中发挥着重要作用。本文提出的眼底视网膜血管分割和动静脉分类网络结合了对抗训练和注意力机制，解决了眼底视网膜动静脉分类错误和细小血管分割模糊的问题。它由三个核心组件组成：判别器、生成器和分割器。为了解决域偏移问题，U-Net 被用作判别器，动静脉血管的数据样本则通过无监督域自适应（UDA）方法生成器生成。通过添加自我关注机制，提高对血管边缘特征和末端细小血管的关注，从而改进了视网膜动静脉血管（A/V）的分类和细小血管的分割。此外，还采用了非褶皱卷积和非池式降采样方法，以避免丢失细粒度信息和学习效率较低的特征表征。根据 DRIVE 数据集的测试结果，多类血管分割的性能如下：F1 分数（F1）为 0.7496，准确率为 0.9820。与 AU-Net 相比，A/V 分类的准确率提高了 1.35%。这些结果表明，通过增强基线 U-Net，我们提出的策略提高了血管的自动分类和分割。

{"title":"Unsupervised Domain Adaptation for Simultaneous Segmentation and Classification of the Retinal Arteries and Veins","authors":"Lanyan Xue, Wenjun Zhang, Lizheng Lu, Yunsheng Chen, Kaibin Li","doi":"10.1002/ima.23151","DOIUrl":"https://doi.org/10.1002/ima.23151","url":null,"abstract":"<div>\u0000 \u0000 <p>Automatic segmentation of the fundus retinal vessels and accurate classification of the arterial and venous vessels play an important role in clinical diagnosis. This article proposes a fundus retinal vascular segmentation and arteriovenous classification network that combines the adversarial training and attention mechanism to address the issues of fundus retinal arteriovenous classification error and ambiguous segmentation of fine blood vessels. It consists of three core components: discriminator, generator, and segmenter. In order to address the domain shift issue, U-Net is employed as a discriminator, and data samples for arterial and venous vessels are generated with a generator using an unsupervised domain adaption (UDA) approach. The classification of retinal arterial and venous vessels (A/V) as well as the segmentation of fine vessels is improved by adding a self-attention mechanism to improve attention to vessel edge features and the terminal fine vessels. Non-strided convolution and non-pooled downsampling methods are also used to avoid losing fine-grained information and learning less effective feature representations. The performance of multi-class blood vessel segmentation is as follows, per test results on the DRIVE dataset: F1-score (F1) has a value of 0.7496 and an accuracy of 0.9820. The accuracy of A/V categorization has increased by 1.35% when compared to AU-Net. The outcomes demonstrate that by enhancing the baseline U-Net, the strategy we suggested enhances the automated classification and segmentation of blood vessels.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computational Synthesis of Histological Stains: A Step Toward Virtual Enhanced Digital Pathology 组织学染色的计算合成：迈向虚拟增强数字病理学的一步

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-09-04 DOI: 10.1002/ima.23165

Massimo Salvi, Nicola Michielli, Lorenzo Salamone, Alessandro Mogetta, Alessandro Gambella, Luca Molinaro, Mauro Papotti, Filippo Molinari

Histological staining plays a crucial role in anatomic pathology for the analysis of biological tissues and the formulation of diagnostic reports. Traditional methods like hematoxylin and eosin (H&E) primarily offer morphological information but lack insight into functional details, such as the expression of biomarkers indicative of cellular activity. To overcome this limitation, we propose a computational approach to synthesize virtual immunohistochemical (IHC) stains from H&E input, transferring imaging features across staining domains. Our approach comprises two stages: (i) a multi-stage registration framework ensuring precise alignment of cellular and subcellular structures between the source H&E and target IHC stains, and (ii) a deep learning-based generative model which incorporates functional attributes from the target IHC stain by learning cell-to-cell mappings from paired training data. We evaluated our approach of virtual restaining H&E slides to simulate IHC staining for phospho-histone H3, on inguinal lymph node and bladder tissues. Blind pathologist assessments and quantitative metrics validated the diagnostic quality of the synthetic slides. Notably, mitotic counts derived from synthetic images exhibited a strong correlation with physical staining. Moreover, global and stain-specific metrics confirmed the high quality of the synthetic IHC images generated by our approach. This methodology represents an important advance in automated functional restaining, achieved through robust registration and a model trained on precisely paired H&E and IHC data to transfer functions cell-by-cell. Our approach forms the basis for multiparameter histology analysis and comprehensive cohort staining using only digitized H&E slides.

组织学染色在解剖病理学分析生物组织和制定诊断报告方面起着至关重要的作用。苏木精和伊红（H&E）等传统方法主要提供形态学信息，但缺乏对功能细节的深入了解，如指示细胞活动的生物标记物的表达。为了克服这一局限，我们提出了一种计算方法，从 H&E 输入合成虚拟免疫组化（IHC）染色，跨染色域转移成像特征。我们的方法包括两个阶段：(i) 多阶段配准框架，确保源 H&E 和目标 IHC 染色之间的细胞和亚细胞结构精确配准；(ii) 基于深度学习的生成模型，通过从配对训练数据中学习细胞间映射，纳入目标 IHC 染色的功能属性。我们在腹股沟淋巴结和膀胱组织上评估了虚拟重染 H&E 切片的方法，以模拟磷酸组蛋白 H3 的 IHC 染色。病理学家的盲法评估和定量指标验证了合成切片的诊断质量。值得注意的是，从合成图像中得出的有丝分裂计数与物理染色有很强的相关性。此外，全局和特定染色指标也证实了我们的方法所生成的合成 IHC 图像的高质量。这种方法通过稳健的配准和在精确配对的 H&E 和 IHC 数据上训练的模型，实现了逐个细胞的功能转移，代表了自动功能再染色的重要进步。我们的方法为仅使用数字化 H&E 切片进行多参数组织学分析和综合队列染色奠定了基础。

{"title":"Computational Synthesis of Histological Stains: A Step Toward Virtual Enhanced Digital Pathology","authors":"Massimo Salvi, Nicola Michielli, Lorenzo Salamone, Alessandro Mogetta, Alessandro Gambella, Luca Molinaro, Mauro Papotti, Filippo Molinari","doi":"10.1002/ima.23165","DOIUrl":"https://doi.org/10.1002/ima.23165","url":null,"abstract":"<p>Histological staining plays a crucial role in anatomic pathology for the analysis of biological tissues and the formulation of diagnostic reports. Traditional methods like hematoxylin and eosin (H&E) primarily offer morphological information but lack insight into functional details, such as the expression of biomarkers indicative of cellular activity. To overcome this limitation, we propose a computational approach to synthesize virtual immunohistochemical (IHC) stains from H&E input, transferring imaging features across staining domains. Our approach comprises two stages: (i) a multi-stage registration framework ensuring precise alignment of cellular and subcellular structures between the source H&E and target IHC stains, and (ii) a deep learning-based generative model which incorporates functional attributes from the target IHC stain by learning cell-to-cell mappings from paired training data. We evaluated our approach of virtual restaining H&E slides to simulate IHC staining for phospho-histone H3, on inguinal lymph node and bladder tissues. Blind pathologist assessments and quantitative metrics validated the diagnostic quality of the synthetic slides. Notably, mitotic counts derived from synthetic images exhibited a strong correlation with physical staining. Moreover, global and stain-specific metrics confirmed the high quality of the synthetic IHC images generated by our approach. This methodology represents an important advance in automated functional restaining, achieved through robust registration and a model trained on precisely paired H&E and IHC data to transfer functions cell-by-cell. Our approach forms the basis for multiparameter histology analysis and comprehensive cohort staining using only digitized H&E slides.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.23165","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GLAN: Global Local Attention Network for Thoracic Disease Classification to Enhance Heart Failure Diagnosis GLAN：用于胸腔疾病分类的全局局部注意力网络，加强心力衰竭诊断

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-08-30 DOI: 10.1002/ima.23168

Dengao Li, Yujia Mu, Jumin Zhao, Changcheng Shi, Fei Wang

Chest x-ray (CXR) is an effective method for diagnosing heart failure, and identifying important features such as cardiomegaly, effusion, and edema on patient chest x-rays is significant for aiding the treatment of heart failure. However, manually identifying a vast amount of CXR data places a huge burden on physicians. Deep learning's progression has led to the utilization of this technology in numerous research aimed at tackling these particular challenges. However, many of these studies utilize global learning methods, where the contribution of each pixel to the classification is considered equal, or they overly focus on small areas of the lesion while neglecting the global context. In response to these issues, we propose the Global Local Attention Network (GLAN), which incorporates an improved attention module on a branched structure. This enables the network to capture small lesion areas while also considering both local and global features. We evaluated the effectiveness of the proposed model by testing it on multiple public datasets and real-world datasets. Compared to the state-of-the-art methods, our network structure demonstrated greater accuracy and effectiveness in the identification of three key features: cardiomegaly, effusion, and edema. This provides more targeted support for diagnosing and treating heart failure.

胸部 X 光片（CXR）是诊断心力衰竭的有效方法，识别患者胸部 X 光片上的心脏肿大、渗液和水肿等重要特征对于帮助治疗心力衰竭意义重大。然而，人工识别大量的 CXR 数据给医生带来了巨大的负担。随着深度学习技术的发展，许多旨在解决这些特殊难题的研究都开始使用这项技术。然而，这些研究中很多都采用了全局学习方法，即认为每个像素对分类的贡献是相等的，或者过度关注病变的小区域，而忽略了全局背景。针对这些问题，我们提出了全局局部注意力网络（GLAN），该网络在分支结构上集成了改进的注意力模块。这使得该网络在捕捉小病变区域的同时，还能同时考虑局部和全局特征。我们通过在多个公共数据集和真实世界数据集上进行测试，评估了所提模型的有效性。与最先进的方法相比，我们的网络结构在识别心脏肿大、渗出和水肿这三个关键特征方面表现出更高的准确性和有效性。这为诊断和治疗心力衰竭提供了更有针对性的支持。

{"title":"GLAN: Global Local Attention Network for Thoracic Disease Classification to Enhance Heart Failure Diagnosis","authors":"Dengao Li, Yujia Mu, Jumin Zhao, Changcheng Shi, Fei Wang","doi":"10.1002/ima.23168","DOIUrl":"https://doi.org/10.1002/ima.23168","url":null,"abstract":"<div>\u0000 \u0000 <p>Chest x-ray (CXR) is an effective method for diagnosing heart failure, and identifying important features such as cardiomegaly, effusion, and edema on patient chest x-rays is significant for aiding the treatment of heart failure. However, manually identifying a vast amount of CXR data places a huge burden on physicians. Deep learning's progression has led to the utilization of this technology in numerous research aimed at tackling these particular challenges. However, many of these studies utilize global learning methods, where the contribution of each pixel to the classification is considered equal, or they overly focus on small areas of the lesion while neglecting the global context. In response to these issues, we propose the Global Local Attention Network (GLAN), which incorporates an improved attention module on a branched structure. This enables the network to capture small lesion areas while also considering both local and global features. We evaluated the effectiveness of the proposed model by testing it on multiple public datasets and real-world datasets. Compared to the state-of-the-art methods, our network structure demonstrated greater accuracy and effectiveness in the identification of three key features: cardiomegaly, effusion, and edema. This provides more targeted support for diagnosing and treating heart failure.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142100277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

STU3Net: An Improved U-Net With Swin Transformer Fusion for Thyroid Nodule Segmentation STU3Net：改进的 U-Net 与 Swin Transformer 融合用于甲状腺结节分类

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-08-29 DOI: 10.1002/ima.23160

Xiangyu Deng, Zhiyan Dang, Lihao Pan

Thyroid nodules are a common endocrine system disorder for which accurate ultrasound image segmentation is important for evaluation and diagnosis, as well as a critical step in computer-aided diagnostic systems. However, the accuracy and consistency of segmentation remains a challenging task due to the presence of scattering noise, low contrast and resolution in ultrasound images. Therefore, we propose a deep learning-based CAD (computer-aided diagnosis) method, STU³Net in this paper, aiming at automatic segmentation of thyroid nodules. The method employs a modified Swin Transformer combined with a CNN encoder, which is capable of extracting morphological features and edge details of thyroid nodules in ultrasound images. In decoding through the features for image reconstruction, we introduce a modified three-layer U-Net network with cross-layer connectivity to further enhance image reduction. This cross-layer connectivity enhances the network's capture and representation of the contained image feature information by creating skip connections between different layers and merging the detailed information of the shallow network with the abstract information of the deeper network. Through comparison experiments with current mainstream deep learning methods on the TN3K and BUSI datasets, we validate the superiority of the STU³Net method in thyroid nodule segmentation performance. The experimental results show that STU³Net outperforms most of the mainstream models on the TN3K dataset, with Dice and IoU reaching 0.8368 and 0.7416, respectively, which are significantly better than other methods. The method demonstrates excellent performance on these datasets and provides radiologists with an effective auxiliary tool to accurately detect thyroid nodules in ultrasound images.

甲状腺结节是一种常见的内分泌系统疾病，准确的超声图像分割对于评估和诊断非常重要，也是计算机辅助诊断系统的关键步骤。然而，由于超声图像中存在散射噪声、低对比度和低分辨率，分割的准确性和一致性仍然是一项具有挑战性的任务。因此，我们在本文中提出了一种基于深度学习的 CAD（计算机辅助诊断）方法 STU3Net，旨在自动分割甲状腺结节。该方法采用改进的 Swin 变换器与 CNN 编码器相结合，能够提取超声图像中甲状腺结节的形态特征和边缘细节。在通过特征解码进行图像重建时，我们引入了具有跨层连接性的改进型三层 U-Net 网络，以进一步增强图像还原能力。这种跨层连接通过在不同层之间建立跳转连接，将浅层网络的详细信息与深层网络的抽象信息融合在一起，从而增强了网络对所含图像特征信息的捕捉和表示能力。通过与当前主流深度学习方法在 TN3K 和 BUSI 数据集上的对比实验，我们验证了 STU3Net 方法在甲状腺结节分割性能方面的优越性。实验结果表明，STU3Net 在 TN3K 数据集上的表现优于大多数主流模型，Dice 和 IoU 分别达到 0.8368 和 0.7416，明显优于其他方法。该方法在这些数据集上表现优异，为放射科医生准确检测超声图像中的甲状腺结节提供了有效的辅助工具。

{"title":"STU3Net: An Improved U-Net With Swin Transformer Fusion for Thyroid Nodule Segmentation","authors":"Xiangyu Deng, Zhiyan Dang, Lihao Pan","doi":"10.1002/ima.23160","DOIUrl":"https://doi.org/10.1002/ima.23160","url":null,"abstract":"<div>\u0000 \u0000 <p>Thyroid nodules are a common endocrine system disorder for which accurate ultrasound image segmentation is important for evaluation and diagnosis, as well as a critical step in computer-aided diagnostic systems. However, the accuracy and consistency of segmentation remains a challenging task due to the presence of scattering noise, low contrast and resolution in ultrasound images. Therefore, we propose a deep learning-based CAD (computer-aided diagnosis) method, STU<sup>3</sup>Net in this paper, aiming at automatic segmentation of thyroid nodules. The method employs a modified Swin Transformer combined with a CNN encoder, which is capable of extracting morphological features and edge details of thyroid nodules in ultrasound images. In decoding through the features for image reconstruction, we introduce a modified three-layer U-Net network with cross-layer connectivity to further enhance image reduction. This cross-layer connectivity enhances the network's capture and representation of the contained image feature information by creating skip connections between different layers and merging the detailed information of the shallow network with the abstract information of the deeper network. Through comparison experiments with current mainstream deep learning methods on the TN3K and BUSI datasets, we validate the superiority of the STU<sup>3</sup>Net method in thyroid nodule segmentation performance. The experimental results show that STU<sup>3</sup>Net outperforms most of the mainstream models on the TN3K dataset, with Dice and IoU reaching 0.8368 and 0.7416, respectively, which are significantly better than other methods. The method demonstrates excellent performance on these datasets and provides radiologists with an effective auxiliary tool to accurately detect thyroid nodules in ultrasound images.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 5","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142100480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0