International Journal of Imaging Systems and Technology最新文献_第4页

Integrating VGG 19 U-Net for Breast Thermogram Segmentation and Hybrid Enhancement With Optimized Classifier Selection: A Novel Approach to Breast Cancer Diagnosis 将 VGG 19 U-Net 与优化的分类器选择相结合，用于乳腺热图分割和混合增强：乳腺癌诊断的新方法

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-11-03 DOI: 10.1002/ima.23210

A. Arul Edwin Raj, Nabihah Binti Ahmad, S. Ananiah Durai, R. Renugadevi

Early diagnosis of breast cancer is essential for improving patient survival rates and reducing treatment costs. Despite breast thermogram images having high quality, doctors in developing countries often struggle with early diagnosis due to difficulties in interpreting subtle details. Implementing a Computer-Aided Diagnosis (CAD) system can assist doctors in accurately analyzing these details. This article presents an innovative approach to breast cancer diagnosis using thermal images. The proposed method enhances the quality and clarity of relevant features while preserving sharp and curved edges through U-Net-based segmentation for automatic selection of the ROI, advanced hybrid image enhancement techniques, and a machine learning classifier. Subjective analysis compares the processed images with five conventional enhancement techniques, demonstrating the efficiency of the proposed method. The quantitative analysis further validates the effectiveness of the proposed method against five conventional methods using four quality measures. The proposed method achieves superior performance with PSNR of 15.27 for normal and 14.31 for malignant images, AMBE of 6.594 for normal and 7.46 for malignant images, SSIM of 0.829 for normal and 0.80 for malignant images, and DSSIM of 0.084 for normal and 0.14 for malignant images. The classification phase evaluates four classifiers using 13 features from three categories. The Random Forest (RF) classifier with Discrete Wavelet Transform (DWT) based features initially outperformed other classifier features but had limited performance, with accuracy, sensitivity and specificity of 81.8%, 88.8%, and 91%, respectively. To improve this, three categories of features were normalized and converted into two principal components using Principal Component Analysis (PCA) to train the RF classifier, which then showed superior performance with 97.7% accuracy, 96.5% sensitivity, and 98.2% specificity. The dataset utilized in this article is obtained from the Indira Gandhi Centre for Atomic Research (IGCAR), Kalpakkam, India. The entire proposed model is implemented in a Jupyter notebook.

乳腺癌的早期诊断对于提高患者生存率和降低治疗成本至关重要。尽管乳腺热成像图像的质量很高，但发展中国家的医生往往由于难以解读微妙的细节而在早期诊断方面举步维艰。实施计算机辅助诊断（CAD）系统可以帮助医生准确分析这些细节。本文介绍了一种利用热图像诊断乳腺癌的创新方法。所提出的方法通过基于 U-Net 的自动选择 ROI 的分割、先进的混合图像增强技术和机器学习分类器，提高了相关特征的质量和清晰度，同时保留了锐利和弯曲的边缘。主观分析将处理后的图像与五种传统增强技术进行比较，证明了所提方法的效率。定量分析使用四种质量测量方法，进一步验证了建议方法与五种传统方法的有效性。拟议方法取得了卓越的性能，正常图像的 PSNR 为 15.27，恶性图像为 14.31；正常图像的 AMBE 为 6.594，恶性图像为 7.46；正常图像的 SSIM 为 0.829，恶性图像为 0.80；正常图像的 DSSIM 为 0.084，恶性图像为 0.14。分类阶段使用三个类别的 13 个特征对四个分类器进行了评估。基于离散小波变换（DWT）特征的随机森林（RF）分类器最初优于其他分类器特征，但性能有限，准确率、灵敏度和特异性分别为 81.8%、88.8% 和 91%。为了改善这一情况，我们对三类特征进行了归一化处理，并使用主成分分析法（PCA）将其转换为两个主成分来训练射频分类器，结果表明该分类器性能优越，准确率为 97.7%，灵敏度为 96.5%，特异性为 98.2%。本文使用的数据集来自印度卡尔帕卡姆的英迪拉-甘地原子研究中心（IGCAR）。整个拟议模型是在 Jupyter 笔记本中实现的。

{"title":"Integrating VGG 19 U-Net for Breast Thermogram Segmentation and Hybrid Enhancement With Optimized Classifier Selection: A Novel Approach to Breast Cancer Diagnosis","authors":"A. Arul Edwin Raj, Nabihah Binti Ahmad, S. Ananiah Durai, R. Renugadevi","doi":"10.1002/ima.23210","DOIUrl":"https://doi.org/10.1002/ima.23210","url":null,"abstract":"<div>\u0000 \u0000 <p>Early diagnosis of breast cancer is essential for improving patient survival rates and reducing treatment costs. Despite breast thermogram images having high quality, doctors in developing countries often struggle with early diagnosis due to difficulties in interpreting subtle details. Implementing a Computer-Aided Diagnosis (CAD) system can assist doctors in accurately analyzing these details. This article presents an innovative approach to breast cancer diagnosis using thermal images. The proposed method enhances the quality and clarity of relevant features while preserving sharp and curved edges through U-Net-based segmentation for automatic selection of the ROI, advanced hybrid image enhancement techniques, and a machine learning classifier. Subjective analysis compares the processed images with five conventional enhancement techniques, demonstrating the efficiency of the proposed method. The quantitative analysis further validates the effectiveness of the proposed method against five conventional methods using four quality measures. The proposed method achieves superior performance with PSNR of 15.27 for normal and 14.31 for malignant images, AMBE of 6.594 for normal and 7.46 for malignant images, SSIM of 0.829 for normal and 0.80 for malignant images, and DSSIM of 0.084 for normal and 0.14 for malignant images. The classification phase evaluates four classifiers using 13 features from three categories. The Random Forest (RF) classifier with Discrete Wavelet Transform (DWT) based features initially outperformed other classifier features but had limited performance, with accuracy, sensitivity and specificity of 81.8%, 88.8%, and 91%, respectively. To improve this, three categories of features were normalized and converted into two principal components using Principal Component Analysis (PCA) to train the RF classifier, which then showed superior performance with 97.7% accuracy, 96.5% sensitivity, and 98.2% specificity. The dataset utilized in this article is obtained from the Indira Gandhi Centre for Atomic Research (IGCAR), Kalpakkam, India. The entire proposed model is implemented in a Jupyter notebook.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DAG-Net: Dual-Branch Attention-Guided Network for Multi-Scale Information Fusion in Lung Nodule Segmentation DAG-Net：用于肺结节分段中多尺度信息融合的双分支注意引导网络

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-11-02 DOI: 10.1002/ima.23209

Bojie Zhang, Hongqing Zhu, Ziying Wang, Lan Luo, Yang Yu

The development of deep learning has played an increasingly crucial role in assisting medical diagnoses. Lung cancer, as a major disease threatening human health, benefits significantly from the use of auxiliary medical systems to assist in segmenting pulmonary nodules. This approach effectively enhances both the accuracy and speed of diagnosis for physicians, thereby reducing the risk of patient mortality. However, pulmonary nodules are characterized by irregular shapes and a wide range of diameter variations. They often reside amidst blood vessels and various tissue structures, posing significant challenges in designing an automated system for lung nodule segmentation. To address this, we have developed a three-dimensional dual-branch attention-guided network (DAG-Net) for multi-scale information fusion, aimed at segmenting lung nodules of various types and sizes. First, a dual-branch encoding structure is employed to provide the network with prior knowledge about nodule texture information, which aids the network in better identifying different types of lung nodules. Next, we designed a structure to extract global information, which enhances the network's ability to localize lung nodules of different sizes by fusing information from multiple resolutions. Following that, we fused multi-scale information in a parallel structure and used attention mechanisms to guide the network in suppressing the influence of non-nodule regions. Finally, we employed an attention-based structure to guide the network in achieving more accurate segmentation by progressively using high-level semantic information at each layer. Our proposed network achieved a DSC value of 85.6% on the LUNA16 dataset, outperforming state-of-the-art methods, demonstrating the effectiveness of the network.

深度学习的发展在辅助医疗诊断方面发挥着越来越重要的作用。肺癌作为威胁人类健康的重大疾病，使用辅助医疗系统来协助分割肺结节，可使患者受益匪浅。这种方法可有效提高医生诊断的准确性和速度，从而降低病人死亡的风险。然而，肺结节的特点是形状不规则，直径变化范围大。它们通常位于血管和各种组织结构之间，这给设计肺结节自动分割系统带来了巨大挑战。为此，我们开发了一种用于多尺度信息融合的三维双分支注意力引导网络（DAG-Net），旨在分割各种类型和大小的肺结节。首先，采用双分支编码结构为网络提供有关结节纹理信息的先验知识，帮助网络更好地识别不同类型的肺结节。接着，我们设计了一种提取全局信息的结构，通过融合来自多个分辨率的信息，增强了网络定位不同大小肺结节的能力。随后，我们在并行结构中融合了多尺度信息，并利用注意力机制引导网络抑制非结节区域的影响。最后，我们采用了一种基于注意力的结构，通过在每一层逐步使用高级语义信息来引导网络实现更精确的分割。我们提出的网络在 LUNA16 数据集上的 DSC 值达到了 85.6%，超过了最先进的方法，证明了该网络的有效性。

{"title":"DAG-Net: Dual-Branch Attention-Guided Network for Multi-Scale Information Fusion in Lung Nodule Segmentation","authors":"Bojie Zhang, Hongqing Zhu, Ziying Wang, Lan Luo, Yang Yu","doi":"10.1002/ima.23209","DOIUrl":"https://doi.org/10.1002/ima.23209","url":null,"abstract":"<div>\u0000 \u0000 <p>The development of deep learning has played an increasingly crucial role in assisting medical diagnoses. Lung cancer, as a major disease threatening human health, benefits significantly from the use of auxiliary medical systems to assist in segmenting pulmonary nodules. This approach effectively enhances both the accuracy and speed of diagnosis for physicians, thereby reducing the risk of patient mortality. However, pulmonary nodules are characterized by irregular shapes and a wide range of diameter variations. They often reside amidst blood vessels and various tissue structures, posing significant challenges in designing an automated system for lung nodule segmentation. To address this, we have developed a three-dimensional dual-branch attention-guided network (DAG-Net) for multi-scale information fusion, aimed at segmenting lung nodules of various types and sizes. First, a dual-branch encoding structure is employed to provide the network with prior knowledge about nodule texture information, which aids the network in better identifying different types of lung nodules. Next, we designed a structure to extract global information, which enhances the network's ability to localize lung nodules of different sizes by fusing information from multiple resolutions. Following that, we fused multi-scale information in a parallel structure and used attention mechanisms to guide the network in suppressing the influence of non-nodule regions. Finally, we employed an attention-based structure to guide the network in achieving more accurate segmentation by progressively using high-level semantic information at each layer. Our proposed network achieved a DSC value of 85.6% on the LUNA16 dataset, outperforming state-of-the-art methods, demonstrating the effectiveness of the network.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142565479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Embedded System-Based Malaria Detection From Blood Smear Images Using Lightweight Deep Learning Model 利用轻量级深度学习模型，基于嵌入式系统从血液涂片图像中检测疟疾

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-10-29 DOI: 10.1002/ima.23205

Abdus Salam, S. M. Nahid Hasan, Md. Jawadul Karim, Shamim Anower, Md Nahiduzzaman, Muhammad E. H. Chowdhury, M. Murugappan

The disease of malaria, transmitted by female Anopheles mosquitoes, is highly contagious, resulting in numerous deaths across various regions. Microscopic examination of blood cells remains one of the most accurate methods for malaria diagnosis, but it is time-consuming and can produce inaccurate results occasionally. Due to machine learning and deep learning advances in medical diagnosis, improved diagnostic accuracy can now be achieved while costs can be reduced compared to conventional microscopy methods. This work utilizes an open-source dataset with 26 161 blood smear images in RGB for malaria detection. Our preprocessing resized the original dimensions of the images into 64 × 64 due to the limitations in computational complexity in developing embedded systems-based malaria detection. We present a novel embedded system approach using 119 154 trainable parameters in a lightweight 17-layer SqueezeNet model for the automatic detection of malaria. Incredibly, the model is only 1.72 MB in size. An evaluation of the model's performance on the original NIH malaria dataset shows that it has exceptional accuracy, precision, recall, and F1 scores of 96.37%, 95.67%, 97.21%, and 96.44%, respectively. Based on a modified dataset, the results improved further to 99.71% across all metrics. Compared to current deep learning models, our model significantly outperforms them for malaria detection, making it ideal for embedded systems. This model has also been rigorously tested on the Jetson Nano B01 edge device, demonstrating a rapid single image prediction time of only 0.24 s. The fusion of deep learning with embedded systems makes this research a crucial step toward improving malaria diagnosis. In resource-constrained settings, the model's lightweight architecture and accuracy enhancements hold great promise for addressing the critical challenge of malaria detection.

疟疾是由雌性按蚊传播的疾病，具有高度传染性，导致各地无数人死亡。显微镜检查血细胞仍然是诊断疟疾最准确的方法之一，但这种方法耗时较长，偶尔也会产生不准确的结果。由于机器学习和深度学习在医疗诊断领域的进步，现在可以实现更高的诊断准确性，同时与传统的显微镜检查方法相比可以降低成本。这项工作利用了一个开源数据集，其中包含 26 161 张 RGB 血涂片图像，用于疟疾检测。由于在开发基于嵌入式系统的疟疾检测时受到计算复杂度的限制，我们的预处理将图像的原始尺寸调整为 64 × 64。我们提出了一种新颖的嵌入式系统方法，在轻量级 17 层 SqueezeNet 模型中使用 119 154 个可训练参数来自动检测疟疾。令人难以置信的是，该模型的大小仅为 1.72 MB。对模型在原始 NIH 疟疾数据集上的性能进行的评估表明，该模型的准确率、精确率、召回率和 F1 分数分别为 96.37%、95.67%、97.21% 和 96.44%，表现优异。基于修改后的数据集，所有指标的结果进一步提高到 99.71%。与当前的深度学习模型相比，我们的模型在疟疾检测方面明显优于它们，是嵌入式系统的理想选择。该模型还在 Jetson Nano B01 边缘设备上进行了严格测试，结果表明单张图像的快速预测时间仅为 0.24 秒。深度学习与嵌入式系统的融合使这项研究朝着改善疟疾诊断迈出了关键一步。在资源有限的环境下，该模型的轻量级架构和准确性的提高为应对疟疾检测这一关键挑战带来了巨大希望。

{"title":"Embedded System-Based Malaria Detection From Blood Smear Images Using Lightweight Deep Learning Model","authors":"Abdus Salam, S. M. Nahid Hasan, Md. Jawadul Karim, Shamim Anower, Md Nahiduzzaman, Muhammad E. H. Chowdhury, M. Murugappan","doi":"10.1002/ima.23205","DOIUrl":"https://doi.org/10.1002/ima.23205","url":null,"abstract":"<div>\u0000 \u0000 <p>The disease of malaria, transmitted by female Anopheles mosquitoes, is highly contagious, resulting in numerous deaths across various regions. Microscopic examination of blood cells remains one of the most accurate methods for malaria diagnosis, but it is time-consuming and can produce inaccurate results occasionally. Due to machine learning and deep learning advances in medical diagnosis, improved diagnostic accuracy can now be achieved while costs can be reduced compared to conventional microscopy methods. This work utilizes an open-source dataset with 26 161 blood smear images in RGB for malaria detection. Our preprocessing resized the original dimensions of the images into 64 × 64 due to the limitations in computational complexity in developing embedded systems-based malaria detection. We present a novel embedded system approach using 119 154 trainable parameters in a lightweight 17-layer SqueezeNet model for the automatic detection of malaria. Incredibly, the model is only 1.72 MB in size. An evaluation of the model's performance on the original NIH malaria dataset shows that it has exceptional accuracy, precision, recall, and F1 scores of 96.37%, 95.67%, 97.21%, and 96.44%, respectively. Based on a modified dataset, the results improved further to 99.71% across all metrics. Compared to current deep learning models, our model significantly outperforms them for malaria detection, making it ideal for embedded systems. This model has also been rigorously tested on the Jetson Nano B01 edge device, demonstrating a rapid single image prediction time of only 0.24 s. The fusion of deep learning with embedded systems makes this research a crucial step toward improving malaria diagnosis. In resource-constrained settings, the model's lightweight architecture and accuracy enhancements hold great promise for addressing the critical challenge of malaria detection.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142555477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Advancing Leukocyte Classification: A Cutting-Edge Deep Learning Approach for AI-Driven Clinical Diagnosis 推进白细胞分类：用于人工智能临床诊断的前沿深度学习方法

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-10-28 DOI: 10.1002/ima.23204

Ahmadsaidulu Shaik, Abhishek Tiwari, Balachakravarthy Neelapu, Puneet Kumar Jain, Earu Banoth

White blood cells (WBCs) are crucial components of the immune system, responsible for detecting and eliminating pathogens. Accurate detection and classification of WBCs are essential for various clinical diagnostics. This study aims to develop an AI framework for detecting and classifying WBCs from microscopic images using a customized YOLOv5 model with three key modifications. Firstly, the C3 module in YOLOv5's backbone is replaced with the innovative C3TR structure to enhance feature extraction and reduce background noise. Secondly, the BiFPN is integrated into the neck to improve feature localization and discrimination. Thirdly, an additional layer in the head enhances detection of small WBCs. Experiments on the BCCD dataset, comprising 352 microscopic blood smear images with leukocytes, demonstrated the framework's superiority over state-of-the-art methods, achieving 99.4% accuracy. Furthermore, the model exhibits computational efficiency, operating over five times faster than existing YOLO models. These findings underscore the framework's promise in medical diagnostics, showcasing deep learning's supremacy in automated cell classification.

白细胞（WBC）是免疫系统的重要组成部分，负责检测和消除病原体。白细胞的准确检测和分类对于各种临床诊断至关重要。本研究旨在开发一个人工智能框架，利用定制的 YOLOv5 模型，通过三处关键修改，从显微图像中检测白细胞并对其进行分类。首先，用创新的 C3TR 结构取代了 YOLOv5 主干网中的 C3 模块，以增强特征提取并减少背景噪音。其次，将 BiFPN 集成到颈部，以提高特征定位和辨别能力。第三，在头部增加了一层，增强了对小白细胞的检测。BCCD 数据集包括 352 幅带有白细胞的显微血液涂片图像，在该数据集上进行的实验表明，该框架优于最先进的方法，准确率达到 99.4%。此外，该模型的计算效率也很高，比现有的 YOLO 模型快五倍以上。这些发现强调了该框架在医疗诊断方面的前景，展示了深度学习在自动细胞分类方面的优势。

{"title":"Advancing Leukocyte Classification: A Cutting-Edge Deep Learning Approach for AI-Driven Clinical Diagnosis","authors":"Ahmadsaidulu Shaik, Abhishek Tiwari, Balachakravarthy Neelapu, Puneet Kumar Jain, Earu Banoth","doi":"10.1002/ima.23204","DOIUrl":"https://doi.org/10.1002/ima.23204","url":null,"abstract":"<div>\u0000 \u0000 <p>White blood cells (WBCs) are crucial components of the immune system, responsible for detecting and eliminating pathogens. Accurate detection and classification of WBCs are essential for various clinical diagnostics. This study aims to develop an AI framework for detecting and classifying WBCs from microscopic images using a customized YOLOv5 model with three key modifications. Firstly, the C3 module in YOLOv5's backbone is replaced with the innovative C3TR structure to enhance feature extraction and reduce background noise. Secondly, the BiFPN is integrated into the neck to improve feature localization and discrimination. Thirdly, an additional layer in the head enhances detection of small WBCs. Experiments on the BCCD dataset, comprising 352 microscopic blood smear images with leukocytes, demonstrated the framework's superiority over state-of-the-art methods, achieving 99.4% accuracy. Furthermore, the model exhibits computational efficiency, operating over five times faster than existing YOLO models. These findings underscore the framework's promise in medical diagnostics, showcasing deep learning's supremacy in automated cell classification.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142541033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fast-MedNeXt: Accelerating the MedNeXt Architecture to Improve Brain Tumour Segmentation Efficiency Fast-MedNeXt：加速 MedNeXt 架构以提高脑肿瘤分割效率

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-10-27 DOI: 10.1002/ima.23196

Bin Liu, Bing Li, Yaojing Chen, Victor Sreeram, Shuofeng Li

With the rapid development of medical imaging technology, 3D image segmentation technology has gradually become a mainstream method, especially in brain tumour detection and diagnosis showing its unique advantages. The technique makes full use of 3D spatial information to locate and analyze tumours more accurately, thus playing an important role in improving diagnostic accuracy, optimising treatment planning and promoting research. However, it also suffers from significant computational expenditure and delayed processing pace. In this paper, we propose an innovative optimisation scheme to address this problem. We thoroughly investigate the MedNeXt network and propose Fast-MedNeXt, which aims to increase the processing speed while maintaining accuracy. First, we introduce the partial convolution (PConv) technique, which replaces the deep convolutional layers in the network. This improvement effectively reduces computation and memory requirements while maintaining efficient feature extraction. Second, based on PConv, we propose PConv-Down and PConv-Up modules, which are applied to the up-sampling and down-sampling modules to further optimise the network structure and improve efficiency. To confirm the efficacy of the approach, we carried out a sequence of tests in the multimodal brain tumour segmentation challenge 2021 (BraTS2021). By comparing with the MedNeXt series network, the Fast-MedNeXt reduced the latency by 22.1%, 20.5%, 15.8%, and 11.4% respectively, while the average accuracy also increased by 0.475% and 0.2% respectively. These significant performance improvements demonstrate the effectiveness of Fast-MedNeXt in 3D medical image segmentation tasks and provide a new and more efficient solution for the field.

随着医学影像技术的飞速发展，三维图像分割技术已逐渐成为一种主流方法，尤其在脑肿瘤的检测和诊断方面显示出其独特的优势。该技术充分利用三维空间信息，对肿瘤进行更准确的定位和分析，在提高诊断准确性、优化治疗方案和促进科研方面发挥了重要作用。然而，它也存在计算量巨大、处理速度滞后等问题。本文提出了一种创新的优化方案来解决这一问题。我们对 MedNeXt 网络进行了深入研究，并提出了 Fast-MedNeXt 方案，旨在提高处理速度的同时保持准确性。首先，我们引入了部分卷积（PConv）技术，取代了网络中的深度卷积层。这一改进在保持高效特征提取的同时，有效降低了计算和内存需求。其次，在 PConv 的基础上，我们提出了 PConv-Down 和 PConv-Up 模块，应用于上采样和下采样模块，进一步优化网络结构，提高效率。为了证实该方法的有效性，我们在 2021 年多模态脑肿瘤分割挑战赛（BraTS2021）中进行了一系列测试。与 MedNeXt 系列网络相比，Fast-MedNeXt 的延迟时间分别缩短了 22.1%、20.5%、15.8% 和 11.4%，平均准确率也分别提高了 0.475% 和 0.2%。这些性能的大幅提升证明了 Fast-MedNeXt 在三维医学图像分割任务中的有效性，并为该领域提供了一种全新的、更高效的解决方案。

{"title":"Fast-MedNeXt: Accelerating the MedNeXt Architecture to Improve Brain Tumour Segmentation Efficiency","authors":"Bin Liu, Bing Li, Yaojing Chen, Victor Sreeram, Shuofeng Li","doi":"10.1002/ima.23196","DOIUrl":"https://doi.org/10.1002/ima.23196","url":null,"abstract":"<div>\u0000 \u0000 <p>With the rapid development of medical imaging technology, 3D image segmentation technology has gradually become a mainstream method, especially in brain tumour detection and diagnosis showing its unique advantages. The technique makes full use of 3D spatial information to locate and analyze tumours more accurately, thus playing an important role in improving diagnostic accuracy, optimising treatment planning and promoting research. However, it also suffers from significant computational expenditure and delayed processing pace. In this paper, we propose an innovative optimisation scheme to address this problem. We thoroughly investigate the MedNeXt network and propose Fast-MedNeXt, which aims to increase the processing speed while maintaining accuracy. First, we introduce the partial convolution (PConv) technique, which replaces the deep convolutional layers in the network. This improvement effectively reduces computation and memory requirements while maintaining efficient feature extraction. Second, based on PConv, we propose PConv-Down and PConv-Up modules, which are applied to the up-sampling and down-sampling modules to further optimise the network structure and improve efficiency. To confirm the efficacy of the approach, we carried out a sequence of tests in the multimodal brain tumour segmentation challenge 2021 (BraTS2021). By comparing with the MedNeXt series network, the Fast-MedNeXt reduced the latency by 22.1%, 20.5%, 15.8%, and 11.4% respectively, while the average accuracy also increased by 0.475% and 0.2% respectively. These significant performance improvements demonstrate the effectiveness of Fast-MedNeXt in 3D medical image segmentation tasks and provide a new and more efficient solution for the field.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Novel Dictionary Learning Algorithm Based on Prior Knowledge for fMRI Data Analysis 基于先验知识的新型词典学习算法，用于 fMRI 数据分析

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-10-27 DOI: 10.1002/ima.23195

Fangmin Sheng, Yuhu Shi, Lei Wang, Ying Li, Hua Zhang, Weiming Zeng

Task-based functional magnetic resonance imaging (fMRI) has been widely utilized for brain activation detection and functional network analysis. In recent years, the K-singular value decomposition (K-SVD) algorithm has gained increasing attention in the research of fMRI data analysis methods. In this study, we propose a novel temporal feature region-growing constrained K-SVD algorithm that incorporates task-based fMRI temporal prior knowledge and utilizes a region-growing algorithm to infer potential activation locations. The algorithm incorporates temporal and spatial constraints to enhance the detection of brain activation. Specifically, this paper improves the three stages of the traditional K-SVD algorithm. First, in the dictionary initialization stage, the automatic target generation process with an independent component analysis algorithm is utilized in conjunction with prior knowledge to enhance the accuracy of initialization. Second, in the sparse coding stage, the region-growing algorithm is employed to infer potential activation locations based on temporal prior knowledge, thereby imposing spatial constraints to limit the extent of activation regions. Finally, in the dictionary learning stage, soft constraints and low correlation constraints are applied to reinforce the consistency with prior knowledge and enhance the robustness of learning for task-related atoms. The proposed method was validated on simulated and real fMRI data, showing superior performance in detecting brain activation compared with traditional methods. The results indicate that the algorithm accurately identifies activated brain regions, providing an effective approach for studying brain function in clinical applications.

基于任务的功能磁共振成像（fMRI）已被广泛用于脑激活检测和功能网络分析。近年来，K-singular 值分解（K-SVD）算法在 fMRI 数据分析方法的研究中越来越受到重视。在本研究中，我们提出了一种新颖的时间特征区域增长约束 K-SVD 算法，该算法结合了基于任务的 fMRI 时间先验知识，并利用区域增长算法来推断潜在的激活位置。该算法结合了时间和空间约束，以增强对大脑激活的检测。具体来说，本文改进了传统 K-SVD 算法的三个阶段。首先，在字典初始化阶段，利用独立成分分析算法的自动目标生成过程，结合先验知识，提高初始化的准确性。其次，在稀疏编码阶段，利用区域增长算法根据时间先验知识推断潜在的激活位置，从而施加空间约束以限制激活区域的范围。最后，在字典学习阶段，应用软约束和低相关性约束来加强与先验知识的一致性，并提高任务相关原子学习的鲁棒性。所提出的方法在模拟和真实的 fMRI 数据上进行了验证，结果表明与传统方法相比，该方法在检测大脑激活方面表现出色。结果表明，该算法能准确识别激活的大脑区域，为临床应用中的大脑功能研究提供了一种有效的方法。

{"title":"A Novel Dictionary Learning Algorithm Based on Prior Knowledge for fMRI Data Analysis","authors":"Fangmin Sheng, Yuhu Shi, Lei Wang, Ying Li, Hua Zhang, Weiming Zeng","doi":"10.1002/ima.23195","DOIUrl":"https://doi.org/10.1002/ima.23195","url":null,"abstract":"<div>\u0000 \u0000 <p>Task-based functional magnetic resonance imaging (fMRI) has been widely utilized for brain activation detection and functional network analysis. In recent years, the K-singular value decomposition (K-SVD) algorithm has gained increasing attention in the research of fMRI data analysis methods. In this study, we propose a novel temporal feature region-growing constrained K-SVD algorithm that incorporates task-based fMRI temporal prior knowledge and utilizes a region-growing algorithm to infer potential activation locations. The algorithm incorporates temporal and spatial constraints to enhance the detection of brain activation. Specifically, this paper improves the three stages of the traditional K-SVD algorithm. First, in the dictionary initialization stage, the automatic target generation process with an independent component analysis algorithm is utilized in conjunction with prior knowledge to enhance the accuracy of initialization. Second, in the sparse coding stage, the region-growing algorithm is employed to infer potential activation locations based on temporal prior knowledge, thereby imposing spatial constraints to limit the extent of activation regions. Finally, in the dictionary learning stage, soft constraints and low correlation constraints are applied to reinforce the consistency with prior knowledge and enhance the robustness of learning for task-related atoms. The proposed method was validated on simulated and real fMRI data, showing superior performance in detecting brain activation compared with traditional methods. The results indicate that the algorithm accurately identifies activated brain regions, providing an effective approach for studying brain function in clinical applications.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automated, Reproducible, and Reconfigurable Human Head Phantom for Experimental Testing of Microwave Systems for Stroke Classification 用于中风分类微波系统实验测试的自动化、可重现和可重构的人体头部模型

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-10-26 DOI: 10.1002/ima.23200

Tomas Pokorny, Tomas Drizdal, Marek Novak, Jan Vrba

Microwave systems for prehospital stroke classification are currently being developed. In the future, these systems should enable rapid recognition of the type of stroke, shorten the time to start treatment, and thus significantly improve the prognosis of patients. In this study, we realized a realistic and reconfigurable 3D human head phantom for the development, testing, and validation of these newly developed diagnostic methods. The phantom enables automated and reproducible measurements for different positions of the stroke model. The stroke model itself is also interchangeable, so measurements can be made for different types, sizes, and shapes of strokes. Furthermore, an extensive series of measurements was performed at a frequency of 1 GHz, and an SVM classification algorithm was deployed, which successfully identified ischemic stroke in 80% of the corresponding measured data. If similar classification accuracy could be achieved in patients, it would lead to a dramatic reduction in the consequences of strokes.

目前正在开发用于院前中风分类的微波系统。未来，这些系统应能快速识别中风类型，缩短开始治疗的时间，从而显著改善患者的预后。在本研究中，我们实现了一个逼真且可重新配置的三维人体头部模型，用于开发、测试和验证这些新开发的诊断方法。该模型可对中风模型的不同位置进行自动和可重复的测量。中风模型本身也可以互换，因此可以对不同类型、大小和形状的中风进行测量。此外，还在 1 GHz 频率下进行了一系列广泛的测量，并采用 SVM 分类算法，在 80% 的相应测量数据中成功识别出缺血性中风。如果能在患者身上实现类似的分类准确性，将大大减少脑卒中的后果。

{"title":"Automated, Reproducible, and Reconfigurable Human Head Phantom for Experimental Testing of Microwave Systems for Stroke Classification","authors":"Tomas Pokorny, Tomas Drizdal, Marek Novak, Jan Vrba","doi":"10.1002/ima.23200","DOIUrl":"https://doi.org/10.1002/ima.23200","url":null,"abstract":"<p>Microwave systems for prehospital stroke classification are currently being developed. In the future, these systems should enable rapid recognition of the type of stroke, shorten the time to start treatment, and thus significantly improve the prognosis of patients. In this study, we realized a realistic and reconfigurable 3D human head phantom for the development, testing, and validation of these newly developed diagnostic methods. The phantom enables automated and reproducible measurements for different positions of the stroke model. The stroke model itself is also interchangeable, so measurements can be made for different types, sizes, and shapes of strokes. Furthermore, an extensive series of measurements was performed at a frequency of 1 GHz, and an SVM classification algorithm was deployed, which successfully identified ischemic stroke in 80% of the corresponding measured data. If similar classification accuracy could be achieved in patients, it would lead to a dramatic reduction in the consequences of strokes.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.23200","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Image Segmentation Evaluation With the Dice Index: Methodological Issues 用骰子指数评估图像分割：方法论问题

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-10-24 DOI: 10.1002/ima.23203

Mohamed L. Seghier

<p>In this editorial, I call for more clarity and transparency when calculating and reporting the Dice index to evaluate the performance of biomedical image segmentation methods. Despite many existing guidelines for best practices for assessing and reporting the performance of automated methods [<span>1, 2</span>], there is still a lack of clarity on why and how performance metrics were selected and assessed. I have seen articles where, for instance, Dice indices (i) were erroneously reported as smaller than intersection-over-union values, (ii) oddly increased from moderate to excellent values after including images with no actual positive instances, (iii) were drastically affected by image cropping or zero-padding, (iv) did not make sense in the light of the reported precision and sensitivity values, (v) showed opposite trends to F1 scores, (vi) were wrongly interpreted as accuracy measures, (vii) used as a measure of detection success rather than segmentation success, (viii) were used to rank methods that varied considerably in terms of the number of false positives and false negatives, (ix) were averaged across segmented structures of interest with highly variable sizes, and (x) were directly compared to other Dice indices from previous studies despite being tested on completely different datasets. It is important to remind our authors what one can (or cannot) do with the Dice index for biomedical image segmentation.</p><p>As the Dice index is one of the preferred metrics to assess segmentation performance and is widely used in many challenges and benchmarks to rank models [<span>3</span>], it is paramount that authors calculate it correctly and report it clearly and transparently. Below, I discuss conceptual and methodological issues about the Dice index before providing a list of 10 simple rules for optimal and transparent reporting of the Dice index. By improving transparency and clarity, I believe readers will draw the right conclusions about methods evaluation, which will ultimately help improve interpretability and replicability in biomedical data processing.</p><p>The discussion below applies to any image segmentation problem, imaging modality, 2D (slices) or 3D (volumes) inputs, and segmentation tasks (e.g., segmenting abnormalities or typical structures and organs). Examples will be taken from the automated segmentation of stroke lesions in brain scans.</p><p>Put another way, the Dice index codes how the positives declared by an automated method match the actual positives of the ground truth. We have <span></span><math> <semantics> <mrow> <mtext>Dice</mtext> <mfenced> <mrow> <mi>A</mi> <mo>,</mo> <mi>A</mi> </mrow> </mfenced> <mo>=</mo> <mn>1</mn> <mo>

这就是为什么我们有这样的条件陈述：出色的 Dice 足以获得出色的准确性。请注意，当真否定趋于零时，准确度就会变得类似于交集大于联合（IOU）。Dice 是 Jaccard 指数（或 IOU）的单调函数。这两个指标量化的信息相似，如果提供了其中一个指标，那么在评估特定自动方法的性能时就无需提供另一个指标[8]。Dice 被广泛使用，因为在两幅图像重叠度（TP）相同的情况下，它比 IOU 显示出更高的值，因此更重视 TP。不过，IOU 可以更直观地评估分割性能，因为它直接测量重叠像素/象素相对于两幅图像中唯一像素/象素总数的比例。总之，如果同时提供这两个指数，就必须确保结果的一致性（例如，如果 Dice 值小于 IOU 值，则说明其计算出了问题）。由于 Dice 指数相当于精确度和召回率的调和平均值，因此它无法区分 FP 与 FN 比率不同的方法。换句话说，Dice 指数对所有分割错误一视同仁，无论其位置或重要性如何。例如，两种方法可以显示相同的错误实例总数（FP + FN），但一种方法的灵敏度可能较差（FN 较大），而另一种方法的特异性可能较差（FP 较大）。因此，除了 Dice 指数外，报告精确度和灵敏度/召回率也非常有用。这对根据相关应用评估特定方法的实用性具有重要意义。例如，在某些特定的临床应用中，特异性高的方法可能比灵敏度高的方法更受青睐，反之亦然。例如，对高级别肿瘤进行术前分割时，可能更倾向于使用高灵敏度的方法，以确保所有恶性肿瘤组织都已划定，以便随后进行切除。与此相反，良性肿瘤的术前分割可能更倾向于高特异性的方法，以尽量减少术后发病率。这一局限性涉及到两种方法具有相同的 Dice，但受相关结构的不同特征或位置影响的情况。例如，如果一种方法始终对中风病灶的内侧部分进行分割，而另一种方法对外侧部分的分割效果更好，那么如果两种方法都产生了与地面实况相似的重叠，Dice 可能就不能很好地反映这种偏差。例如，在分割向心室内侧延伸的大面积脑卒中病灶时就会出现这种情况。与其他方法相比，一些方法可能会对靠近脑室的病变内侧部分进行过度分割（即内侧或外侧部分出现 FP 对 Dice 指数的影响相同）。一些研究已经讨论过这一局限性，但并不总是得到承认，尤其是在分割大小变化较大的病变或结构时。具体来说，中风病灶的大小从几个体素（0.1 cm3）到数千个体素（300 cm3）不等。对于如此大范围的病灶大小，一个小的 FP 或 FN 会对 Dice 指数产生不同的影响。例如，对于只产生一个 FP 或 FN 的优秀分割方法来说，大病灶的 Dice 指数会非常高，而小病灶的 Dice 指数则为中等或较低。这可能会导致高估大病灶图像的分割性能，而遗漏其他小的但有临床意义的病灶。因此，建议提供训练样本和测试样本的病变大小分布（直方图）。分别计算小、中、大病灶的平均 Dice 值或提供 Dice 值与病灶大小的散点图也很有用；原理与此类似（见 [9，10]）。这将有助于作者就其方法更适合哪种类型的病变提出正确的建议（例如，给定的方法可能非常适合大型病变，但对微小病变效果不佳）。例如，一个小病灶可能极难完全分割，但对它的检测可能仍然对临床有用。如果一个小病灶的大小为两个体素，那么自动方法只显示一个体素重叠，没有 FP，即使该方法成功检测到了这样一个小病灶，其 Dice 指数也只有 0.66，属于中等水平。这就是为什么必须解释临床相关性是指病变的分界还是在图像中的检测。因为用 Dice 指数对后者进行评估并不总是有意义的。

{"title":"Image Segmentation Evaluation With the Dice Index: Methodological Issues","authors":"Mohamed L. Seghier","doi":"10.1002/ima.23203","DOIUrl":"https://doi.org/10.1002/ima.23203","url":null,"abstract":"<p>In this editorial, I call for more clarity and transparency when calculating and reporting the Dice index to evaluate the performance of biomedical image segmentation methods. Despite many existing guidelines for best practices for assessing and reporting the performance of automated methods [<span>1, 2</span>], there is still a lack of clarity on why and how performance metrics were selected and assessed. I have seen articles where, for instance, Dice indices (i) were erroneously reported as smaller than intersection-over-union values, (ii) oddly increased from moderate to excellent values after including images with no actual positive instances, (iii) were drastically affected by image cropping or zero-padding, (iv) did not make sense in the light of the reported precision and sensitivity values, (v) showed opposite trends to F1 scores, (vi) were wrongly interpreted as accuracy measures, (vii) used as a measure of detection success rather than segmentation success, (viii) were used to rank methods that varied considerably in terms of the number of false positives and false negatives, (ix) were averaged across segmented structures of interest with highly variable sizes, and (x) were directly compared to other Dice indices from previous studies despite being tested on completely different datasets. It is important to remind our authors what one can (or cannot) do with the Dice index for biomedical image segmentation.</p><p>As the Dice index is one of the preferred metrics to assess segmentation performance and is widely used in many challenges and benchmarks to rank models [<span>3</span>], it is paramount that authors calculate it correctly and report it clearly and transparently. Below, I discuss conceptual and methodological issues about the Dice index before providing a list of 10 simple rules for optimal and transparent reporting of the Dice index. By improving transparency and clarity, I believe readers will draw the right conclusions about methods evaluation, which will ultimately help improve interpretability and replicability in biomedical data processing.</p><p>The discussion below applies to any image segmentation problem, imaging modality, 2D (slices) or 3D (volumes) inputs, and segmentation tasks (e.g., segmenting abnormalities or typical structures and organs). Examples will be taken from the automated segmentation of stroke lesions in brain scans.</p><p>Put another way, the Dice index codes how the positives declared by an automated method match the actual positives of the ground truth. We have <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mtext>Dice</mtext>\u0000 <mfenced>\u0000 <mrow>\u0000 <mi>A</mi>\u0000 <mo>,</mo>\u0000 <mi>A</mi>\u0000 </mrow>\u0000 </mfenced>\u0000 <mo>=</mo>\u0000 <mn>1</mn>\u0000 <mo>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.23203","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multimodal Connectivity-Guided Glioma Segmentation From Magnetic Resonance Images via Cascaded 3D Residual U-Net 通过级联三维残余 U-Net 从磁共振图像进行多模态连接性引导的胶质瘤分割

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-10-23 DOI: 10.1002/ima.23206

Xiaoyan Sun, Chuhan Hu, Wenhan He, Zhenming Yuan, Jian Zhang

Glioma is a type of brain tumor with a high mortality rate. Magnetic resonance imaging (MRI) is commonly used for examination, and the accurate segmentation of tumor regions from MR images is essential to computer-aided diagnosis. However, due to the intrinsic heterogeneity of brain glioma, precise segmentation is very challenging, especially for tumor subregions. This article proposed a two-stage cascaded method for brain tumor segmentation that considers the hierarchical structure of the target tumor subregions. The first stage aims to identify the whole tumor (WT) from the background area; and the second stage aims to achieve fine-grained segmentation of the subregions, including enhanced tumor (ET) region and tumor core (TC) region. Both stages apply a deep neural network structure combining modified 3D U-Net with a residual connection scheme to tumor region and subregion segmentation. Moreover, in the training phase, the 3D masks generation of subregions with potential incomplete connectivity are guided by the completely connected regions. Experiments were performed to evaluate the performance of the methods on both area and boundary accuracy. The average dice score of the WT, TC, and ET regions on BraTS 2020 dataset is 0.9168, 0.0.8992, 0.8489, and the Hausdorff distance is 6.021, 9.203, 12.171, respectively. The proposed method outperforms current works, especially in segmenting fine-grained tumor subregions.

胶质瘤是一种死亡率很高的脑肿瘤。磁共振成像（MRI）是常用的检查手段，从磁共振图像中准确分割肿瘤区域对计算机辅助诊断至关重要。然而，由于脑胶质瘤的内在异质性，精确分割非常具有挑战性，尤其是对肿瘤亚区域的分割。本文提出了一种考虑目标肿瘤亚区分层结构的两阶段级联脑肿瘤分割方法。第一阶段旨在从背景区域中识别出整个肿瘤（WT）；第二阶段旨在实现亚区域的细粒度分割，包括增强肿瘤（ET）区域和肿瘤核心（TC）区域。这两个阶段都采用了深度神经网络结构，将改进的三维 U-Net 与残差连接方案相结合，对肿瘤区域和子区域进行分割。此外，在训练阶段，具有潜在不完全连接性的子区域的三维掩膜生成是以完全连接区域为指导的。实验评估了这些方法在区域和边界准确性方面的性能。在 BraTS 2020 数据集上，WT、TC 和 ET 区域的平均骰子分数分别为 0.9168、0.0.8992 和 0.8489，豪斯多夫距离分别为 6.021、9.203 和 12.171。所提出的方法优于现有方法，尤其是在分割细粒度肿瘤子区域方面。

{"title":"Multimodal Connectivity-Guided Glioma Segmentation From Magnetic Resonance Images via Cascaded 3D Residual U-Net","authors":"Xiaoyan Sun, Chuhan Hu, Wenhan He, Zhenming Yuan, Jian Zhang","doi":"10.1002/ima.23206","DOIUrl":"https://doi.org/10.1002/ima.23206","url":null,"abstract":"<div>\u0000 \u0000 <p>Glioma is a type of brain tumor with a high mortality rate. Magnetic resonance imaging (MRI) is commonly used for examination, and the accurate segmentation of tumor regions from MR images is essential to computer-aided diagnosis. However, due to the intrinsic heterogeneity of brain glioma, precise segmentation is very challenging, especially for tumor subregions. This article proposed a two-stage cascaded method for brain tumor segmentation that considers the hierarchical structure of the target tumor subregions. The first stage aims to identify the whole tumor (WT) from the background area; and the second stage aims to achieve fine-grained segmentation of the subregions, including enhanced tumor (ET) region and tumor core (TC) region. Both stages apply a deep neural network structure combining modified 3D U-Net with a residual connection scheme to tumor region and subregion segmentation. Moreover, in the training phase, the 3D masks generation of subregions with potential incomplete connectivity are guided by the completely connected regions. Experiments were performed to evaluate the performance of the methods on both area and boundary accuracy. The average dice score of the WT, TC, and ET regions on BraTS 2020 dataset is 0.9168, 0.0.8992, 0.8489, and the Hausdorff distance is 6.021, 9.203, 12.171, respectively. The proposed method outperforms current works, especially in segmenting fine-grained tumor subregions.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Recognition of Diabetic Retinopathy Grades Based on Data Augmentation and Attention Mechanisms 基于数据增强和注意力机制的糖尿病视网膜病变分级识别技术

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology

Pub Date : 2024-10-22 DOI: 10.1002/ima.23201

Xueri Li, Li Wen, Fanyu Du, Lei Yang, Jianfang Wu

Diabetic retinopathy is a complication of diabetes and one of the leading causes of vision loss. Early detection and treatment are essential to prevent vision loss. Deep learning has been making great strides in the field of medical image processing and can be used as an aid for medical practitioners. However, unbalanced datasets, sparse focal areas, small differences between adjacent disease grades, and varied manifestations of the same grade disease challenge deep learning model training. Generalization performance and robustness are inadequate. To address the problem of unbalanced sample numbers between classes in the dataset, this work proposes using VQ-VAE for reconstructing affine transformed images to enrich and balance the dataset. Test results show the model's average reconstruction error is 0.0001, and the mean structural similarity between reconstructed and original images is 0.967. This proves reconstructed images differ from originals yet belong to the same category, expanding and diversifying the dataset. Addressing the issues of focal area sparsity and disease grade disparity, this work utilizes ResNeXt50 as the backbone network and constructs diverse attention networks by modifying the network structure and embedding different attention modules. Experiments demonstrate that the convolutional attention network outperforms ResNeXt50 in terms of Precision, Sensitivity, Specificity, F1 Score, Quadratic Weighted Kappa Coefficient, Accuracy, and robustness against Salt and Pepper noise, Gaussian noise, and gradient perturbation. Finally, the heat maps of each model recognizing the fundus image were plotted using the Grad-CAM method. The heat maps show that the attentional network is more effective than the non-attentional network ResNeXt50 at attending to the fundus image.

糖尿病视网膜病变是糖尿病的一种并发症，也是导致视力丧失的主要原因之一。早期发现和治疗对防止视力丧失至关重要。深度学习在医学图像处理领域取得了长足进步，可作为医疗从业人员的辅助工具。然而，不平衡的数据集、稀疏的病灶区域、相邻疾病等级之间的微小差异以及同一等级疾病的不同表现形式都对深度学习模型的训练提出了挑战。泛化性能和鲁棒性不足。为解决数据集中不同等级样本数量不平衡的问题，本研究提出使用 VQ-VAE 重建仿射变换图像，以丰富和平衡数据集。测试结果表明，该模型的平均重建误差为 0.0001，重建图像与原始图像的平均结构相似度为 0.967。这证明重建图像与原始图像不同，但属于同一类别，从而扩大了数据集并使其多样化。针对病灶区域稀疏和疾病等级差异的问题，这项研究利用 ResNeXt50 作为骨干网络，通过修改网络结构和嵌入不同的注意力模块，构建了多样化的注意力网络。实验证明，卷积注意力网络在精确度、灵敏度、特异度、F1得分、二次加权卡帕系数、准确度以及对椒盐噪声、高斯噪声和梯度扰动的鲁棒性方面都优于ResNeXt50。最后，使用 Grad-CAM 方法绘制了每个模型识别眼底图像的热图。热图显示，注意网络在注意眼底图像方面比非注意网络 ResNeXt50 更有效。

{"title":"Recognition of Diabetic Retinopathy Grades Based on Data Augmentation and Attention Mechanisms","authors":"Xueri Li, Li Wen, Fanyu Du, Lei Yang, Jianfang Wu","doi":"10.1002/ima.23201","DOIUrl":"https://doi.org/10.1002/ima.23201","url":null,"abstract":"<div>\u0000 \u0000 <p>Diabetic retinopathy is a complication of diabetes and one of the leading causes of vision loss. Early detection and treatment are essential to prevent vision loss. Deep learning has been making great strides in the field of medical image processing and can be used as an aid for medical practitioners. However, unbalanced datasets, sparse focal areas, small differences between adjacent disease grades, and varied manifestations of the same grade disease challenge deep learning model training. Generalization performance and robustness are inadequate. To address the problem of unbalanced sample numbers between classes in the dataset, this work proposes using VQ-VAE for reconstructing affine transformed images to enrich and balance the dataset. Test results show the model's average reconstruction error is 0.0001, and the mean structural similarity between reconstructed and original images is 0.967. This proves reconstructed images differ from originals yet belong to the same category, expanding and diversifying the dataset. Addressing the issues of focal area sparsity and disease grade disparity, this work utilizes ResNeXt50 as the backbone network and constructs diverse attention networks by modifying the network structure and embedding different attention modules. Experiments demonstrate that the convolutional attention network outperforms ResNeXt50 in terms of Precision, Sensitivity, Specificity, F1 Score, Quadratic Weighted Kappa Coefficient, Accuracy, and robustness against Salt and Pepper noise, Gaussian noise, and gradient perturbation. Finally, the heat maps of each model recognizing the fundus image were plotted using the Grad-CAM method. The heat maps show that the attentional network is more effective than the non-attentional network ResNeXt50 at attending to the fundus image.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"34 6","pages":""},"PeriodicalIF":3.0,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142525053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0