首页 > 最新文献

International Journal of Imaging Systems and Technology最新文献

英文 中文
Shallow Convolution and Parallel Coarse-To-Fine Attention for Brain Signal Classification 浅卷积与并行粗精注意在脑信号分类中的应用
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-10 DOI: 10.1002/ima.70249
Xiwen Qin, Jiayao Wang, Dingxin Xu, Siqi Zhang

Electroencephalography (EEG) signals, due to their non-invasiveness, high temporal resolution, and low cost, have demonstrated broad application prospects in fields such as Brain-Computer Interface (BCI), motor rehabilitation, and emotion recognition. However, the inherently low signal-to-noise ratio, non-stationarity, and high dimensionality of EEG signals pose substantial challenges for signal decoding. To address these issues and improve the accuracy and robustness of EEG classification, this study proposes a novel EEG classification architecture for motor imagery data—Parallel Hybrid CNN-Transformer (PHCT). The PHCT model consists of a shallow feature extraction module and parallel fine-grained and coarse-grained feature extractors. The former efficiently captures temporal and spatial local features using separable convolutions, while the latter integrates multi-head self-attention mechanisms with convolutional structures to model global and local dependencies. Additionally, data augmentation and Gaussian noise injection are introduced to enhance the model's generalization ability. Empirical studies conducted on the BCI Competition IV-2b dataset show that the proposed model outperforms existing methods across nine subjects, achieving at least a 4.2% improvement in average classification accuracy compared to the current best model, resulting in a significant model performance boost. The experimental results are visualized using the t-SNE method. This study provides a new effective pathway for EEG decoding in complex environments.

脑电图(EEG)信号因其无创、时间分辨率高、成本低等特点,在脑机接口(BCI)、运动康复、情绪识别等领域具有广阔的应用前景。然而,脑电信号固有的低信噪比、非平稳性和高维性给信号解码带来了很大的挑战。为了解决这些问题并提高脑电分类的准确性和鲁棒性,本研究提出了一种新的运动图像数据-并行混合cnn -变压器(PHCT)脑电分类架构。PHCT模型由浅特征提取模块和并行细粒度和粗粒度特征提取器组成。前者利用可分离卷积有效捕获时空局部特征,后者将多头自注意机制与卷积结构相结合,对全局和局部依赖关系进行建模。此外,还引入了数据增强和高斯噪声注入来增强模型的泛化能力。在BCI Competition IV-2b数据集上进行的实证研究表明,所提出的模型在9个主题上优于现有方法,与当前最佳模型相比,平均分类准确率至少提高了4.2%,从而显著提高了模型性能。利用t-SNE方法将实验结果可视化。该研究为复杂环境下的脑电信号解码提供了一条新的有效途径。
{"title":"Shallow Convolution and Parallel Coarse-To-Fine Attention for Brain Signal Classification","authors":"Xiwen Qin,&nbsp;Jiayao Wang,&nbsp;Dingxin Xu,&nbsp;Siqi Zhang","doi":"10.1002/ima.70249","DOIUrl":"https://doi.org/10.1002/ima.70249","url":null,"abstract":"<div>\u0000 \u0000 <p>Electroencephalography (EEG) signals, due to their non-invasiveness, high temporal resolution, and low cost, have demonstrated broad application prospects in fields such as Brain-Computer Interface (BCI), motor rehabilitation, and emotion recognition. However, the inherently low signal-to-noise ratio, non-stationarity, and high dimensionality of EEG signals pose substantial challenges for signal decoding. To address these issues and improve the accuracy and robustness of EEG classification, this study proposes a novel EEG classification architecture for motor imagery data—Parallel Hybrid CNN-Transformer (PHCT). The PHCT model consists of a shallow feature extraction module and parallel fine-grained and coarse-grained feature extractors. The former efficiently captures temporal and spatial local features using separable convolutions, while the latter integrates multi-head self-attention mechanisms with convolutional structures to model global and local dependencies. Additionally, data augmentation and Gaussian noise injection are introduced to enhance the model's generalization ability. Empirical studies conducted on the BCI Competition IV-2b dataset show that the proposed model outperforms existing methods across nine subjects, achieving at least a 4.2% improvement in average classification accuracy compared to the current best model, resulting in a significant model performance boost. The experimental results are visualized using the t-SNE method. This study provides a new effective pathway for EEG decoding in complex environments.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145529742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Interpretable Lesion-Aware Diagnostic Model for Full-Field Mammography Classification 一个可解释的全视场乳房x线摄影分级的病变感知诊断模型
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-08 DOI: 10.1002/ima.70253
Jie Xu, Min Wei, Mingzhe Zhang, Huijian Chen, Yang Song, Muzhen He

Accurate diagnosis of breast cancer is critical for improving patient outcomes. Yet breast lesions are small and mammograms are high-resolution, patch-based methods that often ignore peritumoral context, undermining diagnostic accuracy. We therefore develop an interpretable lesion-aware diagnostic model (LADM), which directly identifies tumor regions from full-field mammograms to improve classification accuracy and clinical trust. LADM incorporates a hyper lesion-aware module that combines spatial- and channel-guided attention. LADM processes craniocaudal (CC) and mediolateral oblique (MLO) mammograms via three branches—CC, MLO, and dual-view. The single-view branches independently learn diagnostic features and perform classification, while the dual-view branch concatenates CC and MLO features for late-fusion prediction to exploit cross-view complementarity. Model interpretability and lesion localization are assessed with Grad-CAM++ heatmaps. On INBreast and CBIS-DDSM, LADM consistently outperforms single-view baselines. In the dual-view setting, it achieves AUC 0.950 and accuracy 0.903 on INBreast, and AUC 0.911 and accuracy 0.848 on CBIS-DDSM, respectively. Visualizations show that the model focuses on diagnostically relevant regions, supporting clinical interpretability. LADM learns lesion-aware features directly on full-field images, fuses CC-MLO while staying single-view robust, and offers transparent predictions with lesion maps. Together these yield accurate, interpretable classification of breast cancer.

乳腺癌的准确诊断对于改善患者的预后至关重要。然而,乳房病变很小,乳房x光检查是高分辨率的、基于补丁的方法,往往忽略了肿瘤周围的情况,损害了诊断的准确性。因此,我们开发了一种可解释的病变感知诊断模型(LADM),该模型可直接从全视野乳房x线照片中识别肿瘤区域,以提高分类准确性和临床信任度。LADM集成了一个超级病变感知模块,该模块结合了空间和通道引导的注意力。LADM通过三个分支处理颅侧(CC)和中外侧斜(MLO)乳房x线照片- CC, MLO和双视图。单视图分支独立学习诊断特征并进行分类,双视图分支连接CC和MLO特征进行后期融合预测,利用跨视图互补性。使用grad - cam++热图评估模型可解释性和病灶定位。在INBreast和CBIS-DDSM上,LADM的性能始终优于单视图基线。在双视图设置下,在INBreast上AUC为0.950,精度为0.903,在CBIS-DDSM上AUC为0.911,精度为0.848。可视化显示,该模型专注于诊断相关区域,支持临床可解释性。LADM直接在全场图像上学习病变感知特征,在保持单视图鲁棒性的同时融合CC-MLO,并提供透明的病变图预测。这些综合起来就产生了准确的、可解释的乳腺癌分类。
{"title":"An Interpretable Lesion-Aware Diagnostic Model for Full-Field Mammography Classification","authors":"Jie Xu,&nbsp;Min Wei,&nbsp;Mingzhe Zhang,&nbsp;Huijian Chen,&nbsp;Yang Song,&nbsp;Muzhen He","doi":"10.1002/ima.70253","DOIUrl":"https://doi.org/10.1002/ima.70253","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate diagnosis of breast cancer is critical for improving patient outcomes. Yet breast lesions are small and mammograms are high-resolution, patch-based methods that often ignore peritumoral context, undermining diagnostic accuracy. We therefore develop an interpretable lesion-aware diagnostic model (LADM), which directly identifies tumor regions from full-field mammograms to improve classification accuracy and clinical trust. LADM incorporates a hyper lesion-aware module that combines spatial- and channel-guided attention. LADM processes craniocaudal (CC) and mediolateral oblique (MLO) mammograms via three branches—CC, MLO, and dual-view. The single-view branches independently learn diagnostic features and perform classification, while the dual-view branch concatenates CC and MLO features for late-fusion prediction to exploit cross-view complementarity. Model interpretability and lesion localization are assessed with Grad-CAM++ heatmaps. On INBreast and CBIS-DDSM, LADM consistently outperforms single-view baselines. In the dual-view setting, it achieves AUC 0.950 and accuracy 0.903 on INBreast, and AUC 0.911 and accuracy 0.848 on CBIS-DDSM, respectively. Visualizations show that the model focuses on diagnostically relevant regions, supporting clinical interpretability. LADM learns lesion-aware features directly on full-field images, fuses CC-MLO while staying single-view robust, and offers transparent predictions with lesion maps. Together these yield accurate, interpretable classification of breast cancer.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-Invasive Prediction of Axillary Lymph Node Metastasis in Breast Cancer Using Combined Clinical, Radiomics, and Deep Learning Features 结合临床、放射组学和深度学习特征无创预测乳腺癌腋窝淋巴结转移
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-08 DOI: 10.1002/ima.70248
Jing Chen, Xiaoying Qiu, Yun Zheng, Yuehui Liao, Weiji Yang, Jionghui Gu, Chunhong Yan, Lang Meng, Jing Cheng, Tian'an Jiang, Xiaobo Lai

Axillary lymph node (ALN) metastasis is a critical prognostic factor in breast cancer; accurate preoperative evaluation is crucial for guiding treatment decisions. This study aims to develop and validate a combined model integrating clinical, radiomics, and deep learning (DL) features derived from preoperative ultrasound images to non-invasively predict ALN metastasis in breast cancer patients, while systematically comparing the predictive value of features derived from different peritumoral regions and multiple feature sources. A total of 431 breast cancer patients, with axillary lymph node dissection pathology serving as the gold standard, were retrospectively enrolled and randomly assigned to training (n = 301) and test (n = 130) sets. Clinical features, radiomics features, and deep learning features (extracted from the penultimate layer of a pre-trained ResNet50 convolutional neural network using global average pooling) were obtained from intratumoral regions, combined intratumoral and peritumoral regions (1–3 mm margins), and whole ultrasound images. Machine learning models were constructed separately for clinical, radiomics, DL, and combined feature sets. Models were evaluated using the area under the curve (AUC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), and negative predictive value (NPV). Clinical models demonstrated limited predictive performance (test set AUC: 0.591–0.611). Radiomics and DL models showed improved performance, particularly when including peritumoral information. A random forest-based combined model integrating clinical features, radiomics features (intratumoral+3-mm peritumoral region), and DL features yielded the highest performance, achieving a test set AUC of 0.869, with SEN/SPE/PPV/NPV values of 0.886/0.856/0.856/0.839, respectively. Additionally, the model demonstrated robust predictive accuracy across molecular subtypes. The combined model integrating clinical, radiomics, and deep learning features from intratumoral and peritumoral ultrasound images effectively and non-invasively predicts ALN metastasis in breast cancer patients. This approach shows potential for clinical decision support, may help reduce unnecessary sentinel lymph node biopsies, and shows promise for generalizability across different molecular subtypes.

腋窝淋巴结(ALN)转移是乳腺癌预后的重要因素;准确的术前评估对指导治疗决策至关重要。本研究旨在建立并验证一种结合临床、放射组学和术前超声图像的深度学习(DL)特征的联合模型,用于无创预测乳腺癌患者ALN转移,同时系统比较来自不同肿瘤周围区域和多种特征来源的特征的预测价值。回顾性纳入431例以腋窝淋巴结清扫病理为金标准的乳腺癌患者,随机分为训练组(n = 301)和试验组(n = 130)。临床特征、放射组学特征和深度学习特征(使用全局平均池化从预训练的ResNet50卷积神经网络的倒数第二层提取)从肿瘤内区域、肿瘤内和肿瘤周围联合区域(1-3 mm边缘)和整个超声图像中获得。分别为临床、放射组学、DL和组合特征集构建机器学习模型。采用曲线下面积(AUC)、敏感性(SEN)、特异性(SPE)、阳性预测值(PPV)和阴性预测值(NPV)对模型进行评价。临床模型显示有限的预测性能(测试集AUC: 0.591-0.611)。放射组学和DL模型表现出更好的性能,特别是当包含肿瘤周围信息时。结合临床特征、放射组学特征(肿瘤内+肿瘤周围3mm区域)和DL特征的随机森林组合模型表现最佳,测试集AUC为0.869,SEN/SPE/PPV/NPV值分别为0.886/0.856/0.856/0.839。此外,该模型在分子亚型中显示出强大的预测准确性。结合临床、放射组学和肿瘤内和肿瘤周围超声图像的深度学习特征的联合模型有效且无创地预测乳腺癌患者的ALN转移。这种方法显示了临床决策支持的潜力,可能有助于减少不必要的前哨淋巴结活检,并有望在不同的分子亚型中推广。
{"title":"Non-Invasive Prediction of Axillary Lymph Node Metastasis in Breast Cancer Using Combined Clinical, Radiomics, and Deep Learning Features","authors":"Jing Chen,&nbsp;Xiaoying Qiu,&nbsp;Yun Zheng,&nbsp;Yuehui Liao,&nbsp;Weiji Yang,&nbsp;Jionghui Gu,&nbsp;Chunhong Yan,&nbsp;Lang Meng,&nbsp;Jing Cheng,&nbsp;Tian'an Jiang,&nbsp;Xiaobo Lai","doi":"10.1002/ima.70248","DOIUrl":"https://doi.org/10.1002/ima.70248","url":null,"abstract":"<div>\u0000 \u0000 <p>Axillary lymph node (ALN) metastasis is a critical prognostic factor in breast cancer; accurate preoperative evaluation is crucial for guiding treatment decisions. This study aims to develop and validate a combined model integrating clinical, radiomics, and deep learning (DL) features derived from preoperative ultrasound images to non-invasively predict ALN metastasis in breast cancer patients, while systematically comparing the predictive value of features derived from different peritumoral regions and multiple feature sources. A total of 431 breast cancer patients, with axillary lymph node dissection pathology serving as the gold standard, were retrospectively enrolled and randomly assigned to training (<i>n</i> = 301) and test (<i>n</i> = 130) sets. Clinical features, radiomics features, and deep learning features (extracted from the penultimate layer of a pre-trained ResNet50 convolutional neural network using global average pooling) were obtained from intratumoral regions, combined intratumoral and peritumoral regions (1–3 mm margins), and whole ultrasound images. Machine learning models were constructed separately for clinical, radiomics, DL, and combined feature sets. Models were evaluated using the area under the curve (AUC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), and negative predictive value (NPV). Clinical models demonstrated limited predictive performance (test set AUC: 0.591–0.611). Radiomics and DL models showed improved performance, particularly when including peritumoral information. A random forest-based combined model integrating clinical features, radiomics features (intratumoral+3-mm peritumoral region), and DL features yielded the highest performance, achieving a test set AUC of 0.869, with SEN/SPE/PPV/NPV values of 0.886/0.856/0.856/0.839, respectively. Additionally, the model demonstrated robust predictive accuracy across molecular subtypes. The combined model integrating clinical, radiomics, and deep learning features from intratumoral and peritumoral ultrasound images effectively and non-invasively predicts ALN metastasis in breast cancer patients. This approach shows potential for clinical decision support, may help reduce unnecessary sentinel lymph node biopsies, and shows promise for generalizability across different molecular subtypes.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mistral in Radiology: AI-Powered Classification of Normal and Abnormal Reports 放射学中的西北风:正常和异常报告的人工智能分类
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-08 DOI: 10.1002/ima.70251
Pilar López-Úbeda, Teodoro Martín-Noguerol, Antonio Luna

This study investigates the potential of the Mistral large language model (LLM) to classify radiological reports as normal or abnormal using three techniques: Zero-Shot Learning (ZSL), Few-Shot Learning (FSL), and Fine-Tuning (FT), aiming to optimize radiology workflows and improve clinical decision-making. The dataset consisted of 124 807 radiology reports from MRI and CT scans conducted between 1 May 2024, and 1 November 2024, at our institution. After applying inclusion and exclusion criteria, 123 296 reports were selected for analysis. The Mistral LLM was tested with ZSL, FSL, and FT techniques. Quantitative metrics, including precision, recall, F1 score, and accuracy, were calculated for each technique. Confusion matrices and qualitative analyses of misclassified cases were also performed. ZSL yielded the lowest performance, with an F1 score of 0.191 for the normal class and an overall accuracy of 0.438, due to a high false-positive rate. FSL improved accuracy to 0.806 but still showed limitations in classifying normal reports (F1 = 0.404). FT achieved the best results, with F1 scores above 0.98 for both classes and an overall accuracy of 0.998, minimizing false positives and false negatives. Classifying radiological reports as normal or abnormal is crucial for prioritizing urgent cases and optimizing workflows. The Mistral LLM, particularly with Fine-Tuning, demonstrated strong potential for automating this task, outperforming ZSL and FSL.

本研究探讨了Mistral大语言模型(LLM)使用三种技术将放射报告分类为正常或异常的潜力:零次学习(ZSL),少次学习(FSL)和微调(FT),旨在优化放射学工作流程和改善临床决策。该数据集包括2024年5月1日至2024年11月1日在我院进行的124 807份MRI和CT扫描的放射学报告。应用纳入标准和排除标准后,选取123 296份报告进行分析。采用ZSL、FSL和FT技术对Mistral LLM进行测试。计算每种技术的定量指标,包括精密度、召回率、F1评分和准确性。混淆矩阵和定性分析错误分类的情况下也进行了。ZSL产生了最低的性能,由于高假阳性率,正常类的F1得分为0.191,总体准确率为0.438。FSL将准确率提高到0.806,但在分类正常报告方面仍然存在局限性(F1 = 0.404)。FT取得了最好的成绩,两个班级的F1得分都在0.98以上,总体准确率达到0.998,最大限度地减少了假阳性和假阴性。将放射报告分类为正常或异常对于优先处理紧急病例和优化工作流程至关重要。Mistral LLM,特别是具有微调功能的LLM,在自动化这项任务方面表现出了强大的潜力,优于ZSL和FSL。
{"title":"Mistral in Radiology: AI-Powered Classification of Normal and Abnormal Reports","authors":"Pilar López-Úbeda,&nbsp;Teodoro Martín-Noguerol,&nbsp;Antonio Luna","doi":"10.1002/ima.70251","DOIUrl":"https://doi.org/10.1002/ima.70251","url":null,"abstract":"<div>\u0000 \u0000 <p>This study investigates the potential of the Mistral large language model (LLM) to classify radiological reports as normal or abnormal using three techniques: Zero-Shot Learning (ZSL), Few-Shot Learning (FSL), and Fine-Tuning (FT), aiming to optimize radiology workflows and improve clinical decision-making. The dataset consisted of 124 807 radiology reports from MRI and CT scans conducted between 1 May 2024, and 1 November 2024, at our institution. After applying inclusion and exclusion criteria, 123 296 reports were selected for analysis. The Mistral LLM was tested with ZSL, FSL, and FT techniques. Quantitative metrics, including precision, recall, F1 score, and accuracy, were calculated for each technique. Confusion matrices and qualitative analyses of misclassified cases were also performed. ZSL yielded the lowest performance, with an F1 score of 0.191 for the normal class and an overall accuracy of 0.438, due to a high false-positive rate. FSL improved accuracy to 0.806 but still showed limitations in classifying normal reports (F1 = 0.404). FT achieved the best results, with F1 scores above 0.98 for both classes and an overall accuracy of 0.998, minimizing false positives and false negatives. Classifying radiological reports as normal or abnormal is crucial for prioritizing urgent cases and optimizing workflows. The Mistral LLM, particularly with Fine-Tuning, demonstrated strong potential for automating this task, outperforming ZSL and FSL.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145470154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convolutional Autoencoder Effect on Parallel Magnetic Resonance Imaging 卷积自编码器对平行磁共振成像的影响
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-05 DOI: 10.1002/ima.70254
Amel Korti, Amène Bekki, Alain Lalande

Parallel magnetic resonance imaging (pMRI) reduces MRI acquisition time, with Sensitivity Encoding (SENSE) being a widely used method that exploits coil sensitivity maps for efficient image reconstruction. However, SENSE can introduce aliasing and noise artifacts, especially at high acceleration factors. To address this limitation, we propose a deep learning-based postprocessing framework that enhances SENSE-reconstructed images using a Convolutional Autoencoder (CAE). The CAE is applied after the SENSE reconstruction to reduce artifacts and improve image quality, without modifying the original reconstruction pipeline. A dataset of 842 fully sampled anatomical images is used to simulate 8-channel coil data, with both uniform and variable-density (VD) undersampling applied at different acceleration factors. The CAE is trained on paired inputs (SENSE-reconstructed images) and targets (fully sampled references) to learn the mapping from degraded to high-quality images. Quantitative evaluation using Peak Signal-to-Noise Ratio (PSNR) and Normalized Mean Squared Error (NMSE), along with qualitative visual assessment, shows that the proposed CAE-SENSE framework significantly improves image fidelity, particularly with variable-density undersampling. These results demonstrate the potential of deep learning as a complementary tool to enhance conventional parallel imaging methods in accelerated MRI.

并行磁共振成像(pMRI)减少了MRI采集时间,灵敏度编码(SENSE)是一种广泛使用的方法,利用线圈灵敏度图进行有效的图像重建。然而,SENSE可能会引入混叠和噪声伪影,特别是在高加速度因素下。为了解决这一限制,我们提出了一个基于深度学习的后处理框架,该框架使用卷积自编码器(CAE)增强感官重建图像。在不修改原始重建流程的情况下,在SENSE重建后进行CAE,减少伪影,提高图像质量。使用842张完全采样的解剖图像数据集来模拟8通道线圈数据,在不同的加速度因子下使用均匀和可变密度(VD)欠采样。CAE在成对输入(感知重建图像)和目标(完全采样参考)上进行训练,以学习从退化图像到高质量图像的映射。使用峰值信噪比(PSNR)和归一化均方误差(NMSE)进行定量评估,以及定性视觉评估表明,所提出的CAE-SENSE框架显著提高了图像保真度,特别是在变密度欠采样情况下。这些结果表明,深度学习作为一种补充工具,在加速MRI中增强传统并行成像方法的潜力。
{"title":"Convolutional Autoencoder Effect on Parallel Magnetic Resonance Imaging","authors":"Amel Korti,&nbsp;Amène Bekki,&nbsp;Alain Lalande","doi":"10.1002/ima.70254","DOIUrl":"https://doi.org/10.1002/ima.70254","url":null,"abstract":"<div>\u0000 \u0000 <p>Parallel magnetic resonance imaging (pMRI) reduces MRI acquisition time, with Sensitivity Encoding (SENSE) being a widely used method that exploits coil sensitivity maps for efficient image reconstruction. However, SENSE can introduce aliasing and noise artifacts, especially at high acceleration factors. To address this limitation, we propose a deep learning-based postprocessing framework that enhances SENSE-reconstructed images using a Convolutional Autoencoder (CAE). The CAE is applied after the SENSE reconstruction to reduce artifacts and improve image quality, without modifying the original reconstruction pipeline. A dataset of 842 fully sampled anatomical images is used to simulate 8-channel coil data, with both uniform and variable-density (VD) undersampling applied at different acceleration factors. The CAE is trained on paired inputs (SENSE-reconstructed images) and targets (fully sampled references) to learn the mapping from degraded to high-quality images. Quantitative evaluation using Peak Signal-to-Noise Ratio (PSNR) and Normalized Mean Squared Error (NMSE), along with qualitative visual assessment, shows that the proposed CAE-SENSE framework significantly improves image fidelity, particularly with variable-density undersampling. These results demonstrate the potential of deep learning as a complementary tool to enhance conventional parallel imaging methods in accelerated MRI.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IGF-CNN: An Optimized Deep Learning Model for Covid-19 Classification IGF-CNN:优化的Covid-19分类深度学习模型
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-03 DOI: 10.1002/ima.70247
Vinayak Tiwari, Sidharrth Kumar Singh, Umaisa Hassan, Amit Singhal

Recent advancements in deep learning and the utilization of pre-trained convolutional neural network (CNN) architectures have led to enhancements in classification tasks. However, these architectures often entail millions of training parameters, posing challenges for real-world deployment. In this work, we propose an iterative Gaussian feature extractor with a custom 3-layer CNN network (IGF-CNN) coupled with a feedforward artificial neural network (ANN) classifier. The input images undergo pre-processing before being fed to the proposed IGF-CNN and then ANN classifies the input into Covid-19, non-Covid-19 and pneumonia classes. The suggested model demands considerably fewer parameters and reduces training time substantially and achieves accuracies of 99.80%, 98.78%, 99.0%, respectively, across three different benchmark datasets. We have also performed cross-dataset validation and obtained consistently good results, further demonstrating the robustness of the proposed approach. The proposed architecture is accurate and efficient and can be integrated with real-time systems.

深度学习的最新进展和预训练卷积神经网络(CNN)架构的使用导致了分类任务的增强。然而,这些体系结构通常需要数以百万计的训练参数,为实际部署带来了挑战。在这项工作中,我们提出了一种迭代高斯特征提取器,该提取器具有自定义的3层CNN网络(IGF-CNN)和前馈人工神经网络(ANN)分类器。输入图像经过预处理,然后馈送到所提出的IGF-CNN,然后ANN将输入分类为Covid-19,非Covid-19和肺炎类。该模型所需的参数大大减少,训练时间大大减少,在三个不同的基准数据集上分别达到99.80%、98.78%和99.0%的准确率。我们还进行了跨数据集验证,并获得了一致的良好结果,进一步证明了所提出方法的鲁棒性。该体系结构准确、高效,可与实时系统集成。
{"title":"IGF-CNN: An Optimized Deep Learning Model for Covid-19 Classification","authors":"Vinayak Tiwari,&nbsp;Sidharrth Kumar Singh,&nbsp;Umaisa Hassan,&nbsp;Amit Singhal","doi":"10.1002/ima.70247","DOIUrl":"https://doi.org/10.1002/ima.70247","url":null,"abstract":"<div>\u0000 \u0000 <p>Recent advancements in deep learning and the utilization of pre-trained convolutional neural network (CNN) architectures have led to enhancements in classification tasks. However, these architectures often entail millions of training parameters, posing challenges for real-world deployment. In this work, we propose an iterative Gaussian feature extractor with a custom 3-layer CNN network (IGF-CNN) coupled with a feedforward artificial neural network (ANN) classifier. The input images undergo pre-processing before being fed to the proposed IGF-CNN and then ANN classifies the input into Covid-19, non-Covid-19 and pneumonia classes. The suggested model demands considerably fewer parameters and reduces training time substantially and achieves accuracies of 99.80%, 98.78%, 99.0%, respectively, across three different benchmark datasets. We have also performed cross-dataset validation and obtained consistently good results, further demonstrating the robustness of the proposed approach. The proposed architecture is accurate and efficient and can be integrated with real-time systems.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MediFusionNet: A Novel Architecture for Multimodal Medical Image Analysis MediFusionNet:一种用于多模态医学图像分析的新架构
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-03 DOI: 10.1002/ima.70246
Muneeb A. Khan, Heemin Park, Dashdorj Yamkhin, Seonuck Paek

Medical image analysis typically employs modality-specific approaches, limiting comprehensive diagnostic capabilities. We introduce MediFusionNet, a deep learning architecture unifying brain MRI and chest X-ray analysis through specialized encoding, cross-modality attention, and uncertainty-aware predictions. Our architecture preserves modality-specific features while enabling knowledge transfer between anatomically distinct regions. Experiments demonstrate 97.73% overall accuracy, significantly outperforming specialized single-modality networks and existing multimodal approaches. MediFusionNet demonstrates positive cross-modal knowledge transfer (CMKTB, +2.61%) while providing calibrated uncertainty estimates (Expected Calibration Error, ECE = 0.027). This uncertainty quantification facilitates clinically meaningful workflow optimization, automatically processing 82.5% of cases with 99.3% accuracy. Ablation studies quantify each architectural component's contribution, providing insights for robust, uncertainty-aware, multimodal medical image analysis systems.

医学图像分析通常采用特定于模态的方法,限制了综合诊断能力。我们介绍MediFusionNet,这是一种通过专门编码、跨模态注意和不确定性感知预测统一脑MRI和胸部x射线分析的深度学习架构。我们的架构保留了特定于形态的特征,同时使知识能够在解剖学上不同的区域之间转移。实验表明,总体准确率为97.73%,显著优于专业的单模态网络和现有的多模态方法。MediFusionNet展示了积极的跨模式知识转移(CMKTB, +2.61%),同时提供校准的不确定性估计(预期校准误差,ECE = 0.027)。这种不确定度量化有助于临床有意义的工作流程优化,自动处理82.5%的病例,准确率为99.3%。消融研究量化了每个架构组件的贡献,为鲁棒性、不确定性感知、多模态医学图像分析系统提供了见解。
{"title":"MediFusionNet: A Novel Architecture for Multimodal Medical Image Analysis","authors":"Muneeb A. Khan,&nbsp;Heemin Park,&nbsp;Dashdorj Yamkhin,&nbsp;Seonuck Paek","doi":"10.1002/ima.70246","DOIUrl":"https://doi.org/10.1002/ima.70246","url":null,"abstract":"<div>\u0000 \u0000 <p>Medical image analysis typically employs modality-specific approaches, limiting comprehensive diagnostic capabilities. We introduce MediFusionNet, a deep learning architecture unifying brain MRI and chest X-ray analysis through specialized encoding, cross-modality attention, and uncertainty-aware predictions. Our architecture preserves modality-specific features while enabling knowledge transfer between anatomically distinct regions. Experiments demonstrate 97.73% overall accuracy, significantly outperforming specialized single-modality networks and existing multimodal approaches. MediFusionNet demonstrates positive cross-modal knowledge transfer (CMKTB, +2.61%) while providing calibrated uncertainty estimates (Expected Calibration Error, ECE = 0.027). This uncertainty quantification facilitates clinically meaningful workflow optimization, automatically processing 82.5% of cases with 99.3% accuracy. Ablation studies quantify each architectural component's contribution, providing insights for robust, uncertainty-aware, multimodal medical image analysis systems.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
U-Net-Based Fetal Head Circumference Segmentation With Synthetic-Driven Generation Data Augmentation 基于u - net的胎儿头围分割与合成驱动生成数据增强
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-01 DOI: 10.1002/ima.70244
Niama Assia El Joudi, Mohamed Lazaar, François Delmotte, Hamid Allaoui, Oussama Mahboub

In modern healthcare, deep learning methods have gained considerable attention in analyzing medical imaging over the last few years, achieving reliable outcomes in various segmentation tasks. However, these models heavily rely on high-quality and large annotated samples, which remain a scarce and hard resource to acquire in several medical fields, particularly in obstetrics and gynecology, limiting the deep learning models' capability to generalize effectively on unseen datasets. Therefore, this study proposes a two-stage framework that enhances the model segmentation performance by incorporating synthetic ultrasound image generation for fetal head segmentation and thus measures its circumference. In this paper, we propose a novel two-stage pipeline designed to segment the fetal head circumference. Initially, an improved Deep Convolutional Generative Adversarial Network is employed to generate synthetic fetal head ultrasound images, producing high-quality and high structural similarity to real ones. Preliminary annotations were obtained through automated segmentation using a lightweight U-Net, followed by a refined enhancement phase. These annotations were incorporated progressively into the U-Net training to enhance the model's performance and effectiveness. The proposed framework was evaluated on the HC18 Grand Challenge dataset. The GAN-based synthetic images achieved a Peak Signal-to-Noise Ratio of 56 and a Structural Similarity Index Measure of 0.99, demonstrating the diversity and similarity of the images and thus significantly improving the U-Net segmentation Dice Score with an increase of 1.62%. Compared to prior works using the same dataset, our model achieved the highest Dice Coefficient of 98%, a Jaccard Index of 96.11%, and a Hausdorff Distance of 0.329 mm on the test set. Our lightweight U-Net, combined with GAN-based data augmentation, effectively addresses the challenge of data scarcity and enhances the segmentation of the fetal head with precise delineation, providing a robust solution in clinical application for early fetal anomaly detection and prenatal diagnosis.

在现代医疗保健中,深度学习方法在过去几年中在分析医学图像方面获得了相当大的关注,在各种分割任务中获得了可靠的结果。然而,这些模型严重依赖于高质量和大量带注释的样本,而这些样本在一些医学领域,特别是在产科和妇科领域,仍然是稀缺和难以获得的资源,这限制了深度学习模型在未见过的数据集上有效泛化的能力。因此,本研究提出了一个两阶段框架,通过结合合成超声图像生成胎儿头部分割,从而测量其周长,从而提高模型分割性能。在本文中,我们提出了一种新的两级管道设计分割胎儿头围。首先,采用改进的深度卷积生成对抗网络生成合成胎儿头部超声图像,生成高质量且与真实图像结构相似的图像。通过使用轻量级U-Net自动分割获得初步注释,然后进行细化增强阶段。这些标注被逐步纳入U-Net训练中,以提高模型的性能和有效性。提出的框架在HC18大挑战数据集上进行了评估。基于gan的合成图像的峰值信噪比为56,结构相似指数测度为0.99,显示了图像的多样性和相似性,从而显著提高了U-Net分割Dice Score,提高了1.62%。与使用相同数据集的先前工作相比,我们的模型在测试集上获得了98%的最高骰子系数,96.11%的Jaccard指数和0.329 mm的豪斯多夫距离。我们的轻量级U-Net结合基于gan的数据增强,有效解决了数据稀缺的挑战,并以精确的描绘增强了胎儿头部的分割,为早期胎儿异常检测和产前诊断的临床应用提供了强大的解决方案。
{"title":"U-Net-Based Fetal Head Circumference Segmentation With Synthetic-Driven Generation Data Augmentation","authors":"Niama Assia El Joudi,&nbsp;Mohamed Lazaar,&nbsp;François Delmotte,&nbsp;Hamid Allaoui,&nbsp;Oussama Mahboub","doi":"10.1002/ima.70244","DOIUrl":"https://doi.org/10.1002/ima.70244","url":null,"abstract":"<div>\u0000 \u0000 <p>In modern healthcare, deep learning methods have gained considerable attention in analyzing medical imaging over the last few years, achieving reliable outcomes in various segmentation tasks. However, these models heavily rely on high-quality and large annotated samples, which remain a scarce and hard resource to acquire in several medical fields, particularly in obstetrics and gynecology, limiting the deep learning models' capability to generalize effectively on unseen datasets. Therefore, this study proposes a two-stage framework that enhances the model segmentation performance by incorporating synthetic ultrasound image generation for fetal head segmentation and thus measures its circumference. In this paper, we propose a novel two-stage pipeline designed to segment the fetal head circumference. Initially, an improved Deep Convolutional Generative Adversarial Network is employed to generate synthetic fetal head ultrasound images, producing high-quality and high structural similarity to real ones. Preliminary annotations were obtained through automated segmentation using a lightweight U-Net, followed by a refined enhancement phase. These annotations were incorporated progressively into the U-Net training to enhance the model's performance and effectiveness. The proposed framework was evaluated on the HC18 Grand Challenge dataset. The GAN-based synthetic images achieved a Peak Signal-to-Noise Ratio of 56 and a Structural Similarity Index Measure of 0.99, demonstrating the diversity and similarity of the images and thus significantly improving the U-Net segmentation Dice Score with an increase of 1.62%. Compared to prior works using the same dataset, our model achieved the highest Dice Coefficient of 98%, a Jaccard Index of 96.11%, and a Hausdorff Distance of 0.329 mm on the test set. Our lightweight U-Net, combined with GAN-based data augmentation, effectively addresses the challenge of data scarcity and enhances the segmentation of the fetal head with precise delineation, providing a robust solution in clinical application for early fetal anomaly detection and prenatal diagnosis.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145406567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced Breast Cancer Detection From Thermal Images Using DNN and Explainable AI 利用深度神经网络和可解释的人工智能从热图像增强乳腺癌检测
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-31 DOI: 10.1002/ima.70243
Mukesh Prasanna, M. Abirami, R. Nithya, B. Santhi, G. R. Brindha, Muthu Thiruvengadam

Breast cancer remains a major universal health issue among women, and early detection is essential for effective treatment and better survival rates. This study presents a non-invasive method for early breast cancer detection using thermal imaging and Explainable Artificial Intelligence (XAI) techniques. Thermal imaging is a radiation-free, safe and comfortable substitute for mammography. Thermal imaging influences the heat generated by cancerous tissues due to their higher metabolic activity. Deep learning methods train using the images by detecting the significant features between the classes. The best models before and after segmentation were VGG19 and ResNet, with accuracies of 95.2% and 96.8%, respectively. XAI techniques, specifically DeepSHAP and Local Interpretable Model-agnostic Explanations (LIME), were applied to improve model interpretability and confidence by detecting regions in images revealing abnormalities. The model outcome and explanations were consequently used to validate the reliability of the predictions made by YOLOv8 for finding an abnormal contour. To address dataset scalability, a bigger segmented dataset of 760 images was applied to a novel method, PenFeatNet, and it achieved 97.6% accuracy. This approach improved accuracy by 1.3% by isolating feature extraction from classification, reducing architectural complexity. These findings provide new avenues for further research and validation, potentially revolutionizing breast cancer screening.

乳腺癌仍然是妇女普遍面临的一个主要健康问题,及早发现对于有效治疗和提高生存率至关重要。本研究提出了一种利用热成像和可解释人工智能(XAI)技术进行早期乳腺癌检测的无创方法。热成像是一种无辐射、安全、舒适的乳房x光检查的替代品。由于癌变组织具有较高的代谢活性,热成像会影响癌变组织产生的热量。深度学习方法通过检测类别之间的重要特征来使用图像进行训练。分割前后的最佳模型为VGG19和ResNet,准确率分别为95.2%和96.8%。XAI技术,特别是DeepSHAP和局部可解释模型不可知论解释(LIME),通过检测图像中显示异常的区域来提高模型的可解释性和置信度。因此,模型结果和解释被用来验证YOLOv8在寻找异常等值线方面所做预测的可靠性。为了解决数据集的可扩展性问题,将760张图像的更大的分割数据集应用于一种新的方法PenFeatNet,准确率达到97.6%。这种方法通过将特征提取从分类中分离出来,降低了体系结构的复杂性,将准确率提高了1.3%。这些发现为进一步的研究和验证提供了新的途径,可能会彻底改变乳腺癌筛查。
{"title":"Enhanced Breast Cancer Detection From Thermal Images Using DNN and Explainable AI","authors":"Mukesh Prasanna,&nbsp;M. Abirami,&nbsp;R. Nithya,&nbsp;B. Santhi,&nbsp;G. R. Brindha,&nbsp;Muthu Thiruvengadam","doi":"10.1002/ima.70243","DOIUrl":"https://doi.org/10.1002/ima.70243","url":null,"abstract":"<div>\u0000 \u0000 <p>Breast cancer remains a major universal health issue among women, and early detection is essential for effective treatment and better survival rates. This study presents a non-invasive method for early breast cancer detection using thermal imaging and Explainable Artificial Intelligence (XAI) techniques. Thermal imaging is a radiation-free, safe and comfortable substitute for mammography. Thermal imaging influences the heat generated by cancerous tissues due to their higher metabolic activity. Deep learning methods train using the images by detecting the significant features between the classes. The best models before and after segmentation were VGG19 and ResNet, with accuracies of 95.2% and 96.8%, respectively. XAI techniques, specifically DeepSHAP and Local Interpretable Model-agnostic Explanations (LIME), were applied to improve model interpretability and confidence by detecting regions in images revealing abnormalities. The model outcome and explanations were consequently used to validate the reliability of the predictions made by YOLOv8 for finding an abnormal contour. To address dataset scalability, a bigger segmented dataset of 760 images was applied to a novel method, PenFeatNet, and it achieved 97.6% accuracy. This approach improved accuracy by 1.3% by isolating feature extraction from classification, reducing architectural complexity. These findings provide new avenues for further research and validation, potentially revolutionizing breast cancer screening.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145407322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Noise Removal and Interpretable Deep Learning Model for Diabetic Retinopathy Detection 一种新的用于糖尿病视网膜病变检测的噪声去除和可解释深度学习模型
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-31 DOI: 10.1002/ima.70245
Sultan Alanazi, Sajid Ullah Khan, Faisal M. Alotaibi, Mohammed Alonazi

Diabetic retinopathy (DR) is a primary reason for visual impairment and blindness in individuals with diabetes worldwide. Timely detection of DR is essential to prevent vision loss in diabetics. However, noise and limited model transparency often compromise the accuracy of diagnosing retinal fundus images. Noise and interpretability are the two main challenges occurring in imaging datasets, overshadowing concerns such as class imbalance or device variability. These distortions are present in all datasets and devices, reducing the clarity of diagnostic signals at the pixel level and often obscuring early lesions within background noise. Addressing these challenges, this research introduces an innovative model called Explainable MINet-ViT, which combines advanced noise reduction techniques with explainable deep learning for more reliable identification of DR. The model incorporates a multi-level denoising network (MINet), modified by a noise-specific pre-processing module using a Variance-Stabilizing Transform (VST) and deep residual feature mapping. A hybrid deep learning architecture that combines Convolutional Neural Networks (CNNs) with Vision Transformers (ViTs) is employed to extract both local and global spatial information. We apply explainability strategies, such as Grad-CAM and SHAP, to ensure clinical interpretability by identifying the crucial retinal regions that influence model predictions. Quantitative and qualitative results show improved performance, robustness, and clinical applicability, achieving an accuracy of 97.6%, a sensitivity of 0.96, a specificity of 0.97, a Kappa of 0.92, and an AUC of 96.7%. Analyses of standard datasets reveal that our proposed model outperforms prior models in accuracy, noise robustness, and interpretability, rendering it exceptionally suitable for real-world clinical applications.

糖尿病视网膜病变(DR)是世界范围内糖尿病患者视力损害和失明的主要原因。及时发现DR对于预防糖尿病患者视力丧失至关重要。然而,噪声和有限的模型透明度往往影响诊断视网膜眼底图像的准确性。噪声和可解释性是成像数据集中出现的两个主要挑战,掩盖了诸如类别不平衡或设备可变性等问题。这些失真存在于所有数据集和设备中,降低了像素级诊断信号的清晰度,并且经常在背景噪声中模糊早期病变。为了应对这些挑战,本研究引入了一种名为Explainable MINet- vit的创新模型,该模型将先进的降噪技术与可解释的深度学习相结合,以更可靠地识别dr。该模型结合了一个多级去噪网络(MINet),由使用方差稳定变换(VST)和深度残差特征映射的噪声特定预处理模块进行修改。采用卷积神经网络(cnn)和视觉变换(ViTs)相结合的混合深度学习架构提取局部和全局空间信息。我们应用可解释性策略,如Grad-CAM和SHAP,通过识别影响模型预测的关键视网膜区域来确保临床可解释性。定量和定性结果均显示出较好的性能、稳健性和临床适用性,准确率为97.6%,灵敏度为0.96,特异性为0.97,Kappa为0.92,AUC为96.7%。对标准数据集的分析表明,我们提出的模型在准确性、噪声稳健性和可解释性方面优于先前的模型,使其非常适合现实世界的临床应用。
{"title":"A Novel Noise Removal and Interpretable Deep Learning Model for Diabetic Retinopathy Detection","authors":"Sultan Alanazi,&nbsp;Sajid Ullah Khan,&nbsp;Faisal M. Alotaibi,&nbsp;Mohammed Alonazi","doi":"10.1002/ima.70245","DOIUrl":"https://doi.org/10.1002/ima.70245","url":null,"abstract":"<div>\u0000 \u0000 <p>Diabetic retinopathy (DR) is a primary reason for visual impairment and blindness in individuals with diabetes worldwide. Timely detection of DR is essential to prevent vision loss in diabetics. However, noise and limited model transparency often compromise the accuracy of diagnosing retinal fundus images. Noise and interpretability are the two main challenges occurring in imaging datasets, overshadowing concerns such as class imbalance or device variability. These distortions are present in all datasets and devices, reducing the clarity of diagnostic signals at the pixel level and often obscuring early lesions within background noise. Addressing these challenges, this research introduces an innovative model called Explainable MINet-ViT, which combines advanced noise reduction techniques with explainable deep learning for more reliable identification of DR. The model incorporates a multi-level denoising network (MINet), modified by a noise-specific pre-processing module using a Variance-Stabilizing Transform (VST) and deep residual feature mapping. A hybrid deep learning architecture that combines Convolutional Neural Networks (CNNs) with Vision Transformers (ViTs) is employed to extract both local and global spatial information. We apply explainability strategies, such as Grad-CAM and SHAP, to ensure clinical interpretability by identifying the crucial retinal regions that influence model predictions. Quantitative and qualitative results show improved performance, robustness, and clinical applicability, achieving an accuracy of 97.6%, a sensitivity of 0.96, a specificity of 0.97, a Kappa of 0.92, and an AUC of 96.7%. Analyses of standard datasets reveal that our proposed model outperforms prior models in accuracy, noise robustness, and interpretability, rendering it exceptionally suitable for real-world clinical applications.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145407324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Imaging Systems and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1