首页 > 最新文献

International Journal of Imaging Systems and Technology最新文献

英文 中文
Non-Invasive Prediction of Axillary Lymph Node Metastasis in Breast Cancer Using Combined Clinical, Radiomics, and Deep Learning Features 结合临床、放射组学和深度学习特征无创预测乳腺癌腋窝淋巴结转移
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-08 DOI: 10.1002/ima.70248
Jing Chen, Xiaoying Qiu, Yun Zheng, Yuehui Liao, Weiji Yang, Jionghui Gu, Chunhong Yan, Lang Meng, Jing Cheng, Tian'an Jiang, Xiaobo Lai

Axillary lymph node (ALN) metastasis is a critical prognostic factor in breast cancer; accurate preoperative evaluation is crucial for guiding treatment decisions. This study aims to develop and validate a combined model integrating clinical, radiomics, and deep learning (DL) features derived from preoperative ultrasound images to non-invasively predict ALN metastasis in breast cancer patients, while systematically comparing the predictive value of features derived from different peritumoral regions and multiple feature sources. A total of 431 breast cancer patients, with axillary lymph node dissection pathology serving as the gold standard, were retrospectively enrolled and randomly assigned to training (n = 301) and test (n = 130) sets. Clinical features, radiomics features, and deep learning features (extracted from the penultimate layer of a pre-trained ResNet50 convolutional neural network using global average pooling) were obtained from intratumoral regions, combined intratumoral and peritumoral regions (1–3 mm margins), and whole ultrasound images. Machine learning models were constructed separately for clinical, radiomics, DL, and combined feature sets. Models were evaluated using the area under the curve (AUC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), and negative predictive value (NPV). Clinical models demonstrated limited predictive performance (test set AUC: 0.591–0.611). Radiomics and DL models showed improved performance, particularly when including peritumoral information. A random forest-based combined model integrating clinical features, radiomics features (intratumoral+3-mm peritumoral region), and DL features yielded the highest performance, achieving a test set AUC of 0.869, with SEN/SPE/PPV/NPV values of 0.886/0.856/0.856/0.839, respectively. Additionally, the model demonstrated robust predictive accuracy across molecular subtypes. The combined model integrating clinical, radiomics, and deep learning features from intratumoral and peritumoral ultrasound images effectively and non-invasively predicts ALN metastasis in breast cancer patients. This approach shows potential for clinical decision support, may help reduce unnecessary sentinel lymph node biopsies, and shows promise for generalizability across different molecular subtypes.

腋窝淋巴结(ALN)转移是乳腺癌预后的重要因素;准确的术前评估对指导治疗决策至关重要。本研究旨在建立并验证一种结合临床、放射组学和术前超声图像的深度学习(DL)特征的联合模型,用于无创预测乳腺癌患者ALN转移,同时系统比较来自不同肿瘤周围区域和多种特征来源的特征的预测价值。回顾性纳入431例以腋窝淋巴结清扫病理为金标准的乳腺癌患者,随机分为训练组(n = 301)和试验组(n = 130)。临床特征、放射组学特征和深度学习特征(使用全局平均池化从预训练的ResNet50卷积神经网络的倒数第二层提取)从肿瘤内区域、肿瘤内和肿瘤周围联合区域(1-3 mm边缘)和整个超声图像中获得。分别为临床、放射组学、DL和组合特征集构建机器学习模型。采用曲线下面积(AUC)、敏感性(SEN)、特异性(SPE)、阳性预测值(PPV)和阴性预测值(NPV)对模型进行评价。临床模型显示有限的预测性能(测试集AUC: 0.591-0.611)。放射组学和DL模型表现出更好的性能,特别是当包含肿瘤周围信息时。结合临床特征、放射组学特征(肿瘤内+肿瘤周围3mm区域)和DL特征的随机森林组合模型表现最佳,测试集AUC为0.869,SEN/SPE/PPV/NPV值分别为0.886/0.856/0.856/0.839。此外,该模型在分子亚型中显示出强大的预测准确性。结合临床、放射组学和肿瘤内和肿瘤周围超声图像的深度学习特征的联合模型有效且无创地预测乳腺癌患者的ALN转移。这种方法显示了临床决策支持的潜力,可能有助于减少不必要的前哨淋巴结活检,并有望在不同的分子亚型中推广。
{"title":"Non-Invasive Prediction of Axillary Lymph Node Metastasis in Breast Cancer Using Combined Clinical, Radiomics, and Deep Learning Features","authors":"Jing Chen,&nbsp;Xiaoying Qiu,&nbsp;Yun Zheng,&nbsp;Yuehui Liao,&nbsp;Weiji Yang,&nbsp;Jionghui Gu,&nbsp;Chunhong Yan,&nbsp;Lang Meng,&nbsp;Jing Cheng,&nbsp;Tian'an Jiang,&nbsp;Xiaobo Lai","doi":"10.1002/ima.70248","DOIUrl":"https://doi.org/10.1002/ima.70248","url":null,"abstract":"<div>\u0000 \u0000 <p>Axillary lymph node (ALN) metastasis is a critical prognostic factor in breast cancer; accurate preoperative evaluation is crucial for guiding treatment decisions. This study aims to develop and validate a combined model integrating clinical, radiomics, and deep learning (DL) features derived from preoperative ultrasound images to non-invasively predict ALN metastasis in breast cancer patients, while systematically comparing the predictive value of features derived from different peritumoral regions and multiple feature sources. A total of 431 breast cancer patients, with axillary lymph node dissection pathology serving as the gold standard, were retrospectively enrolled and randomly assigned to training (<i>n</i> = 301) and test (<i>n</i> = 130) sets. Clinical features, radiomics features, and deep learning features (extracted from the penultimate layer of a pre-trained ResNet50 convolutional neural network using global average pooling) were obtained from intratumoral regions, combined intratumoral and peritumoral regions (1–3 mm margins), and whole ultrasound images. Machine learning models were constructed separately for clinical, radiomics, DL, and combined feature sets. Models were evaluated using the area under the curve (AUC), sensitivity (SEN), specificity (SPE), positive predictive value (PPV), and negative predictive value (NPV). Clinical models demonstrated limited predictive performance (test set AUC: 0.591–0.611). Radiomics and DL models showed improved performance, particularly when including peritumoral information. A random forest-based combined model integrating clinical features, radiomics features (intratumoral+3-mm peritumoral region), and DL features yielded the highest performance, achieving a test set AUC of 0.869, with SEN/SPE/PPV/NPV values of 0.886/0.856/0.856/0.839, respectively. Additionally, the model demonstrated robust predictive accuracy across molecular subtypes. The combined model integrating clinical, radiomics, and deep learning features from intratumoral and peritumoral ultrasound images effectively and non-invasively predicts ALN metastasis in breast cancer patients. This approach shows potential for clinical decision support, may help reduce unnecessary sentinel lymph node biopsies, and shows promise for generalizability across different molecular subtypes.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mistral in Radiology: AI-Powered Classification of Normal and Abnormal Reports 放射学中的西北风:正常和异常报告的人工智能分类
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-08 DOI: 10.1002/ima.70251
Pilar López-Úbeda, Teodoro Martín-Noguerol, Antonio Luna

This study investigates the potential of the Mistral large language model (LLM) to classify radiological reports as normal or abnormal using three techniques: Zero-Shot Learning (ZSL), Few-Shot Learning (FSL), and Fine-Tuning (FT), aiming to optimize radiology workflows and improve clinical decision-making. The dataset consisted of 124 807 radiology reports from MRI and CT scans conducted between 1 May 2024, and 1 November 2024, at our institution. After applying inclusion and exclusion criteria, 123 296 reports were selected for analysis. The Mistral LLM was tested with ZSL, FSL, and FT techniques. Quantitative metrics, including precision, recall, F1 score, and accuracy, were calculated for each technique. Confusion matrices and qualitative analyses of misclassified cases were also performed. ZSL yielded the lowest performance, with an F1 score of 0.191 for the normal class and an overall accuracy of 0.438, due to a high false-positive rate. FSL improved accuracy to 0.806 but still showed limitations in classifying normal reports (F1 = 0.404). FT achieved the best results, with F1 scores above 0.98 for both classes and an overall accuracy of 0.998, minimizing false positives and false negatives. Classifying radiological reports as normal or abnormal is crucial for prioritizing urgent cases and optimizing workflows. The Mistral LLM, particularly with Fine-Tuning, demonstrated strong potential for automating this task, outperforming ZSL and FSL.

本研究探讨了Mistral大语言模型(LLM)使用三种技术将放射报告分类为正常或异常的潜力:零次学习(ZSL),少次学习(FSL)和微调(FT),旨在优化放射学工作流程和改善临床决策。该数据集包括2024年5月1日至2024年11月1日在我院进行的124 807份MRI和CT扫描的放射学报告。应用纳入标准和排除标准后,选取123 296份报告进行分析。采用ZSL、FSL和FT技术对Mistral LLM进行测试。计算每种技术的定量指标,包括精密度、召回率、F1评分和准确性。混淆矩阵和定性分析错误分类的情况下也进行了。ZSL产生了最低的性能,由于高假阳性率,正常类的F1得分为0.191,总体准确率为0.438。FSL将准确率提高到0.806,但在分类正常报告方面仍然存在局限性(F1 = 0.404)。FT取得了最好的成绩,两个班级的F1得分都在0.98以上,总体准确率达到0.998,最大限度地减少了假阳性和假阴性。将放射报告分类为正常或异常对于优先处理紧急病例和优化工作流程至关重要。Mistral LLM,特别是具有微调功能的LLM,在自动化这项任务方面表现出了强大的潜力,优于ZSL和FSL。
{"title":"Mistral in Radiology: AI-Powered Classification of Normal and Abnormal Reports","authors":"Pilar López-Úbeda,&nbsp;Teodoro Martín-Noguerol,&nbsp;Antonio Luna","doi":"10.1002/ima.70251","DOIUrl":"https://doi.org/10.1002/ima.70251","url":null,"abstract":"<div>\u0000 \u0000 <p>This study investigates the potential of the Mistral large language model (LLM) to classify radiological reports as normal or abnormal using three techniques: Zero-Shot Learning (ZSL), Few-Shot Learning (FSL), and Fine-Tuning (FT), aiming to optimize radiology workflows and improve clinical decision-making. The dataset consisted of 124 807 radiology reports from MRI and CT scans conducted between 1 May 2024, and 1 November 2024, at our institution. After applying inclusion and exclusion criteria, 123 296 reports were selected for analysis. The Mistral LLM was tested with ZSL, FSL, and FT techniques. Quantitative metrics, including precision, recall, F1 score, and accuracy, were calculated for each technique. Confusion matrices and qualitative analyses of misclassified cases were also performed. ZSL yielded the lowest performance, with an F1 score of 0.191 for the normal class and an overall accuracy of 0.438, due to a high false-positive rate. FSL improved accuracy to 0.806 but still showed limitations in classifying normal reports (F1 = 0.404). FT achieved the best results, with F1 scores above 0.98 for both classes and an overall accuracy of 0.998, minimizing false positives and false negatives. Classifying radiological reports as normal or abnormal is crucial for prioritizing urgent cases and optimizing workflows. The Mistral LLM, particularly with Fine-Tuning, demonstrated strong potential for automating this task, outperforming ZSL and FSL.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145470154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convolutional Autoencoder Effect on Parallel Magnetic Resonance Imaging 卷积自编码器对平行磁共振成像的影响
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-05 DOI: 10.1002/ima.70254
Amel Korti, Amène Bekki, Alain Lalande

Parallel magnetic resonance imaging (pMRI) reduces MRI acquisition time, with Sensitivity Encoding (SENSE) being a widely used method that exploits coil sensitivity maps for efficient image reconstruction. However, SENSE can introduce aliasing and noise artifacts, especially at high acceleration factors. To address this limitation, we propose a deep learning-based postprocessing framework that enhances SENSE-reconstructed images using a Convolutional Autoencoder (CAE). The CAE is applied after the SENSE reconstruction to reduce artifacts and improve image quality, without modifying the original reconstruction pipeline. A dataset of 842 fully sampled anatomical images is used to simulate 8-channel coil data, with both uniform and variable-density (VD) undersampling applied at different acceleration factors. The CAE is trained on paired inputs (SENSE-reconstructed images) and targets (fully sampled references) to learn the mapping from degraded to high-quality images. Quantitative evaluation using Peak Signal-to-Noise Ratio (PSNR) and Normalized Mean Squared Error (NMSE), along with qualitative visual assessment, shows that the proposed CAE-SENSE framework significantly improves image fidelity, particularly with variable-density undersampling. These results demonstrate the potential of deep learning as a complementary tool to enhance conventional parallel imaging methods in accelerated MRI.

并行磁共振成像(pMRI)减少了MRI采集时间,灵敏度编码(SENSE)是一种广泛使用的方法,利用线圈灵敏度图进行有效的图像重建。然而,SENSE可能会引入混叠和噪声伪影,特别是在高加速度因素下。为了解决这一限制,我们提出了一个基于深度学习的后处理框架,该框架使用卷积自编码器(CAE)增强感官重建图像。在不修改原始重建流程的情况下,在SENSE重建后进行CAE,减少伪影,提高图像质量。使用842张完全采样的解剖图像数据集来模拟8通道线圈数据,在不同的加速度因子下使用均匀和可变密度(VD)欠采样。CAE在成对输入(感知重建图像)和目标(完全采样参考)上进行训练,以学习从退化图像到高质量图像的映射。使用峰值信噪比(PSNR)和归一化均方误差(NMSE)进行定量评估,以及定性视觉评估表明,所提出的CAE-SENSE框架显著提高了图像保真度,特别是在变密度欠采样情况下。这些结果表明,深度学习作为一种补充工具,在加速MRI中增强传统并行成像方法的潜力。
{"title":"Convolutional Autoencoder Effect on Parallel Magnetic Resonance Imaging","authors":"Amel Korti,&nbsp;Amène Bekki,&nbsp;Alain Lalande","doi":"10.1002/ima.70254","DOIUrl":"https://doi.org/10.1002/ima.70254","url":null,"abstract":"<div>\u0000 \u0000 <p>Parallel magnetic resonance imaging (pMRI) reduces MRI acquisition time, with Sensitivity Encoding (SENSE) being a widely used method that exploits coil sensitivity maps for efficient image reconstruction. However, SENSE can introduce aliasing and noise artifacts, especially at high acceleration factors. To address this limitation, we propose a deep learning-based postprocessing framework that enhances SENSE-reconstructed images using a Convolutional Autoencoder (CAE). The CAE is applied after the SENSE reconstruction to reduce artifacts and improve image quality, without modifying the original reconstruction pipeline. A dataset of 842 fully sampled anatomical images is used to simulate 8-channel coil data, with both uniform and variable-density (VD) undersampling applied at different acceleration factors. The CAE is trained on paired inputs (SENSE-reconstructed images) and targets (fully sampled references) to learn the mapping from degraded to high-quality images. Quantitative evaluation using Peak Signal-to-Noise Ratio (PSNR) and Normalized Mean Squared Error (NMSE), along with qualitative visual assessment, shows that the proposed CAE-SENSE framework significantly improves image fidelity, particularly with variable-density undersampling. These results demonstrate the potential of deep learning as a complementary tool to enhance conventional parallel imaging methods in accelerated MRI.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IGF-CNN: An Optimized Deep Learning Model for Covid-19 Classification IGF-CNN:优化的Covid-19分类深度学习模型
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-03 DOI: 10.1002/ima.70247
Vinayak Tiwari, Sidharrth Kumar Singh, Umaisa Hassan, Amit Singhal

Recent advancements in deep learning and the utilization of pre-trained convolutional neural network (CNN) architectures have led to enhancements in classification tasks. However, these architectures often entail millions of training parameters, posing challenges for real-world deployment. In this work, we propose an iterative Gaussian feature extractor with a custom 3-layer CNN network (IGF-CNN) coupled with a feedforward artificial neural network (ANN) classifier. The input images undergo pre-processing before being fed to the proposed IGF-CNN and then ANN classifies the input into Covid-19, non-Covid-19 and pneumonia classes. The suggested model demands considerably fewer parameters and reduces training time substantially and achieves accuracies of 99.80%, 98.78%, 99.0%, respectively, across three different benchmark datasets. We have also performed cross-dataset validation and obtained consistently good results, further demonstrating the robustness of the proposed approach. The proposed architecture is accurate and efficient and can be integrated with real-time systems.

深度学习的最新进展和预训练卷积神经网络(CNN)架构的使用导致了分类任务的增强。然而,这些体系结构通常需要数以百万计的训练参数,为实际部署带来了挑战。在这项工作中,我们提出了一种迭代高斯特征提取器,该提取器具有自定义的3层CNN网络(IGF-CNN)和前馈人工神经网络(ANN)分类器。输入图像经过预处理,然后馈送到所提出的IGF-CNN,然后ANN将输入分类为Covid-19,非Covid-19和肺炎类。该模型所需的参数大大减少,训练时间大大减少,在三个不同的基准数据集上分别达到99.80%、98.78%和99.0%的准确率。我们还进行了跨数据集验证,并获得了一致的良好结果,进一步证明了所提出方法的鲁棒性。该体系结构准确、高效,可与实时系统集成。
{"title":"IGF-CNN: An Optimized Deep Learning Model for Covid-19 Classification","authors":"Vinayak Tiwari,&nbsp;Sidharrth Kumar Singh,&nbsp;Umaisa Hassan,&nbsp;Amit Singhal","doi":"10.1002/ima.70247","DOIUrl":"https://doi.org/10.1002/ima.70247","url":null,"abstract":"<div>\u0000 \u0000 <p>Recent advancements in deep learning and the utilization of pre-trained convolutional neural network (CNN) architectures have led to enhancements in classification tasks. However, these architectures often entail millions of training parameters, posing challenges for real-world deployment. In this work, we propose an iterative Gaussian feature extractor with a custom 3-layer CNN network (IGF-CNN) coupled with a feedforward artificial neural network (ANN) classifier. The input images undergo pre-processing before being fed to the proposed IGF-CNN and then ANN classifies the input into Covid-19, non-Covid-19 and pneumonia classes. The suggested model demands considerably fewer parameters and reduces training time substantially and achieves accuracies of 99.80%, 98.78%, 99.0%, respectively, across three different benchmark datasets. We have also performed cross-dataset validation and obtained consistently good results, further demonstrating the robustness of the proposed approach. The proposed architecture is accurate and efficient and can be integrated with real-time systems.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MediFusionNet: A Novel Architecture for Multimodal Medical Image Analysis MediFusionNet:一种用于多模态医学图像分析的新架构
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-03 DOI: 10.1002/ima.70246
Muneeb A. Khan, Heemin Park, Dashdorj Yamkhin, Seonuck Paek

Medical image analysis typically employs modality-specific approaches, limiting comprehensive diagnostic capabilities. We introduce MediFusionNet, a deep learning architecture unifying brain MRI and chest X-ray analysis through specialized encoding, cross-modality attention, and uncertainty-aware predictions. Our architecture preserves modality-specific features while enabling knowledge transfer between anatomically distinct regions. Experiments demonstrate 97.73% overall accuracy, significantly outperforming specialized single-modality networks and existing multimodal approaches. MediFusionNet demonstrates positive cross-modal knowledge transfer (CMKTB, +2.61%) while providing calibrated uncertainty estimates (Expected Calibration Error, ECE = 0.027). This uncertainty quantification facilitates clinically meaningful workflow optimization, automatically processing 82.5% of cases with 99.3% accuracy. Ablation studies quantify each architectural component's contribution, providing insights for robust, uncertainty-aware, multimodal medical image analysis systems.

医学图像分析通常采用特定于模态的方法,限制了综合诊断能力。我们介绍MediFusionNet,这是一种通过专门编码、跨模态注意和不确定性感知预测统一脑MRI和胸部x射线分析的深度学习架构。我们的架构保留了特定于形态的特征,同时使知识能够在解剖学上不同的区域之间转移。实验表明,总体准确率为97.73%,显著优于专业的单模态网络和现有的多模态方法。MediFusionNet展示了积极的跨模式知识转移(CMKTB, +2.61%),同时提供校准的不确定性估计(预期校准误差,ECE = 0.027)。这种不确定度量化有助于临床有意义的工作流程优化,自动处理82.5%的病例,准确率为99.3%。消融研究量化了每个架构组件的贡献,为鲁棒性、不确定性感知、多模态医学图像分析系统提供了见解。
{"title":"MediFusionNet: A Novel Architecture for Multimodal Medical Image Analysis","authors":"Muneeb A. Khan,&nbsp;Heemin Park,&nbsp;Dashdorj Yamkhin,&nbsp;Seonuck Paek","doi":"10.1002/ima.70246","DOIUrl":"https://doi.org/10.1002/ima.70246","url":null,"abstract":"<div>\u0000 \u0000 <p>Medical image analysis typically employs modality-specific approaches, limiting comprehensive diagnostic capabilities. We introduce MediFusionNet, a deep learning architecture unifying brain MRI and chest X-ray analysis through specialized encoding, cross-modality attention, and uncertainty-aware predictions. Our architecture preserves modality-specific features while enabling knowledge transfer between anatomically distinct regions. Experiments demonstrate 97.73% overall accuracy, significantly outperforming specialized single-modality networks and existing multimodal approaches. MediFusionNet demonstrates positive cross-modal knowledge transfer (CMKTB, +2.61%) while providing calibrated uncertainty estimates (Expected Calibration Error, ECE = 0.027). This uncertainty quantification facilitates clinically meaningful workflow optimization, automatically processing 82.5% of cases with 99.3% accuracy. Ablation studies quantify each architectural component's contribution, providing insights for robust, uncertainty-aware, multimodal medical image analysis systems.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
U-Net-Based Fetal Head Circumference Segmentation With Synthetic-Driven Generation Data Augmentation 基于u - net的胎儿头围分割与合成驱动生成数据增强
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-11-01 DOI: 10.1002/ima.70244
Niama Assia El Joudi, Mohamed Lazaar, François Delmotte, Hamid Allaoui, Oussama Mahboub

In modern healthcare, deep learning methods have gained considerable attention in analyzing medical imaging over the last few years, achieving reliable outcomes in various segmentation tasks. However, these models heavily rely on high-quality and large annotated samples, which remain a scarce and hard resource to acquire in several medical fields, particularly in obstetrics and gynecology, limiting the deep learning models' capability to generalize effectively on unseen datasets. Therefore, this study proposes a two-stage framework that enhances the model segmentation performance by incorporating synthetic ultrasound image generation for fetal head segmentation and thus measures its circumference. In this paper, we propose a novel two-stage pipeline designed to segment the fetal head circumference. Initially, an improved Deep Convolutional Generative Adversarial Network is employed to generate synthetic fetal head ultrasound images, producing high-quality and high structural similarity to real ones. Preliminary annotations were obtained through automated segmentation using a lightweight U-Net, followed by a refined enhancement phase. These annotations were incorporated progressively into the U-Net training to enhance the model's performance and effectiveness. The proposed framework was evaluated on the HC18 Grand Challenge dataset. The GAN-based synthetic images achieved a Peak Signal-to-Noise Ratio of 56 and a Structural Similarity Index Measure of 0.99, demonstrating the diversity and similarity of the images and thus significantly improving the U-Net segmentation Dice Score with an increase of 1.62%. Compared to prior works using the same dataset, our model achieved the highest Dice Coefficient of 98%, a Jaccard Index of 96.11%, and a Hausdorff Distance of 0.329 mm on the test set. Our lightweight U-Net, combined with GAN-based data augmentation, effectively addresses the challenge of data scarcity and enhances the segmentation of the fetal head with precise delineation, providing a robust solution in clinical application for early fetal anomaly detection and prenatal diagnosis.

在现代医疗保健中,深度学习方法在过去几年中在分析医学图像方面获得了相当大的关注,在各种分割任务中获得了可靠的结果。然而,这些模型严重依赖于高质量和大量带注释的样本,而这些样本在一些医学领域,特别是在产科和妇科领域,仍然是稀缺和难以获得的资源,这限制了深度学习模型在未见过的数据集上有效泛化的能力。因此,本研究提出了一个两阶段框架,通过结合合成超声图像生成胎儿头部分割,从而测量其周长,从而提高模型分割性能。在本文中,我们提出了一种新的两级管道设计分割胎儿头围。首先,采用改进的深度卷积生成对抗网络生成合成胎儿头部超声图像,生成高质量且与真实图像结构相似的图像。通过使用轻量级U-Net自动分割获得初步注释,然后进行细化增强阶段。这些标注被逐步纳入U-Net训练中,以提高模型的性能和有效性。提出的框架在HC18大挑战数据集上进行了评估。基于gan的合成图像的峰值信噪比为56,结构相似指数测度为0.99,显示了图像的多样性和相似性,从而显著提高了U-Net分割Dice Score,提高了1.62%。与使用相同数据集的先前工作相比,我们的模型在测试集上获得了98%的最高骰子系数,96.11%的Jaccard指数和0.329 mm的豪斯多夫距离。我们的轻量级U-Net结合基于gan的数据增强,有效解决了数据稀缺的挑战,并以精确的描绘增强了胎儿头部的分割,为早期胎儿异常检测和产前诊断的临床应用提供了强大的解决方案。
{"title":"U-Net-Based Fetal Head Circumference Segmentation With Synthetic-Driven Generation Data Augmentation","authors":"Niama Assia El Joudi,&nbsp;Mohamed Lazaar,&nbsp;François Delmotte,&nbsp;Hamid Allaoui,&nbsp;Oussama Mahboub","doi":"10.1002/ima.70244","DOIUrl":"https://doi.org/10.1002/ima.70244","url":null,"abstract":"<div>\u0000 \u0000 <p>In modern healthcare, deep learning methods have gained considerable attention in analyzing medical imaging over the last few years, achieving reliable outcomes in various segmentation tasks. However, these models heavily rely on high-quality and large annotated samples, which remain a scarce and hard resource to acquire in several medical fields, particularly in obstetrics and gynecology, limiting the deep learning models' capability to generalize effectively on unseen datasets. Therefore, this study proposes a two-stage framework that enhances the model segmentation performance by incorporating synthetic ultrasound image generation for fetal head segmentation and thus measures its circumference. In this paper, we propose a novel two-stage pipeline designed to segment the fetal head circumference. Initially, an improved Deep Convolutional Generative Adversarial Network is employed to generate synthetic fetal head ultrasound images, producing high-quality and high structural similarity to real ones. Preliminary annotations were obtained through automated segmentation using a lightweight U-Net, followed by a refined enhancement phase. These annotations were incorporated progressively into the U-Net training to enhance the model's performance and effectiveness. The proposed framework was evaluated on the HC18 Grand Challenge dataset. The GAN-based synthetic images achieved a Peak Signal-to-Noise Ratio of 56 and a Structural Similarity Index Measure of 0.99, demonstrating the diversity and similarity of the images and thus significantly improving the U-Net segmentation Dice Score with an increase of 1.62%. Compared to prior works using the same dataset, our model achieved the highest Dice Coefficient of 98%, a Jaccard Index of 96.11%, and a Hausdorff Distance of 0.329 mm on the test set. Our lightweight U-Net, combined with GAN-based data augmentation, effectively addresses the challenge of data scarcity and enhances the segmentation of the fetal head with precise delineation, providing a robust solution in clinical application for early fetal anomaly detection and prenatal diagnosis.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145406567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced Breast Cancer Detection From Thermal Images Using DNN and Explainable AI 利用深度神经网络和可解释的人工智能从热图像增强乳腺癌检测
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-31 DOI: 10.1002/ima.70243
Mukesh Prasanna, M. Abirami, R. Nithya, B. Santhi, G. R. Brindha, Muthu Thiruvengadam

Breast cancer remains a major universal health issue among women, and early detection is essential for effective treatment and better survival rates. This study presents a non-invasive method for early breast cancer detection using thermal imaging and Explainable Artificial Intelligence (XAI) techniques. Thermal imaging is a radiation-free, safe and comfortable substitute for mammography. Thermal imaging influences the heat generated by cancerous tissues due to their higher metabolic activity. Deep learning methods train using the images by detecting the significant features between the classes. The best models before and after segmentation were VGG19 and ResNet, with accuracies of 95.2% and 96.8%, respectively. XAI techniques, specifically DeepSHAP and Local Interpretable Model-agnostic Explanations (LIME), were applied to improve model interpretability and confidence by detecting regions in images revealing abnormalities. The model outcome and explanations were consequently used to validate the reliability of the predictions made by YOLOv8 for finding an abnormal contour. To address dataset scalability, a bigger segmented dataset of 760 images was applied to a novel method, PenFeatNet, and it achieved 97.6% accuracy. This approach improved accuracy by 1.3% by isolating feature extraction from classification, reducing architectural complexity. These findings provide new avenues for further research and validation, potentially revolutionizing breast cancer screening.

乳腺癌仍然是妇女普遍面临的一个主要健康问题,及早发现对于有效治疗和提高生存率至关重要。本研究提出了一种利用热成像和可解释人工智能(XAI)技术进行早期乳腺癌检测的无创方法。热成像是一种无辐射、安全、舒适的乳房x光检查的替代品。由于癌变组织具有较高的代谢活性,热成像会影响癌变组织产生的热量。深度学习方法通过检测类别之间的重要特征来使用图像进行训练。分割前后的最佳模型为VGG19和ResNet,准确率分别为95.2%和96.8%。XAI技术,特别是DeepSHAP和局部可解释模型不可知论解释(LIME),通过检测图像中显示异常的区域来提高模型的可解释性和置信度。因此,模型结果和解释被用来验证YOLOv8在寻找异常等值线方面所做预测的可靠性。为了解决数据集的可扩展性问题,将760张图像的更大的分割数据集应用于一种新的方法PenFeatNet,准确率达到97.6%。这种方法通过将特征提取从分类中分离出来,降低了体系结构的复杂性,将准确率提高了1.3%。这些发现为进一步的研究和验证提供了新的途径,可能会彻底改变乳腺癌筛查。
{"title":"Enhanced Breast Cancer Detection From Thermal Images Using DNN and Explainable AI","authors":"Mukesh Prasanna,&nbsp;M. Abirami,&nbsp;R. Nithya,&nbsp;B. Santhi,&nbsp;G. R. Brindha,&nbsp;Muthu Thiruvengadam","doi":"10.1002/ima.70243","DOIUrl":"https://doi.org/10.1002/ima.70243","url":null,"abstract":"<div>\u0000 \u0000 <p>Breast cancer remains a major universal health issue among women, and early detection is essential for effective treatment and better survival rates. This study presents a non-invasive method for early breast cancer detection using thermal imaging and Explainable Artificial Intelligence (XAI) techniques. Thermal imaging is a radiation-free, safe and comfortable substitute for mammography. Thermal imaging influences the heat generated by cancerous tissues due to their higher metabolic activity. Deep learning methods train using the images by detecting the significant features between the classes. The best models before and after segmentation were VGG19 and ResNet, with accuracies of 95.2% and 96.8%, respectively. XAI techniques, specifically DeepSHAP and Local Interpretable Model-agnostic Explanations (LIME), were applied to improve model interpretability and confidence by detecting regions in images revealing abnormalities. The model outcome and explanations were consequently used to validate the reliability of the predictions made by YOLOv8 for finding an abnormal contour. To address dataset scalability, a bigger segmented dataset of 760 images was applied to a novel method, PenFeatNet, and it achieved 97.6% accuracy. This approach improved accuracy by 1.3% by isolating feature extraction from classification, reducing architectural complexity. These findings provide new avenues for further research and validation, potentially revolutionizing breast cancer screening.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145407322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Noise Removal and Interpretable Deep Learning Model for Diabetic Retinopathy Detection 一种新的用于糖尿病视网膜病变检测的噪声去除和可解释深度学习模型
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-31 DOI: 10.1002/ima.70245
Sultan Alanazi, Sajid Ullah Khan, Faisal M. Alotaibi, Mohammed Alonazi

Diabetic retinopathy (DR) is a primary reason for visual impairment and blindness in individuals with diabetes worldwide. Timely detection of DR is essential to prevent vision loss in diabetics. However, noise and limited model transparency often compromise the accuracy of diagnosing retinal fundus images. Noise and interpretability are the two main challenges occurring in imaging datasets, overshadowing concerns such as class imbalance or device variability. These distortions are present in all datasets and devices, reducing the clarity of diagnostic signals at the pixel level and often obscuring early lesions within background noise. Addressing these challenges, this research introduces an innovative model called Explainable MINet-ViT, which combines advanced noise reduction techniques with explainable deep learning for more reliable identification of DR. The model incorporates a multi-level denoising network (MINet), modified by a noise-specific pre-processing module using a Variance-Stabilizing Transform (VST) and deep residual feature mapping. A hybrid deep learning architecture that combines Convolutional Neural Networks (CNNs) with Vision Transformers (ViTs) is employed to extract both local and global spatial information. We apply explainability strategies, such as Grad-CAM and SHAP, to ensure clinical interpretability by identifying the crucial retinal regions that influence model predictions. Quantitative and qualitative results show improved performance, robustness, and clinical applicability, achieving an accuracy of 97.6%, a sensitivity of 0.96, a specificity of 0.97, a Kappa of 0.92, and an AUC of 96.7%. Analyses of standard datasets reveal that our proposed model outperforms prior models in accuracy, noise robustness, and interpretability, rendering it exceptionally suitable for real-world clinical applications.

糖尿病视网膜病变(DR)是世界范围内糖尿病患者视力损害和失明的主要原因。及时发现DR对于预防糖尿病患者视力丧失至关重要。然而,噪声和有限的模型透明度往往影响诊断视网膜眼底图像的准确性。噪声和可解释性是成像数据集中出现的两个主要挑战,掩盖了诸如类别不平衡或设备可变性等问题。这些失真存在于所有数据集和设备中,降低了像素级诊断信号的清晰度,并且经常在背景噪声中模糊早期病变。为了应对这些挑战,本研究引入了一种名为Explainable MINet- vit的创新模型,该模型将先进的降噪技术与可解释的深度学习相结合,以更可靠地识别dr。该模型结合了一个多级去噪网络(MINet),由使用方差稳定变换(VST)和深度残差特征映射的噪声特定预处理模块进行修改。采用卷积神经网络(cnn)和视觉变换(ViTs)相结合的混合深度学习架构提取局部和全局空间信息。我们应用可解释性策略,如Grad-CAM和SHAP,通过识别影响模型预测的关键视网膜区域来确保临床可解释性。定量和定性结果均显示出较好的性能、稳健性和临床适用性,准确率为97.6%,灵敏度为0.96,特异性为0.97,Kappa为0.92,AUC为96.7%。对标准数据集的分析表明,我们提出的模型在准确性、噪声稳健性和可解释性方面优于先前的模型,使其非常适合现实世界的临床应用。
{"title":"A Novel Noise Removal and Interpretable Deep Learning Model for Diabetic Retinopathy Detection","authors":"Sultan Alanazi,&nbsp;Sajid Ullah Khan,&nbsp;Faisal M. Alotaibi,&nbsp;Mohammed Alonazi","doi":"10.1002/ima.70245","DOIUrl":"https://doi.org/10.1002/ima.70245","url":null,"abstract":"<div>\u0000 \u0000 <p>Diabetic retinopathy (DR) is a primary reason for visual impairment and blindness in individuals with diabetes worldwide. Timely detection of DR is essential to prevent vision loss in diabetics. However, noise and limited model transparency often compromise the accuracy of diagnosing retinal fundus images. Noise and interpretability are the two main challenges occurring in imaging datasets, overshadowing concerns such as class imbalance or device variability. These distortions are present in all datasets and devices, reducing the clarity of diagnostic signals at the pixel level and often obscuring early lesions within background noise. Addressing these challenges, this research introduces an innovative model called Explainable MINet-ViT, which combines advanced noise reduction techniques with explainable deep learning for more reliable identification of DR. The model incorporates a multi-level denoising network (MINet), modified by a noise-specific pre-processing module using a Variance-Stabilizing Transform (VST) and deep residual feature mapping. A hybrid deep learning architecture that combines Convolutional Neural Networks (CNNs) with Vision Transformers (ViTs) is employed to extract both local and global spatial information. We apply explainability strategies, such as Grad-CAM and SHAP, to ensure clinical interpretability by identifying the crucial retinal regions that influence model predictions. Quantitative and qualitative results show improved performance, robustness, and clinical applicability, achieving an accuracy of 97.6%, a sensitivity of 0.96, a specificity of 0.97, a Kappa of 0.92, and an AUC of 96.7%. Analyses of standard datasets reveal that our proposed model outperforms prior models in accuracy, noise robustness, and interpretability, rendering it exceptionally suitable for real-world clinical applications.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145407324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Respiratory Differencing: Enhancing Pulmonary Thermal Ablation Evaluation Through Pre- and Intraoperative Image Fusion 呼吸差异:通过术前和术中图像融合增强肺热消融评估
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-31 DOI: 10.1002/ima.70242
Wan Li, Wei Li, Hengmo Rong, Yutao Rao, Hui Tang, Yudong Zhang, Feng Wang

CT-guided thermal ablation is increasingly being used for the treatment of lung cancer; however, follow-up studies indicate that physicians' subjective intraoperative assessments often overestimate ablation success, potentially leading to incomplete treatment. To address this limitation, we developed Respiratory Differencing, a CT image-based intraoperative assistance system designed to improve ablation evaluation. The system first segments tumor regions in preoperative CT scans and then applies a multistage registration strategy to align them with intra- or postoperative CT/CBCT images, compensating for respiratory motion and treatment-induced anatomical changes. The system provides two key outputs. First, differential images are generated by subtracting the registered preoperative scan from the intraoperative scan, enabling direct visualization and quantitative comparison of pre- and posttreatment regions. These registered images, together with tumor masks, allow physicians to assess the spatial relationship between tumor and ablation zones—even when the tumor is no longer visible in postablation scans. Second, the system computes a quantitative Ablation Effectiveness Scale (AES) that measures the spatial discrepancy between the tumor region and the ablation zone, offering an objective index of treatment adequacy. By accounting for complex pulmonary deformations and integrating pre- and intraoperative data, this system enhances quality control in ablation procedures. In a retrospective study of 35 clinical cases, Respiratory Differencing significantly outperformed conventional subjective assessment in detecting under-ablation during or immediately after treatment, underscoring its potential to improve intraoperative decision-making and patient outcomes.

ct引导下的热消融越来越多地被用于肺癌的治疗;然而,随访研究表明,医生的主观术中评估往往高估消融成功,可能导致治疗不完全。为了解决这一局限性,我们开发了Respiratory differentiation,这是一种基于CT图像的术中辅助系统,旨在改善消融评估。该系统首先在术前CT扫描中分割肿瘤区域,然后应用多阶段配准策略将其与术中或术后CT/CBCT图像对齐,补偿呼吸运动和治疗引起的解剖变化。系统提供两个关键输出。首先,通过从术中扫描中减去术前扫描记录生成差分图像,可以直接可视化和定量比较治疗前和治疗后区域。这些注册图像,连同肿瘤掩膜,允许医生评估肿瘤和消融区域之间的空间关系——即使在消融后扫描中肿瘤不再可见。其次,系统计算定量消融有效性量表(AES),测量肿瘤区域与消融区域之间的空间差异,提供治疗充分性的客观指标。通过考虑复杂的肺部变形并整合术前和术中数据,该系统提高了消融过程的质量控制。在一项对35例临床病例的回顾性研究中,呼吸差异在治疗期间或治疗后立即检测消融不足方面明显优于传统的主观评估,强调了其改善术中决策和患者预后的潜力。
{"title":"Respiratory Differencing: Enhancing Pulmonary Thermal Ablation Evaluation Through Pre- and Intraoperative Image Fusion","authors":"Wan Li,&nbsp;Wei Li,&nbsp;Hengmo Rong,&nbsp;Yutao Rao,&nbsp;Hui Tang,&nbsp;Yudong Zhang,&nbsp;Feng Wang","doi":"10.1002/ima.70242","DOIUrl":"https://doi.org/10.1002/ima.70242","url":null,"abstract":"<div>\u0000 \u0000 <p>CT-guided thermal ablation is increasingly being used for the treatment of lung cancer; however, follow-up studies indicate that physicians' subjective intraoperative assessments often overestimate ablation success, potentially leading to incomplete treatment. To address this limitation, we developed <i>Respiratory Differencing</i>, a CT image-based intraoperative assistance system designed to improve ablation evaluation. The system first segments tumor regions in preoperative CT scans and then applies a multistage registration strategy to align them with intra- or postoperative CT/CBCT images, compensating for respiratory motion and treatment-induced anatomical changes. The system provides two key outputs. First, differential images are generated by subtracting the registered preoperative scan from the intraoperative scan, enabling direct visualization and quantitative comparison of pre- and posttreatment regions. These registered images, together with tumor masks, allow physicians to assess the spatial relationship between tumor and ablation zones—even when the tumor is no longer visible in postablation scans. Second, the system computes a quantitative Ablation Effectiveness Scale (AES) that measures the spatial discrepancy between the tumor region and the ablation zone, offering an objective index of treatment adequacy. By accounting for complex pulmonary deformations and integrating pre- and intraoperative data, this system enhances quality control in ablation procedures. In a retrospective study of 35 clinical cases, <i>Respiratory Differencing</i> significantly outperformed conventional subjective assessment in detecting under-ablation during or immediately after treatment, underscoring its potential to improve intraoperative decision-making and patient outcomes.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145407406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Binary Focal Loss: Enhancing Radiograph Image Classification With Balanced Specificity and Sensitivity 自适应二元焦丢失:增强x线图像分类与平衡的特异性和敏感性
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-31 DOI: 10.1002/ima.70238
Gokaramaiah Thota, Nagaraju Karinagappa, Sathya Babu Korra
<div> <p>Convolutional neural networks (CNN) are widely used to classify radiograph images. Musculoskeletal disorders (MSD) of the upper extremity (which comprises upper body parts such as the shoulder, elbow, wrist, and hand, allowing movement, strength and fine motor skills). However, their performance is often limited by class imbalance and the presence of hard samples. Although approaches like ensemble models, capsule networks and regularised CNNs in groups can address these issues, they require substantial computational resources. The adoption of loss function does not require additional computational overhead. Focal loss prioritises hard samples (samples that are not easy to classify); it simultaneously suppresses the gradients for easy samples, which affects learning. This can reduce accuracy and create an imbalance between sensitivity and specificity, which is an undesirable outcome in medical diagnostics. To overcome these limitations, adaptive binary focal loss (ABFL) is proposed here, which combines the strengths of binary cross-entropy and focal loss to achieve balanced learning between easy and hard samples. A balance parameter, <span></span><math> <semantics> <mrow> <mi>λ</mi> </mrow> <annotation>$$ lambda $$</annotation> </semantics></math>, is introduced to adaptively weigh the contributions of binary cross-entropy and focal loss. This approach is further extended to multi-class classification tasks through the proposed adaptive categorical focal loss (ACFL). In addition, a procedure is introduced to automatically tune the three key hyperparameters <span></span><math> <semantics> <mrow> <mi>λ</mi> </mrow> <annotation>$$ lambda $$</annotation> </semantics></math>, <span></span><math> <semantics> <mrow> <mi>γ</mi> </mrow> <annotation>$$ gamma $$</annotation> </semantics></math> and <span></span><math> <semantics> <mrow> <mi>β</mi> </mrow> <annotation>$$ beta $$</annotation> </semantics></math> based on the characteristics of the dataset. This eliminates the need for manual intervention. ABFL and ACFL are compared with seven existing loss functions using DenseNet-169 and Inception-v3 on musculoskeletal radiograph images (MURA), a digital database for screening mammography (DDSM) and a garbage classification dataset. Compared to focal loss, Cohen's kappa score performance improved by 33.70% in ABFL on the MURA finger dataset. Similarly, ACFL achieved improvements of 58.07% and 20.23% on the DDSM and garbage datasets, respectively, while maintaining balanced sensitivity and specificity. These results show the robustness and effectiveness of both ABFL and ACFL in handling clas
卷积神经网络(CNN)被广泛用于x线图像分类。上肢的肌肉骨骼疾病(包括肩部、肘部、手腕和手等上肢部位,允许运动、力量和精细运动技能)。然而,它们的性能往往受到类别不平衡和硬样本存在的限制。尽管像集成模型、胶囊网络和正则化cnn组这样的方法可以解决这些问题,但它们需要大量的计算资源。采用损失函数不需要额外的计算开销。焦点丢失优先考虑硬样本(不容易分类的样本);它同时抑制了简单样本的梯度,这会影响学习。这可能会降低准确性,并造成敏感性和特异性之间的不平衡,这是医疗诊断中不希望看到的结果。为了克服这些局限性,本文提出了自适应二元焦点损失算法(ABFL),该算法结合了二元交叉熵和焦点损失的优点,实现了简单和困难样本之间的平衡学习。引入平衡参数λ $$ lambda $$自适应地衡量二元交叉熵和焦损的贡献。该方法通过提出的自适应分类焦丢失(ACFL)进一步扩展到多类分类任务。此外,还介绍了一种基于数据集特征自动调整三个关键超参数λ $$ lambda $$、γ $$ gamma $$和β $$ beta $$的过程。这消除了人工干预的需要。ABFL和ACFL与现有的7种损失函数进行比较,使用DenseNet-169和Inception-v3对肌肉骨骼x线照片(MURA)、乳房x线摄影筛查数字数据库(DDSM)和垃圾分类数据集进行比较。与失焦组相比,Cohen的kappa评分提高了33.70分% in ABFL on the MURA finger dataset. Similarly, ACFL achieved improvements of 58.07% and 20.23% on the DDSM and garbage datasets, respectively, while maintaining balanced sensitivity and specificity. These results show the robustness and effectiveness of both ABFL and ACFL in handling class imbalance and hard samples in CNN-based classification.
{"title":"Adaptive Binary Focal Loss: Enhancing Radiograph Image Classification With Balanced Specificity and Sensitivity","authors":"Gokaramaiah Thota,&nbsp;Nagaraju Karinagappa,&nbsp;Sathya Babu Korra","doi":"10.1002/ima.70238","DOIUrl":"https://doi.org/10.1002/ima.70238","url":null,"abstract":"&lt;div&gt;\u0000 \u0000 &lt;p&gt;Convolutional neural networks (CNN) are widely used to classify radiograph images. Musculoskeletal disorders (MSD) of the upper extremity (which comprises upper body parts such as the shoulder, elbow, wrist, and hand, allowing movement, strength and fine motor skills). However, their performance is often limited by class imbalance and the presence of hard samples. Although approaches like ensemble models, capsule networks and regularised CNNs in groups can address these issues, they require substantial computational resources. The adoption of loss function does not require additional computational overhead. Focal loss prioritises hard samples (samples that are not easy to classify); it simultaneously suppresses the gradients for easy samples, which affects learning. This can reduce accuracy and create an imbalance between sensitivity and specificity, which is an undesirable outcome in medical diagnostics. To overcome these limitations, adaptive binary focal loss (ABFL) is proposed here, which combines the strengths of binary cross-entropy and focal loss to achieve balanced learning between easy and hard samples. A balance parameter, &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;λ&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$$ lambda $$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;, is introduced to adaptively weigh the contributions of binary cross-entropy and focal loss. This approach is further extended to multi-class classification tasks through the proposed adaptive categorical focal loss (ACFL). In addition, a procedure is introduced to automatically tune the three key hyperparameters &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;λ&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$$ lambda $$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt;, &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;γ&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$$ gamma $$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt; and &lt;span&gt;&lt;/span&gt;&lt;math&gt;\u0000 &lt;semantics&gt;\u0000 &lt;mrow&gt;\u0000 &lt;mi&gt;β&lt;/mi&gt;\u0000 &lt;/mrow&gt;\u0000 &lt;annotation&gt;$$ beta $$&lt;/annotation&gt;\u0000 &lt;/semantics&gt;&lt;/math&gt; based on the characteristics of the dataset. This eliminates the need for manual intervention. ABFL and ACFL are compared with seven existing loss functions using DenseNet-169 and Inception-v3 on musculoskeletal radiograph images (MURA), a digital database for screening mammography (DDSM) and a garbage classification dataset. Compared to focal loss, Cohen's kappa score performance improved by 33.70% in ABFL on the MURA finger dataset. Similarly, ACFL achieved improvements of 58.07% and 20.23% on the DDSM and garbage datasets, respectively, while maintaining balanced sensitivity and specificity. These results show the robustness and effectiveness of both ABFL and ACFL in handling clas","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145407323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Imaging Systems and Technology
全部 Chem. Ecol. ACTA PETROL SIN J. Atmos. Chem. Hydrogeol. J. Geophys. Prospect. Environmental Claims Journal GEOLOGY Quat. Sci. Rev. J. Hydrol. ERN: Other Macroeconomics: Aggregative Models (Topic) Environ. Prot. Eng. Geosci. Front. PETROLOGY+ Ocean Dyn. ECOLOGY Yan Ke Xue Bao (Hong Kong) Environ. Res. Lett. ACTA GEOL POL Archaeol. Anthropol. Sci. Ocean Sci. Andean Geol. GROUNDWATER ERN: Other IO: Empirical Studies of Firms & Markets (Topic) Environmental Science: an Indian journal Acta Oceanolog. Sin. NEUES JAHRB GEOL P-A 航空科学与技术(英文) ECOL RESTOR WEATHER CLIM SOC Geosci. Model Dev. Engineering, Technology & Applied Science Research Engineering Science and Technology, an International Journal Exp. Cell. Res. ARCT ANTARCT ALP RES Expert Rev. Neurother. J. Atmos. Oceanic Technol. GEOTECH LETT Prog. Oceanogr. Phys. Chem. Miner. Études Caribéennes Quat. Int. EXP LUNG RES CRIT REV ENV SCI TEC Espacio Tiempo y Forma. Serie VI, Geografía TECTONOPHYSICS ITAL J REMOTE SENS J. Clim. Proc. Yorkshire Geol. Soc. Am. Mineral. Erziehungswissenschaftliche Revue Dyn. Atmos. Oceans Pet. Geosci. Int. J. Geog. Inf. Sci. EVOL MED PUBLIC HLTH N. Z. J. Geol. Geophys. EXPERT OPIN DRUG DEL ERN: Regulation (IO) (Topic) Ecol. Res. Swiss J. Geosci. Geobiology Int. J. Appl. Earth Obs. Geoinf. Geography Compass Ecol. Monogr. Terra Nova Norw. J. Geol. 山西省考古学会论文集 Aquat. Geochem. EST J EARTH SCI ATMOSPHERE-BASEL Geosci. J. EUR UROL FOCUS GEOFIZIKA Geochim. Cosmochim. Acta Surv. Rev. J. Earth Syst. Sci. COMP BIOCHEM PHYS C Environ. Chem. J. Geophys. Eng. Geochem. Trans. J. Spatial Sci. GEOMAT NAT HAZ RISK Ocean and Coastal Research Math. Geosci. Environ. Educ. Res, ICHNOS ERDE INTERPRETATION-J SUB Clim. Change Big Earth Data Adv. Meteorol. Basin Res. Aust. J. Earth Sci. Ecol. Indic. Int. J. Biometeorol. Conserv. Biol. Ann. Glaciol. Clean Technol. Environ. Policy ENVIRON ENG GEOSCI BIOGEOSCIENCES Clean-Soil Air Water
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1