首页 > 最新文献

International Journal of Imaging Systems and Technology最新文献

英文 中文
Automated Lumbar Disc Intensity Classification From MRI Scans Using Region-Based CNNs and Transformer Models 利用基于区域的cnn和变压器模型从MRI扫描中自动分类腰椎间盘强度
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-16 DOI: 10.1002/ima.70229
Hasan Ulutas, Mustafa Fatih Erkoc, Erdal Ozbay, Muhammet Emin Sahin, Mucella Ozbay Karakus, Esra Yuce

This study explores the effectiveness of deep learning methodologies in the detection and classification of lumbar disc intensity using MRI scans. Initially, region-based deep learning frameworks, including Faster R-CNN and Mask R-CNN with different backbones such as ResNet50 and ResNet101 are evaluated. Results demonstrated that backbone selection significantly impacts model performance, with Mask R-CNN combined with ResNet101 achieving a remarkable [email protected] (AP50) of 99.83%. In addition to object detection models, Transformer-based classification architectures, including MaxViT, Vision Transformer (ViT), a Hybrid CNN-ViT model, and Fine-Tuned Enhanced Pyramid Network (FT-EPN), are implemented. Among these, the Hybrid model achieved the highest classification accuracy (83.1%), while MaxViT yielded the highest precision (0.804). Comparative analyses highlighted that while Mask R-CNN models excelled in segmentation and detection tasks, Transformer-based models provided effective solutions for direct severity classification of lumbar discs. These findings emphasize the critical role of both backbone architecture and model type in optimizing diagnostic performance. The study demonstrates the potential of integrating region-based and Transformer-based models in advancing automated lumbar spine assessment, paving the way for more accurate and reliable medical diagnostic systems.

本研究探讨了深度学习方法在使用MRI扫描检测和分类腰椎间盘强度方面的有效性。首先,评估了基于区域的深度学习框架,包括具有不同主干(如ResNet50和ResNet101)的Faster R-CNN和Mask R-CNN。结果表明,骨干网选择对模型性能有显著影响,Mask R-CNN与ResNet101结合可获得99.83%的AP50 (email protected)。除了目标检测模型外,还实现了基于变压器的分类架构,包括MaxViT,视觉变压器(ViT), CNN-ViT混合模型和微调增强金字塔网络(FT-EPN)。其中,Hybrid模型的分类准确率最高(83.1%),MaxViT模型的分类准确率最高(0.804)。对比分析表明,Mask R-CNN模型在分割和检测任务上表现出色,而基于transformer的模型则为腰椎间盘的直接严重程度分类提供了有效的解决方案。这些发现强调了骨干结构和模型类型在优化诊断性能中的关键作用。该研究展示了整合基于区域和基于transformer的模型在推进腰椎自动评估方面的潜力,为更准确和可靠的医疗诊断系统铺平了道路。
{"title":"Automated Lumbar Disc Intensity Classification From MRI Scans Using Region-Based CNNs and Transformer Models","authors":"Hasan Ulutas,&nbsp;Mustafa Fatih Erkoc,&nbsp;Erdal Ozbay,&nbsp;Muhammet Emin Sahin,&nbsp;Mucella Ozbay Karakus,&nbsp;Esra Yuce","doi":"10.1002/ima.70229","DOIUrl":"https://doi.org/10.1002/ima.70229","url":null,"abstract":"<div>\u0000 \u0000 <p>This study explores the effectiveness of deep learning methodologies in the detection and classification of lumbar disc intensity using MRI scans. Initially, region-based deep learning frameworks, including Faster R-CNN and Mask R-CNN with different backbones such as ResNet50 and ResNet101 are evaluated. Results demonstrated that backbone selection significantly impacts model performance, with Mask R-CNN combined with ResNet101 achieving a remarkable [email protected] (AP50) of 99.83%. In addition to object detection models, Transformer-based classification architectures, including MaxViT, Vision Transformer (ViT), a Hybrid CNN-ViT model, and Fine-Tuned Enhanced Pyramid Network (FT-EPN), are implemented. Among these, the Hybrid model achieved the highest classification accuracy (83.1%), while MaxViT yielded the highest precision (0.804). Comparative analyses highlighted that while Mask R-CNN models excelled in segmentation and detection tasks, Transformer-based models provided effective solutions for direct severity classification of lumbar discs. These findings emphasize the critical role of both backbone architecture and model type in optimizing diagnostic performance. The study demonstrates the potential of integrating region-based and Transformer-based models in advancing automated lumbar spine assessment, paving the way for more accurate and reliable medical diagnostic systems.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145317267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Network With Spectrum Transformer and Triplet Attention for CT Image Segmentation 基于频谱变换和三重关注的CT图像分割新方法
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-16 DOI: 10.1002/ima.70221
Ju Zhang, Jiahao Yu, Changgang Ying, Yun Cheng, Fanghong Wang

Deep learning-based methods have achieved great progress in CT image segmentation in recent years. However, the lack of unified large-scale datasets, unbalanced categories of segmented images, blurred boundaries between infected and healthy regions, and different sizes and shapes of lesions have led to the fact that the existing methods still have challenges in further improving the segmentation accuracy in medical applications. In this work, a novel network with spectrum Transformer and triplet attention for CT image segmentation is proposed. The computed spectrum Transformer module and the triplet attention module (TAM) are fused in a parallel way utilizing the parallel hybrid fusion module (PHFM), which extracts global and local contextual information from sequence features and cross-dimensional features. A spectrum Transformer block (STB) is proposed, which utilizes the fast Fourier transform (FFT) to learn the weights of each frequency component in the spectral space. Extensive comparison experiments are conducted on both the COVID-19-Seg dataset and Mosmeddata, showing that the proposed network has better accuracy than most existing methods for CT image segmentation tasks. In particular, the proposed model achieves improvement by 1.29% and 2.28% in terms of DSC, respectively, by 1.15% and 1.35% in terms of mIoU. SEN metrics increase by 1.45% and 1.48%, respectively. PRE also achieves the best results, showing its significant advantage in accurately segmenting medical images. SPE also shows quite good results in both datasets. Ablation studies show that the proposed STB and TAM modules improve the segmentation performance significantly.

近年来,基于深度学习的方法在CT图像分割方面取得了很大的进展。然而,由于缺乏统一的大规模数据集,分割图像类别不平衡,感染区域和健康区域界限模糊,病变大小和形状不同,导致现有方法在进一步提高医学应用中的分割精度方面仍然存在挑战。本文提出了一种基于频谱变换和三重关注的CT图像分割方法。利用并行混合融合模块(PHFM)将计算谱变压器模块和三重关注模块(TAM)进行并行融合,从序列特征和跨维特征中提取全局和局部上下文信息。提出了一种利用快速傅里叶变换(FFT)学习频谱空间中各频率分量权重的频谱变换块(STB)。在COVID-19-Seg数据集和Mosmeddata数据集上进行了大量的对比实验,结果表明,本文提出的网络在CT图像分割任务中具有比大多数现有方法更好的准确率。其中,提出的模型DSC分别提高了1.29%和2.28%,mIoU分别提高了1.15%和1.35%。SEN指标分别增长了1.45%和1.48%。PRE也取得了最好的效果,在医学图像的准确分割上具有明显的优势。SPE在这两个数据集中也显示出相当好的结果。烧蚀研究表明,所提出的STB和TAM模块显著提高了分割性能。
{"title":"A Novel Network With Spectrum Transformer and Triplet Attention for CT Image Segmentation","authors":"Ju Zhang,&nbsp;Jiahao Yu,&nbsp;Changgang Ying,&nbsp;Yun Cheng,&nbsp;Fanghong Wang","doi":"10.1002/ima.70221","DOIUrl":"https://doi.org/10.1002/ima.70221","url":null,"abstract":"<div>\u0000 \u0000 <p>Deep learning-based methods have achieved great progress in CT image segmentation in recent years. However, the lack of unified large-scale datasets, unbalanced categories of segmented images, blurred boundaries between infected and healthy regions, and different sizes and shapes of lesions have led to the fact that the existing methods still have challenges in further improving the segmentation accuracy in medical applications. In this work, a novel network with spectrum Transformer and triplet attention for CT image segmentation is proposed. The computed spectrum Transformer module and the triplet attention module (TAM) are fused in a parallel way utilizing the parallel hybrid fusion module (PHFM), which extracts global and local contextual information from sequence features and cross-dimensional features. A spectrum Transformer block (STB) is proposed, which utilizes the fast Fourier transform (FFT) to learn the weights of each frequency component in the spectral space. Extensive comparison experiments are conducted on both the COVID-19-Seg dataset and Mosmeddata, showing that the proposed network has better accuracy than most existing methods for CT image segmentation tasks. In particular, the proposed model achieves improvement by 1.29% and 2.28% in terms of DSC, respectively, by 1.15% and 1.35% in terms of mIoU. SEN metrics increase by 1.45% and 1.48%, respectively. PRE also achieves the best results, showing its significant advantage in accurately segmenting medical images. SPE also shows quite good results in both datasets. Ablation studies show that the proposed STB and TAM modules improve the segmentation performance significantly.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145317152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Magnetic Resonance Imaging Guidance Method of Low-Intensity Transcranial Focused Ultrasound for Deep Brain Neuromodulation 低强度经颅聚焦超声对脑深部神经调节的磁共振成像引导方法
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-15 DOI: 10.1002/ima.70226
Tao Zhang, Dechen Kong, Neil Roberts, Xiaoqi Huang, Jiayu Zhu, Yijing Dong, Qiang He, Haoyang Xing, Qiyong Gong

Low-Intensity Transcranial Focused Ultrasound (LITFU) is a highly anticipated non-invasive neuromodulation technique in terms of targeting accuracy, spatial resolution, stimulation depth, and reversibility. However, LITFU still faces challenges, most notably is and the reduced precision in deep tissue targeting and limited stimulation depth due to the distortion of the acoustic field by the skull. This study proposes a method for using a rectangular concave ultrasonic transducer to achieve deep brain stimulation with LITFU under Magnetic Resonance Imaging Acoustic Radiation Force Imaging guidance. First, k-Wave simulation modeling is performed by the skull magnetic resonance imaging (MRI) with ultrashort echo time (UTE) sequence to calculate the acoustic pressure distribution and the maximum value within the tissue. Then, the time required for maximum tissue displacement is calculated using a biological tissue elastic deformation model. Based on the displacement and time, the motion encoding gradient is optimized to obtain a more distinct phase contrast. We validated the feasibility of this method on water-filled balloons and ex vivo porcine brains, provided the optimized ultrasonic delay time coefficient, and achieved precise LITFU focusing on the amygdala located deep within the human volunteers. The results demonstrate that the rectangular transducer achieves a maximum stimulation depth of 80 mm with a maximum peak acoustic pressure of 3 MPa. The simulated and ARFI measured acoustic fields were compared and their consistency and differences were demonstrated. The improved EPI-ARFI significantly enhances measurement sensitivity and reduces scanning time to 24 s and three-dimensional imaging can be achieved. This work provides a localization and verification method for LITFU guided by MRI, deep brain stimulation of LITFU in future clinical applications.

低强度经颅聚焦超声(LITFU)在靶向精度、空间分辨率、刺激深度和可逆性方面是一种备受期待的非侵入性神经调节技术。然而,LITFU仍然面临着挑战,最明显的是,由于颅骨声场的扭曲,深层组织定位精度降低,刺激深度有限。本研究提出了一种在磁共振成像声辐射力成像引导下,利用矩形凹形超声换能器实现LITFU脑深部刺激的方法。首先,利用颅骨磁共振成像(MRI)超短回波时间(UTE)序列进行k波模拟建模,计算组织内声压分布及最大值;然后,利用生物组织弹性变形模型计算最大组织位移所需的时间。基于位移和时间对运动编码梯度进行优化,得到更清晰的相衬。我们在充水气球和离体猪脑上验证了该方法的可行性,并提供了优化的超声延迟时间系数,实现了对人类志愿者体内深层杏仁核的精确聚焦。结果表明,矩形换能器最大激发深度为80 mm,最大峰值声压为3 MPa。比较了模拟声场和ARFI实测声场的一致性和差异性。改进后的EPI-ARFI显著提高了测量灵敏度,扫描时间缩短至24 s,可实现三维成像。本工作为MRI引导下LITFU的定位和验证提供了一种方法,在未来的临床应用中对LITFU进行深部脑刺激。
{"title":"Magnetic Resonance Imaging Guidance Method of Low-Intensity Transcranial Focused Ultrasound for Deep Brain Neuromodulation","authors":"Tao Zhang,&nbsp;Dechen Kong,&nbsp;Neil Roberts,&nbsp;Xiaoqi Huang,&nbsp;Jiayu Zhu,&nbsp;Yijing Dong,&nbsp;Qiang He,&nbsp;Haoyang Xing,&nbsp;Qiyong Gong","doi":"10.1002/ima.70226","DOIUrl":"https://doi.org/10.1002/ima.70226","url":null,"abstract":"<div>\u0000 \u0000 <p>Low-Intensity Transcranial Focused Ultrasound (LITFU) is a highly anticipated non-invasive neuromodulation technique in terms of targeting accuracy, spatial resolution, stimulation depth, and reversibility. However, LITFU still faces challenges, most notably is and the reduced precision in deep tissue targeting and limited stimulation depth due to the distortion of the acoustic field by the skull. This study proposes a method for using a rectangular concave ultrasonic transducer to achieve deep brain stimulation with LITFU under Magnetic Resonance Imaging Acoustic Radiation Force Imaging guidance. First, k-Wave simulation modeling is performed by the skull magnetic resonance imaging (MRI) with ultrashort echo time (UTE) sequence to calculate the acoustic pressure distribution and the maximum value within the tissue. Then, the time required for maximum tissue displacement is calculated using a biological tissue elastic deformation model. Based on the displacement and time, the motion encoding gradient is optimized to obtain a more distinct phase contrast. We validated the feasibility of this method on water-filled balloons and ex vivo porcine brains, provided the optimized ultrasonic delay time coefficient, and achieved precise LITFU focusing on the amygdala located deep within the human volunteers. The results demonstrate that the rectangular transducer achieves a maximum stimulation depth of 80 mm with a maximum peak acoustic pressure of 3 MPa. The simulated and ARFI measured acoustic fields were compared and their consistency and differences were demonstrated. The improved EPI-ARFI significantly enhances measurement sensitivity and reduces scanning time to 24 s and three-dimensional imaging can be achieved. This work provides a localization and verification method for LITFU guided by MRI, deep brain stimulation of LITFU in future clinical applications.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145317394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Lung Cancer Screening With an Automated Deep Learning System: A Resource-Efficient Approach 用自动化深度学习系统加强肺癌筛查:一种资源高效的方法
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-15 DOI: 10.1002/ima.70225
Md. Rafiqul Islam, Md. Noman Hasan Sarker, Md. Kudrat-E-Shahinur Miah, Md. Hasibul Hossain

Lung cancer significantly contributes to cancer mortality worldwide, and prompt diagnosis is essential for enhancing patient outcomes. Identifying the condition at an early stage remains challenging, particularly in regions with limited medical facilities and experienced radiologists. The paper aims to introduce a fully automated DL system capable of identifying, segmenting, and classifying lung cancer at an early phase. This approach aims to enhance the precision and efficacy of lung cancer screening in resource-constrained environments. The recommended architecture has three stages: (1) lung preprocessing, employing a bilateral filter to enhance image quality; (2) lung segmentation, utilizing Otsu's thresholding to delineate lung areas; and (3) lung cancer classification, implementing a modified CNN referred to as the P-Model. The framework was assessed utilizing the publicly available IQ-OTH/NCCD dataset. The proposed framework exhibited high performance metrics, with lung segmentation accuracy at 96%, classification precision at 97%, sensitivity at 96%, and an F1 score of 96%. Additionally, the Matthews correlation coefficient was recorded at 0.9406. The findings indicate that our paradigm surpasses previous studies across all pertinent evaluation metrics. The proposed deep learning approach shows considerable potential for precise detection and classification of lung cancer, particularly in areas with limited resources. This study significantly advances the field of lung cancer detection and has the capacity to enhance healthcare outcomes and patient care standards globally.

肺癌是全球癌症死亡率的重要因素,及时诊断对于提高患者预后至关重要。在早期阶段确定病情仍然具有挑战性,特别是在医疗设施和经验丰富的放射科医生有限的地区。本文旨在介绍一种能够在早期阶段识别、分割和分类肺癌的全自动DL系统。该方法旨在提高资源受限环境下肺癌筛查的准确性和有效性。推荐的架构分为三个阶段:(1)肺预处理,采用双边滤波器增强图像质量;(2)肺分割,利用Otsu阈值法对肺区域进行分割;(3)肺癌分类,实现一种改进的CNN,称为P-Model。该框架利用公开可用的IQ-OTH/NCCD数据集进行评估。所提出的框架具有很高的性能指标,肺分割准确率为96%,分类精度为97%,灵敏度为96%,F1评分为96%。马修斯相关系数为0.9406。研究结果表明,我们的范式在所有相关的评估指标上都超过了以前的研究。提出的深度学习方法在肺癌的精确检测和分类方面显示出相当大的潜力,特别是在资源有限的地区。这项研究显著推进了肺癌检测领域,并有能力提高全球医疗保健结果和患者护理标准。
{"title":"Enhancing Lung Cancer Screening With an Automated Deep Learning System: A Resource-Efficient Approach","authors":"Md. Rafiqul Islam,&nbsp;Md. Noman Hasan Sarker,&nbsp;Md. Kudrat-E-Shahinur Miah,&nbsp;Md. Hasibul Hossain","doi":"10.1002/ima.70225","DOIUrl":"https://doi.org/10.1002/ima.70225","url":null,"abstract":"<div>\u0000 \u0000 <p>Lung cancer significantly contributes to cancer mortality worldwide, and prompt diagnosis is essential for enhancing patient outcomes. Identifying the condition at an early stage remains challenging, particularly in regions with limited medical facilities and experienced radiologists. The paper aims to introduce a fully automated DL system capable of identifying, segmenting, and classifying lung cancer at an early phase. This approach aims to enhance the precision and efficacy of lung cancer screening in resource-constrained environments. The recommended architecture has three stages: (1) lung preprocessing, employing a bilateral filter to enhance image quality; (2) lung segmentation, utilizing Otsu's thresholding to delineate lung areas; and (3) lung cancer classification, implementing a modified CNN referred to as the P-Model. The framework was assessed utilizing the publicly available IQ-OTH/NCCD dataset. The proposed framework exhibited high performance metrics, with lung segmentation accuracy at 96%, classification precision at 97%, sensitivity at 96%, and an F1 score of 96%. Additionally, the Matthews correlation coefficient was recorded at 0.9406. The findings indicate that our paradigm surpasses previous studies across all pertinent evaluation metrics. The proposed deep learning approach shows considerable potential for precise detection and classification of lung cancer, particularly in areas with limited resources. This study significantly advances the field of lung cancer detection and has the capacity to enhance healthcare outcomes and patient care standards globally.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145317164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Hybrid Deep Learning Approach for Enhanced Classification of Lung Pathologies From Chest X-Ray 一种混合深度学习方法用于增强胸部x射线肺部病理分类
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-15 DOI: 10.1002/ima.70227
Samira Sajed, Habib Rostami, Jorge Esparteiro Garcia, Ahmad Keshavarz, Andreia Teixeira

The increasing global burden of lung diseases necessitates the development of improved diagnostic tools. According to the WHO, hundreds of millions of individuals worldwide are currently affected by various forms of lung disease. The rapid advancement of artificial neural networks has revolutionized lung disease diagnosis, enabling the development of highly effective detection and classification systems. This article presents dual channel neural networks in image feature extraction based on classical CNN and vision transformers for multi-label lung disease diagnosis. Two separate subnetworks are employed to capture both global and local feature representations, thereby facilitating the extraction of more informative and discriminative image features. The global network analyzes all-organ regions, while the local network simultaneously focuses on multiple single-organ regions. We then apply a novel feature fusion operation, leveraging a multi-head attention mechanism to weight global features according to the significance of localized features. Through this multi-channel approach, the framework is designed to identify complicated and subtle features within images, which often go unnoticed by the human eye. Evaluation on the ChestX-ray14 benchmark dataset demonstrates that our hybrid model consistently outperforms established state-of-the-art architectures, including ResNet-50, DenseNet-121, and CheXNet, by achieving significantly higher AUC scores across multiple thoracic disease classification tasks. By incorporating test-time augmentation, the model achieved an average accuracy of 95.7% and a specificity of 99%. The experimental findings indicated that our model attained an average testing AUC of 87%. In addition, our method tackles a more practical clinical problem, and preliminary results suggest its feasibility and effectiveness. It could assist clinicians in making timely decisions about lung diseases.

全球肺病负担日益加重,需要开发改进的诊断工具。据世界卫生组织称,目前全世界有数亿人受到各种肺部疾病的影响。人工神经网络的快速发展已经彻底改变了肺部疾病的诊断,使高效的检测和分类系统的发展成为可能。本文提出了一种基于经典CNN和视觉变换的双通道神经网络图像特征提取方法,用于多标签肺部疾病诊断。采用两个独立的子网来捕获全局和局部特征表示,从而促进提取更多信息和判别性更强的图像特征。全局网络分析所有器官区域,而局部网络同时关注多个单一器官区域。然后,我们应用了一种新的特征融合操作,利用多头注意机制根据局部特征的重要性对全局特征进行加权。通过这种多通道方法,该框架旨在识别图像中复杂和微妙的特征,这些特征通常被人眼忽略。对ChestX-ray14基准数据集的评估表明,我们的混合模型通过在多个胸部疾病分类任务中实现更高的AUC得分,始终优于已建立的最先进的架构,包括ResNet-50、DenseNet-121和CheXNet。通过增加测试时间,该模型的平均准确率为95.7%,特异性为99%。实验结果表明,该模型的平均测试AUC为87%。此外,我们的方法解决了一个更实际的临床问题,初步结果表明了它的可行性和有效性。它可以帮助临床医生对肺部疾病做出及时的决定。
{"title":"A Hybrid Deep Learning Approach for Enhanced Classification of Lung Pathologies From Chest X-Ray","authors":"Samira Sajed,&nbsp;Habib Rostami,&nbsp;Jorge Esparteiro Garcia,&nbsp;Ahmad Keshavarz,&nbsp;Andreia Teixeira","doi":"10.1002/ima.70227","DOIUrl":"https://doi.org/10.1002/ima.70227","url":null,"abstract":"<div>\u0000 \u0000 <p>The increasing global burden of lung diseases necessitates the development of improved diagnostic tools. According to the WHO, hundreds of millions of individuals worldwide are currently affected by various forms of lung disease. The rapid advancement of artificial neural networks has revolutionized lung disease diagnosis, enabling the development of highly effective detection and classification systems. This article presents dual channel neural networks in image feature extraction based on classical CNN and vision transformers for multi-label lung disease diagnosis. Two separate subnetworks are employed to capture both global and local feature representations, thereby facilitating the extraction of more informative and discriminative image features. The global network analyzes all-organ regions, while the local network simultaneously focuses on multiple single-organ regions. We then apply a novel feature fusion operation, leveraging a multi-head attention mechanism to weight global features according to the significance of localized features. Through this multi-channel approach, the framework is designed to identify complicated and subtle features within images, which often go unnoticed by the human eye. Evaluation on the ChestX-ray14 benchmark dataset demonstrates that our hybrid model consistently outperforms established state-of-the-art architectures, including ResNet-50, DenseNet-121, and CheXNet, by achieving significantly higher AUC scores across multiple thoracic disease classification tasks. By incorporating test-time augmentation, the model achieved an average accuracy of 95.7% and a specificity of 99%. The experimental findings indicated that our model attained an average testing AUC of 87%. In addition, our method tackles a more practical clinical problem, and preliminary results suggest its feasibility and effectiveness. It could assist clinicians in making timely decisions about lung diseases.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145317395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Transfer Learning for Surgical Tool Presence Detection in Laparoscopic Videos Through Gradual Freezing Fine-Tuning 基于渐进式冻结微调的腹腔镜视频手术工具存在检测的自适应迁移学习
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-14 DOI: 10.1002/ima.70218
Ana Davila, Jacinto Colan, Yasuhisa Hasegawa

Minimally invasive surgery can benefit significantly from automated surgical tool detection, enabling advanced analysis and assistance. However, the limited availability of annotated data in surgical settings poses a challenge for training robust deep learning models. This paper introduces a novel staged adaptive fine-tuning approach consisting of two steps: a linear probing stage to condition additional classification layers on a pre-trained CNN-based architecture and a gradual freezing stage to dynamically reduce the fine-tunable layers, aiming to regulate adaptation to the surgical domain. This strategy reduces network complexity and improves efficiency, requiring only a single training loop and eliminating the need for multiple iterations. We validated our method on the Cholec80 dataset, employing CNN architectures (ResNet-50 and DenseNet-121) pre-trained on ImageNet for detecting surgical tools in cholecystectomy endoscopic videos. Our results demonstrate that our method improves detection performance compared to existing approaches and established fine-tuning techniques, achieving a mean average precision (mAP) of 96.4%. To assess its broader applicability, the generalizability of the fine-tuning strategy was further confirmed on the CATARACTS dataset, a distinct domain of minimally invasive ophthalmic surgery. These findings suggest that gradual freezing fine-tuning is a promising technique for improving tool presence detection in diverse surgical procedures and may have broader applications in general image classification tasks.

微创手术可以从自动化手术工具检测中显著获益,从而实现先进的分析和辅助。然而,手术环境中有限的注释数据的可用性对训练稳健的深度学习模型提出了挑战。本文介绍了一种新的分阶段自适应微调方法,该方法由两个步骤组成:线性探测阶段,在预训练的基于cnn的架构上调节额外的分类层;逐步冻结阶段,动态减少可微调层,旨在调节对外科领域的适应。这种策略降低了网络的复杂性,提高了效率,只需要一个训练循环,消除了多次迭代的需要。我们在Cholec80数据集上验证了我们的方法,使用在ImageNet上预训练的CNN架构(ResNet-50和DenseNet-121)来检测胆囊切除术内窥镜视频中的手术工具。结果表明,与现有方法和已建立的微调技术相比,我们的方法提高了检测性能,平均精度(mAP)达到96.4%。为了评估其更广泛的适用性,在白内障数据集上进一步证实了微调策略的普遍性,白内障数据集是微创眼科手术的一个独特领域。这些发现表明,逐渐冻结微调是一种很有前途的技术,可以改善各种外科手术中工具存在的检测,并可能在一般图像分类任务中有更广泛的应用。
{"title":"Adaptive Transfer Learning for Surgical Tool Presence Detection in Laparoscopic Videos Through Gradual Freezing Fine-Tuning","authors":"Ana Davila,&nbsp;Jacinto Colan,&nbsp;Yasuhisa Hasegawa","doi":"10.1002/ima.70218","DOIUrl":"https://doi.org/10.1002/ima.70218","url":null,"abstract":"<div>\u0000 \u0000 <p>Minimally invasive surgery can benefit significantly from automated surgical tool detection, enabling advanced analysis and assistance. However, the limited availability of annotated data in surgical settings poses a challenge for training robust deep learning models. This paper introduces a novel staged adaptive fine-tuning approach consisting of two steps: a linear probing stage to condition additional classification layers on a pre-trained CNN-based architecture and a gradual freezing stage to dynamically reduce the fine-tunable layers, aiming to regulate adaptation to the surgical domain. This strategy reduces network complexity and improves efficiency, requiring only a single training loop and eliminating the need for multiple iterations. We validated our method on the Cholec80 dataset, employing CNN architectures (ResNet-50 and DenseNet-121) pre-trained on ImageNet for detecting surgical tools in cholecystectomy endoscopic videos. Our results demonstrate that our method improves detection performance compared to existing approaches and established fine-tuning techniques, achieving a mean average precision (mAP) of 96.4%. To assess its broader applicability, the generalizability of the fine-tuning strategy was further confirmed on the CATARACTS dataset, a distinct domain of minimally invasive ophthalmic surgery. These findings suggest that gradual freezing fine-tuning is a promising technique for improving tool presence detection in diverse surgical procedures and may have broader applications in general image classification tasks.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145317140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MaFMatch: Semi-Supervised Medical Image Segmentation Network Based on Mixed Data and Feature Augmentation MaFMatch:基于混合数据和特征增强的半监督医学图像分割网络
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-13 DOI: 10.1002/ima.70228
Jianwu Long, Yuwei Li

Medical image segmentation plays a crucial role in biomedical engineering and computer-aided medical systems. Fully supervised medical image segmentation algorithms have achieved significant performance improvements. However, these improvements often rely on large amounts of finely annotated data, while semi-supervised medical image segmentation can better address this issue. In this work, we propose the MaFMatch model based on the principle of weak-to-strong consistency. It effectively addresses the limitations of current semi-supervised medical image segmentation, such as noise generated by perturbation leading the model to learn in unfavorable directions, and simple feature perturbations being insufficient to explore a broader perturbation space. On the one hand, this approach introduces a mixed data perturbation flow to utilize all pixel information, making the model inclined to consider global semantic information rather than simply discarding unreliable pixels. On the other hand, to fully utilize feature perturbation information flow, we propose a feature augmentation perturbation scheme that simultaneously supplements and discards information from the original feature flow, enabling the model to effectively overcome the diminishing marginal returns brought about by multi-branch perturbations. MaFMatch achieved an mDsc of 90.8 on the automatic cardiac diagnosis challenge (ACDC) dataset. It outperforms most methods across major metrics on both ACDC and LA datasets. The code is available at https://github.com/HandsomeRed/MaFMatch-main.

医学图像分割在生物医学工程和计算机辅助医疗系统中起着至关重要的作用。完全监督医学图像分割算法取得了显著的性能改进。然而,这些改进往往依赖于大量精细注释的数据,而半监督医学图像分割可以更好地解决这一问题。在这项工作中,我们提出了基于弱到强一致性原则的MaFMatch模型。它有效地解决了目前半监督医学图像分割的局限性,如摄动产生的噪声导致模型向不利方向学习,简单的特征摄动不足以探索更广泛的摄动空间。一方面,该方法引入了混合数据摄动流来利用所有像素信息,使模型倾向于考虑全局语义信息,而不是简单地丢弃不可靠的像素。另一方面,为了充分利用特征摄动信息流,我们提出了一种特征增强摄动方案,对原始特征流中的信息进行补充和丢弃,使模型能够有效克服多分支摄动带来的边际收益递减。MaFMatch在自动心脏诊断挑战(ACDC)数据集上实现了90.8的mDsc。它在ACDC和LA数据集的主要指标上优于大多数方法。代码可在https://github.com/HandsomeRed/MaFMatch-main上获得。
{"title":"MaFMatch: Semi-Supervised Medical Image Segmentation Network Based on Mixed Data and Feature Augmentation","authors":"Jianwu Long,&nbsp;Yuwei Li","doi":"10.1002/ima.70228","DOIUrl":"https://doi.org/10.1002/ima.70228","url":null,"abstract":"<div>\u0000 \u0000 <p>Medical image segmentation plays a crucial role in biomedical engineering and computer-aided medical systems. Fully supervised medical image segmentation algorithms have achieved significant performance improvements. However, these improvements often rely on large amounts of finely annotated data, while semi-supervised medical image segmentation can better address this issue. In this work, we propose the MaFMatch model based on the principle of weak-to-strong consistency. It effectively addresses the limitations of current semi-supervised medical image segmentation, such as noise generated by perturbation leading the model to learn in unfavorable directions, and simple feature perturbations being insufficient to explore a broader perturbation space. On the one hand, this approach introduces a mixed data perturbation flow to utilize all pixel information, making the model inclined to consider global semantic information rather than simply discarding unreliable pixels. On the other hand, to fully utilize feature perturbation information flow, we propose a feature augmentation perturbation scheme that simultaneously supplements and discards information from the original feature flow, enabling the model to effectively overcome the diminishing marginal returns brought about by multi-branch perturbations. MaFMatch achieved an mDsc of 90.8 on the automatic cardiac diagnosis challenge (ACDC) dataset. It outperforms most methods across major metrics on both ACDC and LA datasets. The code is available at https://github.com/HandsomeRed/MaFMatch-main.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145316823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Indexers Should Actively Support the Fight Against Paper Mills 索引机构应积极支持与造纸厂的斗争
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-13 DOI: 10.1002/ima.70230
Mohamed L. Seghier
<p>Fake publications, proliferated by paper mills, are a symptom of the modern number-centric academia [<span>1, 2</span>]. The recent Richardson et al. study found that, while fraudulent papers make up a small portion of all publications, the scale of fraud has increased at a shocking rate, not suspected by many [<span>3</span>]. Specifically, the conclusion of Richardson et al.'s study is alarming: that is, “the ability to evade interventions is enabling the number of fraudulent publications to grow at a rate far outpacing that of legitimate science” [<span>3</span>]. All this is reinforced by a degree of impunity that perpetrators enjoy, leaving researchers disillusioned about the authenticity of existing scientific evidence. The question on the lips of all stakeholders is this: if it is still ‘easy’ to publish fake papers, why has academia allowed the problem to persist and undermine scholarly communication? Here, we discuss this problem from a pragmatic perspective, while stressing that this real problem should not be weaponized against science [<span>4</span>].</p><p>Fraud is neither new nor specific to academia. For example, academia has been wrestling with different types of fraudulent activities for decades, including cheating in exams, bogus colleges, forged degrees, and doctored CVs. Fraud in scholarly communication should be examined comprehensively within this broader context. The core underlying issue is that not all researchers have the necessary skills or resources to maintain high levels of research productivity. Consequently, some individuals may resort to unethical practices to achieve high h-index scores and publication counts like those of well-supported researchers at leading institutions, yet without investing equivalent effort. If scholarly communication is vulnerable to corruption [<span>5</span>], and the consequences for being caught are not harsh, why not game the system?</p><p>I believe here is where the problem lies: the whole purpose of fraudulent activities is to inflate research metrics through manipulation and fraud [<span>6</span>]. Yet, manipulations and fraud are rarely fed back to the system to adjust these metrics. This is why a solution must actively involve the entities that calculate and promote such research metrics: the indexers, like journal indexing and university ranking agencies. Specifically, indexers can impact individuals (and institutions) who engage in fraudulent practices by hurting their research metrics.</p><p>Scholarly communication was profoundly shaped by the introduction of research metrics, originally packaged as objective quantitative measures of research quality and impact [<span>7</span>]. These metrics are widely adopted in academia despite their known limitations and inherent biases [<span>8, 9</span>]. But academia also knew that these metrics can be manipulated, as they soon become bad metrics (Goodhart's law), making the system vulnerable to corruption (Campbell's law). Yet, aca
造纸厂泛滥的虚假出版物是现代以数字为中心的学术界的一个症状[1,2]。最近Richardson等人的研究发现,虽然欺诈性论文只占所有出版物的一小部分,但欺诈的规模却以惊人的速度增长,很多人都没有想到这一点。具体来说,理查森等人的研究得出的结论令人担忧:即“逃避干预的能力使欺诈性出版物的数量以远远超过合法科学的速度增长”。犯罪者享有一定程度的有罪不罚,这一切都加强了这一点,使研究人员对现有科学证据的真实性感到失望。所有利益相关者口中的问题是:如果发表假论文仍然很“容易”,为什么学术界允许这个问题持续存在并破坏学术交流?在这里,我们从务实的角度来讨论这个问题,同时强调这个真正的问题不应该被用作对抗科学的武器。欺诈既不新鲜,也不局限于学术界。例如,几十年来,学术界一直在与各种类型的欺诈活动作斗争,包括考试作弊、假大学、伪造学位和伪造简历。学术传播中的欺诈行为应该在这一更广泛的背景下进行全面审查。核心的潜在问题是,并不是所有的研究人员都有必要的技能或资源来维持高水平的研究生产力。因此,一些人可能会采取不道德的做法来获得高h指数得分和发表数量,就像那些在领先机构得到良好支持的研究人员一样,但却没有投入同等的努力。如果学术传播容易受到腐败的影响,而且被发现的后果并不严重,为什么不利用这个制度呢?我认为这就是问题所在:欺诈活动的全部目的是通过操纵和欺诈来夸大研究指标。然而,操纵和欺诈很少反馈到系统来调整这些指标。这就是为什么一个解决方案必须积极地涉及计算和促进这些研究指标的实体:索引机构,如期刊索引和大学排名机构。具体来说,索引器可以通过损害研究指标来影响从事欺诈行为的个人(和机构)。研究指标的引入深刻地影响了学术交流,研究指标最初被包装为研究质量和影响的客观定量衡量标准。尽管这些指标存在已知的局限性和固有的偏差,但它们在学术界被广泛采用[8,9]。但学术界也知道这些指标是可以被操纵的,因为它们很快就会变成糟糕的指标(古德哈特定律),使体系容易受到腐败的影响(坎贝尔定律)。然而,学术界仍然自满于让糟糕的衡量标准如此普遍,忽视了几次让体系免于腐败的呼吁。由于许多糟糕的指标都是由索引器创建的,也许它们可以在某种程度上帮助修复系统。例如,如果一个人在Scopus上打开一个研究人员的个人资料,许多关键指标就会显示在第一页,很容易访问。撤回的论文就不是这样了,因为它们经常隐藏在其他出版物中。在用户资料中简单显示撤稿总数,有助于暴露虚假或不可靠研究的规模。的确,撤稿有时是由于真正的错误造成的;尽管如此,它们经常表明研究质量差(要么是假的,要么是不可靠的研究),因此应该在研究人员的简介中突出显示(图1)。其目的不是损害研究人员的声誉,而是鼓励人们反思他们的产出速度和合作质量,尤其是在撤稿变得重要的时候。为了提高这种纠正措施的有效性,系统必须在识别和撤回不良论文方面快速可靠。目前,撤稿过程缓慢而复杂,因此出版商应该简化程序以加快撤稿过程。事实上,当社区标记出不可靠或虚假的出版物时,一些出版商行动迟缓(甚至不活跃)。这就是为什么索引器和出版商与现有社区主导的倡议(如United2Act、撤稿观察和PubPeer)密切合作非常重要,以更好地了解撤稿模式,并确保实施严格和全面的解决方案b[12]。特别是,索引商和出版商应该促进和支持侦探,以打击虚假出版物。另一个相关的方面是,当有人怀疑期刊的编辑行为不佳或受到损害时,索引机构有权取消期刊的索引。期刊在编辑过程中需要进行许多检查,以检测假论文。 例如,在这个人工智能时代,仍然令人惊讶的是,一些期刊没有严格的筛选系统,包括抄袭、图片修改、数据操纵、有缺陷的统计数据、虚假或不相关的引用、身份不明或虚假的作者,或推荐的审稿人有虚假的账户或明显的利益冲突。这些期刊应该被要求整合这些检查,否则它们就有被取消索引的风险。此外,索引编纂者应该标记出作者变更请求异常高的期刊,因为这可能表明可疑的作者销售[15]。同样,高撤稿率的期刊应该更频繁地进行评估,为它们提供明确的行动计划,以解决现有的问题。如果做不到这一点,这些期刊可能会面临失去影响因子的风险,或者他们的出版物被排除在引文统计之外,以帮助防止引文出售行为。此外,那些发表了太多特刊或超出其范围和目标的文章的期刊应该被取消索引。例如,传播或教育领域的期刊发表关于农业人工智能的特刊是没有意义的!最后但并非最不重要的是,一些索引器在其度量计算方法中留下了漏洞,这应该得到解决。例如,最近的一项调查显示,b谷歌Scholar继续收录伪造论文的引用,即使这些论文已被撤回b[6],这为欺诈机构提供了激励,让他们在预印本服务器上充斥着荒谬的工作,人为地夸大引用。同样,索引员应该与出版商密切合作,以识别涉嫌参与欺诈编辑活动的编辑。事实上,一些编辑很不幸地屈服于允许发表虚假研究报告的诱惑。他们可能会寻求两种类型的潜在利益:(1)从某些欺诈活动中获得直接利益,例如经济补偿,与获得有利编辑决定的作者合作,或访问可以促进在其他期刊上发表的作者或编辑网络;或者(2)通过增加期刊指标的间接优势,因为一些由相互关联的研究人员网络支持的假论文可以带来更高的提交量、出版数量和引用。因此,索引员应该与出版商密切合作,调查这种做法,要么帮助被造纸厂劫持的期刊,要么对积极参与推广此类虚假出版物的编辑采取果断行动。这方面涉及到研究领域的根源,作弊者(其中一些是脆弱的研究人员)在其中运作,即他们的大学所促进的工作环境。许多欺诈行为来自那些痴迷于排名的新兴大学,这并不令人惊讶(也令人难过)。这些大学想要发挥自己的能力,所以他们雇佣研究人员作为“学术雇佣兵”。通常,那些被聘用的研究人员被要求每年发表X篇论文,而不考虑其他因素,比如他们沉重的教学负担和缺乏资源。此外,研究人员受到经济奖励的激励,在领先的学术期刊上发表论文,即使他们的行政和教学责任继续增加。未能发表X篇论文可能会影响他们的工作或晋升机会。在这种情况下,不监督其成员获得出版物的方法的大学应该面临调整其在大学排名表中的位置的后果。例如,除了每位教员的出版物数量外,排名机构还应密切关注这些出版物产生的研究条件,即该大学的研究生态系统是否健康,是否能够维持这种研究生产力?一些维度可以作为质疑非典型研究生产力的危险信号,包括教师的工作量模型、每篇发表论文的博士或博士后人员数量、配备适当设备的实验室数量、某一特定领域每篇发表论文的研究经费数量,以及每篇论文的国际合作者的规模和类型。新兴大学中高产研究人员的增加凸显了用于维持研究生产力的有问题方法的程度[12,20]。同样,论文撤稿率不正常的大学也应该在排名中受到相应的惩罚。用非典型的夸大指标来降级大学,可能会向这些大学发出强烈的信号,促使它们重新考虑自己的研究策略。 总之,研究人员无偿向出版商提供他们的智力产出,如文章,并自愿参与同行评审。作为回报,出版商应保持符合研究完整性和可靠性标准的出版流程。通过确保索引者和排名机构使用的当前指标考虑到欺诈性产出,该系统可以有效地阻止作者参与造纸厂和掠夺性期刊所提倡的不道德行为。如果越来越少的研究人员求助于这些欺诈性的服务,这将使它们在经济上无法生存,甚至可能完全消失。根据United2Act倡议,提高人们对虚假出版物危险的认识至关重要,因为它们
{"title":"Indexers Should Actively Support the Fight Against Paper Mills","authors":"Mohamed L. Seghier","doi":"10.1002/ima.70230","DOIUrl":"https://doi.org/10.1002/ima.70230","url":null,"abstract":"&lt;p&gt;Fake publications, proliferated by paper mills, are a symptom of the modern number-centric academia [&lt;span&gt;1, 2&lt;/span&gt;]. The recent Richardson et al. study found that, while fraudulent papers make up a small portion of all publications, the scale of fraud has increased at a shocking rate, not suspected by many [&lt;span&gt;3&lt;/span&gt;]. Specifically, the conclusion of Richardson et al.'s study is alarming: that is, “the ability to evade interventions is enabling the number of fraudulent publications to grow at a rate far outpacing that of legitimate science” [&lt;span&gt;3&lt;/span&gt;]. All this is reinforced by a degree of impunity that perpetrators enjoy, leaving researchers disillusioned about the authenticity of existing scientific evidence. The question on the lips of all stakeholders is this: if it is still ‘easy’ to publish fake papers, why has academia allowed the problem to persist and undermine scholarly communication? Here, we discuss this problem from a pragmatic perspective, while stressing that this real problem should not be weaponized against science [&lt;span&gt;4&lt;/span&gt;].&lt;/p&gt;&lt;p&gt;Fraud is neither new nor specific to academia. For example, academia has been wrestling with different types of fraudulent activities for decades, including cheating in exams, bogus colleges, forged degrees, and doctored CVs. Fraud in scholarly communication should be examined comprehensively within this broader context. The core underlying issue is that not all researchers have the necessary skills or resources to maintain high levels of research productivity. Consequently, some individuals may resort to unethical practices to achieve high h-index scores and publication counts like those of well-supported researchers at leading institutions, yet without investing equivalent effort. If scholarly communication is vulnerable to corruption [&lt;span&gt;5&lt;/span&gt;], and the consequences for being caught are not harsh, why not game the system?&lt;/p&gt;&lt;p&gt;I believe here is where the problem lies: the whole purpose of fraudulent activities is to inflate research metrics through manipulation and fraud [&lt;span&gt;6&lt;/span&gt;]. Yet, manipulations and fraud are rarely fed back to the system to adjust these metrics. This is why a solution must actively involve the entities that calculate and promote such research metrics: the indexers, like journal indexing and university ranking agencies. Specifically, indexers can impact individuals (and institutions) who engage in fraudulent practices by hurting their research metrics.&lt;/p&gt;&lt;p&gt;Scholarly communication was profoundly shaped by the introduction of research metrics, originally packaged as objective quantitative measures of research quality and impact [&lt;span&gt;7&lt;/span&gt;]. These metrics are widely adopted in academia despite their known limitations and inherent biases [&lt;span&gt;8, 9&lt;/span&gt;]. But academia also knew that these metrics can be manipulated, as they soon become bad metrics (Goodhart's law), making the system vulnerable to corruption (Campbell's law). Yet, aca","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70230","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145316822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Dual-Model Federated Learning for Generalizable Brain Tumor Segmentation 基于自适应双模型联邦学习的泛化脑肿瘤分割
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-12 DOI: 10.1002/ima.70223
Abdul Raheem, Zhen Yang, Malik Abdul Manan, Shahzad Ahmed, Fahad Sabah

Accurate segmentation of brain tumors in magnetic resonance imaging (MRI) is critical for diagnosis, treatment planning, and longitudinal monitoring. The development of robust and generalizable deep learning models for tumor segmentation is hindered by challenges such as data privacy, limited annotations, and domain variability across clinical institutions. To address these issues, we propose a dual-model federated learning model for brain tumor segmentation that enables collaborative model training without sharing patient data. The model employs two specialized architectures: a Multi-Scale Encoder U-Net (MSE-UNet) for fine-grained, multi-resolution feature extraction and a Residual Attention Transpose U-Net (ART-UNet) that leverages residual learning and dual attention mechanisms to enhance contextual sensitivity and robustness under non-IID conditions. To ensure effective learning across distributed, heterogeneous clients, we introduce a Dual-Model Architecture-Aware Aggregation (DAAA) strategy, which performs independent, performance-weighted aggregation of each architecture's updates. The proposed method is evaluated on two benchmark datasets, BraTS 2018 and TCGA-LGG, demonstrating superior performance compared to several state-of-the-art baselines. The model achieves Dice scores of 91.30% and 90.10% on BraTS and TCGA-LGG, respectively, with improved IoU, sensitivity, and boundary precision. Ablation studies confirm that each component, including auxiliary supervision, architectural duality, and adaptive aggregation, contributes significantly to overall performance. Clinically, this framework offers a scalable and privacy-preserving solution that can be integrated into real-world healthcare systems without compromising patient data security. By enabling cross-institutional collaboration and ensuring robust performance across diverse imaging protocols, the proposed model facilitates early and accurate tumor delineation, supporting radiologists in critical decision-making processes such as surgical planning, radiotherapy guidance, and follow-up assessment. This study establishes a foundation for practical deployment in federated clinical environments, particularly within resource-constrained or privacy-sensitive institutions.

磁共振成像(MRI)对脑肿瘤的准确分割对诊断、治疗计划和纵向监测至关重要。数据隐私、有限的注释和跨临床机构的领域可变性等挑战阻碍了用于肿瘤分割的鲁棒性和可泛化深度学习模型的发展。为了解决这些问题,我们提出了一种用于脑肿瘤分割的双模型联邦学习模型,该模型可以在不共享患者数据的情况下进行协作模型训练。该模型采用两种专门的架构:用于细粒度、多分辨率特征提取的多尺度编码器U-Net (MSE-UNet)和利用残差学习和双注意机制来增强非iid条件下上下文敏感性和鲁棒性的剩余注意转置U-Net (ART-UNet)。为了确保跨分布式、异构客户端的有效学习,我们引入了双模型架构感知聚合(DAAA)策略,该策略对每个架构的更新执行独立的、性能加权的聚合。该方法在两个基准数据集(BraTS 2018和TCGA-LGG)上进行了评估,与几个最先进的基线相比,显示出优越的性能。该模型在brat和TCGA-LGG上的Dice得分分别为91.30%和90.10%,IoU、灵敏度和边界精度均有提高。消融研究证实,每个组成部分,包括辅助监督、建筑二元性和自适应聚合,都对整体性能有重要贡献。在临床上,该框架提供了一个可扩展且保护隐私的解决方案,可以集成到现实世界的医疗保健系统中,而不会影响患者数据的安全性。通过促进跨机构合作,确保不同成像方案的强大性能,所提出的模型促进了早期和准确的肿瘤描绘,支持放射科医生在关键决策过程中,如手术计划、放疗指导和随访评估。本研究为联邦临床环境中的实际部署奠定了基础,特别是在资源受限或隐私敏感的机构中。
{"title":"Adaptive Dual-Model Federated Learning for Generalizable Brain Tumor Segmentation","authors":"Abdul Raheem,&nbsp;Zhen Yang,&nbsp;Malik Abdul Manan,&nbsp;Shahzad Ahmed,&nbsp;Fahad Sabah","doi":"10.1002/ima.70223","DOIUrl":"https://doi.org/10.1002/ima.70223","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate segmentation of brain tumors in magnetic resonance imaging (MRI) is critical for diagnosis, treatment planning, and longitudinal monitoring. The development of robust and generalizable deep learning models for tumor segmentation is hindered by challenges such as data privacy, limited annotations, and domain variability across clinical institutions. To address these issues, we propose a dual-model federated learning model for brain tumor segmentation that enables collaborative model training without sharing patient data. The model employs two specialized architectures: a Multi-Scale Encoder U-Net (MSE-UNet) for fine-grained, multi-resolution feature extraction and a Residual Attention Transpose U-Net (ART-UNet) that leverages residual learning and dual attention mechanisms to enhance contextual sensitivity and robustness under non-IID conditions. To ensure effective learning across distributed, heterogeneous clients, we introduce a Dual-Model Architecture-Aware Aggregation (DAAA) strategy, which performs independent, performance-weighted aggregation of each architecture's updates. The proposed method is evaluated on two benchmark datasets, BraTS 2018 and TCGA-LGG, demonstrating superior performance compared to several state-of-the-art baselines. The model achieves Dice scores of 91.30% and 90.10% on BraTS and TCGA-LGG, respectively, with improved IoU, sensitivity, and boundary precision. Ablation studies confirm that each component, including auxiliary supervision, architectural duality, and adaptive aggregation, contributes significantly to overall performance. Clinically, this framework offers a scalable and privacy-preserving solution that can be integrated into real-world healthcare systems without compromising patient data security. By enabling cross-institutional collaboration and ensuring robust performance across diverse imaging protocols, the proposed model facilitates early and accurate tumor delineation, supporting radiologists in critical decision-making processes such as surgical planning, radiotherapy guidance, and follow-up assessment. This study establishes a foundation for practical deployment in federated clinical environments, particularly within resource-constrained or privacy-sensitive institutions.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145316731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Skin Cancer Classification in Dermoscopic Images Using Multi-Scale Feature Map Fusion Based on Deep Learning 基于深度学习的多尺度特征映射融合皮肤镜图像皮肤癌分类
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-10-10 DOI: 10.1002/ima.70219
Arvind Singh Rajpoot, Rahul Dixit, Anupam Shukla

Skin cancer is a prevalent and potentially life-threatening condition that requires accurate and timely diagnosis. Dermoscopic imaging aids in diagnosis but is often limited by manual interpretation, which can be time–intensive and error–prone. Automated classification systems using convolutional neural networks (CNNs) have shown significant promise in enhancing accuracy and efficiency. To further improve classification performance, we propose a novel deep learning-based Multi–Scale Feature Map Fusion (MFMF) model that extracts and fuses features from multiple convolutional layers. The MFMF module effectively combines these multi–scale features, enabling robust feature capture even with limited datasets. Additionally, a hair removal algorithm is proposed to enhance prediction accuracy through improved image preprocessing. On the HAM10000 dataset, our proposed model achieves an overall accuracy of 96.12%, an AUC of 98.7%, and an F1 Score of 93.29%, along with notable improvements in precision, recall, and sensitivity.

皮肤癌是一种普遍且可能危及生命的疾病,需要准确和及时的诊断。皮肤镜成像有助于诊断,但往往受到人工解释的限制,这可能是耗时且容易出错的。使用卷积神经网络(cnn)的自动分类系统在提高准确性和效率方面显示出显著的前景。为了进一步提高分类性能,我们提出了一种新的基于深度学习的多尺度特征映射融合(MFMF)模型,该模型从多个卷积层中提取和融合特征。MFMF模块有效地结合了这些多尺度特征,即使在有限的数据集上也能实现鲁棒的特征捕获。此外,提出了一种脱毛算法,通过改进图像预处理来提高预测精度。在HAM10000数据集上,我们提出的模型的总体准确率为96.12%,AUC为98.7%,F1分数为93.29%,同时在精度、召回率和灵敏度方面都有显着提高。
{"title":"Skin Cancer Classification in Dermoscopic Images Using Multi-Scale Feature Map Fusion Based on Deep Learning","authors":"Arvind Singh Rajpoot,&nbsp;Rahul Dixit,&nbsp;Anupam Shukla","doi":"10.1002/ima.70219","DOIUrl":"https://doi.org/10.1002/ima.70219","url":null,"abstract":"<div>\u0000 \u0000 <p>Skin cancer is a prevalent and potentially life-threatening condition that requires accurate and timely diagnosis. Dermoscopic imaging aids in diagnosis but is often limited by manual interpretation, which can be time–intensive and error–prone. Automated classification systems using convolutional neural networks (CNNs) have shown significant promise in enhancing accuracy and efficiency. To further improve classification performance, we propose a novel deep learning-based Multi–Scale Feature Map Fusion (MFMF) model that extracts and fuses features from multiple convolutional layers. The MFMF module effectively combines these multi–scale features, enabling robust feature capture even with limited datasets. Additionally, a hair removal algorithm is proposed to enhance prediction accuracy through improved image preprocessing. On the HAM10000 dataset, our proposed model achieves an overall accuracy of 96.12%, an AUC of 98.7%, and an F1 Score of 93.29%, along with notable improvements in precision, recall, and sensitivity.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 6","pages":""},"PeriodicalIF":2.5,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145272410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Imaging Systems and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1