首页 > 最新文献

International Journal of Imaging Systems and Technology最新文献

英文 中文
Hybrid MRUNet+: Enhanced Multi-Structure Retinal Segmentation for Optic Cup, Disc, and Vascular Features in Complex Imaging Conditions 混合MRUNet+:在复杂成像条件下增强的多结构视网膜杯、盘和血管特征分割
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-16 DOI: 10.1002/ima.70293
Abdul Qadir Khan, Guangmin Sun, Anas Bilal, Abdulkareem Alzahrani, Abdullah Almuhaimeed

Precise segmentation of critical retinal structures, including the optic cup, optic disc, and vascular bifurcation sites, is essential for the early identification and management of severe ocular conditions such as glaucoma, diabetic retinopathy, hypertensive retinopathy, and age-related macular degeneration. This study proposes a hybrid multi-attention residual-based UNet (MRUNet+), a deep learning (DL) model developed to address the complex nature of retinal image processing using an advanced attention mechanism to enhance segmentation precision. MRUNet+ was trained on many high-quality datasets, including DRIVE, CHASE_DB1, Drishti-GS1, RIM-ONE and REFUGE1, enabling it to address prevalent imaging issues, such as variations in contrast, resolution and the presence of abnormalities. The model has exhibited markedly enhanced performance compared to recent methodologies, with improved accuracy metrics, Dice coefficients and intersection over union (IoU), particularly in low-contrast or complex anatomical regions. Due to its adaptability and precision, MRUNet+ has the potential to function as a pivotal instrument in clinical settings, providing automated retinal assessments that aid in the prompt detection of various retinal and vascular pathologies, thereby improving patient management.

精确分割关键的视网膜结构,包括视杯、视盘和血管分叉部位,对于青光眼、糖尿病视网膜病变、高血压视网膜病变和年龄相关性黄斑变性等严重眼病的早期识别和治疗至关重要。本研究提出了一种基于混合多注意残差的UNet (MRUNet+),这是一种深度学习(DL)模型,旨在解决视网膜图像处理的复杂性,使用先进的注意机制来提高分割精度。MRUNet+在许多高质量的数据集上进行了训练,包括DRIVE, CHASE_DB1, Drishti-GS1, RIM-ONE和refe1,使其能够解决普遍的成像问题,如对比度变化,分辨率和异常的存在。与最近的方法相比,该模型表现出了显著增强的性能,提高了精度指标、Dice系数和交汇连接(IoU),特别是在低对比度或复杂的解剖区域。由于其适应性和精确性,MRUNet+有潜力成为临床环境中的关键工具,提供自动视网膜评估,有助于及时发现各种视网膜和血管病变,从而改善患者管理。
{"title":"Hybrid MRUNet+: Enhanced Multi-Structure Retinal Segmentation for Optic Cup, Disc, and Vascular Features in Complex Imaging Conditions","authors":"Abdul Qadir Khan,&nbsp;Guangmin Sun,&nbsp;Anas Bilal,&nbsp;Abdulkareem Alzahrani,&nbsp;Abdullah Almuhaimeed","doi":"10.1002/ima.70293","DOIUrl":"https://doi.org/10.1002/ima.70293","url":null,"abstract":"<div>\u0000 \u0000 <p>Precise segmentation of critical retinal structures, including the optic cup, optic disc, and vascular bifurcation sites, is essential for the early identification and management of severe ocular conditions such as glaucoma, diabetic retinopathy, hypertensive retinopathy, and age-related macular degeneration. This study proposes a hybrid multi-attention residual-based UNet (MRUNet+), a deep learning (DL) model developed to address the complex nature of retinal image processing using an advanced attention mechanism to enhance segmentation precision. MRUNet+ was trained on many high-quality datasets, including DRIVE, CHASE_DB1, Drishti-GS1, RIM-ONE and REFUGE1, enabling it to address prevalent imaging issues, such as variations in contrast, resolution and the presence of abnormalities. The model has exhibited markedly enhanced performance compared to recent methodologies, with improved accuracy metrics, Dice coefficients and intersection over union (IoU), particularly in low-contrast or complex anatomical regions. Due to its adaptability and precision, MRUNet+ has the potential to function as a pivotal instrument in clinical settings, providing automated retinal assessments that aid in the prompt detection of various retinal and vascular pathologies, thereby improving patient management.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
T2TCAN: A Twin-Token Transformer for Interpretable Grading of Invasive Ductal Carcinoma T2TCAN:可解释浸润性导管癌分级的双标记转换器
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-15 DOI: 10.1002/ima.70296
Vivek Harshey, Amar Partap Singh Pharwaha

Accurate grading of invasive ductal carcinoma (IDC) remains hard even for the trained pathologist. The patient's therapeutic schedule depends on the observed grade, emphasizing the importance of accurate tissue grading. We propose a novel architecture called Twin Tokens Talking-heads Class Attention Net (T2TCAN), designed to grade IDC images. Based on the transformer architecture, this design enhances the model's ability to capture local and global tissue features. This model was trained on the PathoIDCG dataset, outperforming the existing methods in terms of standard metrics. Further, introducing class and twin tokens later in the model, along with the talking-heads class attention and per-channel scaling, made the T2TCAN model efficient at distinguishing the tissue grades. Our model outperformed conventional CNN-based IDC grading approaches. We report the highest 98.82% accuracy, AUC (0.99), precision (98.83%), recall (98.80%), and F1-score (98.81%) on the PathoIDCG dataset. We plotted class attention maps for the TCT layers to provide insights into the model's decision-making process. These attention maps are also helpful for pathologists to understand the discriminative features that significantly contribute to prognostic scoring. Further, our model also performed well on a separate IDC grading dataset, producing state-of-the-art results and thus proving its efficacy. The designed model claims its adaptability and suitability for integration into the healthcare workflow for IDC grading.

浸润性导管癌(IDC)的准确分级即使对训练有素的病理学家来说也是困难的。患者的治疗计划取决于观察到的分级,强调准确的组织分级的重要性。我们提出了一种新的架构,称为双令牌说话头类注意力网络(T2TCAN),旨在对IDC图像进行分级。基于变压器结构,该设计增强了模型捕获局部和全局组织特征的能力。该模型是在PathoIDCG数据集上训练的,在标准指标方面优于现有方法。此外,稍后在模型中引入类和双令牌,以及谈话头类注意和每个通道缩放,使T2TCAN模型在区分组织等级方面有效。我们的模型优于传统的基于cnn的IDC分级方法。我们报告了在PathoIDCG数据集上最高的98.82%准确率、AUC(0.99)、精密度(98.83%)、召回率(98.80%)和f1分数(98.81%)。我们为TCT层绘制了类注意图,以深入了解模型的决策过程。这些注意图也有助于病理学家理解显著有助于预后评分的鉴别特征。此外,我们的模型在独立的IDC分级数据集上也表现良好,产生了最先进的结果,从而证明了其有效性。所设计的模型具有适应性和适用性,可以集成到IDC分级的医疗保健工作流程中。
{"title":"T2TCAN: A Twin-Token Transformer for Interpretable Grading of Invasive Ductal Carcinoma","authors":"Vivek Harshey,&nbsp;Amar Partap Singh Pharwaha","doi":"10.1002/ima.70296","DOIUrl":"https://doi.org/10.1002/ima.70296","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate grading of invasive ductal carcinoma (IDC) remains hard even for the trained pathologist. The patient's therapeutic schedule depends on the observed grade, emphasizing the importance of accurate tissue grading. We propose a novel architecture called Twin Tokens Talking-heads Class Attention Net (T2TCAN), designed to grade IDC images. Based on the transformer architecture, this design enhances the model's ability to capture local and global tissue features. This model was trained on the PathoIDCG dataset, outperforming the existing methods in terms of standard metrics. Further, introducing class and twin tokens later in the model, along with the talking-heads class attention and per-channel scaling, made the T2TCAN model efficient at distinguishing the tissue grades. Our model outperformed conventional CNN-based IDC grading approaches. We report the highest 98.82% accuracy, AUC (0.99), precision (98.83%), recall (98.80%), and F1-score (98.81%) on the PathoIDCG dataset. We plotted class attention maps for the TCT layers to provide insights into the model's decision-making process. These attention maps are also helpful for pathologists to understand the discriminative features that significantly contribute to prognostic scoring. Further, our model also performed well on a separate IDC grading dataset, producing state-of-the-art results and thus proving its efficacy. The designed model claims its adaptability and suitability for integration into the healthcare workflow for IDC grading.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146016390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Fusion: A High-Performance AI Framework for Colorectal Cancer Grading 深度融合:结直肠癌分级的高性能AI框架
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-15 DOI: 10.1002/ima.70291
Muhammed Emin Bedir, Mesut Ersin Sönmez, Afife Uğuz, Burcu Sanal Yılmaz

Automated, precise histopathological grading of colorectal cancer (CRC) is vital for prognosis and treatment but is challenged by inter-observer variability and time demands. This study introduces and evaluates a novel deep learning framework for robust four-class grading of colorectal adenocarcinoma (Normal, Well, Moderately, and Poorly Differentiated), directly addressing limitations of prior studies focused on survival prediction or limited feature extraction. Our approach integrates Reinhard color normalization for stain variability, data augmentation for robustness, and synergistic feature extraction from fine-tuned MobileNetV2 and InceptionV3 Convolutional Neural Networks (CNNs). Computationally efficient classification was achieved using Principal Component Analysis (PCA) to reduce features to 100 components, followed by an optimized Support Vector Machine (SVM) with multiple kernel evaluations. The framework, with a polynomial kernel SVM, demonstrated superior performance, achieving 96.41% macro-averaged accuracy (95% CI, 95.8%–97.0%), with corresponding 0.9641 precision, recall, F1-score, and a 0.9915 macro area under the curve (AUC) in fivefold cross-validation. Polynomial and Radial Basis Function (RBF) kernels significantly outperformed others. The framework's primary contribution is a validated, synergistic feature fusion pipeline that leverages the complementary strengths of diverse CNN architectures and an optimized SVM classifier, representing a notable advancement in automated grading on clinically sourced data. This provides a comprehensive framework for multi-class CRC grading, showing considerable potential to enhance diagnostic accuracy, consistency, and efficiency.

结直肠癌(CRC)的自动、精确的组织病理学分级对预后和治疗至关重要,但受到观察者间可变性和时间要求的挑战。本研究引入并评估了一种新的深度学习框架,用于对结直肠腺癌进行稳健的四级分级(正常、良好、中度和低分化),直接解决了先前研究集中于生存预测或有限特征提取的局限性。我们的方法集成了Reinhard颜色归一化的染色可变性,数据增强的鲁棒性,以及从微调的MobileNetV2和InceptionV3卷积神经网络(cnn)中协同提取特征。采用主成分分析(PCA)将特征减少到100个分量,然后采用优化的支持向量机(SVM)进行多核评估,从而实现高效的分类。该框架采用多项式核支持向量机,在五重交叉验证中达到96.41%的宏观平均准确率(95% CI, 95.8% ~ 97.0%),对应的精密度、召回率、f1得分为0.9641,宏观曲线下面积(AUC)为0.9915。多项式和径向基函数(RBF)核明显优于其他核。该框架的主要贡献是一个经过验证的协同特征融合管道,它利用了各种CNN架构和优化的SVM分类器的互补优势,代表了临床来源数据的自动分级方面的显着进步。这为多类别CRC分级提供了一个全面的框架,显示出提高诊断准确性、一致性和效率的巨大潜力。
{"title":"Deep Fusion: A High-Performance AI Framework for Colorectal Cancer Grading","authors":"Muhammed Emin Bedir,&nbsp;Mesut Ersin Sönmez,&nbsp;Afife Uğuz,&nbsp;Burcu Sanal Yılmaz","doi":"10.1002/ima.70291","DOIUrl":"https://doi.org/10.1002/ima.70291","url":null,"abstract":"<p>Automated, precise histopathological grading of colorectal cancer (CRC) is vital for prognosis and treatment but is challenged by inter-observer variability and time demands. This study introduces and evaluates a novel deep learning framework for robust four-class grading of colorectal adenocarcinoma (Normal, Well, Moderately, and Poorly Differentiated), directly addressing limitations of prior studies focused on survival prediction or limited feature extraction. Our approach integrates Reinhard color normalization for stain variability, data augmentation for robustness, and synergistic feature extraction from fine-tuned MobileNetV2 and InceptionV3 Convolutional Neural Networks (CNNs). Computationally efficient classification was achieved using Principal Component Analysis (PCA) to reduce features to 100 components, followed by an optimized Support Vector Machine (SVM) with multiple kernel evaluations. The framework, with a polynomial kernel SVM, demonstrated superior performance, achieving 96.41% macro-averaged accuracy (95% CI, 95.8%–97.0%), with corresponding 0.9641 precision, recall, <i>F</i>1-score, and a 0.9915 macro area under the curve (AUC) in fivefold cross-validation. Polynomial and Radial Basis Function (RBF) kernels significantly outperformed others. The framework's primary contribution is a validated, synergistic feature fusion pipeline that leverages the complementary strengths of diverse CNN architectures and an optimized SVM classifier, representing a notable advancement in automated grading on clinically sourced data. This provides a comprehensive framework for multi-class CRC grading, showing considerable potential to enhance diagnostic accuracy, consistency, and efficiency.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70291","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Directional Context Modeling With HCSMIL: Enhancing Cancer Prediction and Subtype Classification From Whole Slide Images 基于HCSMIL的多向上下文建模:从整个幻灯片图像中增强癌症预测和亚型分类
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-15 DOI: 10.1002/ima.70287
Jingtao Qiu, Yucheng Liu

The Mamba model performs excellently in natural image processing but faces limitations in analyzing whole slide images (WSIs) for cancer prediction and subtype classification in digital pathology—pathological images feature highly irregular lesion spatial distributions (especially complex small-lesion associations), while Mamba's inherent unidirectional/limited-direction scanning cannot effectively model such multi-dimensional spatial dependencies, failing to capture key pathological structural features. To address this, we propose HCSMIL, a Mamba-based optimized framework tailored to pathological image clinical analysis. It comprehensively captures local lesion spatial topology via multi-directional contextual modeling and integrates a multi-scale pyramid structure to extract global lesion distribution features, jointly enhancing diagnostic accuracy. Validation on authoritative datasets (Camelyon16, TCGA-LUNG, TCGA-Kidney) shows HCSMIL significantly outperforms existing mainstream methods: on TCGA-LUNG, accuracy (ACC), F1 score, and AUC are 0.66%, 1.42%, and 1.25% higher than the second-best method; on TCGA-Kidney, these metrics increase by 1.47%, 0.09%, and 1.00%; on Camelyon16, ACC is 0.77% higher. Notably, HCSMIL achieves an 84% small-lesion recognition rate, substantially exceeding TransMIL (70.59%) and MambaMIL (64.71%), fully demonstrating its strength in capturing complexly distributed lesions and providing reliable technical support for cancer diagnosis.

Mamba模型在自然图像处理方面表现出色,但在分析全幻灯片图像(wsi)用于癌症预测和数字病理亚型分类方面存在局限性-病理图像具有高度不规则的病变空间分布(特别是复杂的小病变关联),而Mamba固有的单向/有限方向扫描无法有效地模拟这种多维空间依赖性。未能捕捉关键的病理结构特征。为了解决这个问题,我们提出了HCSMIL,一个基于曼巴的优化框架,专门用于病理图像的临床分析。通过多向上下文建模全面捕捉局部病变空间拓扑,结合多尺度金字塔结构提取全局病变分布特征,共同提高诊断准确率。在权威数据集(Camelyon16、TCGA-LUNG、TCGA-Kidney)上的验证表明,HCSMIL显著优于现有主流方法:在TCGA-LUNG上,准确率(ACC)、F1评分和AUC分别比次优方法高0.66%、1.42%和1.25%;在TCGA-Kidney组,这些指标分别增加1.47%、0.09%和1.00%;Camelyon16的ACC高0.77%。值得注意的是,HCSMIL的小病变识别率达到84%,大大超过TransMIL(70.59%)和MambaMIL(64.71%),充分显示了HCSMIL在捕获分布复杂病变方面的优势,为癌症诊断提供了可靠的技术支持。
{"title":"Multi-Directional Context Modeling With HCSMIL: Enhancing Cancer Prediction and Subtype Classification From Whole Slide Images","authors":"Jingtao Qiu,&nbsp;Yucheng Liu","doi":"10.1002/ima.70287","DOIUrl":"https://doi.org/10.1002/ima.70287","url":null,"abstract":"<div>\u0000 \u0000 <p>The Mamba model performs excellently in natural image processing but faces limitations in analyzing whole slide images (WSIs) for cancer prediction and subtype classification in digital pathology—pathological images feature highly irregular lesion spatial distributions (especially complex small-lesion associations), while Mamba's inherent unidirectional/limited-direction scanning cannot effectively model such multi-dimensional spatial dependencies, failing to capture key pathological structural features. To address this, we propose HCSMIL, a Mamba-based optimized framework tailored to pathological image clinical analysis. It comprehensively captures local lesion spatial topology via multi-directional contextual modeling and integrates a multi-scale pyramid structure to extract global lesion distribution features, jointly enhancing diagnostic accuracy. Validation on authoritative datasets (Camelyon16, TCGA-LUNG, TCGA-Kidney) shows HCSMIL significantly outperforms existing mainstream methods: on TCGA-LUNG, accuracy (ACC), F1 score, and AUC are 0.66%, 1.42%, and 1.25% higher than the second-best method; on TCGA-Kidney, these metrics increase by 1.47%, 0.09%, and 1.00%; on Camelyon16, ACC is 0.77% higher. Notably, HCSMIL achieves an 84% small-lesion recognition rate, substantially exceeding TransMIL (70.59%) and MambaMIL (64.71%), fully demonstrating its strength in capturing complexly distributed lesions and providing reliable technical support for cancer diagnosis.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature Derivative-Based Pixel Segmentation Method for Detecting Lung Tumors From CT Images 基于特征导数的CT图像肺肿瘤检测像素分割方法
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1002/ima.70281
S. P. Kavya, V. Seethalakshmi

Lung tumor segmentation using machine learning and artificial intelligence techniques leverages the diagnosis precision through accurate localization of the infections. The prominent factors are the features that reflect the infected region concealed through patterns, boundaries, and edges. In this article, a novel Feature-Derivative Pixel Segmentation (FDPS) method is introduced to improve the tumor segmentation accuracy influenced by the disparity pixel distribution problem. This proposed method is assisted by tuneable recurrent learning (TRL) to vary the feature derivative count for varying segments. The learning inputs are modifiable using different extracted feature derivatives under parity and disparity pixel distributions. By identifying the maximum disparity pixels, the tuneable inputs for the recurrent learning are decided. The computation layer of the learning process identifies the maximum related regions identified under parity and disparity features. Such regions are segmented from multiple pixel distribution points until the image size. This process is therefore iterated to identify maximum conjoined features under different infected regions. The proposed method improves the accuracy by 9.63%, the true positive rate by 10.85%, and reduces the classification error by 10.06% for the maximum regions.

使用机器学习和人工智能技术的肺肿瘤分割通过准确定位感染来提高诊断精度。突出的因素是通过图案、边界和边缘来反映感染区域的特征。本文提出了一种新的特征导数像素分割(FDPS)方法,以改善受视差像素分布问题影响的肿瘤分割精度。该方法通过可调循环学习(TRL)来改变不同片段的特征导数计数。学习输入可以在奇偶性和视差像素分布下使用不同提取的特征导数进行修改。通过识别最大视差像素,确定循环学习的可调输入。学习过程的计算层识别在宇称和视差特征下识别的最大相关区域。这些区域从多个像素分布点分割,直到图像大小。因此,迭代该过程以确定不同感染区域下的最大连接特征。该方法对最大区域的分类准确率提高了9.63%,真阳性率提高了10.85%,分类误差降低了10.06%。
{"title":"Feature Derivative-Based Pixel Segmentation Method for Detecting Lung Tumors From CT Images","authors":"S. P. Kavya,&nbsp;V. Seethalakshmi","doi":"10.1002/ima.70281","DOIUrl":"https://doi.org/10.1002/ima.70281","url":null,"abstract":"<div>\u0000 \u0000 <p>Lung tumor segmentation using machine learning and artificial intelligence techniques leverages the diagnosis precision through accurate localization of the infections. The prominent factors are the features that reflect the infected region concealed through patterns, boundaries, and edges. In this article, a novel Feature-Derivative Pixel Segmentation (FDPS) method is introduced to improve the tumor segmentation accuracy influenced by the disparity pixel distribution problem. This proposed method is assisted by tuneable recurrent learning (TRL) to vary the feature derivative count for varying segments. The learning inputs are modifiable using different extracted feature derivatives under parity and disparity pixel distributions. By identifying the maximum disparity pixels, the tuneable inputs for the recurrent learning are decided. The computation layer of the learning process identifies the maximum related regions identified under parity and disparity features. Such regions are segmented from multiple pixel distribution points until the image size. This process is therefore iterated to identify maximum conjoined features under different infected regions. The proposed method improves the accuracy by 9.63%, the true positive rate by 10.85%, and reduces the classification error by 10.06% for the maximum regions.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SDCSCF-Net: A High-Performance Spatial Channel Fusion Attention Network for Diabetic Retinopathy Classification SDCSCF-Net:用于糖尿病视网膜病变分类的高性能空间通道融合注意网络
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1002/ima.70290
Liwen Zhang, Baiyang Yang, Rongwei Xia, Qiang Zhang, Jinchan Wang

Diabetic retinopathy (DR) is a leading cause of blindness among individuals with diabetes. Timely diagnosis and precise classification of DR are essential for patients. However, the traditional diagnostic methods have limitations in precision, mainly relying on doctors' experiences and subjective judgments on DR images. Therefore, an efficient network model, named SDCSCF-Net, is proposed based on deep learning for DR diagnosis and classification. Firstly, the SE_Double_Conv (SDC) module is designed by integrating the Squeeze-and-Excitation (SE) attention mechanism into the first Double_Conv block of the encoder structure of U-Net to enhance feature representation and suppress redundant information. Secondly, a novel attention mechanism, spatial channel fusion attention (SCFA) module, is proposed to enhance the model's focus on lesion areas and the relationship between the channels, making the model more effectively distinguish subtle differences between adjacent DR classes. Finally, the proposed model is evaluated on the APTOS 2019 dataset, which contains 3662 fundus images. The results show that the proposed model demonstrates superior classification performance for DR compared to other existing approaches, and its accuracy, precision, recall, and F1-score for binary classification of DR are 99.18%, 99.47%, 98.98%, and 99.19%, respectively. For the five-class classification task, the model achieves an accuracy of 84.72%, a precision of 84.12%, a recall of 84.72%, and an F1-score of 84.02%. All the evaluation metrics are obtained from the testing phase of the model. In addition, the Grad-CAM technology is utilized to visualize the key lesion areas concerned by the model and further verifies the effectiveness of the proposed model. It is beneficial to promote the research and practical application in the intelligent diagnosis of DR.

糖尿病视网膜病变(DR)是糖尿病患者致盲的主要原因。及时诊断和准确分类DR对患者至关重要。然而,传统的诊断方法在精度上存在局限性,主要依靠医生的经验和对DR图像的主观判断。为此,提出了一种基于深度学习的DR诊断分类网络模型SDCSCF-Net。首先,设计SE_Double_Conv (SDC)模块,将压缩激励(SE)注意机制集成到U-Net编码器结构的第一个Double_Conv块中,增强特征表征,抑制冗余信息;其次,提出了一种新的注意机制——空间通道融合注意(SCFA)模块,增强了模型对病变区域和通道之间关系的关注,使模型更有效地区分相邻DR类别之间的细微差异。最后,在包含3662张眼底图像的APTOS 2019数据集上对该模型进行了评估。结果表明,该模型在DR分类上的准确率、精密度、召回率和F1-score分别为99.18%、99.47%、98.98%和99.19%。对于五类分类任务,该模型的准确率为84.72%,精密度为84.12%,召回率为84.72%,f1得分为84.02%。所有的评估指标都是从模型的测试阶段获得的。此外,利用Grad-CAM技术对模型关注的关键病变区域进行可视化,进一步验证了所提模型的有效性。这有利于促进DR智能诊断的研究和实际应用。
{"title":"SDCSCF-Net: A High-Performance Spatial Channel Fusion Attention Network for Diabetic Retinopathy Classification","authors":"Liwen Zhang,&nbsp;Baiyang Yang,&nbsp;Rongwei Xia,&nbsp;Qiang Zhang,&nbsp;Jinchan Wang","doi":"10.1002/ima.70290","DOIUrl":"https://doi.org/10.1002/ima.70290","url":null,"abstract":"<div>\u0000 \u0000 <p>Diabetic retinopathy (DR) is a leading cause of blindness among individuals with diabetes. Timely diagnosis and precise classification of DR are essential for patients. However, the traditional diagnostic methods have limitations in precision, mainly relying on doctors' experiences and subjective judgments on DR images. Therefore, an efficient network model, named SDCSCF-Net, is proposed based on deep learning for DR diagnosis and classification. Firstly, the SE_Double_Conv (SDC) module is designed by integrating the Squeeze-and-Excitation (SE) attention mechanism into the first Double_Conv block of the encoder structure of U-Net to enhance feature representation and suppress redundant information. Secondly, a novel attention mechanism, spatial channel fusion attention (SCFA) module, is proposed to enhance the model's focus on lesion areas and the relationship between the channels, making the model more effectively distinguish subtle differences between adjacent DR classes. Finally, the proposed model is evaluated on the APTOS 2019 dataset, which contains 3662 fundus images. The results show that the proposed model demonstrates superior classification performance for DR compared to other existing approaches, and its accuracy, precision, recall, and F1-score for binary classification of DR are 99.18%, 99.47%, 98.98%, and 99.19%, respectively. For the five-class classification task, the model achieves an accuracy of 84.72%, a precision of 84.12%, a recall of 84.72%, and an F1-score of 84.02%. All the evaluation metrics are obtained from the testing phase of the model. In addition, the Grad-CAM technology is utilized to visualize the key lesion areas concerned by the model and further verifies the effectiveness of the proposed model. It is beneficial to promote the research and practical application in the intelligent diagnosis of DR.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SlideInspect: From Pixel-Level Artifact Detection to Actionable Quality Metrics in Digital Pathology SlideInspect:从像素级伪影检测到数字病理学中可操作的质量指标
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-11 DOI: 10.1002/ima.70292
Manuela Scotto, Roberta Patti, Vincenzo L'imperio, Filippo Fraggetta, Filippo Molinari, Massimo Salvi

The presence of artifacts in whole slide images (WSIs), such as tissue folds, air bubbles, and out-of-focus regions, can significantly impact WSI digitization, pathologists' evaluation, and the accuracy of downstream analyses. We present SlideInspect, a novel AI-based framework for comprehensive artifact detection and quality control in digital pathology. Our system leverages deep learning techniques to segment multiple artifact types across diverse tissue types and staining methods. SlideInspect provides a hierarchical output: a color-coded slide quality indicator (green, yellow, red) with recommended actions (no action, re-scan, re-mount, re-cut) based on artifact type and extent, and pixel-level segmentation masks for detailed analysis. The system operates at multiple magnifications (1.25× for tissue segmentation, 5× for artifact detection) and also incorporates stain quality assessment for histological stain evaluation. We validated SlideInspect on a large, multi-centric, multi-scanner dataset of over 3000 WSIs, demonstrating robust performance across different tissue types, staining methods, and scanning platforms. The system achieves high segmentation accuracy for various artifacts while maintaining computational efficiency (average processing time: 72.7 s per WSI). Pathologist evaluations confirmed the clinical relevance and accuracy of SlideInspect's quality assessments. By providing actionable insights at multiple levels of granularity, SlideInspect significantly improves the efficiency and standardization of digital pathology workflows. Its vendor-agnostic design and multi-stain capability make it suitable for integration into diverse clinical and research settings.

在整个幻灯片图像(WSI)中存在伪影,如组织褶皱、气泡和失焦区域,会严重影响WSI数字化、病理学家的评估和下游分析的准确性。我们提出了SlideInspect,一个新的基于人工智能的框架,用于数字病理学中全面的伪影检测和质量控制。我们的系统利用深度学习技术在不同的组织类型和染色方法中分割多种工件类型。SlideInspect提供了一个分层输出:一个颜色编码的幻灯片质量指示器(绿色,黄色,红色),根据工件类型和程度推荐操作(无操作,重新扫描,重新安装,重新切割),以及用于详细分析的像素级分割掩码。该系统可在多种倍率下工作(1.25倍用于组织分割,5倍用于伪影检测),并结合染色质量评估用于组织学染色评估。我们在超过3000个wsi的大型、多中心、多扫描仪数据集上验证了SlideInspect,展示了在不同组织类型、染色方法和扫描平台上的稳健性能。该系统在保持计算效率(平均处理时间:72.7 s / WSI)的同时,实现了对各种工件的高分割精度。病理学家的评估证实了SlideInspect质量评估的临床相关性和准确性。通过在多个粒度级别提供可操作的见解,SlideInspect显着提高了数字病理工作流程的效率和标准化。其供应商不可知的设计和多染色能力使其适合整合到不同的临床和研究设置。
{"title":"SlideInspect: From Pixel-Level Artifact Detection to Actionable Quality Metrics in Digital Pathology","authors":"Manuela Scotto,&nbsp;Roberta Patti,&nbsp;Vincenzo L'imperio,&nbsp;Filippo Fraggetta,&nbsp;Filippo Molinari,&nbsp;Massimo Salvi","doi":"10.1002/ima.70292","DOIUrl":"https://doi.org/10.1002/ima.70292","url":null,"abstract":"<p>The presence of artifacts in whole slide images (WSIs), such as tissue folds, air bubbles, and out-of-focus regions, can significantly impact WSI digitization, pathologists' evaluation, and the accuracy of downstream analyses. We present SlideInspect, a novel AI-based framework for comprehensive artifact detection and quality control in digital pathology. Our system leverages deep learning techniques to segment multiple artifact types across diverse tissue types and staining methods. SlideInspect provides a hierarchical output: a color-coded slide quality indicator (green, yellow, red) with recommended actions (no action, re-scan, re-mount, re-cut) based on artifact type and extent, and pixel-level segmentation masks for detailed analysis. The system operates at multiple magnifications (1.25× for tissue segmentation, 5× for artifact detection) and also incorporates stain quality assessment for histological stain evaluation. We validated SlideInspect on a large, multi-centric, multi-scanner dataset of over 3000 WSIs, demonstrating robust performance across different tissue types, staining methods, and scanning platforms. The system achieves high segmentation accuracy for various artifacts while maintaining computational efficiency (average processing time: 72.7 s per WSI). Pathologist evaluations confirmed the clinical relevance and accuracy of SlideInspect's quality assessments. By providing actionable insights at multiple levels of granularity, SlideInspect significantly improves the efficiency and standardization of digital pathology workflows. Its vendor-agnostic design and multi-stain capability make it suitable for integration into diverse clinical and research settings.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70292","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breast Tumor Detection via S-Parameter Contrast Using a 1 × 8 Miniaturized Metamaterial Antenna Array for UWB Microwave Imaging 1 × 8微型超材料天线阵列超宽带微波成像s参数对比检测乳腺肿瘤
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-11 DOI: 10.1002/ima.70295
Sanaa Salama, Duaa Zyoud, Ashraf Abuelhaija, Muneera Altayeb, Ammar Al-Bassam

Due to the significant increase in breast cancer cases and the limitations of existing early-stage detection techniques, microwave imaging has emerged as a critical tool for the diagnosis of carcinogenic and malignant cells in various tissues. In this work, a 1 × 8 miniaturized metamaterial-based antenna array is conceived and developed for ultrawideband microwave imaging for early breast cancer diagnosis because of its improved accuracy. The developed antenna array features a small dimension, a wide frequency band, a high gain, and broadside radiation properties. To achieve the wider bandwidth from 3.34 to 6.79 GHz, H-shaped unit cells and T-shaped feed network dimensions are optimized. The obtained wide bandwidth supports the generation of high-quality images. A partial ground plane structure is used to improve impedance matching and further enhance the bandwidth. Antenna performance is first numerically and experimentally validated in free space. The antenna performance is validated via numerical simulation and experimental measurements in free space. A numerical phantom with similar tissue properties is created with and without the tumor. Differences in the back scattered signals from the antenna array elements can be observed due to the higher water content and larger dielectric constant of malignant cells as compared to healthy ones, which can be analyzed to identify the tumor or cancer. Here, eight antenna elements are arranged in a circle at a distance of 10 mm from the breast. The separation between adjacent antenna elements is 17 mm to reduce the mutual coupling. Furthermore, the breast tissue is scanned at different angles. At a time, one antenna is excited and the others are in the receiving mode. The collected signals are used to detect malignant cells. The existence of a tumor causes differences in the back scattered signals of the antenna elements. The absolute difference in transmission coefficients, with and without the presence of a tumor, is used to detect the existence of malignant cells. The suggested structure has demonstrated effective performance in microwave imaging using S-parameter contrast.

由于乳腺癌病例的显著增加和现有早期检测技术的局限性,微波成像已成为诊断各种组织中致癌和恶性细胞的重要工具。在这项工作中,由于其准确性的提高,我们设想并开发了一种1 × 8小型化的基于超材料的天线阵列,用于超宽带微波成像用于早期乳腺癌诊断。所研制的天线阵具有尺寸小、频带宽、增益高、宽侧辐射等特点。为了实现3.34 ~ 6.79 GHz的更宽带宽,优化了h形单元格和t形馈电网络尺寸。所获得的宽带宽支持高质量图像的生成。采用局部接地面结构,改善了阻抗匹配,进一步提高了带宽。首先在自由空间对天线性能进行了数值和实验验证。通过数值模拟和自由空间实验验证了天线的性能。在有肿瘤和没有肿瘤的情况下,创建一个具有类似组织特性的数值幻影。由于恶性细胞的含水量和介电常数高于健康细胞,因此可以观察到来自天线阵列元件的反向散射信号的差异,可以通过分析这些差异来识别肿瘤或癌症。在这里,八个天线元件在距离乳房10毫米的距离上排列成一个圆圈。相邻天线单元之间的间距为17mm,以减少相互耦合。此外,乳房组织从不同角度进行扫描。同时,一个天线处于激励状态,其他天线处于接收模式。收集到的信号用于检测恶性细胞。肿瘤的存在使天线单元的反向散射信号产生差异。透射系数的绝对差值,在有无肿瘤的情况下,被用来检测恶性细胞的存在。该结构在s参数对比的微波成像中表现出了有效的性能。
{"title":"Breast Tumor Detection via S-Parameter Contrast Using a 1 × 8 Miniaturized Metamaterial Antenna Array for UWB Microwave Imaging","authors":"Sanaa Salama,&nbsp;Duaa Zyoud,&nbsp;Ashraf Abuelhaija,&nbsp;Muneera Altayeb,&nbsp;Ammar Al-Bassam","doi":"10.1002/ima.70295","DOIUrl":"https://doi.org/10.1002/ima.70295","url":null,"abstract":"<div>\u0000 \u0000 <p>Due to the significant increase in breast cancer cases and the limitations of existing early-stage detection techniques, microwave imaging has emerged as a critical tool for the diagnosis of carcinogenic and malignant cells in various tissues. In this work, a 1 × 8 miniaturized metamaterial-based antenna array is conceived and developed for ultrawideband microwave imaging for early breast cancer diagnosis because of its improved accuracy. The developed antenna array features a small dimension, a wide frequency band, a high gain, and broadside radiation properties. To achieve the wider bandwidth from 3.34 to 6.79 GHz, H-shaped unit cells and T-shaped feed network dimensions are optimized. The obtained wide bandwidth supports the generation of high-quality images. A partial ground plane structure is used to improve impedance matching and further enhance the bandwidth. Antenna performance is first numerically and experimentally validated in free space. The antenna performance is validated via numerical simulation and experimental measurements in free space. A numerical phantom with similar tissue properties is created with and without the tumor. Differences in the back scattered signals from the antenna array elements can be observed due to the higher water content and larger dielectric constant of malignant cells as compared to healthy ones, which can be analyzed to identify the tumor or cancer. Here, eight antenna elements are arranged in a circle at a distance of 10 mm from the breast. The separation between adjacent antenna elements is 17 mm to reduce the mutual coupling. Furthermore, the breast tissue is scanned at different angles. At a time, one antenna is excited and the others are in the receiving mode. The collected signals are used to detect malignant cells. The existence of a tumor causes differences in the back scattered signals of the antenna elements. The absolute difference in transmission coefficients, with and without the presence of a tumor, is used to detect the existence of malignant cells. The suggested structure has demonstrated effective performance in microwave imaging using S-parameter contrast.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MorphoFormer: Dual-Branch Dilated Transformer With Pathological Prior Fusion for Cervical Cell Morphology Analysis MorphoFormer:双分支扩张变压器病理融合用于宫颈细胞形态学分析
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-10 DOI: 10.1002/ima.70273
Linhong Zhao, Xiao Shang, Zhenfeng Zhao, Yuhao Liu, Yueping Liu, Shenwen Wang

Cervical cancer is one of the most common malignant tumors among women worldwide, and accurate early diagnosis is critical for improving patient survival rates. Traditional cytological screening methods rely on manual microscopic examination, which suffers from low efficiency and high subjectivity. In recent years, deep learning has facilitated the automation of cervical cell image analysis, yet challenges such as insufficient modeling of pathological features and high computational cost remain. To address these issues, this study proposes a novel dual-branch multi-scale model, MorphoFormer. The model employs a multi-scale dilated Transformer (DilateFormer) as its backbone and innovatively incorporates specialized modules for each branch: A Local Context Aggregation (LCA) module in the local branch and a Global Focus Attention (GFA) module in the global branch. These modules respectively enhance the representation of local details and global semantics, and their features are fused to enable collaborative multi-scale information modeling. Experimental results on the publicly available SIPaKMeD dataset demonstrate that MorphoFormer achieves classification accuracies of 99.58%, 98.51%, and 98.14% for binary, three-class, and five-class tasks, respectively. Further validation on the Blood Cell Count and Detection (BCCD) dataset indicates strong cross-task robustness. Moreover, MorphoFormer requires only 8.22 GFLOPs for inference, highlighting its practical potential by achieving high performance with low computational overhead. Related codes: https://github.com/sijhb/MorphoFormer.

宫颈癌是全球女性最常见的恶性肿瘤之一,准确的早期诊断对提高患者生存率至关重要。传统的细胞学筛查方法依赖于人工显微检查,效率低,主观性强。近年来,深度学习促进了宫颈细胞图像分析的自动化,但仍然存在病理特征建模不足和计算成本高等挑战。为了解决这些问题,本研究提出了一种新的双分支多尺度模型MorphoFormer。该模型采用多尺度扩展变压器(DilateFormer)作为主干,创新地为每个分支集成了专门的模块:本地分支中的本地上下文聚合(LCA)模块和全球分支中的全局焦点关注(GFA)模块。这些模块分别增强了局部细节的表示和全局语义,并将它们的特征融合在一起,实现了协同多尺度信息建模。在公开的SIPaKMeD数据集上的实验结果表明,MorphoFormer在二分类、三分类和五分类任务上的分类准确率分别达到99.58%、98.51%和98.14%。对血细胞计数和检测(BCCD)数据集的进一步验证表明,该方法具有很强的跨任务鲁棒性。此外,MorphoFormer仅需要8.22 GFLOPs进行推理,以低计算开销实现高性能,突出了其实用潜力。相关代码:https://github.com/sijhb/MorphoFormer。
{"title":"MorphoFormer: Dual-Branch Dilated Transformer With Pathological Prior Fusion for Cervical Cell Morphology Analysis","authors":"Linhong Zhao,&nbsp;Xiao Shang,&nbsp;Zhenfeng Zhao,&nbsp;Yuhao Liu,&nbsp;Yueping Liu,&nbsp;Shenwen Wang","doi":"10.1002/ima.70273","DOIUrl":"https://doi.org/10.1002/ima.70273","url":null,"abstract":"<div>\u0000 \u0000 <p>Cervical cancer is one of the most common malignant tumors among women worldwide, and accurate early diagnosis is critical for improving patient survival rates. Traditional cytological screening methods rely on manual microscopic examination, which suffers from low efficiency and high subjectivity. In recent years, deep learning has facilitated the automation of cervical cell image analysis, yet challenges such as insufficient modeling of pathological features and high computational cost remain. To address these issues, this study proposes a novel dual-branch multi-scale model, MorphoFormer. The model employs a multi-scale dilated Transformer (DilateFormer) as its backbone and innovatively incorporates specialized modules for each branch: A Local Context Aggregation (LCA) module in the local branch and a Global Focus Attention (GFA) module in the global branch. These modules respectively enhance the representation of local details and global semantics, and their features are fused to enable collaborative multi-scale information modeling. Experimental results on the publicly available SIPaKMeD dataset demonstrate that MorphoFormer achieves classification accuracies of 99.58%, 98.51%, and 98.14% for binary, three-class, and five-class tasks, respectively. Further validation on the Blood Cell Count and Detection (BCCD) dataset indicates strong cross-task robustness. Moreover, MorphoFormer requires only 8.22 GFLOPs for inference, highlighting its practical potential by achieving high performance with low computational overhead. Related codes: https://github.com/sijhb/MorphoFormer.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Source-Free Domain Adaptive Fundus Image Segmentation With Multiscale Feature Fusion and Stepwise Attention Integration 基于多尺度特征融合和分步关注融合的无源域自适应眼底图像分割
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-10 DOI: 10.1002/ima.70285
Mingtao Liu, Yuxuan Li, Qingyun Huo, Zhengfei Li, Shunbo Hu, Qingman Ge

Traditional unsupervised domain adaptation methods usually depend on source domain data distribution for cross-domain alignment. However, direct access to source data is often restricted due to privacy concerns and intellectual property rights. Without using source data, Source-free unsupervised domain adaptation methods can align the pre-trained model with the target domain by generating pseudo-labels for target domain data, which are then used as labeled samples to guide transfer learning. However, methods that generate pseudo-labels solely through iterative averaging often neglect spatial correlations among pixels and are susceptible to noise, resulting in blurred label boundaries. To this end, we propose a source-free domain adaptation framework for fundus image segmentation, which consists of a Multiscale Feature Fusion module for generating high-quality pseudo-labels and a Stepwise Attention Integration module for enhancing model training. The Multiscale Feature Fusion module refines the initial pseudo-labels from the pre-trained model through neighborhood value filling, effectively reducing noise and sharpening label boundaries. The Stepwise Attention Integration module progressively integrates high-level and low-level feature information into the low-level representation. The fused features preserve high-resolution details and enrich semantic content, thereby substantially enhancing the model's recognition capability. Experimental results demonstrate that, without using any source domain images or modifying the pre-trained model, our method achieves performance comparable to or even surpassing state-of-the-art approaches.

传统的无监督域自适应方法通常依赖于源域数据的分布进行跨域对齐。但是,由于隐私问题和知识产权问题,直接访问源数据常常受到限制。无源无监督域自适应方法在不使用源数据的情况下,通过为目标域数据生成伪标签,将预训练模型与目标域对齐,然后将目标域数据作为标记样本来指导迁移学习。然而,仅通过迭代平均生成伪标签的方法往往忽略了像素之间的空间相关性,并且容易受到噪声的影响,导致标签边界模糊。为此,我们提出了一种无源域自适应眼底图像分割框架,该框架由用于生成高质量伪标签的多尺度特征融合模块和用于增强模型训练的逐步注意集成模块组成。Multiscale Feature Fusion模块通过邻域值填充对预训练模型的初始伪标签进行细化,有效地降低了噪声,锐化了标签边界。Stepwise Attention Integration模块将高级和低级特征信息逐步整合到低级表征中。融合的特征保留了高分辨率的细节,丰富了语义内容,从而大大增强了模型的识别能力。实验结果表明,在不使用任何源域图像或修改预训练模型的情况下,我们的方法达到了与最先进方法相当甚至超过最先进方法的性能。
{"title":"Source-Free Domain Adaptive Fundus Image Segmentation With Multiscale Feature Fusion and Stepwise Attention Integration","authors":"Mingtao Liu,&nbsp;Yuxuan Li,&nbsp;Qingyun Huo,&nbsp;Zhengfei Li,&nbsp;Shunbo Hu,&nbsp;Qingman Ge","doi":"10.1002/ima.70285","DOIUrl":"https://doi.org/10.1002/ima.70285","url":null,"abstract":"<div>\u0000 \u0000 <p>Traditional unsupervised domain adaptation methods usually depend on source domain data distribution for cross-domain alignment. However, direct access to source data is often restricted due to privacy concerns and intellectual property rights. Without using source data, Source-free unsupervised domain adaptation methods can align the pre-trained model with the target domain by generating pseudo-labels for target domain data, which are then used as labeled samples to guide transfer learning. However, methods that generate pseudo-labels solely through iterative averaging often neglect spatial correlations among pixels and are susceptible to noise, resulting in blurred label boundaries. To this end, we propose a source-free domain adaptation framework for fundus image segmentation, which consists of a Multiscale Feature Fusion module for generating high-quality pseudo-labels and a Stepwise Attention Integration module for enhancing model training. The Multiscale Feature Fusion module refines the initial pseudo-labels from the pre-trained model through neighborhood value filling, effectively reducing noise and sharpening label boundaries. The Stepwise Attention Integration module progressively integrates high-level and low-level feature information into the low-level representation. The fused features preserve high-resolution details and enrich semantic content, thereby substantially enhancing the model's recognition capability. Experimental results demonstrate that, without using any source domain images or modifying the pre-trained model, our method achieves performance comparable to or even surpassing state-of-the-art approaches.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145964276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Imaging Systems and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1