首页 > 最新文献

International Journal of Imaging Systems and Technology最新文献

英文 中文
A Robust Method for Automated Segmentation of Optic Disc Using Hypercolumn Deep Features With Probability Thresholding 一种基于概率阈值的超列深度特征自动分割视盘的鲁棒方法
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-21 DOI: 10.1002/ima.70288
Kemal Akyol, Murat Uçar, Yusuf Yargı Baydilli, Ümit Atila

Glaucoma is a dangerous disease that can lead to blindness in advanced stages. It has been a hot topic among machine learning researchers as it can be diagnosed quickly and effectively through optic disc segmentation. However, anomalies in the optic disk region significantly complicate this task. There is also a need for a model that produces stable results on different data sets. Therefore, in this study, we propose a novel method that can be used for early diagnosis and treatment of glaucoma. Using deep learning architecture, hypercolumn deep features extracted from retinal fundus images were trained with different classifiers, and their behavior was examined in depth according to varying thresholding values. Global and local thresholding approaches were developed to improve the predictions of the standard classifier. The proposed model achieved Dice scores of 0.9437, 0.9654, and 0.9407 on the DRIONS, DRISHTI, and RIMONE-v3 datasets, respectively. The results obtained at the end of the study show that the proposed model is robust on these datasets and is competitive with other studies in the literature. In conclusion, we showed that the proposed model can be used effectively in glaucoma diagnosis.

青光眼是一种危险的疾病,可导致晚期失明。由于可以通过视盘分割快速有效地进行诊断,因此一直是机器学习研究人员关注的热点。然而,视盘区域的异常明显地使这项任务复杂化。还需要一个在不同数据集上产生稳定结果的模型。因此,在本研究中,我们提出了一种新的方法,可以用于青光眼的早期诊断和治疗。利用深度学习架构,利用不同的分类器训练从视网膜眼底图像中提取的超列深度特征,并根据不同的阈值对其行为进行深度检测。开发了全局和局部阈值方法来改进标准分类器的预测。该模型在DRIONS、DRISHTI和RIMONE-v3数据集上的Dice得分分别为0.9437、0.9654和0.9407。研究结束时获得的结果表明,所提出的模型在这些数据集上具有鲁棒性,并且与文献中的其他研究具有竞争力。综上所述,该模型可以有效地用于青光眼的诊断。
{"title":"A Robust Method for Automated Segmentation of Optic Disc Using Hypercolumn Deep Features With Probability Thresholding","authors":"Kemal Akyol,&nbsp;Murat Uçar,&nbsp;Yusuf Yargı Baydilli,&nbsp;Ümit Atila","doi":"10.1002/ima.70288","DOIUrl":"https://doi.org/10.1002/ima.70288","url":null,"abstract":"<div>\u0000 \u0000 <p>Glaucoma is a dangerous disease that can lead to blindness in advanced stages. It has been a hot topic among machine learning researchers as it can be diagnosed quickly and effectively through optic disc segmentation. However, anomalies in the optic disk region significantly complicate this task. There is also a need for a model that produces stable results on different data sets. Therefore, in this study, we propose a novel method that can be used for early diagnosis and treatment of glaucoma. Using deep learning architecture, hypercolumn deep features extracted from retinal fundus images were trained with different classifiers, and their behavior was examined in depth according to varying thresholding values. Global and local thresholding approaches were developed to improve the predictions of the standard classifier. The proposed model achieved Dice scores of 0.9437, 0.9654, and 0.9407 on the DRIONS, DRISHTI, and RIMONE-v3 datasets, respectively. The results obtained at the end of the study show that the proposed model is robust on these datasets and is competitive with other studies in the literature. In conclusion, we showed that the proposed model can be used effectively in glaucoma diagnosis.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146057930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Two-Stage Multi-View Fusion Framework With Semantic-Spatial Alignment for Precise Diagnosis of Phalangeal Fractures 一种具有语义-空间对齐的两阶段多视图融合框架用于指骨骨折的精确诊断
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-21 DOI: 10.1002/ima.70298
Jialin Hong, Yongxin Ge, Qin Long, Changjun Hou, Danqun Huo, Yinglan Lei, Peng Yin, Xiaogang Luo

Hand fractures, particularly phalangeal fractures, are often subtle, concealed, and complex, posing significant challenges for accurate and efficient diagnosis with traditional radiography. To address the limitations of existing deep learning methods in handling complex anatomy, multi-view uncertainty, and causal discrimination, we propose a novel two-stage context-aware multi-view fusion framework for precise fracture classification and localization. In the first stage, a Context-Aware Multi-View Classification Network (CAMVC-Net) is developed. A Dual-path Semantic-Spatial Alignment (DSSA) module aligns low-level spatial details with high-level semantics, enhancing fine-grained fracture discrimination. A Context Aware Pyramid (CAP) module captures multi-scale context, while a Counterfactual Attention Learning (CAL) loss guides the network to focus on discriminative regions. Multi-view uncertainty is modeled using a Dirichlet distribution, and decision-level fusion is achieved by Dempster's rule to mimic clinical reasoning. In the second stage, a Detail-aware Fracture Localization Network (DFL-Net) is designed. To adapt to irregular fracture geometry, DFL-Net integrates deformable convolutions and incorporates DSSA into the feature pyramid to preserve fine spatial details during deep downsampling. Experiments on the MURA and clinical datasets demonstrate strong performance: the classification model achieved an accuracy of 92.89%, precision of 93.05%, recall of 96.96%, and F1 score of 94.97%. The localization network obtained 73.8% AP50 and 43.8% AP75, with a 10.4% improvement in AP75 over Faster R-CNN. These results indicate that the proposed framework provides an accurate and efficient tool for computer-aided fracture diagnosis, with the potential to reduce misdiagnosis and improve clinical decision-making efficiency.

手部骨折,尤其是指骨骨折,通常是微妙、隐蔽和复杂的,这对传统x线摄影的准确和有效诊断提出了重大挑战。为了解决现有深度学习方法在处理复杂解剖、多视图不确定性和因果歧视方面的局限性,我们提出了一种新的两阶段上下文感知多视图融合框架,用于精确的骨折分类和定位。在第一阶段,开发了上下文感知的多视图分类网络(CAMVC-Net)。双路径语义-空间对齐(DSSA)模块将低级空间细节与高级语义对齐,增强了细粒度的裂缝识别。上下文感知金字塔(CAP)模块捕获多尺度上下文,而反事实注意学习(CAL)损失引导网络专注于判别区域。采用Dirichlet分布对多视角不确定性进行建模,采用Dempster规则实现决策级融合,模拟临床推理。在第二阶段,设计了一个细节感知裂缝定位网络(DFL-Net)。为了适应不规则的裂缝几何形状,DFL-Net集成了可变形卷积,并将DSSA集成到特征金字塔中,以在深度下采样期间保留精细的空间细节。在MURA和临床数据集上的实验表明,该分类模型的准确率为92.89%,精密度为93.05%,召回率为96.96%,F1分数为94.97%。定位网络获得73.8%的AP50和43.8%的AP75, AP75比Faster R-CNN提高了10.4%。这些结果表明,所提出的框架为计算机辅助骨折诊断提供了一个准确有效的工具,具有减少误诊和提高临床决策效率的潜力。
{"title":"A Two-Stage Multi-View Fusion Framework With Semantic-Spatial Alignment for Precise Diagnosis of Phalangeal Fractures","authors":"Jialin Hong,&nbsp;Yongxin Ge,&nbsp;Qin Long,&nbsp;Changjun Hou,&nbsp;Danqun Huo,&nbsp;Yinglan Lei,&nbsp;Peng Yin,&nbsp;Xiaogang Luo","doi":"10.1002/ima.70298","DOIUrl":"https://doi.org/10.1002/ima.70298","url":null,"abstract":"<div>\u0000 \u0000 <p>Hand fractures, particularly phalangeal fractures, are often subtle, concealed, and complex, posing significant challenges for accurate and efficient diagnosis with traditional radiography. To address the limitations of existing deep learning methods in handling complex anatomy, multi-view uncertainty, and causal discrimination, we propose a novel two-stage context-aware multi-view fusion framework for precise fracture classification and localization. In the first stage, a Context-Aware Multi-View Classification Network (CAMVC-Net) is developed. A Dual-path Semantic-Spatial Alignment (DSSA) module aligns low-level spatial details with high-level semantics, enhancing fine-grained fracture discrimination. A Context Aware Pyramid (CAP) module captures multi-scale context, while a Counterfactual Attention Learning (CAL) loss guides the network to focus on discriminative regions. Multi-view uncertainty is modeled using a Dirichlet distribution, and decision-level fusion is achieved by Dempster's rule to mimic clinical reasoning. In the second stage, a Detail-aware Fracture Localization Network (DFL-Net) is designed. To adapt to irregular fracture geometry, DFL-Net integrates deformable convolutions and incorporates DSSA into the feature pyramid to preserve fine spatial details during deep downsampling. Experiments on the MURA and clinical datasets demonstrate strong performance: the classification model achieved an accuracy of 92.89%, precision of 93.05%, recall of 96.96%, and F1 score of 94.97%. The localization network obtained 73.8% <i>AP</i><sub>50</sub> and 43.8% <i>AP</i><sub>75</sub>, with a 10.4% improvement in <i>AP</i><sub>75</sub> over Faster R-CNN. These results indicate that the proposed framework provides an accurate and efficient tool for computer-aided fracture diagnosis, with the potential to reduce misdiagnosis and improve clinical decision-making efficiency.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146083374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Magnetic Resonance Neurography of the Brachial Plexus at 1.5 T: Diagnostic Reliability and Time Efficiency of Bilateral Ultra-Fast 3D STIR vs. Unilateral 2D Dixon 1.5 T时臂丛磁共振神经造影:双侧超快速3D STIR与单侧2D Dixon的诊断可靠性和时间效率
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-17 DOI: 10.1002/ima.70294
Fabio Zecca, Frederik Abel, Egon Burian, Falko Ensle, Soléakhéna Ken, Winston J. Rennie, Luca Saba, Roman Guggenberger

Recent technological advancements may now allow shorter reliable magnetic resonance neurography (MRN) scans of the brachial plexus (BP) even at 1.5 T. The objective of this study was to compare ultra-fast bilateral coronal 3D T2w TSE STIR (3D-IR) with unilateral sagittal 2D T2w TSE Dixon (2D-DX) in terms of image quality, diagnostic reliability, and time efficiency for BP MRN at 1.5 T. Free-breathing 3D-IR and 2D-DX images acquired at 1.5 T were retrospectively collected from an equal number of healthy and pathologic cases and blindly compared by three raters. Pediatric cases, different acquisition methods, and low-quality scans were excluded. Image quality was assessed objectively and subjectively. Image interpretability was assessed for each BP component. Diagnostic accuracy was evaluated against final radiologic reports. Inter-rater agreement was calculated. The MRN scans of 72 BP from 36 patients (13 females, 50.3 ± 17.6 y) were included. 2D-DX performed better for signal intensity (all p < 0.001), nerve-to-muscle contrast ratio (p = 0.016), signal- and contrast-to-noise ratios (all p < 0.001), sharpness of C6 (p = 0.001) and MT (p = 0.004), general depiction of nerves (p = 0.007), and inter-rater agreement, while 3D-IR was superior for noise intensity (p < 0.001), nerve-to-fat contrast ratio (p = 0.024) and vascular suppression (p < 0.001). Sharpness of the remaining nerves, overall neurographic quality, resistance to motion artifacts, nerve identifiability, and image interpretability were comparable between the two sequences. Diagnostic accuracy was similar for both techniques (3D-IR: 79.6%; 2D-DX: 78.7%), although bilateral coverage via 3D-IR required 46% of the acquisition time compared to 2D-DX (4′30″ vs. 8′20″). According to these results, 3D STIR seems a viable alternative to 2D Dixon for bilateral BP MRN at 1.5 T, offering a significant time advantage without compromising diagnostic reliability. A full-3D abbreviated protocol supported by time-efficient breathing control and advanced reconstruction tools could further optimize 1.5-T BP MRN workflows.

最近的技术进步可能允许更短的可靠的磁共振神经摄影(MRN)扫描臂丛(BP),即使在1.5 T。本研究的目的是比较超快速双侧冠状面3D T2w TSE STIR (3D- ir)与单侧矢状面2D T2w TSE Dixon (2D- dx)在图像质量、诊断可靠性和1.5 T BP MRN时间效率方面的差异。在1.5 T时获得的自由呼吸3D-IR和2D-DX图像回顾性地从相同数量的健康和病理病例中收集,并由三位评分者进行盲目比较。排除了儿童病例、不同采集方法和低质量扫描。对图像质量进行了客观和主观评价。评估每个BP分量的图像可解释性。根据最终的放射学报告评估诊断的准确性。计算了同业协议。36例患者(女性13例,50.3±17.6岁)72 BP mri扫描。2D-DX在信号强度(p < 0.001)、神经-肌肉对比度(p = 0.016)、信号和噪声比(p < 0.001)、C6的清晰度(p = 0.001)和MT (p = 0.004)、神经的一般描绘(p = 0.007)和图像间一致性方面表现更好,而3D-IR在噪声强度(p < 0.001)、神经-脂肪对比度(p = 0.024)和血管抑制(p < 0.001)方面表现更好。剩余神经的清晰度,整体神经成像质量,对运动伪影的抵抗力,神经可识别性和图像可解释性在两个序列之间具有可比性。两种技术的诊断准确性相似(3D-IR: 79.6%; 2D-DX: 78.7%),尽管与2D-DX相比,通过3D-IR进行双侧覆盖需要46%的采集时间(4 ' 30″对8 ' 20″)。根据这些结果,3D STIR似乎是2D Dixon在1.5 T下进行双侧BP MRN的可行替代方案,在不影响诊断可靠性的情况下提供了显著的时间优势。一个全3d的简化协议,支持时间高效的呼吸控制和先进的重建工具,可以进一步优化1.5 t BP MRN工作流程。
{"title":"Magnetic Resonance Neurography of the Brachial Plexus at 1.5 T: Diagnostic Reliability and Time Efficiency of Bilateral Ultra-Fast 3D STIR vs. Unilateral 2D Dixon","authors":"Fabio Zecca,&nbsp;Frederik Abel,&nbsp;Egon Burian,&nbsp;Falko Ensle,&nbsp;Soléakhéna Ken,&nbsp;Winston J. Rennie,&nbsp;Luca Saba,&nbsp;Roman Guggenberger","doi":"10.1002/ima.70294","DOIUrl":"https://doi.org/10.1002/ima.70294","url":null,"abstract":"<p>Recent technological advancements may now allow shorter reliable magnetic resonance neurography (MRN) scans of the brachial plexus (BP) even at 1.5 T. The objective of this study was to compare ultra-fast bilateral coronal 3D T2w TSE STIR (3D-IR) with unilateral sagittal 2D T2w TSE Dixon (2D-DX) in terms of image quality, diagnostic reliability, and time efficiency for BP MRN at 1.5 T. Free-breathing 3D-IR and 2D-DX images acquired at 1.5 T were retrospectively collected from an equal number of healthy and pathologic cases and blindly compared by three raters. Pediatric cases, different acquisition methods, and low-quality scans were excluded. Image quality was assessed objectively and subjectively. Image interpretability was assessed for each BP component. Diagnostic accuracy was evaluated against final radiologic reports. Inter-rater agreement was calculated. The MRN scans of 72 BP from 36 patients (13 females, 50.3 ± 17.6 y) were included. 2D-DX performed better for signal intensity (all <i>p</i> &lt; 0.001), nerve-to-muscle contrast ratio (<i>p</i> = 0.016), signal- and contrast-to-noise ratios (all <i>p</i> &lt; 0.001), sharpness of C6 (<i>p</i> = 0.001) and MT (<i>p</i> = 0.004), general depiction of nerves (<i>p</i> = 0.007), and inter-rater agreement, while 3D-IR was superior for noise intensity (<i>p</i> &lt; 0.001), nerve-to-fat contrast ratio (<i>p</i> = 0.024) and vascular suppression (<i>p</i> &lt; 0.001). Sharpness of the remaining nerves, overall neurographic quality, resistance to motion artifacts, nerve identifiability, and image interpretability were comparable between the two sequences. Diagnostic accuracy was similar for both techniques (3D-IR: 79.6%; 2D-DX: 78.7%), although bilateral coverage via 3D-IR required 46% of the acquisition time compared to 2D-DX (4′30″ vs. 8′20″). According to these results, 3D STIR seems a viable alternative to 2D Dixon for bilateral BP MRN at 1.5 T, offering a significant time advantage without compromising diagnostic reliability. A full-3D abbreviated protocol supported by time-efficient breathing control and advanced reconstruction tools could further optimize 1.5-T BP MRN workflows.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70294","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CCEMSS-Unet++: An Enhanced Multi-Scale Context Fusion Network for Pulmonary Nodule Segmentation 一种用于肺结节分割的增强多尺度上下文融合网络
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-16 DOI: 10.1002/ima.70297
Zhen Cui, Qing Lu, Xia Wang

In CT imaging, the size, shape, margin, and density of pulmonary nodules are used to determine whether they are benign or malignant. However, pulmonary nodules often exhibit challenging characteristics in CT images, such as irregular shapes, and tiny nodules are prone to being overlooked. In addition, the density of nodules may be similar to that of surrounding tissues. When the nodules are small or close to the pulmonary wall, their resolution is low, making them difficult to distinguish. Because of these characteristics, it is quite challenging to segment pulmonary nodules automatically. This research proposes CCEMSS-Unet++, a medical image segmentation network that enhances feature fusion between local details and global context through a multi-scale design. It includes the CCEMS module and the SE attention mechanism. The SE module mitigates false negatives, particularly for small nodules; the CCEMS module is used to extract local information and connect global context, improving segmentation accuracy. This study used two datasets: the publicly available LIDC-IDRI dataset and a dataset collected from the CT Department of Yan'an University Affiliated Hospital. Compared with Unet++, the proposed model improves IoU and Dice by 2.62% and 2.15% on the LIDC-IDRI dataset, and by 5.10% and 7.11% on our dataset, respectively. In terms of segmentation accuracy and generalization, the experimental findings demonstrate that this approach outperforms other networks, including handling nodules with low resolution. The efficacy of the proposed enhancement method is further supported by the ablation experiments.

在CT成像中,肺结节的大小、形状、边缘和密度是判断其是良性还是恶性的依据。然而,肺结节在CT图像上往往表现出具有挑战性的特征,如形状不规则,微小结节容易被忽视。此外,结节的密度可能与周围组织的密度相似。当结节很小或靠近肺壁时,其分辨率较低,难以区分。由于这些特点,自动分割肺结节具有很大的挑战性。本研究提出一种医学图像分割网络ccemss - unet++,通过多尺度设计增强局部细节与全局背景的特征融合。它包括CCEMS模块和SE注意机制。SE模块减轻了假阴性,特别是对于小结节;CCEMS模块用于提取局部信息和连接全局上下文,提高了分割精度。本研究使用两个数据集:公开的LIDC-IDRI数据集和延安大学附属医院CT科收集的数据集。与Unet++相比,该模型在LIDC-IDRI数据集上的IoU和Dice分别提高了2.62%和2.15%,在我们的数据集上分别提高了5.10%和7.11%。在分割精度和泛化方面,实验结果表明,该方法优于其他网络,包括处理低分辨率的结节。烧蚀实验进一步验证了所提增强方法的有效性。
{"title":"CCEMSS-Unet++: An Enhanced Multi-Scale Context Fusion Network for Pulmonary Nodule Segmentation","authors":"Zhen Cui,&nbsp;Qing Lu,&nbsp;Xia Wang","doi":"10.1002/ima.70297","DOIUrl":"https://doi.org/10.1002/ima.70297","url":null,"abstract":"<div>\u0000 \u0000 <p>In CT imaging, the size, shape, margin, and density of pulmonary nodules are used to determine whether they are benign or malignant. However, pulmonary nodules often exhibit challenging characteristics in CT images, such as irregular shapes, and tiny nodules are prone to being overlooked. In addition, the density of nodules may be similar to that of surrounding tissues. When the nodules are small or close to the pulmonary wall, their resolution is low, making them difficult to distinguish. Because of these characteristics, it is quite challenging to segment pulmonary nodules automatically. This research proposes CCEMSS-Unet++, a medical image segmentation network that enhances feature fusion between local details and global context through a multi-scale design. It includes the CCEMS module and the SE attention mechanism. The SE module mitigates false negatives, particularly for small nodules; the CCEMS module is used to extract local information and connect global context, improving segmentation accuracy. This study used two datasets: the publicly available LIDC-IDRI dataset and a dataset collected from the CT Department of Yan'an University Affiliated Hospital. Compared with Unet++, the proposed model improves IoU and Dice by 2.62% and 2.15% on the LIDC-IDRI dataset, and by 5.10% and 7.11% on our dataset, respectively. In terms of segmentation accuracy and generalization, the experimental findings demonstrate that this approach outperforms other networks, including handling nodules with low resolution. The efficacy of the proposed enhancement method is further supported by the ablation experiments.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid MRUNet+: Enhanced Multi-Structure Retinal Segmentation for Optic Cup, Disc, and Vascular Features in Complex Imaging Conditions 混合MRUNet+:在复杂成像条件下增强的多结构视网膜杯、盘和血管特征分割
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-16 DOI: 10.1002/ima.70293
Abdul Qadir Khan, Guangmin Sun, Anas Bilal, Abdulkareem Alzahrani, Abdullah Almuhaimeed

Precise segmentation of critical retinal structures, including the optic cup, optic disc, and vascular bifurcation sites, is essential for the early identification and management of severe ocular conditions such as glaucoma, diabetic retinopathy, hypertensive retinopathy, and age-related macular degeneration. This study proposes a hybrid multi-attention residual-based UNet (MRUNet+), a deep learning (DL) model developed to address the complex nature of retinal image processing using an advanced attention mechanism to enhance segmentation precision. MRUNet+ was trained on many high-quality datasets, including DRIVE, CHASE_DB1, Drishti-GS1, RIM-ONE and REFUGE1, enabling it to address prevalent imaging issues, such as variations in contrast, resolution and the presence of abnormalities. The model has exhibited markedly enhanced performance compared to recent methodologies, with improved accuracy metrics, Dice coefficients and intersection over union (IoU), particularly in low-contrast or complex anatomical regions. Due to its adaptability and precision, MRUNet+ has the potential to function as a pivotal instrument in clinical settings, providing automated retinal assessments that aid in the prompt detection of various retinal and vascular pathologies, thereby improving patient management.

精确分割关键的视网膜结构,包括视杯、视盘和血管分叉部位,对于青光眼、糖尿病视网膜病变、高血压视网膜病变和年龄相关性黄斑变性等严重眼病的早期识别和治疗至关重要。本研究提出了一种基于混合多注意残差的UNet (MRUNet+),这是一种深度学习(DL)模型,旨在解决视网膜图像处理的复杂性,使用先进的注意机制来提高分割精度。MRUNet+在许多高质量的数据集上进行了训练,包括DRIVE, CHASE_DB1, Drishti-GS1, RIM-ONE和refe1,使其能够解决普遍的成像问题,如对比度变化,分辨率和异常的存在。与最近的方法相比,该模型表现出了显著增强的性能,提高了精度指标、Dice系数和交汇连接(IoU),特别是在低对比度或复杂的解剖区域。由于其适应性和精确性,MRUNet+有潜力成为临床环境中的关键工具,提供自动视网膜评估,有助于及时发现各种视网膜和血管病变,从而改善患者管理。
{"title":"Hybrid MRUNet+: Enhanced Multi-Structure Retinal Segmentation for Optic Cup, Disc, and Vascular Features in Complex Imaging Conditions","authors":"Abdul Qadir Khan,&nbsp;Guangmin Sun,&nbsp;Anas Bilal,&nbsp;Abdulkareem Alzahrani,&nbsp;Abdullah Almuhaimeed","doi":"10.1002/ima.70293","DOIUrl":"https://doi.org/10.1002/ima.70293","url":null,"abstract":"<div>\u0000 \u0000 <p>Precise segmentation of critical retinal structures, including the optic cup, optic disc, and vascular bifurcation sites, is essential for the early identification and management of severe ocular conditions such as glaucoma, diabetic retinopathy, hypertensive retinopathy, and age-related macular degeneration. This study proposes a hybrid multi-attention residual-based UNet (MRUNet+), a deep learning (DL) model developed to address the complex nature of retinal image processing using an advanced attention mechanism to enhance segmentation precision. MRUNet+ was trained on many high-quality datasets, including DRIVE, CHASE_DB1, Drishti-GS1, RIM-ONE and REFUGE1, enabling it to address prevalent imaging issues, such as variations in contrast, resolution and the presence of abnormalities. The model has exhibited markedly enhanced performance compared to recent methodologies, with improved accuracy metrics, Dice coefficients and intersection over union (IoU), particularly in low-contrast or complex anatomical regions. Due to its adaptability and precision, MRUNet+ has the potential to function as a pivotal instrument in clinical settings, providing automated retinal assessments that aid in the prompt detection of various retinal and vascular pathologies, thereby improving patient management.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
T2TCAN: A Twin-Token Transformer for Interpretable Grading of Invasive Ductal Carcinoma T2TCAN:可解释浸润性导管癌分级的双标记转换器
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-15 DOI: 10.1002/ima.70296
Vivek Harshey, Amar Partap Singh Pharwaha

Accurate grading of invasive ductal carcinoma (IDC) remains hard even for the trained pathologist. The patient's therapeutic schedule depends on the observed grade, emphasizing the importance of accurate tissue grading. We propose a novel architecture called Twin Tokens Talking-heads Class Attention Net (T2TCAN), designed to grade IDC images. Based on the transformer architecture, this design enhances the model's ability to capture local and global tissue features. This model was trained on the PathoIDCG dataset, outperforming the existing methods in terms of standard metrics. Further, introducing class and twin tokens later in the model, along with the talking-heads class attention and per-channel scaling, made the T2TCAN model efficient at distinguishing the tissue grades. Our model outperformed conventional CNN-based IDC grading approaches. We report the highest 98.82% accuracy, AUC (0.99), precision (98.83%), recall (98.80%), and F1-score (98.81%) on the PathoIDCG dataset. We plotted class attention maps for the TCT layers to provide insights into the model's decision-making process. These attention maps are also helpful for pathologists to understand the discriminative features that significantly contribute to prognostic scoring. Further, our model also performed well on a separate IDC grading dataset, producing state-of-the-art results and thus proving its efficacy. The designed model claims its adaptability and suitability for integration into the healthcare workflow for IDC grading.

浸润性导管癌(IDC)的准确分级即使对训练有素的病理学家来说也是困难的。患者的治疗计划取决于观察到的分级,强调准确的组织分级的重要性。我们提出了一种新的架构,称为双令牌说话头类注意力网络(T2TCAN),旨在对IDC图像进行分级。基于变压器结构,该设计增强了模型捕获局部和全局组织特征的能力。该模型是在PathoIDCG数据集上训练的,在标准指标方面优于现有方法。此外,稍后在模型中引入类和双令牌,以及谈话头类注意和每个通道缩放,使T2TCAN模型在区分组织等级方面有效。我们的模型优于传统的基于cnn的IDC分级方法。我们报告了在PathoIDCG数据集上最高的98.82%准确率、AUC(0.99)、精密度(98.83%)、召回率(98.80%)和f1分数(98.81%)。我们为TCT层绘制了类注意图,以深入了解模型的决策过程。这些注意图也有助于病理学家理解显著有助于预后评分的鉴别特征。此外,我们的模型在独立的IDC分级数据集上也表现良好,产生了最先进的结果,从而证明了其有效性。所设计的模型具有适应性和适用性,可以集成到IDC分级的医疗保健工作流程中。
{"title":"T2TCAN: A Twin-Token Transformer for Interpretable Grading of Invasive Ductal Carcinoma","authors":"Vivek Harshey,&nbsp;Amar Partap Singh Pharwaha","doi":"10.1002/ima.70296","DOIUrl":"https://doi.org/10.1002/ima.70296","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate grading of invasive ductal carcinoma (IDC) remains hard even for the trained pathologist. The patient's therapeutic schedule depends on the observed grade, emphasizing the importance of accurate tissue grading. We propose a novel architecture called Twin Tokens Talking-heads Class Attention Net (T2TCAN), designed to grade IDC images. Based on the transformer architecture, this design enhances the model's ability to capture local and global tissue features. This model was trained on the PathoIDCG dataset, outperforming the existing methods in terms of standard metrics. Further, introducing class and twin tokens later in the model, along with the talking-heads class attention and per-channel scaling, made the T2TCAN model efficient at distinguishing the tissue grades. Our model outperformed conventional CNN-based IDC grading approaches. We report the highest 98.82% accuracy, AUC (0.99), precision (98.83%), recall (98.80%), and F1-score (98.81%) on the PathoIDCG dataset. We plotted class attention maps for the TCT layers to provide insights into the model's decision-making process. These attention maps are also helpful for pathologists to understand the discriminative features that significantly contribute to prognostic scoring. Further, our model also performed well on a separate IDC grading dataset, producing state-of-the-art results and thus proving its efficacy. The designed model claims its adaptability and suitability for integration into the healthcare workflow for IDC grading.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146016390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Fusion: A High-Performance AI Framework for Colorectal Cancer Grading 深度融合:结直肠癌分级的高性能AI框架
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-15 DOI: 10.1002/ima.70291
Muhammed Emin Bedir, Mesut Ersin Sönmez, Afife Uğuz, Burcu Sanal Yılmaz

Automated, precise histopathological grading of colorectal cancer (CRC) is vital for prognosis and treatment but is challenged by inter-observer variability and time demands. This study introduces and evaluates a novel deep learning framework for robust four-class grading of colorectal adenocarcinoma (Normal, Well, Moderately, and Poorly Differentiated), directly addressing limitations of prior studies focused on survival prediction or limited feature extraction. Our approach integrates Reinhard color normalization for stain variability, data augmentation for robustness, and synergistic feature extraction from fine-tuned MobileNetV2 and InceptionV3 Convolutional Neural Networks (CNNs). Computationally efficient classification was achieved using Principal Component Analysis (PCA) to reduce features to 100 components, followed by an optimized Support Vector Machine (SVM) with multiple kernel evaluations. The framework, with a polynomial kernel SVM, demonstrated superior performance, achieving 96.41% macro-averaged accuracy (95% CI, 95.8%–97.0%), with corresponding 0.9641 precision, recall, F1-score, and a 0.9915 macro area under the curve (AUC) in fivefold cross-validation. Polynomial and Radial Basis Function (RBF) kernels significantly outperformed others. The framework's primary contribution is a validated, synergistic feature fusion pipeline that leverages the complementary strengths of diverse CNN architectures and an optimized SVM classifier, representing a notable advancement in automated grading on clinically sourced data. This provides a comprehensive framework for multi-class CRC grading, showing considerable potential to enhance diagnostic accuracy, consistency, and efficiency.

结直肠癌(CRC)的自动、精确的组织病理学分级对预后和治疗至关重要,但受到观察者间可变性和时间要求的挑战。本研究引入并评估了一种新的深度学习框架,用于对结直肠腺癌进行稳健的四级分级(正常、良好、中度和低分化),直接解决了先前研究集中于生存预测或有限特征提取的局限性。我们的方法集成了Reinhard颜色归一化的染色可变性,数据增强的鲁棒性,以及从微调的MobileNetV2和InceptionV3卷积神经网络(cnn)中协同提取特征。采用主成分分析(PCA)将特征减少到100个分量,然后采用优化的支持向量机(SVM)进行多核评估,从而实现高效的分类。该框架采用多项式核支持向量机,在五重交叉验证中达到96.41%的宏观平均准确率(95% CI, 95.8% ~ 97.0%),对应的精密度、召回率、f1得分为0.9641,宏观曲线下面积(AUC)为0.9915。多项式和径向基函数(RBF)核明显优于其他核。该框架的主要贡献是一个经过验证的协同特征融合管道,它利用了各种CNN架构和优化的SVM分类器的互补优势,代表了临床来源数据的自动分级方面的显着进步。这为多类别CRC分级提供了一个全面的框架,显示出提高诊断准确性、一致性和效率的巨大潜力。
{"title":"Deep Fusion: A High-Performance AI Framework for Colorectal Cancer Grading","authors":"Muhammed Emin Bedir,&nbsp;Mesut Ersin Sönmez,&nbsp;Afife Uğuz,&nbsp;Burcu Sanal Yılmaz","doi":"10.1002/ima.70291","DOIUrl":"https://doi.org/10.1002/ima.70291","url":null,"abstract":"<p>Automated, precise histopathological grading of colorectal cancer (CRC) is vital for prognosis and treatment but is challenged by inter-observer variability and time demands. This study introduces and evaluates a novel deep learning framework for robust four-class grading of colorectal adenocarcinoma (Normal, Well, Moderately, and Poorly Differentiated), directly addressing limitations of prior studies focused on survival prediction or limited feature extraction. Our approach integrates Reinhard color normalization for stain variability, data augmentation for robustness, and synergistic feature extraction from fine-tuned MobileNetV2 and InceptionV3 Convolutional Neural Networks (CNNs). Computationally efficient classification was achieved using Principal Component Analysis (PCA) to reduce features to 100 components, followed by an optimized Support Vector Machine (SVM) with multiple kernel evaluations. The framework, with a polynomial kernel SVM, demonstrated superior performance, achieving 96.41% macro-averaged accuracy (95% CI, 95.8%–97.0%), with corresponding 0.9641 precision, recall, <i>F</i>1-score, and a 0.9915 macro area under the curve (AUC) in fivefold cross-validation. Polynomial and Radial Basis Function (RBF) kernels significantly outperformed others. The framework's primary contribution is a validated, synergistic feature fusion pipeline that leverages the complementary strengths of diverse CNN architectures and an optimized SVM classifier, representing a notable advancement in automated grading on clinically sourced data. This provides a comprehensive framework for multi-class CRC grading, showing considerable potential to enhance diagnostic accuracy, consistency, and efficiency.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70291","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Directional Context Modeling With HCSMIL: Enhancing Cancer Prediction and Subtype Classification From Whole Slide Images 基于HCSMIL的多向上下文建模:从整个幻灯片图像中增强癌症预测和亚型分类
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-15 DOI: 10.1002/ima.70287
Jingtao Qiu, Yucheng Liu

The Mamba model performs excellently in natural image processing but faces limitations in analyzing whole slide images (WSIs) for cancer prediction and subtype classification in digital pathology—pathological images feature highly irregular lesion spatial distributions (especially complex small-lesion associations), while Mamba's inherent unidirectional/limited-direction scanning cannot effectively model such multi-dimensional spatial dependencies, failing to capture key pathological structural features. To address this, we propose HCSMIL, a Mamba-based optimized framework tailored to pathological image clinical analysis. It comprehensively captures local lesion spatial topology via multi-directional contextual modeling and integrates a multi-scale pyramid structure to extract global lesion distribution features, jointly enhancing diagnostic accuracy. Validation on authoritative datasets (Camelyon16, TCGA-LUNG, TCGA-Kidney) shows HCSMIL significantly outperforms existing mainstream methods: on TCGA-LUNG, accuracy (ACC), F1 score, and AUC are 0.66%, 1.42%, and 1.25% higher than the second-best method; on TCGA-Kidney, these metrics increase by 1.47%, 0.09%, and 1.00%; on Camelyon16, ACC is 0.77% higher. Notably, HCSMIL achieves an 84% small-lesion recognition rate, substantially exceeding TransMIL (70.59%) and MambaMIL (64.71%), fully demonstrating its strength in capturing complexly distributed lesions and providing reliable technical support for cancer diagnosis.

Mamba模型在自然图像处理方面表现出色,但在分析全幻灯片图像(wsi)用于癌症预测和数字病理亚型分类方面存在局限性-病理图像具有高度不规则的病变空间分布(特别是复杂的小病变关联),而Mamba固有的单向/有限方向扫描无法有效地模拟这种多维空间依赖性。未能捕捉关键的病理结构特征。为了解决这个问题,我们提出了HCSMIL,一个基于曼巴的优化框架,专门用于病理图像的临床分析。通过多向上下文建模全面捕捉局部病变空间拓扑,结合多尺度金字塔结构提取全局病变分布特征,共同提高诊断准确率。在权威数据集(Camelyon16、TCGA-LUNG、TCGA-Kidney)上的验证表明,HCSMIL显著优于现有主流方法:在TCGA-LUNG上,准确率(ACC)、F1评分和AUC分别比次优方法高0.66%、1.42%和1.25%;在TCGA-Kidney组,这些指标分别增加1.47%、0.09%和1.00%;Camelyon16的ACC高0.77%。值得注意的是,HCSMIL的小病变识别率达到84%,大大超过TransMIL(70.59%)和MambaMIL(64.71%),充分显示了HCSMIL在捕获分布复杂病变方面的优势,为癌症诊断提供了可靠的技术支持。
{"title":"Multi-Directional Context Modeling With HCSMIL: Enhancing Cancer Prediction and Subtype Classification From Whole Slide Images","authors":"Jingtao Qiu,&nbsp;Yucheng Liu","doi":"10.1002/ima.70287","DOIUrl":"https://doi.org/10.1002/ima.70287","url":null,"abstract":"<div>\u0000 \u0000 <p>The Mamba model performs excellently in natural image processing but faces limitations in analyzing whole slide images (WSIs) for cancer prediction and subtype classification in digital pathology—pathological images feature highly irregular lesion spatial distributions (especially complex small-lesion associations), while Mamba's inherent unidirectional/limited-direction scanning cannot effectively model such multi-dimensional spatial dependencies, failing to capture key pathological structural features. To address this, we propose HCSMIL, a Mamba-based optimized framework tailored to pathological image clinical analysis. It comprehensively captures local lesion spatial topology via multi-directional contextual modeling and integrates a multi-scale pyramid structure to extract global lesion distribution features, jointly enhancing diagnostic accuracy. Validation on authoritative datasets (Camelyon16, TCGA-LUNG, TCGA-Kidney) shows HCSMIL significantly outperforms existing mainstream methods: on TCGA-LUNG, accuracy (ACC), F1 score, and AUC are 0.66%, 1.42%, and 1.25% higher than the second-best method; on TCGA-Kidney, these metrics increase by 1.47%, 0.09%, and 1.00%; on Camelyon16, ACC is 0.77% higher. Notably, HCSMIL achieves an 84% small-lesion recognition rate, substantially exceeding TransMIL (70.59%) and MambaMIL (64.71%), fully demonstrating its strength in capturing complexly distributed lesions and providing reliable technical support for cancer diagnosis.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature Derivative-Based Pixel Segmentation Method for Detecting Lung Tumors From CT Images 基于特征导数的CT图像肺肿瘤检测像素分割方法
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1002/ima.70281
S. P. Kavya, V. Seethalakshmi

Lung tumor segmentation using machine learning and artificial intelligence techniques leverages the diagnosis precision through accurate localization of the infections. The prominent factors are the features that reflect the infected region concealed through patterns, boundaries, and edges. In this article, a novel Feature-Derivative Pixel Segmentation (FDPS) method is introduced to improve the tumor segmentation accuracy influenced by the disparity pixel distribution problem. This proposed method is assisted by tuneable recurrent learning (TRL) to vary the feature derivative count for varying segments. The learning inputs are modifiable using different extracted feature derivatives under parity and disparity pixel distributions. By identifying the maximum disparity pixels, the tuneable inputs for the recurrent learning are decided. The computation layer of the learning process identifies the maximum related regions identified under parity and disparity features. Such regions are segmented from multiple pixel distribution points until the image size. This process is therefore iterated to identify maximum conjoined features under different infected regions. The proposed method improves the accuracy by 9.63%, the true positive rate by 10.85%, and reduces the classification error by 10.06% for the maximum regions.

使用机器学习和人工智能技术的肺肿瘤分割通过准确定位感染来提高诊断精度。突出的因素是通过图案、边界和边缘来反映感染区域的特征。本文提出了一种新的特征导数像素分割(FDPS)方法,以改善受视差像素分布问题影响的肿瘤分割精度。该方法通过可调循环学习(TRL)来改变不同片段的特征导数计数。学习输入可以在奇偶性和视差像素分布下使用不同提取的特征导数进行修改。通过识别最大视差像素,确定循环学习的可调输入。学习过程的计算层识别在宇称和视差特征下识别的最大相关区域。这些区域从多个像素分布点分割,直到图像大小。因此,迭代该过程以确定不同感染区域下的最大连接特征。该方法对最大区域的分类准确率提高了9.63%,真阳性率提高了10.85%,分类误差降低了10.06%。
{"title":"Feature Derivative-Based Pixel Segmentation Method for Detecting Lung Tumors From CT Images","authors":"S. P. Kavya,&nbsp;V. Seethalakshmi","doi":"10.1002/ima.70281","DOIUrl":"https://doi.org/10.1002/ima.70281","url":null,"abstract":"<div>\u0000 \u0000 <p>Lung tumor segmentation using machine learning and artificial intelligence techniques leverages the diagnosis precision through accurate localization of the infections. The prominent factors are the features that reflect the infected region concealed through patterns, boundaries, and edges. In this article, a novel Feature-Derivative Pixel Segmentation (FDPS) method is introduced to improve the tumor segmentation accuracy influenced by the disparity pixel distribution problem. This proposed method is assisted by tuneable recurrent learning (TRL) to vary the feature derivative count for varying segments. The learning inputs are modifiable using different extracted feature derivatives under parity and disparity pixel distributions. By identifying the maximum disparity pixels, the tuneable inputs for the recurrent learning are decided. The computation layer of the learning process identifies the maximum related regions identified under parity and disparity features. Such regions are segmented from multiple pixel distribution points until the image size. This process is therefore iterated to identify maximum conjoined features under different infected regions. The proposed method improves the accuracy by 9.63%, the true positive rate by 10.85%, and reduces the classification error by 10.06% for the maximum regions.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146007299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SDCSCF-Net: A High-Performance Spatial Channel Fusion Attention Network for Diabetic Retinopathy Classification SDCSCF-Net:用于糖尿病视网膜病变分类的高性能空间通道融合注意网络
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-12 DOI: 10.1002/ima.70290
Liwen Zhang, Baiyang Yang, Rongwei Xia, Qiang Zhang, Jinchan Wang

Diabetic retinopathy (DR) is a leading cause of blindness among individuals with diabetes. Timely diagnosis and precise classification of DR are essential for patients. However, the traditional diagnostic methods have limitations in precision, mainly relying on doctors' experiences and subjective judgments on DR images. Therefore, an efficient network model, named SDCSCF-Net, is proposed based on deep learning for DR diagnosis and classification. Firstly, the SE_Double_Conv (SDC) module is designed by integrating the Squeeze-and-Excitation (SE) attention mechanism into the first Double_Conv block of the encoder structure of U-Net to enhance feature representation and suppress redundant information. Secondly, a novel attention mechanism, spatial channel fusion attention (SCFA) module, is proposed to enhance the model's focus on lesion areas and the relationship between the channels, making the model more effectively distinguish subtle differences between adjacent DR classes. Finally, the proposed model is evaluated on the APTOS 2019 dataset, which contains 3662 fundus images. The results show that the proposed model demonstrates superior classification performance for DR compared to other existing approaches, and its accuracy, precision, recall, and F1-score for binary classification of DR are 99.18%, 99.47%, 98.98%, and 99.19%, respectively. For the five-class classification task, the model achieves an accuracy of 84.72%, a precision of 84.12%, a recall of 84.72%, and an F1-score of 84.02%. All the evaluation metrics are obtained from the testing phase of the model. In addition, the Grad-CAM technology is utilized to visualize the key lesion areas concerned by the model and further verifies the effectiveness of the proposed model. It is beneficial to promote the research and practical application in the intelligent diagnosis of DR.

糖尿病视网膜病变(DR)是糖尿病患者致盲的主要原因。及时诊断和准确分类DR对患者至关重要。然而,传统的诊断方法在精度上存在局限性,主要依靠医生的经验和对DR图像的主观判断。为此,提出了一种基于深度学习的DR诊断分类网络模型SDCSCF-Net。首先,设计SE_Double_Conv (SDC)模块,将压缩激励(SE)注意机制集成到U-Net编码器结构的第一个Double_Conv块中,增强特征表征,抑制冗余信息;其次,提出了一种新的注意机制——空间通道融合注意(SCFA)模块,增强了模型对病变区域和通道之间关系的关注,使模型更有效地区分相邻DR类别之间的细微差异。最后,在包含3662张眼底图像的APTOS 2019数据集上对该模型进行了评估。结果表明,该模型在DR分类上的准确率、精密度、召回率和F1-score分别为99.18%、99.47%、98.98%和99.19%。对于五类分类任务,该模型的准确率为84.72%,精密度为84.12%,召回率为84.72%,f1得分为84.02%。所有的评估指标都是从模型的测试阶段获得的。此外,利用Grad-CAM技术对模型关注的关键病变区域进行可视化,进一步验证了所提模型的有效性。这有利于促进DR智能诊断的研究和实际应用。
{"title":"SDCSCF-Net: A High-Performance Spatial Channel Fusion Attention Network for Diabetic Retinopathy Classification","authors":"Liwen Zhang,&nbsp;Baiyang Yang,&nbsp;Rongwei Xia,&nbsp;Qiang Zhang,&nbsp;Jinchan Wang","doi":"10.1002/ima.70290","DOIUrl":"https://doi.org/10.1002/ima.70290","url":null,"abstract":"<div>\u0000 \u0000 <p>Diabetic retinopathy (DR) is a leading cause of blindness among individuals with diabetes. Timely diagnosis and precise classification of DR are essential for patients. However, the traditional diagnostic methods have limitations in precision, mainly relying on doctors' experiences and subjective judgments on DR images. Therefore, an efficient network model, named SDCSCF-Net, is proposed based on deep learning for DR diagnosis and classification. Firstly, the SE_Double_Conv (SDC) module is designed by integrating the Squeeze-and-Excitation (SE) attention mechanism into the first Double_Conv block of the encoder structure of U-Net to enhance feature representation and suppress redundant information. Secondly, a novel attention mechanism, spatial channel fusion attention (SCFA) module, is proposed to enhance the model's focus on lesion areas and the relationship between the channels, making the model more effectively distinguish subtle differences between adjacent DR classes. Finally, the proposed model is evaluated on the APTOS 2019 dataset, which contains 3662 fundus images. The results show that the proposed model demonstrates superior classification performance for DR compared to other existing approaches, and its accuracy, precision, recall, and F1-score for binary classification of DR are 99.18%, 99.47%, 98.98%, and 99.19%, respectively. For the five-class classification task, the model achieves an accuracy of 84.72%, a precision of 84.12%, a recall of 84.72%, and an F1-score of 84.02%. All the evaluation metrics are obtained from the testing phase of the model. In addition, the Grad-CAM technology is utilized to visualize the key lesion areas concerned by the model and further verifies the effectiveness of the proposed model. It is beneficial to promote the research and practical application in the intelligent diagnosis of DR.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Imaging Systems and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1