首页 > 最新文献

International Journal of Imaging Systems and Technology最新文献

英文 中文
SlideInspect: From Pixel-Level Artifact Detection to Actionable Quality Metrics in Digital Pathology SlideInspect:从像素级伪影检测到数字病理学中可操作的质量指标
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-11 DOI: 10.1002/ima.70292
Manuela Scotto, Roberta Patti, Vincenzo L'imperio, Filippo Fraggetta, Filippo Molinari, Massimo Salvi

The presence of artifacts in whole slide images (WSIs), such as tissue folds, air bubbles, and out-of-focus regions, can significantly impact WSI digitization, pathologists' evaluation, and the accuracy of downstream analyses. We present SlideInspect, a novel AI-based framework for comprehensive artifact detection and quality control in digital pathology. Our system leverages deep learning techniques to segment multiple artifact types across diverse tissue types and staining methods. SlideInspect provides a hierarchical output: a color-coded slide quality indicator (green, yellow, red) with recommended actions (no action, re-scan, re-mount, re-cut) based on artifact type and extent, and pixel-level segmentation masks for detailed analysis. The system operates at multiple magnifications (1.25× for tissue segmentation, 5× for artifact detection) and also incorporates stain quality assessment for histological stain evaluation. We validated SlideInspect on a large, multi-centric, multi-scanner dataset of over 3000 WSIs, demonstrating robust performance across different tissue types, staining methods, and scanning platforms. The system achieves high segmentation accuracy for various artifacts while maintaining computational efficiency (average processing time: 72.7 s per WSI). Pathologist evaluations confirmed the clinical relevance and accuracy of SlideInspect's quality assessments. By providing actionable insights at multiple levels of granularity, SlideInspect significantly improves the efficiency and standardization of digital pathology workflows. Its vendor-agnostic design and multi-stain capability make it suitable for integration into diverse clinical and research settings.

在整个幻灯片图像(WSI)中存在伪影,如组织褶皱、气泡和失焦区域,会严重影响WSI数字化、病理学家的评估和下游分析的准确性。我们提出了SlideInspect,一个新的基于人工智能的框架,用于数字病理学中全面的伪影检测和质量控制。我们的系统利用深度学习技术在不同的组织类型和染色方法中分割多种工件类型。SlideInspect提供了一个分层输出:一个颜色编码的幻灯片质量指示器(绿色,黄色,红色),根据工件类型和程度推荐操作(无操作,重新扫描,重新安装,重新切割),以及用于详细分析的像素级分割掩码。该系统可在多种倍率下工作(1.25倍用于组织分割,5倍用于伪影检测),并结合染色质量评估用于组织学染色评估。我们在超过3000个wsi的大型、多中心、多扫描仪数据集上验证了SlideInspect,展示了在不同组织类型、染色方法和扫描平台上的稳健性能。该系统在保持计算效率(平均处理时间:72.7 s / WSI)的同时,实现了对各种工件的高分割精度。病理学家的评估证实了SlideInspect质量评估的临床相关性和准确性。通过在多个粒度级别提供可操作的见解,SlideInspect显着提高了数字病理工作流程的效率和标准化。其供应商不可知的设计和多染色能力使其适合整合到不同的临床和研究设置。
{"title":"SlideInspect: From Pixel-Level Artifact Detection to Actionable Quality Metrics in Digital Pathology","authors":"Manuela Scotto,&nbsp;Roberta Patti,&nbsp;Vincenzo L'imperio,&nbsp;Filippo Fraggetta,&nbsp;Filippo Molinari,&nbsp;Massimo Salvi","doi":"10.1002/ima.70292","DOIUrl":"https://doi.org/10.1002/ima.70292","url":null,"abstract":"<p>The presence of artifacts in whole slide images (WSIs), such as tissue folds, air bubbles, and out-of-focus regions, can significantly impact WSI digitization, pathologists' evaluation, and the accuracy of downstream analyses. We present SlideInspect, a novel AI-based framework for comprehensive artifact detection and quality control in digital pathology. Our system leverages deep learning techniques to segment multiple artifact types across diverse tissue types and staining methods. SlideInspect provides a hierarchical output: a color-coded slide quality indicator (green, yellow, red) with recommended actions (no action, re-scan, re-mount, re-cut) based on artifact type and extent, and pixel-level segmentation masks for detailed analysis. The system operates at multiple magnifications (1.25× for tissue segmentation, 5× for artifact detection) and also incorporates stain quality assessment for histological stain evaluation. We validated SlideInspect on a large, multi-centric, multi-scanner dataset of over 3000 WSIs, demonstrating robust performance across different tissue types, staining methods, and scanning platforms. The system achieves high segmentation accuracy for various artifacts while maintaining computational efficiency (average processing time: 72.7 s per WSI). Pathologist evaluations confirmed the clinical relevance and accuracy of SlideInspect's quality assessments. By providing actionable insights at multiple levels of granularity, SlideInspect significantly improves the efficiency and standardization of digital pathology workflows. Its vendor-agnostic design and multi-stain capability make it suitable for integration into diverse clinical and research settings.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70292","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breast Tumor Detection via S-Parameter Contrast Using a 1 × 8 Miniaturized Metamaterial Antenna Array for UWB Microwave Imaging 1 × 8微型超材料天线阵列超宽带微波成像s参数对比检测乳腺肿瘤
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-11 DOI: 10.1002/ima.70295
Sanaa Salama, Duaa Zyoud, Ashraf Abuelhaija, Muneera Altayeb, Ammar Al-Bassam

Due to the significant increase in breast cancer cases and the limitations of existing early-stage detection techniques, microwave imaging has emerged as a critical tool for the diagnosis of carcinogenic and malignant cells in various tissues. In this work, a 1 × 8 miniaturized metamaterial-based antenna array is conceived and developed for ultrawideband microwave imaging for early breast cancer diagnosis because of its improved accuracy. The developed antenna array features a small dimension, a wide frequency band, a high gain, and broadside radiation properties. To achieve the wider bandwidth from 3.34 to 6.79 GHz, H-shaped unit cells and T-shaped feed network dimensions are optimized. The obtained wide bandwidth supports the generation of high-quality images. A partial ground plane structure is used to improve impedance matching and further enhance the bandwidth. Antenna performance is first numerically and experimentally validated in free space. The antenna performance is validated via numerical simulation and experimental measurements in free space. A numerical phantom with similar tissue properties is created with and without the tumor. Differences in the back scattered signals from the antenna array elements can be observed due to the higher water content and larger dielectric constant of malignant cells as compared to healthy ones, which can be analyzed to identify the tumor or cancer. Here, eight antenna elements are arranged in a circle at a distance of 10 mm from the breast. The separation between adjacent antenna elements is 17 mm to reduce the mutual coupling. Furthermore, the breast tissue is scanned at different angles. At a time, one antenna is excited and the others are in the receiving mode. The collected signals are used to detect malignant cells. The existence of a tumor causes differences in the back scattered signals of the antenna elements. The absolute difference in transmission coefficients, with and without the presence of a tumor, is used to detect the existence of malignant cells. The suggested structure has demonstrated effective performance in microwave imaging using S-parameter contrast.

由于乳腺癌病例的显著增加和现有早期检测技术的局限性,微波成像已成为诊断各种组织中致癌和恶性细胞的重要工具。在这项工作中,由于其准确性的提高,我们设想并开发了一种1 × 8小型化的基于超材料的天线阵列,用于超宽带微波成像用于早期乳腺癌诊断。所研制的天线阵具有尺寸小、频带宽、增益高、宽侧辐射等特点。为了实现3.34 ~ 6.79 GHz的更宽带宽,优化了h形单元格和t形馈电网络尺寸。所获得的宽带宽支持高质量图像的生成。采用局部接地面结构,改善了阻抗匹配,进一步提高了带宽。首先在自由空间对天线性能进行了数值和实验验证。通过数值模拟和自由空间实验验证了天线的性能。在有肿瘤和没有肿瘤的情况下,创建一个具有类似组织特性的数值幻影。由于恶性细胞的含水量和介电常数高于健康细胞,因此可以观察到来自天线阵列元件的反向散射信号的差异,可以通过分析这些差异来识别肿瘤或癌症。在这里,八个天线元件在距离乳房10毫米的距离上排列成一个圆圈。相邻天线单元之间的间距为17mm,以减少相互耦合。此外,乳房组织从不同角度进行扫描。同时,一个天线处于激励状态,其他天线处于接收模式。收集到的信号用于检测恶性细胞。肿瘤的存在使天线单元的反向散射信号产生差异。透射系数的绝对差值,在有无肿瘤的情况下,被用来检测恶性细胞的存在。该结构在s参数对比的微波成像中表现出了有效的性能。
{"title":"Breast Tumor Detection via S-Parameter Contrast Using a 1 × 8 Miniaturized Metamaterial Antenna Array for UWB Microwave Imaging","authors":"Sanaa Salama,&nbsp;Duaa Zyoud,&nbsp;Ashraf Abuelhaija,&nbsp;Muneera Altayeb,&nbsp;Ammar Al-Bassam","doi":"10.1002/ima.70295","DOIUrl":"https://doi.org/10.1002/ima.70295","url":null,"abstract":"<div>\u0000 \u0000 <p>Due to the significant increase in breast cancer cases and the limitations of existing early-stage detection techniques, microwave imaging has emerged as a critical tool for the diagnosis of carcinogenic and malignant cells in various tissues. In this work, a 1 × 8 miniaturized metamaterial-based antenna array is conceived and developed for ultrawideband microwave imaging for early breast cancer diagnosis because of its improved accuracy. The developed antenna array features a small dimension, a wide frequency band, a high gain, and broadside radiation properties. To achieve the wider bandwidth from 3.34 to 6.79 GHz, H-shaped unit cells and T-shaped feed network dimensions are optimized. The obtained wide bandwidth supports the generation of high-quality images. A partial ground plane structure is used to improve impedance matching and further enhance the bandwidth. Antenna performance is first numerically and experimentally validated in free space. The antenna performance is validated via numerical simulation and experimental measurements in free space. A numerical phantom with similar tissue properties is created with and without the tumor. Differences in the back scattered signals from the antenna array elements can be observed due to the higher water content and larger dielectric constant of malignant cells as compared to healthy ones, which can be analyzed to identify the tumor or cancer. Here, eight antenna elements are arranged in a circle at a distance of 10 mm from the breast. The separation between adjacent antenna elements is 17 mm to reduce the mutual coupling. Furthermore, the breast tissue is scanned at different angles. At a time, one antenna is excited and the others are in the receiving mode. The collected signals are used to detect malignant cells. The existence of a tumor causes differences in the back scattered signals of the antenna elements. The absolute difference in transmission coefficients, with and without the presence of a tumor, is used to detect the existence of malignant cells. The suggested structure has demonstrated effective performance in microwave imaging using S-parameter contrast.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MorphoFormer: Dual-Branch Dilated Transformer With Pathological Prior Fusion for Cervical Cell Morphology Analysis MorphoFormer:双分支扩张变压器病理融合用于宫颈细胞形态学分析
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-10 DOI: 10.1002/ima.70273
Linhong Zhao, Xiao Shang, Zhenfeng Zhao, Yuhao Liu, Yueping Liu, Shenwen Wang

Cervical cancer is one of the most common malignant tumors among women worldwide, and accurate early diagnosis is critical for improving patient survival rates. Traditional cytological screening methods rely on manual microscopic examination, which suffers from low efficiency and high subjectivity. In recent years, deep learning has facilitated the automation of cervical cell image analysis, yet challenges such as insufficient modeling of pathological features and high computational cost remain. To address these issues, this study proposes a novel dual-branch multi-scale model, MorphoFormer. The model employs a multi-scale dilated Transformer (DilateFormer) as its backbone and innovatively incorporates specialized modules for each branch: A Local Context Aggregation (LCA) module in the local branch and a Global Focus Attention (GFA) module in the global branch. These modules respectively enhance the representation of local details and global semantics, and their features are fused to enable collaborative multi-scale information modeling. Experimental results on the publicly available SIPaKMeD dataset demonstrate that MorphoFormer achieves classification accuracies of 99.58%, 98.51%, and 98.14% for binary, three-class, and five-class tasks, respectively. Further validation on the Blood Cell Count and Detection (BCCD) dataset indicates strong cross-task robustness. Moreover, MorphoFormer requires only 8.22 GFLOPs for inference, highlighting its practical potential by achieving high performance with low computational overhead. Related codes: https://github.com/sijhb/MorphoFormer.

宫颈癌是全球女性最常见的恶性肿瘤之一,准确的早期诊断对提高患者生存率至关重要。传统的细胞学筛查方法依赖于人工显微检查,效率低,主观性强。近年来,深度学习促进了宫颈细胞图像分析的自动化,但仍然存在病理特征建模不足和计算成本高等挑战。为了解决这些问题,本研究提出了一种新的双分支多尺度模型MorphoFormer。该模型采用多尺度扩展变压器(DilateFormer)作为主干,创新地为每个分支集成了专门的模块:本地分支中的本地上下文聚合(LCA)模块和全球分支中的全局焦点关注(GFA)模块。这些模块分别增强了局部细节的表示和全局语义,并将它们的特征融合在一起,实现了协同多尺度信息建模。在公开的SIPaKMeD数据集上的实验结果表明,MorphoFormer在二分类、三分类和五分类任务上的分类准确率分别达到99.58%、98.51%和98.14%。对血细胞计数和检测(BCCD)数据集的进一步验证表明,该方法具有很强的跨任务鲁棒性。此外,MorphoFormer仅需要8.22 GFLOPs进行推理,以低计算开销实现高性能,突出了其实用潜力。相关代码:https://github.com/sijhb/MorphoFormer。
{"title":"MorphoFormer: Dual-Branch Dilated Transformer With Pathological Prior Fusion for Cervical Cell Morphology Analysis","authors":"Linhong Zhao,&nbsp;Xiao Shang,&nbsp;Zhenfeng Zhao,&nbsp;Yuhao Liu,&nbsp;Yueping Liu,&nbsp;Shenwen Wang","doi":"10.1002/ima.70273","DOIUrl":"https://doi.org/10.1002/ima.70273","url":null,"abstract":"<div>\u0000 \u0000 <p>Cervical cancer is one of the most common malignant tumors among women worldwide, and accurate early diagnosis is critical for improving patient survival rates. Traditional cytological screening methods rely on manual microscopic examination, which suffers from low efficiency and high subjectivity. In recent years, deep learning has facilitated the automation of cervical cell image analysis, yet challenges such as insufficient modeling of pathological features and high computational cost remain. To address these issues, this study proposes a novel dual-branch multi-scale model, MorphoFormer. The model employs a multi-scale dilated Transformer (DilateFormer) as its backbone and innovatively incorporates specialized modules for each branch: A Local Context Aggregation (LCA) module in the local branch and a Global Focus Attention (GFA) module in the global branch. These modules respectively enhance the representation of local details and global semantics, and their features are fused to enable collaborative multi-scale information modeling. Experimental results on the publicly available SIPaKMeD dataset demonstrate that MorphoFormer achieves classification accuracies of 99.58%, 98.51%, and 98.14% for binary, three-class, and five-class tasks, respectively. Further validation on the Blood Cell Count and Detection (BCCD) dataset indicates strong cross-task robustness. Moreover, MorphoFormer requires only 8.22 GFLOPs for inference, highlighting its practical potential by achieving high performance with low computational overhead. Related codes: https://github.com/sijhb/MorphoFormer.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Source-Free Domain Adaptive Fundus Image Segmentation With Multiscale Feature Fusion and Stepwise Attention Integration 基于多尺度特征融合和分步关注融合的无源域自适应眼底图像分割
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-10 DOI: 10.1002/ima.70285
Mingtao Liu, Yuxuan Li, Qingyun Huo, Zhengfei Li, Shunbo Hu, Qingman Ge

Traditional unsupervised domain adaptation methods usually depend on source domain data distribution for cross-domain alignment. However, direct access to source data is often restricted due to privacy concerns and intellectual property rights. Without using source data, Source-free unsupervised domain adaptation methods can align the pre-trained model with the target domain by generating pseudo-labels for target domain data, which are then used as labeled samples to guide transfer learning. However, methods that generate pseudo-labels solely through iterative averaging often neglect spatial correlations among pixels and are susceptible to noise, resulting in blurred label boundaries. To this end, we propose a source-free domain adaptation framework for fundus image segmentation, which consists of a Multiscale Feature Fusion module for generating high-quality pseudo-labels and a Stepwise Attention Integration module for enhancing model training. The Multiscale Feature Fusion module refines the initial pseudo-labels from the pre-trained model through neighborhood value filling, effectively reducing noise and sharpening label boundaries. The Stepwise Attention Integration module progressively integrates high-level and low-level feature information into the low-level representation. The fused features preserve high-resolution details and enrich semantic content, thereby substantially enhancing the model's recognition capability. Experimental results demonstrate that, without using any source domain images or modifying the pre-trained model, our method achieves performance comparable to or even surpassing state-of-the-art approaches.

传统的无监督域自适应方法通常依赖于源域数据的分布进行跨域对齐。但是,由于隐私问题和知识产权问题,直接访问源数据常常受到限制。无源无监督域自适应方法在不使用源数据的情况下,通过为目标域数据生成伪标签,将预训练模型与目标域对齐,然后将目标域数据作为标记样本来指导迁移学习。然而,仅通过迭代平均生成伪标签的方法往往忽略了像素之间的空间相关性,并且容易受到噪声的影响,导致标签边界模糊。为此,我们提出了一种无源域自适应眼底图像分割框架,该框架由用于生成高质量伪标签的多尺度特征融合模块和用于增强模型训练的逐步注意集成模块组成。Multiscale Feature Fusion模块通过邻域值填充对预训练模型的初始伪标签进行细化,有效地降低了噪声,锐化了标签边界。Stepwise Attention Integration模块将高级和低级特征信息逐步整合到低级表征中。融合的特征保留了高分辨率的细节,丰富了语义内容,从而大大增强了模型的识别能力。实验结果表明,在不使用任何源域图像或修改预训练模型的情况下,我们的方法达到了与最先进方法相当甚至超过最先进方法的性能。
{"title":"Source-Free Domain Adaptive Fundus Image Segmentation With Multiscale Feature Fusion and Stepwise Attention Integration","authors":"Mingtao Liu,&nbsp;Yuxuan Li,&nbsp;Qingyun Huo,&nbsp;Zhengfei Li,&nbsp;Shunbo Hu,&nbsp;Qingman Ge","doi":"10.1002/ima.70285","DOIUrl":"https://doi.org/10.1002/ima.70285","url":null,"abstract":"<div>\u0000 \u0000 <p>Traditional unsupervised domain adaptation methods usually depend on source domain data distribution for cross-domain alignment. However, direct access to source data is often restricted due to privacy concerns and intellectual property rights. Without using source data, Source-free unsupervised domain adaptation methods can align the pre-trained model with the target domain by generating pseudo-labels for target domain data, which are then used as labeled samples to guide transfer learning. However, methods that generate pseudo-labels solely through iterative averaging often neglect spatial correlations among pixels and are susceptible to noise, resulting in blurred label boundaries. To this end, we propose a source-free domain adaptation framework for fundus image segmentation, which consists of a Multiscale Feature Fusion module for generating high-quality pseudo-labels and a Stepwise Attention Integration module for enhancing model training. The Multiscale Feature Fusion module refines the initial pseudo-labels from the pre-trained model through neighborhood value filling, effectively reducing noise and sharpening label boundaries. The Stepwise Attention Integration module progressively integrates high-level and low-level feature information into the low-level representation. The fused features preserve high-resolution details and enrich semantic content, thereby substantially enhancing the model's recognition capability. Experimental results demonstrate that, without using any source domain images or modifying the pre-trained model, our method achieves performance comparable to or even surpassing state-of-the-art approaches.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145964276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clinically Aligned AI for Diabetic Retinopathy: Interpretable Grading Based on Lesion Segmentation 糖尿病视网膜病变的临床对齐AI:基于病变分割的可解释分级
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-07 DOI: 10.1002/ima.70286
Nabil Hezil, Ahmed Bouridane, Rifat Hamoudi, Mohamed Deriche, Fouzi Harrag

Retinopathy, a prevalent retinal disorder, poses a major risk of vision loss if not detected at an early stage. Automatic lesion segmentation plays a key role in effective diagnosis and disease monitoring. In this work, we present a complete and interpretable pipeline that combines YOLOv12 for lesion segmentation, SVD-CAM for visual explanation, and transformer-based Gradient Boosted Neural Networks (GBNN) and Random Forest classifiers for diabetic retinopathy (DR) severity grading. YOLOv12 (You Only Look Once, version 12), known for its real-time object detection capability, is adapted to the complex task of retinal lesion segmentation, delivering both high accuracy and speed. To enhance lesion localization, SVD-CAM generates precise heatmaps that highlight critical pathological regions influencing the grading decision. The segmented lesions are then quantified and used as input features for the grading stage, enabling clinically aligned DR classification. Our approach not only achieves state-of-the-art performance across three public datasets (IDRiD, DDR, and FGADR) but also provides lesion-level interpretability that improves clinical trust and adoption. Extensive experiments demonstrate that the proposed framework delivers accurate segmentation, reliable grading, and meaningful visual explanations, establishing a robust solution for automated DR analysis.

视网膜病变是一种常见的视网膜疾病,如果不及早发现,可能会导致视力丧失。病灶自动分割对有效诊断和疾病监测起着关键作用。在这项工作中,我们提出了一个完整且可解释的管道,将YOLOv12用于病变分割,SVD-CAM用于视觉解释,以及基于变压器的梯度增强神经网络(GBNN)和随机森林分类器用于糖尿病视网膜病变(DR)严重程度分级。YOLOv12 (You Only Look Once, version 12)以其实时目标检测能力而闻名,适用于视网膜病变分割的复杂任务,提供高精度和高速度。为了加强病变定位,SVD-CAM生成精确的热图,突出影响分级决策的关键病理区域。然后对分割的病变进行量化,并将其作为分级阶段的输入特征,从而实现临床一致的DR分类。我们的方法不仅在三个公共数据集(IDRiD、DDR和FGADR)上实现了最先进的性能,而且还提供了病变级别的可解释性,从而提高了临床信任和采用。大量的实验表明,所提出的框架提供了准确的分割、可靠的分级和有意义的视觉解释,为自动DR分析建立了一个强大的解决方案。
{"title":"Clinically Aligned AI for Diabetic Retinopathy: Interpretable Grading Based on Lesion Segmentation","authors":"Nabil Hezil,&nbsp;Ahmed Bouridane,&nbsp;Rifat Hamoudi,&nbsp;Mohamed Deriche,&nbsp;Fouzi Harrag","doi":"10.1002/ima.70286","DOIUrl":"https://doi.org/10.1002/ima.70286","url":null,"abstract":"<div>\u0000 \u0000 <p>Retinopathy, a prevalent retinal disorder, poses a major risk of vision loss if not detected at an early stage. Automatic lesion segmentation plays a key role in effective diagnosis and disease monitoring. In this work, we present a complete and interpretable pipeline that combines YOLOv12 for lesion segmentation, SVD-CAM for visual explanation, and transformer-based Gradient Boosted Neural Networks (GBNN) and Random Forest classifiers for diabetic retinopathy (DR) severity grading. YOLOv12 (You Only Look Once, version 12), known for its real-time object detection capability, is adapted to the complex task of retinal lesion segmentation, delivering both high accuracy and speed. To enhance lesion localization, SVD-CAM generates precise heatmaps that highlight critical pathological regions influencing the grading decision. The segmented lesions are then quantified and used as input features for the grading stage, enabling clinically aligned DR classification. Our approach not only achieves state-of-the-art performance across three public datasets (IDRiD, DDR, and FGADR) but also provides lesion-level interpretability that improves clinical trust and adoption. Extensive experiments demonstrate that the proposed framework delivers accurate segmentation, reliable grading, and meaningful visual explanations, establishing a robust solution for automated DR analysis.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145963879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
U-KAN for Multi-Nuclei Segmentation Using an Adaptive Sliding Window Approach 基于自适应滑动窗口方法的U-KAN多核分割
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-06 DOI: 10.1002/ima.70283
Usman Ali, Jin Qi, Aiman Rashid, Muhammad Hammad Musaddiq

Accurate segmentation of nuclei in histopathological images is critical for improving diagnostic precision and advancing computational pathology. Deep learning models employed for this task must effectively handle structural variability while offering transparent and interpretable predictions to ensure clinical reliability. In this study, we investigate the integration of Kolmogorov–Arnold Networks (KANs) into the widely adopted U-Net architecture, forming a novel hybrid model referred to as U-KAN. To the best of our knowledge, we are the first to explore the application of U-KAN for multi-class nuclei segmentation on the challenging MoNuSAC2020 dataset, leveraging an adaptive sliding window strategy. Our results demonstrate that U-KAN achieves a 17.9% improvement in Dice coefficient (Dice Similarity Coefficient, DSC) (0.976) and a 25.7% increase in IoU (Intersection over Union) (0.954) compared to baseline method (U-Net), while also delivering enhanced model interpretability. Gradient-based explanation techniques further confirm that U-KAN produces anatomically plausible predictions, with strong attention to nuclear boundaries. These findings suggest that symbolic-connectionist hybrids like U-KAN can meaningfully advance automated histopathological image analysis.

组织病理图像中细胞核的准确分割对于提高诊断精度和推进计算病理学至关重要。用于这项任务的深度学习模型必须有效地处理结构变异性,同时提供透明和可解释的预测,以确保临床可靠性。在本研究中,我们研究了将Kolmogorov-Arnold网络(KANs)集成到广泛采用的U-Net架构中,形成一种称为U-KAN的新型混合模型。据我们所知,我们是第一个探索U-KAN在具有挑战性的MoNuSAC2020数据集上用于多类核分割的应用,利用自适应滑动窗口策略。我们的研究结果表明,与基线方法(U-Net)相比,U-KAN在Dice系数(Dice Similarity coefficient, DSC)(0.976)和IoU (Intersection over Union)(0.954)方面提高了17.9%,同时还提供了增强的模型可解释性。基于梯度的解释技术进一步证实了U-KAN产生解剖学上合理的预测,并强烈关注核边界。这些发现表明,像U-KAN这样的符号连接主义混合体可以有意义地推进自动组织病理学图像分析。
{"title":"U-KAN for Multi-Nuclei Segmentation Using an Adaptive Sliding Window Approach","authors":"Usman Ali,&nbsp;Jin Qi,&nbsp;Aiman Rashid,&nbsp;Muhammad Hammad Musaddiq","doi":"10.1002/ima.70283","DOIUrl":"https://doi.org/10.1002/ima.70283","url":null,"abstract":"<p>Accurate segmentation of nuclei in histopathological images is critical for improving diagnostic precision and advancing computational pathology. Deep learning models employed for this task must effectively handle structural variability while offering transparent and interpretable predictions to ensure clinical reliability. In this study, we investigate the integration of Kolmogorov–Arnold Networks (KANs) into the widely adopted U-Net architecture, forming a novel hybrid model referred to as U-KAN. To the best of our knowledge, we are the first to explore the application of U-KAN for multi-class nuclei segmentation on the challenging MoNuSAC2020 dataset, leveraging an adaptive sliding window strategy. Our results demonstrate that U-KAN achieves a 17.9% improvement in Dice coefficient (Dice Similarity Coefficient, DSC) (0.976) and a 25.7% increase in IoU (Intersection over Union) (0.954) compared to baseline method (U-Net), while also delivering enhanced model interpretability. Gradient-based explanation techniques further confirm that U-KAN produces anatomically plausible predictions, with strong attention to nuclear boundaries. These findings suggest that symbolic-connectionist hybrids like U-KAN can meaningfully advance automated histopathological image analysis.</p>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ima.70283","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SpineDeep-Net: Dual-Self-Attention-Based Deep Neural Network for Automating Slice Selection and Precise Transverse Plane Localization in Lumbar Spine MRI for Intervertebral Disc Analysis SpineDeep-Net:用于椎间盘分析的腰椎MRI自动切片选择和精确横切面定位的双自注意深度神经网络
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-04 DOI: 10.1002/ima.70280
Rashmi Singh, Rakesh Chandra Joshi, Suzain Rashid, Radim Burget, Malay Kishore Dutta

The rising prevalence of lumbar spine disorders demands scalable solutions for mass screening and automated diagnosis. Accurate analysis of specific MRI slices, such as mid-sagittal or transverse mid-height intervertebral disc (IVD) slices, is essential but currently relies on time-consuming, error-prone manual selection. Automating this process is crucial to enhance the efficiency and accuracy of computer-aided diagnostic systems. To address this need, this study introduces a novel deep learning-based framework—SpineDeep-Net that integrates self-attention mechanisms within a multi-layer convolutional neural network for automatic selection of optimal transverse planes of lumbar spine MRI disc slices. By focusing on mid-height slices of L3/L4, L4/L5, and L5/S1 IVDs—the most diagnostically relevant slices, SpineDeep-Net eliminates the reliance on manual selection processes, thereby accelerating and improving the diagnostic pipeline. Unlike standard attention, the proposed dual-self-attention employs two sequential attention stages that jointly enhance long-range spatial cue extraction and emphasize subtle disc-level differences. This mechanism enables the model to focus more effectively on diagnostically relevant regions within lumbar MRI slices by dynamically recalibrating feature maps and strengthening feature dependencies. Experimental evaluations demonstrate the superior performance of SpineDeep-Net, achieving 96.83% accuracy and 98.41% specificity, outperforming state-of-the-art methods. By automating the selection and classification of clinically critical disc slices, SpineDeep-Net addresses a key challenge in lumbar spine diagnostics, providing a reliable, scalable, and efficient tool that aids radiologists in making informed clinical decisions. The proposed framework highlights the transformative potential of self-attention-guided deep learning in advancing healthcare diagnostics. The source code is publicly available at https://github.com/rakeshchandrajoshi/spinedeepnet.

腰椎疾病的患病率不断上升,需要大规模筛查和自动诊断的可扩展解决方案。准确分析特定的MRI切片,如正中矢状面或横向中高椎间盘(IVD)切片,是必不可少的,但目前依赖于耗时且容易出错的人工选择。这一过程的自动化对于提高计算机辅助诊断系统的效率和准确性至关重要。为了满足这一需求,本研究引入了一种新的基于深度学习的框架——spinedeep - net,该框架将自注意机制集成在多层卷积神经网络中,用于自动选择腰椎MRI椎间盘切片的最佳横平面。通过专注于L3/L4、L4/L5和L5/S1 ivd的中高切片(与诊断最相关的切片),SpineDeep-Net消除了对人工选择过程的依赖,从而加速和改进了诊断流程。与标准注意不同,本文提出的双自我注意采用了两个连续的注意阶段,共同增强了远程空间线索提取,并强调了细微的磁盘水平差异。这种机制使模型能够通过动态重新校准特征图和加强特征依赖性,更有效地关注腰椎MRI切片中诊断相关的区域。实验评估证明了SpineDeep-Net的优越性能,达到96.83%的准确率和98.41%的特异性,优于目前最先进的方法。通过自动选择和分类临床关键椎间盘切片,SpineDeep-Net解决了腰椎诊断中的一个关键挑战,提供了一个可靠的、可扩展的和有效的工具,帮助放射科医生做出明智的临床决策。提出的框架强调了自我注意力引导的深度学习在推进医疗诊断方面的变革潜力。源代码可在https://github.com/rakeshchandrajoshi/spinedeepnet上公开获得。
{"title":"SpineDeep-Net: Dual-Self-Attention-Based Deep Neural Network for Automating Slice Selection and Precise Transverse Plane Localization in Lumbar Spine MRI for Intervertebral Disc Analysis","authors":"Rashmi Singh,&nbsp;Rakesh Chandra Joshi,&nbsp;Suzain Rashid,&nbsp;Radim Burget,&nbsp;Malay Kishore Dutta","doi":"10.1002/ima.70280","DOIUrl":"https://doi.org/10.1002/ima.70280","url":null,"abstract":"<div>\u0000 \u0000 <p>The rising prevalence of lumbar spine disorders demands scalable solutions for mass screening and automated diagnosis. Accurate analysis of specific MRI slices, such as mid-sagittal or transverse mid-height intervertebral disc (IVD) slices, is essential but currently relies on time-consuming, error-prone manual selection. Automating this process is crucial to enhance the efficiency and accuracy of computer-aided diagnostic systems. To address this need, this study introduces a novel deep learning-based framework—SpineDeep-Net that integrates self-attention mechanisms within a multi-layer convolutional neural network for automatic selection of optimal transverse planes of lumbar spine MRI disc slices. By focusing on mid-height slices of L3/L4, L4/L5, and L5/S1 IVDs—the most diagnostically relevant slices, SpineDeep-Net eliminates the reliance on manual selection processes, thereby accelerating and improving the diagnostic pipeline. Unlike standard attention, the proposed dual-self-attention employs two sequential attention stages that jointly enhance long-range spatial cue extraction and emphasize subtle disc-level differences. This mechanism enables the model to focus more effectively on diagnostically relevant regions within lumbar MRI slices by dynamically recalibrating feature maps and strengthening feature dependencies. Experimental evaluations demonstrate the superior performance of SpineDeep-Net, achieving 96.83% accuracy and 98.41% specificity, outperforming state-of-the-art methods. By automating the selection and classification of clinically critical disc slices, SpineDeep-Net addresses a key challenge in lumbar spine diagnostics, providing a reliable, scalable, and efficient tool that aids radiologists in making informed clinical decisions. The proposed framework highlights the transformative potential of self-attention-guided deep learning in advancing healthcare diagnostics. The source code is publicly available at https://github.com/rakeshchandrajoshi/spinedeepnet.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TTNet: Three-Stages Tooth Segmentation Network Based on Tooth Masks in CBCT Images 基于牙膜的CBCT图像三阶段牙齿分割网络
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-04 DOI: 10.1002/ima.70271
Jianfeng Lu, Yang Hu, Yiyang Hu, Renlin Xin, Chuhua Song, Mahmoud Emam

Accurate identification and segmentation of teeth in cone-beam computed tomography (CBCT) images are essential for dental diagnosis and treatment in digital dentistry. However, extracting regions of interest (ROI) from maxillofacial CBCT images remains difficult due to the low pixel ratio of tooth structures, especially in the apical area. Traditional tooth segmentation methods such as threshold-based, region-based, and edge-based methods address limited accuracy under challenging imaging conditions. In this paper, we propose TTNNet, a three-stage tooth instance segmentation network designed to improve tooth segmentation from CBCT images. The proposed TTNet employs an intersection-based refinement of the tooth centroid heatmap, retaining only pixels that simultaneously lie within the predicted tooth mask and high-probability heatmap regions. This intersection operation eliminates noise in the tooth centroid heatmap, such as regions that may have been labeled as teeth incorrectly. Extensive experiments on publicly available CBCT tooth dataset demonstrate that TTNet achieves superior performance compared to recent state-of-the-art methods.

锥形束计算机断层扫描(CBCT)图像中牙齿的准确识别和分割对于数字牙科的诊断和治疗至关重要。然而,由于牙齿结构的像素比较低,特别是在牙尖区域,从颌面部CBCT图像中提取感兴趣区域(ROI)仍然很困难。传统的牙齿分割方法,如基于阈值、基于区域和基于边缘的方法,在具有挑战性的成像条件下解决了精度有限的问题。在本文中,我们提出了TTNNet,一个三阶段的牙齿实例分割网络,旨在提高从CBCT图像中分割牙齿。提出的TTNet采用基于交集的牙齿质心热图改进,仅保留同时位于预测牙齿掩模和高概率热图区域内的像素。这种交叉操作消除了牙齿质心热图中的噪声,例如可能被错误地标记为牙齿的区域。在公开可用的CBCT牙齿数据集上进行的大量实验表明,与最近最先进的方法相比,TTNet实现了卓越的性能。
{"title":"TTNet: Three-Stages Tooth Segmentation Network Based on Tooth Masks in CBCT Images","authors":"Jianfeng Lu,&nbsp;Yang Hu,&nbsp;Yiyang Hu,&nbsp;Renlin Xin,&nbsp;Chuhua Song,&nbsp;Mahmoud Emam","doi":"10.1002/ima.70271","DOIUrl":"https://doi.org/10.1002/ima.70271","url":null,"abstract":"<div>\u0000 \u0000 <p>Accurate identification and segmentation of teeth in cone-beam computed tomography (CBCT) images are essential for dental diagnosis and treatment in digital dentistry. However, extracting regions of interest (ROI) from maxillofacial CBCT images remains difficult due to the low pixel ratio of tooth structures, especially in the apical area. Traditional tooth segmentation methods such as threshold-based, region-based, and edge-based methods address limited accuracy under challenging imaging conditions. In this paper, we propose TTNNet, a three-stage tooth instance segmentation network designed to improve tooth segmentation from CBCT images. The proposed TTNet employs an intersection-based refinement of the tooth centroid heatmap, retaining only pixels that simultaneously lie within the predicted tooth mask and high-probability heatmap regions. This intersection operation eliminates noise in the tooth centroid heatmap, such as regions that may have been labeled as teeth incorrectly. Extensive experiments on publicly available CBCT tooth dataset demonstrate that TTNet achieves superior performance compared to recent state-of-the-art methods.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145983442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Supervised Transfer Learning of Cross-Domains Histopathological Images for Cancer Diagnosis 跨领域组织病理图像的自监督迁移学习用于癌症诊断
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-03 DOI: 10.1002/ima.70278
Jianbo Zhu, Zihan Wang, Jinjin Wu, Chenbei Li, Lan Li, Linwei Shang, Huijie Wang, Chao Tu, Jianhua Yin

Whole-slide imaging assisted by deep learning has been employed to help the digital pathology, while limited by the scarcity of paired label data. To address this issue, a novel self-supervised image modeling framework, PathMAE, is proposed to effectively enlarge the labeled dataset in a cross-domain way, where cross-dataset and even cross-disease histopathological images can be used for model training. PathMAE integrates masked image modeling and contrastive learning to effectively learn transferable visual representations from unlabeled WSIs. The framework comprises two key components: a Swin-Transformer-based encoder-decoder (SMED) with a window-masking strategy for local feature reconstruction, and a Dynamic Memory Contrastive Learning (DMCL) module for enhancing global semantic alignment via memory-guided feature comparison. Experimental results on three public histopathology datasets demonstrate the robustness and generalizability of the proposed method. In cross-disease transfer (BreakHis → Osteosarcoma), PathMAE achieved 97.15% accuracy and 99.03% AUC; in cross-dataset transfer (BreakHis → Camelyon16), it obtained 84.67% accuracy and 88.04% AUC. These findings validate the capability of PathMAE as a scalable and domain-adaptive image analysis framework, offering new potential for building reliable computational pathology systems under limited supervision.

深度学习辅助的全切片成像已被用于帮助数字病理,但受到配对标签数据稀缺的限制。为了解决这一问题,提出了一种新的自监督图像建模框架PathMAE,以跨域的方式有效地扩大标记数据集,其中跨数据集甚至跨疾病的组织病理学图像可用于模型训练。PathMAE集成了掩模图像建模和对比学习,可以有效地从未标记的wsi中学习可转移的视觉表示。该框架包括两个关键组件:一个基于swing - transformer的编码器-解码器(SMED),该编码器具有用于局部特征重建的窗口掩蔽策略,以及一个动态记忆对比学习(DMCL)模块,该模块通过记忆引导的特征比较来增强全局语义对齐。在三个公共组织病理学数据集上的实验结果证明了该方法的鲁棒性和泛化性。在跨疾病转移(BreakHis→骨肉瘤)中,PathMAE的准确率为97.15%,AUC为99.03%;在跨数据集传输(BreakHis→Camelyon16)中,准确率为84.67%,AUC为88.04%。这些发现验证了PathMAE作为一个可扩展和自适应领域的图像分析框架的能力,为在有限的监督下建立可靠的计算病理系统提供了新的潜力。
{"title":"Self-Supervised Transfer Learning of Cross-Domains Histopathological Images for Cancer Diagnosis","authors":"Jianbo Zhu,&nbsp;Zihan Wang,&nbsp;Jinjin Wu,&nbsp;Chenbei Li,&nbsp;Lan Li,&nbsp;Linwei Shang,&nbsp;Huijie Wang,&nbsp;Chao Tu,&nbsp;Jianhua Yin","doi":"10.1002/ima.70278","DOIUrl":"https://doi.org/10.1002/ima.70278","url":null,"abstract":"<div>\u0000 \u0000 <p>Whole-slide imaging assisted by deep learning has been employed to help the digital pathology, while limited by the scarcity of paired label data. To address this issue, a novel self-supervised image modeling framework, PathMAE, is proposed to effectively enlarge the labeled dataset in a cross-domain way, where cross-dataset and even cross-disease histopathological images can be used for model training. PathMAE integrates masked image modeling and contrastive learning to effectively learn transferable visual representations from unlabeled WSIs. The framework comprises two key components: a Swin-Transformer-based encoder-decoder (SMED) with a window-masking strategy for local feature reconstruction, and a Dynamic Memory Contrastive Learning (DMCL) module for enhancing global semantic alignment via memory-guided feature comparison. Experimental results on three public histopathology datasets demonstrate the robustness and generalizability of the proposed method. In cross-disease transfer (BreakHis → Osteosarcoma), PathMAE achieved 97.15% accuracy and 99.03% AUC; in cross-dataset transfer (BreakHis → Camelyon16), it obtained 84.67% accuracy and 88.04% AUC. These findings validate the capability of PathMAE as a scalable and domain-adaptive image analysis framework, offering new potential for building reliable computational pathology systems under limited supervision.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145904706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-Time Iris Recognition With Stand-Alone Embedded Processor Based on AI Model 基于AI模型的独立嵌入式处理器实时虹膜识别
IF 2.5 4区 计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-03 DOI: 10.1002/ima.70279
Shih-Chang Hsia, Jhong-Hao Luo

This study focuses on iris recognition using deep learning techniques. The EfficientDet network model was employed for both iris detection and recognition tasks. Four datasets were utilized to train and evaluate the deep learning network. The model was trained to extract iris features and classify individuals based on their unique iris patterns. The proposed method achieved a high recognition rate of over 98% across multiple dataset evaluations. For real-time implementation on an embedded system, the trained model was quantized to an 8-bit integer format to accommodate resource-constrained devices. Despite this quantization, the recognition accuracy remained high, reaching 97%. By incorporating an Edge TPU accelerator alongside a Raspberry Pi system, the processing speed reached up to 10 frames per second during real-time iris camera testing, demonstrating the feasibility of real-time iris recognition. An intruder test was conducted to assess the system's robustness in preventing unauthorized access. The False Acceptance Rate (FAR) was measured to assess the likelihood of incorrectly accepting an unauthorized individual. Experimental results show that the FAR can be reduced to zero by applying additional temporal constraints, effectively preventing unauthorized individuals from passing the iris recognition-based access control system.

本研究的重点是使用深度学习技术进行虹膜识别。虹膜检测和识别任务均采用了effentdet网络模型。使用四个数据集来训练和评估深度学习网络。训练该模型提取虹膜特征,并根据个体独特的虹膜模式对个体进行分类。该方法在多个数据集评估中实现了98%以上的高识别率。为了在嵌入式系统上实时实现,训练模型被量化为8位整数格式,以适应资源受限的设备。尽管进行了量化,但识别准确率仍然很高,达到97%。通过将Edge TPU加速器与树莓派系统结合在一起,在实时虹膜相机测试中,处理速度达到每秒10帧,证明了实时虹膜识别的可行性。进行了入侵者测试,以评估系统在防止未经授权的访问方面的稳健性。错误接受率(FAR)的测量是为了评估错误接受未经授权的个人的可能性。实验结果表明,在基于虹膜识别的门禁系统中加入额外的时间约束,可以将FAR降至零,有效地阻止了未经授权的个人通过。
{"title":"Real-Time Iris Recognition With Stand-Alone Embedded Processor Based on AI Model","authors":"Shih-Chang Hsia,&nbsp;Jhong-Hao Luo","doi":"10.1002/ima.70279","DOIUrl":"https://doi.org/10.1002/ima.70279","url":null,"abstract":"<div>\u0000 \u0000 <p>This study focuses on iris recognition using deep learning techniques. The EfficientDet network model was employed for both iris detection and recognition tasks. Four datasets were utilized to train and evaluate the deep learning network. The model was trained to extract iris features and classify individuals based on their unique iris patterns. The proposed method achieved a high recognition rate of over 98% across multiple dataset evaluations. For real-time implementation on an embedded system, the trained model was quantized to an 8-bit integer format to accommodate resource-constrained devices. Despite this quantization, the recognition accuracy remained high, reaching 97%. By incorporating an Edge TPU accelerator alongside a Raspberry Pi system, the processing speed reached up to 10 frames per second during real-time iris camera testing, demonstrating the feasibility of real-time iris recognition. An intruder test was conducted to assess the system's robustness in preventing unauthorized access. The False Acceptance Rate (FAR) was measured to assess the likelihood of incorrectly accepting an unauthorized individual. Experimental results show that the FAR can be reduced to zero by applying additional temporal constraints, effectively preventing unauthorized individuals from passing the iris recognition-based access control system.</p>\u0000 </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"36 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145904935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International Journal of Imaging Systems and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1