首页 > 最新文献

Journal of Imaging最新文献

英文 中文
Camera-in-the-Loop Realization of Direct Search with Random Trajectory Method for Binary-Phase Computer-Generated Hologram Optimization. 二相全息图优化中随机轨迹法直接搜索的环内摄像机实现。
IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Pub Date : 2025-12-05 DOI: 10.3390/jimaging11120434
Evgenii Yu Zlokazov, Rostislav S Starikov, Pavel A Cheremkhin, Timur Z Minikhanov

High-speed realization of computer-generated holograms (CGHs) is a crucial problem in the field of modern 3D visualization and optical image processing system development. Binary CGHs can be realized using high-resolution, high-speed spatial light modulators such as ferroelectric liquid crystals on silicon devices or digital micro-mirror devices providing the high throughput of optoelectronic systems. However, the quality of holographic images restored by binary CGHs often suffers from distortions, background noise, and speckle noise caused by the limitations and imperfections of optical system components. The present manuscript introduces a method based on the optimization of CGH models directly in the optical system with a camera-in-the-loop configuration using effective direct search with a random trajectory algorithm. The method was experimentally verified. The results demonstrate a significant enhancement in the quality of the holographic images optically restored by binary-phase CGH models optimized through this method compared to purely digitally generated models.

计算机生成全息图(CGHs)的高速实现是现代三维可视化和光学图像处理系统开发领域的关键问题。二进制CGHs可以使用高分辨率,高速空间光调制器,如硅器件上的铁电液晶或提供光电系统高通量的数字微镜器件来实现。然而,由于光学系统部件的限制和缺陷,二元CGHs恢复的全息图像质量经常受到畸变、背景噪声和散斑噪声的影响。本文介绍了一种基于随机轨迹算法的有效直接搜索,直接在环内相机光学系统中优化CGH模型的方法。实验验证了该方法的有效性。结果表明,与纯数字生成的模型相比,通过这种方法优化的二相CGH模型光学恢复的全息图像质量有显著提高。
{"title":"Camera-in-the-Loop Realization of Direct Search with Random Trajectory Method for Binary-Phase Computer-Generated Hologram Optimization.","authors":"Evgenii Yu Zlokazov, Rostislav S Starikov, Pavel A Cheremkhin, Timur Z Minikhanov","doi":"10.3390/jimaging11120434","DOIUrl":"10.3390/jimaging11120434","url":null,"abstract":"<p><p>High-speed realization of computer-generated holograms (CGHs) is a crucial problem in the field of modern 3D visualization and optical image processing system development. Binary CGHs can be realized using high-resolution, high-speed spatial light modulators such as ferroelectric liquid crystals on silicon devices or digital micro-mirror devices providing the high throughput of optoelectronic systems. However, the quality of holographic images restored by binary CGHs often suffers from distortions, background noise, and speckle noise caused by the limitations and imperfections of optical system components. The present manuscript introduces a method based on the optimization of CGH models directly in the optical system with a camera-in-the-loop configuration using effective direct search with a random trajectory algorithm. The method was experimentally verified. The results demonstrate a significant enhancement in the quality of the holographic images optically restored by binary-phase CGH models optimized through this method compared to purely digitally generated models.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 12","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12733431/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DiagNeXt: A Two-Stage Attention-Guided ConvNeXt Framework for Kidney Pathology Segmentation and Classification. DiagNeXt:用于肾脏病理分割和分类的两阶段注意力引导的ConvNeXt框架。
IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Pub Date : 2025-12-04 DOI: 10.3390/jimaging11120433
Hilal Tekin, Şafak Kılıç, Yahya Doğan

Accurate segmentation and classification of kidney pathologies from medical images remain a major challenge in computer-aided diagnosis due to complex morphological variations, small lesion sizes, and severe class imbalance. This study introduces DiagNeXt, a novel two-stage deep learning framework designed to overcome these challenges through an integrated use of attention-enhanced ConvNeXt architectures for both segmentation and classification. In the first stage, DiagNeXt-Seg employs a U-Net-based design incorporating Enhanced Convolutional Blocks (ECBs) with spatial attention gates and Atrous Spatial Pyramid Pooling (ASPP) to achieve precise multi-class kidney segmentation. In the second stage, DiagNeXt-Cls utilizes the segmented regions of interest (ROIs) for pathology classification through a hierarchical multi-resolution strategy enhanced by Context-Aware Feature Fusion (CAFF) and Evidential Deep Learning (EDL) for uncertainty estimation. The main contributions of this work include: (1) enhanced ConvNeXt blocks with large-kernel depthwise convolutions optimized for 3D medical imaging, (2) a boundary-aware compound loss combining Dice, cross-entropy, focal, and distance transform terms to improve segmentation precision, (3) attention-guided skip connections preserving fine-grained spatial details, (4) hierarchical multi-scale feature modeling for robust pathology recognition, and (5) a confidence-modulated classification approach integrating segmentation quality metrics for reliable decision-making. Extensive experiments on a large kidney CT dataset comprising 3847 patients demonstrate that DiagNeXt achieves 98.9% classification accuracy, outperforming state-of-the-art approaches by 6.8%. The framework attains near-perfect AUC scores across all pathology classes (Normal: 1.000, Tumor: 1.000, Cyst: 0.999, Stone: 0.994) while offering clinically interpretable uncertainty maps and attention visualizations. The superior diagnostic accuracy, computational efficiency (6.2× faster inference), and interpretability of DiagNeXt make it a strong candidate for real-world integration into clinical kidney disease diagnosis and treatment planning systems.

由于肾脏形态变化复杂、病变大小小、分类严重不平衡,从医学图像中准确分割和分类肾脏病理仍然是计算机辅助诊断的主要挑战。本研究介绍了DiagNeXt,这是一种新的两阶段深度学习框架,旨在通过集成使用注意力增强的ConvNeXt架构来进行分割和分类,从而克服这些挑战。在第一阶段,DiagNeXt-Seg采用基于u - net的设计,结合带有空间注意门的增强卷积块(ECBs)和阿特鲁斯空间金字塔池(ASPP)来实现精确的多类别肾脏分割。在第二阶段,DiagNeXt-Cls利用分割的感兴趣区域(roi)进行病理分类,通过上下文感知特征融合(CAFF)和证据深度学习(EDL)增强的分层多分辨率策略进行不确定性估计。这项工作的主要贡献包括:(1)针对三维医学成像优化了基于大核深度卷积的增强ConvNeXt块;(2)结合Dice、交叉熵、焦点和距离变换项的边界感知复合损失,以提高分割精度;(3)注意引导的跳过连接,保留细粒度的空间细节;(4)分层多尺度特征建模,实现鲁棒性病理识别。(5)基于可靠决策的分割质量度量的置信度调制分类方法。在包含3847例患者的大型肾脏CT数据集上进行的大量实验表明,DiagNeXt的分类准确率达到了98.9%,比目前最先进的方法高出6.8%。该框架在所有病理类别中获得近乎完美的AUC分数(正常:1.000,肿瘤:1.000,囊肿:0.999,结石:0.994),同时提供临床可解释的不确定性图和注意力可视化。DiagNeXt卓越的诊断准确性、计算效率(推理速度快6.2倍)和可解释性使其成为现实世界中与临床肾脏疾病诊断和治疗计划系统集成的有力候选者。
{"title":"DiagNeXt: A Two-Stage Attention-Guided ConvNeXt Framework for Kidney Pathology Segmentation and Classification.","authors":"Hilal Tekin, Şafak Kılıç, Yahya Doğan","doi":"10.3390/jimaging11120433","DOIUrl":"10.3390/jimaging11120433","url":null,"abstract":"<p><p>Accurate segmentation and classification of kidney pathologies from medical images remain a major challenge in computer-aided diagnosis due to complex morphological variations, small lesion sizes, and severe class imbalance. This study introduces DiagNeXt, a novel two-stage deep learning framework designed to overcome these challenges through an integrated use of attention-enhanced ConvNeXt architectures for both segmentation and classification. In the first stage, DiagNeXt-Seg employs a U-Net-based design incorporating Enhanced Convolutional Blocks (ECBs) with spatial attention gates and Atrous Spatial Pyramid Pooling (ASPP) to achieve precise multi-class kidney segmentation. In the second stage, DiagNeXt-Cls utilizes the segmented regions of interest (ROIs) for pathology classification through a hierarchical multi-resolution strategy enhanced by Context-Aware Feature Fusion (CAFF) and Evidential Deep Learning (EDL) for uncertainty estimation. The main contributions of this work include: (1) enhanced ConvNeXt blocks with large-kernel depthwise convolutions optimized for 3D medical imaging, (2) a boundary-aware compound loss combining Dice, cross-entropy, focal, and distance transform terms to improve segmentation precision, (3) attention-guided skip connections preserving fine-grained spatial details, (4) hierarchical multi-scale feature modeling for robust pathology recognition, and (5) a confidence-modulated classification approach integrating segmentation quality metrics for reliable decision-making. Extensive experiments on a large kidney CT dataset comprising 3847 patients demonstrate that DiagNeXt achieves 98.9% classification accuracy, outperforming state-of-the-art approaches by 6.8%. The framework attains near-perfect AUC scores across all pathology classes (Normal: 1.000, Tumor: 1.000, Cyst: 0.999, Stone: 0.994) while offering clinically interpretable uncertainty maps and attention visualizations. The superior diagnostic accuracy, computational efficiency (6.2× faster inference), and interpretability of DiagNeXt make it a strong candidate for real-world integration into clinical kidney disease diagnosis and treatment planning systems.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 12","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12733990/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UAV-TIRVis: A Benchmark Dataset for Thermal-Visible Image Registration from Aerial Platforms. UAV-TIRVis:空中平台热可见光图像配准的基准数据集。
IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Pub Date : 2025-12-04 DOI: 10.3390/jimaging11120432
Costin-Emanuel Vasile, Călin Bîră, Radu Hobincu

Registering UAV-based thermal and visible images is a challenging task due to differences in appearance across spectra and the lack of public benchmarks. To address this issue, we introduce UAV-TIRVis, a dataset consisting of 80 accurately and manually registered UAV-based thermal (640 × 512) and visible (4K) image pairs, captured across diverse environments. We benchmark our dataset using well-known registration methods, including feature-based (ORB, SURF, SIFT, KAZE), correlation-based, and intensity-based methods, as well as a custom, heuristic intensity-based method. We evaluate the performance of these methods using four metrics: RMSE, PSNR, SSIM, and NCC, averaged per scenario and across the entire dataset. The results show that conventional methods often fail to generalize across scenes, yielding <0.6 NCC on average, whereas the heuristic method shows that it is possible to achieve 0.77 SSIM and 0.82 NCC, highlighting the difficulty of cross-spectral UAV alignment and the need for further research to improve optimization in existing registration methods.

由于光谱的外观差异和缺乏公共基准,注册基于无人机的热图像和可见光图像是一项具有挑战性的任务。为了解决这个问题,我们引入了UAV-TIRVis,这是一个由80个精确和手动注册的基于无人机的热(640 × 512)和可见(4K)图像对组成的数据集,这些图像是在不同的环境中捕获的。我们使用众所周知的配准方法对数据集进行基准测试,包括基于特征(ORB, SURF, SIFT, KAZE),基于相关性和基于强度的方法,以及基于自定义的启发式强度方法。我们使用四个指标评估这些方法的性能:RMSE, PSNR, SSIM和NCC,每个场景和整个数据集的平均值。结果表明,传统的方法往往不能泛化跨场景,产生
{"title":"UAV-TIRVis: A Benchmark Dataset for Thermal-Visible Image Registration from Aerial Platforms.","authors":"Costin-Emanuel Vasile, Călin Bîră, Radu Hobincu","doi":"10.3390/jimaging11120432","DOIUrl":"10.3390/jimaging11120432","url":null,"abstract":"<p><p>Registering UAV-based thermal and visible images is a challenging task due to differences in appearance across spectra and the lack of public benchmarks. To address this issue, we introduce UAV-TIRVis, a dataset consisting of 80 accurately and manually registered UAV-based thermal (640 × 512) and visible (4K) image pairs, captured across diverse environments. We benchmark our dataset using well-known registration methods, including feature-based (ORB, SURF, SIFT, KAZE), correlation-based, and intensity-based methods, as well as a custom, heuristic intensity-based method. We evaluate the performance of these methods using four metrics: RMSE, PSNR, SSIM, and NCC, averaged per scenario and across the entire dataset. The results show that conventional methods often fail to generalize across scenes, yielding <0.6 NCC on average, whereas the heuristic method shows that it is possible to achieve 0.77 SSIM and 0.82 NCC, highlighting the difficulty of cross-spectral UAV alignment and the need for further research to improve optimization in existing registration methods.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 12","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12734084/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial on the Special Issue "Image and Video Processing for Blind and Visually Impaired". “盲人及视障人士的影像及视讯处理”特刊社论。
IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Pub Date : 2025-12-03 DOI: 10.3390/jimaging11120430
Zhigang Zhu, John-Ross Rizzo, Hao Tang

Over 2 [...].

超过2[…]。
{"title":"Editorial on the Special Issue \"Image and Video Processing for Blind and Visually Impaired\".","authors":"Zhigang Zhu, John-Ross Rizzo, Hao Tang","doi":"10.3390/jimaging11120430","DOIUrl":"10.3390/jimaging11120430","url":null,"abstract":"<p><p>Over 2 [...].</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 12","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12733805/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid AI Pipeline for Laboratory Detection of Internal Potato Defects Using 2D RGB Imaging. 基于二维RGB成像的马铃薯内部缺陷实验室检测混合AI流水线。
IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Pub Date : 2025-12-03 DOI: 10.3390/jimaging11120431
Slim Hamdi, Kais Loukil, Adem Haj Boubaker, Hichem Snoussi, Mohamed Abid

The internal quality assessment of potato tubers is a crucial task in agro-laboratory processing. Traditional methods struggle to detect internal defects such as hollow heart, internal bruises, and insect galleries using only surface features. We present a novel, fully modular hybrid AI architecture designed for defect detection using RGB images of potato slices, suitable for integration in laboratory. Our pipeline combines high-recall multi-threshold YOLO detection, contextual patch validation using ResNet, precise segmentation via the Segment Anything Model (SAM), and skin-contact analysis using VGG16 with a Random Forest classifier. Experimental results on a labeled dataset of over 6000 annotated instances show a recall above 95% and precision near 97.2% for most defect classes. The approach offers both robustness and interpretability, outperforming previous methods that rely on costly hyperspectral or MRI techniques. This system is scalable, explainable, and compatible with existing 2D imaging hardware.

马铃薯块茎内部质量评价是农业实验室加工中的一项重要工作。传统的方法很难检测内部缺陷,如空心心脏、内部擦伤和昆虫画廊,仅使用表面特征。我们提出了一种新颖的、完全模块化的混合人工智能架构,用于使用马铃薯片的RGB图像进行缺陷检测,适用于实验室集成。我们的管道结合了高召回率的多阈值YOLO检测,使用ResNet的上下文补丁验证,通过分段任意模型(SAM)进行精确分割,以及使用VGG16和随机森林分类器进行皮肤接触分析。在超过6000个标注实例的标记数据集上的实验结果显示,大多数缺陷类别的召回率超过95%,准确率接近97.2%。该方法具有鲁棒性和可解释性,优于以前依赖昂贵的高光谱或MRI技术的方法。该系统是可扩展的,可解释的,并与现有的2D成像硬件兼容。
{"title":"Hybrid AI Pipeline for Laboratory Detection of Internal Potato Defects Using 2D RGB Imaging.","authors":"Slim Hamdi, Kais Loukil, Adem Haj Boubaker, Hichem Snoussi, Mohamed Abid","doi":"10.3390/jimaging11120431","DOIUrl":"10.3390/jimaging11120431","url":null,"abstract":"<p><p>The internal quality assessment of potato tubers is a crucial task in agro-laboratory processing. Traditional methods struggle to detect internal defects such as hollow heart, internal bruises, and insect galleries using only surface features. We present a novel, fully modular hybrid AI architecture designed for defect detection using RGB images of potato slices, suitable for integration in laboratory. Our pipeline combines high-recall multi-threshold YOLO detection, contextual patch validation using ResNet, precise segmentation via the Segment Anything Model (SAM), and skin-contact analysis using VGG16 with a Random Forest classifier. Experimental results on a labeled dataset of over 6000 annotated instances show a recall above 95% and precision near 97.2% for most defect classes. The approach offers both robustness and interpretability, outperforming previous methods that rely on costly hyperspectral or MRI techniques. This system is scalable, explainable, and compatible with existing 2D imaging hardware.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 12","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12733781/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Good Is the Machine at the Imitation Game? On Stylistic Characteristics of AI-Generated Images. 机器有多擅长模仿游戏?论人工智能生成图像的风格特征。
IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Pub Date : 2025-12-02 DOI: 10.3390/jimaging11120429
Adrien Deliège, Jeanne Marlot, Marc Van Droogenbroeck, Maria Giulia Dondero

Text-to-image generative models can be used to imitate historical artistic styles, but their effectiveness in doing so remains unclear. In this work, we propose an evaluation framework that leverages expert knowledge from art history and visual semiotics and combines it with quantitative analysis to assess stylistic fidelity. Three experts rated both historical artwork production and images generated with Midjourney v6 for five major movements (Abstract Art, Cubism, Expressionism, Impressionism, Surrealism) and ten associated painters (male and female pairs), using nine visual criteria grounded in Greimas's plastic categories and Wölfflin's stylistic oppositions. Ratings were expressed as 95% intervals on continuous 0-100 scales and compared using our Relative Ratings Map (RRMap), which summarizes relative shifts, relative dispersion, and distributional overlap (via the Bhattacharyya coefficient). They were also discretized in four quality ratings (bad, stereotype, fair, excellent). The results show strong inter-expert variability and more moderate intra-expert effects tied to movements, criteria, criterion groups and modalities. Experts tend to agree that the model sometimes aligns with historical trends but also sometimes produces stereotyped versions of a movement or painter, or even completely missed its target, although no unanimous consensus emerges. We conclude that evaluating generative models requires both expert-driven interpretation and quantitative tools, and that stylistic fidelity is hard to quantify even with a rigorous framework.

文本-图像生成模型可用于模仿历史艺术风格,但其有效性尚不清楚。在这项工作中,我们提出了一个评估框架,利用艺术史和视觉符号学的专业知识,并将其与定量分析相结合,以评估风格的保真度。三位专家使用基于格利马的塑料分类和Wölfflin的风格对立的九种视觉标准,对历史艺术作品和Midjourney v6为五大运动(抽象艺术、立体主义、表现主义、印象派、超现实主义)和十位相关画家(男女对)生成的图像进行了评分。评分以连续0-100的95%区间表示,并使用我们的相对评分图(RRMap)进行比较,该图总结了相对变化、相对分散和分布重叠(通过Bhattacharyya系数)。他们也被分为四个质量等级(坏、刻板、一般、优秀)。结果显示,专家之间的差异很大,专家内部的影响与运动、标准、标准组和模式有关。专家们倾向于认为,这种模式有时符合历史趋势,但有时也会产生对某个运动或画家的刻板印象,甚至完全偏离了目标,尽管没有达成一致的共识。我们的结论是,评估生成模型既需要专家驱动的解释,也需要定量工具,即使有严格的框架,风格的保真度也很难量化。
{"title":"How Good Is the Machine at the Imitation Game? On Stylistic Characteristics of AI-Generated Images.","authors":"Adrien Deliège, Jeanne Marlot, Marc Van Droogenbroeck, Maria Giulia Dondero","doi":"10.3390/jimaging11120429","DOIUrl":"10.3390/jimaging11120429","url":null,"abstract":"<p><p>Text-to-image generative models can be used to imitate historical artistic styles, but their effectiveness in doing so remains unclear. In this work, we propose an evaluation framework that leverages expert knowledge from art history and visual semiotics and combines it with quantitative analysis to assess stylistic fidelity. Three experts rated both historical artwork production and images generated with Midjourney v6 for five major movements (Abstract Art, Cubism, Expressionism, Impressionism, Surrealism) and ten associated painters (male and female pairs), using nine visual criteria grounded in Greimas's plastic categories and Wölfflin's stylistic oppositions. Ratings were expressed as 95% intervals on continuous 0-100 scales and compared using our Relative Ratings Map (RRMap), which summarizes relative shifts, relative dispersion, and distributional overlap (via the Bhattacharyya coefficient). They were also discretized in four quality ratings (bad, stereotype, fair, excellent). The results show strong inter-expert variability and more moderate intra-expert effects tied to movements, criteria, criterion groups and modalities. Experts tend to agree that the model sometimes aligns with historical trends but also sometimes produces stereotyped versions of a movement or painter, or even completely missed its target, although no unanimous consensus emerges. We conclude that evaluating generative models requires both expert-driven interpretation and quantitative tools, and that stylistic fidelity is hard to quantify even with a rigorous framework.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 12","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12734345/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How Safe Are Oxygen-Ozone Therapy Procedures for Spine Disc Herniation? The SIOOT Protocols for Treating Spine Disorders. 氧臭氧治疗脊柱椎间盘突出症的安全性如何?脊柱疾病的SIOOT治疗方案。
IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Pub Date : 2025-12-01 DOI: 10.3390/jimaging11120428
Marianno Franzini, Salvatore Chirumbolo, Francesco Vaiano, Luigi Valdenassi, Francesca Giannetti, Marianna Chierchia, Umberto Tirelli, Paolo Bonacina, Gianluca Poggi, Aniello Langella, Edoardo Maria Pieracci, Christian Giannetti, Roberto Antonio Giannetti

Oxygen-ozone (O2-O3) therapy is widely used for treating lumbar disc herniation. However, controversy remains regarding the safest and most effective route of administration. While intradiscal injection is purported to show clinical efficacy, it has also been associated with serious complications. In contrast, the intramuscular route can exhibit a more favourable safety profile and comparable pain outcomes, suggesting its potential as a safer alternative in selected patient populations. This mixed-method study combined computed tomography (CT) imaging, biophysical diffusion modelling, and a meta-analysis of clinical trials to evaluate whether intramuscular O2-O3 therapy can achieve disc penetration and therapeutic efficacy comparable to intradiscal nucleolysis, while minimizing procedural risk. Literature searches across PubMed, Scopus, and Cochrane databases identified seven eligible studies (four randomized controlled trials and three cohort studies), encompassing a total of 120 patients. Statistical analyses included Hedges' g, odds ratios, and number needed to harm (NNH). CT imaging demonstrated gas migration into the intervertebral disc within minutes after intramuscular injection, confirming the plausibility of diffusion through annular micro-fissures. The meta-analysis revealed substantial pain reduction with intramuscular therapy (Hedges' g = -1.55) and very high efficacy with intradiscal treatment (g = 2.87), though the latter was associated with significantly greater heterogeneity and higher complication rates. The relative risk of severe adverse events was 6.57 times higher for intradiscal procedures (NNH ≈ 1180). O2-O3 therapy offers a biologically plausible, safer, and effective alternative to intradiscal injection, supporting its adoption as a first-line, minimally invasive strategy for managing lumbar disc herniation.

氧-臭氧(O2-O3)疗法被广泛用于治疗腰椎间盘突出症。然而,关于最安全和最有效的给药途径仍然存在争议。虽然椎间盘内注射据称具有临床疗效,但它也与严重的并发症有关。相比之下,肌肉注射途径可以表现出更有利的安全性和可比的疼痛结果,这表明它在选定的患者群体中可能是一种更安全的选择。这项混合方法研究结合了计算机断层扫描(CT)成像、生物物理扩散模型和临床试验的荟萃分析,以评估肌肉注射O2-O3治疗是否可以实现椎间盘穿透和与椎间盘内核溶解相当的治疗效果,同时最大限度地降低手术风险。通过PubMed、Scopus和Cochrane数据库的文献检索,确定了7项符合条件的研究(4项随机对照试验和3项队列研究),共包括120名患者。统计分析包括对冲系数、优势比和需要伤害的数量(NNH)。CT成像显示气体在肌内注射后几分钟内迁移到椎间盘,证实了通过环形微裂隙扩散的可能性。荟萃分析显示,肌肉内治疗可显著减轻疼痛(Hedges' g = -1.55),椎间盘内治疗的疗效非常高(g = 2.87),尽管后者具有更大的异质性和更高的并发症发生率。严重不良事件的相对风险是椎间盘内手术的6.57倍(NNH≈1180)。O2-O3治疗提供了一种生物学上合理、更安全、有效的替代椎间盘内注射的方法,支持其作为治疗腰椎间盘突出症的一线微创策略。
{"title":"How Safe Are Oxygen-Ozone Therapy Procedures for Spine Disc Herniation? The SIOOT Protocols for Treating Spine Disorders.","authors":"Marianno Franzini, Salvatore Chirumbolo, Francesco Vaiano, Luigi Valdenassi, Francesca Giannetti, Marianna Chierchia, Umberto Tirelli, Paolo Bonacina, Gianluca Poggi, Aniello Langella, Edoardo Maria Pieracci, Christian Giannetti, Roberto Antonio Giannetti","doi":"10.3390/jimaging11120428","DOIUrl":"10.3390/jimaging11120428","url":null,"abstract":"<p><p>Oxygen-ozone (O<sub>2</sub>-O<sub>3</sub>) therapy is widely used for treating lumbar disc herniation. However, controversy remains regarding the safest and most effective route of administration. While intradiscal injection is purported to show clinical efficacy, it has also been associated with serious complications. In contrast, the intramuscular route can exhibit a more favourable safety profile and comparable pain outcomes, suggesting its potential as a safer alternative in selected patient populations. This mixed-method study combined computed tomography (CT) imaging, biophysical diffusion modelling, and a meta-analysis of clinical trials to evaluate whether intramuscular O<sub>2</sub>-O<sub>3</sub> therapy can achieve disc penetration and therapeutic efficacy comparable to intradiscal nucleolysis, while minimizing procedural risk. Literature searches across PubMed, Scopus, and Cochrane databases identified seven eligible studies (four randomized controlled trials and three cohort studies), encompassing a total of 120 patients. Statistical analyses included Hedges' g, odds ratios, and number needed to harm (NNH). CT imaging demonstrated gas migration into the intervertebral disc within minutes after intramuscular injection, confirming the plausibility of diffusion through annular micro-fissures. The meta-analysis revealed substantial pain reduction with intramuscular therapy (Hedges' g = -1.55) and very high efficacy with intradiscal treatment (g = 2.87), though the latter was associated with significantly greater heterogeneity and higher complication rates. The relative risk of severe adverse events was 6.57 times higher for intradiscal procedures (NNH ≈ 1180). O<sub>2</sub>-O<sub>3</sub> therapy offers a biologically plausible, safer, and effective alternative to intradiscal injection, supporting its adoption as a first-line, minimally invasive strategy for managing lumbar disc herniation.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 12","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12734168/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Brain Tumour Classification Model Based on Spatial Block-Residual Block Collaborative Architecture with Strip Pooling Feature Fusion. 基于空间块-残块协同结构的条形池特征融合脑肿瘤分类模型。
IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Pub Date : 2025-11-29 DOI: 10.3390/jimaging11120427
Meilan Tang, Xinlian Zhou, Zhiyong Li

Precise classification of brain tumors is crucial for early diagnosis and treatment, but obtaining tumor masks is extremely challenging, limiting the application of traditional methods. This paper proposes a brain tumor classification model based on whole-brain images, combining a spatial block-residual block cooperative architecture with striped pooling feature fusion to achieve multi-scale feature representation without requiring tumor masks. The model extracts fine-grained morphological features through three shallow VGG spatial blocks while capturing global contextual information between tumors and surrounding tissues via four deep ResNet residual blocks. Residual connections mitigate gradient vanishing. To effectively fuse multi-level features, strip pooling modules are introduced after the third spatial block and fourth residual block, enabling cross-layer feature integration-particularly optimizing representation of irregular tumor regions. The fused features undergo cross-scale concatenation, integrating both spatial perception and semantic information, and are ultimately classified via an end-to-end Softmax classifier. Experimental results demonstrate that the model achieves an accuracy of 97.29% in brain tumor image classification tasks, significantly outperforming traditional convolutional neural networks. This validates its effectiveness in achieving high-precision, multi-scale feature learning and classification without brain tumor masks, holding potential clinical application value.

脑肿瘤的精确分类对于早期诊断和治疗至关重要,但获得肿瘤掩膜极具挑战性,限制了传统方法的应用。本文提出了一种基于全脑图像的脑肿瘤分类模型,将空间块-残差块协同架构与条纹池化特征融合相结合,在不需要肿瘤掩模的情况下实现多尺度特征表示。该模型通过三个浅VGG空间块提取细粒度形态特征,同时通过四个深ResNet残差块捕获肿瘤与周围组织之间的全局上下文信息。剩余连接缓解梯度消失。为了有效地融合多层次特征,在第三个空间块和第四个残差块之后引入条状池化模块,实现了跨层特征集成,特别是优化了不规则肿瘤区域的表示。融合的特征经过跨尺度拼接,整合空间感知和语义信息,最终通过端到端的Softmax分类器进行分类。实验结果表明,该模型在脑肿瘤图像分类任务中准确率达到97.29%,显著优于传统卷积神经网络。验证了该方法在不使用脑肿瘤面具的情况下实现高精度、多尺度特征学习和分类的有效性,具有潜在的临床应用价值。
{"title":"Brain Tumour Classification Model Based on Spatial Block-Residual Block Collaborative Architecture with Strip Pooling Feature Fusion.","authors":"Meilan Tang, Xinlian Zhou, Zhiyong Li","doi":"10.3390/jimaging11120427","DOIUrl":"10.3390/jimaging11120427","url":null,"abstract":"<p><p>Precise classification of brain tumors is crucial for early diagnosis and treatment, but obtaining tumor masks is extremely challenging, limiting the application of traditional methods. This paper proposes a brain tumor classification model based on whole-brain images, combining a spatial block-residual block cooperative architecture with striped pooling feature fusion to achieve multi-scale feature representation without requiring tumor masks. The model extracts fine-grained morphological features through three shallow VGG spatial blocks while capturing global contextual information between tumors and surrounding tissues via four deep ResNet residual blocks. Residual connections mitigate gradient vanishing. To effectively fuse multi-level features, strip pooling modules are introduced after the third spatial block and fourth residual block, enabling cross-layer feature integration-particularly optimizing representation of irregular tumor regions. The fused features undergo cross-scale concatenation, integrating both spatial perception and semantic information, and are ultimately classified via an end-to-end Softmax classifier. Experimental results demonstrate that the model achieves an accuracy of 97.29% in brain tumor image classification tasks, significantly outperforming traditional convolutional neural networks. This validates its effectiveness in achieving high-precision, multi-scale feature learning and classification without brain tumor masks, holding potential clinical application value.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 12","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12734159/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimized Hounsfield Units Transformation for Explainable Temporal Stage-Specific Ischemic Stroke Classification in CT Imaging. 优化的Hounsfield单位变换用于CT成像中可解释的特定时间阶段的缺血性脑卒中分类。
IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Pub Date : 2025-11-28 DOI: 10.3390/jimaging11120423
Radwan Qasrawi, Suliman Thwib, Ghada Issa, Ibrahem Qdaih, Razan Abu Ghoush, Hamza Arjah

Background: The early and accurate classification of ischemic stroke stages on computed tomography (CT) remains challenging due to subtle attenuation differences and significant scanner variability. This study developed a neural network framework to dynamically optimize Hounsfield Unit (HU) transformations and CLAHE parameters for temporal stage-specific stroke classification.

Methods: We analyzed 1480 CT cases from 68 patients across five stages (hyperacute, acute, subacute, chronic, and normal). The training data were augmented via horizontal flipping, ±7° rotation. A convolutional neural network (CNN) was used to optimize linear transformation and CLAHE parameters through a combined loss function incorporating the effective measure of enhancement (EME), peak signal-to-noise ratio (PSNR), and regularization. the enhanced images were classified using logistic regression (LR), support vector machines (SVMs), and random forests (RFs) with 25-fold cross-validation. Model interpretability was evaluated using Grad-CAM.

Results: Neural network optimization significantly outperformed static parameters across validation metrics. Deep CLAHE achieved the following accuracies versus static CLAHE: hyperacute (0.9838 vs. 0.9754), acute (0.9904 vs. 0.9873), subacute (0.9948 vs. 0.9825), and chronic (near-perfect 0.9979 vs. 0.9808). Qualitative interpretability analysis confirmed that models focused on clinically relevant regions, with optimized enhancement producing more coherent attention patterns than static methods. Parameter analysis revealed stage-aware adaptation: conservative enhancement in early phases (slope: 1.249-1.257), maximized in subacute (slope: 1.290-1.292), and restrained in the chronic phase (slope: 1.240-1.258), reflecting underlying stroke pathophysiology.

Conclusions: A neural network-optimized framework with interpretability validation provides stage-specific stroke classification that achieves superior performance over static methods. Its pathophysiology-aligned parameter adaptation offers a clinically viable and transparent solution for emergency stroke assessment.

背景:由于细微的衰减差异和显著的扫描仪可变性,计算机断层扫描(CT)早期准确的缺血性脑卒中分期分类仍然具有挑战性。本研究开发了一种神经网络框架,用于动态优化Hounsfield Unit (HU)变换和CLAHE参数,用于特定时间阶段的脑卒中分类。方法:我们分析了68例患者的1480例CT病例,分为5个阶段(超急性、急性、亚急性、慢性和正常)。通过水平翻转、±7°旋转增强训练数据。利用卷积神经网络(CNN)通过结合增强(EME)、峰值信噪比(PSNR)和正则化的组合损失函数对线性变换和CLAHE参数进行优化。利用逻辑回归(LR)、支持向量机(svm)和随机森林(RFs)对增强后的图像进行分类,并进行25次交叉验证。使用Grad-CAM评估模型的可解释性。结果:神经网络优化在验证指标上明显优于静态参数。与静态CLAHE相比,深度CLAHE达到以下精度:超急性(0.9838比0.9754)、急性(0.9904比0.9873)、亚急性(0.9948比0.9825)和慢性(接近完美的0.9979比0.9808)。定性可解释性分析证实,模型专注于临床相关区域,与静态方法相比,优化增强产生了更连贯的注意力模式。参数分析显示阶段感知适应:早期保守增强(斜率:1.249-1.257),亚急性期最大(斜率:1.290-1.292),慢性期受限(斜率:1.240-1.258),反映了潜在的卒中病理生理。结论:具有可解释性验证的神经网络优化框架提供了特定阶段的脑卒中分类,比静态方法具有更好的性能。它的病理生理参数适应为紧急卒中评估提供了临床可行和透明的解决方案。
{"title":"Optimized Hounsfield Units Transformation for Explainable Temporal Stage-Specific Ischemic Stroke Classification in CT Imaging.","authors":"Radwan Qasrawi, Suliman Thwib, Ghada Issa, Ibrahem Qdaih, Razan Abu Ghoush, Hamza Arjah","doi":"10.3390/jimaging11120423","DOIUrl":"10.3390/jimaging11120423","url":null,"abstract":"<p><strong>Background: </strong>The early and accurate classification of ischemic stroke stages on computed tomography (CT) remains challenging due to subtle attenuation differences and significant scanner variability. This study developed a neural network framework to dynamically optimize Hounsfield Unit (HU) transformations and CLAHE parameters for temporal stage-specific stroke classification.</p><p><strong>Methods: </strong>We analyzed 1480 CT cases from 68 patients across five stages (hyperacute, acute, subacute, chronic, and normal). The training data were augmented via horizontal flipping, ±7° rotation. A convolutional neural network (CNN) was used to optimize linear transformation and CLAHE parameters through a combined loss function incorporating the effective measure of enhancement (EME), peak signal-to-noise ratio (PSNR), and regularization. the enhanced images were classified using logistic regression (LR), support vector machines (SVMs), and random forests (RFs) with 25-fold cross-validation. Model interpretability was evaluated using Grad-CAM.</p><p><strong>Results: </strong>Neural network optimization significantly outperformed static parameters across validation metrics. Deep CLAHE achieved the following accuracies versus static CLAHE: hyperacute (0.9838 vs. 0.9754), acute (0.9904 vs. 0.9873), subacute (0.9948 vs. 0.9825), and chronic (near-perfect 0.9979 vs. 0.9808). Qualitative interpretability analysis confirmed that models focused on clinically relevant regions, with optimized enhancement producing more coherent attention patterns than static methods. Parameter analysis revealed stage-aware adaptation: conservative enhancement in early phases (slope: 1.249-1.257), maximized in subacute (slope: 1.290-1.292), and restrained in the chronic phase (slope: 1.240-1.258), reflecting underlying stroke pathophysiology.</p><p><strong>Conclusions: </strong>A neural network-optimized framework with interpretability validation provides stage-specific stroke classification that achieves superior performance over static methods. Its pathophysiology-aligned parameter adaptation offers a clinically viable and transparent solution for emergency stroke assessment.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 12","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12733839/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SegClarity: An Attribution-Based XAI Workflow for Evaluating Historical Document Layout Models. SegClarity:用于评估历史文档布局模型的基于归因的XAI工作流。
IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Pub Date : 2025-11-28 DOI: 10.3390/jimaging11120424
Iheb Brini, Najoua Rahal, Maroua Mehri, Rolf Ingold, Najoua Essoukri Ben Amara

In recent years, deep learning networks have demonstrated remarkable progress in the semantic segmentation of historical documents. Nonetheless, their limited explainability remains a critical concern, as these models frequently operate as black boxes, thereby constraining confidence in the trustworthiness of their outputs. To enhance transparency and reliability in their deployment, increasing attention has been directed toward explainable artificial intelligence (XAI) techniques. These techniques typically produce fine-grained attribution maps in the form of heatmaps, illustrating feature contributions from different blocks and layers within a deep neural network (DNN). However, such maps often closely resemble the segmentation outputs themselves, and there is currently no consensus regarding appropriate explainability metrics for semantic segmentation. To overcome these challenges, we present SegClarity, a novel workflow designed to integrate explainability into the analysis of historical documents. The workflow combines visual and quantitative evaluations specifically tailored to segmentation-based applications. Furthermore, we introduce the Attribution Concordance Score (ACS), a new explainability metric that provides quantitative insights into the consistency and reliability of attribution maps. To evaluate the effectiveness of our approach, we conducted extensive qualitative and quantitative experiments using two datasets of historical document images, two U-Net model variants, and four attribution-based XAI methods. A qualitative assessment involved four XAI methods across multiple U-Net layers, including comparisons at the input level with state-of-the-art perturbation methods RISE and MiSuRe. Quantitatively, five XAI evaluation metrics were employed to benchmark these approaches comprehensively. Beyond historical document analysis, we further validated the workflow's generalization by demonstrating its transferability to the Cityscapes dataset, a challenging benchmark for urban scene segmentation. The results demonstrate that the proposed workflow substantially improves the interpretability and reliability of deep learning models applied to the semantic segmentation of historical documents. To enhance reproducibility, we have released SegClarity's source code along with interactive examples of the proposed workflow.

近年来,深度学习网络在历史文献的语义分割方面取得了显著进展。尽管如此,它们有限的可解释性仍然是一个关键问题,因为这些模型经常像黑盒子一样运作,从而限制了对其产出可靠性的信心。为了提高部署的透明度和可靠性,人们越来越关注可解释的人工智能(XAI)技术。这些技术通常以热图的形式生成细粒度的属性图,说明深度神经网络(DNN)中不同块和层的特征贡献。然而,这种映射通常与切分输出本身非常相似,并且目前对于语义切分的适当可解释性指标没有达成共识。为了克服这些挑战,我们提出了SegClarity,这是一种新的工作流程,旨在将可解释性集成到历史文档的分析中。该工作流结合了专为基于细分的应用程序量身定制的可视化和定量评估。此外,我们引入了归因一致性评分(ACS),这是一种新的可解释性指标,为归因图的一致性和可靠性提供了定量的见解。为了评估我们方法的有效性,我们使用两个历史文档图像数据集、两个U-Net模型变体和四种基于归因的XAI方法进行了广泛的定性和定量实验。定性评估涉及跨多个U-Net层的四种XAI方法,包括在输入层面与最先进的扰动方法RISE和MiSuRe进行比较。从数量上讲,五个XAI评估指标被用来对这些方法进行全面的基准测试。除了历史文档分析之外,我们还通过展示其可转移到城市场景数据集(城市场景分割的一个具有挑战性的基准)来进一步验证工作流的泛化。结果表明,该工作流极大地提高了深度学习模型用于历史文档语义分割的可解释性和可靠性。为了增强再现性,我们发布了SegClarity的源代码以及所提议工作流的交互示例。
{"title":"SegClarity: An Attribution-Based XAI Workflow for Evaluating Historical Document Layout Models.","authors":"Iheb Brini, Najoua Rahal, Maroua Mehri, Rolf Ingold, Najoua Essoukri Ben Amara","doi":"10.3390/jimaging11120424","DOIUrl":"10.3390/jimaging11120424","url":null,"abstract":"<p><p>In recent years, deep learning networks have demonstrated remarkable progress in the semantic segmentation of historical documents. Nonetheless, their limited explainability remains a critical concern, as these models frequently operate as black boxes, thereby constraining confidence in the trustworthiness of their outputs. To enhance transparency and reliability in their deployment, increasing attention has been directed toward explainable artificial intelligence (XAI) techniques. These techniques typically produce fine-grained attribution maps in the form of heatmaps, illustrating feature contributions from different blocks and layers within a deep neural network (DNN). However, such maps often closely resemble the segmentation outputs themselves, and there is currently no consensus regarding appropriate explainability metrics for semantic segmentation. To overcome these challenges, we present SegClarity, a novel workflow designed to integrate explainability into the analysis of historical documents. The workflow combines visual and quantitative evaluations specifically tailored to segmentation-based applications. Furthermore, we introduce the Attribution Concordance Score (ACS), a new explainability metric that provides quantitative insights into the consistency and reliability of attribution maps. To evaluate the effectiveness of our approach, we conducted extensive qualitative and quantitative experiments using two datasets of historical document images, two U-Net model variants, and four attribution-based XAI methods. A qualitative assessment involved four XAI methods across multiple U-Net layers, including comparisons at the input level with state-of-the-art perturbation methods RISE and MiSuRe. Quantitatively, five XAI evaluation metrics were employed to benchmark these approaches comprehensively. Beyond historical document analysis, we further validated the workflow's generalization by demonstrating its transferability to the Cityscapes dataset, a challenging benchmark for urban scene segmentation. The results demonstrate that the proposed workflow substantially improves the interpretability and reliability of deep learning models applied to the semantic segmentation of historical documents. To enhance reproducibility, we have released SegClarity's source code along with interactive examples of the proposed workflow.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 12","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12734373/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145821360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1