首页 > 最新文献

Computerized Medical Imaging and Graphics最新文献

英文 中文
SGRRG: Leveraging radiology scene graphs for improved and abnormality-aware radiology report generation SGRRG:利用放射学场景图来改进和异常感知放射学报告生成。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 Epub Date: 2025-09-15 DOI: 10.1016/j.compmedimag.2025.102644
Jun Wang , Lixing Zhu , Abhir Bhalerao , Yulan He
Radiology report generation (RRG) methods often lack sufficient medical knowledge to produce clinically accurate reports. A scene graph provides comprehensive information for describing objects within an image. However, automatically generated radiology scene graphs (RSG) may contain noise annotations and highly overlapping regions, posing challenges in utilizing RSG to enhance RRG. To this end, we propose Scene Graph aided RRG (SGRRG), a framework that leverages an automatically generated RSG and copes with noisy supervision problems in the RSG with a transformer-based module, effectively distilling medical knowledge in an end-to-end manner. SGRRG is composed of a dedicated scene graph encoder responsible for translating the radiography into a RSG, and a scene graph-aided decoder that takes advantage of both patch-level and region-level visual information and mitigates the noisy annotation problem in the RSG. The incorporation of both patch-level and region-level features, alongside the integration of the essential RSG construction modules, enhances our framework’s flexibility and robustness, enabling it to readily exploit prior advanced RRG techniques. A fine-grained, sentence-level attention method is designed to better distill the RSG information. Additionally, we introduce two proxy tasks to enhance the model’s ability to produce clinically accurate reports. Extensive experiments demonstrate that SGRRG outperforms previous state-of-the-art methods in report generation and can better capture abnormal findings. Code is available at https://github.com/Markin-Wang/SGRRG.
放射学报告生成(RRG)方法往往缺乏足够的医学知识,以产生临床准确的报告。场景图为描述图像中的对象提供了全面的信息。然而,自动生成的放射场景图(RSG)可能包含噪声注释和高度重叠的区域,这给利用RSG增强RRG带来了挑战。为此,我们提出了场景图辅助RRG (SGRRG)框架,该框架利用自动生成的RSG,并使用基于变压器的模块处理RSG中的噪声监督问题,有效地以端到端方式提取医学知识。SGRRG由一个专门的场景图编码器和一个场景图辅助解码器组成,前者负责将射线照相转换为RSG,而前者利用了补丁级和区域级视觉信息,并减轻了RSG中的噪声注释问题。结合补丁级和区域级功能,以及基本RSG构建模块的集成,增强了我们框架的灵活性和稳健性,使其能够轻松利用先前的先进RRG技术。为了更好地提取RSG信息,设计了一种细粒度的句子级注意方法。此外,我们引入了两个代理任务,以提高模型产生临床准确报告的能力。大量的实验表明,SGRRG在报告生成方面优于以前最先进的方法,可以更好地捕获异常发现。代码可从https://github.com/Markin-Wang/SGRRG获得。
{"title":"SGRRG: Leveraging radiology scene graphs for improved and abnormality-aware radiology report generation","authors":"Jun Wang ,&nbsp;Lixing Zhu ,&nbsp;Abhir Bhalerao ,&nbsp;Yulan He","doi":"10.1016/j.compmedimag.2025.102644","DOIUrl":"10.1016/j.compmedimag.2025.102644","url":null,"abstract":"<div><div>Radiology report generation (RRG) methods often lack sufficient medical knowledge to produce clinically accurate reports. A scene graph provides comprehensive information for describing objects within an image. However, automatically generated radiology scene graphs (RSG) may contain noise annotations and highly overlapping regions, posing challenges in utilizing RSG to enhance RRG. To this end, we propose Scene Graph aided RRG (SGRRG), a framework that leverages an automatically generated RSG and copes with noisy supervision problems in the RSG with a transformer-based module, effectively distilling medical knowledge in an end-to-end manner. SGRRG is composed of a dedicated scene graph encoder responsible for translating the radiography into a RSG, and a scene graph-aided decoder that takes advantage of both patch-level and region-level visual information and mitigates the noisy annotation problem in the RSG. The incorporation of both patch-level and region-level features, alongside the integration of the essential RSG construction modules, enhances our framework’s flexibility and robustness, enabling it to readily exploit prior advanced RRG techniques. A fine-grained, sentence-level attention method is designed to better distill the RSG information. Additionally, we introduce two proxy tasks to enhance the model’s ability to produce clinically accurate reports. Extensive experiments demonstrate that SGRRG outperforms previous state-of-the-art methods in report generation and can better capture abnormal findings. Code is available at <span><span>https://github.com/Markin-Wang/SGRRG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102644"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145103172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collect vascular specimens in one cabinet: A hierarchical prompt-guided universal model for 3D vascular segmentation 收集血管标本在一个柜子:一个层次快速引导的三维血管分割通用模型。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 Epub Date: 2025-09-26 DOI: 10.1016/j.compmedimag.2025.102650
Yinuo Wang , Cai Meng , Zhe Xu
Accurate segmentation of vascular structures in volumetric medical images is critical for disease diagnosis and surgical planning. While deep neural networks have shown remarkable effectiveness, existing methods often rely on separate models tailored to specific modalities and anatomical regions, resulting in redundant parameters and limited generalization. Recent universal models address broader segmentation tasks but struggle with the unique challenges of vascular structures. To overcome these limitations, we first present VasBench, a new comprehensive vascular segmentation benchmark comprising nine sub-datasets spanning diverse modalities and anatomical regions. Building on this foundation, we introduce VasCab, a novel prompt-guided universal model for volumetric vascular segmentation, designed to “collect vascular specimens in one cabinet”. Specifically, VasCab is equipped with learnable domain and topology prompts to capture shared and unique vascular characteristics across diverse data domains, complemented by morphology perceptual loss to address complex morphological variations. Experimental results demonstrate that VasCab surpasses individual models and state-of-the-art medical foundation models across all test datasets, showcasing exceptional cross-domain integration and precise modeling of vascular morphological variations. Moreover, VasCab exhibits robust performance in downstream tasks, underscoring its versatility and potential for unified vascular analysis. This study marks a significant step toward universal vascular segmentation, offering a promising solution for unified vascular analysis across heterogeneous datasets. Code and dataset are available at https://github.com/mileswyn/VasCab.
体积医学图像中血管结构的准确分割对疾病诊断和手术计划至关重要。虽然深度神经网络已经显示出显著的有效性,但现有的方法往往依赖于针对特定模式和解剖区域定制的单独模型,导致参数冗余和泛化受限。最近的通用模型解决了更广泛的分割任务,但与血管结构的独特挑战作斗争。为了克服这些限制,我们首先提出了VasBench,这是一个新的综合血管分割基准,包括跨越不同模式和解剖区域的9个子数据集。在此基础上,我们介绍了VasCab,一种新颖的快速引导的通用模型,用于体积血管分割,旨在“在一个柜子里收集血管标本”。具体来说,VasCab配备了可学习的域和拓扑提示,以捕获跨不同数据域的共享和独特的血管特征,并辅以形态感知损失来解决复杂的形态变化。实验结果表明,VasCab在所有测试数据集上都超越了个体模型和最先进的医学基础模型,展示了卓越的跨域集成和血管形态变化的精确建模。此外,VasCab在下游任务中表现出强大的性能,强调了其通用性和统一血管分析的潜力。这项研究标志着向通用血管分割迈出了重要的一步,为跨异构数据集的统一血管分析提供了一个有希望的解决方案。代码和数据集可从https://github.com/mileswyn/VasCab获得。
{"title":"Collect vascular specimens in one cabinet: A hierarchical prompt-guided universal model for 3D vascular segmentation","authors":"Yinuo Wang ,&nbsp;Cai Meng ,&nbsp;Zhe Xu","doi":"10.1016/j.compmedimag.2025.102650","DOIUrl":"10.1016/j.compmedimag.2025.102650","url":null,"abstract":"<div><div>Accurate segmentation of vascular structures in volumetric medical images is critical for disease diagnosis and surgical planning. While deep neural networks have shown remarkable effectiveness, existing methods often rely on separate models tailored to specific modalities and anatomical regions, resulting in redundant parameters and limited generalization. Recent universal models address broader segmentation tasks but struggle with the unique challenges of vascular structures. To overcome these limitations, we first present <strong>VasBench</strong>, a new comprehensive vascular segmentation benchmark comprising nine sub-datasets spanning diverse modalities and anatomical regions. Building on this foundation, we introduce <strong>VasCab</strong>, a novel prompt-guided universal model for volumetric vascular segmentation, designed to “collect vascular specimens in one cabinet”. Specifically, VasCab is equipped with learnable domain and topology prompts to capture shared and unique vascular characteristics across diverse data domains, complemented by morphology perceptual loss to address complex morphological variations. Experimental results demonstrate that VasCab surpasses individual models and state-of-the-art medical foundation models across all test datasets, showcasing exceptional cross-domain integration and precise modeling of vascular morphological variations. Moreover, VasCab exhibits robust performance in downstream tasks, underscoring its versatility and potential for unified vascular analysis. This study marks a significant step toward universal vascular segmentation, offering a promising solution for unified vascular analysis across heterogeneous datasets. Code and dataset are available at <span><span>https://github.com/mileswyn/VasCab</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102650"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145201977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing intracranial vessel segmentation using diffusion models without manual annotation for 3D Time-of-Flight Magnetic Resonance Angiography 增强颅内血管分割使用扩散模型无需手动注释的三维飞行时间磁共振血管成像。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 Epub Date: 2025-09-30 DOI: 10.1016/j.compmedimag.2025.102651
Jonghun Kim , Inye Na , Jiwon Chung , Ha-Na Song , Kyungseo Kim , Seongvin Ju , Mi-Yeon Eun , Woo-Keun Seo , Hyunjin Park
Intracranial vessel segmentation is essential for managing brain disorders, facilitating early detection and precise intervention of stroke and aneurysm. Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) is a commonly used vascular imaging technique for segmenting brain vessels. Traditional rule-based MRA segmentation methods were efficient, but suffered from instability and poor performance. Deep learning models, including diffusion models, have recently gained attention in medical image segmentation. However, they require ground truth for training, which is labor-intensive and time-consuming to obtain. We propose a novel segmentation method that combines the strengths of rule-based and diffusion models to improve segmentation without relying on explicit labels. Our model adopts a Frangi filter to help with vessel detection and modifies the diffusion models to exclude memory-intensive attention modules to improve efficiency. Our condition network concatenates the feature maps to further enhance the segmentation process. Quantitative and qualitative evaluations on two datasets demonstrate that our approach not only maintains the integrity of the vascular regions but also substantially reduces noise, offering a robust solution for segmenting intracranial vessels. Our results suggest a basis for improved patient care in disorders involving brain vessels. Our code is available at github.com/jongdory/Vessel-Diffusion.
颅内血管分割是管理脑疾病,促进早期发现和精确干预中风和动脉瘤的必要条件。飞行时间磁共振血管成像(TOF-MRA)是一种常用的血管成像技术。传统的基于规则的MRA分割方法效率高,但存在不稳定和性能差的问题。近年来,包括扩散模型在内的深度学习模型在医学图像分割中得到了广泛的关注。然而,他们需要训练的基础真理,这是劳动密集型和耗时的。我们提出了一种新的分割方法,它结合了基于规则和扩散模型的优势,在不依赖显式标签的情况下改进分割。我们的模型采用Frangi滤波器来帮助血管检测,并修改扩散模型以排除内存密集型注意力模块以提高效率。我们的条件网络将特征映射连接起来,以进一步增强分割过程。对两个数据集的定量和定性评估表明,我们的方法不仅保持了血管区域的完整性,而且大大降低了噪声,为颅内血管分割提供了一个强大的解决方案。我们的研究结果为改善涉及脑血管疾病的患者护理提供了基础。我们的代码可在github.com/jongdory/Vessel-Diffusion上获得。
{"title":"Enhancing intracranial vessel segmentation using diffusion models without manual annotation for 3D Time-of-Flight Magnetic Resonance Angiography","authors":"Jonghun Kim ,&nbsp;Inye Na ,&nbsp;Jiwon Chung ,&nbsp;Ha-Na Song ,&nbsp;Kyungseo Kim ,&nbsp;Seongvin Ju ,&nbsp;Mi-Yeon Eun ,&nbsp;Woo-Keun Seo ,&nbsp;Hyunjin Park","doi":"10.1016/j.compmedimag.2025.102651","DOIUrl":"10.1016/j.compmedimag.2025.102651","url":null,"abstract":"<div><div>Intracranial vessel segmentation is essential for managing brain disorders, facilitating early detection and precise intervention of stroke and aneurysm. Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) is a commonly used vascular imaging technique for segmenting brain vessels. Traditional rule-based MRA segmentation methods were efficient, but suffered from instability and poor performance. Deep learning models, including diffusion models, have recently gained attention in medical image segmentation. However, they require ground truth for training, which is labor-intensive and time-consuming to obtain. We propose a novel segmentation method that combines the strengths of rule-based and diffusion models to improve segmentation without relying on explicit labels. Our model adopts a Frangi filter to help with vessel detection and modifies the diffusion models to exclude memory-intensive attention modules to improve efficiency. Our condition network concatenates the feature maps to further enhance the segmentation process. Quantitative and qualitative evaluations on two datasets demonstrate that our approach not only maintains the integrity of the vascular regions but also substantially reduces noise, offering a robust solution for segmenting intracranial vessels. Our results suggest a basis for improved patient care in disorders involving brain vessels. Our code is available at <span><span>github.com/jongdory/Vessel-Diffusion</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102651"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145259815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An arbitrary-modal fusion network for volumetric cranial nerves tract segmentation 颅神经束体积分割的任意模态融合网络
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 Epub Date: 2025-08-30 DOI: 10.1016/j.compmedimag.2025.102635
Lei Xie , Huajun Zhou , Junxiong Huang , Qingrun Zeng , Jiahao Huang , Jianzhong He , Jiawei Zhang , Baohua Fan , Mingchu Li , Guoqiang Xie , Hao Chen , Yuanjing Feng
The segmentation of cranial nerves (CNs) tract provides a valuable quantitative tool for the analysis of the morphology and trajectory of individual CNs. Multimodal CN segmentation networks, e.g., CNTSeg, which combine structural Magnetic Resonance Imaging (MRI) and diffusion MRI, have achieved promising segmentation performance. However, it is laborious or even infeasible to collect complete multimodal data in clinical practice due to limitations in equipment, user privacy, and working conditions. In this work, we propose a novel arbitrary-modal fusion network for volumetric CN segmentation, called CNTSeg-v2, which trains one model to handle different combinations of available modalities. Instead of directly combining all the modalities, we select T1-weighted (T1w) images as the primary modality due to its simplicity in data acquisition and contribution most to the results, which supervises the information selection of other auxiliary modalities. Our model encompasses an Arbitrary-Modal Collaboration Module (ACM) designed to effectively extract informative features from other auxiliary modalities, guided by the supervision of T1w images. Meanwhile, we construct a Deep Distance-guided Multi-stage (DDM) decoder to correct small errors and discontinuities through signed distance maps to improve segmentation accuracy. We evaluate our CNTSeg-v2 on the Human Connectome Project (HCP) dataset and the clinical Multi-shell Diffusion MRI (MDM) dataset. Extensive experimental results show that our CNTSeg-v2 achieves state-of-the-art segmentation performance, outperforming all competing methods.
颅神经束的分割为分析单个脑神经束的形态和运动轨迹提供了有价值的定量工具。以CNTSeg为代表的多模态CN分割网络将结构磁共振成像(MRI)和扩散MRI相结合,取得了很好的分割效果。然而,在临床实践中,由于设备、用户隐私和工作条件的限制,收集完整的多模态数据是费力的,甚至是不可行的。在这项工作中,我们提出了一种新的用于体积CN分割的任意模态融合网络,称为cntsg -v2,它训练一个模型来处理可用模态的不同组合。我们没有直接组合所有模态,而是选择t1加权(T1w)图像作为主要模态,因为它的数据获取简单且对结果贡献最大,它监督其他辅助模态的信息选择。我们的模型包含一个任意模态协作模块(ACM),旨在通过对T1w图像的监督,有效地从其他辅助模态中提取信息特征。同时,我们构建了一个Deep - distance -guided Multi-stage (DDM)解码器,通过签名距离图来修正小误差和不连续性,以提高分割精度。我们在人类连接组项目(HCP)数据集和临床多壳扩散MRI (MDM)数据集上评估了cntsg -v2。大量的实验结果表明,我们的cntsg -v2实现了最先进的分割性能,优于所有竞争方法。
{"title":"An arbitrary-modal fusion network for volumetric cranial nerves tract segmentation","authors":"Lei Xie ,&nbsp;Huajun Zhou ,&nbsp;Junxiong Huang ,&nbsp;Qingrun Zeng ,&nbsp;Jiahao Huang ,&nbsp;Jianzhong He ,&nbsp;Jiawei Zhang ,&nbsp;Baohua Fan ,&nbsp;Mingchu Li ,&nbsp;Guoqiang Xie ,&nbsp;Hao Chen ,&nbsp;Yuanjing Feng","doi":"10.1016/j.compmedimag.2025.102635","DOIUrl":"10.1016/j.compmedimag.2025.102635","url":null,"abstract":"<div><div>The segmentation of cranial nerves (CNs) tract provides a valuable quantitative tool for the analysis of the morphology and trajectory of individual CNs. Multimodal CN segmentation networks, e.g., CNTSeg, which combine structural Magnetic Resonance Imaging (MRI) and diffusion MRI, have achieved promising segmentation performance. However, it is laborious or even infeasible to collect complete multimodal data in clinical practice due to limitations in equipment, user privacy, and working conditions. In this work, we propose a novel arbitrary-modal fusion network for volumetric CN segmentation, called CNTSeg-v2, which trains one model to handle different combinations of available modalities. Instead of directly combining all the modalities, we select T1-weighted (T1w) images as the primary modality due to its simplicity in data acquisition and contribution most to the results, which supervises the information selection of other auxiliary modalities. Our model encompasses an Arbitrary-Modal Collaboration Module (ACM) designed to effectively extract informative features from other auxiliary modalities, guided by the supervision of T1w images. Meanwhile, we construct a Deep Distance-guided Multi-stage (DDM) decoder to correct small errors and discontinuities through signed distance maps to improve segmentation accuracy. We evaluate our CNTSeg-v2 on the Human Connectome Project (HCP) dataset and the clinical Multi-shell Diffusion MRI (MDM) dataset. Extensive experimental results show that our CNTSeg-v2 achieves state-of-the-art segmentation performance, outperforming all competing methods.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102635"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144989357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A self-attention model for robust rigid slice-to-volume registration of functional MRI 一种功能MRI稳健刚性切片-体积配准的自注意模型。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 Epub Date: 2025-09-13 DOI: 10.1016/j.compmedimag.2025.102643
Samah Khawaled , Onur Afacan , Simon K. Warfield , Moti Freiman
Functional Magnetic Resonance Imaging (fMRI) is vital in neuroscience, enabling investigations into brain disorders, treatment monitoring, and brain function mapping. However, head motion during fMRI scans, occurring between shots of slice acquisition, can result in distortion, biased analyses, and increased costs due to the need for scan repetitions. Therefore, retrospective slice-level motion correction through slice-to-volume registration (SVR) is crucial. Previous studies have utilized deep learning (DL) based models to address the SVR task; however, they overlooked the uncertainty stemming from the input stack of slices and did not assign weighting or scoring to each slice. Treating all slices equally ignores the variability in their relevance, leading to suboptimal predictions. In this work, we introduce an end-to-end SVR model for aligning 2D fMRI slices with a 3D reference volume, incorporating a self-attention mechanism to enhance robustness against input data variations and uncertainties. Our SVR model utilizes independent slice and volume encoders and a self-attention module to assign pixel-wise scores for each slice. We used the publicly available Healthy Brain Network (HBN) dataset. We split the volumes into training (64%), validation (16%), and test (20%) sets. To conduct the simulated motion study, we synthesized rigid transformations across a wide range of parameters and applied them to the reference volumes. Slices were then sampled according to the acquisition protocol to generate 2,000, 500, and 200 3D volume–2D slice pairs for the training, validation, and test sets, respectively. Our experimental results demonstrate that our model achieves competitive performance in terms of alignment accuracy compared to state-of-the-art deep learning-based methods (Euclidean distance of 0.93 [mm] vs. 1.86 [mm], a paired t-test with a p-value of p<0.03). Furthermore, our approach exhibits faster registration speed compared to conventional iterative methods (0.096 s vs. 1.17 s). Our end-to-end SVR model facilitates real-time head motion tracking during fMRI acquisition, ensuring reliability and robustness against uncertainties in the inputs.
功能磁共振成像(fMRI)在神经科学中是至关重要的,它使研究大脑疾病、治疗监测和脑功能绘图成为可能。然而,在fMRI扫描期间,头部运动发生在切片采集之间,由于需要重复扫描,可能导致失真、分析偏差和成本增加。因此,通过切片-体积配准(SVR)进行回顾性切片级运动校正至关重要。以前的研究利用基于深度学习(DL)的模型来解决SVR任务;然而,他们忽略了来自切片输入堆栈的不确定性,并且没有为每个切片分配权重或评分。平等地对待所有切片忽略了它们相关性的可变性,导致次优预测。在这项工作中,我们引入了一个端到端的SVR模型,用于将2D fMRI切片与3D参考体积对准,该模型结合了一个自注意机制,以增强对输入数据变化和不确定性的鲁棒性。我们的SVR模型利用独立的切片和音量编码器以及自关注模块为每个切片分配像素级分数。我们使用了公开可用的健康大脑网络(HBN)数据集。我们将这些数据集分成训练集(64%)、验证集(16%)和测试集(20%)。为了进行模拟运动研究,我们在广泛的参数范围内合成了刚性变换,并将它们应用于参考体积。然后根据采集协议对切片进行采样,分别为训练集、验证集和测试集生成2,000、500和200个3D体- 2d切片对。我们的实验结果表明,与最先进的基于深度学习的方法相比,我们的模型在对齐精度方面取得了具有竞争力的性能(欧几里得距离为0.93 [mm] vs. 1.86 [mm],配对t检验,p值为p
{"title":"A self-attention model for robust rigid slice-to-volume registration of functional MRI","authors":"Samah Khawaled ,&nbsp;Onur Afacan ,&nbsp;Simon K. Warfield ,&nbsp;Moti Freiman","doi":"10.1016/j.compmedimag.2025.102643","DOIUrl":"10.1016/j.compmedimag.2025.102643","url":null,"abstract":"<div><div>Functional Magnetic Resonance Imaging (fMRI) is vital in neuroscience, enabling investigations into brain disorders, treatment monitoring, and brain function mapping. However, head motion during fMRI scans, occurring between shots of slice acquisition, can result in distortion, biased analyses, and increased costs due to the need for scan repetitions. Therefore, retrospective slice-level motion correction through slice-to-volume registration (SVR) is crucial. Previous studies have utilized deep learning (DL) based models to address the SVR task; however, they overlooked the uncertainty stemming from the input stack of slices and did not assign weighting or scoring to each slice. Treating all slices equally ignores the variability in their relevance, leading to suboptimal predictions. In this work, we introduce an end-to-end SVR model for aligning 2D fMRI slices with a 3D reference volume, incorporating a self-attention mechanism to enhance robustness against input data variations and uncertainties. Our SVR model utilizes independent slice and volume encoders and a self-attention module to assign pixel-wise scores for each slice. We used the publicly available Healthy Brain Network (HBN) dataset. We split the volumes into training (64%), validation (16%), and test (20%) sets. To conduct the simulated motion study, we synthesized rigid transformations across a wide range of parameters and applied them to the reference volumes. Slices were then sampled according to the acquisition protocol to generate 2,000, 500, and 200 3D volume–2D slice pairs for the training, validation, and test sets, respectively. Our experimental results demonstrate that our model achieves competitive performance in terms of alignment accuracy compared to state-of-the-art deep learning-based methods (Euclidean distance of 0.93 [mm] vs. 1.86 [mm], a paired t-test with a <span><math><mi>p</mi></math></span>-value of <span><math><mrow><mi>p</mi><mo>&lt;</mo><mn>0</mn><mo>.</mo><mn>03</mn></mrow></math></span>). Furthermore, our approach exhibits faster registration speed compared to conventional iterative methods (0.096 s vs. 1.17 s). Our end-to-end SVR model facilitates real-time head motion tracking during fMRI acquisition, ensuring reliability and robustness against uncertainties in the inputs.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102643"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145151729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference time correction based on confidence and uncertainty for improved deep-learning model performance and explainability in medical image classification 基于置信度和不确定性的医学图像分类深度学习模型性能和可解释性的推理时间校正
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 Epub Date: 2025-08-13 DOI: 10.1016/j.compmedimag.2025.102630
Joel Jeffrey , Ashwin RajKumar , Sudhanshu Pandey , Lokesh Bathala , Phaneendra K. Yalavarthy
The major challenge faced by artificial intelligence (AI) models for medical image analysis is the class imbalance of training data and limited explainability. This study introduces a Confidence and Entropy-based Uncertainty Thresholding Algorithm (CEbUTAl), which is a novel post-processing method, designed to enhance both model performance and explainability. CEbUTAl modifies model predictions during inference, based on uncertainty and confidence measures, to improve classification in scenarios with class imbalance. CEbUTAl’s inference-time correction addresses explainability, while simultaneously improving performance, contrary to the prevailing notion that explainability necessitates a compromise in performance. The algorithm was evaluated across five medical imaging tasks: intracranial hemorrhage detection, optical coherence tomography analysis, breast cancer detection, carpal tunnel syndrome detection, and multi-class skin lesion classification. Results demonstrate that CEbUTAl improves accuracy by approximately 5% and increases sensitivity across multiple deep learning architectures, loss functions, and tasks. Comparative studies indicate that CEbUTAl outperforms state-of-the-art methods in addressing class imbalance and quantifying uncertainty. The model-agnostic, task-agnostic and post-processing nature of CEbUTAl makes it appealing for enhancing both performance and trustworthiness in medical image analysis. This study provides a generalizable approach to mitigate biases arising from class imbalance, while improving the explainability of AI models, thus increasing their utility in clinical practice.
人工智能(AI)医学图像分析模型面临的主要挑战是训练数据的类别不平衡和有限的可解释性。本文提出了一种基于置信度和熵的不确定性阈值算法(CEbUTAl),这是一种新的后处理方法,旨在提高模型的性能和可解释性。CEbUTAl在推理过程中修改模型预测,基于不确定性和置信度度量,以改进类不平衡场景中的分类。CEbUTAl的推理时间修正解决了可解释性,同时提高了性能,这与普遍认为可解释性必须在性能上做出妥协的观念相反。该算法在五个医学成像任务中进行了评估:颅内出血检测、光学相干断层扫描分析、乳腺癌检测、腕管综合征检测和多类别皮肤病变分类。结果表明,CEbUTAl在多个深度学习架构、损失函数和任务中提高了大约5%的准确性,并提高了灵敏度。比较研究表明,CEbUTAl在解决阶级不平衡和量化不确定性方面优于最先进的方法。CEbUTAl的模型不可知论、任务不可知论和后处理特性使其在提高医学图像分析的性能和可信度方面具有吸引力。本研究提供了一种可推广的方法来减轻因类别不平衡而产生的偏见,同时提高人工智能模型的可解释性,从而提高其在临床实践中的实用性。
{"title":"Inference time correction based on confidence and uncertainty for improved deep-learning model performance and explainability in medical image classification","authors":"Joel Jeffrey ,&nbsp;Ashwin RajKumar ,&nbsp;Sudhanshu Pandey ,&nbsp;Lokesh Bathala ,&nbsp;Phaneendra K. Yalavarthy","doi":"10.1016/j.compmedimag.2025.102630","DOIUrl":"10.1016/j.compmedimag.2025.102630","url":null,"abstract":"<div><div>The major challenge faced by artificial intelligence (AI) models for medical image analysis is the class imbalance of training data and limited explainability. This study introduces a Confidence and Entropy-based Uncertainty Thresholding Algorithm (CEbUTAl), which is a novel post-processing method, designed to enhance both model performance and explainability. CEbUTAl modifies model predictions during inference, based on uncertainty and confidence measures, to improve classification in scenarios with class imbalance. CEbUTAl’s inference-time correction addresses explainability, while simultaneously improving performance, contrary to the prevailing notion that explainability necessitates a compromise in performance. The algorithm was evaluated across five medical imaging tasks: intracranial hemorrhage detection, optical coherence tomography analysis, breast cancer detection, carpal tunnel syndrome detection, and multi-class skin lesion classification. Results demonstrate that CEbUTAl improves accuracy by approximately 5% and increases sensitivity across multiple deep learning architectures, loss functions, and tasks. Comparative studies indicate that CEbUTAl outperforms state-of-the-art methods in addressing class imbalance and quantifying uncertainty. The model-agnostic, task-agnostic and post-processing nature of CEbUTAl makes it appealing for enhancing both performance and trustworthiness in medical image analysis. This study provides a generalizable approach to mitigate biases arising from class imbalance, while improving the explainability of AI models, thus increasing their utility in clinical practice.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102630"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144893305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling hidden risks: A Holistically-Driven Weak Supervision framework for ultra-short-term ACS prediction using CCTA 揭示隐藏的风险:使用CCTA进行超短期ACS预测的整体驱动弱监管框架。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 Epub Date: 2025-09-15 DOI: 10.1016/j.compmedimag.2025.102636
Zhen Liu , Bangkang Fu , Jiahui Mao , Junjie He , Jiangyue Xiang , Hongjin Li , Yunsong Peng , Bangguo Li , Rongpin Wang
This paper proposes MH-STR, a novel end-to-end framework for predicting the three-month risk of Acute Coronary Syndrome (ACS) from Coronary CT Angiography (CCTA) images. The model combines hybrid attention mechanisms with convolutional networks to capture subtle and irregular lesion patterns that are difficult to detect visually. A stage-wise transfer learning strategy helps distill general features and transfer vascular-specific knowledge. To reconcile feature scale mismatches in the dual-branch architecture, we introduce a wavelet-based multi-scale fusion module for effective integration across scales. Experiments show that MH-STR achieves an AUC of 0.834, an F1 score of 0.82, and a precision of 0.92, outperforming existing methods and highlighting its potential for improving ACS risk prediction.
本文提出了MH-STR,一个新的端到端框架,用于从冠状动脉CT血管造影(CCTA)图像预测三个月的急性冠脉综合征(ACS)风险。该模型将混合注意机制与卷积网络相结合,以捕获难以视觉检测的细微和不规则病变模式。分阶段迁移学习策略有助于提取一般特征并迁移血管特定知识。为了解决双分支结构中特征尺度不匹配的问题,我们引入了基于小波的多尺度融合模块,实现了多尺度的有效融合。实验结果表明,MH-STR的AUC为0.834,F1分数为0.82,精度为0.92,优于现有方法,具有提高ACS风险预测的潜力。
{"title":"Unveiling hidden risks: A Holistically-Driven Weak Supervision framework for ultra-short-term ACS prediction using CCTA","authors":"Zhen Liu ,&nbsp;Bangkang Fu ,&nbsp;Jiahui Mao ,&nbsp;Junjie He ,&nbsp;Jiangyue Xiang ,&nbsp;Hongjin Li ,&nbsp;Yunsong Peng ,&nbsp;Bangguo Li ,&nbsp;Rongpin Wang","doi":"10.1016/j.compmedimag.2025.102636","DOIUrl":"10.1016/j.compmedimag.2025.102636","url":null,"abstract":"<div><div>This paper proposes MH-STR, a novel end-to-end framework for predicting the three-month risk of Acute Coronary Syndrome (ACS) from Coronary CT Angiography (CCTA) images. The model combines hybrid attention mechanisms with convolutional networks to capture subtle and irregular lesion patterns that are difficult to detect visually. A stage-wise transfer learning strategy helps distill general features and transfer vascular-specific knowledge. To reconcile feature scale mismatches in the dual-branch architecture, we introduce a wavelet-based multi-scale fusion module for effective integration across scales. Experiments show that MH-STR achieves an AUC of 0.834, an F1 score of 0.82, and a precision of 0.92, outperforming existing methods and highlighting its potential for improving ACS risk prediction.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102636"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145088019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SSIFNet: Spatial–temporal stereo information fusion network for self-supervised surgical video inpainting SSIFNet:用于自监督手术视频喷漆的时空立体信息融合网络
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 Epub Date: 2025-08-25 DOI: 10.1016/j.compmedimag.2025.102622
Xiaoyang Zou , Zhuyuan Zhang , Derong Yu , Wenyuan Sun , Wenyong Liu , Donghua Hang , Wei Bao , Guoyan Zheng
During minimally invasive robot-assisted surgical procedures, surgeons rely on stereo endoscopes to provide image guidance. Nevertheless, the field-of-view is typically restricted owing to the limited size of the endoscope and constrained workspace. Such a visualization challenge becomes even more severe when surgical instruments are inserted into the already restricted field-of-view, where important anatomical landmarks and relevant clinical contents may become occluded by the inserted instruments. To address the challenge, in this work, we propose a novel end-to-end trainable spatial–temporal stereo information fusion network, referred as SSIFNet, for inpainting surgical videos of surgical scene under instrument occlusions in robot-assisted endoscopic surgery. The proposed SSIFNet features three essential modules including a novel optical flow-guided deformable feature propagation (OFDFP) module, a novel spatial–temporal stereo focal transformer (S2FT)-based information fusion module, and a novel stereo-consistency enforcement (SE) module. These three modules work synergistically to inpaint occluded regions in the surgical scene. More importantly, SSIFNet is trained in a self-supervised manner with simulated occlusions by a novel loss function, which is designed to combine flow completion, disparity matching, cross-warping consistency, warping-consistency, image and adversarial loss terms to generate high fidelity and accurate occlusion reconstructions in both views. After training, the trained model can be applied directly to inpainting surgical videos with true instrument occlusions to generate results with not only spatial and temporal consistency but also stereo-consistency. Comprehensive quantitative and qualitative experimental results demonstrate that SSIFNet outperforms state-of-the-art (SOTA) video inpainting methods. The source code of this study will be released at https://github.com/SHAUNZXY/SSIFNet.
在微创机器人辅助手术过程中,外科医生依靠立体内窥镜提供图像引导。然而,由于内窥镜的尺寸和工作空间的限制,视野通常受到限制。当手术器械被插入到已经受限的视野中时,这种可视化挑战变得更加严峻,因为重要的解剖标志和相关的临床内容可能会被插入的器械遮挡。为了解决这一挑战,在这项工作中,我们提出了一种新颖的端到端可训练的时空立体信息融合网络(SSIFNet),用于在机器人辅助内窥镜手术中绘制器械闭塞下的手术场景视频。提出的SSIFNet具有三个基本模块,包括新型光流引导可变形特征传播(OFDFP)模块、新型基于时空立体焦变压器(S2FT)的信息融合模块和新型立体一致性增强(SE)模块。这三个模块协同工作,在手术场景中绘制闭塞区域。更重要的是,SSIFNet通过一种新的损失函数对模拟遮挡进行自监督训练,该损失函数结合了流补全、视差匹配、交叉扭曲一致性、扭曲一致性、图像和对抗损失项,在两个视图中生成高保真和准确的遮挡重建。训练后的模型可以直接应用于真实器械闭塞的手术视频中,生成的结果不仅具有空间一致性和时间一致性,而且具有立体一致性。综合的定量和定性实验结果表明,SSIFNet优于最先进的(SOTA)视频喷漆方法。本研究的源代码将在https://github.com/SHAUNZXY/SSIFNet上发布。
{"title":"SSIFNet: Spatial–temporal stereo information fusion network for self-supervised surgical video inpainting","authors":"Xiaoyang Zou ,&nbsp;Zhuyuan Zhang ,&nbsp;Derong Yu ,&nbsp;Wenyuan Sun ,&nbsp;Wenyong Liu ,&nbsp;Donghua Hang ,&nbsp;Wei Bao ,&nbsp;Guoyan Zheng","doi":"10.1016/j.compmedimag.2025.102622","DOIUrl":"10.1016/j.compmedimag.2025.102622","url":null,"abstract":"<div><div>During minimally invasive robot-assisted surgical procedures, surgeons rely on stereo endoscopes to provide image guidance. Nevertheless, the field-of-view is typically restricted owing to the limited size of the endoscope and constrained workspace. Such a visualization challenge becomes even more severe when surgical instruments are inserted into the already restricted field-of-view, where important anatomical landmarks and relevant clinical contents may become occluded by the inserted instruments. To address the challenge, in this work, we propose a novel end-to-end trainable spatial–temporal stereo information fusion network, referred as SSIFNet, for inpainting surgical videos of surgical scene under instrument occlusions in robot-assisted endoscopic surgery. The proposed SSIFNet features three essential modules including a novel optical flow-guided deformable feature propagation (OFDFP) module, a novel spatial–temporal stereo focal transformer (S<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>FT)-based information fusion module, and a novel stereo-consistency enforcement (SE) module. These three modules work synergistically to inpaint occluded regions in the surgical scene. More importantly, SSIFNet is trained in a self-supervised manner with simulated occlusions by a novel loss function, which is designed to combine flow completion, disparity matching, cross-warping consistency, warping-consistency, image and adversarial loss terms to generate high fidelity and accurate occlusion reconstructions in both views. After training, the trained model can be applied directly to inpainting surgical videos with true instrument occlusions to generate results with not only spatial and temporal consistency but also stereo-consistency. Comprehensive quantitative and qualitative experimental results demonstrate that SSIFNet outperforms state-of-the-art (SOTA) video inpainting methods. The source code of this study will be released at <span><span>https://github.com/SHAUNZXY/SSIFNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102622"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144902402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SA2Net: Scale-adaptive structure-affinity transformation for spine segmentation from ultrasound volume projection imaging 基于尺度自适应结构亲和变换的超声体积投影成像脊柱分割。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 Epub Date: 2025-09-25 DOI: 10.1016/j.compmedimag.2025.102649
Hao Xie , Zixun Huang , Yushen Zuo , Yakun Ju , Frank H.F. Leung , N.F. Law , Kin-Man Lam , Yong-Ping Zheng , Sai Ho Ling
Spine segmentation, based on ultrasound volume projection imaging (VPI), plays a vital role for intelligent scoliosis diagnosis in clinical applications. However, this task faces several significant challenges. Firstly, the global contextual knowledge of spines may not be well-learned if we neglect the high spatial correlation of different bone features. Secondly, the spine bones contain rich structural knowledge regarding their shapes and positions, which deserves to be encoded into the segmentation process. To address these challenges, we propose a novel scale-adaptive structure-aware network (SA2Net) for effective spine segmentation. First, we propose a scale-adaptive complementary strategy to learn the cross-dimensional long-distance correlation features for spinal images. Second, motivated by the consistency between multi-head self-attention in Transformers and semantic level affinity, we propose structure-affinity transformation to transform semantic features with class-specific affinity and combine it with a Transformer decoder for structure-aware reasoning. In addition, we adopt a feature mixing loss aggregation method to enhance model training. This method improves the robustness and accuracy of the segmentation process. The experimental results demonstrate that our SA2Net achieves superior segmentation performance compared to other state-of-the-art methods. Moreover, the adaptability of SA2Net to various backbones enhances its potential as a promising tool for advanced scoliosis diagnosis using intelligent spinal image analysis.
基于超声体积投影成像(VPI)的脊柱分割对脊柱侧凸智能诊断具有重要的临床应用价值。然而,这项任务面临着几个重大挑战。首先,如果我们忽略了不同骨骼特征的高空间相关性,那么脊柱的全局上下文知识可能无法很好地学习。其次,脊柱骨骼包含丰富的形状和位置结构知识,值得编码到分割过程中。为了解决这些挑战,我们提出了一种新的规模自适应结构感知网络(SA2Net),用于有效的脊柱分割。首先,我们提出了一种尺度自适应互补策略来学习脊柱图像的跨维远距离相关特征。其次,基于Transformer中多头自注意与语义级亲和力的一致性,提出了结构-亲和力转换,将语义特征转换为类特定亲和力,并将其与Transformer解码器结合,实现结构感知推理。此外,我们采用特征混合损失聚合方法来增强模型训练。该方法提高了分割过程的鲁棒性和准确性。实验结果表明,与其他最先进的方法相比,我们的SA2Net实现了优越的分割性能。此外,SA2Net对各种脊柱的适应性增强了其作为智能脊柱图像分析高级脊柱侧凸诊断工具的潜力。
{"title":"SA2Net: Scale-adaptive structure-affinity transformation for spine segmentation from ultrasound volume projection imaging","authors":"Hao Xie ,&nbsp;Zixun Huang ,&nbsp;Yushen Zuo ,&nbsp;Yakun Ju ,&nbsp;Frank H.F. Leung ,&nbsp;N.F. Law ,&nbsp;Kin-Man Lam ,&nbsp;Yong-Ping Zheng ,&nbsp;Sai Ho Ling","doi":"10.1016/j.compmedimag.2025.102649","DOIUrl":"10.1016/j.compmedimag.2025.102649","url":null,"abstract":"<div><div>Spine segmentation, based on ultrasound volume projection imaging (VPI), plays a vital role for intelligent scoliosis diagnosis in clinical applications. However, this task faces several significant challenges. Firstly, the global contextual knowledge of spines may not be well-learned if we neglect the high spatial correlation of different bone features. Secondly, the spine bones contain rich structural knowledge regarding their shapes and positions, which deserves to be encoded into the segmentation process. To address these challenges, we propose a novel scale-adaptive structure-aware network (SA<sup>2</sup>Net) for effective spine segmentation. First, we propose a scale-adaptive complementary strategy to learn the cross-dimensional long-distance correlation features for spinal images. Second, motivated by the consistency between multi-head self-attention in Transformers and semantic level affinity, we propose structure-affinity transformation to transform semantic features with class-specific affinity and combine it with a Transformer decoder for structure-aware reasoning. In addition, we adopt a feature mixing loss aggregation method to enhance model training. This method improves the robustness and accuracy of the segmentation process. The experimental results demonstrate that our SA<sup>2</sup>Net achieves superior segmentation performance compared to other state-of-the-art methods. Moreover, the adaptability of SA<sup>2</sup>Net to various backbones enhances its potential as a promising tool for advanced scoliosis diagnosis using intelligent spinal image analysis.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102649"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145193971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mamba-based context-aware local feature network for vessel detail enhancement 基于mamba的上下文感知局部特征网络,用于船舶细节增强
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 Epub Date: 2025-09-12 DOI: 10.1016/j.compmedimag.2025.102645
Keyi Han , Anqi Xiao , Jie Tian , Zhenhua Hu

Objective

Blood vessel analysis is essential in various clinical fields. Detailed vascular imaging enables clinicians to assess abnormalities and make timely, effective interventions. Near-infrared-II (NIR-II, 1000–1700 nm) fluorescence imaging offers superior resolution, sensitivity, and deeper tissue visualization, making it highly promising for vascular imaging. However, deep vessels exhibit relatively low contrast, making differentiation challenging, and accurate vessel segmentation remains a difficult task.

Methods

We propose CALFNet, a context-aware local feature network based on the Mamba module, which can segment more vascular details in low-contrast regions. CALFNet overall follows a UNet-like architectures, with a ResNet-based encoder for extracting local features and a Mamba-based context-aware module in the latent space for the awareness of the global context. By incorporating the global vessel contextual information, the network can enhance segmentation performance in locally low-contrast areas, capturing finer vessel structures more effectively. Furthermore, a feature-enhance module between the encoder and decoder is designed to preserve critical historical local features from the encoder and use them to further refine the vascular details in the decoder's feature representations.

Results

We conducted experiments on two types of clinical datasets, including an NIR-II fluorescent vascular imaging dataset and retinal vessel datasets captured under visible light. The results show that CALFNet outperforms the comparison methods, demonstrating superior robustness and achieving more accurate vessel segmentation, particularly in low-contrast regions.

Conclusion and Significance

CALFNet is an effective vessel segmentation network showing better performance in accurately segmenting vessels within low-contrast regions. It can enhance the capability of NIR-II fluorescence imaging for vascular analysis, providing valuable support for clinical diagnosis and medical intervention.
目的血管分析在临床各个领域都是必不可少的。详细的血管成像使临床医生能够评估异常并及时有效地进行干预。近红外- ii (NIR-II, 1000-1700 nm)荧光成像提供卓越的分辨率,灵敏度和更深层次的组织可视化,使其在血管成像方面非常有前途。然而,深层血管的对比度相对较低,使得分化具有挑战性,并且准确的血管分割仍然是一项艰巨的任务。方法提出基于Mamba模块的上下文感知局部特征网络CALFNet,该网络可以在低对比度区域分割出更多的血管细节。CALFNet总体上遵循类似unet的架构,使用基于resnet的编码器来提取本地特征,在潜在空间中使用基于mamba的上下文感知模块来感知全局上下文。通过整合全局船舶上下文信息,网络可以增强局部低对比度区域的分割性能,更有效地捕获更精细的船舶结构。此外,在编码器和解码器之间设计了一个特征增强模块,用于保留编码器的关键历史局部特征,并使用它们进一步细化解码器特征表示中的血管细节。结果我们对两种类型的临床数据集进行了实验,包括NIR-II荧光血管成像数据集和可见光下捕获的视网膜血管数据集。结果表明,CALFNet优于对比方法,表现出优越的鲁棒性,实现了更准确的血管分割,特别是在低对比度区域。结论与意义alfnet是一种有效的血管分割网络,在低对比度区域内具有较好的血管准确分割效果。可增强NIR-II荧光成像血管分析能力,为临床诊断和医学干预提供有价值的支持。
{"title":"Mamba-based context-aware local feature network for vessel detail enhancement","authors":"Keyi Han ,&nbsp;Anqi Xiao ,&nbsp;Jie Tian ,&nbsp;Zhenhua Hu","doi":"10.1016/j.compmedimag.2025.102645","DOIUrl":"10.1016/j.compmedimag.2025.102645","url":null,"abstract":"<div><h3>Objective</h3><div>Blood vessel analysis is essential in various clinical fields. Detailed vascular imaging enables clinicians to assess abnormalities and make timely, effective interventions. Near-infrared-II (NIR-II, 1000–1700 nm) fluorescence imaging offers superior resolution, sensitivity, and deeper tissue visualization, making it highly promising for vascular imaging. However, deep vessels exhibit relatively low contrast, making differentiation challenging, and accurate vessel segmentation remains a difficult task.</div></div><div><h3>Methods</h3><div>We propose CALFNet, a context-aware local feature network based on the Mamba module, which can segment more vascular details in low-contrast regions. CALFNet overall follows a UNet-like architectures, with a ResNet-based encoder for extracting local features and a Mamba-based context-aware module in the latent space for the awareness of the global context. By incorporating the global vessel contextual information, the network can enhance segmentation performance in locally low-contrast areas, capturing finer vessel structures more effectively. Furthermore, a feature-enhance module between the encoder and decoder is designed to preserve critical historical local features from the encoder and use them to further refine the vascular details in the decoder's feature representations.</div></div><div><h3>Results</h3><div>We conducted experiments on two types of clinical datasets, including an NIR-II fluorescent vascular imaging dataset and retinal vessel datasets captured under visible light. The results show that CALFNet outperforms the comparison methods, demonstrating superior robustness and achieving more accurate vessel segmentation, particularly in low-contrast regions.</div></div><div><h3>Conclusion and Significance</h3><div>CALFNet is an effective vessel segmentation network showing better performance in accurately segmenting vessels within low-contrast regions. It can enhance the capability of NIR-II fluorescence imaging for vascular analysis, providing valuable support for clinical diagnosis and medical intervention.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102645"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145118689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computerized Medical Imaging and Graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1