首页 > 最新文献

Computerized Medical Imaging and Graphics最新文献

英文 中文
Path and bone-contour regularized unpaired MRI-to-CT translation 路径和骨轮廓正则化非配对mri - ct翻译
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-13 DOI: 10.1016/j.compmedimag.2025.102656
Teng Zhou , Jax Luo , Yuping Sun , Yiheng Tan , Shun Yao , Nazim Haouchine , Scott Raymond
Accurate MRI-to-CT translation promises the integration of complementary imaging information without the need for additional imaging sessions. Given the practical challenges associated with acquiring paired MRI and CT scans, the development of robust methods capable of leveraging unpaired datasets is essential for advancing the MRI-to-CT translation. Current unpaired MRI-to-CT translation methods, which predominantly rely on cycle consistency and contrastive learning frameworks, frequently encounter challenges in accurately translating anatomical features that are highly discernible on CT but less distinguishable on MRI, such as bone structures. This limitation renders these approaches less suitable for applications in radiation therapy, where precise bone representation is essential for accurate treatment planning. To address this challenge, we propose a path- and bone-contour regularized approach for unpaired MRI-to-CT translation. In our method, MRI and CT images are projected to a shared latent space, where the MRI-to-CT mapping is modeled as a continuous flow governed by neural ordinary differential equations. The optimal mapping is obtained by minimizing the transition path length of the flow. To enhance the accuracy of translated bone structures, we introduce a trainable neural network to generate bone contours from MRI and implement mechanisms to directly and indirectly encourage the model to focus on bone contours and their adjacent regions. Evaluations conducted on three datasets demonstrate that our method outperforms existing unpaired MRI-to-CT translation approaches, achieving lower overall error rates. Moreover, in a downstream bone segmentation task, our approach exhibits superior performance in preserving the fidelity of bone structures. Our code is available at: https://github.com/kennysyp/PaBoT.
准确的mri到ct转换保证了互补成像信息的整合,而不需要额外的成像会话。考虑到获取配对MRI和CT扫描相关的实际挑战,开发能够利用非配对数据集的强大方法对于推进MRI到CT的转换至关重要。目前的非配对MRI- CT翻译方法主要依赖于周期一致性和对比学习框架,在准确翻译在CT上高度可识别但在MRI上不易识别的解剖特征(如骨结构)时经常遇到挑战。这种限制使得这些方法不太适合应用于放射治疗,在放射治疗中,精确的骨表示对于准确的治疗计划至关重要。为了解决这一挑战,我们提出了一种非配对mri到ct翻译的路径和骨轮廓正则化方法。在我们的方法中,MRI和CT图像被投影到一个共享的潜在空间,其中MRI到CT的映射被建模为由神经常微分方程控制的连续流。通过最小化流的过渡路径长度来获得最优映射。为了提高翻译骨结构的准确性,我们引入了一个可训练的神经网络来从MRI中生成骨轮廓,并实现了直接和间接鼓励模型关注骨轮廓及其邻近区域的机制。在三个数据集上进行的评估表明,我们的方法优于现有的非成对mri - ct翻译方法,实现了更低的总体错误率。此外,在下游的骨分割任务中,我们的方法在保持骨结构的保真度方面表现出优越的性能。我们的代码可在:https://github.com/kennysyp/PaBoT。
{"title":"Path and bone-contour regularized unpaired MRI-to-CT translation","authors":"Teng Zhou ,&nbsp;Jax Luo ,&nbsp;Yuping Sun ,&nbsp;Yiheng Tan ,&nbsp;Shun Yao ,&nbsp;Nazim Haouchine ,&nbsp;Scott Raymond","doi":"10.1016/j.compmedimag.2025.102656","DOIUrl":"10.1016/j.compmedimag.2025.102656","url":null,"abstract":"<div><div>Accurate MRI-to-CT translation promises the integration of complementary imaging information without the need for additional imaging sessions. Given the practical challenges associated with acquiring paired MRI and CT scans, the development of robust methods capable of leveraging unpaired datasets is essential for advancing the MRI-to-CT translation. Current unpaired MRI-to-CT translation methods, which predominantly rely on cycle consistency and contrastive learning frameworks, frequently encounter challenges in accurately translating anatomical features that are highly discernible on CT but less distinguishable on MRI, such as bone structures. This limitation renders these approaches less suitable for applications in radiation therapy, where precise bone representation is essential for accurate treatment planning. To address this challenge, we propose a path- and bone-contour regularized approach for unpaired MRI-to-CT translation. In our method, MRI and CT images are projected to a shared latent space, where the MRI-to-CT mapping is modeled as a continuous flow governed by neural ordinary differential equations. The optimal mapping is obtained by minimizing the transition path length of the flow. To enhance the accuracy of translated bone structures, we introduce a trainable neural network to generate bone contours from MRI and implement mechanisms to directly and indirectly encourage the model to focus on bone contours and their adjacent regions. Evaluations conducted on three datasets demonstrate that our method outperforms existing unpaired MRI-to-CT translation approaches, achieving lower overall error rates. Moreover, in a downstream bone segmentation task, our approach exhibits superior performance in preserving the fidelity of bone structures. Our code is available at: <span><span>https://github.com/kennysyp/PaBoT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"126 ","pages":"Article 102656"},"PeriodicalIF":4.9,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145290008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ESAM2-BLS: Enhanced segment anything model 2 for efficient breast lesion segmentation in ultrasound imaging ESAM2-BLS:用于超声成像中乳腺病变有效分割的增强分段任何模型2。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-10 DOI: 10.1016/j.compmedimag.2025.102654
Lishuang Guo , Haonan Zhang , Chenbin Ma
Ultrasound imaging, as an economical, efficient, and non-invasive diagnostic tool, is widely used for breast lesion screening and diagnosis. However, the segmentation of lesion regions remains a significant challenge due to factors such as noise interference and the variability in image quality. To address this issue, we propose a novel deep learning model named enhanced segment anything model 2 (SAM2) for breast lesion segmentation (ESAM2-BLS). This model is an optimized version of the SAM2 architecture. ESAM2-BLS customizes and fine-tunes the pre-trained SAM2 model by introducing an adapter module, specifically designed to accommodate the unique characteristics of breast ultrasound images. The adapter module directly addresses ultrasound-specific challenges including speckle noise, low contrast boundaries, shadowing artifacts, and anisotropic resolution through targeted architectural elements such as channel attention mechanisms, specialized convolution kernels, and optimized skip connections. This optimization significantly improves segmentation accuracy, particularly for low-contrast and small lesion regions. Compared to traditional methods, ESAM2-BLS fully leverages the generalization capabilities of large models while incorporating multi-scale feature fusion and axial dilated depthwise convolution to effectively capture multi-level information from complex lesions. During the decoding process, the model enhances the identification of fine boundaries and small lesions through depthwise separable convolutions and skip connections, while maintaining a low computational cost. Visualization of the segmentation results and interpretability analysis demonstrate that ESAM2-BLS achieves an average Dice score of 0.9077 and 0.8633 in five-fold cross-validation across two datasets with over 1600 patients. These results significantly improve segmentation accuracy and robustness. This model provides an efficient, reliable, and specialized automated solution for early breast cancer screening and diagnosis.
超声成像作为一种经济、高效、无创的诊断手段,被广泛应用于乳腺病变的筛查和诊断。然而,由于噪声干扰和图像质量的可变性等因素,病灶区域的分割仍然是一个重大挑战。为了解决这个问题,我们提出了一种新的深度学习模型,称为增强分割任何模型2 (SAM2),用于乳腺病变分割(ESAM2-BLS)。该模型是SAM2体系结构的优化版本。ESAM2-BLS定制和微调预训练SAM2模型通过引入一个适配器模块,专门设计以适应乳房超声图像的独特特点。适配器模块通过通道注意机制、专门的卷积核和优化的跳过连接等目标架构元素,直接解决超声波特定的挑战,包括散斑噪声、低对比度边界、阴影伪影和各向异性分辨率。这种优化显著提高了分割精度,特别是对于低对比度和小病变区域。与传统方法相比,ESAM2-BLS充分利用了大型模型的泛化能力,同时结合了多尺度特征融合和轴向扩张深度卷积,有效地捕获了复杂病变的多层次信息。在解码过程中,该模型通过深度可分离卷积和跳跃连接增强了对细边界和小病灶的识别,同时保持了较低的计算成本。分割结果的可视化和可解释性分析表明,ESAM2-BLS在超过1600例患者的两个数据集上进行了五倍交叉验证,平均Dice得分为0.9077和0.8633。这些结果显著提高了分割的准确性和鲁棒性。该模型为早期乳腺癌筛查和诊断提供了高效、可靠、专业化的自动化解决方案。
{"title":"ESAM2-BLS: Enhanced segment anything model 2 for efficient breast lesion segmentation in ultrasound imaging","authors":"Lishuang Guo ,&nbsp;Haonan Zhang ,&nbsp;Chenbin Ma","doi":"10.1016/j.compmedimag.2025.102654","DOIUrl":"10.1016/j.compmedimag.2025.102654","url":null,"abstract":"<div><div>Ultrasound imaging, as an economical, efficient, and non-invasive diagnostic tool, is widely used for breast lesion screening and diagnosis. However, the segmentation of lesion regions remains a significant challenge due to factors such as noise interference and the variability in image quality. To address this issue, we propose a novel deep learning model named enhanced segment anything model 2 (SAM2) for breast lesion segmentation (ESAM2-BLS). This model is an optimized version of the SAM2 architecture. ESAM2-BLS customizes and fine-tunes the pre-trained SAM2 model by introducing an adapter module, specifically designed to accommodate the unique characteristics of breast ultrasound images. The adapter module directly addresses ultrasound-specific challenges including speckle noise, low contrast boundaries, shadowing artifacts, and anisotropic resolution through targeted architectural elements such as channel attention mechanisms, specialized convolution kernels, and optimized skip connections. This optimization significantly improves segmentation accuracy, particularly for low-contrast and small lesion regions. Compared to traditional methods, ESAM2-BLS fully leverages the generalization capabilities of large models while incorporating multi-scale feature fusion and axial dilated depthwise convolution to effectively capture multi-level information from complex lesions. During the decoding process, the model enhances the identification of fine boundaries and small lesions through depthwise separable convolutions and skip connections, while maintaining a low computational cost. Visualization of the segmentation results and interpretability analysis demonstrate that ESAM2-BLS achieves an average Dice score of 0.9077 and 0.8633 in five-fold cross-validation across two datasets with over 1600 patients. These results significantly improve segmentation accuracy and robustness. This model provides an efficient, reliable, and specialized automated solution for early breast cancer screening and diagnosis.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"126 ","pages":"Article 102654"},"PeriodicalIF":4.9,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145356747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trends and applications of variational autoencoders in medical imaging analysis 变分自编码器在医学影像分析中的发展趋势及应用
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-09 DOI: 10.1016/j.compmedimag.2025.102647
Pauline Shan Qing Yeoh , Khairunnisa Hasikin , Xiang Wu , Siew Li Goh , Khin Wee Lai
Automated medical imaging analysis plays a crucial role in modern healthcare, with deep learning emerging as a widely adopted solution. However, traditional supervised learning methods often struggle to achieve optimal performance due to increasing challenges such as data scarcity and variability. In response, generative artificial intelligence has gained significant attention, particularly Variational Autoencoders (VAEs), which have been extensively utilized to address various challenges in medical imaging. This review analyzed 118 articles published in the Web of Science database between 2018 and 2024. Bibliometric analysis was conducted to map research trends, while a curated compilation of datasets and evaluation metrics were extracted to underscore the importance of standardization in deep learning workflows. VAEs have been applied across multiple healthcare applications, including anomaly detection, segmentation, classification, synthesis, registration, harmonization, and clustering. Findings suggest that VAE-based models are increasingly applied in medical imaging, with Magnetic Resonance Imaging emerging as the dominant modality and image synthesis as a primary application. The growing interest in this field highlights the potential of VAEs to enhance medical imaging analysis by overcoming existing limitations in data-driven healthcare solutions. This review serves as a valuable resource for researchers looking to integrate VAE models into healthcare applications, offering an overview of current advancements.
自动化医学成像分析在现代医疗保健中发挥着至关重要的作用,深度学习正在成为一种广泛采用的解决方案。然而,由于数据稀缺性和可变性等挑战的增加,传统的监督学习方法往往难以达到最佳性能。因此,生成式人工智能已经获得了极大的关注,特别是变分自编码器(VAEs),它已被广泛用于解决医学成像中的各种挑战。该综述分析了2018年至2024年间发表在Web of Science数据库中的118篇文章。进行文献计量分析以绘制研究趋势,同时提取了精心整理的数据集和评估指标,以强调标准化在深度学习工作流程中的重要性。VAEs已应用于多个医疗保健应用程序,包括异常检测、分割、分类、合成、注册、协调和聚类。研究结果表明,基于vae的模型越来越多地应用于医学成像,磁共振成像正在成为主导模式,图像合成是主要应用。对这一领域日益增长的兴趣凸显了VAEs的潜力,通过克服数据驱动的医疗保健解决方案中的现有限制来增强医学成像分析。这篇综述为希望将VAE模型集成到医疗保健应用程序中的研究人员提供了宝贵的资源,概述了当前的进展。
{"title":"Trends and applications of variational autoencoders in medical imaging analysis","authors":"Pauline Shan Qing Yeoh ,&nbsp;Khairunnisa Hasikin ,&nbsp;Xiang Wu ,&nbsp;Siew Li Goh ,&nbsp;Khin Wee Lai","doi":"10.1016/j.compmedimag.2025.102647","DOIUrl":"10.1016/j.compmedimag.2025.102647","url":null,"abstract":"<div><div>Automated medical imaging analysis plays a crucial role in modern healthcare, with deep learning emerging as a widely adopted solution. However, traditional supervised learning methods often struggle to achieve optimal performance due to increasing challenges such as data scarcity and variability. In response, generative artificial intelligence has gained significant attention, particularly Variational Autoencoders (VAEs), which have been extensively utilized to address various challenges in medical imaging. This review analyzed 118 articles published in the Web of Science database between 2018 and 2024. Bibliometric analysis was conducted to map research trends, while a curated compilation of datasets and evaluation metrics were extracted to underscore the importance of standardization in deep learning workflows. VAEs have been applied across multiple healthcare applications, including anomaly detection, segmentation, classification, synthesis, registration, harmonization, and clustering. Findings suggest that VAE-based models are increasingly applied in medical imaging, with Magnetic Resonance Imaging emerging as the dominant modality and image synthesis as a primary application. The growing interest in this field highlights the potential of VAEs to enhance medical imaging analysis by overcoming existing limitations in data-driven healthcare solutions. This review serves as a valuable resource for researchers looking to integrate VAE models into healthcare applications, offering an overview of current advancements.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"126 ","pages":"Article 102647"},"PeriodicalIF":4.9,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145290006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Twin-ViMReg: DXR driven synthetic dynamic Standing-CBCTs through Twin Vision Mamba-based 2D/3D registration Twin- vimreg:通过Twin Vision mamba基于2D/3D注册的DXR驱动合成动态站立- cbct。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102648
Jiashun Wang , Hao Tang , Zhan Wu , Yikun Zhang , Yan Xi , Yang Chen , Chunfeng Yang , Yixin Zhou , Hui Tang
Medical imaging of the knee joint under physiological weight bearing is crucial for diagnosing and analyzing knee lesions. Existing modalities have limitations: Standing Cone-Beam Computed Tomography (Standing-CBCT) provides high-resolution 3D data but with long acquisition time and only a single static view, while Dynamic X-ray Imaging (DXR) captures continuous motion but lacks 3D structural information. These limitations motivate the need for dynamic 3D knee generation through 2D/3D registration of Standing-CBCT and DXR. Anatomically, although the femur, patella, and tibia–fibula undergo rigid motion, the joint as a whole exhibits non-rigid behavior. Consequently, existing rigid or non-rigid 2D/3D registration methods fail to fully address this scenario. We propose Twin-ViMReg, a twin-stream 2D/3D registration framework for multiple correlated objects in the knee joint. It extends conventional 2D/3D registration paradigm by establishing a pair of twined sub-tasks. By introducing a Multi-Objective Spatial Transformation (MOST) module, it models inter-object correlations and enhances registration robustness. The Vision Mamba-based encoder also strengthens the representation capacity of the method. We used 1,500 simulated data pairs from 10 patients for training and 56 real data pairs from 3 patients for testing. Quantitative evaluation shows that the mean TRE reached 3.36 mm, the RSR was 8.93% higher than the SOTA methods. With an average computation time of 1.22 s per X-ray image, Twin-ViMReg enables efficient 2D/3D knee joint registration within seconds, making it a practical and promising solution.
生理负重下膝关节的医学影像对诊断和分析膝关节病变至关重要。现有的模式存在局限性:立式锥束计算机断层扫描(Standing- cbct)提供高分辨率的3D数据,但采集时间长,只有单一的静态视图,而动态x射线成像(DXR)捕获连续运动,但缺乏3D结构信息。这些限制激发了通过站立cbct和DXR的2D/3D注册进行动态3D膝关节生成的需求。解剖学上,虽然股骨、髌骨和胫腓骨经历刚性运动,但关节作为一个整体表现出非刚性行为。因此,现有的刚性或非刚性2D/3D配准方法不能完全解决这种情况。我们提出了Twin-ViMReg,一种双流二维/三维配准框架,用于膝关节内多个相关物体。它通过建立一对缠绕子任务扩展了传统的2D/3D配准范式。通过引入多目标空间变换(MOST)模块,对目标间的相关性进行建模,增强了配准的鲁棒性。基于视觉曼巴的编码器也加强了该方法的表示能力。我们使用来自10名患者的1500对模拟数据进行训练,使用来自3名患者的56对真实数据进行测试。定量评价结果表明,该方法的平均TRE为3.36 mm, RSR比SOTA方法高8.93%。Twin-ViMReg每张x射线图像的平均计算时间为1.22秒,可以在几秒钟内实现高效的2D/3D膝关节注册,使其成为一种实用而有前途的解决方案。
{"title":"Twin-ViMReg: DXR driven synthetic dynamic Standing-CBCTs through Twin Vision Mamba-based 2D/3D registration","authors":"Jiashun Wang ,&nbsp;Hao Tang ,&nbsp;Zhan Wu ,&nbsp;Yikun Zhang ,&nbsp;Yan Xi ,&nbsp;Yang Chen ,&nbsp;Chunfeng Yang ,&nbsp;Yixin Zhou ,&nbsp;Hui Tang","doi":"10.1016/j.compmedimag.2025.102648","DOIUrl":"10.1016/j.compmedimag.2025.102648","url":null,"abstract":"<div><div>Medical imaging of the knee joint under physiological weight bearing is crucial for diagnosing and analyzing knee lesions. Existing modalities have limitations: Standing Cone-Beam Computed Tomography (Standing-CBCT) provides high-resolution 3D data but with long acquisition time and only a single static view, while Dynamic X-ray Imaging (DXR) captures continuous motion but lacks 3D structural information. These limitations motivate the need for dynamic 3D knee generation through 2D/3D registration of Standing-CBCT and DXR. Anatomically, although the femur, patella, and tibia–fibula undergo rigid motion, the joint as a whole exhibits non-rigid behavior. Consequently, existing rigid or non-rigid 2D/3D registration methods fail to fully address this scenario. We propose Twin-ViMReg, a twin-stream 2D/3D registration framework for multiple correlated objects in the knee joint. It extends conventional 2D/3D registration paradigm by establishing a pair of twined sub-tasks. By introducing a Multi-Objective Spatial Transformation (MOST) module, it models inter-object correlations and enhances registration robustness. The Vision Mamba-based encoder also strengthens the representation capacity of the method. We used 1,500 simulated data pairs from 10 patients for training and 56 real data pairs from 3 patients for testing. Quantitative evaluation shows that the mean TRE reached 3.36 mm, the RSR was 8.93% higher than the SOTA methods. With an average computation time of 1.22 s per X-ray image, Twin-ViMReg enables efficient 2D/3D knee joint registration within seconds, making it a practical and promising solution.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102648"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145214148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collect vascular specimens in one cabinet: A hierarchical prompt-guided universal model for 3D vascular segmentation 收集血管标本在一个柜子:一个层次快速引导的三维血管分割通用模型。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102650
Yinuo Wang , Cai Meng , Zhe Xu
Accurate segmentation of vascular structures in volumetric medical images is critical for disease diagnosis and surgical planning. While deep neural networks have shown remarkable effectiveness, existing methods often rely on separate models tailored to specific modalities and anatomical regions, resulting in redundant parameters and limited generalization. Recent universal models address broader segmentation tasks but struggle with the unique challenges of vascular structures. To overcome these limitations, we first present VasBench, a new comprehensive vascular segmentation benchmark comprising nine sub-datasets spanning diverse modalities and anatomical regions. Building on this foundation, we introduce VasCab, a novel prompt-guided universal model for volumetric vascular segmentation, designed to “collect vascular specimens in one cabinet”. Specifically, VasCab is equipped with learnable domain and topology prompts to capture shared and unique vascular characteristics across diverse data domains, complemented by morphology perceptual loss to address complex morphological variations. Experimental results demonstrate that VasCab surpasses individual models and state-of-the-art medical foundation models across all test datasets, showcasing exceptional cross-domain integration and precise modeling of vascular morphological variations. Moreover, VasCab exhibits robust performance in downstream tasks, underscoring its versatility and potential for unified vascular analysis. This study marks a significant step toward universal vascular segmentation, offering a promising solution for unified vascular analysis across heterogeneous datasets. Code and dataset are available at https://github.com/mileswyn/VasCab.
体积医学图像中血管结构的准确分割对疾病诊断和手术计划至关重要。虽然深度神经网络已经显示出显著的有效性,但现有的方法往往依赖于针对特定模式和解剖区域定制的单独模型,导致参数冗余和泛化受限。最近的通用模型解决了更广泛的分割任务,但与血管结构的独特挑战作斗争。为了克服这些限制,我们首先提出了VasBench,这是一个新的综合血管分割基准,包括跨越不同模式和解剖区域的9个子数据集。在此基础上,我们介绍了VasCab,一种新颖的快速引导的通用模型,用于体积血管分割,旨在“在一个柜子里收集血管标本”。具体来说,VasCab配备了可学习的域和拓扑提示,以捕获跨不同数据域的共享和独特的血管特征,并辅以形态感知损失来解决复杂的形态变化。实验结果表明,VasCab在所有测试数据集上都超越了个体模型和最先进的医学基础模型,展示了卓越的跨域集成和血管形态变化的精确建模。此外,VasCab在下游任务中表现出强大的性能,强调了其通用性和统一血管分析的潜力。这项研究标志着向通用血管分割迈出了重要的一步,为跨异构数据集的统一血管分析提供了一个有希望的解决方案。代码和数据集可从https://github.com/mileswyn/VasCab获得。
{"title":"Collect vascular specimens in one cabinet: A hierarchical prompt-guided universal model for 3D vascular segmentation","authors":"Yinuo Wang ,&nbsp;Cai Meng ,&nbsp;Zhe Xu","doi":"10.1016/j.compmedimag.2025.102650","DOIUrl":"10.1016/j.compmedimag.2025.102650","url":null,"abstract":"<div><div>Accurate segmentation of vascular structures in volumetric medical images is critical for disease diagnosis and surgical planning. While deep neural networks have shown remarkable effectiveness, existing methods often rely on separate models tailored to specific modalities and anatomical regions, resulting in redundant parameters and limited generalization. Recent universal models address broader segmentation tasks but struggle with the unique challenges of vascular structures. To overcome these limitations, we first present <strong>VasBench</strong>, a new comprehensive vascular segmentation benchmark comprising nine sub-datasets spanning diverse modalities and anatomical regions. Building on this foundation, we introduce <strong>VasCab</strong>, a novel prompt-guided universal model for volumetric vascular segmentation, designed to “collect vascular specimens in one cabinet”. Specifically, VasCab is equipped with learnable domain and topology prompts to capture shared and unique vascular characteristics across diverse data domains, complemented by morphology perceptual loss to address complex morphological variations. Experimental results demonstrate that VasCab surpasses individual models and state-of-the-art medical foundation models across all test datasets, showcasing exceptional cross-domain integration and precise modeling of vascular morphological variations. Moreover, VasCab exhibits robust performance in downstream tasks, underscoring its versatility and potential for unified vascular analysis. This study marks a significant step toward universal vascular segmentation, offering a promising solution for unified vascular analysis across heterogeneous datasets. Code and dataset are available at <span><span>https://github.com/mileswyn/VasCab</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102650"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145201977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing intracranial vessel segmentation using diffusion models without manual annotation for 3D Time-of-Flight Magnetic Resonance Angiography 增强颅内血管分割使用扩散模型无需手动注释的三维飞行时间磁共振血管成像。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102651
Jonghun Kim , Inye Na , Jiwon Chung , Ha-Na Song , Kyungseo Kim , Seongvin Ju , Mi-Yeon Eun , Woo-Keun Seo , Hyunjin Park
Intracranial vessel segmentation is essential for managing brain disorders, facilitating early detection and precise intervention of stroke and aneurysm. Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) is a commonly used vascular imaging technique for segmenting brain vessels. Traditional rule-based MRA segmentation methods were efficient, but suffered from instability and poor performance. Deep learning models, including diffusion models, have recently gained attention in medical image segmentation. However, they require ground truth for training, which is labor-intensive and time-consuming to obtain. We propose a novel segmentation method that combines the strengths of rule-based and diffusion models to improve segmentation without relying on explicit labels. Our model adopts a Frangi filter to help with vessel detection and modifies the diffusion models to exclude memory-intensive attention modules to improve efficiency. Our condition network concatenates the feature maps to further enhance the segmentation process. Quantitative and qualitative evaluations on two datasets demonstrate that our approach not only maintains the integrity of the vascular regions but also substantially reduces noise, offering a robust solution for segmenting intracranial vessels. Our results suggest a basis for improved patient care in disorders involving brain vessels. Our code is available at github.com/jongdory/Vessel-Diffusion.
颅内血管分割是管理脑疾病,促进早期发现和精确干预中风和动脉瘤的必要条件。飞行时间磁共振血管成像(TOF-MRA)是一种常用的血管成像技术。传统的基于规则的MRA分割方法效率高,但存在不稳定和性能差的问题。近年来,包括扩散模型在内的深度学习模型在医学图像分割中得到了广泛的关注。然而,他们需要训练的基础真理,这是劳动密集型和耗时的。我们提出了一种新的分割方法,它结合了基于规则和扩散模型的优势,在不依赖显式标签的情况下改进分割。我们的模型采用Frangi滤波器来帮助血管检测,并修改扩散模型以排除内存密集型注意力模块以提高效率。我们的条件网络将特征映射连接起来,以进一步增强分割过程。对两个数据集的定量和定性评估表明,我们的方法不仅保持了血管区域的完整性,而且大大降低了噪声,为颅内血管分割提供了一个强大的解决方案。我们的研究结果为改善涉及脑血管疾病的患者护理提供了基础。我们的代码可在github.com/jongdory/Vessel-Diffusion上获得。
{"title":"Enhancing intracranial vessel segmentation using diffusion models without manual annotation for 3D Time-of-Flight Magnetic Resonance Angiography","authors":"Jonghun Kim ,&nbsp;Inye Na ,&nbsp;Jiwon Chung ,&nbsp;Ha-Na Song ,&nbsp;Kyungseo Kim ,&nbsp;Seongvin Ju ,&nbsp;Mi-Yeon Eun ,&nbsp;Woo-Keun Seo ,&nbsp;Hyunjin Park","doi":"10.1016/j.compmedimag.2025.102651","DOIUrl":"10.1016/j.compmedimag.2025.102651","url":null,"abstract":"<div><div>Intracranial vessel segmentation is essential for managing brain disorders, facilitating early detection and precise intervention of stroke and aneurysm. Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) is a commonly used vascular imaging technique for segmenting brain vessels. Traditional rule-based MRA segmentation methods were efficient, but suffered from instability and poor performance. Deep learning models, including diffusion models, have recently gained attention in medical image segmentation. However, they require ground truth for training, which is labor-intensive and time-consuming to obtain. We propose a novel segmentation method that combines the strengths of rule-based and diffusion models to improve segmentation without relying on explicit labels. Our model adopts a Frangi filter to help with vessel detection and modifies the diffusion models to exclude memory-intensive attention modules to improve efficiency. Our condition network concatenates the feature maps to further enhance the segmentation process. Quantitative and qualitative evaluations on two datasets demonstrate that our approach not only maintains the integrity of the vascular regions but also substantially reduces noise, offering a robust solution for segmenting intracranial vessels. Our results suggest a basis for improved patient care in disorders involving brain vessels. Our code is available at <span><span>github.com/jongdory/Vessel-Diffusion</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102651"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145259815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coronary artery calcification segmentation with sparse annotations in intravascular OCT: Leveraging self-supervised learning and consistency regularization 血管内OCT稀疏注释冠状动脉钙化分割:利用自监督学习和一致性正则化
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102653
Chao Li , Zhifeng Qin , Zhenfei Tang , Yidan Wang , Bo Zhang , Jinwei Tian , Zhao Wang
Assessing coronary artery calcification (CAC) is crucial in evaluating the progression of atherosclerosis and planning percutaneous coronary intervention (PCI). Intravascular Optical Coherence Tomography (OCT) is a commonly used imaging tool for evaluating CAC at micrometer-scale level and in three-dimensions for optimizing PCI. While existing deep learning methods have proven effective in OCT image analysis, they are hindered by the lack of large-scale, high-quality labels to train deep neural networks that can reach human level performance in practice. In this work, we propose an annotation-efficient approach for segmenting CAC in intravascular OCT images, leveraging self-supervised learning and consistency regularization. We employ a transformer encoder paired with a simple linear projection layer for self-supervised pre-training on unlabeled OCT data. Subsequently, a transformer-based segmentation model is fine-tuned on sparsely annotated OCT pullbacks with a contrast loss using a combination of unlabeled and labeled data. We collected 2,549,073 unlabeled OCT images from 7,108 OCT pullbacks for pre-training, and 1,106,347 sparsely annotated OCT images from 3,025 OCT pullbacks for model training and testing. The proposed approach consistently outperformed existing sparsely supervised methods on both internal and external datasets. In addition, extensive comparisons under full, partial, and sparse annotation schemes substantiated its high annotation efficiency. With 80% reduction in image labeling efforts, our method has the potential to expedite the development of deep learning models for processing large-scale medical image data.
评估冠状动脉钙化(CAC)是评估动脉粥样硬化进展和计划经皮冠状动脉介入治疗(PCI)的关键。血管内光学相干断层扫描(OCT)是一种常用的成像工具,用于在微米尺度上评估CAC,并在三维空间上优化PCI。虽然现有的深度学习方法已被证明在OCT图像分析中是有效的,但由于缺乏大规模、高质量的标签来训练能够在实践中达到人类水平性能的深度神经网络,它们受到了阻碍。在这项工作中,我们提出了一种注释有效的方法来分割血管内OCT图像中的CAC,利用自监督学习和一致性正则化。我们使用一个变压器编码器与一个简单的线性投影层配对,对未标记的OCT数据进行自监督预训练。随后,基于变压器的分割模型对稀疏注释的OCT回调进行微调,并使用未标记和标记数据的组合进行对比度损失。我们从7,108个OCT回调中收集了2,549,073张未标记的OCT图像用于预训练,从3,025个OCT回调中收集了1,106,347张稀疏注释的OCT图像用于模型训练和测试。所提出的方法在内部和外部数据集上始终优于现有的稀疏监督方法。通过对全标注、部分标注和稀疏标注方案的比较,证明了其标注效率高。由于图像标记工作减少了80%,我们的方法有可能加速深度学习模型的开发,以处理大规模医学图像数据。
{"title":"Coronary artery calcification segmentation with sparse annotations in intravascular OCT: Leveraging self-supervised learning and consistency regularization","authors":"Chao Li ,&nbsp;Zhifeng Qin ,&nbsp;Zhenfei Tang ,&nbsp;Yidan Wang ,&nbsp;Bo Zhang ,&nbsp;Jinwei Tian ,&nbsp;Zhao Wang","doi":"10.1016/j.compmedimag.2025.102653","DOIUrl":"10.1016/j.compmedimag.2025.102653","url":null,"abstract":"<div><div>Assessing coronary artery calcification (CAC) is crucial in evaluating the progression of atherosclerosis and planning percutaneous coronary intervention (PCI). Intravascular Optical Coherence Tomography (OCT) is a commonly used imaging tool for evaluating CAC at micrometer-scale level and in three-dimensions for optimizing PCI. While existing deep learning methods have proven effective in OCT image analysis, they are hindered by the lack of large-scale, high-quality labels to train deep neural networks that can reach human level performance in practice. In this work, we propose an annotation-efficient approach for segmenting CAC in intravascular OCT images, leveraging self-supervised learning and consistency regularization. We employ a transformer encoder paired with a simple linear projection layer for self-supervised pre-training on unlabeled OCT data. Subsequently, a transformer-based segmentation model is fine-tuned on sparsely annotated OCT pullbacks with a contrast loss using a combination of unlabeled and labeled data. We collected 2,549,073 unlabeled OCT images from 7,108 OCT pullbacks for pre-training, and 1,106,347 sparsely annotated OCT images from 3,025 OCT pullbacks for model training and testing. The proposed approach consistently outperformed existing sparsely supervised methods on both internal and external datasets. In addition, extensive comparisons under full, partial, and sparse annotation schemes substantiated its high annotation efficiency. With 80% reduction in image labeling efforts, our method has the potential to expedite the development of deep learning models for processing large-scale medical image data.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102653"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145267115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SA2Net: Scale-adaptive structure-affinity transformation for spine segmentation from ultrasound volume projection imaging 基于尺度自适应结构亲和变换的超声体积投影成像脊柱分割。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102649
Hao Xie , Zixun Huang , Yushen Zuo , Yakun Ju , Frank H.F. Leung , N.F. Law , Kin-Man Lam , Yong-Ping Zheng , Sai Ho Ling
Spine segmentation, based on ultrasound volume projection imaging (VPI), plays a vital role for intelligent scoliosis diagnosis in clinical applications. However, this task faces several significant challenges. Firstly, the global contextual knowledge of spines may not be well-learned if we neglect the high spatial correlation of different bone features. Secondly, the spine bones contain rich structural knowledge regarding their shapes and positions, which deserves to be encoded into the segmentation process. To address these challenges, we propose a novel scale-adaptive structure-aware network (SA2Net) for effective spine segmentation. First, we propose a scale-adaptive complementary strategy to learn the cross-dimensional long-distance correlation features for spinal images. Second, motivated by the consistency between multi-head self-attention in Transformers and semantic level affinity, we propose structure-affinity transformation to transform semantic features with class-specific affinity and combine it with a Transformer decoder for structure-aware reasoning. In addition, we adopt a feature mixing loss aggregation method to enhance model training. This method improves the robustness and accuracy of the segmentation process. The experimental results demonstrate that our SA2Net achieves superior segmentation performance compared to other state-of-the-art methods. Moreover, the adaptability of SA2Net to various backbones enhances its potential as a promising tool for advanced scoliosis diagnosis using intelligent spinal image analysis.
基于超声体积投影成像(VPI)的脊柱分割对脊柱侧凸智能诊断具有重要的临床应用价值。然而,这项任务面临着几个重大挑战。首先,如果我们忽略了不同骨骼特征的高空间相关性,那么脊柱的全局上下文知识可能无法很好地学习。其次,脊柱骨骼包含丰富的形状和位置结构知识,值得编码到分割过程中。为了解决这些挑战,我们提出了一种新的规模自适应结构感知网络(SA2Net),用于有效的脊柱分割。首先,我们提出了一种尺度自适应互补策略来学习脊柱图像的跨维远距离相关特征。其次,基于Transformer中多头自注意与语义级亲和力的一致性,提出了结构-亲和力转换,将语义特征转换为类特定亲和力,并将其与Transformer解码器结合,实现结构感知推理。此外,我们采用特征混合损失聚合方法来增强模型训练。该方法提高了分割过程的鲁棒性和准确性。实验结果表明,与其他最先进的方法相比,我们的SA2Net实现了优越的分割性能。此外,SA2Net对各种脊柱的适应性增强了其作为智能脊柱图像分析高级脊柱侧凸诊断工具的潜力。
{"title":"SA2Net: Scale-adaptive structure-affinity transformation for spine segmentation from ultrasound volume projection imaging","authors":"Hao Xie ,&nbsp;Zixun Huang ,&nbsp;Yushen Zuo ,&nbsp;Yakun Ju ,&nbsp;Frank H.F. Leung ,&nbsp;N.F. Law ,&nbsp;Kin-Man Lam ,&nbsp;Yong-Ping Zheng ,&nbsp;Sai Ho Ling","doi":"10.1016/j.compmedimag.2025.102649","DOIUrl":"10.1016/j.compmedimag.2025.102649","url":null,"abstract":"<div><div>Spine segmentation, based on ultrasound volume projection imaging (VPI), plays a vital role for intelligent scoliosis diagnosis in clinical applications. However, this task faces several significant challenges. Firstly, the global contextual knowledge of spines may not be well-learned if we neglect the high spatial correlation of different bone features. Secondly, the spine bones contain rich structural knowledge regarding their shapes and positions, which deserves to be encoded into the segmentation process. To address these challenges, we propose a novel scale-adaptive structure-aware network (SA<sup>2</sup>Net) for effective spine segmentation. First, we propose a scale-adaptive complementary strategy to learn the cross-dimensional long-distance correlation features for spinal images. Second, motivated by the consistency between multi-head self-attention in Transformers and semantic level affinity, we propose structure-affinity transformation to transform semantic features with class-specific affinity and combine it with a Transformer decoder for structure-aware reasoning. In addition, we adopt a feature mixing loss aggregation method to enhance model training. This method improves the robustness and accuracy of the segmentation process. The experimental results demonstrate that our SA<sup>2</sup>Net achieves superior segmentation performance compared to other state-of-the-art methods. Moreover, the adaptability of SA<sup>2</sup>Net to various backbones enhances its potential as a promising tool for advanced scoliosis diagnosis using intelligent spinal image analysis.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102649"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145193971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning for automatic vertebra analysis: A methodological survey of recent advances 用于自动椎体分析的深度学习:最近进展的方法学调查。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-10-01 DOI: 10.1016/j.compmedimag.2025.102652
Zhuofan Xie , Zishan Lin , Enlong Sun , Fengyi Ding , Jie Qi , Shen Zhao
Automated vertebra analysis (AVA), encompassing vertebra detection and segmentation, plays a critical role in computer-aided diagnosis, surgical planning, and postoperative evaluation in spine-related clinical workflows. Despite notable progress, AVA continues to face key challenges, including variations in the field of view (FOV), complex vertebral morphology, limited availability of high-quality annotated data, and performance degradation under domain shifts. Over the past decade, numerous studies have employed deep learning (DL) to tackle these issues, introducing advanced network architectures and innovative learning paradigms. However, the rapid evolution of these methods has not been comprehensively captured by existing surveys, resulting in a knowledge gap regarding the current state of the field. To address this, this paper presents an up-to-date review that systematically summarizes recent advances. The review begins by consolidating publicly available datasets and evaluation metrics to support standardized benchmarking. Recent DL-based AVA approaches are then analyzed from two methodological perspectives: network architecture improvement and learning strategies design. Finally, an examination of persistent technical barriers and emerging clinical needs that are shaping future research directions is provided. These include multimodal learning, domain generalization, and the integration of foundation models. As the most current survey in the field, this review provides a comprehensive and structured synthesis aimed at guiding future research toward the development of robust, generalizable, and clinically deployable AVA systems in the era of intelligent medical imaging.
自动椎体分析(AVA),包括椎体检测和分割,在脊柱相关临床工作流程的计算机辅助诊断、手术计划和术后评估中起着至关重要的作用。尽管取得了显著进展,但AVA仍然面临着关键挑战,包括视场(FOV)的变化,复杂的椎体形态,高质量注释数据的可用性有限,以及域转移下的性能下降。在过去的十年中,许多研究都采用深度学习(DL)来解决这些问题,引入了先进的网络架构和创新的学习范式。然而,这些方法的快速发展并没有被现有的调查全面地捕捉到,从而导致了关于该领域现状的知识差距。为了解决这个问题,本文提出了一个最新的评论,系统地总结了最近的进展。审查从整合公开可用的数据集和评估指标开始,以支持标准化基准测试。然后从两个方法学角度分析了最近基于dl的AVA方法:网络架构改进和学习策略设计。最后,对持续存在的技术障碍和正在形成未来研究方向的新临床需求进行了检查。这些包括多模态学习、领域泛化和基础模型的集成。作为该领域最新的调查,本综述提供了一个全面和结构化的综合,旨在指导未来研究在智能医学成像时代开发健壮、通用和临床可部署的AVA系统。
{"title":"Deep learning for automatic vertebra analysis: A methodological survey of recent advances","authors":"Zhuofan Xie ,&nbsp;Zishan Lin ,&nbsp;Enlong Sun ,&nbsp;Fengyi Ding ,&nbsp;Jie Qi ,&nbsp;Shen Zhao","doi":"10.1016/j.compmedimag.2025.102652","DOIUrl":"10.1016/j.compmedimag.2025.102652","url":null,"abstract":"<div><div>Automated vertebra analysis (AVA), encompassing vertebra detection and segmentation, plays a critical role in computer-aided diagnosis, surgical planning, and postoperative evaluation in spine-related clinical workflows. Despite notable progress, AVA continues to face key challenges, including variations in the field of view (FOV), complex vertebral morphology, limited availability of high-quality annotated data, and performance degradation under domain shifts. Over the past decade, numerous studies have employed deep learning (DL) to tackle these issues, introducing advanced network architectures and innovative learning paradigms. However, the rapid evolution of these methods has not been comprehensively captured by existing surveys, resulting in a knowledge gap regarding the current state of the field. To address this, this paper presents an up-to-date review that systematically summarizes recent advances. The review begins by consolidating publicly available datasets and evaluation metrics to support standardized benchmarking. Recent DL-based AVA approaches are then analyzed from two methodological perspectives: network architecture improvement and learning strategies design. Finally, an examination of persistent technical barriers and emerging clinical needs that are shaping future research directions is provided. These include multimodal learning, domain generalization, and the integration of foundation models. As the most current survey in the field, this review provides a comprehensive and structured synthesis aimed at guiding future research toward the development of robust, generalizable, and clinically deployable AVA systems in the era of intelligent medical imaging.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102652"},"PeriodicalIF":4.9,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145208514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SGRRG: Leveraging radiology scene graphs for improved and abnormality-aware radiology report generation SGRRG:利用放射学场景图来改进和异常感知放射学报告生成。
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2025-09-15 DOI: 10.1016/j.compmedimag.2025.102644
Jun Wang , Lixing Zhu , Abhir Bhalerao , Yulan He
Radiology report generation (RRG) methods often lack sufficient medical knowledge to produce clinically accurate reports. A scene graph provides comprehensive information for describing objects within an image. However, automatically generated radiology scene graphs (RSG) may contain noise annotations and highly overlapping regions, posing challenges in utilizing RSG to enhance RRG. To this end, we propose Scene Graph aided RRG (SGRRG), a framework that leverages an automatically generated RSG and copes with noisy supervision problems in the RSG with a transformer-based module, effectively distilling medical knowledge in an end-to-end manner. SGRRG is composed of a dedicated scene graph encoder responsible for translating the radiography into a RSG, and a scene graph-aided decoder that takes advantage of both patch-level and region-level visual information and mitigates the noisy annotation problem in the RSG. The incorporation of both patch-level and region-level features, alongside the integration of the essential RSG construction modules, enhances our framework’s flexibility and robustness, enabling it to readily exploit prior advanced RRG techniques. A fine-grained, sentence-level attention method is designed to better distill the RSG information. Additionally, we introduce two proxy tasks to enhance the model’s ability to produce clinically accurate reports. Extensive experiments demonstrate that SGRRG outperforms previous state-of-the-art methods in report generation and can better capture abnormal findings. Code is available at https://github.com/Markin-Wang/SGRRG.
放射学报告生成(RRG)方法往往缺乏足够的医学知识,以产生临床准确的报告。场景图为描述图像中的对象提供了全面的信息。然而,自动生成的放射场景图(RSG)可能包含噪声注释和高度重叠的区域,这给利用RSG增强RRG带来了挑战。为此,我们提出了场景图辅助RRG (SGRRG)框架,该框架利用自动生成的RSG,并使用基于变压器的模块处理RSG中的噪声监督问题,有效地以端到端方式提取医学知识。SGRRG由一个专门的场景图编码器和一个场景图辅助解码器组成,前者负责将射线照相转换为RSG,而前者利用了补丁级和区域级视觉信息,并减轻了RSG中的噪声注释问题。结合补丁级和区域级功能,以及基本RSG构建模块的集成,增强了我们框架的灵活性和稳健性,使其能够轻松利用先前的先进RRG技术。为了更好地提取RSG信息,设计了一种细粒度的句子级注意方法。此外,我们引入了两个代理任务,以提高模型产生临床准确报告的能力。大量的实验表明,SGRRG在报告生成方面优于以前最先进的方法,可以更好地捕获异常发现。代码可从https://github.com/Markin-Wang/SGRRG获得。
{"title":"SGRRG: Leveraging radiology scene graphs for improved and abnormality-aware radiology report generation","authors":"Jun Wang ,&nbsp;Lixing Zhu ,&nbsp;Abhir Bhalerao ,&nbsp;Yulan He","doi":"10.1016/j.compmedimag.2025.102644","DOIUrl":"10.1016/j.compmedimag.2025.102644","url":null,"abstract":"<div><div>Radiology report generation (RRG) methods often lack sufficient medical knowledge to produce clinically accurate reports. A scene graph provides comprehensive information for describing objects within an image. However, automatically generated radiology scene graphs (RSG) may contain noise annotations and highly overlapping regions, posing challenges in utilizing RSG to enhance RRG. To this end, we propose Scene Graph aided RRG (SGRRG), a framework that leverages an automatically generated RSG and copes with noisy supervision problems in the RSG with a transformer-based module, effectively distilling medical knowledge in an end-to-end manner. SGRRG is composed of a dedicated scene graph encoder responsible for translating the radiography into a RSG, and a scene graph-aided decoder that takes advantage of both patch-level and region-level visual information and mitigates the noisy annotation problem in the RSG. The incorporation of both patch-level and region-level features, alongside the integration of the essential RSG construction modules, enhances our framework’s flexibility and robustness, enabling it to readily exploit prior advanced RRG techniques. A fine-grained, sentence-level attention method is designed to better distill the RSG information. Additionally, we introduce two proxy tasks to enhance the model’s ability to produce clinically accurate reports. Extensive experiments demonstrate that SGRRG outperforms previous state-of-the-art methods in report generation and can better capture abnormal findings. Code is available at <span><span>https://github.com/Markin-Wang/SGRRG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"125 ","pages":"Article 102644"},"PeriodicalIF":4.9,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145103172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computerized Medical Imaging and Graphics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1