首页 > 最新文献

IEEE transactions on medical imaging最新文献

英文 中文
A Benchmark Framework for the Right Atrium Cavity Segmentation From LGE-MRIs 基于大磁共振成像的右心房腔分割基准框架。
Pub Date : 2025-07-22 DOI: 10.1109/TMI.2025.3590694
Jieyun Bai;Jinwen Zhu;Zhiting Chen;Ziduo Yang;Yaosheng Lu;Lei Li;Qince Li;Wei Wang;Henggui Zhang;Kuanquan Wang;Jie Gan;Jichao Zhao;Hua Lu;Suining Li;Jiawen Huang;Xiaoming Chen;Xiaoshen Zhang;Xiaowei Xu;Lulu Li;Yanfeng Tian;Víctor M. Campello;Karim Lekadir
The right atrium (RA) is critical for cardiac hemodynamics but is often overlooked in clinical diagnostics. This study presents a benchmark framework for RA cavity segmentation from late gadolinium-enhanced magnetic resonance imaging (LGE-MRIs), leveraging a two-stage strategy and a novel 3D deep learning network, RASnet. The architecture addresses challenges in class imbalance and anatomical variability by incorporating multi-path input, multi-scale feature fusion modules, Vision Transformers, context interaction mechanisms, and deep supervision. Evaluated on datasets comprising 354 LGE-MRIs, RASnet achieves SOTA performance with a Dice score of 92.19% on a primary dataset and demonstrates robust generalizability on an independent dataset. The proposed framework establishes a benchmark for RA cavity segmentation, enabling accurate and efficient analysis for cardiac imaging applications. Open-source code (https://github.com/zjinw/RAS) and data (https://zenodo.org/records/15524472) are provided to facilitate further research and clinical adoption.
右心房(RA)对心脏血流动力学至关重要,但在临床诊断中经常被忽视。本研究提出了一个基于后期钆增强磁共振成像(lge - mri)的RA空腔分割的基准框架,利用两阶段策略和一种新的3D深度学习网络RASnet。该架构通过整合多路径输入、多尺度特征融合模块、视觉变形器、上下文交互机制和深度监督来解决类别不平衡和解剖变异的挑战。在包含354个lge - mri的数据集上进行评估,RASnet在主数据集上实现了SOTA性能,Dice得分为92.19%,并在独立数据集上展示了强大的泛化能力。提出的框架建立了RA腔分割的基准,为心脏成像应用提供了准确有效的分析。提供了开源代码(https://github.com/zjinw/RAS)和数据(https://zenodo.org/records/15524472),以促进进一步的研究和临床应用。
{"title":"A Benchmark Framework for the Right Atrium Cavity Segmentation From LGE-MRIs","authors":"Jieyun Bai;Jinwen Zhu;Zhiting Chen;Ziduo Yang;Yaosheng Lu;Lei Li;Qince Li;Wei Wang;Henggui Zhang;Kuanquan Wang;Jie Gan;Jichao Zhao;Hua Lu;Suining Li;Jiawen Huang;Xiaoming Chen;Xiaoshen Zhang;Xiaowei Xu;Lulu Li;Yanfeng Tian;Víctor M. Campello;Karim Lekadir","doi":"10.1109/TMI.2025.3590694","DOIUrl":"10.1109/TMI.2025.3590694","url":null,"abstract":"The right atrium (RA) is critical for cardiac hemodynamics but is often overlooked in clinical diagnostics. This study presents a benchmark framework for RA cavity segmentation from late gadolinium-enhanced magnetic resonance imaging (LGE-MRIs), leveraging a two-stage strategy and a novel 3D deep learning network, RASnet. The architecture addresses challenges in class imbalance and anatomical variability by incorporating multi-path input, multi-scale feature fusion modules, Vision Transformers, context interaction mechanisms, and deep supervision. Evaluated on datasets comprising 354 LGE-MRIs, RASnet achieves SOTA performance with a Dice score of 92.19% on a primary dataset and demonstrates robust generalizability on an independent dataset. The proposed framework establishes a benchmark for RA cavity segmentation, enabling accurate and efficient analysis for cardiac imaging applications. Open-source code (<uri>https://github.com/zjinw/RAS</uri>) and data (<uri>https://zenodo.org/records/15524472</uri>) are provided to facilitate further research and clinical adoption.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5290-5305"},"PeriodicalIF":0.0,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144684346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Instrument-Tissue-Guided Surgical Action Triplet Detection via Textual-Temporal Trail Exploration 基于文本-时间轨迹探索的器械组织引导手术动作三重态检测。
Pub Date : 2025-07-18 DOI: 10.1109/TMI.2025.3590457
Jialun Pei;Jiaan Zhang;Guanyi Qin;Kai Wang;Yueming Jin;Pheng-Ann Heng
Surgical action triplet detection offers intuitive intraoperative scene analysis for dynamically perceiving laparoscopic surgical workflows and analyzing the interaction between instruments and tissues. The current challenge of this task lies in simultaneously localizing surgical instruments while performing more accurate surgical triplet recognition to enhance a comprehensive understanding of intraoperative surgical scenes. To fully leverage the spatial localization of surgical instruments for associating with triplet detection, we propose an Instrument-Tissue-Guided Triplet detector, termed ITG-Trip, which navigates the confluence of surgical action cues through instrument and tissue pseudo-localization labeling to optimize action triplet detection. For exploiting textual and temporal trails, our framework embraces a Visual-Linguistic Association (VLA) module that exploits a pre-trained text encoder to distill textual prior knowledge, enhancing semantic information in global visual features and compensating rare interaction class perception. Besides, we introduce a Mamba-enhanced Spatial-temporal Perception (MSP) decoder, which weaves Mamba and Transformer blocks to explore subject- and object-aware spatial and temporal information to improve the accuracy of action triplet detection in long-time sequence surgical videos. Experimental results on the CholecT50 benchmark indicate that our method significantly outperforms existing state-of-the-art methods in both instrument localization and action triplet detection. The code is available at: github.com/PJLallen/ITG-Trip
手术动作三联体检测为动态感知腹腔镜手术工作流程和分析器械与组织之间的相互作用提供了直观的术中场景分析。当前这项任务的挑战在于,在对手术器械进行定位的同时,进行更准确的手术三联识别,以增强对术中手术场景的全面了解。为了充分利用手术器械的空间定位与三联体检测相关联,我们提出了一种称为ITG-Trip的器械-组织引导三联体检测器,它通过器械和组织的伪定位标记来导航手术动作线索的融合,以优化动作三联体检测。为了利用文本和时间轨迹,我们的框架包含了一个视觉语言关联(VLA)模块,该模块利用预训练的文本编码器来提取文本先验知识,增强全局视觉特征中的语义信息,并补偿罕见的交互类感知。此外,我们还介绍了一种曼巴增强时空感知(MSP)解码器,该解码器编织曼巴和变形块来探索主体和客体感知的空间和时间信息,以提高长时间序列手术视频中动作三联体检测的准确性。在CholecT50基准上的实验结果表明,我们的方法在仪器定位和动作三重态检测方面都明显优于现有的最先进的方法。代码可从github.com/PJLallen/ITG-Trip获得。
{"title":"Instrument-Tissue-Guided Surgical Action Triplet Detection via Textual-Temporal Trail Exploration","authors":"Jialun Pei;Jiaan Zhang;Guanyi Qin;Kai Wang;Yueming Jin;Pheng-Ann Heng","doi":"10.1109/TMI.2025.3590457","DOIUrl":"10.1109/TMI.2025.3590457","url":null,"abstract":"Surgical action triplet detection offers intuitive intraoperative scene analysis for dynamically perceiving laparoscopic surgical workflows and analyzing the interaction between instruments and tissues. The current challenge of this task lies in simultaneously localizing surgical instruments while performing more accurate surgical triplet recognition to enhance a comprehensive understanding of intraoperative surgical scenes. To fully leverage the spatial localization of surgical instruments for associating with triplet detection, we propose an Instrument-Tissue-Guided Triplet detector, termed ITG-Trip, which navigates the confluence of surgical action cues through instrument and tissue pseudo-localization labeling to optimize action triplet detection. For exploiting textual and temporal trails, our framework embraces a Visual-Linguistic Association (VLA) module that exploits a pre-trained text encoder to distill textual prior knowledge, enhancing semantic information in global visual features and compensating rare interaction class perception. Besides, we introduce a Mamba-enhanced Spatial-temporal Perception (MSP) decoder, which weaves Mamba and Transformer blocks to explore subject- and object-aware spatial and temporal information to improve the accuracy of action triplet detection in long-time sequence surgical videos. Experimental results on the CholecT50 benchmark indicate that our method significantly outperforms existing state-of-the-art methods in both instrument localization and action triplet detection. The code is available at: github.com/PJLallen/ITG-Trip","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5278-5289"},"PeriodicalIF":0.0,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144661828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Supervised Neuron Morphology Representation With Graph Transformer 基于图转换器的自监督神经元形态表示。
Pub Date : 2025-07-18 DOI: 10.1109/TMI.2025.3590484
Pengpeng Sheng;Gangming Zhao;Tingting Han;Lei Qu
Effective representation of neuronal morphology is essential for cell typing and understanding brain function. However, the complexity of neuronal morphology manifests not only in inter-class structural differences but also in intra-class variations across developmental stages and environmental conditions. Such diversity poses significant challenges for existing methods in balancing robustness and discriminative power when representing neuronal morphology. To address this, we propose SGTMorph, a hybrid Graph Transformer framework that leverages the local topological modeling capabilities of graph neural networks and the global relational reasoning strengths of Transformers to explicitly encode neuronal structural information. SGTMorph incorporates a random walk-based positional encoding scheme to facilitate effective information propagation across neuronal graphs and introduces a spatially invariant encoding mechanism to improve adaptability with diverse morphology. This integrated approach enables a robust and comprehensive representation of neuronal morphology while maintaining biological fidelity. To enable label-free feature learning, we devise a self-supervised learning strategy grounded in geometric and topological similarity metrics. Extensive experiments on five datasets demonstrate SGTMorph’s superior performance in neuron morphology classification and retrieval tasks. Furthermore, Its practical utility in neuronal function research is validated through the accurate predictions of two functional features: the laminar distribution of somas and axonal projection patterns. The code is available at https://github.com/big-rain/SGTMorph
有效表征神经元形态对细胞分型和理解脑功能至关重要。然而,神经元形态的复杂性不仅存在于类间结构差异中,还存在于类内发育阶段和环境条件的差异中。这种多样性对现有方法在表示神经元形态时如何平衡鲁棒性和判别能力提出了重大挑战。为了解决这个问题,我们提出了SGTMorph,这是一个混合图转换器框架,它利用图神经网络的局部拓扑建模能力和变形器的全局关系推理能力来显式编码神经元结构信息。SGTMorph采用了一种基于随机行走的位置编码方案,以促进信息在神经元图之间的有效传播,并引入了一种空间不变的编码机制,以提高对不同形态的适应性。这种综合的方法能够在保持生物保真度的同时,对神经元形态进行鲁棒和全面的表征。为了实现无标签特征学习,我们设计了一种基于几何和拓扑相似性度量的自监督训练策略。在5个数据集上的大量实验证明了SGTMorph在神经元形态分类和检索任务上的优越性能。此外,它在神经科学研究中的实际应用是通过准确预测两种功能特性来验证的:体细胞的层状分布和轴突投影模式。代码公开在:https://github.com/big-rain/SGTMorph。
{"title":"Self-Supervised Neuron Morphology Representation With Graph Transformer","authors":"Pengpeng Sheng;Gangming Zhao;Tingting Han;Lei Qu","doi":"10.1109/TMI.2025.3590484","DOIUrl":"10.1109/TMI.2025.3590484","url":null,"abstract":"Effective representation of neuronal morphology is essential for cell typing and understanding brain function. However, the complexity of neuronal morphology manifests not only in inter-class structural differences but also in intra-class variations across developmental stages and environmental conditions. Such diversity poses significant challenges for existing methods in balancing robustness and discriminative power when representing neuronal morphology. To address this, we propose SGTMorph, a hybrid Graph Transformer framework that leverages the local topological modeling capabilities of graph neural networks and the global relational reasoning strengths of Transformers to explicitly encode neuronal structural information. SGTMorph incorporates a random walk-based positional encoding scheme to facilitate effective information propagation across neuronal graphs and introduces a spatially invariant encoding mechanism to improve adaptability with diverse morphology. This integrated approach enables a robust and comprehensive representation of neuronal morphology while maintaining biological fidelity. To enable label-free feature learning, we devise a self-supervised learning strategy grounded in geometric and topological similarity metrics. Extensive experiments on five datasets demonstrate SGTMorph’s superior performance in neuron morphology classification and retrieval tasks. Furthermore, Its practical utility in neuronal function research is validated through the accurate predictions of two functional features: the laminar distribution of somas and axonal projection patterns. The code is available at <uri>https://github.com/big-rain/SGTMorph</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5332-5344"},"PeriodicalIF":0.0,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144661826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scaling Chest X-Ray Foundation Models From Mixed Supervisions for Dense Prediction 基于混合监督的胸部x射线基础模型密度预测。
Pub Date : 2025-07-16 DOI: 10.1109/TMI.2025.3589928
Fuying Wang;Lequan Yu
Foundation models have significantly revolutionized the field of chest X-ray diagnosis with their ability to transfer across various diseases and tasks. However, previous works have predominantly utilized self-supervised learning from medical image-text pairs, which falls short in dense medical prediction tasks due to their sole reliance on such coarse pair supervision, thereby limiting their applicability to detailed diagnostics. In this paper, we introduce a Dense Chest X-ray Foundation Model (DCXFM), which utilizes mixed supervision types (i.e., text, label, and segmentation masks) to significantly enhance the scalability of foundation models across various medical tasks. Our model involves two training stages: we first employ a novel self-distilled multimodal pretraining paradigm to exploit text and label supervision, along with local-to-global self-distillation and soft cross-modal contrastive alignment strategies to enhance localization capabilities. Subsequently, we introduce an efficient cost aggregation module, comprising spatial and class aggregation mechanisms, to further advance dense prediction tasks with densely annotated datasets. Comprehensive evaluations on three tasks (phrase grounding, zero-shot semantic segmentation, and zero-shot classification) demonstrate DCXFM’s superior performance over other state-of-the-art medical image-text pretraining models. Remarkably, DCXFM exhibits powerful zero-shot capabilities across various datasets in phrase grounding and zero-shot semantic segmentation, underscoring its superior generalization in dense prediction tasks.
基础模型具有跨越各种疾病和任务的能力,已经显著地改变了胸部x线诊断领域。然而,以前的工作主要是利用医学图像-文本对的自监督学习,由于它们完全依赖于这种粗对监督,因此在密集的医学预测任务中存在不足,从而限制了它们对详细诊断的适用性。在本文中,我们引入了密集胸部x射线基础模型(DCXFM),该模型利用混合监督类型(即文本、标签和分割掩码)来显著增强基础模型在各种医疗任务中的可扩展性。我们的模型包括两个训练阶段:我们首先采用一种新的自蒸馏多模态预训练范式来利用文本和标签监督,以及局部到全局的自蒸馏和软跨模态对比对齐策略来增强定位能力。随后,我们引入了一个高效的成本聚合模块,包括空间和类聚合机制,以进一步推进密集注释数据集的密集预测任务。对三个任务(短语基础、零样本语义分割和零样本分类)的综合评估表明,DCXFM优于其他最先进的医学图像-文本预训练模型。值得注意的是,DCXFM在短语基础和语义分割方面展示了强大的跨各种数据集的零采样能力,强调了其在密集预测任务中的优越泛化能力。
{"title":"Scaling Chest X-Ray Foundation Models From Mixed Supervisions for Dense Prediction","authors":"Fuying Wang;Lequan Yu","doi":"10.1109/TMI.2025.3589928","DOIUrl":"10.1109/TMI.2025.3589928","url":null,"abstract":"Foundation models have significantly revolutionized the field of chest X-ray diagnosis with their ability to transfer across various diseases and tasks. However, previous works have predominantly utilized self-supervised learning from medical image-text pairs, which falls short in dense medical prediction tasks due to their sole reliance on such coarse pair supervision, thereby limiting their applicability to detailed diagnostics. In this paper, we introduce a Dense Chest X-ray Foundation Model (DCXFM), which utilizes mixed supervision types (i.e., text, label, and segmentation masks) to significantly enhance the scalability of foundation models across various medical tasks. Our model involves two training stages: we first employ a novel self-distilled multimodal pretraining paradigm to exploit text and label supervision, along with local-to-global self-distillation and soft cross-modal contrastive alignment strategies to enhance localization capabilities. Subsequently, we introduce an efficient cost aggregation module, comprising spatial and class aggregation mechanisms, to further advance dense prediction tasks with densely annotated datasets. Comprehensive evaluations on three tasks (phrase grounding, zero-shot semantic segmentation, and zero-shot classification) demonstrate DCXFM’s superior performance over other state-of-the-art medical image-text pretraining models. Remarkably, DCXFM exhibits powerful zero-shot capabilities across various datasets in phrase grounding and zero-shot semantic segmentation, underscoring its superior generalization in dense prediction tasks.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5306-5318"},"PeriodicalIF":0.0,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144645754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ray-Bundle-Based X-Ray Representation and Reconstruction: An Alternative to Classic Tomography on Voxelized Volumes 基于射线束的x射线表示和重建:体素化体上经典断层扫描的替代方法。
Pub Date : 2025-07-16 DOI: 10.1109/TMI.2025.3589946
Yuanwei He;Dan Ruan
Tomography recovers internal volume from projection measurements. Formulated as inverse problems, classic computed tomography generally reconstructs attenuation property in a preset cartesian grid coordinate. While this is intuitive and convenient for digital display, such discretization leads to forward-backward projection inconsistency, and discrepancy between digital and effective resolution. We take a different perspective by considering the image volume as continuous and modelling forward projection as a hybrid continuous-to-discrete mapping from volume to detector elements, which we call “ray bundles”. The ray bundle can be regarded as an unconventional heterogenous coordinate. Projections are modeled as line integrations along ray bundles in the continuous volume space and approximated by numerical integration using customized sample points. This modeling approach is conveniently supported with an implicit neural representation approach. By representing the volume as a function mapping spatial coordinates to attenuation properties and leveraging ray bundle projection, this approach reflects transmission physics and eliminates the need for explicit interpolation, intersection calculations, or matrix inversions. A novel sampling strategy is further developed to adaptively distribute points along the ray bundles, emphasizing high gradient regions to allocate computational resources to heterogenous structures and details. We call this system T-ReX to indicate Transmission Ray bundles for X-ray geometry. We validate T-ReX through comprehensive experiments across three scenarios: simulated full-fan projections with primary signal only, half-fan setups with simulated scatter and noise, and an in-house dataset with realistic acquisition conditions. These results highlight the effectiveness of T-ReX in sparse view X-ray tomography.
断层扫描从投影测量中恢复内部体积。经典的计算机断层扫描通常以反问题的形式在预先设定的笛卡尔网格坐标中重建衰减特性。虽然这对数字显示直观方便,但这种离散化导致了前后投影不一致,以及数字分辨率与有效分辨率之间的差异。我们采取了不同的视角,将图像体积视为连续的,并将正演投影建模为从体积到检测器元素的连续到离散的混合映射,我们称之为“射线束”。射线束可以看作是一个非常规的非均质坐标。投影建模为连续体空间中沿射线束的线积分,并通过使用自定义样本点的数值积分近似。这种建模方法方便地支持隐式神经表示方法。通过将体积表示为一个函数,将空间坐标映射到衰减属性,并利用射线束投影,这种方法反映了传输物理,消除了对显式插值、交叉计算或矩阵反转的需要。进一步提出了一种新的采样策略,沿射线束自适应分布点,强调高梯度区域,将计算资源分配给异质结构和细节。我们称这个系统为T-ReX,以表示x射线几何中的透射射线束。我们通过三种场景的综合实验来验证T-ReX:模拟只有主信号的全扇投影,模拟散射和噪声的半扇设置,以及具有现实采集条件的内部数据集。这些结果突出了T-ReX在稀疏视图x射线断层扫描中的有效性。
{"title":"Ray-Bundle-Based X-Ray Representation and Reconstruction: An Alternative to Classic Tomography on Voxelized Volumes","authors":"Yuanwei He;Dan Ruan","doi":"10.1109/TMI.2025.3589946","DOIUrl":"10.1109/TMI.2025.3589946","url":null,"abstract":"Tomography recovers internal volume from projection measurements. Formulated as inverse problems, classic computed tomography generally reconstructs attenuation property in a preset cartesian grid coordinate. While this is intuitive and convenient for digital display, such discretization leads to forward-backward projection inconsistency, and discrepancy between digital and effective resolution. We take a different perspective by considering the image volume as continuous and modelling forward projection as a hybrid continuous-to-discrete mapping from volume to detector elements, which we call “ray bundles”. The ray bundle can be regarded as an unconventional heterogenous coordinate. Projections are modeled as line integrations along ray bundles in the continuous volume space and approximated by numerical integration using customized sample points. This modeling approach is conveniently supported with an implicit neural representation approach. By representing the volume as a function mapping spatial coordinates to attenuation properties and leveraging ray bundle projection, this approach reflects transmission physics and eliminates the need for explicit interpolation, intersection calculations, or matrix inversions. A novel sampling strategy is further developed to adaptively distribute points along the ray bundles, emphasizing high gradient regions to allocate computational resources to heterogenous structures and details. We call this system T-ReX to indicate Transmission Ray bundles for X-ray geometry. We validate T-ReX through comprehensive experiments across three scenarios: simulated full-fan projections with primary signal only, half-fan setups with simulated scatter and noise, and an in-house dataset with realistic acquisition conditions. These results highlight the effectiveness of T-ReX in sparse view X-ray tomography.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5128-5141"},"PeriodicalIF":0.0,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144645752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Frenet–Serret Frame-Based Decomposition for Part Segmentation of 3-D Curvilinear Structures 基于Frenet-Serret框架的三维曲线结构零件分割。
Pub Date : 2025-07-16 DOI: 10.1109/TMI.2025.3589543
Shixuan Leslie Gu;Jason Ken Adhinarta;Mikhail Bessmeltsev;Jiancheng Yang;Yongjie Jessica Zhang;Wenjie Yin;Daniel Berger;Jeff W. Lichtman;Hanspeter Pfister;Donglai Wei
Accurate segmentation of anatomical substructures within 3D curvilinear structures in medical imaging remains challenging due to their complex geometry and the scarcity of diverse, large-scale datasets for algorithm development and evaluation. In this paper, we use dendritic spine segmentation as a case study and address these challenges by introducing a novel Frenet-Serret Frame-based Decomposition, which decomposes 3D curvilinear structures into a globally smooth continuous curve that captures the overall shape, and a cylindrical primitive that encodes local geometric properties. This approach leverages Frenet-Serret Frames and arc length parameterization to preserve essential geometric features while reducing representational complexity, facilitating data-efficient learning, improved segmentation accuracy, and generalization on 3D curvilinear structures. To rigorously evaluate our method, we introduce two datasets: CurviSeg, a synthetic dataset for 3D curvilinear structure segmentation that validates our method’s key properties, and DenSpineEM, a benchmark for dendritic spine segmentation, which comprises 4,476 manually annotated spines from 70 dendrites across three public electron microscopy datasets, covering multiple brain regions and species. Our experiments on DenSpineEM demonstrate exceptional cross-region and cross-species generalization: models trained on the mouse somatosensory cortex subset achieve 94.43% Dice, maintaining strong performance in zero-shot segmentation on both mouse visual cortex (95.61% Dice) and human frontal lobe (86.63% Dice) subsets. Moreover, we test the generalizability of our method on the IntrA dataset, where it achieves 77.08% Dice (5.29% higher than prior arts) on intracranial aneurysm segmentation from entire artery models. These findings demonstrate the potential of our approach for accurately analyzing complex curvilinear structures across diverse medical imaging fields. Our dataset, code, and models are available at https://github.com/VCG/FFD4DenSpineEM to support future research.
由于其复杂的几何形状以及用于算法开发和评估的各种大规模数据集的稀缺,医学成像中三维曲线结构中解剖子结构的准确分割仍然具有挑战性。在本文中,我们以树突脊柱分割为例进行研究,并通过引入一种新颖的基于Frenet-Serret帧的分解来解决这些挑战,该分解方法将3D曲线结构分解为捕获整体形状的全局光滑连续曲线和编码局部几何属性的圆柱形原语。该方法利用Frenet-Serret框架和弧长参数化来保留基本的几何特征,同时降低表征复杂性,促进数据高效学习,提高分割精度,并对3D曲线结构进行泛化。为了严格评估我们的方法,我们引入了两个数据集:CurviSeg,一个用于3D曲线结构分割的合成数据集,验证了我们的方法的关键属性;DenSpineEM,一个树突脊柱分割的基准数据集,包括来自三个公共电子显微镜数据集的70个树突的4,476个手动注释的脊柱,涵盖多个大脑区域和物种。我们在DenSpineEM上的实验证明了卓越的跨区域和跨物种泛化:在小鼠体感皮层子集上训练的模型达到了94.43%的Dice,在小鼠视觉皮层(95.61% Dice)和人类额叶(86.63% Dice)子集上保持了良好的零射击分割性能。此外,我们在IntrA数据集上测试了我们的方法的泛化性,从整个动脉模型中分割颅内动脉瘤的准确率达到77.08%(比现有技术高5.29%)。这些发现证明了我们的方法在不同医学成像领域准确分析复杂曲线结构的潜力。我们的数据集、代码和模型可在https://github.com/VCG/FFD4DenSpineEM上获得,以支持未来的研究。
{"title":"Frenet–Serret Frame-Based Decomposition for Part Segmentation of 3-D Curvilinear Structures","authors":"Shixuan Leslie Gu;Jason Ken Adhinarta;Mikhail Bessmeltsev;Jiancheng Yang;Yongjie Jessica Zhang;Wenjie Yin;Daniel Berger;Jeff W. Lichtman;Hanspeter Pfister;Donglai Wei","doi":"10.1109/TMI.2025.3589543","DOIUrl":"10.1109/TMI.2025.3589543","url":null,"abstract":"Accurate segmentation of anatomical substructures within 3D curvilinear structures in medical imaging remains challenging due to their complex geometry and the scarcity of diverse, large-scale datasets for algorithm development and evaluation. In this paper, we use dendritic spine segmentation as a case study and address these challenges by introducing a novel Frenet-Serret Frame-based Decomposition, which decomposes 3D curvilinear structures into a globally smooth continuous curve that captures the overall shape, and a cylindrical primitive that encodes local geometric properties. This approach leverages Frenet-Serret Frames and arc length parameterization to preserve essential geometric features while reducing representational complexity, facilitating data-efficient learning, improved segmentation accuracy, and generalization on 3D curvilinear structures. To rigorously evaluate our method, we introduce two datasets: CurviSeg, a synthetic dataset for 3D curvilinear structure segmentation that validates our method’s key properties, and DenSpineEM, a benchmark for dendritic spine segmentation, which comprises 4,476 manually annotated spines from 70 dendrites across three public electron microscopy datasets, covering multiple brain regions and species. Our experiments on DenSpineEM demonstrate exceptional cross-region and cross-species generalization: models trained on the mouse somatosensory cortex subset achieve 94.43% Dice, maintaining strong performance in zero-shot segmentation on both mouse visual cortex (95.61% Dice) and human frontal lobe (86.63% Dice) subsets. Moreover, we test the generalizability of our method on the IntrA dataset, where it achieves 77.08% Dice (5.29% higher than prior arts) on intracranial aneurysm segmentation from entire artery models. These findings demonstrate the potential of our approach for accurately analyzing complex curvilinear structures across diverse medical imaging fields. Our dataset, code, and models are available at <uri>https://github.com/VCG/FFD4DenSpineEM</uri> to support future research.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5319-5331"},"PeriodicalIF":0.0,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144645751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Debiasing Medical Knowledge for Prompting Universal Model in CT Image Segmentation 基于医学知识去偏的CT图像分割通用提示模型
Pub Date : 2025-07-15 DOI: 10.1109/TMI.2025.3589399
Boxiang Yun;Shitian Zhao;Qingli Li;Alex Kot;Yan Wang
With the assistance of large language models, which offer universal medical prior knowledge via text prompts, state-of-the-art Universal Models (UM) have demonstrated considerable potential in the field of medical image segmentation. Semantically detailed text prompts, on the one hand, indicate comprehensive knowledge; on the other hand, they bring biases that may not be applicable to specific cases involving heterogeneous organs or rare cancers. To this end, we propose a Debiased Universal Model (DUM) to consider instance-level context information and remove knowledge biases in text prompts from the causal perspective. We are the first to discover and mitigate the bias introduced by universal knowledge. Specifically, we propose to extract organ-level text prompts via language models and instance-level context prompts from the visual features of each image. We aim to highlight more on factual instance-level information and mitigate organ-level’s knowledge bias. This process can be derived and theoretically supported by a causal graph, and instantiated by designing a standard UM (SUM) and a biased UM. The debiased output is finally obtained by subtracting the likelihood distribution output by biased UM from that of the SUM. Experiments on three large-scale multi-center external datasets and MSD internal tumor datasets show that our method enhances the model’s generalization ability in handling diverse medical scenarios and reducing the potential biases, even with an improvement of 4.16% compared with popular universal model on the AbdomenAtlas dataset, showing the strong generalizability. The code is publicly available at https://github.com/DeepMed-Lab-ECNU/DUM
在通过文本提示提供通用医学先验知识的大型语言模型的帮助下,最先进的通用模型(UM)在医学图像分割领域显示出相当大的潜力。语义详实的文本提示,一方面表明知识全面;另一方面,它们带来的偏见可能不适用于涉及异质器官或罕见癌症的特定病例。为此,我们提出了一个Debiased Universal Model (DUM)来考虑实例级上下文信息,并从因果关系的角度消除文本提示中的知识偏差。我们是第一个发现并减轻普遍知识带来的偏见的人。具体来说,我们建议通过语言模型和实例级上下文提示从每个图像的视觉特征中提取器官级文本提示。我们的目标是更多地强调事实的实例级信息,减轻器官级的知识偏差。这一过程可以由因果图推导和理论上支持,并通过设计一个标准UM (SUM)和一个有偏UM来实例化。通过从SUM的似然分布输出中减去有偏UM的似然分布输出,最终得到无偏输出。在三个大规模多中心外部数据集和MSD内部肿瘤数据集上的实验表明,我们的方法增强了模型处理多种医疗场景的泛化能力,减少了潜在的偏差,甚至比目前流行的通用模型在腹大图数据集上的泛化能力提高了4.16%,显示出较强的泛化能力。该代码可在https://github.com/DeepMed-Lab-ECNU/DUM上公开获得
{"title":"Debiasing Medical Knowledge for Prompting Universal Model in CT Image Segmentation","authors":"Boxiang Yun;Shitian Zhao;Qingli Li;Alex Kot;Yan Wang","doi":"10.1109/TMI.2025.3589399","DOIUrl":"10.1109/TMI.2025.3589399","url":null,"abstract":"With the assistance of large language models, which offer universal medical prior knowledge via text prompts, state-of-the-art Universal Models (UM) have demonstrated considerable potential in the field of medical image segmentation. Semantically detailed text prompts, on the one hand, indicate comprehensive knowledge; on the other hand, they bring biases that may not be applicable to specific cases involving heterogeneous organs or rare cancers. To this end, we propose a Debiased Universal Model (DUM) to consider instance-level context information and remove knowledge biases in text prompts from the causal perspective. We are the first to discover and mitigate the bias introduced by universal knowledge. Specifically, we propose to extract organ-level text prompts via language models and instance-level context prompts from the visual features of each image. We aim to highlight more on factual instance-level information and mitigate organ-level’s knowledge bias. This process can be derived and theoretically supported by a causal graph, and instantiated by designing a standard UM (SUM) and a biased UM. The debiased output is finally obtained by subtracting the likelihood distribution output by biased UM from that of the SUM. Experiments on three large-scale multi-center external datasets and MSD internal tumor datasets show that our method enhances the model’s generalization ability in handling diverse medical scenarios and reducing the potential biases, even with an improvement of 4.16% compared with popular universal model on the AbdomenAtlas dataset, showing the strong generalizability. The code is publicly available at <uri>https://github.com/DeepMed-Lab-ECNU/DUM</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5142-5154"},"PeriodicalIF":0.0,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144639749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Segment Anything Model for Source-Free Domain Adaptation via Dual Feature Guided Auto-Prompting 利用分段任意模型通过双特征引导自动提示进行无源域自适应
Pub Date : 2025-07-15 DOI: 10.1109/TMI.2025.3587733
Zheang Huai;Hui Tang;Yi Li;Zhuangzhuang Chen;Xiaomeng Li
Source-free domain adaptation (SFDA) for segmentation aims at adapting a model trained in the source domain to perform well in the target domain with only the source model and unlabeled target data. Inspired by the recent success of Segment Anything Model (SAM) which exhibits the generality of segmenting images of various modalities and in different domains given human-annotated prompts like bounding boxes or points, we for the first time explore the potentials of Segment Anything Model for SFDA via automatedly finding an accurate bounding box prompt. We find that the bounding boxes directly generated with existing SFDA approaches are defective due to the domain gap. To tackle this issue, we propose a novel Dual Feature Guided (DFG) auto-prompting approach to search for the box prompt. Specifically, the source model is first trained in a feature aggregation phase, which not only preliminarily adapts the source model to the target domain but also builds a feature distribution well-prepared for box prompt search. In the second phase, based on two feature distribution observations, we gradually expand the box prompt with the guidance of the target model feature and the SAM feature to handle the class-wise clustered target features and the class-wise dispersed target features, respectively. To remove the potentially enlarged false positive regions caused by the over-confident prediction of the target model, the refined pseudo-labels produced by SAM are further postprocessed based on connectivity analysis. Experiments on 3D and 2D datasets indicate that our approach yields superior performance compared to conventional methods. Code is available at https://github.com/xmed-lab/DFG.
无源域自适应(source -free domain adaptation, SFDA)分割的目的是使在源域中训练好的模型在只有源模型和未标记的目标数据的情况下在目标域中表现良好。受到最近成功的Segment Anything Model (SAM)的启发,我们首次通过自动找到准确的边界框提示来探索Segment Anything Model在SFDA中的潜力。SAM展示了在给定的人类注释提示(如边界框或点)下,对各种模式和不同领域的图像进行分割的通用性。我们发现用现有的SFDA方法直接生成的边界盒由于域间隙存在缺陷。为了解决这个问题,我们提出了一种新的双特征引导(DFG)自动提示方法来搜索框提示符。具体来说,首先在特征聚合阶段对源模型进行训练,不仅使源模型初步适应目标域,而且构建了一个为框提示搜索做好准备的特征分布。第二阶段,在两次特征分布观测的基础上,在目标模型特征和SAM特征的指导下,逐步扩展框提示,分别处理类明智的聚类目标特征和类明智的分散目标特征。为了去除由于对目标模型的过度自信预测而可能增大的假阳性区域,对由SAM生成的精细伪标签进行基于连通性分析的进一步后处理。在3D和2D数据集上的实验表明,与传统方法相比,我们的方法具有更好的性能。代码可从https://github.com/xmed-lab/DFG获得。
{"title":"Leveraging Segment Anything Model for Source-Free Domain Adaptation via Dual Feature Guided Auto-Prompting","authors":"Zheang Huai;Hui Tang;Yi Li;Zhuangzhuang Chen;Xiaomeng Li","doi":"10.1109/TMI.2025.3587733","DOIUrl":"10.1109/TMI.2025.3587733","url":null,"abstract":"Source-free domain adaptation (SFDA) for segmentation aims at adapting a model trained in the source domain to perform well in the target domain with only the source model and unlabeled target data. Inspired by the recent success of Segment Anything Model (SAM) which exhibits the generality of segmenting images of various modalities and in different domains given human-annotated prompts like bounding boxes or points, we for the first time explore the potentials of Segment Anything Model for SFDA via automatedly finding an accurate bounding box prompt. We find that the bounding boxes directly generated with existing SFDA approaches are defective due to the domain gap. To tackle this issue, we propose a novel Dual Feature Guided (DFG) auto-prompting approach to search for the box prompt. Specifically, the source model is first trained in a feature aggregation phase, which not only preliminarily adapts the source model to the target domain but also builds a feature distribution well-prepared for box prompt search. In the second phase, based on two feature distribution observations, we gradually expand the box prompt with the guidance of the target model feature and the SAM feature to handle the class-wise clustered target features and the class-wise dispersed target features, respectively. To remove the potentially enlarged false positive regions caused by the over-confident prediction of the target model, the refined pseudo-labels produced by SAM are further postprocessed based on connectivity analysis. Experiments on 3D and 2D datasets indicate that our approach yields superior performance compared to conventional methods. Code is available at <uri>https://github.com/xmed-lab/DFG</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5077-5088"},"PeriodicalIF":0.0,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144639870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Polyp Detection and Diagnosis Through Compositional Prompt-Guided Diffusion Models 基于成分快速引导扩散模型的鲁棒息肉检测与诊断
Pub Date : 2025-07-15 DOI: 10.1109/TMI.2025.3589456
Jia Yu;Yan Zhu;Peiyao Fu;Tianyi Chen;Junbo Huang;Quanlin Li;Pinghong Zhou;Zhihua Wang;Fei Wu;Shuo Wang;Xian Yang
Colorectal cancer (CRC) is a significant global health concern, and early detection through screening plays a critical role in reducing mortality. While deep learning models have shown promise in improving polyp detection, classification, and segmentation, their generalization across diverse clinical environments, particularly with out-of-distribution (OOD) data, remains a challenge. Multi-center datasets like PolypGen have been developed to address these issues, but their collection is costly and time-consuming. Traditional data augmentation techniques provide limited variability, failing to capture the complexity of medical images. Diffusion models have emerged as a promising solution for generating synthetic polyp images, but the image generation process in current models mainly relies on segmentation masks as the condition, limiting their ability to capture the full clinical context. To overcome these limitations, we propose a Progressive Spectrum Diffusion Model (PSDM) that integrates diverse clinical annotations–such as segmentation masks, bounding boxes, and colonoscopy reports–by transforming them into compositional prompts. These prompts are organized into coarse and fine components, allowing the model to capture both broad spatial structures and fine details, generating clinically accurate synthetic images. By augmenting training data with PSDM-generated samples, our model significantly improves polyp detection, classification, and segmentation. For instance, on the PolypGen dataset, PSDM increases the F1 score by 2.12% and the mean average precision by 3.09%, demonstrating superior performance in OOD scenarios and enhanced generalization.
结直肠癌(CRC)是一个重大的全球健康问题,通过筛查早期发现在降低死亡率方面发挥着关键作用。虽然深度学习模型在改善息肉检测、分类和分割方面表现出了希望,但它们在不同临床环境中的泛化,特别是在分布外(OOD)数据方面,仍然是一个挑战。像polygen这样的多中心数据集已经开发出来解决这些问题,但它们的收集成本高且耗时。传统的数据增强技术提供有限的可变性,无法捕捉医学图像的复杂性。扩散模型已经成为生成合成息肉图像的一个很有前途的解决方案,但目前模型中的图像生成过程主要依赖于分割掩模作为条件,限制了它们捕捉完整临床环境的能力。为了克服这些限制,我们提出了一种渐进式光谱扩散模型(PSDM),该模型通过将不同的临床注释(如分割掩模、边界框和结肠镜检查报告)转换为组合提示来集成它们。这些提示被组织成粗糙和精细的组件,允许模型捕获广泛的空间结构和精细的细节,生成临床准确的合成图像。通过使用psdm生成的样本增强训练数据,我们的模型显著提高了息肉的检测、分类和分割。例如,在polygen数据集上,PSDM将F1得分提高了2.12%,平均精度提高了3.09%,在OOD场景中表现出优异的性能和增强的泛化能力。
{"title":"Robust Polyp Detection and Diagnosis Through Compositional Prompt-Guided Diffusion Models","authors":"Jia Yu;Yan Zhu;Peiyao Fu;Tianyi Chen;Junbo Huang;Quanlin Li;Pinghong Zhou;Zhihua Wang;Fei Wu;Shuo Wang;Xian Yang","doi":"10.1109/TMI.2025.3589456","DOIUrl":"10.1109/TMI.2025.3589456","url":null,"abstract":"Colorectal cancer (CRC) is a significant global health concern, and early detection through screening plays a critical role in reducing mortality. While deep learning models have shown promise in improving polyp detection, classification, and segmentation, their generalization across diverse clinical environments, particularly with out-of-distribution (OOD) data, remains a challenge. Multi-center datasets like PolypGen have been developed to address these issues, but their collection is costly and time-consuming. Traditional data augmentation techniques provide limited variability, failing to capture the complexity of medical images. Diffusion models have emerged as a promising solution for generating synthetic polyp images, but the image generation process in current models mainly relies on segmentation masks as the condition, limiting their ability to capture the full clinical context. To overcome these limitations, we propose a Progressive Spectrum Diffusion Model (PSDM) that integrates diverse clinical annotations–such as segmentation masks, bounding boxes, and colonoscopy reports–by transforming them into compositional prompts. These prompts are organized into coarse and fine components, allowing the model to capture both broad spatial structures and fine details, generating clinically accurate synthetic images. By augmenting training data with PSDM-generated samples, our model significantly improves polyp detection, classification, and segmentation. For instance, on the PolypGen dataset, PSDM increases the F1 score by 2.12% and the mean average precision by 3.09%, demonstrating superior performance in OOD scenarios and enhanced generalization.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5245-5257"},"PeriodicalIF":0.0,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11080481","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144639753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention-Based Shape-Deformation Networks for Artifact-Free Geometry Reconstruction of Lumbar Spine From MR Images 基于注意力的MR图像腰椎无伪影几何重建的形状变形网络
Pub Date : 2025-07-15 DOI: 10.1109/TMI.2025.3588831
Linchen Qian;Jiasong Chen;Linhai Ma;Timur Urakov;Weiyong Gu;Liang Liang
Lumbar disc degeneration, a progressive structural wear and tear of lumbar intervertebral disc, is regarded as an essential role on low back pain, a significant global health concern. Automated lumbar spine geometry reconstruction from MR images will enable fast measurement of medical parameters to evaluate the lumbar status, in order to determine a suitable treatment. Existing image segmentation-based techniques often generate erroneous segments or unstructured point clouds, unsuitable for medical parameter measurement. In this work, we present UNet-DeformSA and TransDeformer: novel attention-based deep neural networks that reconstruct the geometry of the lumbar spine with high spatial accuracy and mesh correspondence across patients, and we also present a variant of TransDeformer for error estimation. Specially, we devise new attention modules with a new attention formula, which integrate tokenized image features and tokenized shape features to predict the displacements of the points on a shape template. The deformed template reveals the lumbar spine geometry in an image. Experiment results show that our networks generate artifact-free geometry outputs, and the variant of TransDeformer can predict the errors of a reconstructed geometry. Our code is available at https://github.com/linchenq/TransDeformer-Mesh.
腰椎间盘退变是腰椎间盘的进行性结构磨损和撕裂,被认为是腰痛的重要原因,是一个重大的全球健康问题。从MR图像中自动重建腰椎几何结构将能够快速测量医学参数以评估腰椎状态,以便确定合适的治疗方法。现有的基于图像分割的技术经常产生错误的片段或非结构化的点云,不适合医学参数的测量。在这项工作中,我们提出了UNet-DeformSA和TransDeformer:新型的基于注意力的深度神经网络,以高空间精度和网格对应的方式重建腰椎的几何形状,我们还提出了TransDeformer的一种变体,用于误差估计。特别地,我们设计了新的关注模块和新的关注公式,将标记化图像特征和标记化形状特征相结合,预测形状模板上点的位移。变形的模板在图像中显示腰椎的几何形状。实验结果表明,我们的网络生成了无伪影的几何输出,并且transformformer变体可以预测重构几何的误差。我们的代码可在https://github.com/linchenq/TransDeformer-Mesh上获得。
{"title":"Attention-Based Shape-Deformation Networks for Artifact-Free Geometry Reconstruction of Lumbar Spine From MR Images","authors":"Linchen Qian;Jiasong Chen;Linhai Ma;Timur Urakov;Weiyong Gu;Liang Liang","doi":"10.1109/TMI.2025.3588831","DOIUrl":"10.1109/TMI.2025.3588831","url":null,"abstract":"Lumbar disc degeneration, a progressive structural wear and tear of lumbar intervertebral disc, is regarded as an essential role on low back pain, a significant global health concern. Automated lumbar spine geometry reconstruction from MR images will enable fast measurement of medical parameters to evaluate the lumbar status, in order to determine a suitable treatment. Existing image segmentation-based techniques often generate erroneous segments or unstructured point clouds, unsuitable for medical parameter measurement. In this work, we present UNet-DeformSA and TransDeformer: novel attention-based deep neural networks that reconstruct the geometry of the lumbar spine with high spatial accuracy and mesh correspondence across patients, and we also present a variant of TransDeformer for error estimation. Specially, we devise new attention modules with a new attention formula, which integrate tokenized image features and tokenized shape features to predict the displacements of the points on a shape template. The deformed template reveals the lumbar spine geometry in an image. Experiment results show that our networks generate artifact-free geometry outputs, and the variant of TransDeformer can predict the errors of a reconstructed geometry. Our code is available at <uri>https://github.com/linchenq/TransDeformer-Mesh</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5258-5277"},"PeriodicalIF":0.0,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144639750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on medical imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1