首页 > 最新文献

Medical image analysis最新文献

英文 中文
DDTracking: A diffusion model-based deep generative framework with local-global spatiotemporal modeling for diffusion MRI tractography DDTracking:一种基于扩散模型的局部-全局时空建模的深度生成框架
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-29 DOI: 10.1016/j.media.2026.103967
Yijie Li , Wei Zhang , Xi Zhu , Ye Wu , Yogesh Rathi , Lauren J. O'Donnell , Fan Zhang
Diffusion MRI (dMRI) tractography is an advanced technique that uniquely enables in vivo mapping of brain fiber pathways. Traditional methods rely on tissue modeling to estimate fiber orientations for streamline propagation, which are computationally intensive and remain sensitive to noise and artifacts. Recent deep learning-based approaches enable data-driven fiber tracking by directly mapping dMRI signals to orientations, demonstrating both improved efficiency and accuracy. However, existing methods typically operate by either leveraging local signal information or learning global dependencies along streamlines. This paper presents DDTracking, a deep generative framework for tractography. One key innovation is the reformulation of streamline propagation as a conditional denoising diffusion process. To the best of our knowledge, this is the first work to apply diffusion models for fiber tracking. Our network architecture incorporates two new designs, including: (1) a dual-pathway encoding scheme that extracts complementary local spatial features and global temporal context, and (2) a conditional diffusion model module that integrates the spatiotemporal features to predict propagation orientations. All components are trained jointly in an end-to-end manner without any pretraining. In this way, DDTracking can capture fine-scale structural details at each point while ensuring long-range consistency across the entire streamline. We conduct a comprehensive evaluation across diverse datasets, including both synthetic and clinical data. Experiments demonstrate that DDTracking outperforms traditional model-based and state-of-the-art deep learning-based methods in terms of tracking accuracy and computational efficiency. Furthermore, our results highlight DDTracking’s high generalizability across heterogeneous datasets, spanning varying health conditions, age groups, imaging protocols, and scanner types. Code is available at: https://github.com/yishengpoxiao/DDTracking.git.
弥散磁共振成像(dMRI)是一种先进的技术,能够独特地在体内绘制脑纤维通路。传统的方法依赖于组织建模来估计流线传播的纤维方向,这种方法计算量大,并且对噪声和伪影很敏感。最近基于深度学习的方法可以通过直接将dMRI信号映射到方向来实现数据驱动的光纤跟踪,从而提高了效率和准确性。然而,现有的方法通常通过利用本地信号信息或沿着流线学习全局依赖关系来操作。本文提出了一种用于轨迹成像的深度生成框架——DDTracking。一个关键的创新是将流线传播重新表述为条件去噪扩散过程。据我们所知,这是第一次将扩散模型应用于光纤跟踪。我们的网络架构采用了两种新的设计,包括:(1)提取互补的局部空间特征和全局时间背景的双路径编码方案;(2)集成时空特征来预测传播方向的条件扩散模型模块。所有组件在没有任何预训练的情况下以端到端方式联合训练。通过这种方式,DDTracking可以在每个点捕获精细的结构细节,同时确保整个流线的远程一致性。我们对不同的数据集进行综合评估,包括合成数据和临床数据。实验表明,DDTracking在跟踪精度和计算效率方面优于传统的基于模型和最先进的基于深度学习的方法。此外,我们的研究结果强调了DDTracking在不同健康状况、年龄组、成像协议和扫描仪类型的异构数据集上的高通用性。代码可从https://github.com/yishengpoxiao/DDTracking.git获得。
{"title":"DDTracking: A diffusion model-based deep generative framework with local-global spatiotemporal modeling for diffusion MRI tractography","authors":"Yijie Li ,&nbsp;Wei Zhang ,&nbsp;Xi Zhu ,&nbsp;Ye Wu ,&nbsp;Yogesh Rathi ,&nbsp;Lauren J. O'Donnell ,&nbsp;Fan Zhang","doi":"10.1016/j.media.2026.103967","DOIUrl":"10.1016/j.media.2026.103967","url":null,"abstract":"<div><div>Diffusion MRI (dMRI) tractography is an advanced technique that uniquely enables in vivo mapping of brain fiber pathways. Traditional methods rely on tissue modeling to estimate fiber orientations for streamline propagation, which are computationally intensive and remain sensitive to noise and artifacts. Recent deep learning-based approaches enable data-driven fiber tracking by directly mapping dMRI signals to orientations, demonstrating both improved efficiency and accuracy. However, existing methods typically operate by either leveraging local signal information or learning global dependencies along streamlines. This paper presents DDTracking, a deep generative framework for tractography. One key innovation is the reformulation of streamline propagation as a conditional denoising diffusion process. To the best of our knowledge, this is the first work to apply diffusion models for fiber tracking. Our network architecture incorporates two new designs, including: (1) a dual-pathway encoding scheme that extracts complementary local spatial features and global temporal context, and (2) a conditional diffusion model module that integrates the spatiotemporal features to predict propagation orientations. All components are trained jointly in an end-to-end manner without any pretraining. In this way, DDTracking can capture fine-scale structural details at each point while ensuring long-range consistency across the entire streamline. We conduct a comprehensive evaluation across diverse datasets, including both synthetic and clinical data. Experiments demonstrate that DDTracking outperforms traditional model-based and state-of-the-art deep learning-based methods in terms of tracking accuracy and computational efficiency. Furthermore, our results highlight DDTracking’s high generalizability across heterogeneous datasets, spanning varying health conditions, age groups, imaging protocols, and scanner types. Code is available at: <span><span>https://github.com/yishengpoxiao/DDTracking.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103967"},"PeriodicalIF":11.8,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ESM-AnatTractNet: Advanced deep learning model of true positive eloquent white matter tractography to improve preoperative evaluation of pediatric epilepsy surgery ESM-AnatTractNet:一种先进的真阳性白质束造影深度学习模型,用于改善小儿癫痫手术术前评估
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-29 DOI: 10.1016/j.media.2026.103969
Min-Hee Lee , Bohan Xiao , Soumyanil Banerjee , Hiroshi Uda , Yoon Ho Hwang , Csaba Juhász , Eishi Asano , Ming Dong , Jeong-Won Jeong
Accurate preoperative identification of true positive white matter pathways involved in critical eloquent functions such as motor, language, and vision plays a vital role in minimizing the risk of postoperative functional deficits and improving postoperative functional outcomes in pediatric epilepsy surgery. This study proposes a novel deep learning model: “ESM-AnatTractNet” that can accurately classify true positive eloquent white matter pathways across preoperative diffusion weighted imaging tractography data of 85 drug-resistant epilepsy patients (age: 10.70 ± 4.41 years). To enhance geometric and anatomical consistency of true positive tract classification, the ESM-AnatTractNet integrated two features in a point-cloud-based framework, 1) electro-physiologically confirmed spatial coordinates using electrical stimulation mapping (ESM) and 2) anatomically-contexted labels of the end-to-end neural connection using a standard brain atlas. Its overall performance was validated by accurately classifying 14 eloquent functional areas in whole brain, objectively optimizing resection margins to preserve eloquent functions using Kalman filter, and precisely predicting postoperative language outcomes using canonical correlation. Our ESM-AnatTractNet outperformed other baseline models, achieving an accuracy of 97% in correctly classifying eloquent areas within 10mm spatial resolution of clinical subdural grid electroencephalography. The Kalman filter analysis achieved 94% accuracy in predicting no deficits when the ESM-AnatTractNet-defined preservation zones were not resected. Postoperative decrease in language-related white matter connection efficacy defined by the ESM-AnatTractNet analysis was significantly associated with worse postoperative language outcome (R=0.73, p < 0.001). Our findings demonstrate that the ESM-AnatTractNet improves non-invasive localization of true positive eloquent white matter pathways, supporting its potential to enhance current preoperative evaluation of pediatric epilepsy surgery.
在小儿癫痫手术中,术前准确识别涉及运动、语言和视觉等关键功能的真阳性白质通路,对于降低术后功能缺陷的风险和改善术后功能结局至关重要。本研究提出了一种新的深度学习模型:“ESM-AnatTractNet”,该模型可以通过85例耐药癫痫患者(年龄:10.70 ± 4.41岁)的术前弥散加权成像神经束造影数据准确分类真阳性的白质通路。为了增强真正束分类的几何和解剖一致性,ESM- anattractnet在一个基于点云的框架中整合了两个特征:1)使用电刺激映射(ESM)进行电生理确认的空间坐标;2)使用标准脑图谱进行端到端神经连接的解剖背景标签。通过对全脑14个雄辩功能区域进行准确分类,使用卡尔曼滤波客观优化切除边缘以保留雄辩功能,并使用典型相关精确预测术后语言结果,验证了其整体性能。我们的ESM-AnatTractNet优于其他基线模型,在临床硬膜下网格脑电图的10mm空间分辨率内,正确分类有意义的区域的准确率达到97%。当esm - anattractnet定义的保护区未被切除时,卡尔曼滤波分析在预测无缺陷方面达到94%的准确率。ESM-AnatTractNet分析定义的术后语言相关白质连接效能下降与术后语言预后恶化显著相关(R=0.73, p <; 0.001)。我们的研究结果表明,ESM-AnatTractNet改善了真阳性白质通路的非侵入性定位,支持其增强当前儿科癫痫手术术前评估的潜力。
{"title":"ESM-AnatTractNet: Advanced deep learning model of true positive eloquent white matter tractography to improve preoperative evaluation of pediatric epilepsy surgery","authors":"Min-Hee Lee ,&nbsp;Bohan Xiao ,&nbsp;Soumyanil Banerjee ,&nbsp;Hiroshi Uda ,&nbsp;Yoon Ho Hwang ,&nbsp;Csaba Juhász ,&nbsp;Eishi Asano ,&nbsp;Ming Dong ,&nbsp;Jeong-Won Jeong","doi":"10.1016/j.media.2026.103969","DOIUrl":"10.1016/j.media.2026.103969","url":null,"abstract":"<div><div>Accurate preoperative identification of true positive white matter pathways involved in critical eloquent functions such as motor, language, and vision plays a vital role in minimizing the risk of postoperative functional deficits and improving postoperative functional outcomes in pediatric epilepsy surgery. This study proposes a novel deep learning model: “ESM-AnatTractNet” that can accurately classify true positive eloquent white matter pathways across preoperative diffusion weighted imaging tractography data of 85 drug-resistant epilepsy patients (age: 10.70 ± 4.41 years). To enhance geometric and anatomical consistency of true positive tract classification, the ESM-AnatTractNet integrated two features in a point-cloud-based framework, 1) electro-physiologically confirmed spatial coordinates using electrical stimulation mapping (ESM) and 2) anatomically-contexted labels of the end-to-end neural connection using a standard brain atlas. Its overall performance was validated by accurately classifying 14 eloquent functional areas in whole brain, objectively optimizing resection margins to preserve eloquent functions using Kalman filter, and precisely predicting postoperative language outcomes using canonical correlation. Our ESM-AnatTractNet outperformed other baseline models, achieving an accuracy of 97% in correctly classifying eloquent areas within 10mm spatial resolution of clinical subdural grid electroencephalography. The Kalman filter analysis achieved 94% accuracy in predicting no deficits when the ESM-AnatTractNet-defined preservation zones were not resected. Postoperative decrease in language-related white matter connection efficacy defined by the ESM-AnatTractNet analysis was significantly associated with worse postoperative language outcome (R=0.73, p &lt; 0.001). Our findings demonstrate that the ESM-AnatTractNet improves non-invasive localization of true positive eloquent white matter pathways, supporting its potential to enhance current preoperative evaluation of pediatric epilepsy surgery.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103969"},"PeriodicalIF":11.8,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AsymTrack: asymmetric fiber orientation mapping for accurate tractography in brain disorders via unsupervised deep learning AsymTrack:通过无监督深度学习对脑部疾病进行精确神经束造影的非对称纤维定向映射
IF 10.9 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-29 DOI: 10.1016/j.media.2026.103968
Di Zhang, Ziyu Li, Xiaofeng Deng, Zekun Han, Alan Wang, Yong Liu, Fangrong Zong
{"title":"AsymTrack: asymmetric fiber orientation mapping for accurate tractography in brain disorders via unsupervised deep learning","authors":"Di Zhang, Ziyu Li, Xiaofeng Deng, Zekun Han, Alan Wang, Yong Liu, Fangrong Zong","doi":"10.1016/j.media.2026.103968","DOIUrl":"https://doi.org/10.1016/j.media.2026.103968","url":null,"abstract":"","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"50 1","pages":""},"PeriodicalIF":10.9,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A navigation-guided 3D breast ultrasound scanning and reconstruction system for automated multi-lesion spatial localization and diagnosis 一种用于多病灶自动定位与诊断的导航引导三维乳腺超声扫描与重建系统
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.media.2026.103965
Yi Zhang , Yulin Yan , Kun Wang , Muyu Cai , Yifei Xiang , Yan Guo , Puxun Tu , Tao Ying , Xiaojun Chen
Handheld ultrasound (HHUS) is indispensable for breast cancer screening but remains compromised by operator-dependent acquisition, subjective 2D interpretation and clock-face annotation. Existing spatial tracking systems for HHUS typically lack integration, adaptability, flexibility, and robust 3D representation. Additionally, current deep learning diagnostic methods are predominantly based on single ultrasound images, whereas video-based malignancy classification approaches suffer from limited temporal interpretability. In this study, we develop an intelligent navigation-guided breast ultrasound scanning system delivering seamless 3D reconstruction, nipple-centric lesion localization, and video-based malignancy prediction with full adaptation to the routine workflow. Specifically, a Hybrid Lesion-informed Spatiotemporal Transformer (HLST) is proposed to selectively fuse intra- and peri-lesional dynamics augmented from a prompt-driven BUS-SAM-2 foundation model for sequence-level classification. Moreover, a geometry-adaptive clock projection and analysis method is designed to enable automated standardized clock-face orientation and lesion-to-nipple distance measurement for breasts of arbitrary shape, eliminating patient-attached fiducials or pre-marked landmarks. Validation on three breast phantoms demonstrated high correlations with CT reference (r > 0.99 for distance, r > 0.97 for 3D size, and r=1.00 for clockwise angle, p < 0.0001). Clinical evaluation in 43 female patients (30 abnormal breasts) yielded median clock-face orientation and size discrepancies of 0 h and 0.7 mm  ×  0.6 mm, respectively, versus conventional reports. Meanwhile, HLST achieved superior performance (86.1% accuracy) on the BUV dataset. By coupling precise 3D spatial annotation with foundation-model-enhanced spatiotemporal characterization, the proposed system offers a reliable, streamlined workflow that standardizes follow-up, guides biopsies, and promotes diagnostic confidence in HHUS practice.
手持式超声(HHUS)在乳腺癌筛查中是不可或缺的,但仍然受到操作者依赖的获取、主观2D解释和时钟面注释的影响。现有的hus空间跟踪系统通常缺乏集成、适应性、灵活性和鲁棒的3D表示。此外,目前的深度学习诊断方法主要基于单个超声图像,而基于视频的恶性肿瘤分类方法在时间上的可解释性有限。在这项研究中,我们开发了一种智能导航引导的乳房超声扫描系统,提供无缝的3D重建,乳头中心病变定位和基于视频的恶性肿瘤预测,完全适应常规工作流程。具体而言,提出了一种混合病变信息时空转换器(HLST),以选择性地融合从提示驱动的BUS-SAM-2基础模型增强的病变内和病变周围动态,用于序列级分类。此外,设计了一种几何自适应时钟投影和分析方法,可以实现任意形状乳房的自动标准化时钟面方向和病变到乳头距离测量,消除了患者附加的基准或预先标记的地标。三个乳房幻影的验证显示与CT参考高度相关(距离r >; 0.99,3D尺寸r >; 0.97,顺时针角度r=1.00, p <; 0.0001)。43例女性患者(30例异常乳房)的临床评估结果显示,与传统报告相比,时钟面中位取向和尺寸差异分别为0小时和0.7 mm × 0.6 mm。同时,HLST在BUV数据集上取得了86.1%的准确率。通过将精确的3D空间注释与基础模型增强的时空特征相结合,该系统提供了一个可靠的、简化的工作流程,可以标准化随访,指导活检,并提高HHUS实践中的诊断信心。
{"title":"A navigation-guided 3D breast ultrasound scanning and reconstruction system for automated multi-lesion spatial localization and diagnosis","authors":"Yi Zhang ,&nbsp;Yulin Yan ,&nbsp;Kun Wang ,&nbsp;Muyu Cai ,&nbsp;Yifei Xiang ,&nbsp;Yan Guo ,&nbsp;Puxun Tu ,&nbsp;Tao Ying ,&nbsp;Xiaojun Chen","doi":"10.1016/j.media.2026.103965","DOIUrl":"10.1016/j.media.2026.103965","url":null,"abstract":"<div><div>Handheld ultrasound (HHUS) is indispensable for breast cancer screening but remains compromised by operator-dependent acquisition, subjective 2D interpretation and clock-face annotation. Existing spatial tracking systems for HHUS typically lack integration, adaptability, flexibility, and robust 3D representation. Additionally, current deep learning diagnostic methods are predominantly based on single ultrasound images, whereas video-based malignancy classification approaches suffer from limited temporal interpretability. In this study, we develop an intelligent navigation-guided breast ultrasound scanning system delivering seamless 3D reconstruction, nipple-centric lesion localization, and video-based malignancy prediction with full adaptation to the routine workflow. Specifically, a Hybrid Lesion-informed Spatiotemporal Transformer (HLST) is proposed to selectively fuse intra- and peri-lesional dynamics augmented from a prompt-driven BUS-SAM-2 foundation model for sequence-level classification. Moreover, a geometry-adaptive clock projection and analysis method is designed to enable automated standardized clock-face orientation and lesion-to-nipple distance measurement for breasts of arbitrary shape, eliminating patient-attached fiducials or pre-marked landmarks. Validation on three breast phantoms demonstrated high correlations with CT reference (<em>r</em> &gt; 0.99 for distance, <em>r</em> &gt; 0.97 for 3D size, and <span><math><mrow><mi>r</mi><mo>=</mo><mn>1.00</mn></mrow></math></span> for clockwise angle, <em>p</em> &lt; 0.0001). Clinical evaluation in 43 female patients (30 abnormal breasts) yielded median clock-face orientation and size discrepancies of 0 h and 0.7 mm  ×  0.6 mm, respectively, versus conventional reports. Meanwhile, HLST achieved superior performance (86.1% accuracy) on the BUV dataset. By coupling precise 3D spatial annotation with foundation-model-enhanced spatiotemporal characterization, the proposed system offers a reliable, streamlined workflow that standardizes follow-up, guides biopsies, and promotes diagnostic confidence in HHUS practice.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103965"},"PeriodicalIF":11.8,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal sparse fusion transformer network with spatio-temporal decoupling for breast tumor classification 时空解耦的多模态稀疏融合变压器网络用于乳腺肿瘤分类
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.media.2026.103966
Jiahao Xu , Shuxin Zhuang , Yi He , Haolin Wang , Zhemin Zhuang , Huancheng Zeng
Accurate analysis of tumor morphology, vascularity, and tissue stiffness under multimodal ultrasound imaging plays a critical role in the diagnosis of breast cancer. However, manual interpretation across multiple modalities is time-consuming and heavily dependent on the radiologist’s expertise. Computer-aided classification offers an efficient alternative, yet remains challenging due to significant modality heterogeneity, inconsistent image quality, and redundant information across modalities. To address these issues, we propose a novel Multimodal Sparse Fusion Transformer Network (MSFT-Net). First, a Spatio-Temporal Decoupling Attention architecture (STDA) is introduced to disentangle and extract dynamic and static features from different modalities along spatial and temporal dimensions, capturing modality-specific motion and morphological characteristics independently. Second, the Mixed-Scale Convolution Module (MSCM) obtains tumor features at multiple scales, enhancing geometric detail representation and improving receptive field coverage. Third, the Sparse Cross-Attention Module (SCAM) adaptively retains the most effective query-key interactions between modalities, thereby facilitating the aggregation of high-quality features for robust multimodal information fusion. MSFT-Net is trained and tested on a curated dataset comprising multimodal breast tumor videos collected from 458 patients, including ultrasound (US), superb microvascular imaging (SMI), and strain elastography (SE), and its generalizability is further validated on the public BraTS'21 MRI dataset. Extensive experiments demonstrate that MSFT-Net achieves superior performance in multimodal breast tumor classification compared to state-of-the-art methods, providing fast and reliable support for radiologists in diagnostic tasks.
在多模态超声成像下准确分析肿瘤形态、血管分布和组织刚度对乳腺癌的诊断起着至关重要的作用。然而,跨多种模式的人工解释是耗时的,并且严重依赖于放射科医生的专业知识。计算机辅助分类提供了一种有效的选择,但由于显著的模态异质性、不一致的图像质量和模态之间的冗余信息,仍然具有挑战性。为了解决这些问题,我们提出了一种新的多模态稀疏融合变压器网络(MSFT-Net)。首先,引入时空解耦注意力架构(STDA),沿时空维度从不同模态中分离和提取动态和静态特征,独立捕获模态特定的运动和形态特征;其次,混合尺度卷积模块(MSCM)在多个尺度上获取肿瘤特征,增强几何细节表征,提高感受野覆盖率;第三,稀疏交叉关注模块(SCAM)自适应保留模式之间最有效的查询键交互,从而促进高质量特征的聚合,用于鲁棒多模式信息融合。MSFT-Net在一个精心设计的数据集上进行训练和测试,该数据集包括从458名患者收集的多模态乳房肿瘤视频,包括超声(US)、一流微血管成像(SMI)和应变弹性成像(SE),并在公共BraTS的21个MRI数据集上进一步验证了其泛化性。大量的实验表明,MSFT-Net在多模式乳腺肿瘤分类方面的表现优于最先进的方法,为放射科医生的诊断任务提供了快速可靠的支持。
{"title":"Multimodal sparse fusion transformer network with spatio-temporal decoupling for breast tumor classification","authors":"Jiahao Xu ,&nbsp;Shuxin Zhuang ,&nbsp;Yi He ,&nbsp;Haolin Wang ,&nbsp;Zhemin Zhuang ,&nbsp;Huancheng Zeng","doi":"10.1016/j.media.2026.103966","DOIUrl":"10.1016/j.media.2026.103966","url":null,"abstract":"<div><div>Accurate analysis of tumor morphology, vascularity, and tissue stiffness under multimodal ultrasound imaging plays a critical role in the diagnosis of breast cancer. However, manual interpretation across multiple modalities is time-consuming and heavily dependent on the radiologist’s expertise. Computer-aided classification offers an efficient alternative, yet remains challenging due to significant modality heterogeneity, inconsistent image quality, and redundant information across modalities. To address these issues, we propose a novel Multimodal Sparse Fusion Transformer Network (MSFT-Net). First, a Spatio-Temporal Decoupling Attention architecture (STDA) is introduced to disentangle and extract dynamic and static features from different modalities along spatial and temporal dimensions, capturing modality-specific motion and morphological characteristics independently. Second, the Mixed-Scale Convolution Module (MSCM) obtains tumor features at multiple scales, enhancing geometric detail representation and improving receptive field coverage. Third, the Sparse Cross-Attention Module (SCAM) adaptively retains the most effective query-key interactions between modalities, thereby facilitating the aggregation of high-quality features for robust multimodal information fusion. MSFT-Net is trained and tested on a curated dataset comprising multimodal breast tumor videos collected from 458 patients, including ultrasound (US), superb microvascular imaging (SMI), and strain elastography (SE), and its generalizability is further validated on the public BraTS'21 MRI dataset. Extensive experiments demonstrate that MSFT-Net achieves superior performance in multimodal breast tumor classification compared to state-of-the-art methods, providing fast and reliable support for radiologists in diagnostic tasks.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103966"},"PeriodicalIF":11.8,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting lesion tracking in 3D total body photography 重新审视三维全身摄影中的病灶跟踪
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.media.2026.103963
Wei-Lun Huang , Minghao Xue , Zhiyou Liu , Davood Tashayyod , Jun Kang , Amir Gandjbakhche , Misha Kazhdan , Mehran Armand
Melanoma is the most deadly form of skin cancer. Tracking the evolution of nevi and detecting new lesions across the body is essential for the early detection of melanoma. Despite prior work on longitudinal tracking of skin lesions in 3D total body photography, there are still several challenges, including 1) low accuracy for finding correct lesion pairs across scans, 2) sensitivity to noisy lesion detection, and 3) lack of large-scale datasets with numerous annotated lesion pairs. We propose a framework that takes in a pair of 3D textured meshes, matches lesions in the context of total body photography, and identifies unmatchable lesions. We start by computing correspondence maps bringing the source and target meshes to a template mesh. Using these maps to define source/target signals over the template domain, we construct a flow field aligning the mapped signals. The initial correspondence maps are then refined by advecting forward/backward along the flow field. Finally, lesion assignment is performed using the refined correspondence maps. We propose the first large-scale dataset for skin lesion tracking with 25K lesion pairs across 198 subjects. The proposed method achieves a success rate of 90.1% (at 10 mm criterion) for all pairs of annotated lesions and a matching accuracy of 98.1% for subjects with more than 200 lesions.
黑色素瘤是最致命的皮肤癌。追踪痣的演变和发现全身的新病变对于黑色素瘤的早期发现至关重要。尽管之前在3D全身摄影中对皮肤病变进行了纵向跟踪,但仍然存在一些挑战,包括1)在扫描中找到正确病变对的准确性较低,2)对噪声病变检测的敏感性,以及3)缺乏具有大量注释病变对的大规模数据集。我们提出了一个框架,该框架采用一对3D纹理网格,在全身摄影的背景下匹配病变,并识别不匹配的病变。我们开始计算对应映射,将源和目标网格带到模板网格。使用这些映射来定义模板域上的源/目标信号,我们构建了一个对齐映射信号的流场。然后通过沿流场向前/向后平流来细化初始对应图。最后,使用改进的对应映射进行病灶分配。我们提出了第一个大规模的皮肤病变跟踪数据集,其中包含198个受试者的25K个病变对。该方法对所有标注病灶对的匹配成功率为90.1%(以10 mm为标准),对超过200个病灶的配对准确率为98.1%。
{"title":"Revisiting lesion tracking in 3D total body photography","authors":"Wei-Lun Huang ,&nbsp;Minghao Xue ,&nbsp;Zhiyou Liu ,&nbsp;Davood Tashayyod ,&nbsp;Jun Kang ,&nbsp;Amir Gandjbakhche ,&nbsp;Misha Kazhdan ,&nbsp;Mehran Armand","doi":"10.1016/j.media.2026.103963","DOIUrl":"10.1016/j.media.2026.103963","url":null,"abstract":"<div><div>Melanoma is the most deadly form of skin cancer. Tracking the evolution of nevi and detecting new lesions across the body is essential for the early detection of melanoma. Despite prior work on longitudinal tracking of skin lesions in 3D total body photography, there are still several challenges, including 1) low accuracy for finding correct lesion pairs across scans, 2) sensitivity to noisy lesion detection, and 3) lack of large-scale datasets with numerous annotated lesion pairs. We propose a framework that takes in a pair of 3D textured meshes, matches lesions in the context of total body photography, and identifies unmatchable lesions. We start by computing correspondence maps bringing the source and target meshes to a template mesh. Using these maps to define source/target signals over the template domain, we construct a flow field aligning the mapped signals. The initial correspondence maps are then refined by advecting forward/backward along the flow field. Finally, lesion assignment is performed using the refined correspondence maps. We propose the first large-scale dataset for skin lesion tracking with 25K lesion pairs across 198 subjects. The proposed method achieves a success rate of 90.1% (at 10 mm criterion) for all pairs of annotated lesions and a matching accuracy of 98.1% for subjects with more than 200 lesions.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103963"},"PeriodicalIF":11.8,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DPFR: Semi-supervised gland segmentation via density perturbation and feature recalibration DPFR:通过密度扰动和特征重新校准的半监督腺体分割
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-27 DOI: 10.1016/j.media.2026.103962
Jiejiang Yu, Yu Liu
In recent years, semi-supervised methods have attracted considerable attention in gland segmentation of histopathological images, as they can substantially reduce the annotation data burden for pathologists. The most widely adopted approach is the Mean-Teacher framework based on consistency regularization, which exploits unlabeled data information through consistency regularization constraints. However, due to the morphological complexity of glands in histopathological images, existing methods still suffer from confusion between glands and background, as well as gland adhesion. To address these challenges, we propose a semi-supervised gland segmentation method based on Density Perturbation and Feature Recalibration (DPFR). Specifically, we first design a normalized flow-based density estimator to effectively model the feature density distributions of glands, contours, and background. The gradient information of the estimator is then exploited to determine the descent direction in low-density regions, along which perturbations are applied to enhance feature discriminability. Furthermore, a contrastive-learning-based feature recalibration module is designed to alleviate inter-class distribution confusion, thereby improving gland-background separability and mitigating gland adhesion. Extensive experiments on three public gland segmentation datasets demonstrate that the proposed method consistently outperforms existing semi-supervised approaches, achieving state-of-the-art performance with a substantial margin. The code repository address is https://github.com/Methow0/DPFR.
近年来,半监督方法在组织病理图像的腺体分割中引起了广泛的关注,因为它可以大大减少病理学家的注释数据负担。采用最广泛的方法是基于一致性正则化的Mean-Teacher框架,该框架通过一致性正则化约束来利用未标记的数据信息。然而,由于组织病理图像中腺体形态的复杂性,现有方法存在腺体与背景混淆、腺体粘连等问题。为了解决这些问题,我们提出了一种基于密度扰动和特征再校准(DPFR)的半监督腺体分割方法。具体来说,我们首先设计了一个归一化的基于流的密度估计器,以有效地模拟腺体、轮廓和背景的特征密度分布。然后利用估计器的梯度信息确定低密度区域的下降方向,沿此方向施加扰动以增强特征的可判别性。此外,设计了基于对比学习的特征再校准模块,以缓解类间分布混乱,从而提高腺体-背景可分离性,减轻腺体粘附。在三个公共腺体分割数据集上进行的大量实验表明,所提出的方法始终优于现有的半监督方法,在很大程度上实现了最先进的性能。代码存储库地址是https://github.com/Methow0/DPFR。
{"title":"DPFR: Semi-supervised gland segmentation via density perturbation and feature recalibration","authors":"Jiejiang Yu,&nbsp;Yu Liu","doi":"10.1016/j.media.2026.103962","DOIUrl":"10.1016/j.media.2026.103962","url":null,"abstract":"<div><div>In recent years, semi-supervised methods have attracted considerable attention in gland segmentation of histopathological images, as they can substantially reduce the annotation data burden for pathologists. The most widely adopted approach is the Mean-Teacher framework based on consistency regularization, which exploits unlabeled data information through consistency regularization constraints. However, due to the morphological complexity of glands in histopathological images, existing methods still suffer from confusion between glands and background, as well as gland adhesion. To address these challenges, we propose a semi-supervised gland segmentation method based on Density Perturbation and Feature Recalibration (DPFR). Specifically, we first design a normalized flow-based density estimator to effectively model the feature density distributions of glands, contours, and background. The gradient information of the estimator is then exploited to determine the descent direction in low-density regions, along which perturbations are applied to enhance feature discriminability. Furthermore, a contrastive-learning-based feature recalibration module is designed to alleviate inter-class distribution confusion, thereby improving gland-background separability and mitigating gland adhesion. Extensive experiments on three public gland segmentation datasets demonstrate that the proposed method consistently outperforms existing semi-supervised approaches, achieving state-of-the-art performance with a substantial margin. The code repository address is <span><span>https://github.com/Methow0/DPFR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103962"},"PeriodicalIF":11.8,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reliable uncertainty quantification for 2D/3D anatomical landmark localization using multi-output conformal prediction 使用多输出适形预测进行二维/三维解剖地标定位的可靠不确定性量化
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-27 DOI: 10.1016/j.media.2026.103953
Jef Jonkers , Frank Coopman , Luc Duchateau , Glenn Van Wallendael , Sofie Van Hoecke
Automatic anatomical landmark localization in medical imaging requires not just accurate predictions but reliable uncertainty quantification for effective clinical decision support. Current uncertainty quantification approaches often fall short, particularly when combined with normality assumptions, systematically underestimating total predictive uncertainty. This paper introduces conformal prediction as a framework for reliable uncertainty quantification in anatomical landmark localization, addressing a critical gap in automatic landmark localization. We present two novel approaches guaranteeing finite-sample validity for multi-output prediction: multi-output regression-as-classification conformal prediction (M-R2CCP) and its variant multi-output regression to classification conformal prediction set to region (M-R2C2R). Unlike conventional methods that produce axis-aligned hyperrectangular or ellipsoidal regions, our approaches generate flexible, non-convex prediction regions that better capture the underlying uncertainty structure of landmark predictions. Through extensive empirical evaluation across multiple 2D and 3D datasets, we demonstrate that our methods consistently outperform existing multi-output conformal prediction approaches in both validity and efficiency. This work represents a significant advancement in reliable uncertainty estimation for anatomical landmark localization, providing clinicians with trustworthy confidence measures for their diagnoses. While developed for medical imaging, these methods show promise for broader applications in multi-output regression problems.
医学成像中的自动解剖地标定位不仅需要准确的预测,还需要可靠的不确定性量化,以提供有效的临床决策支持。当前的不确定性量化方法往往不足,特别是当与正态性假设相结合时,系统地低估了总的预测不确定性。本文引入保形预测作为解剖学地标定位中可靠的不确定性量化框架,解决了自动地标定位的关键空白。提出了两种保证多输出预测有限样本有效性的新方法:多输出回归作为分类适形预测(M-R2CCP)及其变体多输出回归到分类适形预测集到区域(M-R2C2R)。与产生轴向超矩形或椭球体区域的传统方法不同,我们的方法产生灵活的非凸预测区域,可以更好地捕获里程碑预测的潜在不确定性结构。通过对多个2D和3D数据集的广泛经验评估,我们证明了我们的方法在有效性和效率方面始终优于现有的多输出适形预测方法。这项工作代表了解剖地标定位可靠不确定性估计的重大进展,为临床医生的诊断提供了值得信赖的信心措施。虽然这些方法是为医学成像而开发的,但它们在多输出回归问题中有更广泛的应用前景。
{"title":"Reliable uncertainty quantification for 2D/3D anatomical landmark localization using multi-output conformal prediction","authors":"Jef Jonkers ,&nbsp;Frank Coopman ,&nbsp;Luc Duchateau ,&nbsp;Glenn Van Wallendael ,&nbsp;Sofie Van Hoecke","doi":"10.1016/j.media.2026.103953","DOIUrl":"10.1016/j.media.2026.103953","url":null,"abstract":"<div><div>Automatic anatomical landmark localization in medical imaging requires not just accurate predictions but reliable uncertainty quantification for effective clinical decision support. Current uncertainty quantification approaches often fall short, particularly when combined with normality assumptions, systematically underestimating total predictive uncertainty. This paper introduces conformal prediction as a framework for reliable uncertainty quantification in anatomical landmark localization, addressing a critical gap in automatic landmark localization. We present two novel approaches guaranteeing finite-sample validity for multi-output prediction: multi-output regression-as-classification conformal prediction (M-R2CCP) and its variant multi-output regression to classification conformal prediction set to region (M-R2C2R). Unlike conventional methods that produce axis-aligned hyperrectangular or ellipsoidal regions, our approaches generate flexible, non-convex prediction regions that better capture the underlying uncertainty structure of landmark predictions. Through extensive empirical evaluation across multiple 2D and 3D datasets, we demonstrate that our methods consistently outperform existing multi-output conformal prediction approaches in both validity and efficiency. This work represents a significant advancement in reliable uncertainty estimation for anatomical landmark localization, providing clinicians with trustworthy confidence measures for their diagnoses. While developed for medical imaging, these methods show promise for broader applications in multi-output regression problems.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103953"},"PeriodicalIF":11.8,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MIL-Adapter: Coupling multiple instance learning and vision-language adapters for few-shot slide-level classification MIL-Adapter:耦合多实例学习和视觉语言适配器,用于少量幻灯片级别的分类
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-26 DOI: 10.1016/j.media.2026.103964
Pablo Meseguer , Rocío del Amor , Valery Naranjo
Contrastive language-image pretraining has greatly enhanced visual representation learning and enabled zero-shot classification. Vision-language language models (VLM) have succeeded in few-shot learning by leveraging adaptation modules fine-tuned for specific downstream tasks. In computational pathology (CPath), accurate whole-slide image (WSI) prediction is crucial for aiding in cancer diagnosis, and multiple instance learning (MIL) remains essential for managing the gigapixel scale of WSIs. In the intersection between CPath and VLMs, the literature still lacks specific adapters that handle the particular complexity of the slides. To solve this gap, we introduce MIL-Adapter, a novel approach designed to obtain consistent slide-level classification under few-shot learning scenarios. In particular, our framework is the first to combine trainable MIL aggregation functions and lightweight visual-language adapters to improve the performance of histopathological VLMs. MIL-Adapter relies on textual ensemble learning to construct discriminative zero-shot prototypes. It is serves as a solid starting point, surpassing MIL models with randomly initialized classifiers in data-constrained settings. With our experimentation, we demonstrate the value of textual ensemble learning and the robust predictive performance of MIL-Adapter through diverse datasets and configurations of few-shot scenarios, while providing crucial insights on model interpretability. The code is publicly accessible in https://github.com/cvblab/MIL-Adapter.
对比语言图像预训练极大地增强了视觉表征学习,实现了零采样分类。视觉语言模型(VLM)通过利用针对特定下游任务进行微调的自适应模块,在少量学习中取得了成功。在计算病理学(CPath)中,准确的全片图像(WSI)预测对于帮助癌症诊断至关重要,而多实例学习(MIL)对于管理WSI的十亿像素规模仍然至关重要。在CPath和vlm之间的交集中,文献仍然缺乏处理幻灯片特定复杂性的特定适配器。为了解决这一差距,我们引入了MIL-Adapter,这是一种新的方法,旨在在少镜头学习场景下获得一致的幻灯片级别分类。特别是,我们的框架是第一个将可训练的MIL聚合功能和轻量级的视觉语言适配器结合起来以提高组织病理学vlm的性能的框架。MIL-Adapter依赖于文本集成学习来构建判别零射击原型。它可以作为一个坚实的起点,在数据约束设置中超越具有随机初始化分类器的MIL模型。通过实验,我们展示了文本集成学习的价值以及MIL-Adapter通过不同数据集和少量场景配置的鲁棒预测性能,同时提供了关于模型可解释性的重要见解。该代码可在https://github.com/cvblab/MIL-Adapter上公开访问。
{"title":"MIL-Adapter: Coupling multiple instance learning and vision-language adapters for few-shot slide-level classification","authors":"Pablo Meseguer ,&nbsp;Rocío del Amor ,&nbsp;Valery Naranjo","doi":"10.1016/j.media.2026.103964","DOIUrl":"10.1016/j.media.2026.103964","url":null,"abstract":"<div><div>Contrastive language-image pretraining has greatly enhanced visual representation learning and enabled zero-shot classification. Vision-language language models (VLM) have succeeded in few-shot learning by leveraging adaptation modules fine-tuned for specific downstream tasks. In computational pathology (CPath), accurate whole-slide image (WSI) prediction is crucial for aiding in cancer diagnosis, and multiple instance learning (MIL) remains essential for managing the gigapixel scale of WSIs. In the intersection between CPath and VLMs, the literature still lacks specific adapters that handle the particular complexity of the slides. To solve this gap, we introduce MIL-Adapter, a novel approach designed to obtain consistent slide-level classification under few-shot learning scenarios. In particular, our framework is the first to combine trainable MIL aggregation functions and lightweight visual-language adapters to improve the performance of histopathological VLMs. MIL-Adapter relies on textual ensemble learning to construct discriminative zero-shot prototypes. It is serves as a solid starting point, surpassing MIL models with randomly initialized classifiers in data-constrained settings. With our experimentation, we demonstrate the value of textual ensemble learning and the robust predictive performance of MIL-Adapter through diverse datasets and configurations of few-shot scenarios, while providing crucial insights on model interpretability. The code is publicly accessible in <span><span>https://github.com/cvblab/MIL-Adapter</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103964"},"PeriodicalIF":11.8,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146048254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards boundary confusion for volumetric medical image segmentation 体医学图像分割中边界混淆的研究
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-25 DOI: 10.1016/j.media.2026.103961
Xin You , Ming Ding , Minghui Zhang , Hanxiao Zhang , Junyang Wu , Yi Yu , Jie Yang , Yun Gu
Accurate boundary segmentation of volumetric images is a critical task for image-guided diagnosis and computer-assisted intervention. It is challenging to address the boundary confusion with explicit constraints. Existing methods of refining boundaries overemphasize the slender structure while overlooking the dynamic interactions between boundaries and neighboring regions. In this paper, we reconceptualize the mechanism of boundary generation via introducing Pushing and Pulling interactions, then propose a unified network termed PP-Net to model shape characteristics of the confused boundary region. Specifically, we first propose the semantic difference module (SDM) from the pushing branch to drive the boundary towards the ground truth under diffusion guidance. Additionally, the class clustering module (CCM) from the pulling branch is introduced to stretch the intersected boundary along the opposite direction. Thus, pushing and pulling branches will furnish two adversarial forces to enhance representation capabilities for the faint boundary. Experiments are conducted on four public datasets and one in-house dataset plagued by boundary confusion. The results demonstrate the superiority of PP-Net over other segmentation networks, especially on the evaluation metrics of Hausdorff Distance and Average Symmetric Surface Distance. Besides, SDM and CCM can serve as plug-and-play modules to enhance classic U-shape baseline models, including recent SAM-based foundation models. Source codes are available at https://github.com/EndoluminalSurgicalVision-IMR/PnPNet.
体积图像的精确边界分割是图像引导诊断和计算机辅助干预的关键任务。用明确的约束来解决边界混淆是具有挑战性的。现有的边界细化方法过分强调细长的结构,而忽略了边界与邻近区域之间的动态相互作用。本文通过引入推拉相互作用,重新定义了边界生成的机制,并提出了一个统一的PP-Net网络来模拟混乱边界区域的形状特征。具体而言,我们首先提出了推枝的语义差分模块(SDM),在扩散引导下将边界向地面真值驱动。此外,引入了来自拉分支的类聚类模块(CCM),将相交边界沿相反方向拉伸。因此,推和拉分支将提供两种对立的力量,以增强对模糊边界的表示能力。在四个公共数据集和一个边界混乱的内部数据集上进行了实验。结果表明,PP-Net在Hausdorff距离和平均对称表面距离的评价指标上优于其他分割网络。此外,SDM和CCM可以作为即插即用模块来增强经典的u型基线模型,包括最近基于sam的基础模型。源代码可从https://github.com/EndoluminalSurgicalVision-IMR/PnPNet获得。
{"title":"Towards boundary confusion for volumetric medical image segmentation","authors":"Xin You ,&nbsp;Ming Ding ,&nbsp;Minghui Zhang ,&nbsp;Hanxiao Zhang ,&nbsp;Junyang Wu ,&nbsp;Yi Yu ,&nbsp;Jie Yang ,&nbsp;Yun Gu","doi":"10.1016/j.media.2026.103961","DOIUrl":"10.1016/j.media.2026.103961","url":null,"abstract":"<div><div>Accurate boundary segmentation of volumetric images is a critical task for image-guided diagnosis and computer-assisted intervention. It is challenging to address the boundary confusion with explicit constraints. Existing methods of refining boundaries overemphasize the slender structure while overlooking the dynamic interactions between boundaries and neighboring regions. In this paper, we reconceptualize the mechanism of boundary generation via introducing Pushing and Pulling interactions, then propose a unified network termed PP-Net to model shape characteristics of the confused boundary region. Specifically, we first propose the semantic difference module (SDM) from the pushing branch to drive the boundary towards the ground truth under diffusion guidance. Additionally, the class clustering module (CCM) from the pulling branch is introduced to stretch the intersected boundary along the opposite direction. Thus, pushing and pulling branches will furnish two adversarial forces to enhance representation capabilities for the faint boundary. Experiments are conducted on four public datasets and one in-house dataset plagued by boundary confusion. The results demonstrate the superiority of PP-Net over other segmentation networks, especially on the evaluation metrics of Hausdorff Distance and Average Symmetric Surface Distance. Besides, SDM and CCM can serve as plug-and-play modules to enhance classic U-shape baseline models, including recent SAM-based foundation models. Source codes are available at <span><span>https://github.com/EndoluminalSurgicalVision-IMR/PnPNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103961"},"PeriodicalIF":11.8,"publicationDate":"2026-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146048365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Medical image analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1