首页 > 最新文献

IEEE transactions on medical imaging最新文献

英文 中文
Lesion Asymmetry Screening Assisted Global Awareness Multi-View Network for Mammogram Classification 病变不对称筛查辅助全局感知多视图网络进行乳腺x线照片分类。
Pub Date : 2025-09-09 DOI: 10.1109/TMI.2025.3607877
Xinchuan Liu;Luhao Sun;Chao Li;Bowen Han;Wenzong Jiang;Tianhao Yuan;Weifeng Liu;Zhaoyun Liu;Zhiyong Yu;Baodi Liu
Mammography is a primary method for early screening, and developing deep learning-based computer-aided systems is of great significance. However, current deep learning models typically treat each image as an independent entity for diagnosis, rather than integrating images from multiple views to diagnose the patient. These methods do not fully consider and address the complex interactions between different views, resulting in poor diagnostic performance and interpretability. To address this issue, this paper proposes a novel end-to-end framework for breast cancer diagnosis: lesion asymmetry screening assisted global awareness multi-view network (LAS-GAM). More than just the most common image-level diagnostic model, LAS-GAM operates at the patient level, simulating the workflow of radiologists analyzing mammographic images. The framework processes the four views of a patient and revolves around two key modules: a global module and a lesion screening module. The global module simulates the comprehensive assessment by radiologists, integrating complementary information from the craniocaudal (CC) and mediolateral oblique (MLO) views of both breasts to generate global features that represent the patient’s overall condition. The lesion screening module mimics the process of locating lesions by comparing symmetric regions in contralateral views, identifying potential lesion areas and extracting lesion-specific features using a lightweight model. By combining the global features and lesion-specific features, LAS-GAM simulates the diagnostic process, making patient-level predictions. Moreover, it is trained using only patient-level labels, significantly reducing data annotation costs. Experiments on the Digital Database for Screening Mammography (DDSM) and In-house datasets validate LAS-GAM, achieving AUCs of 0.817 and 0.894, respectively.
乳房x光检查是早期筛查的主要方法,开发基于深度学习的计算机辅助系统具有重要意义。然而,目前的深度学习模型通常将每个图像视为一个独立的实体进行诊断,而不是将来自多个视图的图像集成来诊断患者。这些方法没有充分考虑和处理不同视图之间复杂的相互作用,导致诊断性能和可解释性较差。为了解决这一问题,本文提出了一种新的端到端乳腺癌诊断框架:病变不对称筛查辅助全局感知多视图网络(LAS-GAM)。LAS-GAM不仅仅是最常见的图像级诊断模型,它还可以在患者层面运行,模拟放射科医生分析乳房x线摄影图像的工作流程。该框架处理患者的四个视图,并围绕两个关键模块:全局模块和病变筛查模块。全局模块模拟放射科医生的综合评估,整合来自双乳颅侧(CC)和中外侧斜(MLO)视图的互补信息,生成代表患者整体状况的全局特征。病变筛选模块通过比较对侧视图的对称区域,识别潜在病变区域,并使用轻量级模型提取病变特异性特征,模拟病变定位过程。通过结合全局特征和病变特异性特征,LAS-GAM模拟诊断过程,做出患者级别的预测。此外,它只使用患者级别的标签进行训练,大大降低了数据注释成本。在乳腺筛查数字数据库(DDSM)和内部数据集上的实验验证了LAS-GAM, auc分别为0.817和0.894。
{"title":"Lesion Asymmetry Screening Assisted Global Awareness Multi-View Network for Mammogram Classification","authors":"Xinchuan Liu;Luhao Sun;Chao Li;Bowen Han;Wenzong Jiang;Tianhao Yuan;Weifeng Liu;Zhaoyun Liu;Zhiyong Yu;Baodi Liu","doi":"10.1109/TMI.2025.3607877","DOIUrl":"10.1109/TMI.2025.3607877","url":null,"abstract":"Mammography is a primary method for early screening, and developing deep learning-based computer-aided systems is of great significance. However, current deep learning models typically treat each image as an independent entity for diagnosis, rather than integrating images from multiple views to diagnose the patient. These methods do not fully consider and address the complex interactions between different views, resulting in poor diagnostic performance and interpretability. To address this issue, this paper proposes a novel end-to-end framework for breast cancer diagnosis: lesion asymmetry screening assisted global awareness multi-view network (LAS-GAM). More than just the most common image-level diagnostic model, LAS-GAM operates at the patient level, simulating the workflow of radiologists analyzing mammographic images. The framework processes the four views of a patient and revolves around two key modules: a global module and a lesion screening module. The global module simulates the comprehensive assessment by radiologists, integrating complementary information from the craniocaudal (CC) and mediolateral oblique (MLO) views of both breasts to generate global features that represent the patient’s overall condition. The lesion screening module mimics the process of locating lesions by comparing symmetric regions in contralateral views, identifying potential lesion areas and extracting lesion-specific features using a lightweight model. By combining the global features and lesion-specific features, LAS-GAM simulates the diagnostic process, making patient-level predictions. Moreover, it is trained using only patient-level labels, significantly reducing data annotation costs. Experiments on the Digital Database for Screening Mammography (DDSM) and In-house datasets validate LAS-GAM, achieving AUCs of 0.817 and 0.894, respectively.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 2","pages":"777-788"},"PeriodicalIF":0.0,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145025300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Co-Activation Pattern Analysis Based on Hidden Semi-Markov Model for Brain Spatiotemporal Dynamics 基于隐半马尔可夫模型的脑时空动力学共激活模式分析
Pub Date : 2025-09-08 DOI: 10.1109/TMI.2025.3607113
Zihao Yuan;Jiaqing Chen;Han Qiu;Houxiang Wang;Yangxin Huang;Fuchun Lin
Analyzing the spontaneous activity of the human brain using dynamic approaches can reveal functional organizations. The co-activation pattern (CAP) analysis of signals from different brain regions is used to characterize brain neural networks that may serve specialized functions. However, CAP is based on spatial information but ignores temporal reproducible transition patterns, and lacks robustness to low signal-to-noise rate (SNR) data. To address these issues, this study proposes a new CAP framework based on hidden semi-Markov model (HSMM) called HSMM-CAP analysis, which can be performed to investigate spatiotemporal CAPs (stCAPs) of the brain. HSMM-CAP uses empirical spatial distributions of stCAPs as emission models, and assumes that the state sequence of stCAPs follows a semi-Markov process. Based on the assumptions of sparsity, heterogeneity, and semi-Markov property of stCAPs, the HSMM-CAP-K-means method is constructed to infer the state sequence and transition parameters of stCAPs. In addition, HSMM-CAP provides the inverse relationship between the number of states and sparsity. Simulation studies verify the performance of HSMM-CAP at different levels of SNR. The spatiotemporal dynamics of stCAPs are also revealed by the proposed method on real-world resting-state fMRI data. Our method provides a new data-driven computational framework for revealing the brain spatiotemporal dynamics of resting-state fMRI data.
用动态方法分析人类大脑的自发活动可以揭示功能组织。对来自不同脑区信号的共激活模式(CAP)分析用于表征可能具有特殊功能的脑神经网络。然而,CAP基于空间信息而忽略了时间可重复的过渡模式,并且对低信噪比(SNR)数据缺乏鲁棒性。为了解决这些问题,本研究提出了一种新的基于隐半马尔可夫模型(HSMM)的CAP框架,称为HSMM-CAP分析,可以用于研究大脑的时空CAP (stCAPs)。HSMM-CAP采用stcap的经验空间分布作为排放模型,并假设stcap的状态序列遵循半马尔可夫过程。基于stcap的稀疏性、异质性和半马尔可夫性假设,构造了HSMM-CAP-K-means方法来推断stcap的状态序列和转移参数。此外,HSMM-CAP还提供了状态数与稀疏度之间的反比关系。仿真研究验证了HSMM-CAP在不同信噪比下的性能。该方法还对静息态fMRI数据进行了分析,揭示了stcap的时空动态。我们的方法为揭示静息状态fMRI数据的大脑时空动态提供了一个新的数据驱动的计算框架。
{"title":"Co-Activation Pattern Analysis Based on Hidden Semi-Markov Model for Brain Spatiotemporal Dynamics","authors":"Zihao Yuan;Jiaqing Chen;Han Qiu;Houxiang Wang;Yangxin Huang;Fuchun Lin","doi":"10.1109/TMI.2025.3607113","DOIUrl":"10.1109/TMI.2025.3607113","url":null,"abstract":"Analyzing the spontaneous activity of the human brain using dynamic approaches can reveal functional organizations. The co-activation pattern (CAP) analysis of signals from different brain regions is used to characterize brain neural networks that may serve specialized functions. However, CAP is based on spatial information but ignores temporal reproducible transition patterns, and lacks robustness to low signal-to-noise rate (SNR) data. To address these issues, this study proposes a new CAP framework based on hidden semi-Markov model (HSMM) called HSMM-CAP analysis, which can be performed to investigate spatiotemporal CAPs (stCAPs) of the brain. HSMM-CAP uses empirical spatial distributions of stCAPs as emission models, and assumes that the state sequence of stCAPs follows a semi-Markov process. Based on the assumptions of sparsity, heterogeneity, and semi-Markov property of stCAPs, the HSMM-CAP-K-means method is constructed to infer the state sequence and transition parameters of stCAPs. In addition, HSMM-CAP provides the inverse relationship between the number of states and sparsity. Simulation studies verify the performance of HSMM-CAP at different levels of SNR. The spatiotemporal dynamics of stCAPs are also revealed by the proposed method on real-world resting-state fMRI data. Our method provides a new data-driven computational framework for revealing the brain spatiotemporal dynamics of resting-state fMRI data.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 2","pages":"843-852"},"PeriodicalIF":0.0,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145017614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MetaSSL: A General Heterogeneous Loss for Semi-Supervised Medical Image Segmentation MetaSSL:一种用于半监督医学图像分割的通用异构损失
Pub Date : 2025-09-03 DOI: 10.1109/TMI.2025.3605617
Weiren Zhao;Lanfeng Zhong;Xin Liao;Wenjun Liao;Sichuan Zhang;Shaoting Zhang;Guotai Wang
Semi-Supervised Learning (SSL) is important for reducing the annotation cost for medical image segmentation models. State-of-the-art SSL methods such as Mean Teacher, FixMatch and Cross Pseudo Supervision (CPS) are mainly based on consistency regularization or pseudo-label supervision between a reference prediction and a supervised prediction. Despite the effectiveness, they have overlooked the potential noise in the labeled data, and mainly focus on strategies to generate the reference prediction, while ignoring the heterogeneous values of different unlabeled pixels. We argue that effectively mining the rich information contained by the two predictions in the loss function, instead of the specific strategy to obtain a reference prediction, is more essential for SSL, and propose a universal framework MetaSSL based on a spatially heterogeneous loss that assigns different weights to pixels by simultaneously leveraging the uncertainty and consistency information between the reference and supervised predictions. Specifically, we split the predictions on unlabeled data into four regions with decreasing weights in the loss: Unanimous and Confident (UC), Unanimous and Suspicious (US), Discrepant and Confident (DC), and Discrepant and Suspicious (DS), where an adaptive threshold is proposed to distinguish confident predictions from suspicious ones. The heterogeneous loss is also applied to labeled images for robust learning considering the potential annotation noise. Our method is plug-and-play and general to most existing SSL methods. The experimental results showed that it improved the segmentation performance significantly when integrated with existing SSL frameworks on different datasets. Code is available at https://github.com/HiLab-git/MetaSSL
半监督学习(SSL)对于降低医学图像分割模型的标注成本具有重要意义。目前最先进的SSL方法,如Mean Teacher、FixMatch和Cross Pseudo Supervision (CPS),主要是基于参考预测和监督预测之间的一致性正则化或伪标签监督。尽管有效,但它们忽略了标记数据中潜在的噪声,主要集中在生成参考预测的策略上,而忽略了不同未标记像素的异构值。我们认为有效地挖掘损失函数中两种预测所包含的丰富信息,而不是特定的策略来获得参考预测,对于SSL更重要,并提出了一个基于空间异构损失的通用框架MetaSSL,该框架通过同时利用参考和监督预测之间的不确定性和一致性信息为像素分配不同的权重。具体来说,我们将未标记数据上的预测分为四个损失权重递减的区域:一致和自信(UC)、一致和可疑(US)、差异和自信(DC)和差异和可疑(DS),其中提出了一个自适应阈值来区分可信预测和可疑预测。考虑到潜在的标注噪声,异构损失也被应用于标记图像的鲁棒学习。我们的方法是即插即用的,适用于大多数现有的SSL方法。实验结果表明,该算法与现有的SSL框架集成在不同的数据集上,显著提高了分割性能。代码可从https://github.com/HiLab-git/MetaSSL获得
{"title":"MetaSSL: A General Heterogeneous Loss for Semi-Supervised Medical Image Segmentation","authors":"Weiren Zhao;Lanfeng Zhong;Xin Liao;Wenjun Liao;Sichuan Zhang;Shaoting Zhang;Guotai Wang","doi":"10.1109/TMI.2025.3605617","DOIUrl":"10.1109/TMI.2025.3605617","url":null,"abstract":"Semi-Supervised Learning (SSL) is important for reducing the annotation cost for medical image segmentation models. State-of-the-art SSL methods such as Mean Teacher, FixMatch and Cross Pseudo Supervision (CPS) are mainly based on consistency regularization or pseudo-label supervision between a reference prediction and a supervised prediction. Despite the effectiveness, they have overlooked the potential noise in the labeled data, and mainly focus on strategies to generate the reference prediction, while ignoring the heterogeneous values of different unlabeled pixels. We argue that effectively mining the rich information contained by the two predictions in the loss function, instead of the specific strategy to obtain a reference prediction, is more essential for SSL, and propose a universal framework <bold>MetaSSL</b> based on a spatially heterogeneous loss that assigns different weights to pixels by simultaneously leveraging the uncertainty and consistency information between the reference and supervised predictions. Specifically, we split the predictions on unlabeled data into four regions with decreasing weights in the loss: Unanimous and Confident (UC), Unanimous and Suspicious (US), Discrepant and Confident (DC), and Discrepant and Suspicious (DS), where an adaptive threshold is proposed to distinguish confident predictions from suspicious ones. The heterogeneous loss is also applied to labeled images for robust learning considering the potential annotation noise. Our method is plug-and-play and general to most existing SSL methods. The experimental results showed that it improved the segmentation performance significantly when integrated with existing SSL frameworks on different datasets. Code is available at <uri>https://github.com/HiLab-git/MetaSSL</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 2","pages":"751-763"},"PeriodicalIF":0.0,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144987556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Teacher–Student Instance-Level Adversarial Augmentation for Single Domain Generalized Medical Image Segmentation 基于实例级对抗增强的单域广义医学图像分割
Pub Date : 2025-09-02 DOI: 10.1109/TMI.2025.3605162
Zhengshan Wang;Long Chen;Xuelin Xie;Yang Zhang;Yunpeng Cai;Weiping Ding
Recently, single-source domain generalization (SDG) has gained popularity in medical image segmentation. As a prominent technique, adversarial image augmentation technique can generate synthetic training data that are challenging for the segmentation model to recognize. To avoid the over-augmentation problem, existing adversarial-based works often employ augmenters with relatively simple structures for medical images, typically operating at the image level, limiting the diversity of the augmented images. In this paper, we propose a Teacher-Student Instance-level Adversarial Augmentation (TSIAA) model for generalized medical image segmentation. The objective of TSIAA is to derive domain-generalizable representations by exploring out-of-source data distributions. First, we construct an Instance-level Image Augmenter (IIAG) using several Instance-level Augmentation Modules (IAMs), which are based on the learnable constrained Bèzier transformation function. Compared to image-level adversarial augmentation, instance-level adversarial augmentation breaks the uniformity of augmentation rules across different structures within an image, thereby providing greater diversity. Then, TSIAA conducts Teacher-Student (TS) learning through an adversarial approach, alternating novel image augmentation and generalized representation learning. The former delves into out-of-source and plausible data, while the latter continuously updates both the student and teacher to ensure the original and augmented features maintain consistent and generalized characteristics. By integrating both strategies, our proposed TSIAA model achieves significant improvements over state-of-the-art methods in four challenging SDG tasks. The code can be accessed at https://github.com/Wangzs0228/TSIAA
近年来,单源域泛化(SDG)在医学图像分割中得到了广泛的应用。作为一项突出的技术,对抗图像增强技术可以生成合成训练数据,这对分割模型来说是一个挑战。为了避免过度增强问题,现有的基于对抗性的工作通常使用结构相对简单的医学图像增强器,通常在图像层面上操作,限制了增强图像的多样性。本文提出了一种用于广义医学图像分割的师生实例级对抗增强(TSIAA)模型。TSIAA的目标是通过探索源外数据分布来获得领域泛化表示。首先,我们使用几个基于可学习约束b齐尔变换函数的实例级增强模块(iam)构建了一个实例级图像增强器(IIAG)。与图像级对抗增强相比,实例级对抗增强打破了图像内不同结构增强规则的一致性,从而提供了更大的多样性。然后,TSIAA通过对抗性方法进行师生学习,交替使用新图像增强和广义表征学习。前者挖掘源外可信的数据,而后者不断更新学生和教师,以确保原始特征和增强特征保持一致和广义特征。通过整合这两种策略,我们提出的TSIAA模型在四个具有挑战性的可持续发展目标任务中取得了比最先进的方法显著的改进。代码可以在https://github.com/Wangzs0228/TSIAA上访问
{"title":"Teacher–Student Instance-Level Adversarial Augmentation for Single Domain Generalized Medical Image Segmentation","authors":"Zhengshan Wang;Long Chen;Xuelin Xie;Yang Zhang;Yunpeng Cai;Weiping Ding","doi":"10.1109/TMI.2025.3605162","DOIUrl":"10.1109/TMI.2025.3605162","url":null,"abstract":"Recently, single-source domain generalization (SDG) has gained popularity in medical image segmentation. As a prominent technique, adversarial image augmentation technique can generate synthetic training data that are challenging for the segmentation model to recognize. To avoid the over-augmentation problem, existing adversarial-based works often employ augmenters with relatively simple structures for medical images, typically operating at the image level, limiting the diversity of the augmented images. In this paper, we propose a Teacher-Student Instance-level Adversarial Augmentation (TSIAA) model for generalized medical image segmentation. The objective of TSIAA is to derive domain-generalizable representations by exploring out-of-source data distributions. First, we construct an Instance-level Image Augmenter (IIAG) using several Instance-level Augmentation Modules (IAMs), which are based on the learnable constrained Bèzier transformation function. Compared to image-level adversarial augmentation, instance-level adversarial augmentation breaks the uniformity of augmentation rules across different structures within an image, thereby providing greater diversity. Then, TSIAA conducts Teacher-Student (TS) learning through an adversarial approach, alternating novel image augmentation and generalized representation learning. The former delves into out-of-source and plausible data, while the latter continuously updates both the student and teacher to ensure the original and augmented features maintain consistent and generalized characteristics. By integrating both strategies, our proposed TSIAA model achieves significant improvements over state-of-the-art methods in four challenging SDG tasks. The code can be accessed at <uri>https://github.com/Wangzs0228/TSIAA</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 2","pages":"764-776"},"PeriodicalIF":0.0,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144930697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Sequential Bayesian Iterative Learning for Myocardial Motion Estimation on Cardiac Image Sequences 基于自适应序列贝叶斯迭代学习的心脏图像序列心肌运动估计。
Pub Date : 2025-08-18 DOI: 10.1109/TMI.2025.3599487
Shuxin Zhuang;Heye Zhang;Dong Liang;Hui Liu;Zhifan Gao
Motion estimation of left ventricle myocardium on the cardiac image sequence is crucial for assessing cardiac function. However, the intensity variation of cardiac image sequences brings the challenge of uncertain interference to myocardial motion estimation. Such imaging-related uncertain interference appears in different cardiac imaging modalities. We propose adaptive sequential Bayesian iterative learning to overcome the challenge. Specifically, our method applies the adaptive structural inference to state transition and observation to cope with a complex myocardial motion under uncertain setting. In state transition, adaptive structural inference establishes a hierarchical structure recurrence to obtain the complex latent representation of cardiac image sequences. In state observation, the adaptive structural inference forms a chain structure mapping to correlate the latent representation of the cardiac image sequence with that of the motion. Extensive experiments on US, CMR, and TMR datasets concerning 1270 patients (650 patients for CMR, 500 patients for US and 120 patients for TMR) have shown the effectiveness of our method, as well as the superiority to eight state-of-the-art motion estimation methods.
在心脏图像序列上对左心室心肌的运动估计是评估心功能的关键。然而,心肌图像序列的强度变化给心肌运动估计带来了不确定干扰的挑战。这种与成像相关的不确定干扰出现在不同的心脏成像方式中。我们提出自适应顺序贝叶斯迭代学习来克服这一挑战。具体而言,我们的方法将自适应结构推理应用于状态转换和观察,以应对不确定环境下复杂的心肌运动。在状态转换中,自适应结构推理建立层次结构递归,获得心脏图像序列的复杂潜在表示。在状态观察中,自适应结构推理形成链式结构映射,将心脏图像序列的潜在表征与运动的潜在表征相关联。在1270例患者(650例CMR, 500例US和120例TMR)的US、CMR和TMR数据集上进行的大量实验表明,我们的方法是有效的,并且优于8种最先进的运动估计方法。
{"title":"Adaptive Sequential Bayesian Iterative Learning for Myocardial Motion Estimation on Cardiac Image Sequences","authors":"Shuxin Zhuang;Heye Zhang;Dong Liang;Hui Liu;Zhifan Gao","doi":"10.1109/TMI.2025.3599487","DOIUrl":"10.1109/TMI.2025.3599487","url":null,"abstract":"Motion estimation of left ventricle myocardium on the cardiac image sequence is crucial for assessing cardiac function. However, the intensity variation of cardiac image sequences brings the challenge of uncertain interference to myocardial motion estimation. Such imaging-related uncertain interference appears in different cardiac imaging modalities. We propose adaptive sequential Bayesian iterative learning to overcome the challenge. Specifically, our method applies the adaptive structural inference to state transition and observation to cope with a complex myocardial motion under uncertain setting. In state transition, adaptive structural inference establishes a hierarchical structure recurrence to obtain the complex latent representation of cardiac image sequences. In state observation, the adaptive structural inference forms a chain structure mapping to correlate the latent representation of the cardiac image sequence with that of the motion. Extensive experiments on US, CMR, and TMR datasets concerning 1270 patients (650 patients for CMR, 500 patients for US and 120 patients for TMR) have shown the effectiveness of our method, as well as the superiority to eight state-of-the-art motion estimation methods.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 1","pages":"406-420"},"PeriodicalIF":0.0,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144877629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Contrastive Learning for Precise Whole-Body Anatomical Localization in PET/CT Imaging 分层对比学习在PET/CT成像中的精确全身解剖定位。
Pub Date : 2025-08-18 DOI: 10.1109/TMI.2025.3599197
Yaozong Gao;Yiran Shu;Mingyang Yu;Yanbo Chen;Jingyu Liu;Shaonan Zhong;Weifang Zhang;Yiqiang Zhan;Xiang Sean Zhou;Xinlu Wang;Meixin Zhao;Dinggang Shen
Automatic anatomical localization is critical for radiology report generation. While many studies focus on lesion detection and segmentation, anatomical localization—accurately describing lesion positions in radiology reports—has received less attention. Conventional segmentation-based methods are limited to organ-level localization and often fail in severe disease cases due to low segmentation accuracy. To address these limitations, we reformulate anatomical localization as an image-to-text retrieval task. Specifically, we propose a CLIP-based framework that aligns lesion image patches with anatomically descriptive text embeddings in a shared multimodal space. By projecting lesion features into the semantic space and retrieving the most relevant anatomical descriptions in a coarse-to-fine manner, our method achieves fine-grained lesion localization with high accuracy across the entire body. Our main contributions are as follows: (1) hierarchical anatomical retrieval, which organizes 387 locations into a two-level hierarchy, by retrieving from the first level of 124 coarse categories to narrow down the search space and reduce localization complexity; (2) augmented location descriptions, which integrate domain-specific anatomical knowledge for enhancing semantic representation and improving visual—text alignment; and (3) semi-hard negative sample mining, which improves training stability and discriminative learning by avoiding selecting the overly similar negative samples that may introduce label noise or semantic ambiguity. We validate our method on two whole-body PET/CT datasets, achieving an 84.13% localization accuracy on the internal test set and 80.42% on the external test set, with a per-lesion inference time of 34 ms. The proposed framework also demonstrated superior robustness in complex clinical cases compared to segmentation-based approaches.
自动解剖定位是生成放射学报告的关键。虽然许多研究都集中在病灶的检测和分割上,但解剖学定位——在放射学报告中准确描述病灶的位置——却很少受到关注。传统的基于分割的方法仅限于器官水平的定位,并且由于分割精度低,在严重的疾病病例中往往失败。为了解决这些限制,我们将解剖定位重新定义为图像到文本的检索任务。具体来说,我们提出了一个基于clip的框架,该框架将病变图像斑块与共享多模态空间中的解剖学描述性文本嵌入对齐。通过将病灶特征投影到语义空间中,并以粗到细的方式检索最相关的解剖描述,我们的方法实现了高精度的全身细粒度病灶定位。我们的主要贡献如下:(1)分层解剖检索,通过从第一级124个粗分类中检索,将387个位置组织成两个层次,缩小了搜索空间,降低了定位复杂度;(2)增强位置描述,整合特定领域的解剖学知识,增强语义表示,改善视觉-文本对齐;(3)半硬负样本挖掘,通过避免选择可能引入标签噪声或语义模糊的过于相似的负样本,提高训练稳定性和判别学习。我们在两个全身PET/CT数据集上验证了我们的方法,在内部测试集上实现了84.13%的定位精度,在外部测试集上实现了80.42%的定位精度,每个病变的推断时间为34 ms。与基于分段的方法相比,所提出的框架在复杂的临床病例中也表现出优越的稳健性。
{"title":"Hierarchical Contrastive Learning for Precise Whole-Body Anatomical Localization in PET/CT Imaging","authors":"Yaozong Gao;Yiran Shu;Mingyang Yu;Yanbo Chen;Jingyu Liu;Shaonan Zhong;Weifang Zhang;Yiqiang Zhan;Xiang Sean Zhou;Xinlu Wang;Meixin Zhao;Dinggang Shen","doi":"10.1109/TMI.2025.3599197","DOIUrl":"10.1109/TMI.2025.3599197","url":null,"abstract":"Automatic anatomical localization is critical for radiology report generation. While many studies focus on lesion detection and segmentation, anatomical localization—accurately describing lesion positions in radiology reports—has received less attention. Conventional segmentation-based methods are limited to organ-level localization and often fail in severe disease cases due to low segmentation accuracy. To address these limitations, we reformulate anatomical localization as an image-to-text retrieval task. Specifically, we propose a CLIP-based framework that aligns lesion image patches with anatomically descriptive text embeddings in a shared multimodal space. By projecting lesion features into the semantic space and retrieving the most relevant anatomical descriptions in a coarse-to-fine manner, our method achieves fine-grained lesion localization with high accuracy across the entire body. Our main contributions are as follows: (1) hierarchical anatomical retrieval, which organizes 387 locations into a two-level hierarchy, by retrieving from the first level of 124 coarse categories to narrow down the search space and reduce localization complexity; (2) augmented location descriptions, which integrate domain-specific anatomical knowledge for enhancing semantic representation and improving visual—text alignment; and (3) semi-hard negative sample mining, which improves training stability and discriminative learning by avoiding selecting the overly similar negative samples that may introduce label noise or semantic ambiguity. We validate our method on two whole-body PET/CT datasets, achieving an 84.13% localization accuracy on the internal test set and 80.42% on the external test set, with a per-lesion inference time of 34 ms. The proposed framework also demonstrated superior robustness in complex clinical cases compared to segmentation-based approaches.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 1","pages":"391-405"},"PeriodicalIF":0.0,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144877630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SynthAorta: A 3D Mesh Dataset of Parametrized Physiological Healthy Aortas SynthAorta:一个参数化生理健康主动脉的三维网格数据集。
Pub Date : 2025-08-18 DOI: 10.1109/TMI.2025.3599937
Domagoj Bošnjak;Gian Marco Melito;Richard Schussnig;Katrin Ellermann;Thomas-Peter Fries
The effects of the aortic geometry on its mechanics and blood flow, and subsequently on aortic pathologies, remain largely unexplored. The main obstacle lies in obtaining patient-specific aorta models, an extremely difficult procedure in terms of ethics and availability, segmentation, mesh generation, and all of the accompanying processes. Contrastingly, idealized models are easy to build but do not faithfully represent patient-specific variability. Additionally, a unified aortic parametrization in clinic and engineering has not yet been achieved. To bridge this gap, we introduce a new set of statistical parameters to generate synthetic models of the aorta. The parameters possess geometric significance and fall within physiological ranges, effectively bridging the disciplines of clinical medicine and engineering. Smoothly blended realistic representations are recovered with convolution surfaces. These enable high-quality visualization and biological appearance, whereas the structured mesh generation paves the way for numerical simulations. The only requirement of the approach is one patient-specific aorta model and the statistical data for parameter values obtained from the literature. The output of this work is SynthAorta, a dataset of ready-to-use synthetic, physiological aorta models, each containing a centerline, surface representation, and a structured hexahedral finite element mesh. The meshes are structured and fully consistent between different cases, making them imminently suitable for reduced order modeling and machine learning approaches.
主动脉几何形状对其力学和血流的影响,以及随后对主动脉病理的影响,在很大程度上仍未被探索。主要障碍在于获得患者特定的主动脉模型,这是一个在伦理和可用性、分割、网格生成以及所有伴随过程方面极其困难的过程。相比之下,理想化的模型很容易建立,但不能忠实地代表患者特定的可变性。此外,临床上和工程上尚未实现统一的主动脉参数化。为了弥补这一差距,我们引入了一组新的统计参数来生成主动脉的合成模型。这些参数具有几何意义,并在生理范围内,有效地连接了临床医学和工程学科。用卷积曲面恢复平滑混合的逼真表示。这些实现了高质量的可视化和生物外观,而结构化网格生成为数值模拟铺平了道路。该方法的唯一要求是一个患者特定的主动脉模型和从文献中获得的参数值的统计数据。这项工作的输出是SynthAorta,这是一个现成的合成生理主动脉模型数据集,每个模型都包含一个中心线、表面表示和一个结构化的六面体有限元网格。网格是结构化的,并且在不同情况下完全一致,使它们非常适合于降阶建模和机器学习方法。
{"title":"SynthAorta: A 3D Mesh Dataset of Parametrized Physiological Healthy Aortas","authors":"Domagoj Bošnjak;Gian Marco Melito;Richard Schussnig;Katrin Ellermann;Thomas-Peter Fries","doi":"10.1109/TMI.2025.3599937","DOIUrl":"10.1109/TMI.2025.3599937","url":null,"abstract":"The effects of the aortic geometry on its mechanics and blood flow, and subsequently on aortic pathologies, remain largely unexplored. The main obstacle lies in obtaining patient-specific aorta models, an extremely difficult procedure in terms of ethics and availability, segmentation, mesh generation, and all of the accompanying processes. Contrastingly, idealized models are easy to build but do not faithfully represent patient-specific variability. Additionally, a unified aortic parametrization in clinic and engineering has not yet been achieved. To bridge this gap, we introduce a new set of statistical parameters to generate synthetic models of the aorta. The parameters possess geometric significance and fall within physiological ranges, effectively bridging the disciplines of clinical medicine and engineering. Smoothly blended realistic representations are recovered with convolution surfaces. These enable high-quality visualization and biological appearance, whereas the structured mesh generation paves the way for numerical simulations. The only requirement of the approach is one patient-specific aorta model and the statistical data for parameter values obtained from the literature. The output of this work is <italic>SynthAorta</i>, a dataset of ready-to-use synthetic, physiological aorta models, each containing a centerline, surface representation, and a structured hexahedral finite element mesh. The meshes are structured and fully consistent between different cases, making them imminently suitable for reduced order modeling and machine learning approaches.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 1","pages":"421-430"},"PeriodicalIF":0.0,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11129067","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144877632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Unsupervised Learning Approach for Reconstructing 3T-Like Images From 0.3T MRI Without Paired Training Data 一种无监督学习方法在无配对训练数据的情况下从0.3T MRI重建3t样图像。
Pub Date : 2025-08-11 DOI: 10.1109/TMI.2025.3597401
Huaishui Yang;Shaojun Liu;Yilong Liu;Lingyan Zhang;Shoujin Huang;Jiayu Zheng;Jingzhe Liu;Hua Guo;Ed X. Wu;Mengye Lyu
Magnetic resonance imaging (MRI) is powerful in medical diagnostics, yet high-field MRI, despite offering superior image quality, incurs significant costs for procurement, installation, maintenance, and operation, restricting its availability and accessibility, especially in low- and middle-income countries. Addressing this, our study proposes an unsupervised learning algorithm based on cycle-consistent generative adversarial networks. This framework transforms 0.3T low-field MRI into higher-quality 3T-like images, bypassing the need for paired low/high-field training data. The proposed architecture integrates two novel modules to enhance reconstruction quality: (1) an attention block that dynamically balances high-field-like features with the original low-field input, and (2) an edge block that refines boundary details, providing more accurate structural reconstruction. The proposed generative model is trained on large-scale, unpaired, public datasets, and further validated on paired low/high-field acquisitions of three major clinical MRI sequences: T1-weighted, T2-weighted, and fluid-attenuated inversion recovery (FLAIR) imaging. It demonstrates notable improvements in tissue contrast and signal-to-noise ratio while preserving anatomical fidelity. This approach utilizes rich information from publicly available MRI resources, providing a data-efficient unsupervised alternative that complements supervised methods to enhance the utility of low-field MRI.
磁共振成像(MRI)在医学诊断方面具有强大的功能,然而,尽管高场MRI提供了卓越的图像质量,但在采购、安装、维护和操作方面产生了巨大的成本,限制了其可用性和可及性,特别是在低收入和中等收入国家。为了解决这个问题,我们的研究提出了一种基于循环一致生成对抗网络的无监督学习算法。该框架将0.3T低场MRI转换为更高质量的3t样图像,而不需要配对的低场/高场训练数据。该架构集成了两个新的模块,以提高重建质量:(1)关注块,动态平衡高场特征与原始低场输入;(2)边缘块,细化边界细节,提供更精确的结构重建。所提出的生成模型在大规模、未配对的公共数据集上进行训练,并在三个主要临床MRI序列(t1加权、t2加权和流体衰减反转恢复(FLAIR)成像)的配对低场/高场采集上进一步验证。在保持解剖保真度的同时,它显示了组织对比度和信噪比的显着改善。该方法利用了来自公开可用的MRI资源的丰富信息,提供了一种数据高效的无监督替代方案,补充了监督方法,以增强低场MRI的实用性。
{"title":"An Unsupervised Learning Approach for Reconstructing 3T-Like Images From 0.3T MRI Without Paired Training Data","authors":"Huaishui Yang;Shaojun Liu;Yilong Liu;Lingyan Zhang;Shoujin Huang;Jiayu Zheng;Jingzhe Liu;Hua Guo;Ed X. Wu;Mengye Lyu","doi":"10.1109/TMI.2025.3597401","DOIUrl":"10.1109/TMI.2025.3597401","url":null,"abstract":"Magnetic resonance imaging (MRI) is powerful in medical diagnostics, yet high-field MRI, despite offering superior image quality, incurs significant costs for procurement, installation, maintenance, and operation, restricting its availability and accessibility, especially in low- and middle-income countries. Addressing this, our study proposes an unsupervised learning algorithm based on cycle-consistent generative adversarial networks. This framework transforms 0.3T low-field MRI into higher-quality 3T-like images, bypassing the need for paired low/high-field training data. The proposed architecture integrates two novel modules to enhance reconstruction quality: (1) an attention block that dynamically balances high-field-like features with the original low-field input, and (2) an edge block that refines boundary details, providing more accurate structural reconstruction. The proposed generative model is trained on large-scale, unpaired, public datasets, and further validated on paired low/high-field acquisitions of three major clinical MRI sequences: T1-weighted, T2-weighted, and fluid-attenuated inversion recovery (FLAIR) imaging. It demonstrates notable improvements in tissue contrast and signal-to-noise ratio while preserving anatomical fidelity. This approach utilizes rich information from publicly available MRI resources, providing a data-efficient unsupervised alternative that complements supervised methods to enhance the utility of low-field MRI.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5358-5371"},"PeriodicalIF":0.0,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144819720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EPDiff: Erasure Perception Diffusion Model for Unsupervised Anomaly Detection in Preoperative Multimodal Images EPDiff:用于术前多模态图像无监督异常检测的擦除感知扩散模型。
Pub Date : 2025-08-11 DOI: 10.1109/TMI.2025.3597545
Jiazheng Wang;Min Liu;Wenting Shen;Renjie Ding;Yaonan Wang;Erik Meijering
Unsupervised anomaly detection (UAD) methods typically detect anomalies by learning and reconstructing the normative distribution. However, since anomalies constantly invade and affect their surroundings, sub-healthy areas in the junction present structural deformations that could be easily misidentified as anomalies, posing difficulties for UAD methods that solely learn the normative distribution. The use of multimodal images can facilitate to address the above challenges, as they can provide complementary information of anomalies. Therefore, this paper propose a novel method for UAD in preoperative multimodal images, called Erasure Perception Diffusion model (EPDiff). First, the Local Erasure Progressive Training (LEPT) framework is designed to better rebuild sub-healthy structures around anomalies through the diffusion model with a two-phase process. Initially, healthy images are used to capture deviation features labeled as potential anomalies. Then, these anomalies are locally erased in multimodal images to progressively learn sub-healthy structures, obtaining a more detailed reconstruction around anomalies. Second, the Global Structural Perception (GSP) module is developed in the diffusion model to realize global structural representation and correlation within images and between modalities through interactions of high-level semantic information. In addition, a training-free module, named Multimodal Attention Fusion (MAF) module, is presented for weighted fusion of anomaly maps between different modalities and obtaining binary anomaly outputs. Experimental results show that EPDiff improves the AUPRC and mDice scores by 2% and 3.9% on BraTS2021, and by 5.2% and 4.5% on Shifts over the state-of-the-art methods, which proves the applicability of EPDiff in diverse anomaly diagnosis. The code is available at https://github.com/wjiazheng/EPDiff
无监督异常检测(UAD)方法通常通过学习和重构规范分布来检测异常。然而,由于异常不断侵入并影响其周围环境,结区内的亚健康区域存在结构变形,容易被误认为是异常,这给仅学习规范分布的UAD方法带来了困难。多模态图像的使用有助于解决上述挑战,因为它们可以提供异常的补充信息。因此,本文提出了一种新的术前多模态图像UAD处理方法,即Erasure Perception Diffusion model (EPDiff)。首先,设计局部擦除渐进训练(Local Erasure Progressive Training, LEPT)框架,通过两阶段过程的扩散模型更好地重建异常周围的亚健康结构。最初,健康图像用于捕获标记为潜在异常的偏差特征。然后,在多模态图像中局部擦除这些异常,逐步学习亚健康结构,获得异常周围更详细的重建。其次,在扩散模型中开发了全局结构感知(Global structure Perception, GSP)模块,通过高级语义信息的交互实现图像内部和模态之间的全局结构表示和关联。此外,提出了一种无需训练的多模态注意融合(Multimodal Attention Fusion, MAF)模块,对不同模态间的异常映射进行加权融合,得到二元异常输出。实验结果表明,与现有方法相比,EPDiff在BraTS2021上的AUPRC和mice得分分别提高了2%和3.9%,在Shifts上的得分分别提高了5.2%和4.5%,证明了EPDiff在各种异常诊断中的适用性。代码可在https://github.com/wjiazheng/EPDiff上获得。
{"title":"EPDiff: Erasure Perception Diffusion Model for Unsupervised Anomaly Detection in Preoperative Multimodal Images","authors":"Jiazheng Wang;Min Liu;Wenting Shen;Renjie Ding;Yaonan Wang;Erik Meijering","doi":"10.1109/TMI.2025.3597545","DOIUrl":"10.1109/TMI.2025.3597545","url":null,"abstract":"Unsupervised anomaly detection (UAD) methods typically detect anomalies by learning and reconstructing the normative distribution. However, since anomalies constantly invade and affect their surroundings, sub-healthy areas in the junction present structural deformations that could be easily misidentified as anomalies, posing difficulties for UAD methods that solely learn the normative distribution. The use of multimodal images can facilitate to address the above challenges, as they can provide complementary information of anomalies. Therefore, this paper propose a novel method for UAD in preoperative multimodal images, called Erasure Perception Diffusion model (EPDiff). First, the Local Erasure Progressive Training (LEPT) framework is designed to better rebuild sub-healthy structures around anomalies through the diffusion model with a two-phase process. Initially, healthy images are used to capture deviation features labeled as potential anomalies. Then, these anomalies are locally erased in multimodal images to progressively learn sub-healthy structures, obtaining a more detailed reconstruction around anomalies. Second, the Global Structural Perception (GSP) module is developed in the diffusion model to realize global structural representation and correlation within images and between modalities through interactions of high-level semantic information. In addition, a training-free module, named Multimodal Attention Fusion (MAF) module, is presented for weighted fusion of anomaly maps between different modalities and obtaining binary anomaly outputs. Experimental results show that EPDiff improves the AUPRC and mDice scores by 2% and 3.9% on BraTS2021, and by 5.2% and 4.5% on Shifts over the state-of-the-art methods, which proves the applicability of EPDiff in diverse anomaly diagnosis. The code is available at <uri>https://github.com/wjiazheng/EPDiff</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 1","pages":"379-390"},"PeriodicalIF":0.0,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144819772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Choroid Segmentation and Thickness Measurement Based on Mixed Attention-Guided Multiscale Feature Fusion Network 基于混合注意引导的多尺度特征融合网络的脉络自动分割与厚度测量。
Pub Date : 2025-08-08 DOI: 10.1109/TMI.2025.3597026
Xiaoyu Zhu;Shiyin Li;HongLiang Bi;Lina Guan;Haiyang Liu;Zhaolin Lu
Choroidal thickness variations serve as critical biomarkers for numerous ophthalmic diseases. Accurate segmentation and quantification of the choroid in optical coherence tomography (OCT) images is essential for clinical diagnosis and disease progression monitoring. Due to the small number of disease types in the public OCT dataset involving changes in choroidal thickness and the lack of a publicly available labeled dataset, we constructed the Xuzhou Municipal Hospital (XZMH)-Choroid dataset. This dataset contains annotated OCT images of normal and eight choroid-related diseases. However, segmentation of the choroid in OCT images remains a formidable challenge due to the confounding factors of blurred boundaries, non-uniform texture, and lesions. To overcome these challenges, we proposed a mixed attention-guided multiscale feature fusion network (MAMFF-Net). This network integrates a Mixed Attention Encoder (MAE) for enhanced fine-grained feature extraction, a deformable multiscale feature fusion path (DMFFP) for adaptive feature integration across lesion deformations, and a multiscale pyramid layer aggregation (MPLA) module for improved contextual representation learning. Through comparative experiments with other deep learning methods, we found that the MAMFF-Net model has better segmentation performance than other deep learning methods (mDice: 97.44, mIoU: 95.11, mAcc: 97.71). Based on the choroidal segmentation implemented in MAMFF-Net, an algorithm for automated choroidal thickness measurement was developed, and the automated measurement results approached the level of senior specialists.
脉络膜厚度变化是许多眼科疾病的重要生物标志物。光学相干断层扫描(OCT)图像中脉络膜的准确分割和定量对临床诊断和疾病进展监测至关重要。由于公共OCT数据集中涉及脉络膜厚度变化的疾病类型较少,并且缺乏公开可用的标记数据集,我们构建了徐州市医院(XZMH)-脉络膜数据集。该数据集包含正常和八种脉络膜相关疾病的注释OCT图像。然而,由于边界模糊、纹理不均匀和病变等混杂因素,OCT图像中脉络膜的分割仍然是一个巨大的挑战。为了克服这些挑战,我们提出了一种混合注意力引导的多尺度特征融合网络(MAMFF-Net)。该网络集成了用于增强细粒度特征提取的混合注意编码器(MAE),用于跨病变变形进行自适应特征集成的可变形多尺度特征融合路径(DMFFP),以及用于改进上下文表示学习的多尺度金字塔层聚合(MPLA)模块。通过与其他深度学习方法的对比实验,我们发现MAMFF-Net模型比其他深度学习方法具有更好的分割性能(mdevice: 97.44, mIoU: 95.11, mAcc: 97.71)。在MAMFF-Net实现脉络膜分割的基础上,开发了脉络膜厚度自动测量算法,自动测量结果接近高级专家水平。
{"title":"Automatic Choroid Segmentation and Thickness Measurement Based on Mixed Attention-Guided Multiscale Feature Fusion Network","authors":"Xiaoyu Zhu;Shiyin Li;HongLiang Bi;Lina Guan;Haiyang Liu;Zhaolin Lu","doi":"10.1109/TMI.2025.3597026","DOIUrl":"10.1109/TMI.2025.3597026","url":null,"abstract":"Choroidal thickness variations serve as critical biomarkers for numerous ophthalmic diseases. Accurate segmentation and quantification of the choroid in optical coherence tomography (OCT) images is essential for clinical diagnosis and disease progression monitoring. Due to the small number of disease types in the public OCT dataset involving changes in choroidal thickness and the lack of a publicly available labeled dataset, we constructed the Xuzhou Municipal Hospital (XZMH)-Choroid dataset. This dataset contains annotated OCT images of normal and eight choroid-related diseases. However, segmentation of the choroid in OCT images remains a formidable challenge due to the confounding factors of blurred boundaries, non-uniform texture, and lesions. To overcome these challenges, we proposed a mixed attention-guided multiscale feature fusion network (MAMFF-Net). This network integrates a Mixed Attention Encoder (MAE) for enhanced fine-grained feature extraction, a deformable multiscale feature fusion path (DMFFP) for adaptive feature integration across lesion deformations, and a multiscale pyramid layer aggregation (MPLA) module for improved contextual representation learning. Through comparative experiments with other deep learning methods, we found that the MAMFF-Net model has better segmentation performance than other deep learning methods (mDice: 97.44, mIoU: 95.11, mAcc: 97.71). Based on the choroidal segmentation implemented in MAMFF-Net, an algorithm for automated choroidal thickness measurement was developed, and the automated measurement results approached the level of senior specialists.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 1","pages":"350-363"},"PeriodicalIF":0.0,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144802501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on medical imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1