首页 > 最新文献

Medical image analysis最新文献

英文 中文
Fundus image quality assessment in retinopathy of prematurity via multi-label graph evidential network 基于多标签图证据网络的早产儿视网膜病变眼底图像质量评价
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1016/j.media.2026.103959
Donghan Wu , Wenyue Shen , Lu Yuan , Heng Li , Huaying Hao , Juan Ye , Yitian Zhao
Retinopathy of Prematurity (ROP) is a leading cause of childhood blindness worldwide. In clinical practice, fundus imaging serves as a primary diagnostic tool for ROP, making the accurate quality assessment of these images critically important. However, existing automated methods for evaluating ROP fundus images face significant challenges. First, there is a high degree of visual similarity between lesions and factors that influence quality. Second, there is a paucity of trustworthy outputs and interpretable or clinical-friendly designs, which limit their reliability and effectiveness. In this work, we propose a ROP image quality assessment framework, termed Q-ROP. This framework leverages fine-grained multi-label annotations based on key image factors such as artifacts, illumination, spatial positioning, and structural clarity. Additionally, the integration of a label graph network with evidential learning theory enables the model to explicitly capture the relationships between quality grades and influencing factors, thereby improving both robustness and accuracy. This approach facilitates interpretable analysis by directing the model’s focus toward relevant image features and reducing interference from lesion-like artifacts. Furthermore, the incorporation of evidential learning theory serves to quantify the uncertainty inherent in quality ratings, thereby ensuring the trustworthiness of the assessments. Trained and tested on a dataset of 6677 ROP images across three quality levels (i.e. acceptable, potentially acceptable, and unacceptable), Q-ROP achieved state-of-the-art performance with a 95.82% accuracy. Its effectiveness was further validated in a downstream ROP staging task, where it significantly improved the performance of typical classification models. These results demonstrate Q-ROP’s strong potential as a reliable and robust tool for clinical decision support.
早产儿视网膜病变(ROP)是全球儿童失明的主要原因。在临床实践中,眼底成像是ROP的主要诊断工具,因此对这些图像的准确质量评估至关重要。然而,现有的眼底图像ROP评估自动化方法面临着重大挑战。首先,在病变和影响质量的因素之间存在高度的视觉相似性。其次,缺乏可信赖的输出和可解释或临床友好的设计,这限制了它们的可靠性和有效性。在这项工作中,我们提出了一个ROP图像质量评估框架,称为Q-ROP。该框架利用基于关键图像因素(如工件、照明、空间定位和结构清晰度)的细粒度多标签注释。此外,将标签图网络与证据学习理论相结合,使模型能够明确地捕捉质量等级与影响因素之间的关系,从而提高鲁棒性和准确性。这种方法通过将模型的焦点指向相关的图像特征并减少来自类似病变的工件的干扰,从而促进了可解释的分析。此外,证据学习理论的结合有助于量化质量评级固有的不确定性,从而确保评估的可信度。在三个质量水平(即可接受、潜在可接受和不可接受)的6677张ROP图像数据集上进行训练和测试,Q-ROP达到了最先进的性能,准确率为95.82%。在下游ROP分期任务中进一步验证了其有效性,该方法显著提高了典型分类模型的性能。这些结果表明Q-ROP作为临床决策支持的可靠和强大的工具具有强大的潜力。
{"title":"Fundus image quality assessment in retinopathy of prematurity via multi-label graph evidential network","authors":"Donghan Wu ,&nbsp;Wenyue Shen ,&nbsp;Lu Yuan ,&nbsp;Heng Li ,&nbsp;Huaying Hao ,&nbsp;Juan Ye ,&nbsp;Yitian Zhao","doi":"10.1016/j.media.2026.103959","DOIUrl":"10.1016/j.media.2026.103959","url":null,"abstract":"<div><div>Retinopathy of Prematurity (ROP) is a leading cause of childhood blindness worldwide. In clinical practice, fundus imaging serves as a primary diagnostic tool for ROP, making the accurate quality assessment of these images critically important. However, existing automated methods for evaluating ROP fundus images face significant challenges. First, there is a high degree of visual similarity between lesions and factors that influence quality. Second, there is a paucity of trustworthy outputs and interpretable or clinical-friendly designs, which limit their reliability and effectiveness. In this work, we propose a ROP image quality assessment framework, termed Q-ROP. This framework leverages fine-grained multi-label annotations based on key image factors such as artifacts, illumination, spatial positioning, and structural clarity. Additionally, the integration of a label graph network with evidential learning theory enables the model to explicitly capture the relationships between quality grades and influencing factors, thereby improving both robustness and accuracy. This approach facilitates interpretable analysis by directing the model’s focus toward relevant image features and reducing interference from lesion-like artifacts. Furthermore, the incorporation of evidential learning theory serves to quantify the uncertainty inherent in quality ratings, thereby ensuring the trustworthiness of the assessments. Trained and tested on a dataset of 6677 ROP images across three quality levels (i.e. acceptable, potentially acceptable, and unacceptable), Q-ROP achieved state-of-the-art performance with a 95.82% accuracy. Its effectiveness was further validated in a downstream ROP staging task, where it significantly improved the performance of typical classification models. These results demonstrate Q-ROP’s strong potential as a reliable and robust tool for clinical decision support.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103959"},"PeriodicalIF":11.8,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146048255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fréchet radiomic distance (FRD): A versatile metric for comparing medical imaging datasets ferccheradiomic Distance (FRD):一种比较医学影像数据集的通用度量
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1016/j.media.2026.103943
Nicholas Konz , Richard Osuala , Preeti Verma , Yuwen Chen , Hanxue Gu , Haoyu Dong , Yaqian Chen , Andrew Marshall , Lidia Garrucho , Kaisar Kushibar , Daniel M. Lang , Gene S. Kim , Lars J. Grimm , John M. Lewin , James S. Duncan , Julia A. Schnabel , Oliver Diaz , Karim Lekadir , Maciej A. Mazurowski
Determining whether two sets of images belong to the same or different distributions or domains is a crucial task in modern medical image analysis and deep learning; for example, to evaluate the output quality of image generative models. Currently, metrics used for this task either rely on the (potentially biased) choice of some downstream task, such as segmentation, or adopt task-independent perceptual metrics (e.g., Fréchet Inception Distance/FID) from natural imaging, which we show insufficiently capture anatomical features. To this end, we introduce a new perceptual metric tailored for medical images, FRD (Fréchet Radiomic Distance), which utilizes standardized, clinically meaningful, and interpretable image features. We show that FRD is superior to other image distribution metrics for a range of medical imaging applications, including out-of-domain (OOD) detection, the evaluation of image-to-image translation (by correlating more with downstream task performance as well as anatomical consistency and realism), and the evaluation of unconditional image generation. Moreover, FRD offers additional benefits such as stability and computational efficiency at low sample sizes, sensitivity to image corruptions and adversarial attacks, feature interpretability, and correlation with radiologist-perceived image quality. Additionally, we address key gaps in the literature by presenting an extensive framework for the multifaceted evaluation of image similarity metrics in medical imaging—including the first large-scale comparative study of generative models for medical image translation—and release an accessible codebase to facilitate future research. Our results are supported by thorough experiments spanning a variety of datasets, modalities, and downstream tasks, highlighting the broad potential of FRD for medical image analysis.
确定两组图像是否属于相同或不同的分布或域是现代医学图像分析和深度学习的关键任务;例如,评估图像生成模型的输出质量。目前,用于该任务的度量要么依赖于(可能有偏见的)一些下游任务的选择,例如分割,要么采用与任务无关的感知度量(例如,fr起始距离/FID)来自自然成像,我们认为这不足以捕获解剖特征。为此,我们引入了一种为医学图像量身定制的新的感知度量,FRD (fr放射距离),它利用标准化、临床意义和可解释的图像特征。研究表明,在一系列医学成像应用中,FRD优于其他图像分布指标,包括域外(OOD)检测、图像到图像转换的评估(通过更多地与下游任务性能以及解剖一致性和真实感相关)以及无条件图像生成的评估。此外,FRD还提供了额外的好处,如低样本量下的稳定性和计算效率、对图像损坏和对抗性攻击的敏感性、特征可解释性以及与放射科医生感知的图像质量的相关性。此外,我们通过提出医学成像中图像相似性度量的多方面评估的广泛框架(包括医学图像翻译生成模型的首次大规模比较研究)来解决文献中的关键空白,并发布了一个可访问的代码库,以促进未来的研究。我们的结果得到了跨越各种数据集、模式和下游任务的全面实验的支持,突出了FRD在医学图像分析方面的广泛潜力。
{"title":"Fréchet radiomic distance (FRD): A versatile metric for comparing medical imaging datasets","authors":"Nicholas Konz ,&nbsp;Richard Osuala ,&nbsp;Preeti Verma ,&nbsp;Yuwen Chen ,&nbsp;Hanxue Gu ,&nbsp;Haoyu Dong ,&nbsp;Yaqian Chen ,&nbsp;Andrew Marshall ,&nbsp;Lidia Garrucho ,&nbsp;Kaisar Kushibar ,&nbsp;Daniel M. Lang ,&nbsp;Gene S. Kim ,&nbsp;Lars J. Grimm ,&nbsp;John M. Lewin ,&nbsp;James S. Duncan ,&nbsp;Julia A. Schnabel ,&nbsp;Oliver Diaz ,&nbsp;Karim Lekadir ,&nbsp;Maciej A. Mazurowski","doi":"10.1016/j.media.2026.103943","DOIUrl":"10.1016/j.media.2026.103943","url":null,"abstract":"<div><div>Determining whether two sets of images belong to the same or different distributions or domains is a crucial task in modern medical image analysis and deep learning; for example, to evaluate the output quality of image generative models. Currently, metrics used for this task either rely on the (potentially biased) choice of some downstream task, such as segmentation, or adopt task-independent perceptual metrics (<em>e.g.</em>, Fréchet Inception Distance/FID) from natural imaging, which we show insufficiently capture anatomical features. To this end, we introduce a new perceptual metric tailored for medical images, FRD (Fréchet Radiomic Distance), which utilizes standardized, clinically meaningful, and interpretable image features. We show that FRD is superior to other image distribution metrics for a range of medical imaging applications, including out-of-domain (OOD) detection, the evaluation of image-to-image translation (by correlating more with downstream task performance as well as anatomical consistency and realism), and the evaluation of unconditional image generation. Moreover, FRD offers additional benefits such as stability and computational efficiency at low sample sizes, sensitivity to image corruptions and adversarial attacks, feature interpretability, and correlation with radiologist-perceived image quality. Additionally, we address key gaps in the literature by presenting an extensive framework for the multifaceted evaluation of image similarity metrics in medical imaging—including the first large-scale comparative study of generative models for medical image translation—and release an accessible codebase to facilitate future research. Our results are supported by thorough experiments spanning a variety of datasets, modalities, and downstream tasks, highlighting the broad potential of FRD for medical image analysis.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103943"},"PeriodicalIF":11.8,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146048368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MADAT: Missing-aware dynamic adaptive transformer model for medical prognosis prediction with incomplete multimodal data 不完全多模态数据下医疗预后预测的缺失感知动态自适应变压器模型
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1016/j.media.2026.103958
Jianbin He , Guoheng Huang , Xiaochen Yuan , Chi-Man Pun , Guo Zhong , Qi Yang , Ling Guo , Siyu Zhu , Baiying Lei , Haojiang Li
Multimodal medical prognosis prediction has shown great potential in improving diagnostic accuracy by integrating various data types. However, incomplete multimodality, where certain modalities are missing, poses significant challenges to model performance. Current methods, including dynamic adaptation and modality completion, have limitations in handling incomplete multimodality comprehensively. Dynamic adaptation methods fail to fully utilize modality interactions as they only process available modalities. Modality completion methods address inter-modal relationships but risk generating unreliable data, especially when key modalities are missing, since existing modalities cannot replicate unique features of absent ones. This compromises fusion quality and degrades model performance. To address these challenges, we propose the Missing-aware Dynamic Adaptive Transformer (MADAT) model, which integrates two phases: the Decoupling Generalization Completion Phase (DGCP), the Adaptive Cross-Fusion Phase (ACFP). The DGCP reconstructs missing modalities by generating inter-modal and intra-modal shared information using Progressive Transformation Recursive Gated Convolutions (PTRGC) and Wavelet Alignment Domain Generalization (WADG). The ACFP, which incorporates Cross-Agent Attention (CAA) and Generation Quality Feedback Regulation (GQFR), adaptively fuses the original and generated modality features. CAA ensures thorough integration and alignment of the features, while GQFR dynamically adjusts the model’s reliance on the generated features based on their quality, preventing over-dependence on low-quality data. Experiments on three private nasopharyngeal carcinoma datasets demonstrate that MADAT outperforms existing methods, achieving superior robustness in medical multimodal prediction under conditions of incomplete multimodality.
多模式医学预后预测通过整合多种数据类型,在提高诊断准确性方面显示出巨大的潜力。然而,不完整的多模态,其中某些模态缺失,对模型性能提出了重大挑战。现有的动态自适应和模态补全方法在综合处理不完全多模态方面存在局限性。动态适应方法只处理可用的模态,不能充分利用模态相互作用。模态补全方法处理的是多模态关系,但有产生不可靠数据的风险,特别是当关键模态缺失时,因为现有模态无法复制缺失模态的独特特征。这损害了融合质量并降低了模型性能。为了解决这些挑战,我们提出了缺失感知动态自适应变压器(MADAT)模型,该模型集成了两个阶段:解耦推广完成阶段(DGCP)和自适应交叉融合阶段(ACFP)。DGCP通过使用渐进式变换递归门控卷积(PTRGC)和小波对齐域泛化(WADG)生成模态间和模态内共享信息来重建缺失模态。ACFP结合了跨代理注意(CAA)和发电质量反馈调节(GQFR),自适应地融合了原始和生成的模态特征。CAA确保特征的彻底集成和对齐,而GQFR根据生成的特征的质量动态调整模型对其的依赖,防止对低质量数据的过度依赖。在三个私人鼻咽癌数据集上的实验表明,MADAT优于现有方法,在不完全多模态条件下的医学多模态预测具有优越的鲁棒性。
{"title":"MADAT: Missing-aware dynamic adaptive transformer model for medical prognosis prediction with incomplete multimodal data","authors":"Jianbin He ,&nbsp;Guoheng Huang ,&nbsp;Xiaochen Yuan ,&nbsp;Chi-Man Pun ,&nbsp;Guo Zhong ,&nbsp;Qi Yang ,&nbsp;Ling Guo ,&nbsp;Siyu Zhu ,&nbsp;Baiying Lei ,&nbsp;Haojiang Li","doi":"10.1016/j.media.2026.103958","DOIUrl":"10.1016/j.media.2026.103958","url":null,"abstract":"<div><div>Multimodal medical prognosis prediction has shown great potential in improving diagnostic accuracy by integrating various data types. However, incomplete multimodality, where certain modalities are missing, poses significant challenges to model performance. Current methods, including dynamic adaptation and modality completion, have limitations in handling incomplete multimodality comprehensively. Dynamic adaptation methods fail to fully utilize modality interactions as they only process available modalities. Modality completion methods address inter-modal relationships but risk generating unreliable data, especially when key modalities are missing, since existing modalities cannot replicate unique features of absent ones. This compromises fusion quality and degrades model performance. To address these challenges, we propose the Missing-aware Dynamic Adaptive Transformer (MADAT) model, which integrates two phases: the Decoupling Generalization Completion Phase (DGCP), the Adaptive Cross-Fusion Phase (ACFP). The DGCP reconstructs missing modalities by generating inter-modal and intra-modal shared information using Progressive Transformation Recursive Gated Convolutions (PTRGC) and Wavelet Alignment Domain Generalization (WADG). The ACFP, which incorporates Cross-Agent Attention (CAA) and Generation Quality Feedback Regulation (GQFR), adaptively fuses the original and generated modality features. CAA ensures thorough integration and alignment of the features, while GQFR dynamically adjusts the model’s reliance on the generated features based on their quality, preventing over-dependence on low-quality data. Experiments on three private nasopharyngeal carcinoma datasets demonstrate that MADAT outperforms existing methods, achieving superior robustness in medical multimodal prediction under conditions of incomplete multimodality.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103958"},"PeriodicalIF":11.8,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146048256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IUGC: A benchmark of landmark detection in end-to-end intrapartum ultrasound biometry IUGC:端到端分娩时超声生物测量的地标检测基准
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-23 DOI: 10.1016/j.media.2026.103960
Jieyun Bai , Yitong Tang , Xiao Liu , Jiale Hu , Yunda Li , Xufan Chen , Yufeng Wang , Chen Ma , Yunshu Li , Bowen Guo , Jing Jiao , Yi Huang , Kun Wang , Lifei Li , Yuzhang Ma , Xiaoxin Han , Haochen Shao , Zi Yang , Qingchen Liu , Yuchen Hu , Shuo Li
Accurate intrapartum biometry plays a crucial role in monitoring labor progression and preventing complications. However, its clinical application is limited by challenges such as the difficulty in identifying anatomical landmarks and the variability introduced by operator dependency. To overcome these challenges, the Intrapartum Ultrasound Grand Challenge (IUGC) 2025, in collaboration with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), was organized to accelerate the development of automatic measurement techniques for intrapartum ultrasound analysis. The challenge featured a large-scale, multi-center dataset comprising over 32,000 images from 24 hospitals and research institutes. These images were annotated with key anatomical landmarks of the pubic symphysis (PS) and fetal head (FH), along with the corresponding biometric parameter-the angle of progression (AoP). Ten participating teams proposed a variety of end-to-end and semi-supervised frameworks, incorporating advanced strategies such as foundation model distillation, pseudo-label refinement, anatomical segmentation guidance, and ensemble learning. A comprehensive evaluation revealed that the winning team achieved superior accuracy, with a Mean Radial Error (MRE) of 6.53 ± 4.38 pixels for the right PS landmark, 8.60 ± 5.06 pixels for the left PS landmark, 19.90 ± 17.55 pixels for the FH tangent landmark, and an absolute AoP difference of 3.81 ± 3.12° This top-performing method demonstrated accuracy comparable to expert sonographers, emphasizing the clinical potential of automated intrapartum ultrasound analysis. However, challenges remain, such as the trade-off between accuracy and computational efficiency, the lack of segmentation labels and video data, and the need for extensive multi-center clinical validation. IUGC 2025 thus sets the first benchmark for landmark-based intrapartum biometry estimation and provides an open platform for developing and evaluating real-time, intelligent ultrasound analysis solutions for labor management.
准确的产时生物测量在监测产程和预防并发症中起着至关重要的作用。然而,它的临床应用受到诸如难以识别解剖标志和操作员依赖性引入的可变性等挑战的限制。为了克服这些挑战,在医学图像计算和计算机辅助干预国际会议(MICCAI)的合作下,组织了2025年产房超声大挑战(IUGC),以加速产房超声分析自动测量技术的发展。该挑战的特点是一个大规模的多中心数据集,包括来自24家医院和研究机构的32,000多张图像。这些图像标注了耻骨联合(PS)和胎头(FH)的关键解剖标志,以及相应的生物特征参数-进展角(AoP)。十个参与团队提出了各种端到端和半监督框架,结合了先进的策略,如基础模型蒸馏、伪标签细化、解剖分割指导和集成学习。综合评估显示,获胜团队取得了卓越的准确性,右侧PS标志的平均径向误差(MRE)为6.53±4.38像素,左侧PS标志为8.60±5.06像素,FH切线标志为19.90±17.55像素,绝对AoP差为3.81±3.12°。该方法表现出与专家超声仪相当的准确性,强调了自动产时超声分析的临床潜力。然而,挑战仍然存在,例如准确性和计算效率之间的权衡,缺乏分割标签和视频数据,以及需要广泛的多中心临床验证。因此,IUGC 2025为基于里程碑的产时生物测量估计设定了第一个基准,并为开发和评估用于劳动管理的实时智能超声分析解决方案提供了一个开放平台。
{"title":"IUGC: A benchmark of landmark detection in end-to-end intrapartum ultrasound biometry","authors":"Jieyun Bai ,&nbsp;Yitong Tang ,&nbsp;Xiao Liu ,&nbsp;Jiale Hu ,&nbsp;Yunda Li ,&nbsp;Xufan Chen ,&nbsp;Yufeng Wang ,&nbsp;Chen Ma ,&nbsp;Yunshu Li ,&nbsp;Bowen Guo ,&nbsp;Jing Jiao ,&nbsp;Yi Huang ,&nbsp;Kun Wang ,&nbsp;Lifei Li ,&nbsp;Yuzhang Ma ,&nbsp;Xiaoxin Han ,&nbsp;Haochen Shao ,&nbsp;Zi Yang ,&nbsp;Qingchen Liu ,&nbsp;Yuchen Hu ,&nbsp;Shuo Li","doi":"10.1016/j.media.2026.103960","DOIUrl":"10.1016/j.media.2026.103960","url":null,"abstract":"<div><div>Accurate intrapartum biometry plays a crucial role in monitoring labor progression and preventing complications. However, its clinical application is limited by challenges such as the difficulty in identifying anatomical landmarks and the variability introduced by operator dependency. To overcome these challenges, the Intrapartum Ultrasound Grand Challenge (IUGC) 2025, in collaboration with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), was organized to accelerate the development of automatic measurement techniques for intrapartum ultrasound analysis. The challenge featured a large-scale, multi-center dataset comprising over 32,000 images from 24 hospitals and research institutes. These images were annotated with key anatomical landmarks of the pubic symphysis (PS) and fetal head (FH), along with the corresponding biometric parameter-the angle of progression (AoP). Ten participating teams proposed a variety of end-to-end and semi-supervised frameworks, incorporating advanced strategies such as foundation model distillation, pseudo-label refinement, anatomical segmentation guidance, and ensemble learning. A comprehensive evaluation revealed that the winning team achieved superior accuracy, with a Mean Radial Error (MRE) of 6.53 ± 4.38 pixels for the right PS landmark, 8.60 ± 5.06 pixels for the left PS landmark, 19.90 ± 17.55 pixels for the FH tangent landmark, and an absolute AoP difference of 3.81 ± 3.12° This top-performing method demonstrated accuracy comparable to expert sonographers, emphasizing the clinical potential of automated intrapartum ultrasound analysis. However, challenges remain, such as the trade-off between accuracy and computational efficiency, the lack of segmentation labels and video data, and the need for extensive multi-center clinical validation. IUGC 2025 thus sets the first benchmark for landmark-based intrapartum biometry estimation and provides an open platform for developing and evaluating real-time, intelligent ultrasound analysis solutions for labor management.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103960"},"PeriodicalIF":11.8,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating synthetic MRI scans for improving Alzheimer’s disease diagnosis 合成MRI扫描提高阿尔茨海默病诊断
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-23 DOI: 10.1016/j.media.2026.103947
Rosanna Turrisi, Giuseppe Patané
Alzheimer’s disease (AD) is a progressive neurodegenerative disorder and the leading cause of dementia. Magnetic Resonance Imaging (MRI) combined with Machine Learning (ML) enables early diagnosis, but ML models often underperform when trained on small, heterogeneous medical datasets. Transfer Learning (TL) helps mitigate this limitation, yet models pre-trained on 2D natural images still fall short of those trained directly on related 3D MRI data. To address this gap, we introduce an intermediate strategy based on synthetic data generation. Specifically, we propose a conditional Denoising Diffusion Probabilistic Model (DDPM) to synthesise 2D projections (axial, coronal, sagittal) of brain MRI scans across three clinical groups: Cognitively Normal (CN), Mild Cognitive Impairment (MCI), and AD. A total of 9000 synthetic images are used for pre-training 2D models, which are subsequently extended to 3D via axial, coronal, and sagittal convolutions and fine-tuned on real-world small datasets. Our method achieves 91.3% accuracy in binary (CN vs. AD) and 74.5% in three-class (CN/MCI/AD) classification on the 3T ADNI dataset, outperforming both models trained from scratch and those pre-trained on ImageNet. Our 2D ADnet achieved state-of-the-art performance on OASIS-2 (59.3% accuracy, 57.6% F1), surpassing all competitor models and confirming the robustness of synthetic data pre-training. These results show synthetic diffusion-based pre-training as a promising bridge between natural image TL and medical MRI data.
阿尔茨海默病(AD)是一种进行性神经退行性疾病,也是痴呆症的主要原因。磁共振成像(MRI)与机器学习(ML)相结合可以实现早期诊断,但ML模型在小型异构医疗数据集上训练时往往表现不佳。迁移学习(TL)有助于缓解这一限制,但在2D自然图像上预训练的模型仍然不如直接在相关3D MRI数据上训练的模型。为了解决这一差距,我们引入了一种基于合成数据生成的中间策略。具体来说,我们提出了一个条件去噪扩散概率模型(DDPM)来合成三个临床组的脑MRI扫描的二维投影(轴向、冠状、矢状):认知正常(CN)、轻度认知障碍(MCI)和AD。总共有9000张合成图像用于预训练2D模型,随后通过轴向、冠状和矢状卷积扩展到3D,并在现实世界的小数据集上进行微调。在3T ADNI数据集上,我们的方法在二元分类(CN vs. AD)和三类分类(CN/ mci /AD)上的准确率分别达到91.3%和74.5%,优于从头训练的模型和在ImageNet上预训练的模型。我们的2D ADnet在OASIS-2上取得了最先进的性能(59.3%的准确率,57.6%的F1),超过了所有竞争对手的模型,并证实了合成数据预训练的鲁棒性。这些结果表明,基于合成扩散的预训练是连接自然图像TL和医学MRI数据的一个很有前途的桥梁。
{"title":"Generating synthetic MRI scans for improving Alzheimer’s disease diagnosis","authors":"Rosanna Turrisi,&nbsp;Giuseppe Patané","doi":"10.1016/j.media.2026.103947","DOIUrl":"10.1016/j.media.2026.103947","url":null,"abstract":"<div><div>Alzheimer’s disease (AD) is a progressive neurodegenerative disorder and the leading cause of dementia. Magnetic Resonance Imaging (MRI) combined with Machine Learning (ML) enables early diagnosis, but ML models often underperform when trained on small, heterogeneous medical datasets. Transfer Learning (TL) helps mitigate this limitation, yet models pre-trained on 2D natural images still fall short of those trained directly on related 3D MRI data. To address this gap, we introduce an intermediate strategy based on synthetic data generation. Specifically, we propose a conditional Denoising Diffusion Probabilistic Model (DDPM) to synthesise 2D projections (axial, coronal, sagittal) of brain MRI scans across three clinical groups: Cognitively Normal (CN), Mild Cognitive Impairment (MCI), and AD. A total of 9000 synthetic images are used for pre-training 2D models, which are subsequently extended to 3D via axial, coronal, and sagittal convolutions and fine-tuned on real-world small datasets. Our method achieves 91.3% accuracy in binary (CN vs. AD) and 74.5% in three-class (CN/MCI/AD) classification on the 3T ADNI dataset, outperforming both models trained from scratch and those pre-trained on ImageNet. Our 2D ADnet achieved state-of-the-art performance on OASIS-2 (59.3% accuracy, 57.6% F1), surpassing all competitor models and confirming the robustness of synthetic data pre-training. These results show synthetic diffusion-based pre-training as a promising bridge between natural image TL and medical MRI data.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103947"},"PeriodicalIF":11.8,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146032814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Channel-wise joint disentanglement representation learning for B-mode and super-resolution ultrasound based CAD of breast cancer 基于b模式和超分辨率超声的乳腺癌CAD的通道联合解缠表示学习
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1016/j.media.2026.103957
Yuhang Zheng , Jiale Xu , Qing Hua , Xiaohong Jia , Xueqin Hou , Yanfeng Yao , Zheng Wei , Yulu Zhang , Fanggang Wu , Wei Guo , Yuan Tian , Jun Wang , Shujun Xia , Yijie Dong , Jun Shi , Jianqiao Zhou
B-mode ultrasound (BUS) is widely used in breast cancer diagnosis, while the emerging super-resolution ultrasound (SRUS) provides microvascular information with high spatial resolution, which has shown great potential in improving breast cancer diagnosis. However, as a new ultrasound modality, its diagnosis remains highly dependent on the clinical experience of sonologists, highlighting the need for reliable computer-aided diagnosis (CAD) approaches. In this work, a novel dual-branch network with a Channel-Wise Joint Disentanglement Representation Learning (CW-JDRL) method is proposed for the multimodal ultrasound-based CAD of breast cancer, where one branch processes BUS and the other analyzes multimodal SRUS data. The CW-JDRL is implemented on the SRUS branch by grouping the final-layer network channels to capture both common and specific properties. It consists of two modules, namely Gradient-guided Disentanglement (GD) module and Gramian-based Contrastive Learning Disentanglement (GCLD) module. The former disentangles with gradient guidance to encourage consistency among common channels and distinctiveness among specific ones, and the latter disentangles common and specific representations by integrating them into a unified contrastive objective. Extensive experiments on a multicenter SRUS dataset demonstrate that the proposed dual-branch network with CW-JDRL achieves superior performance over the compared algorithms and maintains robust generalizability to external data. It suggests not only the effectiveness of SRUS for diagnosis of breast cancer, but also the potential of the proposed CAD model in clinical practice. The codes are publicly available at https://github.com/Zyh-AIUltra/CW-JDRL#
b超(BUS)在乳腺癌诊断中应用广泛,而新兴的超分辨率超声(SRUS)提供了高空间分辨率的微血管信息,在提高乳腺癌诊断方面显示出巨大的潜力。然而,作为一种新的超声模式,其诊断仍然高度依赖于超声医师的临床经验,强调需要可靠的计算机辅助诊断(CAD)方法。在这项工作中,提出了一种新的双分支网络,采用通道智能联合解纠缠表示学习(CW-JDRL)方法,用于基于多模态超声的乳腺癌CAD,其中一个分支处理BUS,另一个分支分析多模态SRUS数据。CW-JDRL是在SRUS分支上通过对最后一层网络通道进行分组来实现的,以捕获公共和特定的属性。它包括两个模块,即梯度引导解纠缠(GD)模块和基于gramian的对比学习解纠缠(GCLD)模块。前者通过梯度引导进行消解,鼓励共同通道之间的一致性和特定通道之间的独特性;后者通过将共同表征和特定表征整合成一个统一的对比目标来消解共同表征和特定表征。在多中心SRUS数据集上的大量实验表明,基于CW-JDRL的双分支网络比所比较的算法具有更好的性能,并保持了对外部数据的鲁棒泛化性。这不仅表明了SRUS对乳腺癌诊断的有效性,而且表明了所提出的CAD模型在临床实践中的潜力。这些代码可在https://github.com/Zyh-AIUltra/CW-JDRL#上公开获取
{"title":"Channel-wise joint disentanglement representation learning for B-mode and super-resolution ultrasound based CAD of breast cancer","authors":"Yuhang Zheng ,&nbsp;Jiale Xu ,&nbsp;Qing Hua ,&nbsp;Xiaohong Jia ,&nbsp;Xueqin Hou ,&nbsp;Yanfeng Yao ,&nbsp;Zheng Wei ,&nbsp;Yulu Zhang ,&nbsp;Fanggang Wu ,&nbsp;Wei Guo ,&nbsp;Yuan Tian ,&nbsp;Jun Wang ,&nbsp;Shujun Xia ,&nbsp;Yijie Dong ,&nbsp;Jun Shi ,&nbsp;Jianqiao Zhou","doi":"10.1016/j.media.2026.103957","DOIUrl":"10.1016/j.media.2026.103957","url":null,"abstract":"<div><div>B-mode ultrasound (BUS) is widely used in breast cancer diagnosis, while the emerging super-resolution ultrasound (SRUS) provides microvascular information with high spatial resolution, which has shown great potential in improving breast cancer diagnosis. However, as a new ultrasound modality, its diagnosis remains highly dependent on the clinical experience of sonologists, highlighting the need for reliable computer-aided diagnosis (CAD) approaches. In this work, a novel dual-branch network with a <strong>C</strong>hannel-<strong>W</strong>ise <strong>J</strong>oint <strong>D</strong>isentanglement <strong>R</strong>epresentation <strong>L</strong>earning (CW-JDRL) method is proposed for the multimodal ultrasound-based CAD of breast cancer, where one branch processes BUS and the other analyzes multimodal SRUS data. The CW-JDRL is implemented on the SRUS branch by grouping the final-layer network channels to capture both common and specific properties. It consists of two modules, namely Gradient-guided Disentanglement (GD) module and Gramian-based Contrastive Learning Disentanglement (GCLD) module. The former disentangles with gradient guidance to encourage consistency among common channels and distinctiveness among specific ones, and the latter disentangles common and specific representations by integrating them into a unified contrastive objective. Extensive experiments on a multicenter SRUS dataset demonstrate that the proposed dual-branch network with CW-JDRL achieves superior performance over the compared algorithms and maintains robust generalizability to external data. It suggests not only the effectiveness of SRUS for diagnosis of breast cancer, but also the potential of the proposed CAD model in clinical practice. The codes are publicly available at <span><span>https://github.com/Zyh-AIUltra/CW-JDRL#</span><svg><path></path></svg></span></div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103957"},"PeriodicalIF":11.8,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146032815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Anatomy-guided prompting with cross-modal self-alignment for whole-body PET-CT breast cancer segmentation 解剖引导提示与跨模态自对准用于全身PET-CT乳腺癌分割
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1016/j.media.2026.103956
Jiaju Huang , Xiao Yang , Xinglong Liang , Shaobin Chen , Yue Sun , Greta Sp Mok , Shuo Li , Ying Wang , Tao Tan
Accurate segmentation of breast cancer in PET-CT images is crucial for precise staging, monitoring treatment response, and guiding personalized therapy. However, the small size and dispersed nature of metastatic lesions, coupled with the scarcity of annotated data and heterogeneity between modalities that hinders effective information fusion, make this task challenging. This paper proposes a novel anatomy-guided cross-modal learning framework to address these issues. Our approach first generates organ pseudo-labels through a teacher-student learning paradigm, which serve as anatomical prompts to guide cancer segmentation. We then introduce a self-aligning cross-modal pre-training method that aligns PET and CT features in a shared latent space through masked 3D patch reconstruction, enabling effective cross-modal feature fusion. Finally, we initialize the segmentation network’s encoder with the pre-trained encoder weights, and incorporate organ labels through a Mamba-based prompt encoder and Hypernet-Controlled Cross-Attention mechanism for dynamic anatomical feature extraction and fusion. Notably, our method outperforms eight state-of-the-art methods, including CNN-based, transformer-based, and Mamba-based approaches, on two datasets encompassing primary breast cancer, metastatic breast cancer, and other types of cancer segmentation tasks.
在PET-CT图像中准确分割乳腺癌对于精确分期、监测治疗反应和指导个性化治疗至关重要。然而,转移病灶的小尺寸和分散性质,加上注释数据的缺乏和模式之间的异质性,阻碍了有效的信息融合,使得这项任务具有挑战性。本文提出了一种新的解剖学指导的跨模态学习框架来解决这些问题。我们的方法首先通过师生学习范式生成器官伪标签,作为指导癌症分割的解剖提示。然后,我们引入了一种自对齐的跨模态预训练方法,该方法通过掩膜3D补丁重建在共享潜在空间中对齐PET和CT特征,从而实现有效的跨模态特征融合。最后,我们使用预训练的编码器权重初始化分割网络的编码器,并通过基于mamba的提示编码器和hypernet控制的交叉注意机制合并器官标签,进行动态解剖特征提取和融合。值得注意的是,我们的方法在包括原发性乳腺癌、转移性乳腺癌和其他类型的癌症分割任务的两个数据集上优于八种最先进的方法,包括基于cnn、基于transformer和基于mamba的方法。
{"title":"Anatomy-guided prompting with cross-modal self-alignment for whole-body PET-CT breast cancer segmentation","authors":"Jiaju Huang ,&nbsp;Xiao Yang ,&nbsp;Xinglong Liang ,&nbsp;Shaobin Chen ,&nbsp;Yue Sun ,&nbsp;Greta Sp Mok ,&nbsp;Shuo Li ,&nbsp;Ying Wang ,&nbsp;Tao Tan","doi":"10.1016/j.media.2026.103956","DOIUrl":"10.1016/j.media.2026.103956","url":null,"abstract":"<div><div>Accurate segmentation of breast cancer in PET-CT images is crucial for precise staging, monitoring treatment response, and guiding personalized therapy. However, the small size and dispersed nature of metastatic lesions, coupled with the scarcity of annotated data and heterogeneity between modalities that hinders effective information fusion, make this task challenging. This paper proposes a novel anatomy-guided cross-modal learning framework to address these issues. Our approach first generates organ pseudo-labels through a teacher-student learning paradigm, which serve as anatomical prompts to guide cancer segmentation. We then introduce a self-aligning cross-modal pre-training method that aligns PET and CT features in a shared latent space through masked 3D patch reconstruction, enabling effective cross-modal feature fusion. Finally, we initialize the segmentation network’s encoder with the pre-trained encoder weights, and incorporate organ labels through a Mamba-based prompt encoder and Hypernet-Controlled Cross-Attention mechanism for dynamic anatomical feature extraction and fusion. Notably, our method outperforms eight state-of-the-art methods, including CNN-based, transformer-based, and Mamba-based approaches, on two datasets encompassing primary breast cancer, metastatic breast cancer, and other types of cancer segmentation tasks.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103956"},"PeriodicalIF":11.8,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UNISELF: A unified network with instance normalization and self-ensembled lesion fusion for multiple sclerosis lesion segmentation UNISELF:一个具有实例归一化和自集合病灶融合的用于多发性硬化症病灶分割的统一网络
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-20 DOI: 10.1016/j.media.2026.103954
Jinwei Zhang , Lianrui Zuo , Blake E. Dewey , Samuel W. Remedios , Yihao Liu , Savannah P. Hays , Dzung L. Pham , Ellen M. Mowry , Scott D. Newsome , Peter A. Calabresi , Shiv Saidha , Aaron Carass , Jerry L. Prince
Automated segmentation of multiple sclerosis (MS) lesions using multicontrast magnetic resonance (MR) images improves efficiency and reproducibility compared to manual delineation, with deep learning (DL) methods achieving state-of-the-art performance. However, these DL-based methods have yet to simultaneously optimize in-domain accuracy and out-of-domain generalization when trained on a single source with limited data, or their performance has been unsatisfactory. To fill this gap, we propose a method called UNISELF, which achieves high accuracy within a single training domain while demonstrating strong generalizability across multiple out-of-domain test datasets. UNISELF employs a novel test-time self-ensembled lesion fusion to improve segmentation accuracy, and leverages test-time instance normalization (TTIN) of latent features to address domain shifts and missing input contrasts. Trained on the ISBI 2015 longitudinal MS segmentation challenge training dataset, UNISELF ranks among the best-performing methods on the challenge test dataset. Additionally, UNISELF outperforms all benchmark methods trained on the same ISBI training data across diverse out-of-domain test datasets with domain shifts and missing contrasts, including the public MICCAI 2016 and UMCL datasets, as well as a private multisite dataset. These test datasets exhibit domain shifts and/or missing contrasts caused by variations in acquisition protocols, scanner types, and imaging artifacts arising from imperfect acquisition. Our code is available at https://github.com/Jinwei1209/UNISELF.
与人工分割相比,使用多层对比磁共振(MR)图像对多发性硬化症(MS)病变进行自动分割提高了效率和可重复性,深度学习(DL)方法实现了最先进的性能。然而,这些基于dl的方法在单一数据源和有限数据上训练时,还没有同时优化域内精度和域外泛化,或者它们的性能令人不满意。为了填补这一空白,我们提出了一种称为UNISELF的方法,该方法在单个训练域中实现了高准确性,同时在多个域外测试数据集上展示了强大的泛化能力。UNISELF采用一种新颖的测试时间自集合病灶融合来提高分割精度,并利用潜在特征的测试时间实例归一化(TTIN)来解决域偏移和缺失输入对比问题。在ISBI 2015纵向MS分割挑战训练数据集上训练,UNISELF是挑战测试数据集上表现最好的方法之一。此外,UNISELF在不同域外测试数据集(包括公共MICCAI 2016和UMCL数据集以及私有多站点数据集)上训练的相同ISBI训练数据优于所有基准方法,这些数据集具有域移位和缺失对比。这些测试数据集表现出由采集协议、扫描仪类型和不完美采集引起的成像工件的变化引起的域移位和/或缺失对比。我们的代码可在https://github.com/Jinwei1209/UNISELF上获得。
{"title":"UNISELF: A unified network with instance normalization and self-ensembled lesion fusion for multiple sclerosis lesion segmentation","authors":"Jinwei Zhang ,&nbsp;Lianrui Zuo ,&nbsp;Blake E. Dewey ,&nbsp;Samuel W. Remedios ,&nbsp;Yihao Liu ,&nbsp;Savannah P. Hays ,&nbsp;Dzung L. Pham ,&nbsp;Ellen M. Mowry ,&nbsp;Scott D. Newsome ,&nbsp;Peter A. Calabresi ,&nbsp;Shiv Saidha ,&nbsp;Aaron Carass ,&nbsp;Jerry L. Prince","doi":"10.1016/j.media.2026.103954","DOIUrl":"10.1016/j.media.2026.103954","url":null,"abstract":"<div><div>Automated segmentation of multiple sclerosis (MS) lesions using multicontrast magnetic resonance (MR) images improves efficiency and reproducibility compared to manual delineation, with deep learning (DL) methods achieving state-of-the-art performance. However, these DL-based methods have yet to simultaneously optimize in-domain accuracy and out-of-domain generalization when trained on a single source with limited data, or their performance has been unsatisfactory. To fill this gap, we propose a method called UNISELF, which achieves high accuracy within a single training domain while demonstrating strong generalizability across multiple out-of-domain test datasets. UNISELF employs a novel test-time self-ensembled lesion fusion to improve segmentation accuracy, and leverages test-time instance normalization (TTIN) of latent features to address domain shifts and missing input contrasts. Trained on the ISBI 2015 longitudinal MS segmentation challenge training dataset, UNISELF ranks among the best-performing methods on the challenge test dataset. Additionally, UNISELF outperforms all benchmark methods trained on the same ISBI training data across diverse out-of-domain test datasets with domain shifts and missing contrasts, including the public MICCAI 2016 and UMCL datasets, as well as a private multisite dataset. These test datasets exhibit domain shifts and/or missing contrasts caused by variations in acquisition protocols, scanner types, and imaging artifacts arising from imperfect acquisition. Our code is available at <span><span>https://github.com/Jinwei1209/UNISELF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"109 ","pages":"Article 103954"},"PeriodicalIF":11.8,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146014238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VHU-Net: Variational hadamard U-Net for body MRI bias field correction VHU-Net:用于身体MRI偏场校正的变分Hadamard U-Net
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-20 DOI: 10.1016/j.media.2026.103955
Xin Zhu , Ahmet Enis Cetin , Gorkem Durak , Batuhan Gundogdu , Ziliang Hong , Hongyi Pan , Ertugrul Aktas , Elif Keles , Hatice Savas , Aytekin Oto , Hiten Patel , Adam B. Murphy , Ashley Ross , Frank Miller , Baris Turkbey , Ulas Bagci
Bias field artifacts in magnetic resonance imaging (MRI) scans introduce spatially smooth intensity inhomogeneities that degrade image quality and hinder downstream analysis. To address this challenge, we propose a novel variational Hadamard U-Net (VHU-Net) for effective body MRI bias field correction. The encoder comprises multiple convolutional Hadamard transform blocks (ConvHTBlocks), each integrating convolutional layers with a Hadamard transform (HT) layer. Specifically, the HT layer performs channel-wise frequency decomposition to isolate low-frequency components, while a subsequent scaling layer and semi-soft thresholding mechanism suppress redundant high-frequency noise. To compensate for the HT layer’s inability to model inter-channel dependencies, the decoder incorporates an inverse HT-reconstructed transformer block, enabling global, frequency-aware attention for the recovery of spatially consistent bias fields. The stacked decoder ConvHTBlocks further enhance the capacity to reconstruct the underlying ground-truth bias field. Building on the principles of variational inference, we formulate a new evidence lower bound (ELBO) as the training objective, promoting sparsity in the latent space while ensuring accurate bias field estimation. Comprehensive experiments on body MRI datasets demonstrate the superiority of VHU-Net over existing state-of-the-art methods in terms of intensity uniformity. Moreover, the corrected images yield substantial downstream improvements in segmentation accuracy. Our framework offers computational efficiency, interpretability, and robust performance across multi-center datasets, making it suitable for clinical deployment. The codes are available at https://github.com/Holmes696/Probabilistic-Hadamard-U-Net.
磁共振成像(MRI)扫描中的偏置场伪影会引入空间平滑强度不均匀性,从而降低图像质量并阻碍下游分析。为了解决这一挑战,我们提出了一种新的变分Hadamard U-Net (VHU-Net),用于有效的身体MRI偏场校正。编码器包括多个卷积哈达玛变换块(ConvHTBlocks),每个块将卷积层与哈达玛变换(HT)层集成。具体来说,HT层执行按信道的频率分解以隔离低频分量,而随后的缩放层和半软阈值机制抑制冗余的高频噪声。为了弥补高频层无法模拟信道间依赖关系的缺陷,解码器集成了一个逆高频重构的变压器块,实现了对空间一致偏置场恢复的全局频率感知关注。堆叠的解码器ConvHTBlocks进一步增强了重建底层地基真值偏置场的能力。在变分推理原理的基础上,我们制定了一个新的证据下限(ELBO)作为训练目标,在保证准确的偏差场估计的同时,提高了潜在空间的稀疏性。在人体MRI数据集上的综合实验表明,VHU-Net在强度均匀性方面优于现有的最先进方法。此外,校正后的图像在分割精度方面产生了实质性的下游改进。我们的框架提供了跨多中心数据集的计算效率、可解释性和健壮的性能,使其适合临床部署。代码可在https://github.com/Holmes696/Probabilistic-Hadamard-U-Net上获得。
{"title":"VHU-Net: Variational hadamard U-Net for body MRI bias field correction","authors":"Xin Zhu ,&nbsp;Ahmet Enis Cetin ,&nbsp;Gorkem Durak ,&nbsp;Batuhan Gundogdu ,&nbsp;Ziliang Hong ,&nbsp;Hongyi Pan ,&nbsp;Ertugrul Aktas ,&nbsp;Elif Keles ,&nbsp;Hatice Savas ,&nbsp;Aytekin Oto ,&nbsp;Hiten Patel ,&nbsp;Adam B. Murphy ,&nbsp;Ashley Ross ,&nbsp;Frank Miller ,&nbsp;Baris Turkbey ,&nbsp;Ulas Bagci","doi":"10.1016/j.media.2026.103955","DOIUrl":"10.1016/j.media.2026.103955","url":null,"abstract":"<div><div>Bias field artifacts in magnetic resonance imaging (MRI) scans introduce spatially smooth intensity inhomogeneities that degrade image quality and hinder downstream analysis. To address this challenge, we propose a novel variational Hadamard U-Net (VHU-Net) for effective body MRI bias field correction. The encoder comprises multiple convolutional Hadamard transform blocks (ConvHTBlocks), each integrating convolutional layers with a Hadamard transform (HT) layer. Specifically, the HT layer performs channel-wise frequency decomposition to isolate low-frequency components, while a subsequent scaling layer and semi-soft thresholding mechanism suppress redundant high-frequency noise. To compensate for the HT layer’s inability to model inter-channel dependencies, the decoder incorporates an inverse HT-reconstructed transformer block, enabling global, frequency-aware attention for the recovery of spatially consistent bias fields. The stacked decoder ConvHTBlocks further enhance the capacity to reconstruct the underlying ground-truth bias field. Building on the principles of variational inference, we formulate a new evidence lower bound (ELBO) as the training objective, promoting sparsity in the latent space while ensuring accurate bias field estimation. Comprehensive experiments on body MRI datasets demonstrate the superiority of VHU-Net over existing state-of-the-art methods in terms of intensity uniformity. Moreover, the corrected images yield substantial downstream improvements in segmentation accuracy. Our framework offers computational efficiency, interpretability, and robust performance across multi-center datasets, making it suitable for clinical deployment. The codes are available at <span><span>https://github.com/Holmes696/Probabilistic-Hadamard-U-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103955"},"PeriodicalIF":11.8,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146014237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hippocampal surface morphological variation-based genome-wide association analysis network for biomarker detection of Alzheimer’s disease 基于海马表面形态变化的全基因组关联分析网络用于阿尔茨海默病的生物标志物检测
IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-18 DOI: 10.1016/j.media.2026.103952
Xiumei Chen , Xinyue Zhang , Wei Xiong , Tao Wang , Aiwei Jia , Qianjin Feng , Meiyan Huang
Performing genome-wide association analysis (GWAS) between hippocampus and whole-genome data can facilitate disease-related biomarker detection of Alzheimer’s disease (AD). However, most existing studies have prioritized hippocampal volume changes and ignored the morphological variations and subfield differences of the hippocampus in AD progression. This disregard restricts the comprehensive understanding of the associations between hippocampus and whole-genome data, which may result in some potentially specific biomarkers of AD being missed. Moreover, the representation of the complex associations between ultra-high-dimensional imaging and whole-genome data remains an unresolved problem in GWAS. To address these issues, we propose an end-to-end hippocampal surface morphological variation-based genome-wide association analysis network (HSM-GWAS) to explore the nonlinear associations between hippocampal surface morphological variations and whole-genome data for AD-related biomarker detection. First, a multi-modality feature extraction module that includes a graph convolution network and an improved diet network is presented to extract imaging and genetic features from non-Euclidean hippocampal surface and whole-genome data, respectively. Second, a dual contrastive learning-based association analysis module is introduced to map and align genetic features to imaging features, thus narrowing the gap between these features and helping explore the complex associations between hippocampal and whole-genome data. Last, a dual cross-attention fusion module is applied to combine imaging and genetic features for disease diagnosis and biomarker detection of AD. Extensive experiments on the real Alzheimer’s Disease Neuroimaging Initiative dataset and simulated data demonstrate that HSM-GWAS considerably improves biomarker detection and disease diagnosis. These findings highlight the ability of HSM-GWAS to discover disease-related biomarkers, suggesting its potential to provide new insights into pathological mechanisms and aid in AD diagnosis. The codes are to be made publicly available at https://github.com/Meiyan88/HSM-GWAS.
在海马和全基因组数据之间进行全基因组关联分析(GWAS)可以促进阿尔茨海默病(AD)疾病相关生物标志物的检测。然而,大多数现有的研究都优先考虑了海马体积的变化,而忽略了阿尔茨海默病进展过程中海马的形态变化和亚区差异。这种忽视限制了对海马体和全基因组数据之间关联的全面理解,这可能导致阿尔茨海默病的一些潜在特异性生物标志物被遗漏。此外,超高维成像和全基因组数据之间复杂关联的表征在GWAS中仍然是一个未解决的问题。为了解决这些问题,我们提出了一个端到端的基于海马表面形态变化的全基因组关联分析网络(HSM-GWAS),以探索海马表面形态变化与ad相关生物标志物检测的全基因组数据之间的非线性关联。首先,提出了包含图卷积网络和改进饮食网络的多模态特征提取模块,分别从非欧几里得海马表面和全基因组数据中提取成像特征和遗传特征。其次,引入了基于双对比学习的关联分析模块,将遗传特征与成像特征进行映射和比对,从而缩小这些特征之间的差距,并有助于探索海马与全基因组数据之间的复杂关联。最后,采用双交叉关注融合模块,结合影像学特征和遗传特征,对AD进行疾病诊断和生物标志物检测。在真实阿尔茨海默病神经成像倡议数据集和模拟数据上进行的大量实验表明,HSM-GWAS大大提高了生物标志物的检测和疾病诊断。这些发现强调了HSM-GWAS发现疾病相关生物标志物的能力,表明其可能为阿尔茨海默病的病理机制提供新的见解,并有助于阿尔茨海默病的诊断。这些代码将在https://github.com/Meiyan88/HSM-GWAS上公开发布。
{"title":"Hippocampal surface morphological variation-based genome-wide association analysis network for biomarker detection of Alzheimer’s disease","authors":"Xiumei Chen ,&nbsp;Xinyue Zhang ,&nbsp;Wei Xiong ,&nbsp;Tao Wang ,&nbsp;Aiwei Jia ,&nbsp;Qianjin Feng ,&nbsp;Meiyan Huang","doi":"10.1016/j.media.2026.103952","DOIUrl":"10.1016/j.media.2026.103952","url":null,"abstract":"<div><div>Performing genome-wide association analysis (GWAS) between hippocampus and whole-genome data can facilitate disease-related biomarker detection of Alzheimer’s disease (AD). However, most existing studies have prioritized hippocampal volume changes and ignored the morphological variations and subfield differences of the hippocampus in AD progression. This disregard restricts the comprehensive understanding of the associations between hippocampus and whole-genome data, which may result in some potentially specific biomarkers of AD being missed. Moreover, the representation of the complex associations between ultra-high-dimensional imaging and whole-genome data remains an unresolved problem in GWAS. To address these issues, we propose an end-to-end hippocampal surface morphological variation-based genome-wide association analysis network (HSM-GWAS) to explore the nonlinear associations between hippocampal surface morphological variations and whole-genome data for AD-related biomarker detection. First, a multi-modality feature extraction module that includes a graph convolution network and an improved diet network is presented to extract imaging and genetic features from non-Euclidean hippocampal surface and whole-genome data, respectively. Second, a dual contrastive learning-based association analysis module is introduced to map and align genetic features to imaging features, thus narrowing the gap between these features and helping explore the complex associations between hippocampal and whole-genome data. Last, a dual cross-attention fusion module is applied to combine imaging and genetic features for disease diagnosis and biomarker detection of AD. Extensive experiments on the real Alzheimer’s Disease Neuroimaging Initiative dataset and simulated data demonstrate that HSM-GWAS considerably improves biomarker detection and disease diagnosis. These findings highlight the ability of HSM-GWAS to discover disease-related biomarkers, suggesting its potential to provide new insights into pathological mechanisms and aid in AD diagnosis. The codes are to be made publicly available at <span><span>https://github.com/Meiyan88/HSM-GWAS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103952"},"PeriodicalIF":11.8,"publicationDate":"2026-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Medical image analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1