{"title":"A General Framework for Efficient Medical Image Analysis via Shared Attention Vision Transformer","authors":"Yihang Liu, Ying Wen, Longzhen Yang, Lianghua He, Mengchu Zhou","doi":"10.1109/tmi.2025.3644949","DOIUrl":"https://doi.org/10.1109/tmi.2025.3644949","url":null,"abstract":"","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"53 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145770724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-17DOI: 10.1109/tmi.2025.3644811
Anwai Archit, Luca Freckmann, Constantin Pape
{"title":"MedicoSAM: Robust Improvement of SAM for Medical Imaging","authors":"Anwai Archit, Luca Freckmann, Constantin Pape","doi":"10.1109/tmi.2025.3644811","DOIUrl":"https://doi.org/10.1109/tmi.2025.3644811","url":null,"abstract":"","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"155 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145770725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/tmi.2025.3642294
Lan Yang,Yao Li,Chen Qiao
The accurate diagnosis of early mild cognitive impairment is crucial for timely intervention and treatment of dementia. But it is challenging to distinguish from normal aging due to its complex pathology and mild symptoms. Recently, effective hyper-connectivity identified through directed hypergraph can be considered as an effective analysis approach for early detection of mild cognitive impairment and exploration of its underlying neural mechanisms, because it captures directional higher-order interactions across multiple brain regions. However, current methods face limitations, including inefficiency in high-dimensional spaces, sensitivity to noise, reliance on manually defined structures, lack of global structural information, and static learning mechanisms. To address these issues, we integrate robust dictionary learning with directed hypergraph structure learning within a unified framework. This approach jointly estimates low-dimensional sparse representations and the directed hypergraph. The integration allows both processes to dynamically reinforce each other, leading to the refinement of the directed hypergraph, which improves the estimation of low-dimensional sparse representations and, in turn, enhances the quality of the directed hypergraph estimation. Experimental analyses on simulated data confirm the positive interplay between these processes, demonstrating the effectiveness of the proposed collaborative learning strategy. Furthermore, results on real-world brain signal data show that the proposed method is highly competitive in early detection of mild cognitive impairment, highlighting its ability to identify effective hyper-connectivity networks with significant differences.
{"title":"Constructing Effective Hyper-Connectivity Networks through Adaptive Directed Hypergraph Embedded Dictionary Learning: Application to Early Mild Cognitive Impairment Detection.","authors":"Lan Yang,Yao Li,Chen Qiao","doi":"10.1109/tmi.2025.3642294","DOIUrl":"https://doi.org/10.1109/tmi.2025.3642294","url":null,"abstract":"The accurate diagnosis of early mild cognitive impairment is crucial for timely intervention and treatment of dementia. But it is challenging to distinguish from normal aging due to its complex pathology and mild symptoms. Recently, effective hyper-connectivity identified through directed hypergraph can be considered as an effective analysis approach for early detection of mild cognitive impairment and exploration of its underlying neural mechanisms, because it captures directional higher-order interactions across multiple brain regions. However, current methods face limitations, including inefficiency in high-dimensional spaces, sensitivity to noise, reliance on manually defined structures, lack of global structural information, and static learning mechanisms. To address these issues, we integrate robust dictionary learning with directed hypergraph structure learning within a unified framework. This approach jointly estimates low-dimensional sparse representations and the directed hypergraph. The integration allows both processes to dynamically reinforce each other, leading to the refinement of the directed hypergraph, which improves the estimation of low-dimensional sparse representations and, in turn, enhances the quality of the directed hypergraph estimation. Experimental analyses on simulated data confirm the positive interplay between these processes, demonstrating the effectiveness of the proposed collaborative learning strategy. Furthermore, results on real-world brain signal data show that the proposed method is highly competitive in early detection of mild cognitive impairment, highlighting its ability to identify effective hyper-connectivity networks with significant differences.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"29 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145728473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/tmi.2025.3638977
Rina Bao,Anna N Foster,Ya'nan Song,Rutvi Vyas,Ankush Kesri,Imad Eddine Toubal,Elham Soltani Kazemi,Gani Rahmon,Taci Kucukpinar,Mohamed Almansour,Mai-Lan Ho,K Palaniappan,Dean Ninalga,Chiranjeewee Prasad Koirala,Sovesh Mohapatra,Gottfried Schlaug,Marek Wodzinski,Henning Muller,David G Ellis,Michele R Aizenberg,M Arda Aydin,Elvin Abdinli,Gozde Unal,Nazanin Tahmasebi,Kumaradevan Punithakumar,Tian Song,Yun Peng,Sara V Bates,Randy Hirschtick,P Ellen Grant,Yangming Ou
Hypoxic Ischemic Encephalopathy (HIE) represents a brain dysfunction, affecting approximately 1 to 5 per 1000 full-term neonates. The precise delineation and segmentation of HIE-related lesions in neonatal brain Magnetic Resonance Images (MRI) are pivotal in advancing outcome predictions, identifying patients at high risk, elucidating neurological manifestations, and assessing treatment efficacies. Despite its importance, the development of algorithms for segmenting HIE lesions from MRI volumes has been impeded by data scarcity. Addressing this critical gap, we organized the first BONBID-HIE challenge with diffusion MRI data (Apparent Diffusion Coefficient (ADC) maps) for HIE lesion segmentation, in conjunction with the MICCAI 2023. Totally 14 algorithms were submitted, employing a gamut of cutting-edge automatic machine-learning-based segmentation algorithms. Our comprehensive analysis of HIE lesion segmentation and submitted algorithms facilitates an in-depth evaluation of the current technological zenith, outlines directions for future advancements, and highlights persistent hurdles. To foster ongoing research and benchmarking, the annotated HIE dataset, developed algorithm dockers, and unified evaluation codes are accessible through a dedicated online platform (https://bonbid-hie2023.grand-challenge.org).
{"title":"BONBID-HIE 2023: Lesion Segmentation Challenge in BOston Neonatal Brain Injury Data for Hypoxic Ischemic Encephalopathy.","authors":"Rina Bao,Anna N Foster,Ya'nan Song,Rutvi Vyas,Ankush Kesri,Imad Eddine Toubal,Elham Soltani Kazemi,Gani Rahmon,Taci Kucukpinar,Mohamed Almansour,Mai-Lan Ho,K Palaniappan,Dean Ninalga,Chiranjeewee Prasad Koirala,Sovesh Mohapatra,Gottfried Schlaug,Marek Wodzinski,Henning Muller,David G Ellis,Michele R Aizenberg,M Arda Aydin,Elvin Abdinli,Gozde Unal,Nazanin Tahmasebi,Kumaradevan Punithakumar,Tian Song,Yun Peng,Sara V Bates,Randy Hirschtick,P Ellen Grant,Yangming Ou","doi":"10.1109/tmi.2025.3638977","DOIUrl":"https://doi.org/10.1109/tmi.2025.3638977","url":null,"abstract":"Hypoxic Ischemic Encephalopathy (HIE) represents a brain dysfunction, affecting approximately 1 to 5 per 1000 full-term neonates. The precise delineation and segmentation of HIE-related lesions in neonatal brain Magnetic Resonance Images (MRI) are pivotal in advancing outcome predictions, identifying patients at high risk, elucidating neurological manifestations, and assessing treatment efficacies. Despite its importance, the development of algorithms for segmenting HIE lesions from MRI volumes has been impeded by data scarcity. Addressing this critical gap, we organized the first BONBID-HIE challenge with diffusion MRI data (Apparent Diffusion Coefficient (ADC) maps) for HIE lesion segmentation, in conjunction with the MICCAI 2023. Totally 14 algorithms were submitted, employing a gamut of cutting-edge automatic machine-learning-based segmentation algorithms. Our comprehensive analysis of HIE lesion segmentation and submitted algorithms facilitates an in-depth evaluation of the current technological zenith, outlines directions for future advancements, and highlights persistent hurdles. To foster ongoing research and benchmarking, the annotated HIE dataset, developed algorithm dockers, and unified evaluation codes are accessible through a dedicated online platform (https://bonbid-hie2023.grand-challenge.org).","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"31 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145728460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-11DOI: 10.1109/tmi.2025.3642381
Jiaqi Zhang,Xiuzhe Wu,Jiahui Liu,Chunyu Zou,Fengze Nie,Zicheng Sun,Xiaojuan Qi,Jiang Liu
High-fidelity reconstruction of the Posterior Eyeball Shape (PES) is crucial for early diagnosis and timely intervention of sight-threatening diseases such as high myopia, diabetic retinopathy, and glaucoma. However, existing magnetic resonance imaging (MRI)- and optical coherence tomography (OCT)-based methods either provide only coarse scleral geometry or suffer from suboptimal PES representations due to limited field of view (FOV) and detail loss, hindering accurate assessment of intact retinal pigment epithelium (RPE) abnormalities. In this study, we propose the Polar Subarea-Aware Fusion Net (PSAFNet), a novel end-to-end framework that reconstructs complete and high-fidelity PES directly from a single local OCT scan, even under clinically common settings with only 6.25% FOV. To avoid information loss, we reformulate PES reconstruction as a 2D dense regression task and introduce the Ocular Shape Map (OSM), an innovative lossless 2D representation that encodes 3D coordinate attributes into corresponding image channels. PSAFNet then leverages three dedicated modules-Subarea Feature Embedding Module (SFEM), Channel- and Patch-wise Fusion Blocks (CFB/PFB), and Reassemble and Up-sample Module (RUM)-to enhance positional awareness, integrate local-global features, and achieve high-resolution OSM prediction. Furthermore, we construct two large-scale datasets, POSDiag and PESGen, comprising 794 ultra-widefield OCT scans from diverse health conditions and imaging devices, providing a comprehensive benchmark for PES reconstruction. Extensive experiments demonstrate that PSAFNet consistently outperforms existing methods (e.g., EMD=5.58, AAL=97.3%) and exhibits strong clinical relevance, validated by superior performance in downstream disease classification and ophthalmologist evaluations (Expert-Score=82.78%). The source code of the proposed PSAFNet is released at https://github.com/HKUZJ77/PSAFNet.
{"title":"Polar Subarea-Aware Fusion Net for Posterior Eyeball Shape Reconstruction.","authors":"Jiaqi Zhang,Xiuzhe Wu,Jiahui Liu,Chunyu Zou,Fengze Nie,Zicheng Sun,Xiaojuan Qi,Jiang Liu","doi":"10.1109/tmi.2025.3642381","DOIUrl":"https://doi.org/10.1109/tmi.2025.3642381","url":null,"abstract":"High-fidelity reconstruction of the Posterior Eyeball Shape (PES) is crucial for early diagnosis and timely intervention of sight-threatening diseases such as high myopia, diabetic retinopathy, and glaucoma. However, existing magnetic resonance imaging (MRI)- and optical coherence tomography (OCT)-based methods either provide only coarse scleral geometry or suffer from suboptimal PES representations due to limited field of view (FOV) and detail loss, hindering accurate assessment of intact retinal pigment epithelium (RPE) abnormalities. In this study, we propose the Polar Subarea-Aware Fusion Net (PSAFNet), a novel end-to-end framework that reconstructs complete and high-fidelity PES directly from a single local OCT scan, even under clinically common settings with only 6.25% FOV. To avoid information loss, we reformulate PES reconstruction as a 2D dense regression task and introduce the Ocular Shape Map (OSM), an innovative lossless 2D representation that encodes 3D coordinate attributes into corresponding image channels. PSAFNet then leverages three dedicated modules-Subarea Feature Embedding Module (SFEM), Channel- and Patch-wise Fusion Blocks (CFB/PFB), and Reassemble and Up-sample Module (RUM)-to enhance positional awareness, integrate local-global features, and achieve high-resolution OSM prediction. Furthermore, we construct two large-scale datasets, POSDiag and PESGen, comprising 794 ultra-widefield OCT scans from diverse health conditions and imaging devices, providing a comprehensive benchmark for PES reconstruction. Extensive experiments demonstrate that PSAFNet consistently outperforms existing methods (e.g., EMD=5.58, AAL=97.3%) and exhibits strong clinical relevance, validated by superior performance in downstream disease classification and ophthalmologist evaluations (Expert-Score=82.78%). The source code of the proposed PSAFNet is released at https://github.com/HKUZJ77/PSAFNet.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"38 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145728474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-09DOI: 10.1109/tmi.2025.3641894
Wang Yin,Chunling Huang,Linxi Chen,Xinrui Huang,Zhaohong Wang,Yang Bian,Yuan Zhou,You Wan,Tongyan Han,Ming Yi
General movement assessment (GMA) is a non-invasive method used to evaluate neuromotor behavior in infants under six months of age and is considered a reliable tool for the early detection of cerebral palsy (CP). However, traditional GMA relies on the subjective judgment of multiple internationally certified physicians, making it time-consuming and limiting its accessibility for widespread use. Furthermore, artificial intelligence (AI) approaches may overcome these limitations but are usually based on motion skeletons and lack the ability to capture detailed body information. Here, we propose CoGMA (Collaborative General Movements Assessment), a novel multi-modality co-learning framework for GMA. By integrating multimodal large language model as auxiliary network during training, CoGMA incorporates four types of input data-skeleton data, clinical information, RGB video, and text descriptions-to enhance representation learning. During inference, however, CoGMA achieves efficient and accurate prediction using only skeleton data and clinical information. Experimental evaluations indicate that CoGMA demonstrates robust performance across both the writhing and fidgety movement stages, while also excelling in zero-shot evaluation of fidget movement, thereby mitigating the issue of limited training samples in this stage. This framework significantly enhances the GMA methodology and lays the groundwork for future advancements in early detection and research on infant neuromotor behavior. Additionally, to facilitate anonymized data sharing, we introduce InfantAnimator, a tool that generates non-identifiable videos while preserving essential motion features, thereby supporting broader research and collaboration. The code is available at GitHub: https://github.com/wwYinYin/CoGMA.
一般运动评估(GMA)是一种用于评估6个月以下婴儿神经运动行为的非侵入性方法,被认为是早期发现脑瘫(CP)的可靠工具。然而,传统的GMA依赖于多个国际认证医生的主观判断,使其耗时且限制了其广泛使用的可及性。此外,人工智能(AI)方法可以克服这些限制,但通常基于运动骨架,缺乏捕获详细身体信息的能力。本文提出了一种新型的多模态协同学习框架CoGMA (Collaborative General Movements Assessment)。CoGMA通过在训练过程中集成多模态大语言模型作为辅助网络,将骨架数据、临床信息、RGB视频和文本描述四种类型的输入结合起来,增强表征学习。然而,在推理过程中,CoGMA仅使用骨骼数据和临床信息即可实现高效准确的预测。实验评估表明,CoGMA在扭动和烦躁运动阶段都表现出稳健的性能,同时在烦躁运动的零射击评估方面也表现出色,从而缓解了这一阶段训练样本有限的问题。该框架显著增强了GMA方法,并为婴儿神经运动行为的早期检测和研究奠定了基础。此外,为了促进匿名数据共享,我们引入了InfantAnimator,这是一种生成不可识别视频的工具,同时保留了基本的运动特征,从而支持更广泛的研究和合作。代码可在GitHub: https://github.com/wwYinYin/CoGMA。
{"title":"Facilitate Robust Early Screening of Cerebral Palsy via General Movements Assessment with Multi-Modality Co-Learning.","authors":"Wang Yin,Chunling Huang,Linxi Chen,Xinrui Huang,Zhaohong Wang,Yang Bian,Yuan Zhou,You Wan,Tongyan Han,Ming Yi","doi":"10.1109/tmi.2025.3641894","DOIUrl":"https://doi.org/10.1109/tmi.2025.3641894","url":null,"abstract":"General movement assessment (GMA) is a non-invasive method used to evaluate neuromotor behavior in infants under six months of age and is considered a reliable tool for the early detection of cerebral palsy (CP). However, traditional GMA relies on the subjective judgment of multiple internationally certified physicians, making it time-consuming and limiting its accessibility for widespread use. Furthermore, artificial intelligence (AI) approaches may overcome these limitations but are usually based on motion skeletons and lack the ability to capture detailed body information. Here, we propose CoGMA (Collaborative General Movements Assessment), a novel multi-modality co-learning framework for GMA. By integrating multimodal large language model as auxiliary network during training, CoGMA incorporates four types of input data-skeleton data, clinical information, RGB video, and text descriptions-to enhance representation learning. During inference, however, CoGMA achieves efficient and accurate prediction using only skeleton data and clinical information. Experimental evaluations indicate that CoGMA demonstrates robust performance across both the writhing and fidgety movement stages, while also excelling in zero-shot evaluation of fidget movement, thereby mitigating the issue of limited training samples in this stage. This framework significantly enhances the GMA methodology and lays the groundwork for future advancements in early detection and research on infant neuromotor behavior. Additionally, to facilitate anonymized data sharing, we introduce InfantAnimator, a tool that generates non-identifiable videos while preserving essential motion features, thereby supporting broader research and collaboration. The code is available at GitHub: https://github.com/wwYinYin/CoGMA.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"22 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145710799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}