Pub Date : 2025-12-11DOI: 10.1109/tmi.2025.3642381
Jiaqi Zhang,Xiuzhe Wu,Jiahui Liu,Chunyu Zou,Fengze Nie,Zicheng Sun,Xiaojuan Qi,Jiang Liu
High-fidelity reconstruction of the Posterior Eyeball Shape (PES) is crucial for early diagnosis and timely intervention of sight-threatening diseases such as high myopia, diabetic retinopathy, and glaucoma. However, existing magnetic resonance imaging (MRI)- and optical coherence tomography (OCT)-based methods either provide only coarse scleral geometry or suffer from suboptimal PES representations due to limited field of view (FOV) and detail loss, hindering accurate assessment of intact retinal pigment epithelium (RPE) abnormalities. In this study, we propose the Polar Subarea-Aware Fusion Net (PSAFNet), a novel end-to-end framework that reconstructs complete and high-fidelity PES directly from a single local OCT scan, even under clinically common settings with only 6.25% FOV. To avoid information loss, we reformulate PES reconstruction as a 2D dense regression task and introduce the Ocular Shape Map (OSM), an innovative lossless 2D representation that encodes 3D coordinate attributes into corresponding image channels. PSAFNet then leverages three dedicated modules-Subarea Feature Embedding Module (SFEM), Channel- and Patch-wise Fusion Blocks (CFB/PFB), and Reassemble and Up-sample Module (RUM)-to enhance positional awareness, integrate local-global features, and achieve high-resolution OSM prediction. Furthermore, we construct two large-scale datasets, POSDiag and PESGen, comprising 794 ultra-widefield OCT scans from diverse health conditions and imaging devices, providing a comprehensive benchmark for PES reconstruction. Extensive experiments demonstrate that PSAFNet consistently outperforms existing methods (e.g., EMD=5.58, AAL=97.3%) and exhibits strong clinical relevance, validated by superior performance in downstream disease classification and ophthalmologist evaluations (Expert-Score=82.78%). The source code of the proposed PSAFNet is released at https://github.com/HKUZJ77/PSAFNet.
{"title":"Polar Subarea-Aware Fusion Net for Posterior Eyeball Shape Reconstruction.","authors":"Jiaqi Zhang,Xiuzhe Wu,Jiahui Liu,Chunyu Zou,Fengze Nie,Zicheng Sun,Xiaojuan Qi,Jiang Liu","doi":"10.1109/tmi.2025.3642381","DOIUrl":"https://doi.org/10.1109/tmi.2025.3642381","url":null,"abstract":"High-fidelity reconstruction of the Posterior Eyeball Shape (PES) is crucial for early diagnosis and timely intervention of sight-threatening diseases such as high myopia, diabetic retinopathy, and glaucoma. However, existing magnetic resonance imaging (MRI)- and optical coherence tomography (OCT)-based methods either provide only coarse scleral geometry or suffer from suboptimal PES representations due to limited field of view (FOV) and detail loss, hindering accurate assessment of intact retinal pigment epithelium (RPE) abnormalities. In this study, we propose the Polar Subarea-Aware Fusion Net (PSAFNet), a novel end-to-end framework that reconstructs complete and high-fidelity PES directly from a single local OCT scan, even under clinically common settings with only 6.25% FOV. To avoid information loss, we reformulate PES reconstruction as a 2D dense regression task and introduce the Ocular Shape Map (OSM), an innovative lossless 2D representation that encodes 3D coordinate attributes into corresponding image channels. PSAFNet then leverages three dedicated modules-Subarea Feature Embedding Module (SFEM), Channel- and Patch-wise Fusion Blocks (CFB/PFB), and Reassemble and Up-sample Module (RUM)-to enhance positional awareness, integrate local-global features, and achieve high-resolution OSM prediction. Furthermore, we construct two large-scale datasets, POSDiag and PESGen, comprising 794 ultra-widefield OCT scans from diverse health conditions and imaging devices, providing a comprehensive benchmark for PES reconstruction. Extensive experiments demonstrate that PSAFNet consistently outperforms existing methods (e.g., EMD=5.58, AAL=97.3%) and exhibits strong clinical relevance, validated by superior performance in downstream disease classification and ophthalmologist evaluations (Expert-Score=82.78%). The source code of the proposed PSAFNet is released at https://github.com/HKUZJ77/PSAFNet.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"38 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145728474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-09DOI: 10.1109/tmi.2025.3641894
Wang Yin,Chunling Huang,Linxi Chen,Xinrui Huang,Zhaohong Wang,Yang Bian,Yuan Zhou,You Wan,Tongyan Han,Ming Yi
General movement assessment (GMA) is a non-invasive method used to evaluate neuromotor behavior in infants under six months of age and is considered a reliable tool for the early detection of cerebral palsy (CP). However, traditional GMA relies on the subjective judgment of multiple internationally certified physicians, making it time-consuming and limiting its accessibility for widespread use. Furthermore, artificial intelligence (AI) approaches may overcome these limitations but are usually based on motion skeletons and lack the ability to capture detailed body information. Here, we propose CoGMA (Collaborative General Movements Assessment), a novel multi-modality co-learning framework for GMA. By integrating multimodal large language model as auxiliary network during training, CoGMA incorporates four types of input data-skeleton data, clinical information, RGB video, and text descriptions-to enhance representation learning. During inference, however, CoGMA achieves efficient and accurate prediction using only skeleton data and clinical information. Experimental evaluations indicate that CoGMA demonstrates robust performance across both the writhing and fidgety movement stages, while also excelling in zero-shot evaluation of fidget movement, thereby mitigating the issue of limited training samples in this stage. This framework significantly enhances the GMA methodology and lays the groundwork for future advancements in early detection and research on infant neuromotor behavior. Additionally, to facilitate anonymized data sharing, we introduce InfantAnimator, a tool that generates non-identifiable videos while preserving essential motion features, thereby supporting broader research and collaboration. The code is available at GitHub: https://github.com/wwYinYin/CoGMA.
一般运动评估(GMA)是一种用于评估6个月以下婴儿神经运动行为的非侵入性方法,被认为是早期发现脑瘫(CP)的可靠工具。然而,传统的GMA依赖于多个国际认证医生的主观判断,使其耗时且限制了其广泛使用的可及性。此外,人工智能(AI)方法可以克服这些限制,但通常基于运动骨架,缺乏捕获详细身体信息的能力。本文提出了一种新型的多模态协同学习框架CoGMA (Collaborative General Movements Assessment)。CoGMA通过在训练过程中集成多模态大语言模型作为辅助网络,将骨架数据、临床信息、RGB视频和文本描述四种类型的输入结合起来,增强表征学习。然而,在推理过程中,CoGMA仅使用骨骼数据和临床信息即可实现高效准确的预测。实验评估表明,CoGMA在扭动和烦躁运动阶段都表现出稳健的性能,同时在烦躁运动的零射击评估方面也表现出色,从而缓解了这一阶段训练样本有限的问题。该框架显著增强了GMA方法,并为婴儿神经运动行为的早期检测和研究奠定了基础。此外,为了促进匿名数据共享,我们引入了InfantAnimator,这是一种生成不可识别视频的工具,同时保留了基本的运动特征,从而支持更广泛的研究和合作。代码可在GitHub: https://github.com/wwYinYin/CoGMA。
{"title":"Facilitate Robust Early Screening of Cerebral Palsy via General Movements Assessment with Multi-Modality Co-Learning.","authors":"Wang Yin,Chunling Huang,Linxi Chen,Xinrui Huang,Zhaohong Wang,Yang Bian,Yuan Zhou,You Wan,Tongyan Han,Ming Yi","doi":"10.1109/tmi.2025.3641894","DOIUrl":"https://doi.org/10.1109/tmi.2025.3641894","url":null,"abstract":"General movement assessment (GMA) is a non-invasive method used to evaluate neuromotor behavior in infants under six months of age and is considered a reliable tool for the early detection of cerebral palsy (CP). However, traditional GMA relies on the subjective judgment of multiple internationally certified physicians, making it time-consuming and limiting its accessibility for widespread use. Furthermore, artificial intelligence (AI) approaches may overcome these limitations but are usually based on motion skeletons and lack the ability to capture detailed body information. Here, we propose CoGMA (Collaborative General Movements Assessment), a novel multi-modality co-learning framework for GMA. By integrating multimodal large language model as auxiliary network during training, CoGMA incorporates four types of input data-skeleton data, clinical information, RGB video, and text descriptions-to enhance representation learning. During inference, however, CoGMA achieves efficient and accurate prediction using only skeleton data and clinical information. Experimental evaluations indicate that CoGMA demonstrates robust performance across both the writhing and fidgety movement stages, while also excelling in zero-shot evaluation of fidget movement, thereby mitigating the issue of limited training samples in this stage. This framework significantly enhances the GMA methodology and lays the groundwork for future advancements in early detection and research on infant neuromotor behavior. Additionally, to facilitate anonymized data sharing, we introduce InfantAnimator, a tool that generates non-identifiable videos while preserving essential motion features, thereby supporting broader research and collaboration. The code is available at GitHub: https://github.com/wwYinYin/CoGMA.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"22 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145710799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-08DOI: 10.1109/tmi.2025.3639759
Ziang Zhang, Hong Song, Jingfan Fan, Long Shao, Tianyu Fu, Danni Ai, Deqiang Xiao, Yuanyuan Wang, Yucong Lin, Jian Yang
{"title":"SG-3DGS: Sequential Growing 3D Gaussian Splatting for Scene Reconstruction of Monocular Endoscope Video","authors":"Ziang Zhang, Hong Song, Jingfan Fan, Long Shao, Tianyu Fu, Danni Ai, Deqiang Xiao, Yuanyuan Wang, Yucong Lin, Jian Yang","doi":"10.1109/tmi.2025.3639759","DOIUrl":"https://doi.org/10.1109/tmi.2025.3639759","url":null,"abstract":"","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"27 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145703878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-03DOI: 10.1109/tmi.2025.3639776
Haodong Zhong,Gaiying Li,Yi Wang,Jianqi Li
Quantitative susceptibility mapping (QSM) is a magnetic resonance imaging technique that quantifies tissue magnetic susceptibility by deconvolving the measured signal phase data. Accurate background field removal is essential for QSM, especially in surface regions of the brain, such as the cerebral cortex, where the background field interference is substantial. Existing methods have errors in estimating background field near the boundary of an organ, such as those of the brain, due to assumptions or loss of low-frequency information. A novel Green's function total field inversion (gTFI) method is proposed here to model the background field using integral equations composed of Green's function and boundary conditions, thereby eliminating the need for traditional filtering, assumption or regularization. The gTFI method simultaneously determines the background field at the boundary and the tissue susceptibility from the measured phase data. Numerical simulations and in vivo experiments demonstrate that the gTFI effectively separates the background field and reconstructs whole-brain QSM images without boundary erosion, offering superior performance over existing methods, particularly in cortical regions.
{"title":"Green's Function Total Field Inversion for Quantitative Susceptibility Mapping.","authors":"Haodong Zhong,Gaiying Li,Yi Wang,Jianqi Li","doi":"10.1109/tmi.2025.3639776","DOIUrl":"https://doi.org/10.1109/tmi.2025.3639776","url":null,"abstract":"Quantitative susceptibility mapping (QSM) is a magnetic resonance imaging technique that quantifies tissue magnetic susceptibility by deconvolving the measured signal phase data. Accurate background field removal is essential for QSM, especially in surface regions of the brain, such as the cerebral cortex, where the background field interference is substantial. Existing methods have errors in estimating background field near the boundary of an organ, such as those of the brain, due to assumptions or loss of low-frequency information. A novel Green's function total field inversion (gTFI) method is proposed here to model the background field using integral equations composed of Green's function and boundary conditions, thereby eliminating the need for traditional filtering, assumption or regularization. The gTFI method simultaneously determines the background field at the boundary and the tissue susceptibility from the measured phase data. Numerical simulations and in vivo experiments demonstrate that the gTFI effectively separates the background field and reconstructs whole-brain QSM images without boundary erosion, offering superior performance over existing methods, particularly in cortical regions.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"157 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145663959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}