Pub Date : 2026-01-13DOI: 10.1109/tmi.2026.3654000
Jinrong Cui,Weihao Ye,Shengrong Li,Jie Wen,Qi Zhu
Multi-modal learning is extensively applied to diagnose brain diseases such as epilepsy and Alzheimer's disease. However, incomplete multi-modal data, where some modalities are unavailable or difficult to collect, limits the effectiveness of conventional methods. Additionally, existing approaches often overlook semantic relationships between neighbors with the same-label and latent information in missing modalities. To address these challenges, we propose an adjacent-aware distillation recovery framework designed for incomplete multi-modal learning, with a focus on diagnosing representative brain diseases, i.e. epilepsy and Alzheimer's disease. The key novelty of our framework lies in its joint design of adjacent-aware modality recovery and multi-modal representation learning in a single end-to-end pipeline. Specifically, we introduce a label-guided adjacent-aware recovery module that uses a self-attention mechanism to exploit neighbor semantics and generate distribution-consistent features for high-quality modality reconstruction. The recovered features are then refined through a knowledge distillation pathway into a modality generator, enhancing generalization under severe data incompleteness. For multi-modal representation learning, the recovered modality information is fused with the original incomplete information to enhance feature extraction and representation. Extensive experiments demonstrate the effectiveness of our method in diagnosing epilepsy and Alzheimer's disease.
{"title":"Adjacent-aware Modality Recovery based on Incomplete Multi-Modal Brain Disease Diagnosis.","authors":"Jinrong Cui,Weihao Ye,Shengrong Li,Jie Wen,Qi Zhu","doi":"10.1109/tmi.2026.3654000","DOIUrl":"https://doi.org/10.1109/tmi.2026.3654000","url":null,"abstract":"Multi-modal learning is extensively applied to diagnose brain diseases such as epilepsy and Alzheimer's disease. However, incomplete multi-modal data, where some modalities are unavailable or difficult to collect, limits the effectiveness of conventional methods. Additionally, existing approaches often overlook semantic relationships between neighbors with the same-label and latent information in missing modalities. To address these challenges, we propose an adjacent-aware distillation recovery framework designed for incomplete multi-modal learning, with a focus on diagnosing representative brain diseases, i.e. epilepsy and Alzheimer's disease. The key novelty of our framework lies in its joint design of adjacent-aware modality recovery and multi-modal representation learning in a single end-to-end pipeline. Specifically, we introduce a label-guided adjacent-aware recovery module that uses a self-attention mechanism to exploit neighbor semantics and generate distribution-consistent features for high-quality modality reconstruction. The recovered features are then refined through a knowledge distillation pathway into a modality generator, enhancing generalization under severe data incompleteness. For multi-modal representation learning, the recovered modality information is fused with the original incomplete information to enhance feature extraction and representation. Extensive experiments demonstrate the effectiveness of our method in diagnosing epilepsy and Alzheimer's disease.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"87 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/tmi.2026.3653667
Jigmi Basumatary, Yousuf Aborahama, Yang Zhang, Yide Zhang, Yushun Zeng, Cindy Z. Liu, Qifa Zhou, Lihong V. Wang
{"title":"High-Speed Volumetric Dual-Mode Ultrasound and Photoacoustic Tomography with a Single-Element Detector","authors":"Jigmi Basumatary, Yousuf Aborahama, Yang Zhang, Yide Zhang, Yushun Zeng, Cindy Z. Liu, Qifa Zhou, Lihong V. Wang","doi":"10.1109/tmi.2026.3653667","DOIUrl":"https://doi.org/10.1109/tmi.2026.3653667","url":null,"abstract":"","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"27 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/tmi.2026.3652170
Jiakai Zhou, Yang Wang, Chaolin Huang, Chao Dai, Chunyu Tan
{"title":"CeLR: A Transformer-based Regression Network for Accurate Cephalometric Landmark Detection in High-Resolution X-ray Imaging","authors":"Jiakai Zhou, Yang Wang, Chaolin Huang, Chao Dai, Chunyu Tan","doi":"10.1109/tmi.2026.3652170","DOIUrl":"https://doi.org/10.1109/tmi.2026.3652170","url":null,"abstract":"","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"20 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145955130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1109/tmi.2026.3651389
Han Wu,Haoyuan Chen,Lin Zhou,Qi Xu,Zhiming Cui,Dinggang Shen
Precise landmark annotation in cardiac ultrasound images is fundamental for quantitative cardiac health assessment. However, the time-intensive nature of manual annotation typically constrains clinicians to annotate only selected key frames, limiting comprehensive temporal analysis capabilities. While recent automated landmark detection methods have demonstrated success for key-frame analysis, they fail to effectively utilize the intrinsic temporal information across cardiac sequence. To bridge this gap, we present SemiEchoTracker, a novel semi-supervised framework that enables comprehensive landmark tracking throughout echocardiography sequences while requiring supervision only on key frames. Our framework introduces three key innovative strategies: 1) a co-training mechanism that enforces mutual consistency between spatial detection and temporal tracking, enabling accurate intermediate frame detection without additional annotations, 2) a guided DINOv2 pretraining strategy that is specially tailored for extracting fine-grained echocardiography-specific spatial features, and 3) a perception-aware spatial-temporal (PAST) attention module that efficiently captures inter- and intra-frame relationships in echocardiography videos. Extensive validation on three datasets across multiple cardiac views demonstrates that our method not only achieves state-of-the-art detection performance on the keyframes but also yields accurate frame-by-frame prediction, which is important for dynamic cardiac analysis in clinicians.
{"title":"Semi-Supervised Landmark Tracking in Echocardiography Video via Spatial-Temporal Co-Training and Perception-Aware Attention.","authors":"Han Wu,Haoyuan Chen,Lin Zhou,Qi Xu,Zhiming Cui,Dinggang Shen","doi":"10.1109/tmi.2026.3651389","DOIUrl":"https://doi.org/10.1109/tmi.2026.3651389","url":null,"abstract":"Precise landmark annotation in cardiac ultrasound images is fundamental for quantitative cardiac health assessment. However, the time-intensive nature of manual annotation typically constrains clinicians to annotate only selected key frames, limiting comprehensive temporal analysis capabilities. While recent automated landmark detection methods have demonstrated success for key-frame analysis, they fail to effectively utilize the intrinsic temporal information across cardiac sequence. To bridge this gap, we present SemiEchoTracker, a novel semi-supervised framework that enables comprehensive landmark tracking throughout echocardiography sequences while requiring supervision only on key frames. Our framework introduces three key innovative strategies: 1) a co-training mechanism that enforces mutual consistency between spatial detection and temporal tracking, enabling accurate intermediate frame detection without additional annotations, 2) a guided DINOv2 pretraining strategy that is specially tailored for extracting fine-grained echocardiography-specific spatial features, and 3) a perception-aware spatial-temporal (PAST) attention module that efficiently captures inter- and intra-frame relationships in echocardiography videos. Extensive validation on three datasets across multiple cardiac views demonstrates that our method not only achieves state-of-the-art detection performance on the keyframes but also yields accurate frame-by-frame prediction, which is important for dynamic cardiac analysis in clinicians.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"183 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145907534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1109/tmi.2025.3648788
Jie Du,Haoyang Luo,Wenbing Chen,Peng Liu,Tianfu Wang
Existing multi-organ segmentation methods usually rely on large and fully labeled datasets for training. However, medical image datasets are typically decentralized by privacy constraints and partially labeled due to the high costs of full annotation in clinical practice, resulting in label inconsistency across medical centers. Federated learning offers privacy-preserving decentralized training, but the label inconsistency leads to significant divergence in local model parameters across medical centers, thereby hindering the achievement of the global optimum. To resolve this issue, an effective and communication-efficient Federated Learning under Reliable Supervision (FedRS) is proposed, which ensures: i) the local models are trained with reliable supervisory information through the proposed Less-Forgetting and Less-Constraint loss functions, thereby reducing the divergence in local model parameters; and ii) the global model is aggregated based on the consistency of predictions between each local model (after local training) and the global model (received before training), thereby enhancing the reliability of the global model. Extensive experimental results on nine publicly available 3D abdominal CT image datasets show that our FedRS outperforms localized, centralized, and state-of-the-art federated learning methods on both in-federation and out-of-federation datasets, demonstrating its effectiveness and strong generalization capability. In particular, our FedRS only utilizes a model with only 4.1M parameters as its backbone, thereby significantly reducing its communication cost. The source code is publicly available at https://github.com/luohy812/FedRS.
{"title":"FedRS: Federated Learning Under Reliable Supervision for Multi-Organ Segmentation With Inconsistent Labels.","authors":"Jie Du,Haoyang Luo,Wenbing Chen,Peng Liu,Tianfu Wang","doi":"10.1109/tmi.2025.3648788","DOIUrl":"https://doi.org/10.1109/tmi.2025.3648788","url":null,"abstract":"Existing multi-organ segmentation methods usually rely on large and fully labeled datasets for training. However, medical image datasets are typically decentralized by privacy constraints and partially labeled due to the high costs of full annotation in clinical practice, resulting in label inconsistency across medical centers. Federated learning offers privacy-preserving decentralized training, but the label inconsistency leads to significant divergence in local model parameters across medical centers, thereby hindering the achievement of the global optimum. To resolve this issue, an effective and communication-efficient Federated Learning under Reliable Supervision (FedRS) is proposed, which ensures: i) the local models are trained with reliable supervisory information through the proposed Less-Forgetting and Less-Constraint loss functions, thereby reducing the divergence in local model parameters; and ii) the global model is aggregated based on the consistency of predictions between each local model (after local training) and the global model (received before training), thereby enhancing the reliability of the global model. Extensive experimental results on nine publicly available 3D abdominal CT image datasets show that our FedRS outperforms localized, centralized, and state-of-the-art federated learning methods on both in-federation and out-of-federation datasets, demonstrating its effectiveness and strong generalization capability. In particular, our FedRS only utilizes a model with only 4.1M parameters as its backbone, thereby significantly reducing its communication cost. The source code is publicly available at https://github.com/luohy812/FedRS.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"1 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145835979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}