Pub Date : 2024-09-06DOI: 10.1109/TMI.2024.3455931
Huajun Zhou;Fengtao Zhou;Hao Chen
Recently, we have witnessed impressive achievements in cancer survival analysis by integrating multimodal data, e.g., pathology images and genomic profiles. However, the heterogeneity and high dimensionality of these modalities pose significant challenges in extracting discriminative representations while maintaining good generalization. In this paper, we propose a Cohort-individual Cooperative Learning (CCL) framework to advance cancer survival analysis by collaborating knowledge decomposition and cohort guidance. Specifically, first, we propose a Multimodal Knowledge Decomposition (MKD) module to explicitly decompose multimodal knowledge into four distinct components: redundancy, synergy, and uniqueness of the two modalities. Such a comprehensive decomposition can enlighten the models to perceive easily overlooked yet important information, facilitating an effective multimodal fusion. Second, we propose a Cohort Guidance Modeling (CGM) to mitigate the risk of overfitting task-irrelevant information. It can promote a more comprehensive and robust understanding of the underlying multimodal data while avoiding the pitfalls of overfitting and enhancing the generalization ability of the model. By cooperating with the knowledge decomposition and cohort guidance methods, we develop a robust multimodal survival analysis model with enhanced discrimination and generalization abilities. Extensive experimental results on five cancer datasets demonstrate the effectiveness of our model in integrating multimodal data for survival analysis. Our code is available at https://github.com/moothes/CCL-survival.
{"title":"Cohort-Individual Cooperative Learning for Multimodal Cancer Survival Analysis","authors":"Huajun Zhou;Fengtao Zhou;Hao Chen","doi":"10.1109/TMI.2024.3455931","DOIUrl":"10.1109/TMI.2024.3455931","url":null,"abstract":"Recently, we have witnessed impressive achievements in cancer survival analysis by integrating multimodal data, e.g., pathology images and genomic profiles. However, the heterogeneity and high dimensionality of these modalities pose significant challenges in extracting discriminative representations while maintaining good generalization. In this paper, we propose a Cohort-individual Cooperative Learning (CCL) framework to advance cancer survival analysis by collaborating knowledge decomposition and cohort guidance. Specifically, first, we propose a Multimodal Knowledge Decomposition (MKD) module to explicitly decompose multimodal knowledge into four distinct components: redundancy, synergy, and uniqueness of the two modalities. Such a comprehensive decomposition can enlighten the models to perceive easily overlooked yet important information, facilitating an effective multimodal fusion. Second, we propose a Cohort Guidance Modeling (CGM) to mitigate the risk of overfitting task-irrelevant information. It can promote a more comprehensive and robust understanding of the underlying multimodal data while avoiding the pitfalls of overfitting and enhancing the generalization ability of the model. By cooperating with the knowledge decomposition and cohort guidance methods, we develop a robust multimodal survival analysis model with enhanced discrimination and generalization abilities. Extensive experimental results on five cancer datasets demonstrate the effectiveness of our model in integrating multimodal data for survival analysis. Our code is available at <uri>https://github.com/moothes/CCL-survival</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"656-667"},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142143503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-04DOI: 10.1109/TMI.2024.3454268
Jianning Chi;Zhiyi Sun;Liuyi Meng;Siqi Wang;Xiaosheng Yu;Xiaolin Wei;Bin Yang
The anatomies in low-dose computer tomography (LDCT) are usually distorted during the zooming-in observation process due to the small amount of quantum. Super-resolution (SR) methods have been proposed to enhance qualities of LDCT images as post-processing approaches without increasing radiation damage to patients, but suffered from incorrect prediction of degradation information and incomplete leverage of internal connections within the 3D CT volume, resulting in the imbalance between noise removal and detail sharpening in the super-resolution results. In this paper, we propose a novel LDCT SR network where the degradation information self-parsed from the LDCT slice and the 3D anatomical information captured from the LDCT volume are integrated to guide the backbone network. The prior degradation estimator (PDE) is proposed following the contrastive learning strategy to estimate the degradation features in the LDCT images without paired low-normal dose CT images. The self-guidance fusion module (SGFM) is designed to capture anatomical features with internal 3D consistencies between the squashed images along the coronal, sagittal, and axial views of the CT volume. Finally, the features representing degradation and anatomical structures are integrated to recover the CT images with higher resolutions. We apply the proposed method to the 2016 NIH-AAPM Mayo Clinic LDCT Grand Challenge dataset and our collected LDCT dataset to evaluate its ability to recover LDCT images. Experimental results illustrate the superiority of our network concerning quantitative metrics and qualitative observations, demonstrating its potential in recovering detail-sharp and noise-free CT images with higher resolutions from the practical LDCT images.
{"title":"Low-Dose CT Image Super-Resolution With Noise Suppression Based on Prior Degradation Estimator and Self-Guidance Mechanism","authors":"Jianning Chi;Zhiyi Sun;Liuyi Meng;Siqi Wang;Xiaosheng Yu;Xiaolin Wei;Bin Yang","doi":"10.1109/TMI.2024.3454268","DOIUrl":"10.1109/TMI.2024.3454268","url":null,"abstract":"The anatomies in low-dose computer tomography (LDCT) are usually distorted during the zooming-in observation process due to the small amount of quantum. Super-resolution (SR) methods have been proposed to enhance qualities of LDCT images as post-processing approaches without increasing radiation damage to patients, but suffered from incorrect prediction of degradation information and incomplete leverage of internal connections within the 3D CT volume, resulting in the imbalance between noise removal and detail sharpening in the super-resolution results. In this paper, we propose a novel LDCT SR network where the degradation information self-parsed from the LDCT slice and the 3D anatomical information captured from the LDCT volume are integrated to guide the backbone network. The prior degradation estimator (PDE) is proposed following the contrastive learning strategy to estimate the degradation features in the LDCT images without paired low-normal dose CT images. The self-guidance fusion module (SGFM) is designed to capture anatomical features with internal 3D consistencies between the squashed images along the coronal, sagittal, and axial views of the CT volume. Finally, the features representing degradation and anatomical structures are integrated to recover the CT images with higher resolutions. We apply the proposed method to the 2016 NIH-AAPM Mayo Clinic LDCT Grand Challenge dataset and our collected LDCT dataset to evaluate its ability to recover LDCT images. Experimental results illustrate the superiority of our network concerning quantitative metrics and qualitative observations, demonstrating its potential in recovering detail-sharp and noise-free CT images with higher resolutions from the practical LDCT images.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"601-617"},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142134893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Photon-counting computed tomography (PCCT) may dramatically benefit clinical practice due to its versatility such as dose reduction and material characterization. However, the limited number of photons detected in each individual energy bin can induce severe noise contamination in the reconstructed image. Fortunately, the notable low-rank prior inherent in the PCCT image can guide the reconstruction to a denoised outcome. To fully excavate and leverage the intrinsic low-rankness, we propose a novel reconstruction algorithm based on quaternion representation (QR), called low-rank quaternion reconstruction (LOQUAT). First, we organize a group of nonlocal similar patches into a quaternion matrix. Then, an adjusted weighted Schatten-p norm (AWSN) is introduced and imposed on the matrix to enforce its low-rank nature. Subsequently, we formulate an AWSN-regularized model and devise an alternating direction method of multipliers (ADMM) framework to solve it. Experiments on simulated and real-world data substantiate the superiority of the LOQUAT technique over several state-of-the-art competitors in terms of both visual inspection and quantitative metrics. Moreover, our QR-based method exhibits lower computational complexity than some popular tensor representation (TR) based counterparts. Besides, the global convergence of LOQUAT is theoretically established under a mild condition. These properties bolster the robustness and practicality of LOQUAT, facilitating its application in PCCT clinical scenarios. The source code will be available at https://github.com/linzf23/LOQUAT.
{"title":"LOQUAT: Low-Rank Quaternion Reconstruction for Photon-Counting CT","authors":"Zefan Lin;Guotao Quan;Haixian Qu;Yanfeng Du;Jun Zhao","doi":"10.1109/TMI.2024.3454174","DOIUrl":"10.1109/TMI.2024.3454174","url":null,"abstract":"Photon-counting computed tomography (PCCT) may dramatically benefit clinical practice due to its versatility such as dose reduction and material characterization. However, the limited number of photons detected in each individual energy bin can induce severe noise contamination in the reconstructed image. Fortunately, the notable low-rank prior inherent in the PCCT image can guide the reconstruction to a denoised outcome. To fully excavate and leverage the intrinsic low-rankness, we propose a novel reconstruction algorithm based on quaternion representation (QR), called low-rank quaternion reconstruction (LOQUAT). First, we organize a group of nonlocal similar patches into a quaternion matrix. Then, an adjusted weighted Schatten-p norm (AWSN) is introduced and imposed on the matrix to enforce its low-rank nature. Subsequently, we formulate an AWSN-regularized model and devise an alternating direction method of multipliers (ADMM) framework to solve it. Experiments on simulated and real-world data substantiate the superiority of the LOQUAT technique over several state-of-the-art competitors in terms of both visual inspection and quantitative metrics. Moreover, our QR-based method exhibits lower computational complexity than some popular tensor representation (TR) based counterparts. Besides, the global convergence of LOQUAT is theoretically established under a mild condition. These properties bolster the robustness and practicality of LOQUAT, facilitating its application in PCCT clinical scenarios. The source code will be available at <uri>https://github.com/linzf23/LOQUAT</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"668-684"},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142127748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In computational pathology, whole-slide image (WSI) classification presents a formidable challenge due to its gigapixel resolution and limited fine-grained annotations. Multiple-instance learning (MIL) offers a weakly supervised solution, yet refining instance-level information from bag-level labels remains challenging. While most of the conventional MIL methods use attention scores to estimate instance importance scores (IIS) which contribute to the prediction of the slide labels, these often lead to skewed attention distributions and inaccuracies in identifying crucial instances. To address these issues, we propose a new approach inspired by cooperative game theory: employing Shapley values to assess each instance’s contribution, thereby improving IIS estimation. The computation of the Shapley value is then accelerated using attention, meanwhile retaining the enhanced instance identification and prioritization. We further introduce a framework for the progressive assignment of pseudo bags based on estimated IIS, encouraging more balanced attention distributions in MIL models. Our extensive experiments on CAMELYON-16, BRACS, TCGA-LUNG, and TCGA-BRCA datasets show our method’s superiority over existing state-of-the-art approaches, offering enhanced interpretability and class-wise insights. Our source code is available at https://github.com/RenaoYan/PMIL