IEEE transactions on medical imaging最新文献_第5页

Cohort-Individual Cooperative Learning for Multimodal Cancer Survival Analysis 用于多模式癌症生存分析的队列-个体合作学习

IEEE transactions on medical imaging

Pub Date : 2024-09-06 DOI: 10.1109/TMI.2024.3455931

Huajun Zhou;Fengtao Zhou;Hao Chen

Recently, we have witnessed impressive achievements in cancer survival analysis by integrating multimodal data, e.g., pathology images and genomic profiles. However, the heterogeneity and high dimensionality of these modalities pose significant challenges in extracting discriminative representations while maintaining good generalization. In this paper, we propose a Cohort-individual Cooperative Learning (CCL) framework to advance cancer survival analysis by collaborating knowledge decomposition and cohort guidance. Specifically, first, we propose a Multimodal Knowledge Decomposition (MKD) module to explicitly decompose multimodal knowledge into four distinct components: redundancy, synergy, and uniqueness of the two modalities. Such a comprehensive decomposition can enlighten the models to perceive easily overlooked yet important information, facilitating an effective multimodal fusion. Second, we propose a Cohort Guidance Modeling (CGM) to mitigate the risk of overfitting task-irrelevant information. It can promote a more comprehensive and robust understanding of the underlying multimodal data while avoiding the pitfalls of overfitting and enhancing the generalization ability of the model. By cooperating with the knowledge decomposition and cohort guidance methods, we develop a robust multimodal survival analysis model with enhanced discrimination and generalization abilities. Extensive experimental results on five cancer datasets demonstrate the effectiveness of our model in integrating multimodal data for survival analysis. Our code is available at https://github.com/moothes/CCL-survival.

最近，通过整合病理图像和基因组图谱等多模态数据，我们在癌症生存分析领域取得了令人瞩目的成就。然而，这些模式的异质性和高维性为提取具有区分性的表征并保持良好的泛化能力带来了巨大挑战。在本文中，我们提出了一个队列个体合作学习（CCL）框架，通过知识分解和队列指导的合作来推进癌症生存分析。具体来说，首先，我们提出了多模态知识分解（MKD）模块，将多模态知识明确分解为四个不同的组成部分：两种模态的冗余性、协同性和独特性。这种全面的分解可以启发模型感知容易被忽视的重要信息，从而促进有效的多模态融合。其次，我们提出了队列引导建模（CGM）来降低过度拟合任务相关信息的风险。它可以促进对基础多模态数据更全面、更稳健的理解，同时避免过度拟合的陷阱，增强模型的泛化能力。通过将知识分解与队列引导方法相结合，我们建立了一个稳健的多模态生存分析模型，并增强了模型的判别能力和泛化能力。在五个癌症数据集上的大量实验结果证明了我们的模型在整合多模态数据进行生存分析方面的有效性。代码即将公开。

{"title":"Cohort-Individual Cooperative Learning for Multimodal Cancer Survival Analysis","authors":"Huajun Zhou;Fengtao Zhou;Hao Chen","doi":"10.1109/TMI.2024.3455931","DOIUrl":"10.1109/TMI.2024.3455931","url":null,"abstract":"Recently, we have witnessed impressive achievements in cancer survival analysis by integrating multimodal data, e.g., pathology images and genomic profiles. However, the heterogeneity and high dimensionality of these modalities pose significant challenges in extracting discriminative representations while maintaining good generalization. In this paper, we propose a Cohort-individual Cooperative Learning (CCL) framework to advance cancer survival analysis by collaborating knowledge decomposition and cohort guidance. Specifically, first, we propose a Multimodal Knowledge Decomposition (MKD) module to explicitly decompose multimodal knowledge into four distinct components: redundancy, synergy, and uniqueness of the two modalities. Such a comprehensive decomposition can enlighten the models to perceive easily overlooked yet important information, facilitating an effective multimodal fusion. Second, we propose a Cohort Guidance Modeling (CGM) to mitigate the risk of overfitting task-irrelevant information. It can promote a more comprehensive and robust understanding of the underlying multimodal data while avoiding the pitfalls of overfitting and enhancing the generalization ability of the model. By cooperating with the knowledge decomposition and cohort guidance methods, we develop a robust multimodal survival analysis model with enhanced discrimination and generalization abilities. Extensive experimental results on five cancer datasets demonstrate the effectiveness of our model in integrating multimodal data for survival analysis. Our code is available at <uri>https://github.com/moothes/CCL-survival</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"656-667"},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142143503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Low-Dose CT Image Super-Resolution With Noise Suppression Based on Prior Degradation Estimator and Self-Guidance Mechanism 基于先验退化估计器和自我引导机制的低剂量 CT 图像超分辨率与噪声抑制。

IEEE transactions on medical imaging

Pub Date : 2024-09-04 DOI: 10.1109/TMI.2024.3454268

Jianning Chi;Zhiyi Sun;Liuyi Meng;Siqi Wang;Xiaosheng Yu;Xiaolin Wei;Bin Yang

The anatomies in low-dose computer tomography (LDCT) are usually distorted during the zooming-in observation process due to the small amount of quantum. Super-resolution (SR) methods have been proposed to enhance qualities of LDCT images as post-processing approaches without increasing radiation damage to patients, but suffered from incorrect prediction of degradation information and incomplete leverage of internal connections within the 3D CT volume, resulting in the imbalance between noise removal and detail sharpening in the super-resolution results. In this paper, we propose a novel LDCT SR network where the degradation information self-parsed from the LDCT slice and the 3D anatomical information captured from the LDCT volume are integrated to guide the backbone network. The prior degradation estimator (PDE) is proposed following the contrastive learning strategy to estimate the degradation features in the LDCT images without paired low-normal dose CT images. The self-guidance fusion module (SGFM) is designed to capture anatomical features with internal 3D consistencies between the squashed images along the coronal, sagittal, and axial views of the CT volume. Finally, the features representing degradation and anatomical structures are integrated to recover the CT images with higher resolutions. We apply the proposed method to the 2016 NIH-AAPM Mayo Clinic LDCT Grand Challenge dataset and our collected LDCT dataset to evaluate its ability to recover LDCT images. Experimental results illustrate the superiority of our network concerning quantitative metrics and qualitative observations, demonstrating its potential in recovering detail-sharp and noise-free CT images with higher resolutions from the practical LDCT images.

在低剂量计算机断层扫描（LDCT）中，由于量子量较小，在放大观察过程中解剖结构通常会失真。超分辨率（SR）方法作为一种后处理方法，在不增加对患者辐射伤害的前提下提高了 LDCT 图像的质量，但由于对衰减信息的预测不准确，以及对三维 CT 容积内部联系的利用不完全，导致超分辨率结果在噪声去除和细节锐化之间不平衡。在本文中，我们提出了一种新型 LDCT SR 网络，将从 LDCT 切片中自行解析的退化信息和从 LDCT 容积中捕获的三维解剖信息整合在一起，为骨干网络提供指导。根据对比学习策略提出了先验退化估计器（PDE），以估计无配对低正常剂量 CT 图像的 LDCT 图像中的退化特征。自导向融合模块（SGFM）旨在捕捉沿 CT 容积的冠状、矢状和轴向视图的压扁图像之间具有内部三维一致性的解剖特征。最后，对代表退化和解剖结构的特征进行整合，以恢复分辨率更高的 CT 图像。我们将提出的方法应用于 2016 年 NIH-AAPM 梅奥诊所 LDCT 大挑战赛数据集和我们收集的 LDCT 数据集，以评估其恢复 LDCT 图像的能力。实验结果表明了我们的网络在定量指标和定性观察方面的优越性，证明了它在从实际 LDCT 图像中恢复细节清晰、无噪声且分辨率更高的 CT 图像方面的潜力。

{"title":"Low-Dose CT Image Super-Resolution With Noise Suppression Based on Prior Degradation Estimator and Self-Guidance Mechanism","authors":"Jianning Chi;Zhiyi Sun;Liuyi Meng;Siqi Wang;Xiaosheng Yu;Xiaolin Wei;Bin Yang","doi":"10.1109/TMI.2024.3454268","DOIUrl":"10.1109/TMI.2024.3454268","url":null,"abstract":"The anatomies in low-dose computer tomography (LDCT) are usually distorted during the zooming-in observation process due to the small amount of quantum. Super-resolution (SR) methods have been proposed to enhance qualities of LDCT images as post-processing approaches without increasing radiation damage to patients, but suffered from incorrect prediction of degradation information and incomplete leverage of internal connections within the 3D CT volume, resulting in the imbalance between noise removal and detail sharpening in the super-resolution results. In this paper, we propose a novel LDCT SR network where the degradation information self-parsed from the LDCT slice and the 3D anatomical information captured from the LDCT volume are integrated to guide the backbone network. The prior degradation estimator (PDE) is proposed following the contrastive learning strategy to estimate the degradation features in the LDCT images without paired low-normal dose CT images. The self-guidance fusion module (SGFM) is designed to capture anatomical features with internal 3D consistencies between the squashed images along the coronal, sagittal, and axial views of the CT volume. Finally, the features representing degradation and anatomical structures are integrated to recover the CT images with higher resolutions. We apply the proposed method to the 2016 NIH-AAPM Mayo Clinic LDCT Grand Challenge dataset and our collected LDCT dataset to evaluate its ability to recover LDCT images. Experimental results illustrate the superiority of our network concerning quantitative metrics and qualitative observations, demonstrating its potential in recovering detail-sharp and noise-free CT images with higher resolutions from the practical LDCT images.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"601-617"},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142134893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LOQUAT: Low-Rank Quaternion Reconstruction for Photon-Counting CT LOQUAT：用于光子计数 CT 的低函数四元数重建。

IEEE transactions on medical imaging

Pub Date : 2024-09-03 DOI: 10.1109/TMI.2024.3454174

Zefan Lin;Guotao Quan;Haixian Qu;Yanfeng Du;Jun Zhao

Photon-counting computed tomography (PCCT) may dramatically benefit clinical practice due to its versatility such as dose reduction and material characterization. However, the limited number of photons detected in each individual energy bin can induce severe noise contamination in the reconstructed image. Fortunately, the notable low-rank prior inherent in the PCCT image can guide the reconstruction to a denoised outcome. To fully excavate and leverage the intrinsic low-rankness, we propose a novel reconstruction algorithm based on quaternion representation (QR), called low-rank quaternion reconstruction (LOQUAT). First, we organize a group of nonlocal similar patches into a quaternion matrix. Then, an adjusted weighted Schatten-p norm (AWSN) is introduced and imposed on the matrix to enforce its low-rank nature. Subsequently, we formulate an AWSN-regularized model and devise an alternating direction method of multipliers (ADMM) framework to solve it. Experiments on simulated and real-world data substantiate the superiority of the LOQUAT technique over several state-of-the-art competitors in terms of both visual inspection and quantitative metrics. Moreover, our QR-based method exhibits lower computational complexity than some popular tensor representation (TR) based counterparts. Besides, the global convergence of LOQUAT is theoretically established under a mild condition. These properties bolster the robustness and practicality of LOQUAT, facilitating its application in PCCT clinical scenarios. The source code will be available at https://github.com/linzf23/LOQUAT.

光子计数计算机断层扫描（PCCT）因其多功能性（如减少剂量和材料表征），可极大地改善临床实践。然而，在每个单独的能量仓中检测到的光子数量有限，会在重建图像中产生严重的噪声污染。幸运的是，PCCT 图像中固有的显著低秩先验可以引导重建获得去噪结果。为了充分挖掘和利用固有的低秩性，我们提出了一种基于四元数表示（QR）的新型重建算法，称为低秩四元数重建（LOQUAT）。首先，我们将一组非局部相似斑块组织成一个四元数矩阵。然后，引入调整加权沙顿-p 准则（AWSN）并施加于矩阵，以强化其低秩性质。随后，我们提出了一个 AWSN 规则化模型，并设计了一个交替乘法（ADMM）框架来解决这个问题。在模拟和真实世界数据上进行的实验证明，LOQUAT 技术在目测和定量指标方面都优于几种最先进的竞争对手。此外，与一些流行的基于张量表示（TR）的方法相比，我们基于 QR 的方法具有更低的计算复杂度。此外，LOQUAT 的全局收敛性是在一个温和的条件下从理论上确定的。这些特性增强了 LOQUAT 的稳健性和实用性，有助于其在 PCCT 临床场景中的应用。源代码可在 https://github.com/linzf23/LOQUAT 上获取。

{"title":"LOQUAT: Low-Rank Quaternion Reconstruction for Photon-Counting CT","authors":"Zefan Lin;Guotao Quan;Haixian Qu;Yanfeng Du;Jun Zhao","doi":"10.1109/TMI.2024.3454174","DOIUrl":"10.1109/TMI.2024.3454174","url":null,"abstract":"Photon-counting computed tomography (PCCT) may dramatically benefit clinical practice due to its versatility such as dose reduction and material characterization. However, the limited number of photons detected in each individual energy bin can induce severe noise contamination in the reconstructed image. Fortunately, the notable low-rank prior inherent in the PCCT image can guide the reconstruction to a denoised outcome. To fully excavate and leverage the intrinsic low-rankness, we propose a novel reconstruction algorithm based on quaternion representation (QR), called low-rank quaternion reconstruction (LOQUAT). First, we organize a group of nonlocal similar patches into a quaternion matrix. Then, an adjusted weighted Schatten-p norm (AWSN) is introduced and imposed on the matrix to enforce its low-rank nature. Subsequently, we formulate an AWSN-regularized model and devise an alternating direction method of multipliers (ADMM) framework to solve it. Experiments on simulated and real-world data substantiate the superiority of the LOQUAT technique over several state-of-the-art competitors in terms of both visual inspection and quantitative metrics. Moreover, our QR-based method exhibits lower computational complexity than some popular tensor representation (TR) based counterparts. Besides, the global convergence of LOQUAT is theoretically established under a mild condition. These properties bolster the robustness and practicality of LOQUAT, facilitating its application in PCCT clinical scenarios. The source code will be available at <uri>https://github.com/linzf23/LOQUAT</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"668-684"},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142127748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Shapley Values-Enabled Progressive Pseudo Bag Augmentation for Whole-Slide Image Classification Shapley Values-enabled Progressive Pseudo Bag Augmentation for Whole-Slide Image Classification.

IEEE transactions on medical imaging

Pub Date : 2024-09-02 DOI: 10.1109/TMI.2024.3453386

Renao Yan;Qiehe Sun;Cheng Jin;Yiqing Liu;Yonghong He;Tian Guan;Hao Chen

In computational pathology, whole-slide image (WSI) classification presents a formidable challenge due to its gigapixel resolution and limited fine-grained annotations. Multiple-instance learning (MIL) offers a weakly supervised solution, yet refining instance-level information from bag-level labels remains challenging. While most of the conventional MIL methods use attention scores to estimate instance importance scores (IIS) which contribute to the prediction of the slide labels, these often lead to skewed attention distributions and inaccuracies in identifying crucial instances. To address these issues, we propose a new approach inspired by cooperative game theory: employing Shapley values to assess each instance’s contribution, thereby improving IIS estimation. The computation of the Shapley value is then accelerated using attention, meanwhile retaining the enhanced instance identification and prioritization. We further introduce a framework for the progressive assignment of pseudo bags based on estimated IIS, encouraging more balanced attention distributions in MIL models. Our extensive experiments on CAMELYON-16, BRACS, TCGA-LUNG, and TCGA-BRCA datasets show our method’s superiority over existing state-of-the-art approaches, offering enhanced interpretability and class-wise insights. Our source code is available at https://github.com/RenaoYan/PMIL.

在计算病理学中，由于千兆像素的分辨率和有限的细粒度注释，整张幻灯片图像（WSI）分类是一项艰巨的挑战。多实例学习（Multiple-instance Learning，MIL）提供了一种弱监督解决方案，但从包级标签中提炼实例级信息仍是一项挑战。虽然大多数传统的多实例学习方法都使用注意力分数来估算有助于预测幻灯片标签的实例重要性分数（IIS），但这些方法往往会导致注意力分布偏斜，无法准确识别关键实例。为了解决这些问题，我们提出了一种受合作博弈论启发的新方法：使用夏普利值来评估每个实例的贡献，从而改进 IIS 估算。然后利用注意力加速夏普利值的计算，同时保留增强的实例识别和优先级排序。我们进一步引入了一个框架，用于根据估计的 IIS 逐步分配伪袋，从而鼓励在 MIL 模型中实现更均衡的注意力分布。我们在 CAMELYON-16、BRACS、TCGA-LUNG 和 TCGA-BRCA 数据集上进行了广泛的实验，结果表明我们的方法优于现有的先进方法，具有更强的可解释性和分类洞察力。我们将在验收合格后发布代码。

{"title":"Shapley Values-Enabled Progressive Pseudo Bag Augmentation for Whole-Slide Image Classification","authors":"Renao Yan;Qiehe Sun;Cheng Jin;Yiqing Liu;Yonghong He;Tian Guan;Hao Chen","doi":"10.1109/TMI.2024.3453386","DOIUrl":"10.1109/TMI.2024.3453386","url":null,"abstract":"In computational pathology, whole-slide image (WSI) classification presents a formidable challenge due to its gigapixel resolution and limited fine-grained annotations. Multiple-instance learning (MIL) offers a weakly supervised solution, yet refining instance-level information from bag-level labels remains challenging. While most of the conventional MIL methods use attention scores to estimate instance importance scores (IIS) which contribute to the prediction of the slide labels, these often lead to skewed attention distributions and inaccuracies in identifying crucial instances. To address these issues, we propose a new approach inspired by cooperative game theory: employing Shapley values to assess each instance’s contribution, thereby improving IIS estimation. The computation of the Shapley value is then accelerated using attention, meanwhile retaining the enhanced instance identification and prioritization. We further introduce a framework for the progressive assignment of pseudo bags based on estimated IIS, encouraging more balanced attention distributions in MIL models. Our extensive experiments on CAMELYON-16, BRACS, TCGA-LUNG, and TCGA-BRCA datasets show our method’s superiority over existing state-of-the-art approaches, offering enhanced interpretability and class-wise insights. Our source code is available at \u0000<uri>https://github.com/RenaoYan/PMIL</uri>\u0000.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"588-597"},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GenSelfDiff-HIS: Generative Self-Supervision Using Diffusion for Histopathological Image Segmentation GenSelfDiff-HIS：利用扩散进行组织病理图像分割的生成式自我监督

IEEE transactions on medical imaging

Pub Date : 2024-09-02 DOI: 10.1109/TMI.2024.3453492

Vishnuvardhan Purma;Suhas Srinath;Seshan Srirangarajan;Aanchal Kakkar;A. P. Prathosh

Histopathological image segmentation is a laborious and time-intensive task, often requiring analysis from experienced pathologists for accurate examinations. To reduce this burden, supervised machine-learning approaches have been adopted using large-scale annotated datasets for histopathological image analysis. However, in several scenarios, the availability of large-scale annotated data is a bottleneck while training such models. Self-supervised learning (SSL) is an alternative paradigm that provides some respite by constructing models utilizing only the unannotated data which is often abundant. The basic idea of SSL is to train a network to perform one or many pseudo or pretext tasks on unannotated data and use it subsequently as the basis for a variety of downstream tasks. It is seen that the success of SSL depends critically on the considered pretext task. While there have been many efforts in designing pretext tasks for classification problems, there have not been many attempts on SSL for histopathological image segmentation. Motivated by this, we propose an SSL approach for segmenting histopathological images via generative diffusion models. Our method is based on the observation that diffusion models effectively solve an image-to-image translation task akin to a segmentation task. Hence, we propose generative diffusion as the pretext task for histopathological image segmentation. We also utilize a multi-loss function-based fine-tuning for the downstream task. We validate our method using several metrics on two publicly available datasets along with a newly proposed head and neck (HN) cancer dataset containing Hematoxylin and Eosin (H&E) stained images along with annotations.

组织病理学图像分割是一项费力费时的工作，通常需要经验丰富的病理学家进行分析，才能获得准确的检查结果。为了减轻这一负担，人们采用了有监督的机器学习方法，利用大规模注释数据集进行组织病理学图像分析。然而，在一些情况下，大规模标注数据的可用性成为训练此类模型的瓶颈。自我监督学习（SSL）是一种替代范式，它只利用通常非常丰富的未注释数据构建模型，从而提供了一些喘息机会。自监督学习的基本思想是训练一个网络，让它在未标注的数据上执行一个或多个伪任务或借口任务，然后以此为基础执行各种下游任务。可以看出，SSL 的成功与否关键取决于所考虑的借口任务。虽然人们在为分类问题设计前置任务方面做了很多努力，但在组织病理学图像分割的 SSL 方面还没有很多尝试。受此启发，我们提出了一种通过生成扩散模型分割组织病理学图像的 SSL 方法。我们的方法基于这样一个观察结果，即扩散模型能有效解决类似于分割任务的图像到图像转换任务。因此，我们提出将生成扩散作为组织病理学图像分割的前置任务。我们还利用基于多损失函数的微调来完成下游任务。我们在两个公开可用的数据集和一个新提出的头颈部（HN）癌症数据集上使用多个指标验证了我们的方法，该数据集包含带有注释的苏木精和伊红（H&E）染色图像。

{"title":"GenSelfDiff-HIS: Generative Self-Supervision Using Diffusion for Histopathological Image Segmentation","authors":"Vishnuvardhan Purma;Suhas Srinath;Seshan Srirangarajan;Aanchal Kakkar;A. P. Prathosh","doi":"10.1109/TMI.2024.3453492","DOIUrl":"10.1109/TMI.2024.3453492","url":null,"abstract":"Histopathological image segmentation is a laborious and time-intensive task, often requiring analysis from experienced pathologists for accurate examinations. To reduce this burden, supervised machine-learning approaches have been adopted using large-scale annotated datasets for histopathological image analysis. However, in several scenarios, the availability of large-scale annotated data is a bottleneck while training such models. Self-supervised learning (SSL) is an alternative paradigm that provides some respite by constructing models utilizing only the unannotated data which is often abundant. The basic idea of SSL is to train a network to perform one or many pseudo or pretext tasks on unannotated data and use it subsequently as the basis for a variety of downstream tasks. It is seen that the success of SSL depends critically on the considered pretext task. While there have been many efforts in designing pretext tasks for classification problems, there have not been many attempts on SSL for histopathological image segmentation. Motivated by this, we propose an SSL approach for segmenting histopathological images via generative diffusion models. Our method is based on the observation that diffusion models effectively solve an image-to-image translation task akin to a segmentation task. Hence, we propose generative diffusion as the pretext task for histopathological image segmentation. We also utilize a multi-loss function-based fine-tuning for the downstream task. We validate our method using several metrics on two publicly available datasets along with a newly proposed head and neck (HN) cancer dataset containing Hematoxylin and Eosin (H&E) stained images along with annotations.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"618-631"},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Knowledge-aware Multisite Adaptive Graph Transformer for Brain Disorder Diagnosis. 用于脑部疾病诊断的知识感知多站点自适应图转换器

IEEE transactions on medical imaging

Pub Date : 2024-09-02 DOI: 10.1109/TMI.2024.3453419

Xuegang Song, Kaixiang Shu, Peng Yang, Cheng Zhao, Feng Zhou, Alejandro F Frangi, Xiaohua Xiao, Lei Dong, Tianfu Wang, Shuqiang Wang, Baiying Lei

Brain disorder diagnosis via resting-state functional magnetic resonance imaging (rs-fMRI) is usually limited due to the complex imaging features and sample size. For brain disorder diagnosis, the graph convolutional network (GCN) has achieved remarkable success by capturing interactions between individuals and the population. However, there are mainly three limitations: 1) The previous GCN approaches consider the non-imaging information in edge construction but ignore the sensitivity differences of features to non-imaging information. 2) The previous GCN approaches solely focus on establishing interactions between subjects (i.e., individuals and the population), disregarding the essential relationship between features. 3) Multisite data increase the sample size to help classifier training, but the inter-site heterogeneity limits the performance to some extent. This paper proposes a knowledge-aware multisite adaptive graph Transformer to address the above problems. First, we evaluate the sensitivity of features to each piece of non-imaging information, and then construct feature-sensitive and feature-insensitive subgraphs. Second, after fusing the above subgraphs, we integrate a Transformer module to capture the intrinsic relationship between features. Third, we design a domain adaptive GCN using multiple loss function terms to relieve data heterogeneity and to produce the final classification results. Last, the proposed framework is validated on two brain disorder diagnostic tasks. Experimental results show that the proposed framework can achieve state-of-the-art performance.

由于成像特征复杂、样本量大，通过静息态功能磁共振成像（rs-fMRI）诊断脑部疾病通常受到限制。在脑部疾病诊断方面，图卷积网络（GCN）通过捕捉个体与群体之间的相互作用取得了显著的成功。然而，它主要有三个局限性：1) 以往的 GCN 方法在构建边缘时考虑了非成像信息，但忽略了特征对非成像信息的敏感性差异。2) 以往的 GCN 方法只关注建立主体（即个体和群体）之间的相互作用，忽略了特征之间的本质关系。3) 多站点数据增加了样本量，有助于分类器的训练，但站点间的异质性在一定程度上限制了分类器的性能。本文提出了一种知识感知的多站点自适应图变换器来解决上述问题。首先，我们评估特征对每一条非图像信息的敏感度，然后构建特征敏感子图和特征不敏感子图。其次，在融合上述子图之后，我们集成了一个变换器模块来捕捉特征之间的内在关系。第三，我们设计了一个域自适应 GCN，使用多个损失函数项来缓解数据异质性，并生成最终分类结果。最后，我们在两项脑部疾病诊断任务中验证了所提出的框架。实验结果表明，所提出的框架可以达到最先进的性能。

{"title":"Knowledge-aware Multisite Adaptive Graph Transformer for Brain Disorder Diagnosis.","authors":"Xuegang Song, Kaixiang Shu, Peng Yang, Cheng Zhao, Feng Zhou, Alejandro F Frangi, Xiaohua Xiao, Lei Dong, Tianfu Wang, Shuqiang Wang, Baiying Lei","doi":"10.1109/TMI.2024.3453419","DOIUrl":"https://doi.org/10.1109/TMI.2024.3453419","url":null,"abstract":"<p><p>Brain disorder diagnosis via resting-state functional magnetic resonance imaging (rs-fMRI) is usually limited due to the complex imaging features and sample size. For brain disorder diagnosis, the graph convolutional network (GCN) has achieved remarkable success by capturing interactions between individuals and the population. However, there are mainly three limitations: 1) The previous GCN approaches consider the non-imaging information in edge construction but ignore the sensitivity differences of features to non-imaging information. 2) The previous GCN approaches solely focus on establishing interactions between subjects (i.e., individuals and the population), disregarding the essential relationship between features. 3) Multisite data increase the sample size to help classifier training, but the inter-site heterogeneity limits the performance to some extent. This paper proposes a knowledge-aware multisite adaptive graph Transformer to address the above problems. First, we evaluate the sensitivity of features to each piece of non-imaging information, and then construct feature-sensitive and feature-insensitive subgraphs. Second, after fusing the above subgraphs, we integrate a Transformer module to capture the intrinsic relationship between features. Third, we design a domain adaptive GCN using multiple loss function terms to relieve data heterogeneity and to produce the final classification results. Last, the proposed framework is validated on two brain disorder diagnostic tasks. Experimental results show that the proposed framework can achieve state-of-the-art performance.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spatiotemporal Microstate Dynamics of Spike-Free Scalp EEG Offer a Potential Biomarker for Refractory Temporal Lobe Epilepsy 无尖峰头皮脑电图的时空微状态动力学为难治性颞叶癫痫提供了一种潜在的生物标记。

IEEE transactions on medical imaging

Pub Date : 2024-09-02 DOI: 10.1109/TMI.2024.3453377

Rui Feng;Jingwen Yang;Hao Huang;Zelin Chen;Ruiyan Feng;N. U. Farrukh Hameed;Xudong Zhang;Jie Hu;Liang Chen;Shuo Lu

Refractory temporal lobe epilepsy (TLE) is one of the most frequently observed subtypes of epilepsy and endangers more than 50 million people world-wide. Although electroencephalogram (EEG) had been widely recognized as a classic tool to screen and diagnose epilepsy, for many years it heavily relied on identifying epileptic discharges and epileptogenic zone localization, which however, limits the understanding of refractory epilepsy due to the network nature of this disease. This work hypothesizes that the microstate dynamics based on resting-state scalp EEG can offer an additional network depiction of the disease and provide potential complementary evaluation tool for the TLE even without detectable epileptic discharges on EEG. We propose a novel framework for EEG microstate spatial-temporal dynamics (EEG-MiSTD) analysis based on machine learning to comprehensively model millisecond-changing whole-brain network dynamics. With only 100 seconds of resting-state EEG even without epileptic discharges, this approach successfully distinguishes TLE patients from healthy controls and is related to the lateralization of epileptic focus. Besides, microstate temporal and spatial features are found to be widely related to clinical parameters, which further demonstrate that TLE is a network disease. A preliminary exploration suggests that the spatial topography is sensitive to the following surgical outcomes. From such a new perspective, our results suggest that spatiotemporal microstate dynamics is potentially a biomarker of the disease. The developed EEG-MiSTD framework can probably be considered as a general tool to examine dynamical brain network disruption in a user-friendly way for other types of epilepsy.

难治性颞叶癫痫（TLE）是最常见的癫痫亚型之一，危害着全球 5000 多万人。尽管脑电图（EEG）已被广泛认为是筛查和诊断癫痫的经典工具，但多年来，它主要依赖于识别癫痫放电和致痫区定位，然而，由于难治性癫痫的网络性质，这限制了对难治性癫痫的理解。这项研究假设，基于静息态头皮脑电图的微状态动力学可以提供疾病的额外网络描述，并为 TLE 提供潜在的补充评估工具，即使脑电图上没有可检测到的癫痫放电。我们提出了一种基于机器学习的脑电图微状态时空动态（EEG-MiSTD）分析新框架，以全面模拟毫秒级变化的全脑网络动态。即使没有癫痫放电，只需100秒的静息状态脑电图，这种方法就能成功地将TLE患者与健康对照组区分开来，并与癫痫灶的侧向性有关。此外，研究还发现微状态的时间和空间特征与临床参数广泛相关，这进一步证明了 TLE 是一种网络性疾病。初步研究表明，空间地形图对后续手术结果很敏感。从这一新的角度来看，我们的研究结果表明，时空微状态动态可能是该疾病的一种生物标志物。所开发的脑电图-微状态框架或许可被视为一种通用工具，以用户友好的方式检查其他类型癫痫的动态脑网络破坏。

{"title":"Spatiotemporal Microstate Dynamics of Spike-Free Scalp EEG Offer a Potential Biomarker for Refractory Temporal Lobe Epilepsy","authors":"Rui Feng;Jingwen Yang;Hao Huang;Zelin Chen;Ruiyan Feng;N. U. Farrukh Hameed;Xudong Zhang;Jie Hu;Liang Chen;Shuo Lu","doi":"10.1109/TMI.2024.3453377","DOIUrl":"10.1109/TMI.2024.3453377","url":null,"abstract":"Refractory temporal lobe epilepsy (TLE) is one of the most frequently observed subtypes of epilepsy and endangers more than 50 million people world-wide. Although electroencephalogram (EEG) had been widely recognized as a classic tool to screen and diagnose epilepsy, for many years it heavily relied on identifying epileptic discharges and epileptogenic zone localization, which however, limits the understanding of refractory epilepsy due to the network nature of this disease. This work hypothesizes that the microstate dynamics based on resting-state scalp EEG can offer an additional network depiction of the disease and provide potential complementary evaluation tool for the TLE even without detectable epileptic discharges on EEG. We propose a novel framework for EEG microstate spatial-temporal dynamics (EEG-MiSTD) analysis based on machine learning to comprehensively model millisecond-changing whole-brain network dynamics. With only 100 seconds of resting-state EEG even without epileptic discharges, this approach successfully distinguishes TLE patients from healthy controls and is related to the lateralization of epileptic focus. Besides, microstate temporal and spatial features are found to be widely related to clinical parameters, which further demonstrate that TLE is a network disease. A preliminary exploration suggests that the spatial topography is sensitive to the following surgical outcomes. From such a new perspective, our results suggest that spatiotemporal microstate dynamics is potentially a biomarker of the disease. The developed EEG-MiSTD framework can probably be considered as a general tool to examine dynamical brain network disruption in a user-friendly way for other types of epilepsy.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"574-587"},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Attention-Guided Learning With Feature Reconstruction for Skin Lesion Diagnosis Using Clinical and Ultrasound Images 利用临床和超声图像进行皮肤病变诊断的注意力引导学习与特征重构

IEEE transactions on medical imaging

Pub Date : 2024-08-29 DOI: 10.1109/TMI.2024.3450682

Chunlun Xiao;Anqi Zhu;Chunmei Xia;Zifeng Qiu;Yuanlin Liu;Cheng Zhao;Weiwei Ren;Lifan Wang;Lei Dong;Tianfu Wang;Lehang Guo;Baiying Lei

Skin lesion is one of the most common diseases, and most categories are highly similar in morphology and appearance. Deep learning models effectively reduce the variability between classes and within classes, and improve diagnostic accuracy. However, the existing multi-modal methods are only limited to the surface information of lesions in skin clinical and dermatoscopic modalities, which hinders the further improvement of skin lesion diagnostic accuracy. This requires us to further study the depth information of lesions in skin ultrasound. In this paper, we propose a novel skin lesion diagnosis network, which combines clinical and ultrasound modalities to fuse the surface and depth information of the lesion to improve diagnostic accuracy. Specifically, we propose an attention-guided learning (AL) module that fuses clinical and ultrasound modalities from both local and global perspectives to enhance feature representation. The AL module consists of two parts, attention-guided local learning (ALL) computes the intra-modality and inter-modality correlations to fuse multi-scale information, which makes the network focus on the local information of each modality, and attention-guided global learning (AGL) fuses global information to further enhance the feature representation. In addition, we propose a feature reconstruction learning (FRL) strategy which encourages the network to extract more discriminative features and corrects the focus of the network to enhance the model’s robustness and certainty. We conduct extensive experiments and the results confirm the superiority of our proposed method. Our code is available at: https://github.com/XCL-hub/AGFnet.

皮肤病变是最常见的疾病之一，大多数类别在形态和外观上高度相似。深度学习模型有效地减少了类之间和类内的可变性，提高了诊断的准确性。然而，现有的多模态方法仅局限于皮肤临床和皮肤镜下病变的表面信息，阻碍了皮肤病变诊断准确性的进一步提高。这就需要我们进一步研究皮肤超声中病变的深度信息。在本文中，我们提出了一种新的皮肤病变诊断网络，该网络将临床和超声模式相结合，融合病变的表面和深度信息，以提高诊断的准确性。具体来说，我们提出了一个注意引导学习（AL）模块，它融合了临床和超声模式，从局部和全局的角度来增强特征表征。人工智能模块由两部分组成，注意引导局部学习（attention-guided local learning， ALL）计算模态内和模态间的相关性，融合多尺度信息，使网络专注于每个模态的局部信息；注意引导全局学习（attention-guided global learning， AGL）融合全局信息，进一步增强特征表征。此外，我们提出了一种特征重建学习（FRL）策略，该策略鼓励网络提取更多的判别特征并纠正网络的焦点，以增强模型的鲁棒性和确定性。我们进行了大量的实验，结果证实了我们提出的方法的优越性。我们的代码可在：https://github.com/XCL-hub/AGFnet。

{"title":"Attention-Guided Learning With Feature Reconstruction for Skin Lesion Diagnosis Using Clinical and Ultrasound Images","authors":"Chunlun Xiao;Anqi Zhu;Chunmei Xia;Zifeng Qiu;Yuanlin Liu;Cheng Zhao;Weiwei Ren;Lifan Wang;Lei Dong;Tianfu Wang;Lehang Guo;Baiying Lei","doi":"10.1109/TMI.2024.3450682","DOIUrl":"10.1109/TMI.2024.3450682","url":null,"abstract":"Skin lesion is one of the most common diseases, and most categories are highly similar in morphology and appearance. Deep learning models effectively reduce the variability between classes and within classes, and improve diagnostic accuracy. However, the existing multi-modal methods are only limited to the surface information of lesions in skin clinical and dermatoscopic modalities, which hinders the further improvement of skin lesion diagnostic accuracy. This requires us to further study the depth information of lesions in skin ultrasound. In this paper, we propose a novel skin lesion diagnosis network, which combines clinical and ultrasound modalities to fuse the surface and depth information of the lesion to improve diagnostic accuracy. Specifically, we propose an attention-guided learning (AL) module that fuses clinical and ultrasound modalities from both local and global perspectives to enhance feature representation. The AL module consists of two parts, attention-guided local learning (ALL) computes the intra-modality and inter-modality correlations to fuse multi-scale information, which makes the network focus on the local information of each modality, and attention-guided global learning (AGL) fuses global information to further enhance the feature representation. In addition, we propose a feature reconstruction learning (FRL) strategy which encourages the network to extract more discriminative features and corrects the focus of the network to enhance the model’s robustness and certainty. We conduct extensive experiments and the results confirm the superiority of our proposed method. Our code is available at: \u0000<uri>https://github.com/XCL-hub/AGFnet</uri>\u0000.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"543-555"},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142100751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Modal Federated Learning for Cancer Staging Over Non-IID Datasets With Unbalanced Modalities 在不平衡模式的非 IID 数据集上进行癌症分期的多模式联合学习

IEEE transactions on medical imaging

Pub Date : 2024-08-28 DOI: 10.1109/TMI.2024.3450855

Kasra Borazjani;Naji Khosravan;Leslie Ying;Seyyedali Hosseinalipour

The use of machine learning (ML) for cancer staging through medical image analysis has gained substantial interest across medical disciplines. When accompanied by the innovative federated learning (FL) framework, ML techniques can further overcome privacy concerns related to patient data exposure. Given the frequent presence of diverse data modalities within patient records, leveraging FL in a multi-modal learning framework holds considerable promise for cancer staging. However, existing works on multi-modal FL often presume that all data-collecting institutions have access to all data modalities. This oversimplified approach neglects institutions that have access to only a portion of data modalities within the system. In this work, we introduce a novel FL architecture designed to accommodate not only the heterogeneity of data samples, but also the inherent heterogeneity/non-uniformity of data modalities across institutions. We shed light on the challenges associated with varying convergence speeds observed across different data modalities within our FL system. Subsequently, we propose a solution to tackle these challenges by devising a distributed gradient blending and proximity-aware client weighting strategy tailored for multi-modal FL. To show the superiority of our method, we conduct experiments using The Cancer Genome Atlas program (TCGA) datalake considering different cancer types and three modalities of data: mRNA sequences, histopathological image data, and clinical information. Our results further unveil the impact and severity of class-based vs type-based heterogeneity across institutions on the model performance, which widens the perspective to the notion of data heterogeneity in multi-modal FL literature.

通过医学影像分析将机器学习（ML）用于癌症分期的做法在医学各学科中引起了广泛关注。如果辅以创新的联合学习（FL）框架，机器学习技术就能进一步克服与患者数据暴露相关的隐私问题。鉴于患者记录中经常出现不同的数据模式，在多模式学习框架中利用 FL 对癌症分期具有相当大的前景。然而，现有的多模态 FL 工作通常假定所有数据收集机构都能访问所有数据模态。这种过于简化的方法忽略了系统中只能访问部分数据模式的机构。在这项工作中，我们介绍了一种新颖的 FL 架构，其设计不仅考虑到了数据样本的异质性，还考虑到了各机构数据模式的固有异质性/不均匀性。我们阐明了在我们的 FL 系统中，不同数据模式的收敛速度不同所带来的挑战。随后，我们提出了应对这些挑战的解决方案，即为多模 FL 量身定制分布式梯度混合和近距离感知客户端加权策略。为了证明我们的方法的优越性，我们使用癌症基因组图谱计划（TCGA）数据集进行了实验，考虑了不同的癌症类型和三种数据模式：mRNA 序列、组织病理学图像数据和临床信息。我们的结果进一步揭示了不同机构间基于类别与基于类型的异质性对模型性能的影响和严重程度，从而拓宽了多模态 FL 文献中数据异质性概念的视野。

{"title":"Multi-Modal Federated Learning for Cancer Staging Over Non-IID Datasets With Unbalanced Modalities","authors":"Kasra Borazjani;Naji Khosravan;Leslie Ying;Seyyedali Hosseinalipour","doi":"10.1109/TMI.2024.3450855","DOIUrl":"10.1109/TMI.2024.3450855","url":null,"abstract":"The use of machine learning (ML) for cancer staging through medical image analysis has gained substantial interest across medical disciplines. When accompanied by the innovative federated learning (FL) framework, ML techniques can further overcome privacy concerns related to patient data exposure. Given the frequent presence of diverse data modalities within patient records, leveraging FL in a multi-modal learning framework holds considerable promise for cancer staging. However, existing works on multi-modal FL often presume that all data-collecting institutions have access to all data modalities. This oversimplified approach neglects institutions that have access to only a portion of data modalities within the system. In this work, we introduce a novel FL architecture designed to accommodate not only the heterogeneity of data samples, but also the inherent heterogeneity/non-uniformity of data modalities across institutions. We shed light on the challenges associated with varying convergence speeds observed across different data modalities within our FL system. Subsequently, we propose a solution to tackle these challenges by devising a distributed gradient blending and proximity-aware client weighting strategy tailored for multi-modal FL. To show the superiority of our method, we conduct experiments using The Cancer Genome Atlas program (TCGA) datalake considering different cancer types and three modalities of data: mRNA sequences, histopathological image data, and clinical information. Our results further unveil the impact and severity of class-based vs type-based heterogeneity across institutions on the model performance, which widens the perspective to the notion of data heterogeneity in multi-modal FL literature.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"556-573"},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142086428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Emulating Low-Dose PCCT Image Pairs With Independent Noise for Self-Supervised Spectral Image Denoising 利用独立噪声模拟低剂量 PCCT 图像对，实现自监督光谱图像去噪。

IEEE transactions on medical imaging

Pub Date : 2024-08-28 DOI: 10.1109/TMI.2024.3449817

Sen Wang;Yirong Yang;Grant M. Stevens;Zhye Yin;Adam S. Wang

Photon counting CT (PCCT) acquires spectral measurements and enables generation of material decomposition (MD) images that provide distinct advantages in various clinical situations. However, noise amplification is observed in MD images, and denoising is typically applied. Clean or high-quality references are rare in clinical scans, often making supervised learning (Noise2Clean) impractical. Noise2Noise is a self-supervised counterpart, using noisy images and corresponding noisy references with zero-mean, independent noise. PCCT counts transmitted photons separately, and raw measurements are assumed to follow a Poisson distribution in each energy bin, providing the possibility to create noise-independent pairs. The approach is to use binomial selection to split the counts into two low-dose scans with independent noise. We prove that the reconstructed spectral images inherit the noise independence from counts domain through noise propagation analysis and also validated it in numerical simulation and experimental phantom scans. The method offers the flexibility to split measurements into desired dose levels while ensuring the reconstructed images share identical underlying features, thereby strengthening the model’s robustness for input dose levels and capability of preserving fine details. In both numerical simulation and experimental phantom scans, we demonstrated that Noise2Noise with binomial selection outperforms other common self-supervised learning methods based on different presumptive conditions.

光子计数 CT（PCCT）可获取光谱测量数据并生成物质分解（MD）图像，在各种临床情况下具有明显的优势。然而，MD 图像中会出现噪声放大现象，通常需要进行去噪处理。临床扫描中很少有干净或高质量的参考图像，这往往使得监督学习（Noise2Clean）变得不切实际。Noise2Noise 是一种自我监督的对应方法，使用的是噪声图像和相应的零均值、独立噪声参考。PCCT 对传输的光子进行单独计数，并假定原始测量值在每个能量分区中遵循泊松分布，从而为创建与噪声无关的数据对提供了可能。我们的方法是使用二项式选择，将计数分成两个具有独立噪声的低剂量扫描。我们通过噪声传播分析证明，重建的光谱图像继承了计数域的噪声独立性，并在数值模拟和实验幻影扫描中进行了验证。该方法可灵活地将测量结果分成所需的剂量水平，同时确保重建图像具有相同的基本特征，从而增强了模型对输入剂量水平的鲁棒性和保留精细细节的能力。在数值模拟和实验幻影扫描中，我们都证明了采用二叉选择的 Noise2Noise 优于其他基于不同推定条件的常见自监督学习方法。

{"title":"Emulating Low-Dose PCCT Image Pairs With Independent Noise for Self-Supervised Spectral Image Denoising","authors":"Sen Wang;Yirong Yang;Grant M. Stevens;Zhye Yin;Adam S. Wang","doi":"10.1109/TMI.2024.3449817","DOIUrl":"10.1109/TMI.2024.3449817","url":null,"abstract":"Photon counting CT (PCCT) acquires spectral measurements and enables generation of material decomposition (MD) images that provide distinct advantages in various clinical situations. However, noise amplification is observed in MD images, and denoising is typically applied. Clean or high-quality references are rare in clinical scans, often making supervised learning (Noise2Clean) impractical. Noise2Noise is a self-supervised counterpart, using noisy images and corresponding noisy references with zero-mean, independent noise. PCCT counts transmitted photons separately, and raw measurements are assumed to follow a Poisson distribution in each energy bin, providing the possibility to create noise-independent pairs. The approach is to use binomial selection to split the counts into two low-dose scans with independent noise. We prove that the reconstructed spectral images inherit the noise independence from counts domain through noise propagation analysis and also validated it in numerical simulation and experimental phantom scans. The method offers the flexibility to split measurements into desired dose levels while ensuring the reconstructed images share identical underlying features, thereby strengthening the model’s robustness for input dose levels and capability of preserving fine details. In both numerical simulation and experimental phantom scans, we demonstrated that Noise2Noise with binomial selection outperforms other common self-supervised learning methods based on different presumptive conditions.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"530-542"},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142086427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0