首页 > 最新文献

IEEE Transactions on Medical Imaging最新文献

英文 中文
RibSeg v2: A Large-scale Benchmark for Rib Labeling and Anatomical Centerline Extraction RibSeg v2:肋骨标记和解剖中心线提取的大规模基准
IF 10.6 1区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-10-18 DOI: 10.48550/arXiv.2210.09309
L. Jin, Shi Gu, D. Wei, Kaiming Kuang, H. Pfister, Bingbing Ni, Jiancheng Yang, Ming Li
Automatic rib labeling and anatomical centerline extraction are common prerequisites for various clinical applications. Prior studies either use in-house datasets that are inaccessible to communities, or focus on rib segmentation that neglects the clinical significance of rib labeling. To address these issues, we extend our prior dataset (RibSeg) on the binary rib segmentation task to a comprehensive benchmark, named RibSeg v2, with 660 CT scans (15,466 individual ribs in total) and annotations manually inspected by experts for rib labeling and anatomical centerline extraction. Based on the RibSeg v2, we develop a pipeline including deep learning-based methods for rib labeling, and a skeletonization-based method for centerline extraction. To improve computational efficiency, we propose a sparse point cloud representation of CT scans and compare it with standard dense voxel grids. Moreover, we design and analyze evaluation metrics to address the key challenges of each task. Our dataset, code, and model are available online to facilitate open research at https://github.com/M3DV/RibSeg.
自动肋骨标记和解剖中心线提取是各种临床应用的常见先决条件。先前的研究要么使用社区无法访问的内部数据集,要么专注于肋骨分割,忽略了肋骨标记的临床意义。为了解决这些问题,我们将二进制肋骨分割任务的先前数据集(RibSeg)扩展到一个名为RibSegv2的综合基准,共有660个CT扫描(共15466个肋骨)和专家手动检查的注释,用于肋骨标记和解剖中心线提取。基于RibSeg v2,我们开发了一个流水线,包括基于深度学习的肋骨标记方法和基于骨架化的中心线提取方法。为了提高计算效率,我们提出了一种CT扫描的稀疏点云表示,并将其与标准的密集体素网格进行了比较。此外,我们设计和分析评估指标,以应对每项任务的关键挑战。我们的数据集、代码和模型可在线获取,以促进开放式研究https://github.com/M3DV/RibSeg.
{"title":"RibSeg v2: A Large-scale Benchmark for Rib Labeling and Anatomical Centerline Extraction","authors":"L. Jin, Shi Gu, D. Wei, Kaiming Kuang, H. Pfister, Bingbing Ni, Jiancheng Yang, Ming Li","doi":"10.48550/arXiv.2210.09309","DOIUrl":"https://doi.org/10.48550/arXiv.2210.09309","url":null,"abstract":"Automatic rib labeling and anatomical centerline extraction are common prerequisites for various clinical applications. Prior studies either use in-house datasets that are inaccessible to communities, or focus on rib segmentation that neglects the clinical significance of rib labeling. To address these issues, we extend our prior dataset (RibSeg) on the binary rib segmentation task to a comprehensive benchmark, named RibSeg v2, with 660 CT scans (15,466 individual ribs in total) and annotations manually inspected by experts for rib labeling and anatomical centerline extraction. Based on the RibSeg v2, we develop a pipeline including deep learning-based methods for rib labeling, and a skeletonization-based method for centerline extraction. To improve computational efficiency, we propose a sparse point cloud representation of CT scans and compare it with standard dense voxel grids. Moreover, we design and analyze evaluation metrics to address the key challenges of each task. Our dataset, code, and model are available online to facilitate open research at https://github.com/M3DV/RibSeg.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46077902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Bidirectional Semi-supervised Dual-branch CNN for Robust 3D Reconstruction of Stereo Endoscopic Images via Adaptive Cross and Parallel Supervisions 双向半监督双分支CNN通过自适应交叉和并行监督实现立体内窥镜图像的鲁棒三维重建
IF 10.6 1区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-10-15 DOI: 10.48550/arXiv.2210.08291
Hongkuan Shi, Zhiwei Wang, Ying Zhou, Dun Li, Xin Yang, Qiang Li
Semi-supervised learning via teacher-student network can train a model effectively on a few labeled samples. It enables a student model to distill knowledge from the teacher's predictions of extra unlabeled data. However, such knowledge flow is typically unidirectional, having the accuracy vulnerable to the quality of teacher model. In this paper, we seek to robust 3D reconstruction of stereo endoscopic images by proposing a novel fashion of bidirectional learning between two learners, each of which can play both roles of teacher and student concurrently. Specifically, we introduce two self-supervisions, i.e., Adaptive Cross Supervision (ACS) and Adaptive Parallel Supervision (APS), to learn a dual-branch convolutional neural network. The two branches predict two different disparity probability distributions for the same position, and output their expectations as disparity values. The learned knowledge flows across branches along two directions: a cross direction (disparity guides distribution in ACS) and a parallel direction (disparity guides disparity in APS). Moreover, each branch also learns confidences to dynamically refine its provided supervisions. In ACS, the predicted disparity is softened into a unimodal distribution, and the lower the confidence, the smoother the distribution. In APS, the incorrect predictions are suppressed by lowering the weights of those with low confidence. With the adaptive bidirectional learning, the two branches enjoy well-tuned mutual supervisions, and eventually converge on a consistent and more accurate disparity estimation. The experimental results on four public datasets demonstrate our superior accuracy over other state-of-the-arts with a relative decrease of averaged disparity error by at least 9.76%.
通过师生网络进行半监督学习可以在少数标记样本上有效地训练模型。它使学生模型能够从教师对额外未标记数据的预测中提取知识。然而,这种知识流通常是单向的,其准确性容易受到教师模型质量的影响。在本文中,我们通过在两个学习者之间提出一种新的双向学习方式来寻求立体内窥镜图像的鲁棒3D重建,每个学习者都可以同时扮演教师和学生的角色。具体来说,我们引入了两种自监督,即自适应交叉监督(ACS)和自适应并行监督(APS),以学习双分支卷积神经网络。两个分支预测相同位置的两个不同视差概率分布,并将它们的期望值输出为视差值。学习的知识沿着两个方向在分支之间流动:交叉方向(ACS中的视差引导分布)和平行方向(APS中的视差指导视差)。此外,每个分支还学习置信度,以动态地完善其提供的监督。在ACS中,预测的视差被软化为单峰分布,置信度越低,分布越平滑。在APS中,通过降低置信度低的预测的权重来抑制不正确的预测。通过自适应双向学习,这两个分支可以享受良好的相互监督,并最终收敛于一致且更准确的视差估计。在四个公共数据集上的实验结果表明,与其他现有技术相比,我们的精度更高,平均视差误差相对降低了至少9.76%。
{"title":"Bidirectional Semi-supervised Dual-branch CNN for Robust 3D Reconstruction of Stereo Endoscopic Images via Adaptive Cross and Parallel Supervisions","authors":"Hongkuan Shi, Zhiwei Wang, Ying Zhou, Dun Li, Xin Yang, Qiang Li","doi":"10.48550/arXiv.2210.08291","DOIUrl":"https://doi.org/10.48550/arXiv.2210.08291","url":null,"abstract":"Semi-supervised learning via teacher-student network can train a model effectively on a few labeled samples. It enables a student model to distill knowledge from the teacher's predictions of extra unlabeled data. However, such knowledge flow is typically unidirectional, having the accuracy vulnerable to the quality of teacher model. In this paper, we seek to robust 3D reconstruction of stereo endoscopic images by proposing a novel fashion of bidirectional learning between two learners, each of which can play both roles of teacher and student concurrently. Specifically, we introduce two self-supervisions, i.e., Adaptive Cross Supervision (ACS) and Adaptive Parallel Supervision (APS), to learn a dual-branch convolutional neural network. The two branches predict two different disparity probability distributions for the same position, and output their expectations as disparity values. The learned knowledge flows across branches along two directions: a cross direction (disparity guides distribution in ACS) and a parallel direction (disparity guides disparity in APS). Moreover, each branch also learns confidences to dynamically refine its provided supervisions. In ACS, the predicted disparity is softened into a unimodal distribution, and the lower the confidence, the smoother the distribution. In APS, the incorrect predictions are suppressed by lowering the weights of those with low confidence. With the adaptive bidirectional learning, the two branches enjoy well-tuned mutual supervisions, and eventually converge on a consistent and more accurate disparity estimation. The experimental results on four public datasets demonstrate our superior accuracy over other state-of-the-arts with a relative decrease of averaged disparity error by at least 9.76%.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45789439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Dual-Attention Learning Network with Word and Sentence Embedding for Medical Visual Question Answering 一种嵌入单词和句子的医学视觉问答双注意学习网络
IF 10.6 1区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-10-01 DOI: 10.48550/arXiv.2210.00220
Xiaofei Huang, Hongfang Gong
Research in medical visual question answering (MVQA) can contribute to the development of computer-aided diagnosis. MVQA is a task that aims to predict accurate and convincing answers based on given medical images and associated natural language questions. This task requires extracting medical knowledge-rich feature content and making fine-grained understandings of them. Therefore, constructing an effective feature extraction and understanding scheme are keys to modeling. Existing MVQA question extraction schemes mainly focus on word information, ignoring medical information in the text, such as medical concepts and domain-specific terms. Meanwhile, some visual and textual feature understanding schemes cannot effectively capture the correlation between regions and keywords for reasonable visual reasoning. In this study, a dual-attention learning network with word and sentence embedding (DALNet-WSE) is proposed. We design a module, transformer with sentence embedding (TSE), to extract a double embedding representation of questions containing keywords and medical information. A dual-attention learning (DAL) module consisting of self-attention and guided attention is proposed to model intensive intramodal and intermodal interactions. With multiple DAL modules (DALs), learning visual and textual co-attention can increase the granularity of understanding and improve visual reasoning. Experimental results on the ImageCLEF 2019 VQA-MED (VQA-MED 2019) and VQA-RAD datasets demonstrate that our proposed method outperforms previous state-of-the-art methods. According to the ablation studies and Grad-CAM maps, DALNet-WSE can extract rich textual information and has strong visual reasoning ability.
医学视觉问答(MVQA)的研究有助于计算机辅助诊断的发展。MVQA是一项旨在根据给定的医学图像和相关的自然语言问题预测准确和令人信服的答案的任务。这项任务需要提取医学知识丰富的特征内容,并对其进行细粒度的理解。因此,构建一个有效的特征提取和理解方案是建模的关键。现有的MVQA问题提取方案主要关注单词信息,忽略了文本中的医学信息,如医学概念和特定领域术语。同时,一些视觉和文本特征理解方案不能有效地捕捉区域和关键词之间的相关性,从而进行合理的视觉推理。在本研究中,提出了一种具有单词和句子嵌入的双注意学习网络(DALNet-WSE)。我们设计了一个模块,带句子嵌入的转换器(TSE),来提取包含关键字和医疗信息的问题的双重嵌入表示。提出了一个由自我注意和引导注意组成的双重注意学习(DAL)模块来模拟密集的模式内和模式间交互。通过多个DAL模块(DAL),学习视觉和文本的共同注意可以增加理解的粒度并改进视觉推理。在ImageCLEF2019-VQA-MED(VQA-MED 2019)和VQA-RAD数据集上的实验结果表明,我们提出的方法优于以前最先进的方法。根据消融研究和梯度CAM映射,DALNet WSE可以提取丰富的文本信息,并具有较强的视觉推理能力。
{"title":"A Dual-Attention Learning Network with Word and Sentence Embedding for Medical Visual Question Answering","authors":"Xiaofei Huang, Hongfang Gong","doi":"10.48550/arXiv.2210.00220","DOIUrl":"https://doi.org/10.48550/arXiv.2210.00220","url":null,"abstract":"Research in medical visual question answering (MVQA) can contribute to the development of computer-aided diagnosis. MVQA is a task that aims to predict accurate and convincing answers based on given medical images and associated natural language questions. This task requires extracting medical knowledge-rich feature content and making fine-grained understandings of them. Therefore, constructing an effective feature extraction and understanding scheme are keys to modeling. Existing MVQA question extraction schemes mainly focus on word information, ignoring medical information in the text, such as medical concepts and domain-specific terms. Meanwhile, some visual and textual feature understanding schemes cannot effectively capture the correlation between regions and keywords for reasonable visual reasoning. In this study, a dual-attention learning network with word and sentence embedding (DALNet-WSE) is proposed. We design a module, transformer with sentence embedding (TSE), to extract a double embedding representation of questions containing keywords and medical information. A dual-attention learning (DAL) module consisting of self-attention and guided attention is proposed to model intensive intramodal and intermodal interactions. With multiple DAL modules (DALs), learning visual and textual co-attention can increase the granularity of understanding and improve visual reasoning. Experimental results on the ImageCLEF 2019 VQA-MED (VQA-MED 2019) and VQA-RAD datasets demonstrate that our proposed method outperforms previous state-of-the-art methods. According to the ablation studies and Grad-CAM maps, DALNet-WSE can extract rich textual information and has strong visual reasoning ability.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"PP 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42795071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Medical Image Translation with Adversarial Diffusion Models 基于对抗扩散模型的无监督医学图像翻译
IF 10.6 1区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-07-17 DOI: 10.48550/arXiv.2207.08208
Muzaffer Ozbey, S. Dar, H. A. Bedel, Onat Dalmaz, cSaban Ozturk, Alper Gungor, Tolga cCukur
Imputation of missing images via source-to-target modality translation can improve diversity in medical imaging protocols. A pervasive approach for synthesizing target images involves one-shot mapping through generative adversarial networks (GAN). Yet, GAN models that implicitly characterize the image distribution can suffer from limited sample fidelity. Here, we propose a novel method based on adversarial diffusion modeling, SynDiff, for improved performance in medical image translation. To capture a direct correlate of the image distribution, SynDiff leverages a conditional diffusion process that progressively maps noise and source images onto the target image. For fast and accurate image sampling during inference, large diffusion steps are taken with adversarial projections in the reverse diffusion direction. To enable training on unpaired datasets, a cycle-consistent architecture is devised with coupled diffusive and non-diffusive modules that bilaterally translate between two modalities. Extensive assessments are reported on the utility of SynDiff against competing GAN and diffusion models in multi-contrast MRI and MRI-CT translation. Our demonstrations indicate that SynDiff offers quantitatively and qualitatively superior performance against competing baselines.
通过源到目标模态转换对缺失图像进行推断可以提高医学成像协议的多样性。一种用于合成目标图像的普遍方法涉及通过生成对抗性网络(GAN)的一次性映射。然而,隐含地表征图像分布的GAN模型可能受到有限的样本保真度的影响。在这里,我们提出了一种基于对抗性扩散建模的新方法SynDiff,以提高医学图像翻译的性能。为了捕获图像分布的直接相关性,SynDiff利用条件扩散过程,该过程将噪声和源图像逐步映射到目标图像上。为了在推理过程中快速准确地进行图像采样,在反向扩散方向上采用对抗性投影进行大的扩散步骤。为了能够在不成对的数据集上进行训练,设计了一个具有耦合扩散和非扩散模块的循环一致性架构,该模块在两种模态之间双向转换。据报道,SynDiff在多对比MRI和MRI-CT转换中与竞争的GAN和扩散模型的效用进行了广泛的评估。我们的演示表明,SynDiff在数量和质量上都优于竞争基线。
{"title":"Unsupervised Medical Image Translation with Adversarial Diffusion Models","authors":"Muzaffer Ozbey, S. Dar, H. A. Bedel, Onat Dalmaz, cSaban Ozturk, Alper Gungor, Tolga cCukur","doi":"10.48550/arXiv.2207.08208","DOIUrl":"https://doi.org/10.48550/arXiv.2207.08208","url":null,"abstract":"Imputation of missing images via source-to-target modality translation can improve diversity in medical imaging protocols. A pervasive approach for synthesizing target images involves one-shot mapping through generative adversarial networks (GAN). Yet, GAN models that implicitly characterize the image distribution can suffer from limited sample fidelity. Here, we propose a novel method based on adversarial diffusion modeling, SynDiff, for improved performance in medical image translation. To capture a direct correlate of the image distribution, SynDiff leverages a conditional diffusion process that progressively maps noise and source images onto the target image. For fast and accurate image sampling during inference, large diffusion steps are taken with adversarial projections in the reverse diffusion direction. To enable training on unpaired datasets, a cycle-consistent architecture is devised with coupled diffusive and non-diffusive modules that bilaterally translate between two modalities. Extensive assessments are reported on the utility of SynDiff against competing GAN and diffusion models in multi-contrast MRI and MRI-CT translation. Our demonstrations indicate that SynDiff offers quantitatively and qualitatively superior performance against competing baselines.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45407622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
LViT: Language meets Vision Transformer in Medical Image Segmentation 医学图像分割中语言与视觉转换器的结合
IF 10.6 1区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-06-29 DOI: 10.48550/arXiv.2206.14718
Zihan Li, Yunxiang Li, Qingde Li, You Zhang, Puyang Wang, Dazhou Guo, Le Lu, D. Jin, Qingqi Hong
Deep learning has been widely used in medical image segmentation and other aspects. However, the performance of existing medical image segmentation models has been limited by the challenge of obtaining sufficient high-quality labeled data due to the prohibitive data annotation cost. To alleviate this limitation, we propose a new text-augmented medical image segmentation model LViT (Language meets Vision Transformer). In our LViT model, medical text annotation is incorporated to compensate for the quality deficiency in image data. In addition, the text information can guide to generate pseudo labels of improved quality in the semi-supervised learning. We also propose an Exponential Pseudo label Iteration mechanism (EPI) to help the Pixel-Level Attention Module (PLAM) preserve local image features in semi-supervised LViT setting. In our model, LV (Language-Vision) loss is designed to supervise the training of unlabeled images using text information directly. For evaluation, we construct three multimodal medical segmentation datasets (image + text) containing X-rays and CT images. Experimental results show that our proposed LViT has superior segmentation performance in both fully-supervised and semi-supervised setting. The code and datasets are available at https://github.com/HUANGLIZI/LViT.
深度学习在医学图像分割等方面得到了广泛的应用。然而,现有医学图像分割模型的性能一直受到数据标注成本过高而无法获得足够高质量标记数据的挑战的限制。为了缓解这一限制,我们提出了一种新的文本增强医学图像分割模型LViT (Language meets Vision Transformer)。在我们的LViT模型中,医学文本注释被纳入以弥补图像数据的质量缺陷。此外,在半监督学习中,文本信息可以引导生成质量提高的伪标签。我们还提出了一种指数伪标签迭代机制(EPI)来帮助像素级注意模块(PLAM)在半监督LViT设置下保持局部图像特征。在我们的模型中,LV (Language-Vision)损失被设计用来直接使用文本信息监督未标记图像的训练。为了评估,我们构建了包含x射线和CT图像的三个多模态医学分割数据集(图像+文本)。实验结果表明,本文提出的LViT在全监督和半监督两种情况下都具有较好的分割性能。代码和数据集可在https://github.com/HUANGLIZI/LViT上获得。
{"title":"LViT: Language meets Vision Transformer in Medical Image Segmentation","authors":"Zihan Li, Yunxiang Li, Qingde Li, You Zhang, Puyang Wang, Dazhou Guo, Le Lu, D. Jin, Qingqi Hong","doi":"10.48550/arXiv.2206.14718","DOIUrl":"https://doi.org/10.48550/arXiv.2206.14718","url":null,"abstract":"Deep learning has been widely used in medical image segmentation and other aspects. However, the performance of existing medical image segmentation models has been limited by the challenge of obtaining sufficient high-quality labeled data due to the prohibitive data annotation cost. To alleviate this limitation, we propose a new text-augmented medical image segmentation model LViT (Language meets Vision Transformer). In our LViT model, medical text annotation is incorporated to compensate for the quality deficiency in image data. In addition, the text information can guide to generate pseudo labels of improved quality in the semi-supervised learning. We also propose an Exponential Pseudo label Iteration mechanism (EPI) to help the Pixel-Level Attention Module (PLAM) preserve local image features in semi-supervised LViT setting. In our model, LV (Language-Vision) loss is designed to supervise the training of unlabeled images using text information directly. For evaluation, we construct three multimodal medical segmentation datasets (image + text) containing X-rays and CT images. Experimental results show that our proposed LViT has superior segmentation performance in both fully-supervised and semi-supervised setting. The code and datasets are available at https://github.com/HUANGLIZI/LViT.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44585888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Pseudo-Data based Self-Supervised Federated Learning for Classification of Histopathological Images 基于伪数据的组织病理图像分类自监督联邦学习
IF 10.6 1区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-05-31 DOI: 10.48550/arXiv.2205.15530
Jun Shi, Yuan-Yang Zhang, Zheng Li, Xiangmin Han, Saisai Ding, Jun Wang, Shihui Ying
Computer-aided diagnosis (CAD) can help pathologists improve diagnostic accuracy together with consistency and repeatability for cancers. However, the CAD models trained with the histopathological images only from a single center (hospital) generally suffer from the generalization problem due to the straining inconsistencies among different centers. In this work, we propose a pseudo-data based self-supervised federated learning (FL) framework, named SSL-FT-BT, to improve both the diagnostic accuracy and generalization of CAD models. Specifically, the pseudo histopathological images are generated from each center, which contain both inherent and specific properties corresponding to the real images in this center, but do not include the privacy information. These pseudo images are then shared in the central server for self-supervised learning (SSL) to pre-train the backbone of global mode. A multi-task SSL is then designed to effectively learn both the center-specific information and common inherent representation according to the data characteristics. Moreover, a novel Barlow Twins based FL (FL-BT) algorithm is proposed to improve the local training for the CAD models in each center by conducting model contrastive learning, which benefits the optimization of the global model in the FL procedure. The experimental results on four public histopathological image datasets indicate the effectiveness of the proposed SSL-FL-BT on both diagnostic accuracy and generalization.
计算机辅助诊断(CAD)可以帮助病理学家提高癌症诊断的准确性以及一致性和可重复性。然而,仅使用来自单一中心(医院)的组织病理图像训练的CAD模型通常由于不同中心之间的张力不一致而存在泛化问题。在这项工作中,我们提出了一个基于伪数据的自监督联邦学习(FL)框架,命名为SSL-FT-BT,以提高CAD模型的诊断准确性和泛化。具体来说,从每个中心生成伪组织病理图像,这些图像包含该中心真实图像的固有属性和特定属性,但不包含隐私信息。然后,这些伪图像在中央服务器中共享,用于自监督学习(SSL),以预训练全局模式的主干。然后设计多任务SSL,根据数据特征有效地学习特定于中心的信息和常见的固有表示。在此基础上,提出了一种基于Barlow Twins (FL- bt)算法,通过模型对比学习来改进各中心CAD模型的局部训练,有利于FL过程中全局模型的优化。在四个公共组织病理学图像数据集上的实验结果表明,所提出的SSL-FL-BT在诊断准确性和泛化方面都是有效的。
{"title":"Pseudo-Data based Self-Supervised Federated Learning for Classification of Histopathological Images","authors":"Jun Shi, Yuan-Yang Zhang, Zheng Li, Xiangmin Han, Saisai Ding, Jun Wang, Shihui Ying","doi":"10.48550/arXiv.2205.15530","DOIUrl":"https://doi.org/10.48550/arXiv.2205.15530","url":null,"abstract":"Computer-aided diagnosis (CAD) can help pathologists improve diagnostic accuracy together with consistency and repeatability for cancers. However, the CAD models trained with the histopathological images only from a single center (hospital) generally suffer from the generalization problem due to the straining inconsistencies among different centers. In this work, we propose a pseudo-data based self-supervised federated learning (FL) framework, named SSL-FT-BT, to improve both the diagnostic accuracy and generalization of CAD models. Specifically, the pseudo histopathological images are generated from each center, which contain both inherent and specific properties corresponding to the real images in this center, but do not include the privacy information. These pseudo images are then shared in the central server for self-supervised learning (SSL) to pre-train the backbone of global mode. A multi-task SSL is then designed to effectively learn both the center-specific information and common inherent representation according to the data characteristics. Moreover, a novel Barlow Twins based FL (FL-BT) algorithm is proposed to improve the local training for the CAD models in each center by conducting model contrastive learning, which benefits the optimization of the global model in the FL procedure. The experimental results on four public histopathological image datasets indicate the effectiveness of the proposed SSL-FL-BT on both diagnostic accuracy and generalization.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43105200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-modal learning for predicting the genotype of glioma 预测神经胶质瘤基因型的多模式学习
IF 10.6 1区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-03-21 DOI: 10.48550/arXiv.2203.10852
Yiran Wei, Xi Chen, Lei Zhu, Lipei Zhang, C. Schonlieb, S. Price, C. Li
The isocitrate dehydrogenase (IDH) gene mutation is an essential biomarker for the diagnosis and prognosis of glioma. It is promising to better predict glioma genotype by integrating focal tumor image and geometric features with brain network features derived from MRI. Convolutional neural networks show reasonable performance in predicting IDH mutation, which, however, cannot learn from non-Euclidean data, e.g., geometric and network data. In this study, we propose a multi-modal learning framework using three separate encoders to extract features of focal tumor image, tumor geometrics and global brain networks. To mitigate the limited availability of diffusion MRI, we develop a self-supervised approach to generate brain networks from anatomical multi-sequence MRI. Moreover, to extract tumor-related features from the brain network, we design a hierarchical attention module for the brain network encoder. Further, we design a bi-level multi-modal contrastive loss to align the multi-modal features and tackle the domain gap at the focal tumor and global brain. Finally, we propose a weighted population graph to integrate the multi-modal features for genotype prediction. Experimental results on the testing set show that the proposed model outperforms the baseline deep learning models. The ablation experiments validate the performance of different components of the framework. The visualized interpretation corresponds to clinical knowledge with further validation. In conclusion, the proposed learning framework provides a novel approach for predicting the genotype of glioma.
异柠檬酸脱氢酶(IDH)基因突变是神经胶质瘤诊断和预后的重要生物标志物。通过将局灶性肿瘤图像和几何特征与MRI衍生的脑网络特征相结合,可以更好地预测神经胶质瘤基因型。卷积神经网络在预测IDH突变方面表现出合理的性能,然而,它不能从非欧几里得数据中学习,例如几何和网络数据。在这项研究中,我们提出了一个多模式学习框架,使用三个独立的编码器来提取局灶性肿瘤图像、肿瘤几何和全局脑网络的特征。为了缓解扩散MRI的有限可用性,我们开发了一种自监督方法,从解剖多序列MRI生成大脑网络。此外,为了从脑网络中提取肿瘤相关特征,我们为脑网络编码器设计了一个分层注意力模块。此外,我们设计了一个双水平的多模态对比损失,以对齐多模态特征,并解决局灶性肿瘤和全脑的域间隙。最后,我们提出了一个加权总体图来整合多模态特征,用于基因型预测。在测试集上的实验结果表明,所提出的模型优于基线深度学习模型。烧蚀实验验证了框架不同组件的性能。可视化的解释对应于具有进一步验证的临床知识。总之,所提出的学习框架为预测神经胶质瘤的基因型提供了一种新的方法。
{"title":"Multi-modal learning for predicting the genotype of glioma","authors":"Yiran Wei, Xi Chen, Lei Zhu, Lipei Zhang, C. Schonlieb, S. Price, C. Li","doi":"10.48550/arXiv.2203.10852","DOIUrl":"https://doi.org/10.48550/arXiv.2203.10852","url":null,"abstract":"The isocitrate dehydrogenase (IDH) gene mutation is an essential biomarker for the diagnosis and prognosis of glioma. It is promising to better predict glioma genotype by integrating focal tumor image and geometric features with brain network features derived from MRI. Convolutional neural networks show reasonable performance in predicting IDH mutation, which, however, cannot learn from non-Euclidean data, e.g., geometric and network data. In this study, we propose a multi-modal learning framework using three separate encoders to extract features of focal tumor image, tumor geometrics and global brain networks. To mitigate the limited availability of diffusion MRI, we develop a self-supervised approach to generate brain networks from anatomical multi-sequence MRI. Moreover, to extract tumor-related features from the brain network, we design a hierarchical attention module for the brain network encoder. Further, we design a bi-level multi-modal contrastive loss to align the multi-modal features and tackle the domain gap at the focal tumor and global brain. Finally, we propose a weighted population graph to integrate the multi-modal features for genotype prediction. Experimental results on the testing set show that the proposed model outperforms the baseline deep learning models. The ablation experiments validate the performance of different components of the framework. The visualized interpretation corresponds to clinical knowledge with further validation. In conclusion, the proposed learning framework provides a novel approach for predicting the genotype of glioma.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":" ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2022-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46754384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Concurrent Ischemic Lesion Age Estimation and Segmentation of CT Brain Using a Transformer-Based Network 基于变压器网络的CT脑并发缺血性损伤年龄估计与分割
IF 10.6 1区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-01-01 DOI: 10.1007/978-3-031-17899-3_6
Adam Marcus, Paul Bentley, D. Rueckert
{"title":"Concurrent Ischemic Lesion Age Estimation and Segmentation of CT Brain Using a Transformer-Based Network","authors":"Adam Marcus, Paul Bentley, D. Rueckert","doi":"10.1007/978-3-031-17899-3_6","DOIUrl":"https://doi.org/10.1007/978-3-031-17899-3_6","url":null,"abstract":"","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"35 9","pages":"52-62"},"PeriodicalIF":10.6,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50987497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning-Based Regularization for Cardiac Strain Analysis via Domain Adaptation. 基于域自适应的心脏应变分析学习正则化。
IF 10.6 1区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2021-09-01 DOI: 10.1109/TMI.2021.3074033
Allen Lu, Shawn S Ahn, Kevinminh Ta, Nripesh Parajuli, John C Stendahl, Zhao Liu, Nabil E Boutagy, Geng-Shi Jeng, Lawrence H Staib, Matthew O'Donnell, Albert J Sinusas, James S Duncan
Reliable motion estimation and strain analysis using 3D+ time echocardiography (4DE) for localization and characterization of myocardial injury is valuable for early detection and targeted interventions. However, motion estimation is difficult due to the low-SNR that stems from the inherent image properties of 4DE, and intelligent regularization is critical for producing reliable motion estimates. In this work, we incorporated the notion of domain adaptation into a supervised neural network regularization framework. We first propose a semi-supervised Multi-Layered Perceptron (MLP) network with biomechanical constraints for learning a latent representation that is shown to have more physiologically plausible displacements. We extended this framework to include a supervised loss term on synthetic data and showed the effects of biomechanical constraints on the network’s ability for domain adaptation. We validated the semi-supervised regularization method on in vivo data with implanted sonomicrometers. Finally, we showed the ability of our semi-supervised learning regularization approach to identify infarct regions using estimated regional strain maps with good agreement to manually traced infarct regions from postmortem excised hearts.
利用3D+时间超声心动图(4DE)进行可靠的运动估计和应变分析,定位和表征心肌损伤,对早期发现和有针对性的干预有价值。然而,由于4DE固有的图像特性导致的低信噪比,运动估计很困难,而智能正则化对于产生可靠的运动估计至关重要。在这项工作中,我们将领域自适应的概念融入到监督神经网络正则化框架中。我们首先提出了一种具有生物力学约束的半监督多层感知器(MLP)网络,用于学习具有更多生理上似是而非的位移的潜在表示。我们扩展了这个框架,在合成数据上加入了一个监督损失项,并展示了生物力学约束对网络领域适应能力的影响。我们用植入的声微米在体内数据上验证了半监督正则化方法。最后,我们展示了我们的半监督学习正则化方法使用估计的区域应变图识别梗死区域的能力,该方法与从死后切除的心脏手动追踪梗死区域的能力非常一致。
{"title":"Learning-Based Regularization for Cardiac Strain Analysis via Domain Adaptation.","authors":"Allen Lu, Shawn S Ahn, Kevinminh Ta, Nripesh Parajuli, John C Stendahl, Zhao Liu, Nabil E Boutagy, Geng-Shi Jeng, Lawrence H Staib, Matthew O'Donnell, Albert J Sinusas, James S Duncan","doi":"10.1109/TMI.2021.3074033","DOIUrl":"https://doi.org/10.1109/TMI.2021.3074033","url":null,"abstract":"Reliable motion estimation and strain analysis using 3D+ time echocardiography (4DE) for localization and characterization of myocardial injury is valuable for early detection and targeted interventions. However, motion estimation is difficult due to the low-SNR that stems from the inherent image properties of 4DE, and intelligent regularization is critical for producing reliable motion estimates. In this work, we incorporated the notion of domain adaptation into a supervised neural network regularization framework. We first propose a semi-supervised Multi-Layered Perceptron (MLP) network with biomechanical constraints for learning a latent representation that is shown to have more physiologically plausible displacements. We extended this framework to include a supervised loss term on synthetic data and showed the effects of biomechanical constraints on the network’s ability for domain adaptation. We validated the semi-supervised regularization method on in vivo data with implanted sonomicrometers. Finally, we showed the ability of our semi-supervised learning regularization approach to identify infarct regions using estimated regional strain maps with good agreement to manually traced infarct regions from postmortem excised hearts.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"40 9","pages":"2233-2245"},"PeriodicalIF":10.6,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMI.2021.3074033","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9236213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A Method for Electrical Property Tomography Based on a Three-Dimensional Integral Representation of the Electric Field 一种基于电场三维积分表示的电特性层析成像方法
IF 10.6 1区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2021-08-13 DOI: 10.36227/techrxiv.15153579.v1
Naohiro Eda, Motofumi Fushimi, Keisuke Hasegawa, T. Nara
Magnetic resonance electrical properties tomography (MREPT) noninvasively reconstructs high-resolution electrical property (EP) maps using MRI scanners and is useful for diagnosing cancerous tissues. However, conventional MREPT methods have limitations: sensitivity to noise in the numerical Laplacian operation, difficulty in reconstructing three-dimensional (3D) EPs and convergence not guaranteed in the iterative process. We propose a novel, iterative 3D reconstruction MREPT method without a numerical Laplacian operation. We derive an integral representation of the electric field using its Helmholtz decomposition with Maxwell’s equations, under the assumption that the EPs are known on the boundary of the region of interest with the approximation that the unmeasurable magnetic field components are zero. Then, we solve the simultaneous equations composed of the integral representation and Ampere’s law using a convex projection algorithm whose convergence is theoretically guaranteed. The efficacy of the proposed method was validated through numerical simulations and a phantom experiment. The results showed that this method is effective in reconstructing 3D EPs and is robust to noise. It was also shown that our proposed method with the unmeasurable component $H^{-}$ enhances the accuracy of the EPs in a background and that with all the components of the magnetic field reduces the artifacts at the center of the slices except when all the components of the electric field are close to zero.
磁共振电特性断层扫描(MREPT)使用MRI扫描仪无创重建高分辨率电特性(EP)图,可用于诊断癌组织。然而,传统的MREPT方法有局限性:数值拉普拉斯运算中对噪声的敏感性、重建三维(3D)EP的困难以及迭代过程中不能保证收敛性。我们提出了一种新的迭代三维重建MREPT方法,无需数值拉普拉斯运算。我们使用亥姆霍兹分解和麦克斯韦方程组推导了电场的积分表示,假设EP在感兴趣区域的边界上是已知的,并且不可测量的磁场分量近似为零。然后,我们使用凸投影算法来求解由积分表示和安培定律组成的联立方程,该算法的收敛性在理论上是有保证的。通过数值模拟和体模实验验证了该方法的有效性。结果表明,该方法在重建三维EP时是有效的,并且对噪声具有较强的鲁棒性。还表明,我们提出的具有不可测量分量$H^{-}$的方法提高了背景下EP的精度,并且具有磁场的所有分量的方法减少了切片中心的伪影,除非电场的所有分量都接近零。
{"title":"A Method for Electrical Property Tomography Based on a Three-Dimensional Integral Representation of the Electric Field","authors":"Naohiro Eda, Motofumi Fushimi, Keisuke Hasegawa, T. Nara","doi":"10.36227/techrxiv.15153579.v1","DOIUrl":"https://doi.org/10.36227/techrxiv.15153579.v1","url":null,"abstract":"Magnetic resonance electrical properties tomography (MREPT) noninvasively reconstructs high-resolution electrical property (EP) maps using MRI scanners and is useful for diagnosing cancerous tissues. However, conventional MREPT methods have limitations: sensitivity to noise in the numerical Laplacian operation, difficulty in reconstructing three-dimensional (3D) EPs and convergence not guaranteed in the iterative process. We propose a novel, iterative 3D reconstruction MREPT method without a numerical Laplacian operation. We derive an integral representation of the electric field using its Helmholtz decomposition with Maxwell’s equations, under the assumption that the EPs are known on the boundary of the region of interest with the approximation that the unmeasurable magnetic field components are zero. Then, we solve the simultaneous equations composed of the integral representation and Ampere’s law using a convex projection algorithm whose convergence is theoretically guaranteed. The efficacy of the proposed method was validated through numerical simulations and a phantom experiment. The results showed that this method is effective in reconstructing 3D EPs and is robust to noise. It was also shown that our proposed method with the unmeasurable component $H^{-}$ enhances the accuracy of the EPs in a background and that with all the components of the magnetic field reduces the artifacts at the center of the slices except when all the components of the electric field are close to zero.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"41 1","pages":"1400-1409"},"PeriodicalIF":10.6,"publicationDate":"2021-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45873887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
IEEE Transactions on Medical Imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1