首页 > 最新文献

Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention最新文献

英文 中文
SegNetr: Rethinking the local-global interactions and skip connections in U-shaped networks SegNetr:重新思考u型网络中的局部-全局交互和跳过连接
Junlong Cheng, Chengrui Gao, Fengjie Wang, Min Zhu
Recently, U-shaped networks have dominated the field of medical image segmentation due to their simple and easily tuned structure. However, existing U-shaped segmentation networks: 1) mostly focus on designing complex self-attention modules to compensate for the lack of long-term dependence based on convolution operation, which increases the overall number of parameters and computational complexity of the network; 2) simply fuse the features of encoder and decoder, ignoring the connection between their spatial locations. In this paper, we rethink the above problem and build a lightweight medical image segmentation network, called SegNetr. Specifically, we introduce a novel SegNetr block that can perform local-global interactions dynamically at any stage and with only linear complexity. At the same time, we design a general information retention skip connection (IRSC) to preserve the spatial location information of encoder features and achieve accurate fusion with the decoder features. We validate the effectiveness of SegNetr on four mainstream medical image segmentation datasets, with 59% and 76% fewer parameters and GFLOPs than vanilla U-Net, while achieving segmentation performance comparable to state-of-the-art methods. Notably, the components proposed in this paper can be applied to other U-shaped networks to improve their segmentation performance.
近年来,u型网络以其结构简单、易于调整等优点在医学图像分割领域占据主导地位。然而,现有的u型分割网络:1)多侧重于设计复杂的自关注模块,以弥补基于卷积运算的长期依赖性不足,这增加了网络的整体参数数量和计算复杂度;2)简单地融合编码器和解码器的特征,忽略它们空间位置之间的联系。在本文中,我们重新思考上述问题,并构建了一个轻量级的医学图像分割网络,称为SegNetr。具体来说,我们引入了一个新的SegNetr块,它可以在任何阶段动态地执行局部全局交互,并且只有线性复杂性。同时,设计了通用信息保留跳线连接(IRSC),保留了编码器特征的空间位置信息,实现了与解码器特征的精确融合。我们在四种主流医学图像分割数据集上验证了SegNetr的有效性,与普通U-Net相比,其参数和GFLOPs分别减少了59%和76%,同时实现了与最先进方法相当的分割性能。值得注意的是,本文提出的组件可以应用于其他u型网络,以提高其分割性能。
{"title":"SegNetr: Rethinking the local-global interactions and skip connections in U-shaped networks","authors":"Junlong Cheng, Chengrui Gao, Fengjie Wang, Min Zhu","doi":"10.48550/arXiv.2307.02953","DOIUrl":"https://doi.org/10.48550/arXiv.2307.02953","url":null,"abstract":"Recently, U-shaped networks have dominated the field of medical image segmentation due to their simple and easily tuned structure. However, existing U-shaped segmentation networks: 1) mostly focus on designing complex self-attention modules to compensate for the lack of long-term dependence based on convolution operation, which increases the overall number of parameters and computational complexity of the network; 2) simply fuse the features of encoder and decoder, ignoring the connection between their spatial locations. In this paper, we rethink the above problem and build a lightweight medical image segmentation network, called SegNetr. Specifically, we introduce a novel SegNetr block that can perform local-global interactions dynamically at any stage and with only linear complexity. At the same time, we design a general information retention skip connection (IRSC) to preserve the spatial location information of encoder features and achieve accurate fusion with the decoder features. We validate the effectiveness of SegNetr on four mainstream medical image segmentation datasets, with 59% and 76% fewer parameters and GFLOPs than vanilla U-Net, while achieving segmentation performance comparable to state-of-the-art methods. Notably, the components proposed in this paper can be applied to other U-shaped networks to improve their segmentation performance.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"8 1","pages":"64-74"},"PeriodicalIF":0.0,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80165986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Semi-supervised Domain Adaptive Medical Image Segmentation through Consistency Regularized Disentangled Contrastive Learning 基于一致性正则化去纠缠对比学习的半监督域自适应医学图像分割
Hritam Basak, Zhaozheng Yin
Although unsupervised domain adaptation (UDA) is a promising direction to alleviate domain shift, they fall short of their supervised counterparts. In this work, we investigate relatively less explored semi-supervised domain adaptation (SSDA) for medical image segmentation, where access to a few labeled target samples can improve the adaptation performance substantially. Specifically, we propose a two-stage training process. First, an encoder is pre-trained in a self-learning paradigm using a novel domain-content disentangled contrastive learning (CL) along with a pixel-level feature consistency constraint. The proposed CL enforces the encoder to learn discriminative content-specific but domain-invariant semantics on a global scale from the source and target images, whereas consistency regularization enforces the mining of local pixel-level information by maintaining spatial sensitivity. This pre-trained encoder, along with a decoder, is further fine-tuned for the downstream task, (i.e. pixel-level segmentation) using a semi-supervised setting. Furthermore, we experimentally validate that our proposed method can easily be extended for UDA settings, adding to the superiority of the proposed strategy. Upon evaluation on two domain adaptive image segmentation tasks, our proposed method outperforms the SoTA methods, both in SSDA and UDA settings. Code is available at https://github.com/hritam-98/GFDA-disentangled
尽管无监督域自适应(UDA)是缓解域转移的一个很有前途的方向,但它们与有监督域自适应相比存在不足。在这项工作中,我们研究了相对较少探索的半监督域自适应(SSDA)用于医学图像分割,其中访问少量标记的目标样本可以大大提高自适应性能。具体来说,我们提出了一个两阶段的培训过程。首先,使用新颖的领域内容解纠缠对比学习(CL)和像素级特征一致性约束在自学习范式中对编码器进行预训练。所提出的CL强制编码器从源图像和目标图像中在全局尺度上学习区别性的内容特定但域不变的语义,而一致性正则化通过保持空间敏感性来强制挖掘局部像素级信息。这个预训练的编码器,连同一个解码器,使用半监督设置进一步微调下游任务(即像素级分割)。此外,我们通过实验验证了我们提出的方法可以很容易地扩展到UDA设置,增加了所提出策略的优越性。通过对两个域自适应图像分割任务的评估,我们提出的方法在SSDA和UDA设置下都优于SoTA方法。代码可从https://github.com/hritam-98/GFDA-disentangled获得
{"title":"Semi-supervised Domain Adaptive Medical Image Segmentation through Consistency Regularized Disentangled Contrastive Learning","authors":"Hritam Basak, Zhaozheng Yin","doi":"10.48550/arXiv.2307.02798","DOIUrl":"https://doi.org/10.48550/arXiv.2307.02798","url":null,"abstract":"Although unsupervised domain adaptation (UDA) is a promising direction to alleviate domain shift, they fall short of their supervised counterparts. In this work, we investigate relatively less explored semi-supervised domain adaptation (SSDA) for medical image segmentation, where access to a few labeled target samples can improve the adaptation performance substantially. Specifically, we propose a two-stage training process. First, an encoder is pre-trained in a self-learning paradigm using a novel domain-content disentangled contrastive learning (CL) along with a pixel-level feature consistency constraint. The proposed CL enforces the encoder to learn discriminative content-specific but domain-invariant semantics on a global scale from the source and target images, whereas consistency regularization enforces the mining of local pixel-level information by maintaining spatial sensitivity. This pre-trained encoder, along with a decoder, is further fine-tuned for the downstream task, (i.e. pixel-level segmentation) using a semi-supervised setting. Furthermore, we experimentally validate that our proposed method can easily be extended for UDA settings, adding to the superiority of the proposed strategy. Upon evaluation on two domain adaptive image segmentation tasks, our proposed method outperforms the SoTA methods, both in SSDA and UDA settings. Code is available at https://github.com/hritam-98/GFDA-disentangled","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"19 1","pages":"260-270"},"PeriodicalIF":0.0,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85133817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications 医学应用生成模型潜在空间中的隐私保护行走
M. Pennisi, Federica Proietto Salanitri, G. Bellitto, S. Palazzo, Ulas Bagci, C. Spampinato
Generative Adversarial Networks (GANs) have demonstrated their ability to generate synthetic samples that match a target distribution. However, from a privacy perspective, using GANs as a proxy for data sharing is not a safe solution, as they tend to embed near-duplicates of real samples in the latent space. Recent works, inspired by k-anonymity principles, address this issue through sample aggregation in the latent space, with the drawback of reducing the dataset by a factor of k. Our work aims to mitigate this problem by proposing a latent space navigation strategy able to generate diverse synthetic samples that may support effective training of deep models, while addressing privacy concerns in a principled way. Our approach leverages an auxiliary identity classifier as a guide to non-linearly walk between points in the latent space, minimizing the risk of collision with near-duplicates of real samples. We empirically demonstrate that, given any random pair of points in the latent space, our walking strategy is safer than linear interpolation. We then test our path-finding strategy combined to k-same methods and demonstrate, on two benchmarks for tuberculosis and diabetic retinopathy classification, that training a model using samples generated by our approach mitigate drops in performance, while keeping privacy preservation.
生成对抗网络(GANs)已经证明了它们生成符合目标分布的合成样本的能力。然而,从隐私的角度来看,使用gan作为数据共享的代理并不是一个安全的解决方案,因为它们倾向于在潜在空间中嵌入接近重复的真实样本。最近的作品受到k-匿名原则的启发,通过潜在空间中的样本聚合来解决这个问题,缺点是将数据集减少了k个因子。我们的工作旨在通过提出一种潜在空间导航策略来缓解这个问题,该策略能够生成多种合成样本,这些样本可以支持深度模型的有效训练,同时以原则性的方式解决隐私问题。我们的方法利用辅助身份分类器作为潜在空间中点之间非线性行走的指南,最大限度地减少与真实样本的近重复碰撞的风险。我们的经验证明,给定潜在空间中的任意随机点对,我们的行走策略比线性插值更安全。然后,我们测试了结合k-same方法的寻路策略,并在结核病和糖尿病视网膜病变分类的两个基准上证明,使用我们的方法生成的样本训练模型可以减轻性能下降,同时保持隐私保护。
{"title":"A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications","authors":"M. Pennisi, Federica Proietto Salanitri, G. Bellitto, S. Palazzo, Ulas Bagci, C. Spampinato","doi":"10.48550/arXiv.2307.02984","DOIUrl":"https://doi.org/10.48550/arXiv.2307.02984","url":null,"abstract":"Generative Adversarial Networks (GANs) have demonstrated their ability to generate synthetic samples that match a target distribution. However, from a privacy perspective, using GANs as a proxy for data sharing is not a safe solution, as they tend to embed near-duplicates of real samples in the latent space. Recent works, inspired by k-anonymity principles, address this issue through sample aggregation in the latent space, with the drawback of reducing the dataset by a factor of k. Our work aims to mitigate this problem by proposing a latent space navigation strategy able to generate diverse synthetic samples that may support effective training of deep models, while addressing privacy concerns in a principled way. Our approach leverages an auxiliary identity classifier as a guide to non-linearly walk between points in the latent space, minimizing the risk of collision with near-duplicates of real samples. We empirically demonstrate that, given any random pair of points in the latent space, our walking strategy is safer than linear interpolation. We then test our path-finding strategy combined to k-same methods and demonstrate, on two benchmarks for tuberculosis and diabetic retinopathy classification, that training a model using samples generated by our approach mitigate drops in performance, while keeping privacy preservation.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"195 1","pages":"422-431"},"PeriodicalIF":0.0,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88619665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Role of Subgroup Separability in Group-Fair Medical Image Classification 分组可分性在分组公平医学图像分类中的作用
Charles Jones, Mélanie Roschewitz, Ben Glocker
We investigate performance disparities in deep classifiers. We find that the ability of classifiers to separate individuals into subgroups varies substantially across medical imaging modalities and protected characteristics; crucially, we show that this property is predictive of algorithmic bias. Through theoretical analysis and extensive empirical evaluation, we find a relationship between subgroup separability, subgroup disparities, and performance degradation when models are trained on data with systematic bias such as underdiagnosis. Our findings shed new light on the question of how models become biased, providing important insights for the development of fair medical imaging AI.
我们研究了深度分类器的性能差异。我们发现分类器将个体划分为亚群的能力在医学成像方式和保护特征上有很大差异;至关重要的是,我们证明了这一特性可以预测算法偏差。通过理论分析和广泛的实证评估,我们发现当模型在具有系统偏差(如诊断不足)的数据上训练时,子组可分离性、子组差异和性能下降之间存在关系。我们的研究结果揭示了模型如何变得有偏见的问题,为公平的医学成像人工智能的发展提供了重要的见解。
{"title":"The Role of Subgroup Separability in Group-Fair Medical Image Classification","authors":"Charles Jones, Mélanie Roschewitz, Ben Glocker","doi":"10.48550/arXiv.2307.02791","DOIUrl":"https://doi.org/10.48550/arXiv.2307.02791","url":null,"abstract":"We investigate performance disparities in deep classifiers. We find that the ability of classifiers to separate individuals into subgroups varies substantially across medical imaging modalities and protected characteristics; crucially, we show that this property is predictive of algorithmic bias. Through theoretical analysis and extensive empirical evaluation, we find a relationship between subgroup separability, subgroup disparities, and performance degradation when models are trained on data with systematic bias such as underdiagnosis. Our findings shed new light on the question of how models become biased, providing important insights for the development of fair medical imaging AI.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"63 1","pages":"179-188"},"PeriodicalIF":0.0,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73088910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
DisAsymNet: Disentanglement of Asymmetrical Abnormality on Bilateral Mammograms using Self-adversarial Learning DisAsymNet:利用自我对抗学习解除双侧乳房x光片上不对称异常的纠缠
Xin Wang, T. Tan, Yuan Gao, Luyi Han, Tianyu Zhang, Chun-Fang Lu, R. Beets-Tan, Ruisheng Su, R. Mann
Asymmetry is a crucial characteristic of bilateral mammograms (Bi-MG) when abnormalities are developing. It is widely utilized by radiologists for diagnosis. The question of 'what the symmetrical Bi-MG would look like when the asymmetrical abnormalities have been removed ?' has not yet received strong attention in the development of algorithms on mammograms. Addressing this question could provide valuable insights into mammographic anatomy and aid in diagnostic interpretation. Hence, we propose a novel framework, DisAsymNet, which utilizes asymmetrical abnormality transformer guided self-adversarial learning for disentangling abnormalities and symmetric Bi-MG. At the same time, our proposed method is partially guided by randomly synthesized abnormalities. We conduct experiments on three public and one in-house dataset, and demonstrate that our method outperforms existing methods in abnormality classification, segmentation, and localization tasks. Additionally, reconstructed normal mammograms can provide insights toward better interpretable visual cues for clinical diagnosis. The code will be accessible to the public.
当异常发展时,不对称是双侧乳房x光检查(Bi-MG)的一个重要特征。它被放射科医生广泛用于诊断。在乳房x光检查算法的发展中,“当不对称的异常被去除后,对称的Bi-MG会是什么样子?”这个问题还没有得到强烈的关注。解决这个问题可以为乳房x线摄影解剖学提供有价值的见解,并有助于诊断解释。因此,我们提出了一个新的框架,DisAsymNet,它利用不对称异常变压器引导的自对抗学习来解除异常和对称Bi-MG的纠缠。同时,我们提出的方法在一定程度上受到随机合成异常的引导。我们在三个公开数据集和一个内部数据集上进行了实验,并证明我们的方法在异常分类、分割和定位任务上优于现有方法。此外,重建的正常乳房x线照片可以为临床诊断提供更好的可解释的视觉线索。该代码将对公众开放。
{"title":"DisAsymNet: Disentanglement of Asymmetrical Abnormality on Bilateral Mammograms using Self-adversarial Learning","authors":"Xin Wang, T. Tan, Yuan Gao, Luyi Han, Tianyu Zhang, Chun-Fang Lu, R. Beets-Tan, Ruisheng Su, R. Mann","doi":"10.48550/arXiv.2307.02935","DOIUrl":"https://doi.org/10.48550/arXiv.2307.02935","url":null,"abstract":"Asymmetry is a crucial characteristic of bilateral mammograms (Bi-MG) when abnormalities are developing. It is widely utilized by radiologists for diagnosis. The question of 'what the symmetrical Bi-MG would look like when the asymmetrical abnormalities have been removed ?' has not yet received strong attention in the development of algorithms on mammograms. Addressing this question could provide valuable insights into mammographic anatomy and aid in diagnostic interpretation. Hence, we propose a novel framework, DisAsymNet, which utilizes asymmetrical abnormality transformer guided self-adversarial learning for disentangling abnormalities and symmetric Bi-MG. At the same time, our proposed method is partially guided by randomly synthesized abnormalities. We conduct experiments on three public and one in-house dataset, and demonstrate that our method outperforms existing methods in abnormality classification, segmentation, and localization tasks. Additionally, reconstructed normal mammograms can provide insights toward better interpretable visual cues for clinical diagnosis. The code will be accessible to the public.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"2 1","pages":"57-67"},"PeriodicalIF":0.0,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76566402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-supervised learning via inter-modal reconstruction and feature projection networks for label-efficient 3D-to-2D segmentation 基于多模态重构和特征投影网络的自监督学习,用于标签高效的3d到2d分割
José Morano, Guilherme Aresta, D. Lachinov, Julia Mai, U. Schmidt-Erfurth, Hrvoje Bogunovi'c
Deep learning has become a valuable tool for the automation of certain medical image segmentation tasks, significantly relieving the workload of medical specialists. Some of these tasks require segmentation to be performed on a subset of the input dimensions, the most common case being 3D-to-2D. However, the performance of existing methods is strongly conditioned by the amount of labeled data available, as there is currently no data efficient method, e.g. transfer learning, that has been validated on these tasks. In this work, we propose a novel convolutional neural network (CNN) and self-supervised learning (SSL) method for label-efficient 3D-to-2D segmentation. The CNN is composed of a 3D encoder and a 2D decoder connected by novel 3D-to-2D blocks. The SSL method consists of reconstructing image pairs of modalities with different dimensionality. The approach has been validated in two tasks with clinical relevance: the en-face segmentation of geographic atrophy and reticular pseudodrusen in optical coherence tomography. Results on different datasets demonstrate that the proposed CNN significantly improves the state of the art in scenarios with limited labeled data by up to 8% in Dice score. Moreover, the proposed SSL method allows further improvement of this performance by up to 23%, and we show that the SSL is beneficial regardless of the network architecture.
深度学习已成为某些医学图像分割任务自动化的重要工具,大大减轻了医学专家的工作量。其中一些任务需要在输入维度的子集上执行分割,最常见的情况是3d到2d。然而,现有方法的性能在很大程度上取决于可用标记数据的数量,因为目前还没有数据高效的方法,例如迁移学习,已经在这些任务上得到了验证。在这项工作中,我们提出了一种新颖的卷积神经网络(CNN)和自监督学习(SSL)方法,用于标签高效的3d到2d分割。CNN由一个3D编码器和一个2D解码器组成,通过新颖的3D到2D块连接。SSL方法包括重建不同维数的模态图像对。该方法已在两个具有临床意义的任务中得到验证:光学相干断层扫描中的地理萎缩和网状假性结节的正面分割。在不同数据集上的结果表明,在标记数据有限的情况下,所提出的CNN显着提高了Dice得分的8%。此外,所提出的SSL方法允许进一步提高该性能高达23%,并且我们表明SSL无论网络架构如何都是有益的。
{"title":"Self-supervised learning via inter-modal reconstruction and feature projection networks for label-efficient 3D-to-2D segmentation","authors":"José Morano, Guilherme Aresta, D. Lachinov, Julia Mai, U. Schmidt-Erfurth, Hrvoje Bogunovi'c","doi":"10.48550/arXiv.2307.03008","DOIUrl":"https://doi.org/10.48550/arXiv.2307.03008","url":null,"abstract":"Deep learning has become a valuable tool for the automation of certain medical image segmentation tasks, significantly relieving the workload of medical specialists. Some of these tasks require segmentation to be performed on a subset of the input dimensions, the most common case being 3D-to-2D. However, the performance of existing methods is strongly conditioned by the amount of labeled data available, as there is currently no data efficient method, e.g. transfer learning, that has been validated on these tasks. In this work, we propose a novel convolutional neural network (CNN) and self-supervised learning (SSL) method for label-efficient 3D-to-2D segmentation. The CNN is composed of a 3D encoder and a 2D decoder connected by novel 3D-to-2D blocks. The SSL method consists of reconstructing image pairs of modalities with different dimensionality. The approach has been validated in two tasks with clinical relevance: the en-face segmentation of geographic atrophy and reticular pseudodrusen in optical coherence tomography. Results on different datasets demonstrate that the proposed CNN significantly improves the state of the art in scenarios with limited labeled data by up to 8% in Dice score. Moreover, the proposed SSL method allows further improvement of this performance by up to 23%, and we show that the SSL is beneficial regardless of the network architecture.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"117 1","pages":"589-599"},"PeriodicalIF":0.0,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73240787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Scale Prototypical Transformer for Whole Slide Image Classification 全幻灯片图像分类的多尺度原型变压器
Saisai Ding, Jun Wang, Juncheng Li, Jun Shi
Whole slide image (WSI) classification is an essential task in computational pathology. Despite the recent advances in multiple instance learning (MIL) for WSI classification, accurate classification of WSIs remains challenging due to the extreme imbalance between the positive and negative instances in bags, and the complicated pre-processing to fuse multi-scale information of WSI. To this end, we propose a novel multi-scale prototypical Transformer (MSPT) for WSI classification, which includes a prototypical Transformer (PT) module and a multi-scale feature fusion module (MFFM). The PT is developed to reduce redundant instances in bags by integrating prototypical learning into the Transformer architecture. It substitutes all instances with cluster prototypes, which are then re-calibrated through the self-attention mechanism of the Trans-former. Thereafter, an MFFM is proposed to fuse the clustered prototypes of different scales, which employs MLP-Mixer to enhance the information communication between prototypes. The experimental results on two public WSI datasets demonstrate that the proposed MSPT outperforms all the compared algorithms, suggesting its potential applications.
全切片图像(WSI)分类是计算病理学中的一项重要任务。尽管近年来在WSI分类的多实例学习(MIL)方面取得了一些进展,但由于袋中正、负实例之间的极度不平衡,以及融合WSI多尺度信息的复杂预处理,对WSI的准确分类仍然是一个挑战。为此,我们提出了一种新的用于WSI分类的多尺度原型变压器(MSPT),它包括一个原型变压器(PT)模块和一个多尺度特征融合模块(MFFM)。PT的开发是为了通过将原型学习集成到Transformer体系结构中来减少冗余实例。它用集群原型代替所有实例,然后通过变形金刚的自我关注机制重新校准。在此基础上,提出了一种融合不同尺度聚类原型的MFFM模型,该模型采用MLP-Mixer增强原型之间的信息通信。在两个公共WSI数据集上的实验结果表明,所提出的MSPT算法优于所有比较算法,表明了其潜在的应用前景。
{"title":"Multi-Scale Prototypical Transformer for Whole Slide Image Classification","authors":"Saisai Ding, Jun Wang, Juncheng Li, Jun Shi","doi":"10.48550/arXiv.2307.02308","DOIUrl":"https://doi.org/10.48550/arXiv.2307.02308","url":null,"abstract":"Whole slide image (WSI) classification is an essential task in computational pathology. Despite the recent advances in multiple instance learning (MIL) for WSI classification, accurate classification of WSIs remains challenging due to the extreme imbalance between the positive and negative instances in bags, and the complicated pre-processing to fuse multi-scale information of WSI. To this end, we propose a novel multi-scale prototypical Transformer (MSPT) for WSI classification, which includes a prototypical Transformer (PT) module and a multi-scale feature fusion module (MFFM). The PT is developed to reduce redundant instances in bags by integrating prototypical learning into the Transformer architecture. It substitutes all instances with cluster prototypes, which are then re-calibrated through the self-attention mechanism of the Trans-former. Thereafter, an MFFM is proposed to fuse the clustered prototypes of different scales, which employs MLP-Mixer to enhance the information communication between prototypes. The experimental results on two public WSI datasets demonstrate that the proposed MSPT outperforms all the compared algorithms, suggesting its potential applications.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"32 1","pages":"602-611"},"PeriodicalIF":0.0,"publicationDate":"2023-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87098284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
LLCaps: Learning to Illuminate Low-Light Capsule Endoscopy with Curved Wavelet Attention and Reverse Diffusion LLCaps:学习用弯曲小波注意和反向扩散照亮低光胶囊内窥镜
Long Bai, Tong Chen, Yanan Wu, An-Chi Wang, Mobarakol Islam, Hongliang Ren
Wireless capsule endoscopy (WCE) is a painless and non-invasive diagnostic tool for gastrointestinal (GI) diseases. However, due to GI anatomical constraints and hardware manufacturing limitations, WCE vision signals may suffer from insufficient illumination, leading to a complicated screening and examination procedure. Deep learning-based low-light image enhancement (LLIE) in the medical field gradually attracts researchers. Given the exuberant development of the denoising diffusion probabilistic model (DDPM) in computer vision, we introduce a WCE LLIE framework based on the multi-scale convolutional neural network (CNN) and reverse diffusion process. The multi-scale design allows models to preserve high-resolution representation and context information from low-resolution, while the curved wavelet attention (CWA) block is proposed for high-frequency and local feature learning. Furthermore, we combine the reverse diffusion procedure to further optimize the shallow output and generate the most realistic image. The proposed method is compared with ten state-of-the-art (SOTA) LLIE methods and significantly outperforms quantitatively and qualitatively. The superior performance on GI disease segmentation further demonstrates the clinical potential of our proposed model. Our code is publicly accessible.
无线胶囊内镜(WCE)是一种无痛、无创的胃肠道疾病诊断工具。然而,由于GI解剖的限制和硬件制造的限制,WCE视觉信号可能会受到光照不足的影响,导致复杂的筛查和检查程序。基于深度学习的微光图像增强(LLIE)在医学领域的应用越来越受到研究者的关注。鉴于去噪扩散概率模型(DDPM)在计算机视觉领域的蓬勃发展,本文提出了一种基于多尺度卷积神经网络(CNN)和反向扩散过程的WCE LLIE框架。多尺度设计允许模型从低分辨率中保留高分辨率表示和上下文信息,而弯曲小波注意(CWA)块则用于高频和局部特征学习。此外,我们结合反向扩散过程,进一步优化浅输出,生成最真实的图像。该方法与十种最先进的(SOTA) LLIE方法进行了比较,在定量和定性上都有显著的优势。在胃肠道疾病分割上的优异表现进一步证明了我们提出的模型的临床潜力。我们的代码是可公开访问的。
{"title":"LLCaps: Learning to Illuminate Low-Light Capsule Endoscopy with Curved Wavelet Attention and Reverse Diffusion","authors":"Long Bai, Tong Chen, Yanan Wu, An-Chi Wang, Mobarakol Islam, Hongliang Ren","doi":"10.48550/arXiv.2307.02452","DOIUrl":"https://doi.org/10.48550/arXiv.2307.02452","url":null,"abstract":"Wireless capsule endoscopy (WCE) is a painless and non-invasive diagnostic tool for gastrointestinal (GI) diseases. However, due to GI anatomical constraints and hardware manufacturing limitations, WCE vision signals may suffer from insufficient illumination, leading to a complicated screening and examination procedure. Deep learning-based low-light image enhancement (LLIE) in the medical field gradually attracts researchers. Given the exuberant development of the denoising diffusion probabilistic model (DDPM) in computer vision, we introduce a WCE LLIE framework based on the multi-scale convolutional neural network (CNN) and reverse diffusion process. The multi-scale design allows models to preserve high-resolution representation and context information from low-resolution, while the curved wavelet attention (CWA) block is proposed for high-frequency and local feature learning. Furthermore, we combine the reverse diffusion procedure to further optimize the shallow output and generate the most realistic image. The proposed method is compared with ten state-of-the-art (SOTA) LLIE methods and significantly outperforms quantitatively and qualitatively. The superior performance on GI disease segmentation further demonstrates the clinical potential of our proposed model. Our code is publicly accessible.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"49 1","pages":"34-44"},"PeriodicalIF":0.0,"publicationDate":"2023-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74614406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
MDViT: Multi-domain Vision Transformer for Small Medical Image Segmentation Datasets 小型医学图像分割数据集的多域视觉转换器
Siyi Du, Nourhan Bayasi, G. Hamarneh, Rafeef Garbi
Despite its clinical utility, medical image segmentation (MIS) remains a daunting task due to images' inherent complexity and variability. Vision transformers (ViTs) have recently emerged as a promising solution to improve MIS; however, they require larger training datasets than convolutional neural networks. To overcome this obstacle, data-efficient ViTs were proposed, but they are typically trained using a single source of data, which overlooks the valuable knowledge that could be leveraged from other available datasets. Naivly combining datasets from different domains can result in negative knowledge transfer (NKT), i.e., a decrease in model performance on some domains with non-negligible inter-domain heterogeneity. In this paper, we propose MDViT, the first multi-domain ViT that includes domain adapters to mitigate data-hunger and combat NKT by adaptively exploiting knowledge in multiple small data resources (domains). Further, to enhance representation learning across domains, we integrate a mutual knowledge distillation paradigm that transfers knowledge between a universal network (spanning all the domains) and auxiliary domain-specific branches. Experiments on 4 skin lesion segmentation datasets show that MDViT outperforms state-of-the-art algorithms, with superior segmentation performance and a fixed model size, at inference time, even as more domains are added. Our code is available at https://github.com/siyi-wind/MDViT.
尽管医学图像分割(MIS)具有临床应用价值,但由于图像固有的复杂性和可变性,它仍然是一项艰巨的任务。视觉变压器(ViTs)最近成为改善MIS的一种有前途的解决方案;然而,它们需要比卷积神经网络更大的训练数据集。为了克服这一障碍,提出了数据高效的vit,但它们通常使用单一数据源进行训练,从而忽略了可以从其他可用数据集中利用的有价值的知识。单纯地将不同领域的数据集组合在一起会导致负知识转移(NKT),即在一些具有不可忽略的领域间异质性的领域上,模型性能会下降。在本文中,我们提出了MDViT,这是第一个包含域适配器的多域ViT,通过自适应地利用多个小数据资源(域)中的知识来缓解数据饥渴和对抗NKT。此外,为了增强跨领域的表示学习,我们集成了一个相互知识蒸馏范例,该范例在通用网络(跨越所有领域)和辅助领域特定分支之间传输知识。在4个皮肤病变分割数据集上的实验表明,即使添加更多的域,MDViT在推理时也具有优越的分割性能和固定的模型大小,优于最先进的算法。我们的代码可在https://github.com/siyi-wind/MDViT上获得。
{"title":"MDViT: Multi-domain Vision Transformer for Small Medical Image Segmentation Datasets","authors":"Siyi Du, Nourhan Bayasi, G. Hamarneh, Rafeef Garbi","doi":"10.48550/arXiv.2307.02100","DOIUrl":"https://doi.org/10.48550/arXiv.2307.02100","url":null,"abstract":"Despite its clinical utility, medical image segmentation (MIS) remains a daunting task due to images' inherent complexity and variability. Vision transformers (ViTs) have recently emerged as a promising solution to improve MIS; however, they require larger training datasets than convolutional neural networks. To overcome this obstacle, data-efficient ViTs were proposed, but they are typically trained using a single source of data, which overlooks the valuable knowledge that could be leveraged from other available datasets. Naivly combining datasets from different domains can result in negative knowledge transfer (NKT), i.e., a decrease in model performance on some domains with non-negligible inter-domain heterogeneity. In this paper, we propose MDViT, the first multi-domain ViT that includes domain adapters to mitigate data-hunger and combat NKT by adaptively exploiting knowledge in multiple small data resources (domains). Further, to enhance representation learning across domains, we integrate a mutual knowledge distillation paradigm that transfers knowledge between a universal network (spanning all the domains) and auxiliary domain-specific branches. Experiments on 4 skin lesion segmentation datasets show that MDViT outperforms state-of-the-art algorithms, with superior segmentation performance and a fixed model size, at inference time, even as more domains are added. Our code is available at https://github.com/siyi-wind/MDViT.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"36 1","pages":"448-458"},"PeriodicalIF":0.0,"publicationDate":"2023-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82815814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dual Arbitrary Scale Super-Resolution for Multi-Contrast MRI 双重任意尺度超分辨率的多对比MRI
Jiamiao Zhang, Yichen Chi, Jun Lyu, Wenming Yang, Yapeng Tian
Limited by imaging systems, the reconstruction of Magnetic Resonance Imaging (MRI) images from partial measurement is essential to medical imaging research. Benefiting from the diverse and complementary information of multi-contrast MR images in different imaging modalities, multi-contrast Super-Resolution (SR) reconstruction is promising to yield SR images with higher quality. In the medical scenario, to fully visualize the lesion, radiologists are accustomed to zooming the MR images at arbitrary scales rather than using a fixed scale, as used by most MRI SR methods. In addition, existing multi-contrast MRI SR methods often require a fixed resolution for the reference image, which makes acquiring reference images difficult and imposes limitations on arbitrary scale SR tasks. To address these issues, we proposed an implicit neural representations based dual-arbitrary multi-contrast MRI super-resolution method, called Dual-ArbNet. First, we decouple the resolution of the target and reference images by a feature encoder, enabling the network to input target and reference images at arbitrary scales. Then, an implicit fusion decoder fuses the multi-contrast features and uses an Implicit Decoding Function~(IDF) to obtain the final MRI SR results. Furthermore, we introduce a curriculum learning strategy to train our network, which improves the generalization and performance of our Dual-ArbNet. Extensive experiments in two public MRI datasets demonstrate that our method outperforms state-of-the-art approaches under different scale factors and has great potential in clinical practice.
由于成像系统的限制,从局部测量中重建磁共振成像(MRI)图像是医学成像研究的关键。得益于不同成像方式下多对比度磁共振图像信息的多样性和互补性,多对比度超分辨率(SR)重建有望获得更高质量的SR图像。在医疗场景中,为了使病变完全可视化,放射科医生习惯于以任意比例放大MR图像,而不是像大多数MRI SR方法那样使用固定比例。此外,现有的多对比MRI SR方法通常需要固定分辨率的参考图像,这使得获取参考图像变得困难,并限制了任意尺度的SR任务。为了解决这些问题,我们提出了一种基于隐式神经表征的双任意多对比MRI超分辨率方法,称为Dual-ArbNet。首先,我们通过特征编码器解耦目标和参考图像的分辨率,使网络能够输入任意尺度的目标和参考图像。然后,隐式融合解码器融合多对比度特征,并使用隐式解码函数~(IDF)获得最终的MRI SR结果。此外,我们还引入了一种课程学习策略来训练我们的网络,从而提高了我们的Dual-ArbNet的泛化和性能。在两个公开的MRI数据集上进行的大量实验表明,我们的方法在不同的尺度因子下优于最先进的方法,在临床实践中具有很大的潜力。
{"title":"Dual Arbitrary Scale Super-Resolution for Multi-Contrast MRI","authors":"Jiamiao Zhang, Yichen Chi, Jun Lyu, Wenming Yang, Yapeng Tian","doi":"10.48550/arXiv.2307.02334","DOIUrl":"https://doi.org/10.48550/arXiv.2307.02334","url":null,"abstract":"Limited by imaging systems, the reconstruction of Magnetic Resonance Imaging (MRI) images from partial measurement is essential to medical imaging research. Benefiting from the diverse and complementary information of multi-contrast MR images in different imaging modalities, multi-contrast Super-Resolution (SR) reconstruction is promising to yield SR images with higher quality. In the medical scenario, to fully visualize the lesion, radiologists are accustomed to zooming the MR images at arbitrary scales rather than using a fixed scale, as used by most MRI SR methods. In addition, existing multi-contrast MRI SR methods often require a fixed resolution for the reference image, which makes acquiring reference images difficult and imposes limitations on arbitrary scale SR tasks. To address these issues, we proposed an implicit neural representations based dual-arbitrary multi-contrast MRI super-resolution method, called Dual-ArbNet. First, we decouple the resolution of the target and reference images by a feature encoder, enabling the network to input target and reference images at arbitrary scales. Then, an implicit fusion decoder fuses the multi-contrast features and uses an Implicit Decoding Function~(IDF) to obtain the final MRI SR results. Furthermore, we introduce a curriculum learning strategy to train our network, which improves the generalization and performance of our Dual-ArbNet. Extensive experiments in two public MRI datasets demonstrate that our method outperforms state-of-the-art approaches under different scale factors and has great potential in clinical practice.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"38 1","pages":"282-292"},"PeriodicalIF":0.0,"publicationDate":"2023-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88195657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1