Lagrange Duality and Compound Multi-Attention Transformer for Semi-Supervised Medical Image Segmentation

arXiv - CS - Artificial Intelligence Pub Date : 2024-09-12 DOI:arxiv-2409.07793

Fuchen Zheng, Quanjun Li, Weixuan Li, Xuhang Chen, Yihang Dong, Guoheng Huang, Chi-Man Pun, Shoujun Zhou

{"title":"Lagrange Duality and Compound Multi-Attention Transformer for Semi-Supervised Medical Image Segmentation","authors":"Fuchen Zheng, Quanjun Li, Weixuan Li, Xuhang Chen, Yihang Dong, Guoheng Huang, Chi-Man Pun, Shoujun Zhou","doi":"arxiv-2409.07793","DOIUrl":null,"url":null,"abstract":"Medical image segmentation, a critical application of semantic segmentation\nin healthcare, has seen significant advancements through specialized computer\nvision techniques. While deep learning-based medical image segmentation is\nessential for assisting in medical diagnosis, the lack of diverse training data\ncauses the long-tail problem. Moreover, most previous hybrid CNN-ViT\narchitectures have limited ability to combine various attentions in different\nlayers of the Convolutional Neural Network. To address these issues, we propose\na Lagrange Duality Consistency (LDC) Loss, integrated with Boundary-Aware\nContrastive Loss, as the overall training objective for semi-supervised\nlearning to mitigate the long-tail problem. Additionally, we introduce\nCMAformer, a novel network that synergizes the strengths of ResUNet and\nTransformer. The cross-attention block in CMAformer effectively integrates\nspatial attention and channel attention for multi-scale feature fusion.\nOverall, our results indicate that CMAformer, combined with the feature fusion\nframework and the new consistency loss, demonstrates strong complementarity in\nsemi-supervised learning ensembles. We achieve state-of-the-art results on\nmultiple public medical image datasets. Example code are available at:\n\\url{https://github.com/lzeeorno/Lagrange-Duality-and-CMAformer}.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07793","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Medical image segmentation, a critical application of semantic segmentation in healthcare, has seen significant advancements through specialized computer vision techniques. While deep learning-based medical image segmentation is essential for assisting in medical diagnosis, the lack of diverse training data causes the long-tail problem. Moreover, most previous hybrid CNN-ViT architectures have limited ability to combine various attentions in different layers of the Convolutional Neural Network. To address these issues, we propose a Lagrange Duality Consistency (LDC) Loss, integrated with Boundary-Aware Contrastive Loss, as the overall training objective for semi-supervised learning to mitigate the long-tail problem. Additionally, we introduce CMAformer, a novel network that synergizes the strengths of ResUNet and Transformer. The cross-attention block in CMAformer effectively integrates spatial attention and channel attention for multi-scale feature fusion. Overall, our results indicate that CMAformer, combined with the feature fusion framework and the new consistency loss, demonstrates strong complementarity in semi-supervised learning ensembles. We achieve state-of-the-art results on multiple public medical image datasets. Example code are available at: \url{https://github.com/lzeeorno/Lagrange-Duality-and-CMAformer}.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于半监督医学图像分割的拉格朗日对偶性和复合多注意变换器

医学图像分割是语义分割在医疗保健领域的重要应用，通过专业的计算机视觉技术，医学图像分割技术取得了长足的进步。虽然基于深度学习的医学图像分割对辅助医疗诊断至关重要，但缺乏多样化的训练数据会导致长尾问题。此外，之前的大多数混合 CNN-ViT 架构将各种注意力结合到卷积神经网络不同层的能力有限。为了解决这些问题，我们提出了拉格朗日对偶一致性（LDC）损失，并将其与边界感知对比损失（Boundary-AwareContrastive Loss）相结合，作为半监督学习的总体训练目标，以缓解长尾问题。此外，我们还引入了一种新型网络--CMAformer，它协同了 ResUNet 和 Transformer 的优势。总之，我们的研究结果表明，CMAformer 与特征融合框架和新的一致性损失相结合，在半监督学习集合中表现出很强的互补性。我们在多个公共医疗图像数据集上取得了最先进的结果。示例代码见：url{https://github.com/lzeeorno/Lagrange-Duality-and-CMAformer}。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Artificial Intelligence

自引率

0.00%

发文量