{"title":"Lagrange Duality and Compound Multi-Attention Transformer for Semi-Supervised Medical Image Segmentation","authors":"Fuchen Zheng, Quanjun Li, Weixuan Li, Xuhang Chen, Yihang Dong, Guoheng Huang, Chi-Man Pun, Shoujun Zhou","doi":"arxiv-2409.07793","DOIUrl":null,"url":null,"abstract":"Medical image segmentation, a critical application of semantic segmentation\nin healthcare, has seen significant advancements through specialized computer\nvision techniques. While deep learning-based medical image segmentation is\nessential for assisting in medical diagnosis, the lack of diverse training data\ncauses the long-tail problem. Moreover, most previous hybrid CNN-ViT\narchitectures have limited ability to combine various attentions in different\nlayers of the Convolutional Neural Network. To address these issues, we propose\na Lagrange Duality Consistency (LDC) Loss, integrated with Boundary-Aware\nContrastive Loss, as the overall training objective for semi-supervised\nlearning to mitigate the long-tail problem. Additionally, we introduce\nCMAformer, a novel network that synergizes the strengths of ResUNet and\nTransformer. The cross-attention block in CMAformer effectively integrates\nspatial attention and channel attention for multi-scale feature fusion.\nOverall, our results indicate that CMAformer, combined with the feature fusion\nframework and the new consistency loss, demonstrates strong complementarity in\nsemi-supervised learning ensembles. We achieve state-of-the-art results on\nmultiple public medical image datasets. Example code are available at:\n\\url{https://github.com/lzeeorno/Lagrange-Duality-and-CMAformer}.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07793","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Medical image segmentation, a critical application of semantic segmentation
in healthcare, has seen significant advancements through specialized computer
vision techniques. While deep learning-based medical image segmentation is
essential for assisting in medical diagnosis, the lack of diverse training data
causes the long-tail problem. Moreover, most previous hybrid CNN-ViT
architectures have limited ability to combine various attentions in different
layers of the Convolutional Neural Network. To address these issues, we propose
a Lagrange Duality Consistency (LDC) Loss, integrated with Boundary-Aware
Contrastive Loss, as the overall training objective for semi-supervised
learning to mitigate the long-tail problem. Additionally, we introduce
CMAformer, a novel network that synergizes the strengths of ResUNet and
Transformer. The cross-attention block in CMAformer effectively integrates
spatial attention and channel attention for multi-scale feature fusion.
Overall, our results indicate that CMAformer, combined with the feature fusion
framework and the new consistency loss, demonstrates strong complementarity in
semi-supervised learning ensembles. We achieve state-of-the-art results on
multiple public medical image datasets. Example code are available at:
\url{https://github.com/lzeeorno/Lagrange-Duality-and-CMAformer}.