{"title":"Residual SwinV2 transformer coordinate attention network for image super resolution","authors":"Yushi Lei, Zhengwei Zhu, Yilin Qin, Chenyang Zhu, Yanping Zhu","doi":"10.3233/aic-230340","DOIUrl":null,"url":null,"abstract":"Swin Transformers have been designed and used in various image super-resolution (SR) applications. One of the recent image restoration methods is RSTCANet, which combines Swin Transformer with Channel Attention. However, for some channels of images that may carry less useful information or noise, Channel Attention cannot automatically learn the insignificance of these channels. Instead, it tries to enhance their expression capability by adjusting the weights. It may lead to excessive focus on noise information while neglecting more essential features. In this paper, we propose a new image SR method, RSVTCANet, based on an extension of Swin2SR. Specifically, to effectively gather global information for the channel of images, we modify the Residual SwinV2 Transformer blocks in Swin2SR by introducing the coordinate attention for each two successive SwinV2 Transformer Layers (S2TL) and replacing Multi-head Self-Attention (MSA) with Efficient Multi-head Self-Attention version 2 (EMSAv2) to employ the resulting residual SwinV2 Transformer coordinate attention blocks (RSVTCABs) for feature extraction. Additionally, to improve the generalization of RSVTCANet during training, we apply an optimized RandAugment for data augmentation on the training dataset. Extensive experimental results show that RSVTCANet outperforms the recent image SR method regarding visual quality and measures such as PSNR and SSIM.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Communications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/aic-230340","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Swin Transformers have been designed and used in various image super-resolution (SR) applications. One of the recent image restoration methods is RSTCANet, which combines Swin Transformer with Channel Attention. However, for some channels of images that may carry less useful information or noise, Channel Attention cannot automatically learn the insignificance of these channels. Instead, it tries to enhance their expression capability by adjusting the weights. It may lead to excessive focus on noise information while neglecting more essential features. In this paper, we propose a new image SR method, RSVTCANet, based on an extension of Swin2SR. Specifically, to effectively gather global information for the channel of images, we modify the Residual SwinV2 Transformer blocks in Swin2SR by introducing the coordinate attention for each two successive SwinV2 Transformer Layers (S2TL) and replacing Multi-head Self-Attention (MSA) with Efficient Multi-head Self-Attention version 2 (EMSAv2) to employ the resulting residual SwinV2 Transformer coordinate attention blocks (RSVTCABs) for feature extraction. Additionally, to improve the generalization of RSVTCANet during training, we apply an optimized RandAugment for data augmentation on the training dataset. Extensive experimental results show that RSVTCANet outperforms the recent image SR method regarding visual quality and measures such as PSNR and SSIM.
期刊介绍:
AI Communications is a journal on artificial intelligence (AI) which has a close relationship to EurAI (European Association for Artificial Intelligence, formerly ECCAI). It covers the whole AI community: Scientific institutions as well as commercial and industrial companies.
AI Communications aims to enhance contacts and information exchange between AI researchers and developers, and to provide supranational information to those concerned with AI and advanced information processing. AI Communications publishes refereed articles concerning scientific and technical AI procedures, provided they are of sufficient interest to a large readership of both scientific and practical background. In addition it contains high-level background material, both at the technical level as well as the level of opinions, policies and news.