Xiang Gao, Sining Wu, Ying Zhou, Fan Wang, Xiaopeng Hu
{"title":"LCFormer:用于高效图像超分辨率的线性复杂度变换器","authors":"Xiang Gao, Sining Wu, Ying Zhou, Fan Wang, Xiaopeng Hu","doi":"10.1007/s00530-024-01435-4","DOIUrl":null,"url":null,"abstract":"<p>Recently, Transformer-based methods have made significant breakthroughs for single image super-resolution (SISR) but with considerable computation overheads. In this paper, we propose a novel Linear Complexity Transformer (LCFormer) for efficient image super-resolution. Specifically, since the vanilla SA has quadratic complexity and often ignores potential correlations among different data samples, External Attention (EA) is introduced into Transformer to reduce the quadratic complexity to linear and implicitly considers the correlations across the whole dataset. To improve training speed and performance, Root Mean Square Layer Normalization (RMSNorm) is adopted in the Transformer layer. Moreover, an Efficient Gated Depth-wise-conv Feed-forward Network (EGDFN) is designed by the gate mechanism and depth-wise convolutions in Transformer for feature representation with an efficient implementation. The proposed LCFormer achieves comparable or superior performance to existing Transformer-based methods. However, the computation complexity and GPU memory consumption have been dramatically reduced. Extensive experiments demonstrate that LCFormer achieves competitive accuracy and visual improvements against other state-of-the-art methods and reaches a trade-off between model performance and computation costs.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LCFormer: linear complexity transformer for efficient image super-resolution\",\"authors\":\"Xiang Gao, Sining Wu, Ying Zhou, Fan Wang, Xiaopeng Hu\",\"doi\":\"10.1007/s00530-024-01435-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Recently, Transformer-based methods have made significant breakthroughs for single image super-resolution (SISR) but with considerable computation overheads. In this paper, we propose a novel Linear Complexity Transformer (LCFormer) for efficient image super-resolution. Specifically, since the vanilla SA has quadratic complexity and often ignores potential correlations among different data samples, External Attention (EA) is introduced into Transformer to reduce the quadratic complexity to linear and implicitly considers the correlations across the whole dataset. To improve training speed and performance, Root Mean Square Layer Normalization (RMSNorm) is adopted in the Transformer layer. Moreover, an Efficient Gated Depth-wise-conv Feed-forward Network (EGDFN) is designed by the gate mechanism and depth-wise convolutions in Transformer for feature representation with an efficient implementation. The proposed LCFormer achieves comparable or superior performance to existing Transformer-based methods. However, the computation complexity and GPU memory consumption have been dramatically reduced. Extensive experiments demonstrate that LCFormer achieves competitive accuracy and visual improvements against other state-of-the-art methods and reaches a trade-off between model performance and computation costs.</p>\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00530-024-01435-4\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01435-4","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
LCFormer: linear complexity transformer for efficient image super-resolution
Recently, Transformer-based methods have made significant breakthroughs for single image super-resolution (SISR) but with considerable computation overheads. In this paper, we propose a novel Linear Complexity Transformer (LCFormer) for efficient image super-resolution. Specifically, since the vanilla SA has quadratic complexity and often ignores potential correlations among different data samples, External Attention (EA) is introduced into Transformer to reduce the quadratic complexity to linear and implicitly considers the correlations across the whole dataset. To improve training speed and performance, Root Mean Square Layer Normalization (RMSNorm) is adopted in the Transformer layer. Moreover, an Efficient Gated Depth-wise-conv Feed-forward Network (EGDFN) is designed by the gate mechanism and depth-wise convolutions in Transformer for feature representation with an efficient implementation. The proposed LCFormer achieves comparable or superior performance to existing Transformer-based methods. However, the computation complexity and GPU memory consumption have been dramatically reduced. Extensive experiments demonstrate that LCFormer achieves competitive accuracy and visual improvements against other state-of-the-art methods and reaches a trade-off between model performance and computation costs.