具有本地聚合的Swin变压器

Lu Chen, Yang Bai, Q. Cheng, Mei Wu
{"title":"具有本地聚合的Swin变压器","authors":"Lu Chen, Yang Bai, Q. Cheng, Mei Wu","doi":"10.1109/ISPDS56360.2022.9874052","DOIUrl":null,"url":null,"abstract":"Despite the many advantages of Convolutional Neural Networks (CNN), their perceptual fields are usually small and not conducive to capturing global features. In contrast, Transformer is able to capture long-range dependencies and obtain global information of an image with self-attention. For combining the advantages of CNN and Transformer, we propose to integrate the Local Aggregation module to the structure of Swin Transformer. The Local Aggregation module includes lightweight Depthwise Convolution and Pointwise Convolution, and it can locally capture the information of feature map at stages of Swin Transformer. Our experiments demonstrate that accuracy can be improved with such an integrated model. On the Cifar-10 dataset, the Top-1 accuracy reaches 87.74%, which is 3.32% higher than Swin, and the Top-5 accuracy reaches 99.54%; on the Mini-ImageNet dataset, the Top-1 accuracy reaches 79.1%, which is 7.68% higher than Swin, and the Top-5 accuracy reaches 94.02%, which is 3.25% higher than Swin 3.25%.","PeriodicalId":280244,"journal":{"name":"2022 3rd International Conference on Information Science, Parallel and Distributed Systems (ISPDS)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Swin Transformer with Local Aggregation\",\"authors\":\"Lu Chen, Yang Bai, Q. Cheng, Mei Wu\",\"doi\":\"10.1109/ISPDS56360.2022.9874052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite the many advantages of Convolutional Neural Networks (CNN), their perceptual fields are usually small and not conducive to capturing global features. In contrast, Transformer is able to capture long-range dependencies and obtain global information of an image with self-attention. For combining the advantages of CNN and Transformer, we propose to integrate the Local Aggregation module to the structure of Swin Transformer. The Local Aggregation module includes lightweight Depthwise Convolution and Pointwise Convolution, and it can locally capture the information of feature map at stages of Swin Transformer. Our experiments demonstrate that accuracy can be improved with such an integrated model. On the Cifar-10 dataset, the Top-1 accuracy reaches 87.74%, which is 3.32% higher than Swin, and the Top-5 accuracy reaches 99.54%; on the Mini-ImageNet dataset, the Top-1 accuracy reaches 79.1%, which is 7.68% higher than Swin, and the Top-5 accuracy reaches 94.02%, which is 3.25% higher than Swin 3.25%.\",\"PeriodicalId\":280244,\"journal\":{\"name\":\"2022 3rd International Conference on Information Science, Parallel and Distributed Systems (ISPDS)\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 3rd International Conference on Information Science, Parallel and Distributed Systems (ISPDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPDS56360.2022.9874052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 3rd International Conference on Information Science, Parallel and Distributed Systems (ISPDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPDS56360.2022.9874052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

尽管卷积神经网络(CNN)有许多优点,但其感知场通常很小,不利于捕获全局特征。相反,Transformer能够捕获远程依赖关系,并获得具有自关注的图像的全局信息。为了结合CNN和Transformer的优点,我们提出将Local Aggregation模块集成到Swin Transformer的结构中。局部聚合模块包括轻量级的深度卷积和点卷积,可以局部捕获Swin Transformer各阶段的特征图信息。我们的实验表明,这种集成模型可以提高精度。在Cifar-10数据集上,Top-1准确率达到87.74%,比Swin高3.32%,Top-5准确率达到99.54%;在Mini-ImageNet数据集上,Top-1准确率达到79.1%,比Swin高7.68%;Top-5准确率达到94.02%,比Swin的3.25%高3.25%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Swin Transformer with Local Aggregation
Despite the many advantages of Convolutional Neural Networks (CNN), their perceptual fields are usually small and not conducive to capturing global features. In contrast, Transformer is able to capture long-range dependencies and obtain global information of an image with self-attention. For combining the advantages of CNN and Transformer, we propose to integrate the Local Aggregation module to the structure of Swin Transformer. The Local Aggregation module includes lightweight Depthwise Convolution and Pointwise Convolution, and it can locally capture the information of feature map at stages of Swin Transformer. Our experiments demonstrate that accuracy can be improved with such an integrated model. On the Cifar-10 dataset, the Top-1 accuracy reaches 87.74%, which is 3.32% higher than Swin, and the Top-5 accuracy reaches 99.54%; on the Mini-ImageNet dataset, the Top-1 accuracy reaches 79.1%, which is 7.68% higher than Swin, and the Top-5 accuracy reaches 94.02%, which is 3.25% higher than Swin 3.25%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Research on Intelligent Quality Inspection of Customer Service Under the “One Network” Operation Mode of Toll Roads Application of AE keying technology in film and television post-production Study on Artifact Classification Identification Based on Deep Learning Design of Real-time Target Detection System in CCD Vertical Target Coordinate Measurement An evaluation method of municipal pipeline cleaning effect based on image processing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1