DHT:用于工业缺陷图像分类的混合窗口注意动态视觉变换器

IF 1.6 4区 工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Instrumentation & Measurement Magazine Pub Date : 2023-04-01 DOI:10.1109/MIM.2023.10083000
Chao Ding, Donglin Teng, Xianghua Zheng, Qiang Wang, Yuanyuan He, Zhang Long
{"title":"DHT:用于工业缺陷图像分类的混合窗口注意动态视觉变换器","authors":"Chao Ding, Donglin Teng, Xianghua Zheng, Qiang Wang, Yuanyuan He, Zhang Long","doi":"10.1109/MIM.2023.10083000","DOIUrl":null,"url":null,"abstract":"Industrial defect detection is gaining importance in the control of industrial product quality. Highly accurate and efficient defect detection with complex and variable industrial defect types is therefore an interesting but challenging problem. Vision transformers have been highly successful in a variety of computer vision tasks, due to their ability to capture global information in images. Nevertheless, simply capturing global information is problematic. On the one hand, because they are incapable of inductive bias as Convolutional Neural Network (CNN), transformers will have difficulty focusing on local features of defects in industrial defect image inspection tasks. On the other hand, using global computation leads to excessive memory and computational cost. To mitigate these issues, we propose a new vision transformer architecture which contains Hybrid Window Attention (HWA) and Dynamic Token Normalization (DTN). HWA, which combines pooling attention and window attention, makes the computational complexity reduced to improve efficiency. DTN enables transformers to focus on both the global information and the local features of defects, thus providing improved accuracy of industrial surface defect detection. Extensive experiments demonstrate that our Dynamic Vision Transformer (DHT) achieves 96.8% and 98.5% classification accuracy on the NEU dataset and the DAGM dataset, respectively, with a low computational complexity.","PeriodicalId":55025,"journal":{"name":"IEEE Instrumentation & Measurement Magazine","volume":"26 1","pages":"19-28"},"PeriodicalIF":1.6000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DHT: Dynamic Vision Transformer Using Hybrid Window Attention for Industrial Defect Images Classification\",\"authors\":\"Chao Ding, Donglin Teng, Xianghua Zheng, Qiang Wang, Yuanyuan He, Zhang Long\",\"doi\":\"10.1109/MIM.2023.10083000\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Industrial defect detection is gaining importance in the control of industrial product quality. Highly accurate and efficient defect detection with complex and variable industrial defect types is therefore an interesting but challenging problem. Vision transformers have been highly successful in a variety of computer vision tasks, due to their ability to capture global information in images. Nevertheless, simply capturing global information is problematic. On the one hand, because they are incapable of inductive bias as Convolutional Neural Network (CNN), transformers will have difficulty focusing on local features of defects in industrial defect image inspection tasks. On the other hand, using global computation leads to excessive memory and computational cost. To mitigate these issues, we propose a new vision transformer architecture which contains Hybrid Window Attention (HWA) and Dynamic Token Normalization (DTN). HWA, which combines pooling attention and window attention, makes the computational complexity reduced to improve efficiency. DTN enables transformers to focus on both the global information and the local features of defects, thus providing improved accuracy of industrial surface defect detection. Extensive experiments demonstrate that our Dynamic Vision Transformer (DHT) achieves 96.8% and 98.5% classification accuracy on the NEU dataset and the DAGM dataset, respectively, with a low computational complexity.\",\"PeriodicalId\":55025,\"journal\":{\"name\":\"IEEE Instrumentation & Measurement Magazine\",\"volume\":\"26 1\",\"pages\":\"19-28\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Instrumentation & Measurement Magazine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/MIM.2023.10083000\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Instrumentation & Measurement Magazine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/MIM.2023.10083000","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

工业缺陷检测在工业产品质量控制中越来越重要。因此,对复杂多变的工业缺陷类型进行高精度、高效的缺陷检测是一个有趣但具有挑战性的问题。视觉变压器在各种计算机视觉任务中非常成功,因为它们能够捕获图像中的全局信息。然而,仅仅捕获全局信息是有问题的。一方面,由于变压器不像卷积神经网络(CNN)那样具有归纳偏置的能力,在工业缺陷图像检测任务中,变压器将难以集中到缺陷的局部特征上。另一方面,使用全局计算会导致过多的内存和计算成本。为了缓解这些问题,我们提出了一种新的视觉转换器架构,该架构包含混合窗口注意(HWA)和动态令牌规范化(DTN)。HWA将池注意和窗口注意相结合,降低了计算复杂度,提高了效率。DTN使变压器能够同时关注缺陷的全局信息和局部特征,从而提高工业表面缺陷检测的准确性。大量的实验表明,我们的动态视觉转换器(DHT)在NEU数据集和DAGM数据集上的分类准确率分别达到96.8%和98.5%,且计算复杂度较低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DHT: Dynamic Vision Transformer Using Hybrid Window Attention for Industrial Defect Images Classification
Industrial defect detection is gaining importance in the control of industrial product quality. Highly accurate and efficient defect detection with complex and variable industrial defect types is therefore an interesting but challenging problem. Vision transformers have been highly successful in a variety of computer vision tasks, due to their ability to capture global information in images. Nevertheless, simply capturing global information is problematic. On the one hand, because they are incapable of inductive bias as Convolutional Neural Network (CNN), transformers will have difficulty focusing on local features of defects in industrial defect image inspection tasks. On the other hand, using global computation leads to excessive memory and computational cost. To mitigate these issues, we propose a new vision transformer architecture which contains Hybrid Window Attention (HWA) and Dynamic Token Normalization (DTN). HWA, which combines pooling attention and window attention, makes the computational complexity reduced to improve efficiency. DTN enables transformers to focus on both the global information and the local features of defects, thus providing improved accuracy of industrial surface defect detection. Extensive experiments demonstrate that our Dynamic Vision Transformer (DHT) achieves 96.8% and 98.5% classification accuracy on the NEU dataset and the DAGM dataset, respectively, with a low computational complexity.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Instrumentation & Measurement Magazine
IEEE Instrumentation & Measurement Magazine 工程技术-工程:电子与电气
CiteScore
4.20
自引率
4.80%
发文量
147
审稿时长
>12 weeks
期刊介绍: IEEE Instrumentation & Measurement Magazine is a bimonthly publication. It publishes in February, April, June, August, October, and December of each year. The magazine covers a wide variety of topics in instrumentation, measurement, and systems that measure or instrument equipment or other systems. The magazine has the goal of providing readable introductions and overviews of technology in instrumentation and measurement to a wide engineering audience. It does this through articles, tutorials, columns, and departments. Its goal is to cross disciplines to encourage further research and development in instrumentation and measurement.
期刊最新文献
Fundamentals in Measurement: Looking Out for Number 1 Addressing Non-Idealities and EIS Measurement: From Inspection to Implementation Evaluation and Identification of Non-Line of Sight Conditions in UWB Systems Measurement Methodology: Blade Tip Timing: A Non-Contact Blade Vibration Measurement Method Education I&M: Seeing Phase Truncation Spurs in the Output of Direct Digital Synthesizers
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1