Hantao Zhou;Rui Yang;Runze Hu;Chang Shu;Xiaochu Tang;Xiu Li
{"title":"ETDNet: Efficient Transformer-Based Detection Network for Surface Defect Detection","authors":"Hantao Zhou;Rui Yang;Runze Hu;Chang Shu;Xiaochu Tang;Xiu Li","doi":"10.1109/TIM.2023.3307753","DOIUrl":null,"url":null,"abstract":"Deep learning (DL)-based surface defect detectors play a crucial role in ensuring product quality during inspection processes. However, accurately and efficiently detecting defects remain challenging due to specific characteristics inherent in defective images, involving a high degree of foreground–background similarity, scale variation, and shape variation. To address this challenge, we propose an efficient transformer-based detection network, ETDNet, consisting of three novel designs to achieve superior performance. First, ETDNet takes a lightweight vision transformer (ViT) to extract representative global features. This approach ensures an accurate feature characterization of defects even with similar backgrounds. Second, a channel-modulated feature pyramid network (CM-FPN) is devised to fuse multilevel features and maintain critical information from corresponding levels. Finally, a novel task-oriented decoupled (TOD) head is introduced to tackle inconsistent representation between classification and regression tasks. The TOD head employs a local feature representation (LFR) module to learn object-aware local features and introduces a global feature representation (GFR) module, based on the attention mechanism, to learn content-aware global features. By integrating these two modules into the head, ETDNet can effectively classify and perceive defects with varying shapes and scales. Extensive experiments on various defect detection datasets demonstrate the effectiveness of the proposed ETDNet. For instance, it achieves AP 46.7% (versus 45.9%) and $\\mathrm {AP_{50}}~80.2$ % (versus 79.1%) with 49 frames/s on NEU-DET. The code is available at https://github.com/zht8506/ETDNet.","PeriodicalId":5,"journal":{"name":"ACS Applied Materials & Interfaces","volume":null,"pages":null},"PeriodicalIF":8.3000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Materials & Interfaces","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10227321/","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 1
Abstract
Deep learning (DL)-based surface defect detectors play a crucial role in ensuring product quality during inspection processes. However, accurately and efficiently detecting defects remain challenging due to specific characteristics inherent in defective images, involving a high degree of foreground–background similarity, scale variation, and shape variation. To address this challenge, we propose an efficient transformer-based detection network, ETDNet, consisting of three novel designs to achieve superior performance. First, ETDNet takes a lightweight vision transformer (ViT) to extract representative global features. This approach ensures an accurate feature characterization of defects even with similar backgrounds. Second, a channel-modulated feature pyramid network (CM-FPN) is devised to fuse multilevel features and maintain critical information from corresponding levels. Finally, a novel task-oriented decoupled (TOD) head is introduced to tackle inconsistent representation between classification and regression tasks. The TOD head employs a local feature representation (LFR) module to learn object-aware local features and introduces a global feature representation (GFR) module, based on the attention mechanism, to learn content-aware global features. By integrating these two modules into the head, ETDNet can effectively classify and perceive defects with varying shapes and scales. Extensive experiments on various defect detection datasets demonstrate the effectiveness of the proposed ETDNet. For instance, it achieves AP 46.7% (versus 45.9%) and $\mathrm {AP_{50}}~80.2$ % (versus 79.1%) with 49 frames/s on NEU-DET. The code is available at https://github.com/zht8506/ETDNet.
期刊介绍:
ACS Applied Materials & Interfaces is a leading interdisciplinary journal that brings together chemists, engineers, physicists, and biologists to explore the development and utilization of newly-discovered materials and interfacial processes for specific applications. Our journal has experienced remarkable growth since its establishment in 2009, both in terms of the number of articles published and the impact of the research showcased. We are proud to foster a truly global community, with the majority of published articles originating from outside the United States, reflecting the rapid growth of applied research worldwide.