Yike He , Baotong Wu , Xiao Liu , Baicun Wang , Jianzhong Fu , Songyu Hu
{"title":"AEGLR-Net: Attention enhanced global–local refined network for accurate detection of car body surface defects","authors":"Yike He , Baotong Wu , Xiao Liu , Baicun Wang , Jianzhong Fu , Songyu Hu","doi":"10.1016/j.rcim.2024.102806","DOIUrl":null,"url":null,"abstract":"<div><p>The complex background on the car body surface, such as the orange peel-like texture and shiny metallic powder, poses a considerable challenge to automated defect detection. Two mainstream methods are currently used to tackle this challenge: global information-based and attention mechanism-based methods. However, these methods lack the capability to integrate valuable global-to-local information and explore deeper distinguishable features, thereby affecting the overall detection performance. To address this issue, we propose a novel attention enhanced global–local refined detection network (AEGLR-Net), which can perform effective global-to-local refined feature extraction and fusion. First, we design an adaptive Transformer–CNN tandem backbone (ATCT-backbone) to dynamically aware valuable global information and integrate local details to comprehensively extract specific features between defects and complex backgrounds. Then, we propose a novel refined cross-dimensional aggregation (RCDA) attention to facilitate the point-to-point interaction of multidimensional information, effectively emphasizing the representation of deeper discriminative defect features. Finally, we construct an attention-embedded flexible feature pyramid network (AE-FFPN), which incorporates the RCDA attention to guide the feature pyramid network in targeted feature fusion, thereby enhancing the efficiency of feature fusion in the detection model. Extensive comparative experiments demonstrate that the AEGLR-Net outperforms state-of-the-art approaches, attaining exceptional performance with 89.2 % mAP (mean average precision) and 85.5 FPS (frames per second).</p></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"90 ","pages":"Article 102806"},"PeriodicalIF":9.1000,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Computer-integrated Manufacturing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0736584524000930","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The complex background on the car body surface, such as the orange peel-like texture and shiny metallic powder, poses a considerable challenge to automated defect detection. Two mainstream methods are currently used to tackle this challenge: global information-based and attention mechanism-based methods. However, these methods lack the capability to integrate valuable global-to-local information and explore deeper distinguishable features, thereby affecting the overall detection performance. To address this issue, we propose a novel attention enhanced global–local refined detection network (AEGLR-Net), which can perform effective global-to-local refined feature extraction and fusion. First, we design an adaptive Transformer–CNN tandem backbone (ATCT-backbone) to dynamically aware valuable global information and integrate local details to comprehensively extract specific features between defects and complex backgrounds. Then, we propose a novel refined cross-dimensional aggregation (RCDA) attention to facilitate the point-to-point interaction of multidimensional information, effectively emphasizing the representation of deeper discriminative defect features. Finally, we construct an attention-embedded flexible feature pyramid network (AE-FFPN), which incorporates the RCDA attention to guide the feature pyramid network in targeted feature fusion, thereby enhancing the efficiency of feature fusion in the detection model. Extensive comparative experiments demonstrate that the AEGLR-Net outperforms state-of-the-art approaches, attaining exceptional performance with 89.2 % mAP (mean average precision) and 85.5 FPS (frames per second).
期刊介绍:
The journal, Robotics and Computer-Integrated Manufacturing, focuses on sharing research applications that contribute to the development of new or enhanced robotics, manufacturing technologies, and innovative manufacturing strategies that are relevant to industry. Papers that combine theory and experimental validation are preferred, while review papers on current robotics and manufacturing issues are also considered. However, papers on traditional machining processes, modeling and simulation, supply chain management, and resource optimization are generally not within the scope of the journal, as there are more appropriate journals for these topics. Similarly, papers that are overly theoretical or mathematical will be directed to other suitable journals. The journal welcomes original papers in areas such as industrial robotics, human-robot collaboration in manufacturing, cloud-based manufacturing, cyber-physical production systems, big data analytics in manufacturing, smart mechatronics, machine learning, adaptive and sustainable manufacturing, and other fields involving unique manufacturing technologies.