{"title":"Effective Bi-decoding networks for rail-surface defect detection by knowledge distillation","authors":"Wujie Zhou , Yue Wu , Weiwei Qiu , Caie Xu , Fangfang Qiang","doi":"10.1016/j.asoc.2024.112422","DOIUrl":null,"url":null,"abstract":"<div><div>No-service rail-surface defect detection is a crucial method for assessing the quality of railroad tracks. However, the low-contrast and dark-tone characteristics of track-surface textures pose challenges to current defect-monitoring techniques. Real-time and on-site online inspections are important to ensure safe railway operation; however, most complex models for no-service inspections are difficult to deploy on mobile devices. To address these challenges and overcome the detection difficulties associated with complex scenes, we designed a knowledge distillation-based double decoding-layer refinement network (EBDNet-KD). The first decoding process is guided by a bimodal high-level semantic feature map obtained by extending the attention-based graph convolution to incrementally enhance the dual-stream features and obtain an image restoration prior. A divide-and-conquer decoder is then designed to distinguish features using different decoding layers. The prior is then used in the second decoding layer, which enables the bimodal features to interact fully and obtain the final prediction map. We introduce a knowledge distillation strategy that enables a lightweight, compact student network to learn a complex teacher network’s feature extraction process. This facilitates pixel-consistent learning of the knowledge within the bi-decoder layer, as well as bidirectional learning of the focused contextual response knowledge to optimize the model. The EBDNet-KD significantly reduces computational costs while guaranteeing performance with a parameter count of only 28 M. EBDNet-KD demonstrated superior performance over 15 state-of-the-art methods in experiments conducted on NEU RSDDS-AUG, an industrial RGB-depth dataset. We assessed the generalizability of EBDNet-KD by evaluating its performance on three additional public datasets, yielding competitive results. The source code and results can be found at <span><span>https://github.com/Wuyue15/EBDNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"167 ","pages":"Article 112422"},"PeriodicalIF":7.2000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494624011967","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
No-service rail-surface defect detection is a crucial method for assessing the quality of railroad tracks. However, the low-contrast and dark-tone characteristics of track-surface textures pose challenges to current defect-monitoring techniques. Real-time and on-site online inspections are important to ensure safe railway operation; however, most complex models for no-service inspections are difficult to deploy on mobile devices. To address these challenges and overcome the detection difficulties associated with complex scenes, we designed a knowledge distillation-based double decoding-layer refinement network (EBDNet-KD). The first decoding process is guided by a bimodal high-level semantic feature map obtained by extending the attention-based graph convolution to incrementally enhance the dual-stream features and obtain an image restoration prior. A divide-and-conquer decoder is then designed to distinguish features using different decoding layers. The prior is then used in the second decoding layer, which enables the bimodal features to interact fully and obtain the final prediction map. We introduce a knowledge distillation strategy that enables a lightweight, compact student network to learn a complex teacher network’s feature extraction process. This facilitates pixel-consistent learning of the knowledge within the bi-decoder layer, as well as bidirectional learning of the focused contextual response knowledge to optimize the model. The EBDNet-KD significantly reduces computational costs while guaranteeing performance with a parameter count of only 28 M. EBDNet-KD demonstrated superior performance over 15 state-of-the-art methods in experiments conducted on NEU RSDDS-AUG, an industrial RGB-depth dataset. We assessed the generalizability of EBDNet-KD by evaluating its performance on three additional public datasets, yielding competitive results. The source code and results can be found at https://github.com/Wuyue15/EBDNet.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.