{"title":"Asymmetrical Contrastive Learning Network via Knowledge Distillation for No-Service Rail Surface Defect Detection","authors":"Wujie Zhou;Xinyu Sun;Xiaohong Qian;Meixin Fang","doi":"10.1109/TNNLS.2024.3479453","DOIUrl":null,"url":null,"abstract":"Owing to extensive research on deep learning, significant progress has recently been made in trackless surface defect detection (SDD). Nevertheless, existing algorithms face two main challenges. First, while depth features contain rich spatial structure features, most models only accept red-green–blue (RGB) features as input, which severely constrains performance. Thus, this study proposes a dual-stream teacher model termed the asymmetrical contrastive learning network (ACLNet-T), which extracts both RGB and depth features to achieve high performance. Second, the introduction of the dual-stream model facilitates an exponential increase in the number of parameters. As a solution, we designed a single-stream student model (ACLNet-S) that extracted RGB features. We leveraged a contrastive distillation loss via knowledge distillation (KD) techniques to transfer rich multimodal features from the ACLNet-T to the ACLNet-S pixel by pixel and channel by channel. Furthermore, to compensate for the lack of contrastive distillation loss that focuses exclusively on local features, we employed multiscale graph mapping to establish long-range dependencies and transfer global features to the ACLNet-S through multiscale graph mapping distillation loss. Finally, an attentional distillation loss based on the adaptive attention decoder (AAD) was designed to further improve the performance of the ACLNet-S. Consequently, we obtained the ACLNet-S*, which achieved performance similar to that of ACLNet-T, despite having a nearly eightfold parameter count gap. Through comprehensive experimentation using the industrial RGB-D dataset NEU RSDDS-AUG, the ACLNet-S* (ACLNet-S with KD) was confirmed to outperform 16 state-of-the-art methods. Moreover, to showcase the generalization capacity of ACLNet-S*, the proposed network was evaluated on three additional public datasets, and ACLNet-S* achieved comparable results. The code is available at <uri>https://github.com/Yuride0404127/ACLNet-KD</uri>.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 7","pages":"12469-12482"},"PeriodicalIF":8.9000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10737882/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Owing to extensive research on deep learning, significant progress has recently been made in trackless surface defect detection (SDD). Nevertheless, existing algorithms face two main challenges. First, while depth features contain rich spatial structure features, most models only accept red-green–blue (RGB) features as input, which severely constrains performance. Thus, this study proposes a dual-stream teacher model termed the asymmetrical contrastive learning network (ACLNet-T), which extracts both RGB and depth features to achieve high performance. Second, the introduction of the dual-stream model facilitates an exponential increase in the number of parameters. As a solution, we designed a single-stream student model (ACLNet-S) that extracted RGB features. We leveraged a contrastive distillation loss via knowledge distillation (KD) techniques to transfer rich multimodal features from the ACLNet-T to the ACLNet-S pixel by pixel and channel by channel. Furthermore, to compensate for the lack of contrastive distillation loss that focuses exclusively on local features, we employed multiscale graph mapping to establish long-range dependencies and transfer global features to the ACLNet-S through multiscale graph mapping distillation loss. Finally, an attentional distillation loss based on the adaptive attention decoder (AAD) was designed to further improve the performance of the ACLNet-S. Consequently, we obtained the ACLNet-S*, which achieved performance similar to that of ACLNet-T, despite having a nearly eightfold parameter count gap. Through comprehensive experimentation using the industrial RGB-D dataset NEU RSDDS-AUG, the ACLNet-S* (ACLNet-S with KD) was confirmed to outperform 16 state-of-the-art methods. Moreover, to showcase the generalization capacity of ACLNet-S*, the proposed network was evaluated on three additional public datasets, and ACLNet-S* achieved comparable results. The code is available at https://github.com/Yuride0404127/ACLNet-KD.
由于深度学习的广泛研究,近年来在无轨表面缺陷检测(SDD)方面取得了重大进展。然而,现有的算法面临两个主要挑战。首先,深度特征包含丰富的空间结构特征,但大多数模型只接受红绿蓝(RGB)特征作为输入,严重制约了性能。因此,本研究提出了一种称为不对称对比学习网络(ACLNet-T)的双流教师模型,该模型同时提取RGB和深度特征以实现高性能。其次,双流模型的引入促进了参数数量的指数增长。作为解决方案,我们设计了一个提取RGB特征的单流学生模型(ACLNet-S)。我们通过知识蒸馏(KD)技术利用对比蒸馏损失将丰富的多模态特征从ACLNet-T逐像素逐通道地转移到ACLNet-S。此外,为了弥补仅关注局部特征的对比蒸馏损失的不足,我们采用多尺度图映射来建立远程依赖关系,并通过多尺度图映射蒸馏损失将全局特征转移到ACLNet-S中。最后,为了进一步提高ACLNet-S的性能,设计了一种基于自适应注意力解码器(AAD)的注意力蒸馏损失算法。因此,我们获得了ACLNet-S*,它实现了与ACLNet-T相似的性能,尽管有近8倍的参数计数差距。通过使用工业RGB-D数据集NEU RSDDS-AUG的综合实验,ACLNet-S* (ACLNet-S with KD)被证实优于16种最先进的方法。此外,为了展示ACLNet-S*的泛化能力,在另外三个公共数据集上对所提出的网络进行了评估,ACLNet-S*取得了相当的结果。代码可在https://github.com/Yuride0404127/ACLNet-KD上获得。
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.