RailTrack-DaViT：基于视觉转换器的铁路轨道缺陷自动检测方法。

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Journal of Imaging Pub Date : 2024-08-07 DOI:10.3390/jimaging10080192

Aniwat Phaphuangwittayakul, Napat Harnpornchai, Fangli Ying, Jinming Zhang

{"title":"RailTrack-DaViT：基于视觉转换器的铁路轨道缺陷自动检测方法。","authors":"Aniwat Phaphuangwittayakul, Napat Harnpornchai, Fangli Ying, Jinming Zhang","doi":"10.3390/jimaging10080192","DOIUrl":null,"url":null,"abstract":"Railway track defects pose significant safety risks and can lead to accidents, economic losses, and loss of life. Traditional manual inspection methods are either time-consuming, costly, or prone to human error. This paper proposes RailTrack-DaViT, a novel vision transformer-based approach for railway track defect classification. By leveraging the Dual Attention Vision Transformer (DaViT) architecture, RailTrack-DaViT effectively captures both global and local information, enabling accurate defect detection. The model is trained and evaluated on multiple datasets including rail, fastener and fishplate, multi-faults, and ThaiRailTrack. A comprehensive analysis of the model's performance is provided including confusion matrices, training visualizations, and classification metrics. RailTrack-DaViT demonstrates superior performance compared to state-of-the-art CNN-based methods, achieving the highest accuracies: 96.9% on the rail dataset, 98.9% on the fastener and fishplate dataset, and 98.8% on the multi-faults dataset. Moreover, RailTrack-DaViT outperforms baselines on the ThaiRailTrack dataset with 99.2% accuracy, quickly adapts to unseen images, and shows better model stability during fine-tuning. This capability can significantly reduce time consumption when applying the model to novel datasets in practical applications.","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 8","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11355430/pdf/","citationCount":"0","resultStr":"{\"title\":\"RailTrack-DaViT: A Vision Transformer-Based Approach for Automated Railway Track Defect Detection.\",\"authors\":\"Aniwat Phaphuangwittayakul, Napat Harnpornchai, Fangli Ying, Jinming Zhang\",\"doi\":\"10.3390/jimaging10080192\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Railway track defects pose significant safety risks and can lead to accidents, economic losses, and loss of life. Traditional manual inspection methods are either time-consuming, costly, or prone to human error. This paper proposes RailTrack-DaViT, a novel vision transformer-based approach for railway track defect classification. By leveraging the Dual Attention Vision Transformer (DaViT) architecture, RailTrack-DaViT effectively captures both global and local information, enabling accurate defect detection. The model is trained and evaluated on multiple datasets including rail, fastener and fishplate, multi-faults, and ThaiRailTrack. A comprehensive analysis of the model's performance is provided including confusion matrices, training visualizations, and classification metrics. RailTrack-DaViT demonstrates superior performance compared to state-of-the-art CNN-based methods, achieving the highest accuracies: 96.9% on the rail dataset, 98.9% on the fastener and fishplate dataset, and 98.8% on the multi-faults dataset. Moreover, RailTrack-DaViT outperforms baselines on the ThaiRailTrack dataset with 99.2% accuracy, quickly adapts to unseen images, and shows better model stability during fine-tuning. This capability can significantly reduce time consumption when applying the model to novel datasets in practical applications.\",\"PeriodicalId\":37035,\"journal\":{\"name\":\"Journal of Imaging\",\"volume\":\"10 8\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11355430/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/jimaging10080192\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jimaging10080192","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

铁路轨道缺陷构成重大安全风险，可能导致事故、经济损失和人员伤亡。传统的人工检测方法要么耗时长、成本高，要么容易出现人为错误。本文提出了一种基于视觉转换器的铁路轨道缺陷分类新方法--RailTrack-DaViT。通过利用双注意力视觉转换器（DaViT）架构，RailTrack-DaViT 可有效捕捉全局和局部信息，从而实现准确的缺陷检测。该模型在多个数据集上进行了训练和评估，包括钢轨、扣件和鱼板、多重故障和 ThaiRailTrack。提供了对模型性能的全面分析，包括混淆矩阵、训练可视化和分类指标。与最先进的基于 CNN 的方法相比，RailTrack-DaViT 表现出了卓越的性能，达到了最高的准确率：在铁路数据集上达到 96.9%，在扣件和鱼板数据集上达到 98.9%，在多重故障数据集上达到 98.8%。此外，RailTrack-DaViT 在 ThaiRailTrack 数据集上的准确率为 99.2%，优于基线方法，能快速适应未见图像，并在微调过程中显示出更好的模型稳定性。在实际应用中，将模型应用于新数据集时，这种能力可以大大减少时间消耗。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

RailTrack-DaViT: A Vision Transformer-Based Approach for Automated Railway Track Defect Detection.

Railway track defects pose significant safety risks and can lead to accidents, economic losses, and loss of life. Traditional manual inspection methods are either time-consuming, costly, or prone to human error. This paper proposes RailTrack-DaViT, a novel vision transformer-based approach for railway track defect classification. By leveraging the Dual Attention Vision Transformer (DaViT) architecture, RailTrack-DaViT effectively captures both global and local information, enabling accurate defect detection. The model is trained and evaluated on multiple datasets including rail, fastener and fishplate, multi-faults, and ThaiRailTrack. A comprehensive analysis of the model's performance is provided including confusion matrices, training visualizations, and classification metrics. RailTrack-DaViT demonstrates superior performance compared to state-of-the-art CNN-based methods, achieving the highest accuracies: 96.9% on the rail dataset, 98.9% on the fastener and fishplate dataset, and 98.8% on the multi-faults dataset. Moreover, RailTrack-DaViT outperforms baselines on the ThaiRailTrack dataset with 99.2% accuracy, quickly adapts to unseen images, and shows better model stability during fine-tuning. This capability can significantly reduce time consumption when applying the model to novel datasets in practical applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊