RailTrack-DaViT: A Vision Transformer-Based Approach for Automated Railway Track Defect Detection.

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY Journal of Imaging Pub Date : 2024-08-07 DOI:10.3390/jimaging10080192
Aniwat Phaphuangwittayakul, Napat Harnpornchai, Fangli Ying, Jinming Zhang
{"title":"RailTrack-DaViT: A Vision Transformer-Based Approach for Automated Railway Track Defect Detection.","authors":"Aniwat Phaphuangwittayakul, Napat Harnpornchai, Fangli Ying, Jinming Zhang","doi":"10.3390/jimaging10080192","DOIUrl":null,"url":null,"abstract":"<p><p>Railway track defects pose significant safety risks and can lead to accidents, economic losses, and loss of life. Traditional manual inspection methods are either time-consuming, costly, or prone to human error. This paper proposes RailTrack-DaViT, a novel vision transformer-based approach for railway track defect classification. By leveraging the Dual Attention Vision Transformer (DaViT) architecture, RailTrack-DaViT effectively captures both global and local information, enabling accurate defect detection. The model is trained and evaluated on multiple datasets including rail, fastener and fishplate, multi-faults, and ThaiRailTrack. A comprehensive analysis of the model's performance is provided including confusion matrices, training visualizations, and classification metrics. RailTrack-DaViT demonstrates superior performance compared to state-of-the-art CNN-based methods, achieving the highest accuracies: 96.9% on the rail dataset, 98.9% on the fastener and fishplate dataset, and 98.8% on the multi-faults dataset. Moreover, RailTrack-DaViT outperforms baselines on the ThaiRailTrack dataset with 99.2% accuracy, quickly adapts to unseen images, and shows better model stability during fine-tuning. This capability can significantly reduce time consumption when applying the model to novel datasets in practical applications.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 8","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11355430/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jimaging10080192","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Railway track defects pose significant safety risks and can lead to accidents, economic losses, and loss of life. Traditional manual inspection methods are either time-consuming, costly, or prone to human error. This paper proposes RailTrack-DaViT, a novel vision transformer-based approach for railway track defect classification. By leveraging the Dual Attention Vision Transformer (DaViT) architecture, RailTrack-DaViT effectively captures both global and local information, enabling accurate defect detection. The model is trained and evaluated on multiple datasets including rail, fastener and fishplate, multi-faults, and ThaiRailTrack. A comprehensive analysis of the model's performance is provided including confusion matrices, training visualizations, and classification metrics. RailTrack-DaViT demonstrates superior performance compared to state-of-the-art CNN-based methods, achieving the highest accuracies: 96.9% on the rail dataset, 98.9% on the fastener and fishplate dataset, and 98.8% on the multi-faults dataset. Moreover, RailTrack-DaViT outperforms baselines on the ThaiRailTrack dataset with 99.2% accuracy, quickly adapts to unseen images, and shows better model stability during fine-tuning. This capability can significantly reduce time consumption when applying the model to novel datasets in practical applications.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
RailTrack-DaViT:基于视觉转换器的铁路轨道缺陷自动检测方法。
铁路轨道缺陷构成重大安全风险,可能导致事故、经济损失和人员伤亡。传统的人工检测方法要么耗时长、成本高,要么容易出现人为错误。本文提出了一种基于视觉转换器的铁路轨道缺陷分类新方法--RailTrack-DaViT。通过利用双注意力视觉转换器(DaViT)架构,RailTrack-DaViT 可有效捕捉全局和局部信息,从而实现准确的缺陷检测。该模型在多个数据集上进行了训练和评估,包括钢轨、扣件和鱼板、多重故障和 ThaiRailTrack。提供了对模型性能的全面分析,包括混淆矩阵、训练可视化和分类指标。与最先进的基于 CNN 的方法相比,RailTrack-DaViT 表现出了卓越的性能,达到了最高的准确率:在铁路数据集上达到 96.9%,在扣件和鱼板数据集上达到 98.9%,在多重故障数据集上达到 98.8%。此外,RailTrack-DaViT 在 ThaiRailTrack 数据集上的准确率为 99.2%,优于基线方法,能快速适应未见图像,并在微调过程中显示出更好的模型稳定性。在实际应用中,将模型应用于新数据集时,这种能力可以大大减少时间消耗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Imaging
Journal of Imaging Medicine-Radiology, Nuclear Medicine and Imaging
CiteScore
5.90
自引率
6.20%
发文量
303
审稿时长
7 weeks
期刊最新文献
A Specialized Pipeline for Efficient and Reliable 3D Semantic Model Reconstruction of Buildings from Indoor Point Clouds. Design and Use of a Custom Phantom for Regular Tests of Radiography Apparatus: A Feasibility Study. Differentiation of Benign and Malignant Neck Neoplastic Lesions Using Diffusion-Weighted Magnetic Resonance Imaging. Investigating the Sim-to-Real Generalizability of Deep Learning Object Detection Models. Quantitative Comparison of Color-Coded Parametric Imaging Technologies Based on Digital Subtraction and Digital Variance Angiography: A Retrospective Observational Study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1