{"title":"Redefining Real-Time Road Quality Analysis With Vision Transformers on Edge Devices","authors":"Tasnim Ahmed;Naveed Ejaz;Salimur Choudhury","doi":"10.1109/TAI.2024.3394797","DOIUrl":null,"url":null,"abstract":"Road infrastructure is essential for transportation safety and efficiency. However, the current methods for assessing road conditions, crucial for effective planning and maintenance, suffer from high costs, time-intensive procedures, infrequent data collection, and limited real-time capabilities. This article presents an efficient lightweight system to analyze road quality from video feeds in real time. The backbone of the system is EdgeFusionViT, a novel vision transformer (ViT)-based architecture that uses an attention-based late fusion mechanism. The proposed architecture outperforms lightweight convolutional neural network (CNN)-based and ViT-based models. Its practicality is demonstrated by its deployment on an edge device, the Nvidia Jetson Orin Nano, enabling real-time road analysis at 12 frames per second. EdgeFusionViT outperforms existing benchmarks, achieving an impressive accuracy of 89.76% on the road surface condition dataset (RSCD). Notably, the model maintains a commendable accuracy of 76.89% even when trained with only 2% of the dataset, demonstrating its robustness and efficiency. These findings highlight the system's potential in road infrastructure management. It aids in creating safer, more efficient transport systems through timely, accurate road condition assessments. The study sets a new benchmark and opens up possibilities for advanced machine learning in infrastructure management.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 10","pages":"4972-4983"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10510402/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Road infrastructure is essential for transportation safety and efficiency. However, the current methods for assessing road conditions, crucial for effective planning and maintenance, suffer from high costs, time-intensive procedures, infrequent data collection, and limited real-time capabilities. This article presents an efficient lightweight system to analyze road quality from video feeds in real time. The backbone of the system is EdgeFusionViT, a novel vision transformer (ViT)-based architecture that uses an attention-based late fusion mechanism. The proposed architecture outperforms lightweight convolutional neural network (CNN)-based and ViT-based models. Its practicality is demonstrated by its deployment on an edge device, the Nvidia Jetson Orin Nano, enabling real-time road analysis at 12 frames per second. EdgeFusionViT outperforms existing benchmarks, achieving an impressive accuracy of 89.76% on the road surface condition dataset (RSCD). Notably, the model maintains a commendable accuracy of 76.89% even when trained with only 2% of the dataset, demonstrating its robustness and efficiency. These findings highlight the system's potential in road infrastructure management. It aids in creating safer, more efficient transport systems through timely, accurate road condition assessments. The study sets a new benchmark and opens up possibilities for advanced machine learning in infrastructure management.