In this paper, we investigate the use of Vision Transformers for processing and understanding visual data in an autonomous driving setting. Specifically, we explore the use of Vision Transformers for semantic segmentation and monocular depth estimation using only a single image as input. We present state-of-the-art Vision Transformers for these tasks and combine them into a multitask model. Through multiple experiments on four different street image datasets, we demonstrate that the multitask approach significantly reduces inference time while maintaining high accuracy for both tasks. Additionally, we show that changing the size of the Transformer-based backbone can be used as a trade-off between inference speed and accuracy. Furthermore, we investigate the use of synthetic data for pre-training and show that it effectively increases the accuracy of the model when real-world data is limited.
{"title":"A Multi-Task Vision Transformer for Segmentation and Monocular Depth Estimation for Autonomous Vehicles","authors":"Durga Prasad Bavirisetti;Herman Ryen Martinsen;Gabriel Hanssen Kiss;Frank Lindseth","doi":"10.1109/OJITS.2023.3335648","DOIUrl":"10.1109/OJITS.2023.3335648","url":null,"abstract":"In this paper, we investigate the use of Vision Transformers for processing and understanding visual data in an autonomous driving setting. Specifically, we explore the use of Vision Transformers for semantic segmentation and monocular depth estimation using only a single image as input. We present state-of-the-art Vision Transformers for these tasks and combine them into a multitask model. Through multiple experiments on four different street image datasets, we demonstrate that the multitask approach significantly reduces inference time while maintaining high accuracy for both tasks. Additionally, we show that changing the size of the Transformer-based backbone can be used as a trade-off between inference speed and accuracy. Furthermore, we investigate the use of synthetic data for pre-training and show that it effectively increases the accuracy of the model when real-world data is limited.","PeriodicalId":100631,"journal":{"name":"IEEE Open Journal of Intelligent Transportation Systems","volume":"4 ","pages":"909-928"},"PeriodicalIF":0.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10330677","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138576854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-28DOI: 10.1109/OJITS.2023.3335303
Pushkin Kachroo;Shaurya Agarwal;Animesh Biswas;Archie J. Huang
Nonlocal calculus-based macroscopic traffic models overcome the limitations of classical local models in accurately capturing traffic flow dynamics. These models incorporate “nonlocal” elements by considering the speed as a weighted mean of downstream traffic density, aligning it more closely with realistic driving behaviors. The primary contributions of this research are manifold. Firstly, we choose a nonlocal LWR model and Greenshields fundamental diagram and prove that this traffic flow model satisfies the well-posed conditions. Furthermore, we prove that the chosen model maintains bounded states, laying the groundwork for developing numerically stable schemes. Subsequently, the efficacy of the proposed nonlocal model is evaluated through extensive field validation using real traffic data from the NGSIM dataset and developing a stable numerical scheme. These validation results highlight the superiority of the nonlocal model in capturing traffic characteristics compared to its local counterpart and establish its enhanced accuracy in reproducing complex traffic behavior. Therefore, this research expands both the theoretical constructs within the field and substantiates its practical applicability.
{"title":"Nonlocal Calculus-Based Macroscopic Traffic Model: Development, Analysis, and Validation","authors":"Pushkin Kachroo;Shaurya Agarwal;Animesh Biswas;Archie J. Huang","doi":"10.1109/OJITS.2023.3335303","DOIUrl":"https://doi.org/10.1109/OJITS.2023.3335303","url":null,"abstract":"Nonlocal calculus-based macroscopic traffic models overcome the limitations of classical local models in accurately capturing traffic flow dynamics. These models incorporate “nonlocal” elements by considering the speed as a weighted mean of downstream traffic density, aligning it more closely with realistic driving behaviors. The primary contributions of this research are manifold. Firstly, we choose a nonlocal LWR model and Greenshields fundamental diagram and prove that this traffic flow model satisfies the well-posed conditions. Furthermore, we prove that the chosen model maintains bounded states, laying the groundwork for developing numerically stable schemes. Subsequently, the efficacy of the proposed nonlocal model is evaluated through extensive field validation using real traffic data from the NGSIM dataset and developing a stable numerical scheme. These validation results highlight the superiority of the nonlocal model in capturing traffic characteristics compared to its local counterpart and establish its enhanced accuracy in reproducing complex traffic behavior. Therefore, this research expands both the theoretical constructs within the field and substantiates its practical applicability.","PeriodicalId":100631,"journal":{"name":"IEEE Open Journal of Intelligent Transportation Systems","volume":"4 ","pages":"900-908"},"PeriodicalIF":0.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10330738","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138558041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-24DOI: 10.1109/OJITS.2023.3336464
Rainer Trauth;Korbinian Moller;Johannes Betz
Autonomous vehicles face numerous challenges to ensure safe operation in unpredictable and hazardous conditions. The autonomous driving environment is characterized by high uncertainty, especially in occluded areas with limited information about the surrounding obstacles. This work aims to provide a trajectory planner to solve these unsafe environments. The work proposes an approach combining a visibility model, contextual environmental information, and behavioral planning algorithms to predict the likelihood of occlusions and collision probabilities. Ultimately, this allows us to estimate the potential harm from collisions with pedestrians in occluded situations. The primary goal of our proposed approach is to minimize the risk of hitting pedestrians and to establish a predefined, adjustable maximum level of harm. We show several practical applications for informing a sampling-based trajectory planner about occluded areas to increase safety. In addition, to respond to possible high-risk situations, we introduce an adjustable threshold that governs the vehicle’s speed when encountering uncertain situations and strategies to maximize the vehicle’s visible area. In implementing our novel methodology, we analyzed several real-world scenarios in a simulation environment. Our results indicate that combining occlusion-aware trajectory planning algorithms and harm estimation significantly influences vehicle driving behavior, especially in risky situations. The code used in this research is publicly available as open-source software and can be accessed at the following link: https://github.com/TUM-AVS/Frenetix-Motion-Planner