Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-108
Mahmoud Ewaisha, Marwa El Shawarby, Hazem M. Abbas, Ibrahim Sobh
Modern automobiles accidents occur mostly due to inattentive behavior of drivers, which is why driver’s gaze estimation is becoming a critical component in automotive industry. Gaze estimation has introduced many challenges due to the nature of the surrounding environment like changes in illumination, or driver’s head motion, partial face occlusion, or wearing eye decorations. Previous work conducted in this field includes explicit extraction of hand-crafted features such as eye corners and pupil center to be used to estimate gaze, or appearance-based methods like Convolutional Neural Networks which implicitly extracts features from an image and directly map it to the corresponding gaze angle. In this work, a multitask Convolutional Neural Network architecture is proposed to predict subject’s gaze yaw and pitch angles, along with the head pose as an auxiliary task, making the model robust to head pose variations, without needing any complex preprocessing or hand-crafted feature extraction.Then the network’s output is clustered into nine gaze classes relevant in the driving scenario. The model achieves 95.8% accuracy on the test set and 78.2% accuracy in cross-subject testing, proving the model’s generalization capability and robustness to head pose variation.
{"title":"End-to-End Multitask Learning for Driver Gaze and Head Pose Estimation","authors":"Mahmoud Ewaisha, Marwa El Shawarby, Hazem M. Abbas, Ibrahim Sobh","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-108","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-108","url":null,"abstract":"\u0000 Modern automobiles accidents occur mostly due to inattentive behavior of drivers, which is why driver’s gaze estimation is becoming a critical component in automotive industry. Gaze estimation has introduced many challenges due to the nature of the surrounding environment like\u0000 changes in illumination, or driver’s head motion, partial face occlusion, or wearing eye decorations. Previous work conducted in this field includes explicit extraction of hand-crafted features such as eye corners and pupil center to be used to estimate gaze, or appearance-based methods\u0000 like Convolutional Neural Networks which implicitly extracts features from an image and directly map it to the corresponding gaze angle. In this work, a multitask Convolutional Neural Network architecture is proposed to predict subject’s gaze yaw and pitch angles, along with the head\u0000 pose as an auxiliary task, making the model robust to head pose variations, without needing any complex preprocessing or hand-crafted feature extraction.Then the network’s output is clustered into nine gaze classes relevant in the driving scenario. The model achieves 95.8% accuracy on\u0000 the test set and 78.2% accuracy in cross-subject testing, proving the model’s generalization capability and robustness to head pose variation.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116017692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-150
Florian Groh, Dominik Schörkhuber, M. Gelautz
We have developed a semi-automatic annotation tool – “CVL Annotator” – for bounding box ground truth generation in videos. Our research is particularly motivated by the need for reference annotations of challenging nighttime traffic scenes with highly dynamic lighting conditions due to reflections, headlights and halos from oncoming traffic. Our tool incorporates a suite of different state-of-the-art tracking algorithms in order to minimize the amount of human input necessary to generate high-quality ground truth data. We focus our user interface on the premise of minimizing user interaction and visualizing all information relevant to the user at a glance. We perform a preliminary user study to measure the amount of time and clicks necessary to produce ground truth annotations of video traffic scenes and evaluate the accuracy of the final annotation results.
{"title":"A tool for semi-automatic ground truth annotation of traffic videos","authors":"Florian Groh, Dominik Schörkhuber, M. Gelautz","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-150","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-150","url":null,"abstract":"\u0000 We have developed a semi-automatic annotation tool – “CVL Annotator” – for bounding box ground truth generation in videos. Our research is particularly motivated by the need for reference annotations of challenging nighttime traffic scenes with highly dynamic\u0000 lighting conditions due to reflections, headlights and halos from oncoming traffic. Our tool incorporates a suite of different state-of-the-art tracking algorithms in order to minimize the amount of human input necessary to generate high-quality ground truth data. We focus our user interface\u0000 on the premise of minimizing user interaction and visualizing all information relevant to the user at a glance. We perform a preliminary user study to measure the amount of time and clicks necessary to produce ground truth annotations of video traffic scenes and evaluate the accuracy of the\u0000 final annotation results.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126331561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-203
G. Godaliyadda, Vijay Pothukuchi, J. Roh
Grid mapping is widely used to represent the environment surrounding a car or a robot for autonomous navigation. This paper describes an algorithm for evidential occupancy grid (OG) mapping that fuses measurements from different sensors, based on the Dempster-Shafer theory, and is intended for scenes with stationary and moving (dynamic) objects. Conventional OGmapping algorithms tend to struggle in the presence of moving objects because they do not explicitly distinguish between moving and stationary objects. In contrast, evidential OG mapping allows for dynamic and ambiguous states (e.g. a LIDAR measurement: cannot differentiate between moving and stationary objects) that are more aligned with measurements made by sensors. In this paper, we present a framework for fusing measurements as they are received from disparate sensors (e.g. radar, camera and LIDAR) using evidential grid mapping. With this approach, we can form a live map of the environment, and also alleviate the problem of having to synchronize sensors in time. We also designed a new inverse sensor model for radar that allows us to extract more information from object level measurements, by incorporating knowledge of the sensor’s characteristics. We have implemented our algorithm in the OpenVX framework to enable seamless integration into embedded platforms. Test results show compelling performance especially in the presence of moving objects.
{"title":"Multi-Sensor Fusion in Dynamic Environment using Evidential Grid Mapping","authors":"G. Godaliyadda, Vijay Pothukuchi, J. Roh","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-203","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-203","url":null,"abstract":"\u0000 Grid mapping is widely used to represent the environment surrounding a car or a robot for autonomous navigation. This paper describes an algorithm for evidential occupancy grid (OG) mapping that fuses measurements from different sensors, based on the Dempster-Shafer theory, and is\u0000 intended for scenes with stationary and moving (dynamic) objects. Conventional OGmapping algorithms tend to struggle in the presence of moving objects because they do not explicitly distinguish between moving and stationary objects. In contrast, evidential OG mapping allows for dynamic and\u0000 ambiguous states (e.g. a LIDAR measurement: cannot differentiate between moving and stationary objects) that are more aligned with measurements made by sensors.\u0000 \u0000 In this paper, we present a framework for fusing measurements as they are received from disparate sensors (e.g. radar,\u0000 camera and LIDAR) using evidential grid mapping. With this approach, we can form a live map of the environment, and also alleviate the problem of having to synchronize sensors in time. We also designed a new inverse sensor model for radar that allows us to extract more information from object\u0000 level measurements, by incorporating knowledge of the sensor’s characteristics. We have implemented our algorithm in the OpenVX framework to enable seamless integration into embedded platforms. Test results show compelling performance especially in the presence of moving objects.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121456366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/J.IMAGINGSCI.TECHNOL.2019.63.6.060405
Michal Uřičář, Hazem Rashed, Adithya Ranga, Ashok Dahal, S. Yogamani
{"title":"VisibilityNet: Camera visibility detection and image restoration for autonomous driving","authors":"Michal Uřičář, Hazem Rashed, Adithya Ranga, Ashok Dahal, S. Yogamani","doi":"10.2352/J.IMAGINGSCI.TECHNOL.2019.63.6.060405","DOIUrl":"https://doi.org/10.2352/J.IMAGINGSCI.TECHNOL.2019.63.6.060405","url":null,"abstract":"","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114166842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-088
Jimi Lee, J. Nam, ByoungChul Ko
In this study, we propose a new multi-pedestrian tracking (MPT) method that performs quickly and efficiently track pedestrians in real-time system. The proposed method considers combining shallow convolutional neural networks (CNN) with ensemble learning method, Siamese random forests (SRF). Unlike conventional methods, to promote robustness of ensemble method, feature transformation is applied which exploit shallow networks in appearances of still images to extract enrich features. We formulate the problem of MOT in a structured learning framework based on SRF. Each forest learns differences of random feature pairs, which are extracted from the former process to enhance robustness to easily happened circumstances in a moving vehicle. When it compares to the conventional tracking algorithms, the proposed approach, based on SRF, takes advantage of lightweight and efficiency. The proposed lightweight multiple pedestrian tracker was successfully applied to benchmark datasets and yielded a similar or better performance level as compared with state-of-theart methods.
{"title":"Multiple pedestrian tracking using Siamese random forests and shallow Convolutional Neural Networks","authors":"Jimi Lee, J. Nam, ByoungChul Ko","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-088","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-088","url":null,"abstract":"\u0000 In this study, we propose a new multi-pedestrian tracking (MPT) method that performs quickly and efficiently track pedestrians in real-time system. The proposed method considers combining shallow convolutional neural networks (CNN) with ensemble learning method, Siamese random forests\u0000 (SRF). Unlike conventional methods, to promote robustness of ensemble method, feature transformation is applied which exploit shallow networks in appearances of still images to extract enrich features. We formulate the problem of MOT in a structured learning framework based on SRF. Each forest\u0000 learns differences of random feature pairs, which are extracted from the former process to enhance robustness to easily happened circumstances in a moving vehicle. When it compares to the conventional tracking algorithms, the proposed approach, based on SRF, takes advantage of lightweight\u0000 and efficiency. The proposed lightweight multiple pedestrian tracker was successfully applied to benchmark datasets and yielded a similar or better performance level as compared with state-of-theart methods.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134061773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/issn.2470-1173.2020.16.avm-041
O. Skorka, P. Kane
Many of the metrics developed for informational imaging are useful in automotive imaging, since many of the tasks – for example, object detection and identification – are similar. This work discusses sensor characterization parameters for the Ideal Observer SNR model, and elaborates on the noise power spectrum. It presents cross-correlation analysis results for matched-filter detection of a tribar pattern in sets of resolution target images that were captured with three image sensors over a range of illumination levels. Lastly, the work compares the crosscorrelation data to predictions made by the Ideal Observer Model and demonstrates good agreement between the two methods on relative evaluation of detection capabilities.
{"title":"Object Detection Using an Ideal Observer Model","authors":"O. Skorka, P. Kane","doi":"10.2352/issn.2470-1173.2020.16.avm-041","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2020.16.avm-041","url":null,"abstract":"\u0000 Many of the metrics developed for informational imaging are useful in automotive imaging, since many of the tasks – for example, object detection and identification – are similar. This work discusses sensor characterization parameters for the Ideal Observer SNR model,\u0000 and elaborates on the noise power spectrum. It presents cross-correlation analysis results for matched-filter detection of a tribar pattern in sets of resolution target images that were captured with three image sensors over a range of illumination levels. Lastly, the work compares the crosscorrelation\u0000 data to predictions made by the Ideal Observer Model and demonstrates good agreement between the two methods on relative evaluation of detection capabilities. \u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131798518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-080
Minsub Kim, Soonyoung Hong, M. Kang
Haze is one of the sources cause image degradation. Haze affects contrast and saturation of not only for the real world image, but also the road scenes. Most haze removal algorithms use an atmospheric scattering model for removing the effect of haze. Most of haze removal algorithms are based on the single scattering model which does not consider the blur in the haze image. In this paper, a novel haze removal algorithm using a multiple scattering model with deconvolution is proposed. The proposed algorithm considers blurring effect in the haze image. Down sampling of the haze image is also used for estimating the atmospheric light efficiently. The synthetic road scenes with and without haze are used to evaluate the performance of the proposed method. Experimental result demonstrates that the proposed algorithm performs better for restoring images affected by haze both qualitatively and quantitatively.
{"title":"Single image haze removal using multiple scattering model for road scenes","authors":"Minsub Kim, Soonyoung Hong, M. Kang","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-080","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-080","url":null,"abstract":"\u0000 Haze is one of the sources cause image degradation. Haze affects contrast and saturation of not only for the real world image, but also the road scenes. Most haze removal algorithms use an atmospheric scattering model for removing the effect of haze. Most of haze removal algorithms\u0000 are based on the single scattering model which does not consider the blur in the haze image. In this paper, a novel haze removal algorithm using a multiple scattering model with deconvolution is proposed. The proposed algorithm considers blurring effect in the haze image. Down sampling of\u0000 the haze image is also used for estimating the atmospheric light efficiently. The synthetic road scenes with and without haze are used to evaluate the performance of the proposed method. Experimental result demonstrates that the proposed algorithm performs better for restoring images affected\u0000 by haze both qualitatively and quantitatively.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129720209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-200
P. Iacomussi
Metrological applications to road environment are usually focused on the characterization of the road, considering as measurands several characteristics related to the road as a whole or the performances of single components, like the road surface, lighting systems, active and/or passive signaling and obviously vehicles equipment. In current standards approach, driving on the road means to navigate ”visually” (for a human being driver), the characterizations are mostly photometric performances oriented for given reference conditions and reference observer (photometric observer observing the road from assigned points of view, with given spectral sensitivity). But considering the present and future technological trends and knowledge on visual performances, characterizations based on only photometric quantities in reference conditions as described in the current standards would be not fully suitable, even for human driver visual needs. Nowadays research on components and systems for advanced driver assistance are evolving, following different paths toward different solutions: it is not possible, nor useful to define strict constraints as it has been done previously for road applications measurements. The paper presents the current situation of metrological characterization of road environment and components, on laboratory and on site using mobile high efficiency laboratories, and suggests to use ADAS (Advanced Driver Assistance System) for diffuse mapping of road characteristics for a better understanding of the road environment and maintenance. The suggestion has the additional advantage of minimizing measurement costs, but for its full applicability, the reliability and metrological performances of installed devices and of the measurements performed by ADAS are a priority.
{"title":"Metrology Impact of Advanced Driver Assistance Systems","authors":"P. Iacomussi","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-200","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-200","url":null,"abstract":"\u0000 Metrological applications to road environment are usually focused on the characterization of the road, considering as measurands several characteristics related to the road as a whole or the performances of single components, like the road surface, lighting systems, active and/or\u0000 passive signaling and obviously vehicles equipment. In current standards approach, driving on the road means to navigate ”visually” (for a human being driver), the characterizations are mostly photometric performances oriented for given reference conditions and reference observer\u0000 (photometric observer observing the road from assigned points of view, with given spectral sensitivity). But considering the present and future technological trends and knowledge on visual performances, characterizations based on only photometric quantities in reference conditions as described\u0000 in the current standards would be not fully suitable, even for human driver visual needs.\u0000 \u0000 Nowadays research on components and systems for advanced driver assistance are evolving, following different paths toward different solutions: it is not possible, nor useful to define strict\u0000 constraints as it has been done previously for road applications measurements. The paper presents the current situation of metrological characterization of road environment and components, on laboratory and on site using mobile high efficiency laboratories, and suggests to use ADAS (Advanced\u0000 Driver Assistance System) for diffuse mapping of road characteristics for a better understanding of the road environment and maintenance. The suggestion has the additional advantage of minimizing measurement costs, but for its full applicability, the reliability and metrological performances\u0000 of installed devices and of the measurements performed by ADAS are a priority.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127557483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-258
Haney W. Williams, S. Simske
The demand for object tracking (OT) applications has been increasing for the past few decades in many areas of interest: security, surveillance, intelligence gathering, and reconnaissance. Lately, newly-defined requirements for unmanned vehicles have enhanced the interest in OT. Advancements in machine learning, data analytics, and deep learning have facilitated the recognition and tracking of objects of interest; however, continuous tracking is currently a problem of interest to many research projects. This paper presents a system implementing a means to continuously track an object and predict its trajectory based on its previous pathway, even when the object is partially or fully concealed for a period of time. The system is composed of six main subsystems: Image Processing, Detection Algorithm, Image Subtractor, Image Tracking, Tracking Predictor, and the Feedback Analyzer. Combined, these systems allow for reasonable object continuity in the face of object concealment.
{"title":"Object Tracking Continuity through Track and Trace Method","authors":"Haney W. Williams, S. Simske","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-258","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-258","url":null,"abstract":"\u0000 \u0000 The demand for object tracking (OT) applications has been increasing for the past few decades in many areas of interest: security, surveillance, intelligence gathering, and reconnaissance. Lately, newly-defined requirements for unmanned vehicles have enhanced the interest in OT.\u0000 Advancements in machine learning, data analytics, and deep learning have facilitated the recognition and tracking of objects of interest; however, continuous tracking is currently a problem of interest to many research projects. This paper presents a system implementing a means to continuously\u0000 track an object and predict its trajectory based on its previous pathway, even when the object is partially or fully concealed for a period of time. The system is composed of six main subsystems: Image Processing, Detection Algorithm, Image Subtractor, Image Tracking, Tracking Predictor, and\u0000 the Feedback Analyzer. Combined, these systems allow for reasonable object continuity in the face of object concealment.\u0000 \u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122321617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-202
A. Unger, M. Gelautz, F. Seitner
With the growing demand for robust object detection algorithms in self-driving systems, it is important to consider the varying lighting and weather conditions in which cars operate all year round. The goal of our work is to gain a deeper understanding of meaningful strategies for selecting and merging training data from currently available databases and self-annotated videos in the context of automotive night scenes. We retrain an existing Convolutional Neural Network (YOLOv3) to study the influence of different training dataset combinations on the final object detection results in nighttime and low-visibility traffic scenes. Our evaluation shows that a suitable selection of training data from the GTSRD, VIPER, and BDD databases in conjunction with selfrecorded night scenes can achieve an mAP of 63,5% for ten object classes, which is an improvement of 16,7% when compared to the performance of the original YOLOv3 network on the same test set.
{"title":"A Study on Training Data Selection for Object Detection in Nighttime Traffic Scenes","authors":"A. Unger, M. Gelautz, F. Seitner","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-202","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-202","url":null,"abstract":"\u0000 With the growing demand for robust object detection algorithms in self-driving systems, it is important to consider the varying lighting and weather conditions in which cars operate all year round. The goal of our work is to gain a deeper understanding of meaningful strategies for\u0000 selecting and merging training data from currently available databases and self-annotated videos in the context of automotive night scenes. We retrain an existing Convolutional Neural Network (YOLOv3) to study the influence of different training dataset combinations on the final object detection\u0000 results in nighttime and low-visibility traffic scenes. Our evaluation shows that a suitable selection of training data from the GTSRD, VIPER, and BDD databases in conjunction with selfrecorded night scenes can achieve an mAP of 63,5% for ten object classes, which is an improvement of 16,7%\u0000 when compared to the performance of the original YOLOv3 network on the same test set.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"358 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115898584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}