Pub Date : 2022-06-05DOI: 10.1109/iv51971.2022.9827011
Mohit Kumar, Peter Strauss, Sven Kraus, Ömer Sahin Tas, C. Stiller
We present a path optimization approach that ensures driveability while considering a vehicle’s lateral dynamics. The lateral dynamics are non-holonomic; therefore, a vehicle cannot follow a path with abrupt changes even with infinitely fast steering. The curvature and sharpness, i.e., the rate change of curvature with respect to the traveled distance, must be continuous to track a defined reference path efficiently. Existing path optimization techniques typically include sharpness limitations but not sharpness continuity. The sharpness discontinuity is especially problematic for heavy-duty vehicles because their actuator dynamics are even slower than cars. We propose an algorithm that constructs a sparsified sharpness continuous path for a given reference path considering the limits on sharpness and its derivative, which subsequently addresses the torque restrictions of the actuator. The sharpness continuous path needs less steering effort and reduces mechanical stress and fatigue in the steering unit. We compare and present the outcomes for each of the three different types of optimized paths. Simulation results demonstrate that computed sharpness continuous path profiles reduce lateral jerks, enhancing comfort and driveability.
{"title":"Sharpness Continuous Path optimization and Sparsification for Automated Vehicles","authors":"Mohit Kumar, Peter Strauss, Sven Kraus, Ömer Sahin Tas, C. Stiller","doi":"10.1109/iv51971.2022.9827011","DOIUrl":"https://doi.org/10.1109/iv51971.2022.9827011","url":null,"abstract":"We present a path optimization approach that ensures driveability while considering a vehicle’s lateral dynamics. The lateral dynamics are non-holonomic; therefore, a vehicle cannot follow a path with abrupt changes even with infinitely fast steering. The curvature and sharpness, i.e., the rate change of curvature with respect to the traveled distance, must be continuous to track a defined reference path efficiently. Existing path optimization techniques typically include sharpness limitations but not sharpness continuity. The sharpness discontinuity is especially problematic for heavy-duty vehicles because their actuator dynamics are even slower than cars. We propose an algorithm that constructs a sparsified sharpness continuous path for a given reference path considering the limits on sharpness and its derivative, which subsequently addresses the torque restrictions of the actuator. The sharpness continuous path needs less steering effort and reduces mechanical stress and fatigue in the steering unit. We compare and present the outcomes for each of the three different types of optimized paths. Simulation results demonstrate that computed sharpness continuous path profiles reduce lateral jerks, enhancing comfort and driveability.","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127696934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-05DOI: 10.1109/iv51971.2022.9827170
Mohamed El Mostadi, H. Waeselynck, Jean-Marc Gabriel
Testing in virtual road environments is a widespread approach to validate advanced driver assistance systems (ADAS). A number of automated strategies have been proposed to explore dangerous scenarios, like search-based strategies guided by fitness functions. However, such strategies are likely to produce many uninteresting scenarios, representing so extreme driving situations that fatal accidents are unavoidable irrespective of the action of the ADAS. We propose leveraging datasets from real drives to better align the virtual scenarios to reasonable ones. The alignment is based on a simple distance metric that relates the virtual scenario parameters to the real data. We demonstrate the use of this metric for testing an autonomous emergency braking (AEB) system, taking the highD dataset as a reference for normal situations. We show how search-based testing quickly converges toward very distant scenarios that do not bring much insight into the AEB performance. We then provide an example of a distance-aware strategy that searches for less extreme scenarios that the AEB cannot overcome.
{"title":"Virtual Test Scenarios for ADAS: Distance to Real Scenarios Matters!","authors":"Mohamed El Mostadi, H. Waeselynck, Jean-Marc Gabriel","doi":"10.1109/iv51971.2022.9827170","DOIUrl":"https://doi.org/10.1109/iv51971.2022.9827170","url":null,"abstract":"Testing in virtual road environments is a widespread approach to validate advanced driver assistance systems (ADAS). A number of automated strategies have been proposed to explore dangerous scenarios, like search-based strategies guided by fitness functions. However, such strategies are likely to produce many uninteresting scenarios, representing so extreme driving situations that fatal accidents are unavoidable irrespective of the action of the ADAS. We propose leveraging datasets from real drives to better align the virtual scenarios to reasonable ones. The alignment is based on a simple distance metric that relates the virtual scenario parameters to the real data. We demonstrate the use of this metric for testing an autonomous emergency braking (AEB) system, taking the highD dataset as a reference for normal situations. We show how search-based testing quickly converges toward very distant scenarios that do not bring much insight into the AEB performance. We then provide an example of a distance-aware strategy that searches for less extreme scenarios that the AEB cannot overcome.","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125783168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-05DOI: 10.1109/iv51971.2022.9827377
Chuyao Wang, N. Aouf
Semantic segmentation is vital for autonomous car scene understanding. It provides more precise subject information than raw RGB images and this, in turn, boosts the performance of autonomous driving. Recently, self-attention methods show great improvement in image semantic segmentation. Attention maps help scene parsing with abundant relationships of every pixel in an image. However, it is computationally demanding. Besides, existing works focus either on channel attention, ignoring the pixel position factors, or on spatial attention, disregarding the impacts of the channels on each other. To address these problems, we present Fusion Attention Network based on self-attention mechanism to harvest rich contextual dependencies. This model consists of two chains: pyramid fusion spatial attention and fusion channel attention. We apply pyramid sampling in the spatial attention module to reduce the computation for spatial attention maps. Channel attention has a similar structure to the spatial attention. We also introduce a fusion technique to calculate contextual dependencies using features from both attention chains. We concatenate the results from spatial and channel attention modules as the enhanced attention map, leading to better semantic segmentation results. We conduct extensive experiments on popular datasets with different settings in addition to an ablation study to prove the efficiency of our approach. Our model achieves better results, on Cityscapes [7], compared to state-of-the-art methods, and also show good generalization capability on PASCAL VOC 2012 [9].
{"title":"Fusion Attention Network for Autonomous Cars Semantic Segmentation","authors":"Chuyao Wang, N. Aouf","doi":"10.1109/iv51971.2022.9827377","DOIUrl":"https://doi.org/10.1109/iv51971.2022.9827377","url":null,"abstract":"Semantic segmentation is vital for autonomous car scene understanding. It provides more precise subject information than raw RGB images and this, in turn, boosts the performance of autonomous driving. Recently, self-attention methods show great improvement in image semantic segmentation. Attention maps help scene parsing with abundant relationships of every pixel in an image. However, it is computationally demanding. Besides, existing works focus either on channel attention, ignoring the pixel position factors, or on spatial attention, disregarding the impacts of the channels on each other. To address these problems, we present Fusion Attention Network based on self-attention mechanism to harvest rich contextual dependencies. This model consists of two chains: pyramid fusion spatial attention and fusion channel attention. We apply pyramid sampling in the spatial attention module to reduce the computation for spatial attention maps. Channel attention has a similar structure to the spatial attention. We also introduce a fusion technique to calculate contextual dependencies using features from both attention chains. We concatenate the results from spatial and channel attention modules as the enhanced attention map, leading to better semantic segmentation results. We conduct extensive experiments on popular datasets with different settings in addition to an ablation study to prove the efficiency of our approach. Our model achieves better results, on Cityscapes [7], compared to state-of-the-art methods, and also show good generalization capability on PASCAL VOC 2012 [9].","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130098822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-05DOI: 10.48550/arXiv.2206.14391
Abdul Rahman Kreidieh, Y. Farid, K. Oguchi
The prevalence of high-speed vehicle-to-everything (V2X) communication will likely significantly influence the future of vehicle autonomy. In several autonomous driving applications, however, the role such systems will play is seldom understood. In this paper, we explore the role of communication signals in enhancing the performance of lane change assistance systems in situations where downstream bottlenecks restrict the mobility of a few lanes. Building off of prior work on modeling lane change incentives, we design a controller that 1) encourages automated vehicles to subvert lanes in which distant downstream delays are likely to occur, while also 2) ignoring greedy local incentives when such delays are needed to maintain a specific route. Numerical results on different traffic conditions and penetration rates suggest that the model successfully subverts a significant portion of delays brought about by downstream bottlenecks, both globally and from the perspective of the controlled vehicles.
{"title":"Non-local Evasive Overtaking of Downstream Incidents in Distributed Behavior Planning of Connected Vehicles","authors":"Abdul Rahman Kreidieh, Y. Farid, K. Oguchi","doi":"10.48550/arXiv.2206.14391","DOIUrl":"https://doi.org/10.48550/arXiv.2206.14391","url":null,"abstract":"The prevalence of high-speed vehicle-to-everything (V2X) communication will likely significantly influence the future of vehicle autonomy. In several autonomous driving applications, however, the role such systems will play is seldom understood. In this paper, we explore the role of communication signals in enhancing the performance of lane change assistance systems in situations where downstream bottlenecks restrict the mobility of a few lanes. Building off of prior work on modeling lane change incentives, we design a controller that 1) encourages automated vehicles to subvert lanes in which distant downstream delays are likely to occur, while also 2) ignoring greedy local incentives when such delays are needed to maintain a specific route. Numerical results on different traffic conditions and penetration rates suggest that the model successfully subverts a significant portion of delays brought about by downstream bottlenecks, both globally and from the perspective of the controlled vehicles.","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131047772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-05DOI: 10.1109/iv51971.2022.9827281
Colin Decourt, R. V. Rullen, D. Salle, T. Oberlin
Due to the small number of raw data automotive radar datasets and the low resolution of such radar sensors, automotive radar object detection has been little explored with deep learning models in comparison to camera and lidar-based approaches. However, radars are low-cost sensors able to accurately sense surrounding object characteristics (e.g., distance, radial velocity, direction of arrival, radar cross-section) regardless of weather conditions (e.g., rain, snow, fog). Recent open-source datasets such as CARRADA, RADDet or CRUW have opened up research on several topics ranging from object classification to object detection and segmentation. In this paper, we present DAROD, an adaptation of Faster R-CNN object detector for automotive radar on the range-Doppler spectra. We propose a light architecture for features extraction, which shows an increased performance compare to heavier vision-based backbone architectures. Our models reach respectively an mAP@0.5 of 55.83 and 46.57 on CARRADA and RADDet datasets, outperforming competing methods.
{"title":"DAROD: A Deep Automotive Radar Object Detector on Range-Doppler maps","authors":"Colin Decourt, R. V. Rullen, D. Salle, T. Oberlin","doi":"10.1109/iv51971.2022.9827281","DOIUrl":"https://doi.org/10.1109/iv51971.2022.9827281","url":null,"abstract":"Due to the small number of raw data automotive radar datasets and the low resolution of such radar sensors, automotive radar object detection has been little explored with deep learning models in comparison to camera and lidar-based approaches. However, radars are low-cost sensors able to accurately sense surrounding object characteristics (e.g., distance, radial velocity, direction of arrival, radar cross-section) regardless of weather conditions (e.g., rain, snow, fog). Recent open-source datasets such as CARRADA, RADDet or CRUW have opened up research on several topics ranging from object classification to object detection and segmentation. In this paper, we present DAROD, an adaptation of Faster R-CNN object detector for automotive radar on the range-Doppler spectra. We propose a light architecture for features extraction, which shows an increased performance compare to heavier vision-based backbone architectures. Our models reach respectively an mAP@0.5 of 55.83 and 46.57 on CARRADA and RADDet datasets, outperforming competing methods.","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130748161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-05DOI: 10.1109/IV51971.2022.9827432
M. Aleksandrov
We propose a new model for fair division and vehicle routing, where drivers have monotone profit preferences, and their vehicles have feasibility constraints, for customer requests. For this model, we design two new axiomatic notions for fairness for drivers: FEQ1 and FEF1. FEQ1 encodes driver pairwise bounded equitability. FEF1 encodes driver pairwise bounded envy freeness. We compare FEQ1 and FEF1 with popular fair division notions such as EQ1 and EF1. We also give algorithms for guaranteeing FEQ1 and FEF1, respectively.
{"title":"Fair Division meets Vehicle Routing: Fairness for Drivers with Monotone Profits","authors":"M. Aleksandrov","doi":"10.1109/IV51971.2022.9827432","DOIUrl":"https://doi.org/10.1109/IV51971.2022.9827432","url":null,"abstract":"We propose a new model for fair division and vehicle routing, where drivers have monotone profit preferences, and their vehicles have feasibility constraints, for customer requests. For this model, we design two new axiomatic notions for fairness for drivers: FEQ1 and FEF1. FEQ1 encodes driver pairwise bounded equitability. FEF1 encodes driver pairwise bounded envy freeness. We compare FEQ1 and FEF1 with popular fair division notions such as EQ1 and EF1. We also give algorithms for guaranteeing FEQ1 and FEF1, respectively.","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":" 40","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133021411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-05DOI: 10.1109/iv51971.2022.9827045
Roberto Franceschi, D. Rachkov
Recent research explored advantages of applying a learning-based method to the radar target detection problem. A single point target case was mainly considered, though. This work extends those studies to complex automotive scenarios. We propose a Convolutional Neural Networks-based model able to detect and locate targets in multi-dimensional space of range, velocity, azimuth, and elevation. Due to the lack of publicly available datasets containing raw radar data (after analog-to-digital converter), we simulated a dataset comprising more than 17000 frames of automotive scenarios and various road objects including (but not limited to) cars, pedestrians, cyclists, trees, and guardrails. The proposed model was trained exclusively on simulated data and its performance was compared to that of conventional radar detection and angle estimation pipeline. In unseen simulated scenarios, our model outperformed the conventional CFAR-based methods, improving by 14.5% the dice score in range-Doppler domain. Our model was also qualitatively evaluated on unseen real-world radar recordings, achieving more detection points per object than the conventional processing.
{"title":"Deep Learning-Based Radar Detector for Complex Automotive Scenarios","authors":"Roberto Franceschi, D. Rachkov","doi":"10.1109/iv51971.2022.9827045","DOIUrl":"https://doi.org/10.1109/iv51971.2022.9827045","url":null,"abstract":"Recent research explored advantages of applying a learning-based method to the radar target detection problem. A single point target case was mainly considered, though. This work extends those studies to complex automotive scenarios. We propose a Convolutional Neural Networks-based model able to detect and locate targets in multi-dimensional space of range, velocity, azimuth, and elevation. Due to the lack of publicly available datasets containing raw radar data (after analog-to-digital converter), we simulated a dataset comprising more than 17000 frames of automotive scenarios and various road objects including (but not limited to) cars, pedestrians, cyclists, trees, and guardrails. The proposed model was trained exclusively on simulated data and its performance was compared to that of conventional radar detection and angle estimation pipeline. In unseen simulated scenarios, our model outperformed the conventional CFAR-based methods, improving by 14.5% the dice score in range-Doppler domain. Our model was also qualitatively evaluated on unseen real-world radar recordings, achieving more detection points per object than the conventional processing.","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133780148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-05DOI: 10.1109/iv51971.2022.9827053
Zuyuan Guo, Haoran Wang, Wei Yi, Jiahao Zhang
This paper explores object detection on radar range-Doppler map. Most of the radar processing algorithms are proposed for detecting objects without classifying. Meanwhile, these approaches neglect the useful information available in the temporal domain. To address these problems, we propose an online radar deep temporal detection framework by frame-to-frame prediction and association with low computation. The core idea is that once an object is detected, its location and class can be predicted in the future frame to improve detection results. The experiment results illustrate this method achieves better detection and classification performance, and shows the usability of radar data for traffic scenes.
{"title":"Efficient Radar Deep Temporal Detection in Urban Traffic Scenes","authors":"Zuyuan Guo, Haoran Wang, Wei Yi, Jiahao Zhang","doi":"10.1109/iv51971.2022.9827053","DOIUrl":"https://doi.org/10.1109/iv51971.2022.9827053","url":null,"abstract":"This paper explores object detection on radar range-Doppler map. Most of the radar processing algorithms are proposed for detecting objects without classifying. Meanwhile, these approaches neglect the useful information available in the temporal domain. To address these problems, we propose an online radar deep temporal detection framework by frame-to-frame prediction and association with low computation. The core idea is that once an object is detected, its location and class can be predicted in the future frame to improve detection results. The experiment results illustrate this method achieves better detection and classification performance, and shows the usability of radar data for traffic scenes.","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114965858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-05DOI: 10.1109/iv51971.2022.9827064
André Przewodowski, F. Osório
In this paper, we extend the Monte Carlo Localization formulation for a more efficient global localization using coarse digital maps (for instance, the OpenStreetMap maps). The proposed formulation uses the map constraints in order to reduce the state dimension, which is ideal for a Monte Carlo-based particle filter. Also, we propose including to the data association process the matching of the traffic signals’ information to the road properties, so that their exact position do not need to be previously mapped for updating the filter. In the proposed approach, no low-level point cloud mapping was required and neither the use of LIDAR data. The experiments were conducted using a dataset collected by the CARINA II intelligent vehicle and the results suggest that the method is adequate for a localization pipeline. The dataset is available online and the code is available on GitHub.
{"title":"A Monte Carlo particle filter formulation for mapless-based localization","authors":"André Przewodowski, F. Osório","doi":"10.1109/iv51971.2022.9827064","DOIUrl":"https://doi.org/10.1109/iv51971.2022.9827064","url":null,"abstract":"In this paper, we extend the Monte Carlo Localization formulation for a more efficient global localization using coarse digital maps (for instance, the OpenStreetMap maps). The proposed formulation uses the map constraints in order to reduce the state dimension, which is ideal for a Monte Carlo-based particle filter. Also, we propose including to the data association process the matching of the traffic signals’ information to the road properties, so that their exact position do not need to be previously mapped for updating the filter. In the proposed approach, no low-level point cloud mapping was required and neither the use of LIDAR data. The experiments were conducted using a dataset collected by the CARINA II intelligent vehicle and the results suggest that the method is adequate for a localization pipeline. The dataset is available online and the code is available on GitHub.","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123717165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-06-05DOI: 10.48550/arXiv.2207.01778
S. Kothawade
Retrieving images with objects that are semantically similar to objects of interest (OOI) in a query image has many practical use cases. A few examples include fixing failures like false negatives/positives of a learned model or mitigating class imbalance in a dataset. The targeted selection task requires finding the relevant data from a large-scale pool of unlabeled data. Manual mining at this scale is infeasible. Further, the OOI are often small and occupy less than 1% of image area, are occluded, and co-exist with many semantically different objects in cluttered scenes. Existing semantic image retrieval methods often focus on mining for larger sized geographical landmarks, and/or require extra labeled data, such as images/image-pairs with similar objects, for mining images with generic objects. We propose a fast and robust template matching algorithm in the DNN feature space, that retrieves semantically similar images at the object-level from a large unlabeled pool of data. We project the region(s) around the OOI in the query image to the DNN feature space for use as the template. This enables our method to focus on the semantics of the OOI without requiring extra labeled data. In the context of autonomous driving, we evaluate our system for targeted selection by using failure cases of object detectors as OOI. We demonstrate its efficacy on a large unlabeled dataset with 2.2M images and show high recall in mining for images with small-sized OOI. We compare our method against a well-known semantic image retrieval method, which also does not require extra labeled data. Lastly, we show that our method is flexible and retrieves images with one or more semantically different co-occurring OOI seamlessly.
{"title":"Object-Level Targeted Selection via Deep Template Matching","authors":"S. Kothawade","doi":"10.48550/arXiv.2207.01778","DOIUrl":"https://doi.org/10.48550/arXiv.2207.01778","url":null,"abstract":"Retrieving images with objects that are semantically similar to objects of interest (OOI) in a query image has many practical use cases. A few examples include fixing failures like false negatives/positives of a learned model or mitigating class imbalance in a dataset. The targeted selection task requires finding the relevant data from a large-scale pool of unlabeled data. Manual mining at this scale is infeasible. Further, the OOI are often small and occupy less than 1% of image area, are occluded, and co-exist with many semantically different objects in cluttered scenes. Existing semantic image retrieval methods often focus on mining for larger sized geographical landmarks, and/or require extra labeled data, such as images/image-pairs with similar objects, for mining images with generic objects. We propose a fast and robust template matching algorithm in the DNN feature space, that retrieves semantically similar images at the object-level from a large unlabeled pool of data. We project the region(s) around the OOI in the query image to the DNN feature space for use as the template. This enables our method to focus on the semantics of the OOI without requiring extra labeled data. In the context of autonomous driving, we evaluate our system for targeted selection by using failure cases of object detectors as OOI. We demonstrate its efficacy on a large unlabeled dataset with 2.2M images and show high recall in mining for images with small-sized OOI. We compare our method against a well-known semantic image retrieval method, which also does not require extra labeled data. Lastly, we show that our method is flexible and retrieves images with one or more semantically different co-occurring OOI seamlessly.","PeriodicalId":184622,"journal":{"name":"2022 IEEE Intelligent Vehicles Symposium (IV)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129939063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}