Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-175
Haney W. Williams, S. Simske, Fr. Gregory Bishay
The demand for object tracking (OT) applications has been increasing for the past few decades in many areas of interest, including security, surveillance, intelligence gathering, and reconnaissance. Lately, newly-defined requirements for unmanned vehicles have enhanced the interest in OT. Advancements in machine learning, data analytics, and AI/deep learning have facilitated the improved recognition and tracking of objects of interest; however, continuous tracking is currently a problem of interest in many research projects. [1] In our past research, we proposed a system that implements the means to continuously track an object and predict its trajectory based on its previous pathway, even when the object is partially or fully concealed for a period of time. The second phase of this system proposed developing a common knowledge among a mesh of fixed cameras, akin to a real-time panorama. This paper discusses the method to coordinate the cameras' view to a common frame of reference so that the object location is known by all participants in the network.
{"title":"Unify The View of Camera Mesh Network to a Common Coordinate System","authors":"Haney W. Williams, S. Simske, Fr. Gregory Bishay","doi":"10.2352/issn.2470-1173.2021.17.avm-175","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-175","url":null,"abstract":"\u0000 The demand for object tracking (OT) applications has been increasing for the past few decades in many areas of interest, including security, surveillance, intelligence gathering, and reconnaissance. Lately, newly-defined requirements for unmanned vehicles have enhanced the interest\u0000 in OT. Advancements in machine learning, data analytics, and AI/deep learning have facilitated the improved recognition and tracking of objects of interest; however, continuous tracking is currently a problem of interest in many research projects. [1] In our past research, we proposed a system\u0000 that implements the means to continuously track an object and predict its trajectory based on its previous pathway, even when the object is partially or fully concealed for a period of time. The second phase of this system proposed developing a common knowledge among a mesh of fixed cameras,\u0000 akin to a real-time panorama. This paper discusses the method to coordinate the cameras' view to a common frame of reference so that the object location is known by all participants in the network.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114055588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-173
Korbinian Weikl, Damien Schroeder, Daniel Blau, Zhenyi Liu, W. Stechele
Full driving automation imposes to date unmet performance requirements on camera and computer vision systems, in order to replace the visual system of a human driver in any conditions. So far, the individual components of an automotive camera hav mostly been optimized independently, or without taking into account the effect on the computer vision applications. We propose an end-to-end optimization of the imaging system in software, from generation of radiometric input data over physically based camera component models to the output of a computer vision system. Specifically, we present an optimization framework which extends the ISETCam and ISET3d toolboxes to create synthetic spectral data of high dynamic range, and which models a stateof-the-art automotive camera in more detail. It includes a stateof-the-art object detection system as benchmark application. We highlight in which way the framework approximates the physical image formation process. As a result, we provide guidelines for optimization experiments involving modification of the model parameters, and show how these apply to a first experiment on high dynamic range imaging.
{"title":"End-to-End Imaging System Optimization for Computer Vision in Driving Automation","authors":"Korbinian Weikl, Damien Schroeder, Daniel Blau, Zhenyi Liu, W. Stechele","doi":"10.2352/issn.2470-1173.2021.17.avm-173","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-173","url":null,"abstract":"\u0000 Full driving automation imposes to date unmet performance requirements on camera and computer vision systems, in order to replace the visual system of a human driver in any conditions. So far, the individual components of an automotive camera hav mostly been optimized independently,\u0000 or without taking into account the effect on the computer vision applications. We propose an end-to-end optimization of the imaging system in software, from generation of radiometric input data over physically based camera component models to the output of a computer vision system. Specifically,\u0000 we present an optimization framework which extends the ISETCam and ISET3d toolboxes to create synthetic spectral data of high dynamic range, and which models a stateof-the-art automotive camera in more detail. It includes a stateof-the-art object detection system as benchmark application.\u0000 We highlight in which way the framework approximates the physical image formation process. As a result, we provide guidelines for optimization experiments involving modification of the model parameters, and show how these apply to a first experiment on high dynamic range imaging.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127071281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-174
P. V. Beek, Chyuan-Tyng Wu, B. Chaudhury, T. Gardos
Traditional image signal processors (ISPs) are primarily designed and optimized to improve the image quality perceived by humans. However, optimal perceptual image quality does not always translate into optimal performance for computer vision applications. In [1], Wu et al. proposed a set of methods, termed VisionISP, to enhance and optimize the ISP for computer vision purposes. The blocks in VisionISP are simple, content-aware, and trainable using existing machine learning methods. VisionISP significantly reduces the data transmission and power consumption requirements by reducing image bit-depth and resolution, while mitigating the loss of relevant information. In this paper, we show that VisionISP boosts the performance of subsequent computer vision algorithms in the context of multiple tasks, including object detection, face recognition, and stereo disparity estimation. The results demonstrate the benefits of VisionISP for a variety of computer vision applications, CNN model sizes, and benchmark datasets.
{"title":"Boosting computer vision performance by enhancing camera ISP","authors":"P. V. Beek, Chyuan-Tyng Wu, B. Chaudhury, T. Gardos","doi":"10.2352/issn.2470-1173.2021.17.avm-174","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-174","url":null,"abstract":"\u0000 Traditional image signal processors (ISPs) are primarily designed and optimized to improve the image quality perceived by humans. However, optimal perceptual image quality does not always translate into optimal performance for computer vision applications. In [1], Wu et al. proposed\u0000 a set of methods, termed VisionISP, to enhance and optimize the ISP for computer vision purposes. The blocks in VisionISP are simple, content-aware, and trainable using existing machine learning methods.\u0000 \u0000 VisionISP significantly reduces the data transmission and power consumption\u0000 requirements by reducing image bit-depth and resolution, while mitigating the loss of relevant information. In this paper, we show that VisionISP boosts the performance of subsequent computer vision algorithms in the context of multiple tasks, including object detection, face recognition,\u0000 and stereo disparity estimation. The results demonstrate the benefits of VisionISP for a variety of computer vision applications, CNN model sizes, and benchmark datasets.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"31 11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122858542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-171
M. Abdelkarim, M. Abbas, Alaa Osama, Dalia Anwar, Mostafa Azzam, M. Abdelalim, H. Mostafa, Samah El-Tantawy, Ibrahim Sobh
Imitation learning is used massively in autonomous driving for training networks to predict steering commands from frames using annotated data collected by an expert driver. Believing that the frames taken from a front-facing camera are completely mimicking the driver’s eyes raises the question of how eyes and the complex human vision system attention mechanisms perceive the scene. This paper proposes the idea of incorporating eye gaze information with the frames into an end-to-end deep neural network in the lane-following task. The proposed novel architecture, GG-Net, is composed of a spatial transformer network (STN), and a multitask network to predict steering angle as well as the gaze map for the input frame. The experimental results of this architecture show a great improvement in steering angle prediction accuracy of 36% over the baseline with inference time of 0.015 seconds per frame (66 fps) using NVIDIA K80 GPU enabling the proposed model to operate in real-time. We argue that incorporating gaze maps enhances the model generalization capability to the unseen environments. Additionally, a novel course-steering angle conversion algorithm with a complementing mathematical proof is proposed.
{"title":"GG-Net: Gaze Guided Network for Self-driving Cars","authors":"M. Abdelkarim, M. Abbas, Alaa Osama, Dalia Anwar, Mostafa Azzam, M. Abdelalim, H. Mostafa, Samah El-Tantawy, Ibrahim Sobh","doi":"10.2352/issn.2470-1173.2021.17.avm-171","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-171","url":null,"abstract":"\u0000 Imitation learning is used massively in autonomous driving for training networks to predict steering commands from frames using annotated data collected by an expert driver. Believing that the frames taken from a front-facing camera are completely mimicking the driver’s eyes\u0000 raises the question of how eyes and the complex human vision system attention mechanisms perceive the scene. This paper proposes the idea of incorporating eye gaze information with the frames into an end-to-end deep neural network in the lane-following task. The proposed novel architecture,\u0000 GG-Net, is composed of a spatial transformer network (STN), and a multitask network to predict steering angle as well as the gaze map for the input frame. The experimental results of this architecture show a great improvement in steering angle prediction accuracy of 36% over the baseline with\u0000 inference time of 0.015 seconds per frame (66 fps) using NVIDIA K80 GPU enabling the proposed model to operate in real-time. We argue that incorporating gaze maps enhances the model generalization capability to the unseen environments. Additionally, a novel course-steering angle conversion\u0000 algorithm with a complementing mathematical proof is proposed.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128569471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-148
Patrick Mueller, M. Lehmann, Alexander Braun
Simulation is an established tool to develop and validate camera systems. The goal of autonomous driving is pushing simulation into a more important and fundamental role for safety, validation and coverage of billions of miles. Realistic camera models are moving more and more into focus, as simulations need to be more then photo-realistic, they need to be physical-realistic, representing the actual camera system onboard the self-driving vehicle in all relevant physical aspects – and this is not only true for cameras, but also for radar and lidar. But when the camera simulations are becoming more and more realistic, how is this realism tested? Actual, physical camera samples are tested in laboratories following norms like ISO12233, EMVA1288 or the developing P2020, with test charts like dead leaves, slanted edge or OECF-charts. In this article we propose to validate the realism of camera simulations by simulating the physical test bench setup, and then comparing the synthetical simulation result with physical results from the real-world test bench using the established normative metrics and KPIs. While this procedure is used sporadically in industrial settings we are not aware of a rigorous presentation of these ideas in the context of realistic camera models for autonomous driving. After the description of the process we give concrete examples for several different measurement setups using MTF and SFR, and show how these can be used to characterize the quality of different camera models.
{"title":"Simulating tests to test simulation","authors":"Patrick Mueller, M. Lehmann, Alexander Braun","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-148","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-148","url":null,"abstract":"\u0000 Simulation is an established tool to develop and validate camera systems. The goal of autonomous driving is pushing simulation into a more important and fundamental role for safety, validation and coverage of billions of miles. Realistic camera models are moving more and more into\u0000 focus, as simulations need to be more then photo-realistic, they need to be physical-realistic, representing the actual camera system onboard the self-driving vehicle in all relevant physical aspects – and this is not only true for cameras, but also for radar and lidar. But when the\u0000 camera simulations are becoming more and more realistic, how is this realism tested? Actual, physical camera samples are tested in laboratories following norms like ISO12233, EMVA1288 or the developing P2020, with test charts like dead leaves, slanted edge or OECF-charts. In this article we\u0000 propose to validate the realism of camera simulations by simulating the physical test bench setup, and then comparing the synthetical simulation result with physical results from the real-world test bench using the established normative metrics and KPIs. While this procedure is used sporadically\u0000 in industrial settings we are not aware of a rigorous presentation of these ideas in the context of realistic camera models for autonomous driving. After the description of the process we give concrete examples for several different measurement setups using MTF and SFR, and show how these\u0000 can be used to characterize the quality of different camera models.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121488036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-079
Lucie Yahiaoui, Michal Uřičář, Arindam Das, S. Yogamani
Sun glare is a commonly encountered problem in both manual and automated driving. Sun glare causes over-exposure in the image and significantly impacts visual perception algorithms. For higher levels of automated driving, it is essential for the system to understand that there is sun glare which can cause system degradation. There is very limited literature on detecting sun glare for automated driving. It is primarily based on finding saturated brightness areas and extracting regions via image processing heuristics. From the perspective of a safety system, it is necessary to have a highly robust algorithm. Thus we designed two complementary algorithms using classical image processing techniques and CNN which can learn global context. We also discuss how sun glare detection algorithm will efficiently fit into a typical automated driving system. As there is no public dataset, we created our own and will release it publicly via theWoodScape project [1] to encourage further research in this area.
{"title":"Let The Sunshine in: Sun Glare Detection on Automotive Surround-view Cameras","authors":"Lucie Yahiaoui, Michal Uřičář, Arindam Das, S. Yogamani","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-079","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-079","url":null,"abstract":"Sun glare is a commonly encountered problem in both manual and automated driving. Sun glare causes over-exposure in the image and significantly impacts visual perception algorithms. For higher levels of automated driving, it is essential for the system to understand that there is sun glare which can cause system degradation. There is very limited literature on detecting sun glare for automated driving. It is primarily based on finding saturated brightness areas and extracting regions via image processing heuristics. From the perspective of a safety system, it is necessary to have a highly robust algorithm. Thus we designed two complementary algorithms using classical image processing techniques and CNN which can learn global context. We also discuss how sun glare detection algorithm will efficiently fit into a typical automated driving system. As there is no public dataset, we created our own and will release it publicly via theWoodScape project [1] to encourage further research in this area.","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131245952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/issn.2470-1173.2020.16.avm-038
RobertC Sumner
High-frequency flickering light sources such as pulse-width modulated LEDs can cause image sensors to record incorrect levels. We describe a model with a loose set of assumptions (encompassing multi-exposure HDR schemes) which can be used to define the Flicker Signal, a continuous function of time based on the phase relationship between the light source and exposure window. Analysis of the shape of this signal yields a characterization of the camera’s response to a flickering light source–typically seen as an undesirable susceptibility–under a given set of parameters. Flicker Signal calculations are made on discrete samplings measured from image data. Sampling the signal is difficult, however, because it is a function of many parameters, including properties of the light source (frequency, duty cycle, intensity) and properties of the imaging system (exposure scheme, frame rate, row readout time). Moreover, there are degenerate scenarios where sufficient sampling is difficult to obtain. We present a computational approach for determining the evidence (region of interest, duration of test video) necessary to get coverage of this signal sufficient for characterization from a practical test lab setup.
{"title":"Describing and Sampling the LED Flicker Signal","authors":"RobertC Sumner","doi":"10.2352/issn.2470-1173.2020.16.avm-038","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2020.16.avm-038","url":null,"abstract":"\u0000 High-frequency flickering light sources such as pulse-width modulated LEDs can cause image sensors to record incorrect levels. We describe a model with a loose set of assumptions (encompassing multi-exposure HDR schemes) which can be used to define the Flicker Signal, a continuous\u0000 function of time based on the phase relationship between the light source and exposure window. Analysis of the shape of this signal yields a characterization of the camera’s response to a flickering light source–typically seen as an undesirable susceptibility–under a given\u0000 set of parameters. Flicker Signal calculations are made on discrete samplings measured from image data. Sampling the signal is difficult, however, because it is a function of many parameters, including properties of the light source (frequency, duty cycle, intensity) and properties of the\u0000 imaging system (exposure scheme, frame rate, row readout time). Moreover, there are degenerate scenarios where sufficient sampling is difficult to obtain. We present a computational approach for determining the evidence (region of interest, duration of test video) necessary to get coverage\u0000 of this signal sufficient for characterization from a practical test lab setup.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128360275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/issn.2470-1173.2020.16.avm-040
R. Jenkin
Contrast detection probability (CDP) is proposed as an IEEE P2020 metric to predict camera performance intended for computer vision tasks for autonomous vehicles. Its calculation involves comparing combinations of pixel values between imaged patches. Computation of CDP for all meaningful combinations of m patches involves approximately 3/2(m2-m).n4 operations, where n is the length of one side of the patch in pixels. This work presents a method to estimate Weber contrast based CDP based on individual patch statistics and thus reduces to computation to approximately 4n2m calculations. For 180 patches of 10×10 pixels this is a reduction of approximately 6500 times and for 180 25×25 pixel patches, approximately 41000. The absolute error in the estimated CDP is less than 0.04 or 5% where the noise is well described by Gaussian statistics. Results are compared for simulated patches between the full calculation and the fast estimate. Basing the estimate of CDP on individual patch statistics, rather than by a pixel-to-pixel comparison facilitates the prediction of CDP values from a physical model of exposure and camera conditions. This allows Weber CDP behavior to be investigated for a wide variety of conditions and leads to the discovery that, for the case where contrast is increased by decreasing the tone value of one patch and therefore increasing noise as contrast increases, there exists a maxima which yields identical Weber CDP values for patches of different nominal contrast. This means Weber CDP is predicting the same detection performance for patches of different contrast.
{"title":"Fast Prediction of Contrast Detection Probability","authors":"R. Jenkin","doi":"10.2352/issn.2470-1173.2020.16.avm-040","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2020.16.avm-040","url":null,"abstract":"\u0000 Contrast detection probability (CDP) is proposed as an IEEE P2020 metric to predict camera performance intended for computer vision tasks for autonomous vehicles. Its calculation involves comparing combinations of pixel values between imaged patches. Computation of CDP for all meaningful\u0000 combinations of m patches involves approximately 3/2(m2-m).n4 operations, where n is the length of one side of the patch in pixels. This work presents a method to estimate Weber contrast based CDP based on individual patch statistics and thus reduces to computation to approximately 4n2m calculations.\u0000 For 180 patches of 10×10 pixels this is a reduction of approximately 6500 times and for 180 25×25 pixel patches, approximately 41000. The absolute error in the estimated CDP is less than 0.04 or 5% where the noise is well described by Gaussian statistics.\u0000 \u0000 Results are\u0000 compared for simulated patches between the full calculation and the fast estimate. Basing the estimate of CDP on individual patch statistics, rather than by a pixel-to-pixel comparison facilitates the prediction of CDP values from a physical model of exposure and camera conditions. This allows\u0000 Weber CDP behavior to be investigated for a wide variety of conditions and leads to the discovery that, for the case where contrast is increased by decreasing the tone value of one patch and therefore increasing noise as contrast increases, there exists a maxima which yields identical Weber\u0000 CDP values for patches of different nominal contrast. This means Weber CDP is predicting the same detection performance for patches of different contrast.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134164412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/issn.2470-1173.2020.16.avm-019
M. Geese
In this paper, we present an overview of automotive image quality challenges and link them to the physical properties of image acquisition. This process shows that the detection probability based KPIs are a helpful tool to link image quality to the tasks of the SAE classified supported and automated driving tasks. We develop questions around the challenges of the automotive image quality and show that especially color separation probability (CSP) and contrast detection probability (CDP) are a key enabler to improve the knowhow and overview of the image quality optimization problem. Next we introduce a proposal for color separation probability as a new KPI which is based on the random effects of photon shot noise and the properties of light spectra that cause color metamerism. This allows us to demonstrate the image quality influences related to color at different stages of the image generation pipeline. As a second part we investigated the already presented KPI Contrast Detection Probability and show how it links to different metrics of automotive imaging such as HDR, low light performance and detectivity of an object. As conclusion, this paper summarizes the status of the standardization status within IEEE P2020 of these detection probability based KPIs and outlines the next steps for these work packages.
{"title":"Automotive Image Quality Concepts for the next SAE levels: Color Separation and Contrast Detection Probability","authors":"M. Geese","doi":"10.2352/issn.2470-1173.2020.16.avm-019","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2020.16.avm-019","url":null,"abstract":"\u0000 In this paper, we present an overview of automotive image quality challenges and link them to the physical properties of image acquisition. This process shows that the detection probability based KPIs are a helpful tool to link image quality to the tasks of the SAE classified supported\u0000 and automated driving tasks. We develop questions around the challenges of the automotive image quality and show that especially color separation probability (CSP) and contrast detection probability (CDP) are a key enabler to improve the knowhow and overview of the image quality optimization\u0000 problem. Next we introduce a proposal for color separation probability as a new KPI which is based on the random effects of photon shot noise and the properties of light spectra that cause color metamerism. This allows us to demonstrate the image quality influences related to color at different\u0000 stages of the image generation pipeline. As a second part we investigated the already presented KPI Contrast Detection Probability and show how it links to different metrics of automotive imaging such as HDR, low light performance and detectivity of an object. As conclusion, this paper summarizes\u0000 the status of the standardization status within IEEE P2020 of these detection probability based KPIs and outlines the next steps for these work packages.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130324462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-26DOI: 10.2352/ISSN.2470-1173.2020.16.AVM-255
D. Bhanushali, R. Relyea, Karan Manghi, Abhishek Vashist, C. Hochgraf, A. Ganguly, Andres Kwasinski, M. Kuhl, R. Ptucha
The performance of autonomous agents in both commercial and consumer applications increases along with their situational awareness. Tasks such as obstacle avoidance, agent to agent interaction, and path planning are directly dependent upon their ability to convert sensor readings into scene understanding. Central to this is the ability to detect and recognize objects. Many object detection methodologies operate on a single modality such as vision or LiDAR. Camera-based object detection models benefit from an abundance of feature-rich information for classifying different types of objects. LiDAR-based object detection models use sparse point clouds, where each point contains accurate 3D position of object surfaces. Camera-based methods lack accurate object to lens distance measurements, while LiDAR-based methods lack dense feature-rich details. By utilizing information from both camera and LiDAR sensors, advanced object detection and identification is possible. In this work, we introduce a deep learning framework for fusing these modalities and produce a robust real-time 3D bounding box object detection network. We demonstrate qualitative and quantitative analysis of the proposed fusion model on the popular KITTI dataset.
{"title":"LiDAR-Camera Fusion for 3D Object Detection","authors":"D. Bhanushali, R. Relyea, Karan Manghi, Abhishek Vashist, C. Hochgraf, A. Ganguly, Andres Kwasinski, M. Kuhl, R. Ptucha","doi":"10.2352/ISSN.2470-1173.2020.16.AVM-255","DOIUrl":"https://doi.org/10.2352/ISSN.2470-1173.2020.16.AVM-255","url":null,"abstract":"\u0000 The performance of autonomous agents in both commercial and consumer applications increases along with their situational awareness. Tasks such as obstacle avoidance, agent to agent interaction, and path planning are directly dependent upon their ability to convert sensor readings\u0000 into scene understanding. Central to this is the ability to detect and recognize objects. Many object detection methodologies operate on a single modality such as vision or LiDAR. Camera-based object detection models benefit from an abundance of feature-rich information for classifying different\u0000 types of objects. LiDAR-based object detection models use sparse point clouds, where each point contains accurate 3D position of object surfaces. Camera-based methods lack accurate object to lens distance measurements, while LiDAR-based methods lack dense feature-rich details. By utilizing\u0000 information from both camera and LiDAR sensors, advanced object detection and identification is possible. In this work, we introduce a deep learning framework for fusing these modalities and produce a robust real-time 3D bounding box object detection network. We demonstrate qualitative and\u0000 quantitative analysis of the proposed fusion model on the popular KITTI dataset.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117262062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}