Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-115
Zelin Zhang, J. Ohya
To avoid manual collections of a huge amount of labeled image data needed for training autonomous driving models, this paperproposes a novel automatic method for collecting image data with annotation for autonomous driving through a translation network that can transform the simulation CG images to real-world images. The translation network is designed in an end-to-end structure that contains two encoder-decoder networks. The forepart of the translation network is designed to represent the structure of the original simulation CG image with a semantic segmentation. Then the rear part of the network translates the segmentation to a realworld image by applying cGAN. After the training, the translation network can learn a mapping from simulation CG pixels to the realworld image pixels. To confirm the validity of the proposed system, we conducted three experiments under different learning policies by evaluating the MSE of the steering angle and vehicle speed. The first experiment demonstrates that the L1+cGAN performs best above all loss functions in the translation network. As a result of the second experiment conducted under different learning policies, it turns out that the ResNet architecture works best. The third experiment demonstrates that the model trained with the real-world images generated by the translation network can still work great in the real world. All the experimental results demonstrate the validity of our proposed method.
{"title":"Data Collection Through Translation Network Based on End-to-End Deep Learning for Autonomous Driving","authors":"Zelin Zhang, J. Ohya","doi":"10.2352/issn.2470-1173.2021.17.avm-115","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-115","url":null,"abstract":"\u0000 To avoid manual collections of a huge amount of labeled image data needed for training autonomous driving models, this paperproposes a novel automatic method for collecting image data with annotation for autonomous driving through a translation network that can transform the simulation\u0000 CG images to real-world images. The translation network is designed in an end-to-end structure that contains two encoder-decoder networks. The forepart of the translation network is designed to represent the structure of the original simulation CG image with a semantic segmentation. Then the\u0000 rear part of the network translates the segmentation to a realworld image by applying cGAN. After the training, the translation network can learn a mapping from simulation CG pixels to the realworld image pixels. To confirm the validity of the proposed system, we conducted three experiments\u0000 under different learning policies by evaluating the MSE of the steering angle and vehicle speed. The first experiment demonstrates that the L1+cGAN performs best above all loss functions in the translation network. As a result of the second experiment conducted under different learning policies,\u0000 it turns out that the ResNet architecture works best. The third experiment demonstrates that the model trained with the real-world images generated by the translation network can still work great in the real world. All the experimental results demonstrate the validity of our proposed method.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"05 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129578682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-213
Chyuan-Tyng Wu, P. V. Beek, Phillip Schmidt, Joao Peralta Moreira, T. Gardos
Deep neural networks have been utilized in an increasing number of computer vision tasks, demonstrating superior performance. Much research has been focused on making deep networks more suitable for efficient hardware implementation, for low-power and low-latency real-time applications. In [1], Isikdogan et al. introduced a deep neural network design that provides an effective trade-off between flexibility and hardware efficiency. The proposed solution consists of fixed-topology hardware blocks, with partially frozen/partially trainable weights, that can be configured into a full network. Initial results in a few computer vision tasks were presented in [1]. In this paper, we further evaluate this network design by applying it to several additional computer vision use cases and comparing it to other hardware-friendly networks. The experimental results presented here show that the proposed semi-fixed semi-frozen design achieves competitive performanc on a variety of benchmarks, while maintaining very high hardware efficiency.
{"title":"Evaluation of semi-frozen semi-fixed neural network for efficient computer vision inference","authors":"Chyuan-Tyng Wu, P. V. Beek, Phillip Schmidt, Joao Peralta Moreira, T. Gardos","doi":"10.2352/issn.2470-1173.2021.17.avm-213","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-213","url":null,"abstract":"\u0000 Deep neural networks have been utilized in an increasing number of computer vision tasks, demonstrating superior performance. Much research has been focused on making deep networks more suitable for efficient hardware implementation, for low-power and low-latency real-time applications.\u0000 In [1], Isikdogan et al. introduced a deep neural network design that provides an effective trade-off between flexibility and hardware efficiency. The proposed solution consists of fixed-topology hardware blocks, with partially frozen/partially trainable weights, that can be configured into\u0000 a full network. Initial results in a few computer vision tasks were presented in [1]. In this paper, we further evaluate this network design by applying it to several additional computer vision use cases and comparing it to other hardware-friendly networks. The experimental results presented\u0000 here show that the proposed semi-fixed semi-frozen design achieves competitive performanc on a variety of benchmarks, while maintaining very high hardware efficiency.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122174209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-172
Guoqin Zang, Shéhérazade Azouigui, S. Saudrais, Olivier Peyricot, M. Hébert
This paper reports the main conclusions of a fielding observation of vehicle-pedestrian interactions at urban crosswalks, by describing the types, sequences, spatial distributions and probabilities of occurrence of the vehicle and pedestrian behaviors. This study was motivated by the fact that in a near future, with the introduction of autonomous vehicles (AVs), human drivers will become mere passengers, no longer being able to participate into the traffic interactions. With the purpose of recreating the necessary interactions, there is a strong need of new communication abilities for AVs to express their status and intentions, especially to pedestrians who constitute the most vulnerable road users. As pedestrians highly rely on the actual behavioral mechanism to interact with vehicles, it looks preferable to take into account this mechanism in the design of new communication functions. In this study, through more than one hundred of video-recorded vehicle-pedestrian interaction scenes at urban crosswalks, eight scenarios were classified with respect to the different behavioral sequences. Based on the measured position of pedestrians relative to the vehicle at the time of the significant behaviors, quantitative analysis shows that distinct patterns exist for the pedestrian gaze behavior and the vehicle slowing down behavior as a function of Vehicle-to-Pedestrian (V2P) distance and angle.
{"title":"Quantitative study of vehicle-pedestrian interactions: Towards pedestrian-adapted lighting communication functions for autonomous vehicles","authors":"Guoqin Zang, Shéhérazade Azouigui, S. Saudrais, Olivier Peyricot, M. Hébert","doi":"10.2352/issn.2470-1173.2021.17.avm-172","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-172","url":null,"abstract":"\u0000 This paper reports the main conclusions of a fielding observation of vehicle-pedestrian interactions at urban crosswalks, by describing the types, sequences, spatial distributions and probabilities of occurrence of the vehicle and pedestrian behaviors. This study was motivated by\u0000 the fact that in a near future, with the introduction of autonomous vehicles (AVs), human drivers will become mere passengers, no longer being able to participate into the traffic interactions. With the purpose of recreating the necessary interactions, there is a strong need of new communication\u0000 abilities for AVs to express their status and intentions, especially to pedestrians who constitute the most vulnerable road users. As pedestrians highly rely on the actual behavioral mechanism to interact with vehicles, it looks preferable to take into account this mechanism in the design\u0000 of new communication functions. In this study, through more than one hundred of video-recorded vehicle-pedestrian interaction scenes at urban crosswalks, eight scenarios were classified with respect to the different behavioral sequences. Based on the measured position of pedestrians relative\u0000 to the vehicle at the time of the significant behaviors, quantitative analysis shows that distinct patterns exist for the pedestrian gaze behavior and the vehicle slowing down behavior as a function of Vehicle-to-Pedestrian (V2P) distance and angle.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132228272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-183
Christian Wittpahl, B. Deegan, Bob Black, Alexander Braun
The IEEE P2020 Automotive Image Quality working group is proposing new metrics and test protocols to measure image flicker. A comprehensive validation activity is therefore required. Light source flicker (often LED flicker), as captured in a camera output, is a product of camera exposure time, sensitivity, full well capacity, readout timing, HDR scheme, and the light source frequency, duty cycle, intensity, waveform and spectrum. The proposed LED flicker metrics have to be tested and validated for a sufficient number of combinations of these camera and lighting configurations. The test space of the combinations of camera and lighting parameters is unfeasibly large to test with physical cameras and lighting setups. A numerical simulation study to validate the proposed metrics has therefore been performed. To model flicker, a representative pixel model has been implemented in code. The pixel model incorporates exposure time, sensitivity, full well capacity, and representative readout timings. The implemented light source model comprises an hybrid analyticnumerical approach that allows for efficient generation of complex temporal lighting profiles. It simulates full and half wave rectified sinusoidal waveforms, representative of AC lighting, as well as pulse width modulated lighting with variable frequency, duty cycle, intensity, and complex edge rise/fall time behaviour. In this article, both initial results from the flicker simulation model, and evaluation of proposed IEEE metrics, are presented.
{"title":"An analytic-numerical image flicker study to test novel flicker metrics","authors":"Christian Wittpahl, B. Deegan, Bob Black, Alexander Braun","doi":"10.2352/issn.2470-1173.2021.17.avm-183","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-183","url":null,"abstract":"\u0000 The IEEE P2020 Automotive Image Quality working group is proposing new metrics and test protocols to measure image flicker. A comprehensive validation activity is therefore required. Light source flicker (often LED flicker), as captured in a camera output, is a product of camera\u0000 exposure time, sensitivity, full well capacity, readout timing, HDR scheme, and the light source frequency, duty cycle, intensity, waveform and spectrum. The proposed LED flicker metrics have to be tested and validated for a sufficient number of combinations of these camera and lighting configurations.\u0000 The test space of the combinations of camera and lighting parameters is unfeasibly large to test with physical cameras and lighting setups. A numerical simulation study to validate the proposed metrics has therefore been performed. To model flicker, a representative pixel model has been implemented\u0000 in code. The pixel model incorporates exposure time, sensitivity, full well capacity, and representative readout timings. The implemented light source model comprises an hybrid analyticnumerical approach that allows for efficient generation of complex temporal lighting profiles. It simulates\u0000 full and half wave rectified sinusoidal waveforms, representative of AC lighting, as well as pulse width modulated lighting with variable frequency, duty cycle, intensity, and complex edge rise/fall time behaviour. In this article, both initial results from the flicker simulation model, and\u0000 evaluation of proposed IEEE metrics, are presented.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114176876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Perspective transform (or Homography) is commonly used algorithms in ADAS and Automated Driving System. Perspective transform is used in multiple use-cases e.g. viewpoint change, fisheye lens distortion correction, chromatic aberration correction, stereo image pair rectification, This algorithm needs high external DRAM memory bandwidth due to inherent scaling, resulting in nonaligned two dimensional memory burst accesses, resulting in large degradation in system performance and latencies. In this paper, we propose a novel perspective transform engine to reduce external memory DRAM bandwidth to alleviate this problem. The proposed solution consists of multiple regions slicing of input video frame with block size tuned for each region. The paper also gives an algorithm for finding optimal region boundaries with corresponding block size tuned for each region. The proposed solution enables average BW reduction of 67% compared to traditional implementation and achieves clock up-to 720 MHz with output pixel throughput of 1 cycle/pixel in 16nm FinFET process node.
{"title":"DRAM Bandwidth Optimal Perspective Transform Engine","authors":"Mihir Mody, Rajasekhar Allu, Gang Hua, Brijesh Jadav, Niraj Nandan, Ankur Ankur, Mayank Mangla","doi":"10.2352/issn.2470-1173.2021.17.avm-114","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-114","url":null,"abstract":"\u0000 Perspective transform (or Homography) is commonly used algorithms in ADAS and Automated Driving System. Perspective transform is used in multiple use-cases e.g. viewpoint change, fisheye lens distortion correction, chromatic aberration correction, stereo image pair rectification,\u0000 This algorithm needs high external DRAM memory bandwidth due to inherent scaling, resulting in nonaligned two dimensional memory burst accesses, resulting in large degradation in system performance and latencies. In this paper, we propose a novel perspective transform engine to reduce external\u0000 memory DRAM bandwidth to alleviate this problem. The proposed solution consists of multiple regions slicing of input video frame with block size tuned for each region. The paper also gives an algorithm for finding optimal region boundaries with corresponding block size tuned for each region.\u0000 The proposed solution enables average BW reduction of 67% compared to traditional implementation and achieves clock up-to 720 MHz with output pixel throughput of 1 cycle/pixel in 16nm FinFET process node.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129487074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-214
Doncey Albin, S. Simske
In some traditional development processes, engineering teams communicate their subsystem interfaces without much overlap o their respective disciplines and processes. However, for a systems engineering-driven design, a holistic, multidisciplined approach is implemented from the ground up, with considerable overlap between the teams in every phase of the project. Approaching a system from a holistic perspective, rather than an isolated subsystem perspective, is a fundamental component to rapid prototype development and successful system integration. It is also required for full project-level concerns such as the data, security, safety, and sustainability operations. This paper presents the development of a prototype modular unmanned ground vehicle (UGV) used for fire detection and elimination. Taking a systems engineering approach, the mechatronics and control systems designs are performed first, then the system and the important subsystems are built and tested, and finally, the evaluation results are fed back for the next prototype iteration. The goal of this paper is to give engineering students and professionals an example of the process behind holistic development of a semi-autonomous UGV and to begin an inexpensive, readilymodified platform for engineers to build upon.
{"title":"Design, Implementation, and Evaluation of a Semi-Autonomous, Vision-based, Modular Unmanned Ground Vehicle Prototype","authors":"Doncey Albin, S. Simske","doi":"10.2352/issn.2470-1173.2021.17.avm-214","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-214","url":null,"abstract":"\u0000 In some traditional development processes, engineering teams communicate their subsystem interfaces without much overlap o their respective disciplines and processes. However, for a systems engineering-driven design, a holistic, multidisciplined approach is implemented from the ground\u0000 up, with considerable overlap between the teams in every phase of the project. Approaching a system from a holistic perspective, rather than an isolated subsystem perspective, is a fundamental component to rapid prototype development and successful system integration. It is also required for\u0000 full project-level concerns such as the data, security, safety, and sustainability operations. This paper presents the development of a prototype modular unmanned ground vehicle (UGV) used for fire detection and elimination. Taking a systems engineering approach, the mechatronics and control\u0000 systems designs are performed first, then the system and the important subsystems are built and tested, and finally, the evaluation results are fed back for the next prototype iteration. The goal of this paper is to give engineering students and professionals an example of the process behind\u0000 holistic development of a semi-autonomous UGV and to begin an inexpensive, readilymodified platform for engineers to build upon.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126449198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-186
R. Jenkin
The detection and recognition of objects is essential for the operation of autonomous vehicles and robots. Designing and predicting the performance of camera systems intended to supply information to neural networks and vision algorithms is nontrivial. Optimization has to occur across many parameters, such as focal length, f-number, pixel and sensor size, exposure regime and transmission schemes. As such numerous metrics are being explored to assist with these design choices. Detectability index (SNRI) is derived from signal detection theory as applied to imaging systems and is used to estimate the ability of a system to statistically distinguish objects [1], most notably in the medical imaging and defense fields [2]. A new metric is proposed, Contrast Signal to Noise Ratio (CSNR), which is calculated simply as mean contrast divided by the standard deviation of the contrast. This is distinct from contrast to noise ratio which uses the noise of the image as the denominator [3,4]. It is shown mathematically that the metric is proportional to the idealized observer for a cobblestone target and a constant may be calculated to estimate SNRI from CSNR, accounting for target size. Results are further compared to Contrast Detection Probability (CDP), which is a relatively new objective image quality metric proposed within IEEE P2020 to rank the performance of camera systems intended for use in autonomous vehicles [5]. CSNR is shown to generate information in illumination and contrast conditions where CDP saturates and further can be modified to provide CDP-like results.
{"title":"Contrast Signal to Noise Ratio","authors":"R. Jenkin","doi":"10.2352/issn.2470-1173.2021.17.avm-186","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-186","url":null,"abstract":"\u0000 The detection and recognition of objects is essential for the operation of autonomous vehicles and robots. Designing and predicting the performance of camera systems intended to supply information to neural networks and vision algorithms is nontrivial. Optimization has to occur across\u0000 many parameters, such as focal length, f-number, pixel and sensor size, exposure regime and transmission schemes. As such numerous metrics are being explored to assist with these design choices. Detectability index (SNRI) is derived from signal detection theory as applied to imaging systems\u0000 and is used to estimate the ability of a system to statistically distinguish objects [1], most notably in the medical imaging and defense fields [2].\u0000 \u0000 A new metric is proposed, Contrast Signal to Noise Ratio (CSNR), which is calculated simply as mean contrast divided by the standard\u0000 deviation of the contrast. This is distinct from contrast to noise ratio which uses the noise of the image as the denominator [3,4]. It is shown mathematically that the metric is proportional to the idealized observer for a cobblestone target and a constant may be calculated to estimate SNRI\u0000 from CSNR, accounting for target size. Results are further compared to Contrast Detection Probability (CDP), which is a relatively new objective image quality metric proposed within IEEE P2020 to rank the performance of camera systems intended for use in autonomous vehicles [5]. CSNR is shown\u0000 to generate information in illumination and contrast conditions where CDP saturates and further can be modified to provide CDP-like results.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114805934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-180
S. Fleck, B. May, Gwen Daniel, C. Davies
Autonomous driving plays a crucial role to prevent accidents and modern vehicles are equipped with multimodal sensor systems and AI-driven perception and sensor fusion. These features are however not stable during a vehicle’s lifetime due to various means of degradation. This introduces an inherent, yet unaddressed risk: once vehicles are in the field, their individual exposure to environmental effects lead to unpredictable behavior. The goal of this paper is to raise awareness of automotive sensor degradation. Various effects exist, which in combination may have a severe impact on the AI-based processing and ultimately on the customer domain. Failure mode and effects analysis (FMEA) type approaches are used to structure a complete coverage of relevant automotive degradation effects. Sensors include cameras, RADARs, LiDARs and other modalities, both outside and in-cabin. Sensor robustness alone is a well-known topic which is addressed by DV/PV. However, this is not sufficient and various degradations will be looked at which go significantly beyond currently tested environmental stress scenarios. In addition, the combination of sensor degradation and its impact on AI processing is identified as a validation gap. An outlook to future analysis and ways to detect relevant sensor degradations is also presented.
{"title":"Data driven degradation of automotive sensors and effect analysis","authors":"S. Fleck, B. May, Gwen Daniel, C. Davies","doi":"10.2352/issn.2470-1173.2021.17.avm-180","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-180","url":null,"abstract":"\u0000 Autonomous driving plays a crucial role to prevent accidents and modern vehicles are equipped with multimodal sensor systems and AI-driven perception and sensor fusion. These features are however not stable during a vehicle’s lifetime due to various means of degradation. This\u0000 introduces an inherent, yet unaddressed risk: once vehicles are in the field, their individual exposure to environmental effects lead to unpredictable behavior. The goal of this paper is to raise awareness of automotive sensor degradation. Various effects exist, which in combination may have\u0000 a severe impact on the AI-based processing and ultimately on the customer domain. Failure mode and effects analysis (FMEA) type approaches are used to structure a complete coverage of relevant automotive degradation effects. Sensors include cameras, RADARs, LiDARs and other modalities, both\u0000 outside and in-cabin. Sensor robustness alone is a well-known topic which is addressed by DV/PV. However, this is not sufficient and various degradations will be looked at which go significantly beyond currently tested environmental stress scenarios. In addition, the combination of sensor\u0000 degradation and its impact on AI processing is identified as a validation gap. An outlook to future analysis and ways to detect relevant sensor degradations is also presented.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"149 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121397575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-211
R. Jenkin, Cheng Zhao
As autonomous vehicles and machines, such as self-driving cars, agricultural drones and industrial robots, become ubiquitous, there is an increasing need to understand the objective performance of cameras to support these functions. Images go beyond aesthetic and subjective roles as they assume increasing aspects of control, safety, and diagnostic capabilities. Radiometry and photometry are fundamental to describing the behavior of light and modeling the signal chain for imaging systems, and as such, are crucial for establishing objective behavior. As an engineer or scientist, having an intuitive feel for the magnitude of units and the physical behavior of components or systems in any field improves development capabilities and guards against rudimentary errors. Back-of-the-envelope estimations provide comparisons against which detailed calculations may be tested and will urge a developer to “try again” if the order of magnitude is off for example. They also provide a quick check for the feasibility of ideas, a “giggle” or “straight-face” test as it is sometimes known. This paper is a response to the observation of the authors that, amongst participants that are newly relying on the imaging field and existing image scientists alike, there is a general deficit of intuition around the units and order of magnitude of signals in typical cameras for autonomous vehicles and the conditions within which they operate. Further, there persists a number of misconceptions regarding general radiometric and photometric behavior. Confusion between the inverse square law as applied to illumination and consistency of image luminance versus distance is a common example. The authors detail radiometric and photometric model for an imaging system, using it to clarify vocabulary, units and behaviors. The model is then used to estimate the number of quanta expected in pixels for typical imaging systems for each of the patches of a MacBeth color checker under a wide variety of illumination conditions. These results form the basis to establish the fundamental limits of performance for passive camera systems based both solely on camera geometry and additionally considering typical quantum efficiencies available presently. Further a mental model is given which will quickly allow user to estimate numbers of photoelectrons in pixel.
{"title":"Radiometry and Photometry for Autonomous Vehicles and Machines - Fundamental Performance Limits","authors":"R. Jenkin, Cheng Zhao","doi":"10.2352/issn.2470-1173.2021.17.avm-211","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-211","url":null,"abstract":"\u0000 As autonomous vehicles and machines, such as self-driving cars, agricultural drones and industrial robots, become ubiquitous, there is an increasing need to understand the objective performance of cameras to support these functions. Images go beyond aesthetic and subjective roles\u0000 as they assume increasing aspects of control, safety, and diagnostic capabilities. Radiometry and photometry are fundamental to describing the behavior of light and modeling the signal chain for imaging systems, and as such, are crucial for establishing objective behavior.\u0000 \u0000 As an\u0000 engineer or scientist, having an intuitive feel for the magnitude of units and the physical behavior of components or systems in any field improves development capabilities and guards against rudimentary errors. Back-of-the-envelope estimations provide comparisons against which detailed calculations\u0000 may be tested and will urge a developer to “try again” if the order of magnitude is off for example. They also provide a quick check for the feasibility of ideas, a “giggle” or “straight-face” test as it is sometimes known.\u0000 \u0000 This paper is a response\u0000 to the observation of the authors that, amongst participants that are newly relying on the imaging field and existing image scientists alike, there is a general deficit of intuition around the units and order of magnitude of signals in typical cameras for autonomous vehicles and the conditions\u0000 within which they operate. Further, there persists a number of misconceptions regarding general radiometric and photometric behavior. Confusion between the inverse square law as applied to illumination and consistency of image luminance versus distance is a common example.\u0000 \u0000 The authors\u0000 detail radiometric and photometric model for an imaging system, using it to clarify vocabulary, units and behaviors. The model is then used to estimate the number of quanta expected in pixels for typical imaging systems for each of the patches of a MacBeth color checker under a wide variety\u0000 of illumination conditions. These results form the basis to establish the fundamental limits of performance for passive camera systems based both solely on camera geometry and additionally considering typical quantum efficiencies available presently. Further a mental model is given which will\u0000 quickly allow user to estimate numbers of photoelectrons in pixel.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"254 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134266443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-18DOI: 10.2352/issn.2470-1173.2021.17.avm-210
Ashok Dahal, Eric Golab, Rajender Garlapati, Varun Ravi Kumar, S. Yogamani
Road Edge is defined as the borderline where there is a change from the road surface to the non-road surface. Most of the currently existing solutions for Road Edge Detection use only a single front camera to capture the input image; hence, the system’s performance and robustness suffer. Our efficient CNN trained on a very diverse dataset yields more than 98% semantic segmentation for the road surface, which is then used to obtain road edge segments for individual camera images. Afterward, the multi-cameras raw road edges are transformed into world coordinates, and RANSAC curve fitting is used to get the final road edges on both sides of the vehicle for driving assistance. The process of road edge extraction is also very computationally efficient as we can use the same generic road segmentation output, which is computed along with other semantic segmentation for driving assistance and autonomous driving. RoadEdgeNet algorithm is designed for automated driving in series production, and we discuss the various challenges and limitations of the current algorithm.
{"title":"RoadEdgeNet: Road Edge Detection System Using Surround View Camera Images","authors":"Ashok Dahal, Eric Golab, Rajender Garlapati, Varun Ravi Kumar, S. Yogamani","doi":"10.2352/issn.2470-1173.2021.17.avm-210","DOIUrl":"https://doi.org/10.2352/issn.2470-1173.2021.17.avm-210","url":null,"abstract":"\u0000 Road Edge is defined as the borderline where there is a change from the road surface to the non-road surface. Most of the currently existing solutions for Road Edge Detection use only a single front camera to capture the input image; hence, the system’s performance and robustness\u0000 suffer. Our efficient CNN trained on a very diverse dataset yields more than 98% semantic segmentation for the road surface, which is then used to obtain road edge segments for individual camera images. Afterward, the multi-cameras raw road edges are transformed into world coordinates, and\u0000 RANSAC curve fitting is used to get the final road edges on both sides of the vehicle for driving assistance. The process of road edge extraction is also very computationally efficient as we can use the same generic road segmentation output, which is computed along with other semantic segmentation\u0000 for driving assistance and autonomous driving. RoadEdgeNet algorithm is designed for automated driving in series production, and we discuss the various challenges and limitations of the current algorithm.\u0000","PeriodicalId":177462,"journal":{"name":"Autonomous Vehicles and Machines","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123474379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}