CNN-LIDAR pedestrian classification: combining range and reflectance data

2018 IEEE International Conference on Vehicular Electronics and Safety (ICVES) Pub Date : 2018-09-01 DOI:10.1109/ICVES.2018.8519497

Gledson Melotti, A. Asvadi, C. Premebida

{"title":"CNN-LIDAR pedestrian classification: combining range and reflectance data","authors":"Gledson Melotti, A. Asvadi, C. Premebida","doi":"10.1109/ICVES.2018.8519497","DOIUrl":null,"url":null,"abstract":"The use of multiple sensors in perception systems is becoming a consensus in the automotive and robotics industries. Camera is the most popular technology, however, radar and LIDAR are increasingly being adopted more often in protection and safety systems for object/obstacle detection. In this paper, we particularly explore the LIDAR sensor as an inter-modality technology which provides two types of data, range (distance) and reflectance (intensity return), and study the influence of high-resolution distance$/$depth (DM) and reflectance maps (RM) on pedestrian classification using a deep Convolutional Neural Network (CNN). Pedestrian protection is critical for advanced driver assistance system (ADAS) and autonomous driving, and it has regained particular attention recently for known reasons. In this work, CNN-LIDAR based pedestrian classification is studied in three distinct cases: (i) having a single modality as input in the CNN, (ii) by combining distance and reflectance measurements at the CNN input-level (early fusion), and (iii) combining outputs scores from two single-modal CNNs (late fusion). Distance and intensity (reflectance) raw data from LIDAR are transformed to high-resolution (dense) maps which allow a direct implementation on CNNs both as single or multi-channel inputs (early fusion approach). In terms of late-fusion, the outputs from individual CNNs are combined by means of non-learning rules, such as: minimum, maximum, average, product. Pedestrian classification is evaluated on a 'binary classification' dataset created from the KITTI Vision Benchmark Suite, and results are shown for the three cases.","PeriodicalId":203807,"journal":{"name":"2018 IEEE International Conference on Vehicular Electronics and Safety (ICVES)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Vehicular Electronics and Safety (ICVES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICVES.2018.8519497","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

Abstract

The use of multiple sensors in perception systems is becoming a consensus in the automotive and robotics industries. Camera is the most popular technology, however, radar and LIDAR are increasingly being adopted more often in protection and safety systems for object/obstacle detection. In this paper, we particularly explore the LIDAR sensor as an inter-modality technology which provides two types of data, range (distance) and reflectance (intensity return), and study the influence of high-resolution distance$/$depth (DM) and reflectance maps (RM) on pedestrian classification using a deep Convolutional Neural Network (CNN). Pedestrian protection is critical for advanced driver assistance system (ADAS) and autonomous driving, and it has regained particular attention recently for known reasons. In this work, CNN-LIDAR based pedestrian classification is studied in three distinct cases: (i) having a single modality as input in the CNN, (ii) by combining distance and reflectance measurements at the CNN input-level (early fusion), and (iii) combining outputs scores from two single-modal CNNs (late fusion). Distance and intensity (reflectance) raw data from LIDAR are transformed to high-resolution (dense) maps which allow a direct implementation on CNNs both as single or multi-channel inputs (early fusion approach). In terms of late-fusion, the outputs from individual CNNs are combined by means of non-learning rules, such as: minimum, maximum, average, product. Pedestrian classification is evaluated on a 'binary classification' dataset created from the KITTI Vision Benchmark Suite, and results are shown for the three cases.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

CNN-LIDAR行人分类:结合距离和反射率数据

在感知系统中使用多个传感器正在成为汽车和机器人行业的共识。摄像头是最受欢迎的技术，然而，雷达和激光雷达越来越多地被用于物体/障碍物检测的保护和安全系统。在本文中，我们特别探讨了LIDAR传感器作为一种跨模态技术，它提供了两种类型的数据，距离(距离)和反射率(强度返回)，并使用深度卷积神经网络(CNN)研究了高分辨率距离/深度(DM)和反射率图(RM)对行人分类的影响。行人保护对于先进驾驶辅助系统(ADAS)和自动驾驶至关重要，最近由于众所周知的原因，它重新受到了特别关注。在这项工作中，基于CNN- lidar的行人分类研究了三种不同的情况:(i)在CNN中使用单一模态作为输入，(ii)在CNN输入级结合距离和反射率测量(早期融合)，以及(iii)结合两个单模态CNN的输出分数(后期融合)。来自激光雷达的距离和强度(反射率)原始数据被转换为高分辨率(密集)地图，允许在cnn上作为单通道或多通道输入(早期融合方法)直接实现。在后期融合方面，单个cnn的输出通过非学习规则进行组合，例如:最小、最大、平均、乘积。行人分类在KITTI视觉基准套件创建的“二元分类”数据集上进行评估，并显示了三种情况的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 IEEE International Conference on Vehicular Electronics and Safety (ICVES)

自引率

0.00%

发文量

期刊最新文献

A Survey: Engineering Challenges to Implement VANET Security Revisiting Gaussian Mixture Models for Driver Identification On the Impact of Platooning Maneuvers on Traffic Improvement of Pedestrian Positioning Precision by Using Spatial Correlation of Multipath Error Dense Spatial Translation Network