Encoding Crowd Interaction with Deep Neural Network for Pedestrian Trajectory Prediction

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI:10.1109/CVPR.2018.00553

Yanyu Xu, Zhixin Piao, Shenghua Gao

{"title":"Encoding Crowd Interaction with Deep Neural Network for Pedestrian Trajectory Prediction","authors":"Yanyu Xu, Zhixin Piao, Shenghua Gao","doi":"10.1109/CVPR.2018.00553","DOIUrl":null,"url":null,"abstract":"Pedestrian trajectory prediction is a challenging task because of the complex nature of humans. In this paper, we tackle the problem within a deep learning framework by considering motion information of each pedestrian and its interaction with the crowd. Specifically, motivated by the residual learning in deep learning, we propose to predict displacement between neighboring frames for each pedestrian sequentially. To predict such displacement, we design a crowd interaction deep neural network (CIDNN) which considers the different importance of different pedestrians for the displacement prediction of a target pedestrian. Specifically, we use an LSTM to model motion information for all pedestrians and use a multi-layer perceptron to map the location of each pedestrian to a high dimensional feature space where the inner product between features is used as a measurement for the spatial affinity between two pedestrians. Then we weight the motion features of all pedestrians based on their spatial affinity to the target pedestrian for location displacement prediction. Extensive experiments on publicly available datasets validate the effectiveness of our method for trajectory prediction.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"4 1","pages":"5275-5284"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"207","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2018.00553","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 207

Abstract

Pedestrian trajectory prediction is a challenging task because of the complex nature of humans. In this paper, we tackle the problem within a deep learning framework by considering motion information of each pedestrian and its interaction with the crowd. Specifically, motivated by the residual learning in deep learning, we propose to predict displacement between neighboring frames for each pedestrian sequentially. To predict such displacement, we design a crowd interaction deep neural network (CIDNN) which considers the different importance of different pedestrians for the displacement prediction of a target pedestrian. Specifically, we use an LSTM to model motion information for all pedestrians and use a multi-layer perceptron to map the location of each pedestrian to a high dimensional feature space where the inner product between features is used as a measurement for the spatial affinity between two pedestrians. Then we weight the motion features of all pedestrians based on their spatial affinity to the target pedestrian for location displacement prediction. Extensive experiments on publicly available datasets validate the effectiveness of our method for trajectory prediction.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于深度神经网络的人群交互编码行人轨迹预测

由于人类的复杂性，行人轨迹预测是一项具有挑战性的任务。在本文中，我们通过考虑每个行人的运动信息及其与人群的交互，在深度学习框架中解决了这个问题。具体来说，在深度学习中残差学习的激励下，我们提出了顺序预测每个行人相邻帧之间的位移。为了预测这种位移，我们设计了一个人群交互深度神经网络(CIDNN)，该网络考虑了不同行人对目标行人位移预测的不同重要性。具体来说，我们使用LSTM对所有行人的运动信息进行建模，并使用多层感知器将每个行人的位置映射到高维特征空间，其中特征之间的内积用于测量两个行人之间的空间亲和力。然后根据行人与目标行人的空间亲和性对所有行人的运动特征进行加权，进行位置位移预测。在公开可用的数据集上进行的大量实验验证了我们的方法在轨迹预测方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量

期刊最新文献

Multistage Adversarial Losses for Pose-Based Human Image Synthesis Document Enhancement Using Visibility Detection Demo2Vec: Reasoning Object Affordances from Online Videos Planar Shape Detection at Structural Scales Where and Why are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks