Small group pedestrian crossing behaviour prediction using temporal angular 2D skeletal pose

IF 4.5 Q2 COMPUTER SCIENCE, THEORY & METHODS Array Pub Date : 2024-07-01 Epub Date: 2024-03-05 DOI:10.1016/j.array.2024.100341

Hanugra Aulia Sidharta , Berlian Al Kindhi , Eko Mulyanto Yuniarno , Mauridhi Hery Purnomo

{"title":"Small group pedestrian crossing behaviour prediction using temporal angular 2D skeletal pose","authors":"Hanugra Aulia Sidharta , Berlian Al Kindhi , Eko Mulyanto Yuniarno , Mauridhi Hery Purnomo","doi":"10.1016/j.array.2024.100341","DOIUrl":null,"url":null,"abstract":"<div><p>A pedestrian is classified as a Vulnerable Road User (VRU) because they do not have the protective equipment that would make them fatal if they were involved in an accident. An accident can happen while a pedestrian is on the road, especially when crossing the road. To ensure pedestrian safety, it is necessary to understand and predict pedestrian behaviour when crossing the road. We propose pedestrian intention prediction using a 2D pose estimation approach with temporal angle as a feature. Based on visual observation of the Joint Attention in Autonomous Driving (JAAD) dataset, we found that pedestrians tend to walk together in small groups while waiting to cross, and then this group is disbanded on the opposite side of the road. Thus, we propose to perform prediction with small group of pedestrians, based on pedestrian statistical data, we define a small group of pedestrians as consisting of 4 pedestrians. Another problem raised is 2D pose estimation is processing each pedestrian index individually, which creates ambiguous pedestrian index in consecutive frame. We propose Multi Input Single Output (MISO), which has capabilities to process multiple pedestrians together, and use summation layer at the end of the model to solve the ambiguous pedestrian index problem without performing tracking on each pedestrian. The performance of our proposed model achieves model accuracy of 0.9306 with prediction performance of 0.8317.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100341"},"PeriodicalIF":4.5000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000079/pdfft?md5=255bf8dee6ebbdca068e698762cee29a&pid=1-s2.0-S2590005624000079-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Array","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590005624000079","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/5 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

A pedestrian is classified as a Vulnerable Road User (VRU) because they do not have the protective equipment that would make them fatal if they were involved in an accident. An accident can happen while a pedestrian is on the road, especially when crossing the road. To ensure pedestrian safety, it is necessary to understand and predict pedestrian behaviour when crossing the road. We propose pedestrian intention prediction using a 2D pose estimation approach with temporal angle as a feature. Based on visual observation of the Joint Attention in Autonomous Driving (JAAD) dataset, we found that pedestrians tend to walk together in small groups while waiting to cross, and then this group is disbanded on the opposite side of the road. Thus, we propose to perform prediction with small group of pedestrians, based on pedestrian statistical data, we define a small group of pedestrians as consisting of 4 pedestrians. Another problem raised is 2D pose estimation is processing each pedestrian index individually, which creates ambiguous pedestrian index in consecutive frame. We propose Multi Input Single Output (MISO), which has capabilities to process multiple pedestrians together, and use summation layer at the end of the model to solve the ambiguous pedestrian index problem without performing tracking on each pedestrian. The performance of our proposed model achieves model accuracy of 0.9306 with prediction performance of 0.8317.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用时间角度二维骨骼姿态预测小群体行人过马路行为

行人被归类为易受伤害的道路使用者（Vulnerable Road User，VRU），因为他们没有防护设备，一旦发生意外，他们就会致命。行人在路上，特别是横穿马路时，可能会发生事故。为了确保行人安全，有必要了解和预测行人过马路时的行为。我们建议使用二维姿态估计方法，以时间角度为特征，预测行人的意图。基于对自动驾驶联合注意力（JAAD）数据集的视觉观察，我们发现行人在等待过马路时往往会结伴而行，然后在道路的另一侧散开。因此，我们建议使用行人小团体进行预测，根据行人统计数据，我们将行人小团体定义为由 4 名行人组成。二维姿态估计的另一个问题是单独处理每个行人指数，这会在连续帧中产生模糊的行人指数。我们提出了多输入单输出模型（MISO），它可以同时处理多个行人，并在模型末端使用求和层来解决行人指数模糊的问题，而无需对每个行人进行跟踪。我们提出的模型准确率达到 0.9306，预测率达到 0.8317。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊