用于预测城市交通场景中行人意图的时空深度学习框架

IF 1.4 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE AI Communications Pub Date : 2024-05-28 DOI:10.3233/aic-230053

Monika , Pardeep Singh, Satish Chand

{"title":"用于预测城市交通场景中行人意图的时空深度学习框架","authors":"Monika , Pardeep Singh, Satish Chand","doi":"10.3233/aic-230053","DOIUrl":null,"url":null,"abstract":"Pedestrian intent prediction is an essential task for ensuring the safety of pedestrians and vehicles on the road. This task involves predicting whether a pedestrian intends to cross a road or not based on their behavior and surrounding environment. Previous studies have explored feature-based machine learning and vision-based deep learning models for this task but these methods have limitations in capturing the global spatio-temporal context and fusing different features of data effectively. To address these issues, we propose a novel hybrid framework HSTGCN for pedestrian intent prediction that combines spatio-temporal graph convolutional neural networks (STGCN) and long short-term memory (LSTM) networks. The proposed framework utilizes the strengths of both models by fusing multiple features, including skeleton pose, trajectory, height, orientation, and ego-vehicle speed, to predict their intentions accurately. The framework’s performance have been evaluated on the JAAD benchmark dataset and the results show that it outperforms the state-of-the-art methods. The proposed framework has potential applications in developing intelligent transportation systems, autonomous vehicles, and pedestrian safety technologies. The utilization of multiple features can significantly improve the performance of the pedestrian intent prediction task.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"32 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Spatio-temporal deep learning framework for pedestrian intention prediction in urban traffic scenes\",\"authors\":\"Monika , Pardeep Singh, Satish Chand\",\"doi\":\"10.3233/aic-230053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pedestrian intent prediction is an essential task for ensuring the safety of pedestrians and vehicles on the road. This task involves predicting whether a pedestrian intends to cross a road or not based on their behavior and surrounding environment. Previous studies have explored feature-based machine learning and vision-based deep learning models for this task but these methods have limitations in capturing the global spatio-temporal context and fusing different features of data effectively. To address these issues, we propose a novel hybrid framework HSTGCN for pedestrian intent prediction that combines spatio-temporal graph convolutional neural networks (STGCN) and long short-term memory (LSTM) networks. The proposed framework utilizes the strengths of both models by fusing multiple features, including skeleton pose, trajectory, height, orientation, and ego-vehicle speed, to predict their intentions accurately. The framework’s performance have been evaluated on the JAAD benchmark dataset and the results show that it outperforms the state-of-the-art methods. The proposed framework has potential applications in developing intelligent transportation systems, autonomous vehicles, and pedestrian safety technologies. The utilization of multiple features can significantly improve the performance of the pedestrian intent prediction task.\",\"PeriodicalId\":50835,\"journal\":{\"name\":\"AI Communications\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2024-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AI Communications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.3233/aic-230053\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Communications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/aic-230053","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

行人意图预测是确保道路上行人和车辆安全的一项重要任务。这项任务包括根据行人的行为和周围环境预测行人是否打算过马路。以往的研究探索了基于特征的机器学习和基于视觉的深度学习模型来完成这项任务，但这些方法在捕捉全局时空背景和有效融合数据的不同特征方面存在局限性。为了解决这些问题，我们提出了一种用于行人意图预测的新型混合框架 HSTGCN，它结合了时空图卷积神经网络（STGCN）和长短期记忆（LSTM）网络。所提出的框架通过融合骨架姿势、轨迹、高度、方向和自我车辆速度等多种特征，利用了这两种模型的优势，从而准确预测行人的意图。在 JAAD 基准数据集上对该框架的性能进行了评估，结果表明它优于最先进的方法。所提出的框架在开发智能交通系统、自动驾驶汽车和行人安全技术方面具有潜在的应用价值。利用多种特征可以显著提高行人意图预测任务的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Spatio-temporal deep learning framework for pedestrian intention prediction in urban traffic scenes

Pedestrian intent prediction is an essential task for ensuring the safety of pedestrians and vehicles on the road. This task involves predicting whether a pedestrian intends to cross a road or not based on their behavior and surrounding environment. Previous studies have explored feature-based machine learning and vision-based deep learning models for this task but these methods have limitations in capturing the global spatio-temporal context and fusing different features of data effectively. To address these issues, we propose a novel hybrid framework HSTGCN for pedestrian intent prediction that combines spatio-temporal graph convolutional neural networks (STGCN) and long short-term memory (LSTM) networks. The proposed framework utilizes the strengths of both models by fusing multiple features, including skeleton pose, trajectory, height, orientation, and ego-vehicle speed, to predict their intentions accurately. The framework’s performance have been evaluated on the JAAD benchmark dataset and the results show that it outperforms the state-of-the-art methods. The proposed framework has potential applications in developing intelligent transportation systems, autonomous vehicles, and pedestrian safety technologies. The utilization of multiple features can significantly improve the performance of the pedestrian intent prediction task.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

AI Communications 工程技术-计算机：人工智能

CiteScore

2.30

自引率

12.50%

发文量

审稿时长

4.5 months

期刊介绍： AI Communications is a journal on artificial intelligence (AI) which has a close relationship to EurAI (European Association for Artificial Intelligence, formerly ECCAI). It covers the whole AI community: Scientific institutions as well as commercial and industrial companies. AI Communications aims to enhance contacts and information exchange between AI researchers and developers, and to provide supranational information to those concerned with AI and advanced information processing. AI Communications publishes refereed articles concerning scientific and technical AI procedures, provided they are of sufficient interest to a large readership of both scientific and practical background. In addition it contains high-level background material, both at the technical level as well as the level of opinions, policies and news.