Yaping Liao , Guizhen Yu , Peng Chen , Bin Zhou , Han Li
{"title":"个性化汽车跟随行为建模:基于记忆的深度强化学习方法","authors":"Yaping Liao , Guizhen Yu , Peng Chen , Bin Zhou , Han Li","doi":"10.1080/23249935.2022.2035846","DOIUrl":null,"url":null,"abstract":"<div><p>To adapt to human-driving habits, this study develops a personalised car-following model via a memory-based deep reinforcement learning approach. Specifically, Twin Delayed Deep Deterministic Policy Gradients (TD3) is integrated with a long short-term memory (LSTM) (abbreviated as LSTM-TD3). Using the NGSIM dataset, unsupervised learning-based clustering and data feature analyses are performed. The driving characteristics related to safety, efficiency and comfort are extracted for different driving styles, i.e. aggressive, common and conservative. Then, reward functions are constructed for different driving styles by incorporating their driving characteristics. By resorting to the TD3 policy within a recurrent actor–critic framework, LSTM-TD3 optimises the car-following behaviour via trial-and-error interactions according to the reward functions. Results show that compared with LSTM-DDPG and DDPG, LSTM-TD3 reproduces personalised car-following behaviour with desirable convergence speed and reward. It reveals that LSTM-TD3 can reflect the essential difference in safety, efficiency and comfort requirements among different driving styles.</p></div>","PeriodicalId":48871,"journal":{"name":"Transportmetrica A-Transport Science","volume":"20 1","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Modelling personalised car-following behaviour: a memory-based deep reinforcement learning approach\",\"authors\":\"Yaping Liao , Guizhen Yu , Peng Chen , Bin Zhou , Han Li\",\"doi\":\"10.1080/23249935.2022.2035846\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>To adapt to human-driving habits, this study develops a personalised car-following model via a memory-based deep reinforcement learning approach. Specifically, Twin Delayed Deep Deterministic Policy Gradients (TD3) is integrated with a long short-term memory (LSTM) (abbreviated as LSTM-TD3). Using the NGSIM dataset, unsupervised learning-based clustering and data feature analyses are performed. The driving characteristics related to safety, efficiency and comfort are extracted for different driving styles, i.e. aggressive, common and conservative. Then, reward functions are constructed for different driving styles by incorporating their driving characteristics. By resorting to the TD3 policy within a recurrent actor–critic framework, LSTM-TD3 optimises the car-following behaviour via trial-and-error interactions according to the reward functions. Results show that compared with LSTM-DDPG and DDPG, LSTM-TD3 reproduces personalised car-following behaviour with desirable convergence speed and reward. It reveals that LSTM-TD3 can reflect the essential difference in safety, efficiency and comfort requirements among different driving styles.</p></div>\",\"PeriodicalId\":48871,\"journal\":{\"name\":\"Transportmetrica A-Transport Science\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportmetrica A-Transport Science\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/org/science/article/pii/S2324993522006728\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"TRANSPORTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportmetrica A-Transport Science","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/org/science/article/pii/S2324993522006728","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TRANSPORTATION","Score":null,"Total":0}
Modelling personalised car-following behaviour: a memory-based deep reinforcement learning approach
To adapt to human-driving habits, this study develops a personalised car-following model via a memory-based deep reinforcement learning approach. Specifically, Twin Delayed Deep Deterministic Policy Gradients (TD3) is integrated with a long short-term memory (LSTM) (abbreviated as LSTM-TD3). Using the NGSIM dataset, unsupervised learning-based clustering and data feature analyses are performed. The driving characteristics related to safety, efficiency and comfort are extracted for different driving styles, i.e. aggressive, common and conservative. Then, reward functions are constructed for different driving styles by incorporating their driving characteristics. By resorting to the TD3 policy within a recurrent actor–critic framework, LSTM-TD3 optimises the car-following behaviour via trial-and-error interactions according to the reward functions. Results show that compared with LSTM-DDPG and DDPG, LSTM-TD3 reproduces personalised car-following behaviour with desirable convergence speed and reward. It reveals that LSTM-TD3 can reflect the essential difference in safety, efficiency and comfort requirements among different driving styles.
期刊介绍:
Transportmetrica A provides a forum for original discourse in transport science. The international journal''s focus is on the scientific approach to transport research methodology and empirical analysis of moving people and goods. Papers related to all aspects of transportation are welcome. A rigorous peer review that involves editor screening and anonymous refereeing for submitted articles facilitates quality output.