Rongliang Zhou , Haotian Cao , Jiakun Huang , Xiaolin Song , Jing Huang , Zhi Huang
{"title":"Hybrid lane change strategy of autonomous vehicles based on SOAR cognitive architecture and deep reinforcement learning","authors":"Rongliang Zhou , Haotian Cao , Jiakun Huang , Xiaolin Song , Jing Huang , Zhi Huang","doi":"10.1016/j.neucom.2024.128669","DOIUrl":null,"url":null,"abstract":"<div><div>Research on lane change strategies for autonomous vehicles holds paramount importance in optimizing traffic flow efficiency, enhancing driving safety, and adapting to complex traffic environments. While numerous rule-based or machine-learning approaches have been explored to tackle the challenge of lane change on highways, they frequently exhibit limited performance owing to the complexity of driving environments. This study proposes a novel lane change strategy for autonomous vehicles, which utilizes a hybrid framework integrating the SOAR cognitive architecture and deep reinforcement learning (DRL) to address the lane change challenge on highways. First, we introduce a rule extraction algorithm, the RuleCOSI+, which is based on tree ensemble algorithms, designed to extract concise lane change rules from large-scale human driving data. These straightforward rules, together with traffic regulations and safety rules, constitute the long-term memory of the SOAR cognitive architecture, enabling transparent decision-making processes. Next, by analyzing the clipping mechanism of the proximal policy optimization (PPO) algorithm, we propose an Adaptive Clipping PPO (ACPPO) algorithm which is based on the importance of samples. This algorithm adopts different clipping strategies for SOAR samples and ACPPO samples during the training process, enabling the algorithm to more effectively utilize samples with different levels of importance. Then, we propose a hybrid decision-making algorithm: SOAR-ACPPO, which combines the SOAR cognitive architecture with the ACPPO algorithm. This algorithm leverages SOAR’s prior knowledge to effectively and safely guide agent learning. Finally, by selecting appropriate intervention probability and weaning strategy, the system avoids inappropriate knowledge intervention and ensures adequate environment exploration. Simulation experiments conducted using the CARLA simulator illustrate that the proposed strategy not only improves model learning efficiency but also enhances driving efficiency and safety. Additionally, it demonstrates a certain degree of human-like characteristics and interpretability.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5000,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224014401","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Research on lane change strategies for autonomous vehicles holds paramount importance in optimizing traffic flow efficiency, enhancing driving safety, and adapting to complex traffic environments. While numerous rule-based or machine-learning approaches have been explored to tackle the challenge of lane change on highways, they frequently exhibit limited performance owing to the complexity of driving environments. This study proposes a novel lane change strategy for autonomous vehicles, which utilizes a hybrid framework integrating the SOAR cognitive architecture and deep reinforcement learning (DRL) to address the lane change challenge on highways. First, we introduce a rule extraction algorithm, the RuleCOSI+, which is based on tree ensemble algorithms, designed to extract concise lane change rules from large-scale human driving data. These straightforward rules, together with traffic regulations and safety rules, constitute the long-term memory of the SOAR cognitive architecture, enabling transparent decision-making processes. Next, by analyzing the clipping mechanism of the proximal policy optimization (PPO) algorithm, we propose an Adaptive Clipping PPO (ACPPO) algorithm which is based on the importance of samples. This algorithm adopts different clipping strategies for SOAR samples and ACPPO samples during the training process, enabling the algorithm to more effectively utilize samples with different levels of importance. Then, we propose a hybrid decision-making algorithm: SOAR-ACPPO, which combines the SOAR cognitive architecture with the ACPPO algorithm. This algorithm leverages SOAR’s prior knowledge to effectively and safely guide agent learning. Finally, by selecting appropriate intervention probability and weaning strategy, the system avoids inappropriate knowledge intervention and ensures adequate environment exploration. Simulation experiments conducted using the CARLA simulator illustrate that the proposed strategy not only improves model learning efficiency but also enhances driving efficiency and safety. Additionally, it demonstrates a certain degree of human-like characteristics and interpretability.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.