{"title":"基于安全体验回放的自动驾驶深度强化学习","authors":"Xiaohan Huang;Yuhu Cheng;Qiang Yu;Xuesong Wang","doi":"10.1109/TCDS.2024.3405896","DOIUrl":null,"url":null,"abstract":"In the field of autonomous driving, safety has always been a top priority, especially in recent years with the development and increasing application of deep reinforcement learning (DRL) in autonomous driving. Ensuring the safety of algorithms has become an indispensable concern. Reinforcement learning (RL), which involves interacting with the environment through trial and error, may result in unsafe behavior in autonomous driving without any safety constraints. Such behavior could result in the drive path deviation and even collision, causing catastrophic accidents. Therefore, this article proposes a reinforcement learning algorithm based on a safety experience replay mechanism, which is primarily to enhance the safety of reinforcement learning in autonomous driving. First, the ego vehicle conducts preliminary exploration of the environment to collect data. Based on the performance of completing tasks observed from each data trajectory, safety labels of different levels are assigned to all state-action pairs, which establishes a safety experience buffer. Further, a safety-critic network is constructed, which is trained by randomly sampling from the safety experience buffer. This enables the network to quantitatively evaluate the safety of driving actions, and the goal of safe driving for ego vehicle is achieved. The experimental results indicate that the proposed method can effectively reduce driving risks and improve task success rates compared with conventional reinforcement learning algorithms.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 6","pages":"2070-2084"},"PeriodicalIF":5.0000,"publicationDate":"2024-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Reinforcement Learning for Autonomous Driving Based on Safety Experience Replay\",\"authors\":\"Xiaohan Huang;Yuhu Cheng;Qiang Yu;Xuesong Wang\",\"doi\":\"10.1109/TCDS.2024.3405896\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the field of autonomous driving, safety has always been a top priority, especially in recent years with the development and increasing application of deep reinforcement learning (DRL) in autonomous driving. Ensuring the safety of algorithms has become an indispensable concern. Reinforcement learning (RL), which involves interacting with the environment through trial and error, may result in unsafe behavior in autonomous driving without any safety constraints. Such behavior could result in the drive path deviation and even collision, causing catastrophic accidents. Therefore, this article proposes a reinforcement learning algorithm based on a safety experience replay mechanism, which is primarily to enhance the safety of reinforcement learning in autonomous driving. First, the ego vehicle conducts preliminary exploration of the environment to collect data. Based on the performance of completing tasks observed from each data trajectory, safety labels of different levels are assigned to all state-action pairs, which establishes a safety experience buffer. Further, a safety-critic network is constructed, which is trained by randomly sampling from the safety experience buffer. This enables the network to quantitatively evaluate the safety of driving actions, and the goal of safe driving for ego vehicle is achieved. The experimental results indicate that the proposed method can effectively reduce driving risks and improve task success rates compared with conventional reinforcement learning algorithms.\",\"PeriodicalId\":54300,\"journal\":{\"name\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"volume\":\"16 6\",\"pages\":\"2070-2084\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10542087/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive and Developmental Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10542087/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Deep Reinforcement Learning for Autonomous Driving Based on Safety Experience Replay
In the field of autonomous driving, safety has always been a top priority, especially in recent years with the development and increasing application of deep reinforcement learning (DRL) in autonomous driving. Ensuring the safety of algorithms has become an indispensable concern. Reinforcement learning (RL), which involves interacting with the environment through trial and error, may result in unsafe behavior in autonomous driving without any safety constraints. Such behavior could result in the drive path deviation and even collision, causing catastrophic accidents. Therefore, this article proposes a reinforcement learning algorithm based on a safety experience replay mechanism, which is primarily to enhance the safety of reinforcement learning in autonomous driving. First, the ego vehicle conducts preliminary exploration of the environment to collect data. Based on the performance of completing tasks observed from each data trajectory, safety labels of different levels are assigned to all state-action pairs, which establishes a safety experience buffer. Further, a safety-critic network is constructed, which is trained by randomly sampling from the safety experience buffer. This enables the network to quantitatively evaluate the safety of driving actions, and the goal of safe driving for ego vehicle is achieved. The experimental results indicate that the proposed method can effectively reduce driving risks and improve task success rates compared with conventional reinforcement learning algorithms.
期刊介绍:
The IEEE Transactions on Cognitive and Developmental Systems (TCDS) focuses on advances in the study of development and cognition in natural (humans, animals) and artificial (robots, agents) systems. It welcomes contributions from multiple related disciplines including cognitive systems, cognitive robotics, developmental and epigenetic robotics, autonomous and evolutionary robotics, social structures, multi-agent and artificial life systems, computational neuroscience, and developmental psychology. Articles on theoretical, computational, application-oriented, and experimental studies as well as reviews in these areas are considered.