Junqiu Wang, Jianmei Tan, Peng Lin, Chenguang Xing, Bo Liu
{"title":"Learning Stall Recovery Policies using a Soft Actor-Critic Algorithm with Smooth Reward Functions","authors":"Junqiu Wang, Jianmei Tan, Peng Lin, Chenguang Xing, Bo Liu","doi":"10.1109/ROBIO58561.2023.10354940","DOIUrl":null,"url":null,"abstract":"We propose an effective stall recovery learning approach based on a soft actor-critic algorithm with smooth reward functions. Stalling is extremely dangerous for aircraft and unmanned aerial vehicles (UAVs) because altitude decreases can result in fatal accidents. Stall recovery policies perform appropriate control sequences to save aircrafts from such lethal situations. Learning stall recovery policies using reinforcement learning methods is desirable because such policies can be learned automatically. However, stall recovery training is challenging since the interplay between an aircraft and its environment is very complicated. In this work, the proposed stall recovery learning approach yields better performance than other methods. We successfully apply smooth reward functions to the learning process because reward functions are critical for the convergence of policy learning. We achieve good performance by applying reward scaling to the soft actor-critic algorithm with automatic entropy learning. Experimental results demonstrate that stalls can be successfully recovered using the learned policies. The comparison results show that our method provides better results than previous algorithms.","PeriodicalId":505134,"journal":{"name":"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"64 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO58561.2023.10354940","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We propose an effective stall recovery learning approach based on a soft actor-critic algorithm with smooth reward functions. Stalling is extremely dangerous for aircraft and unmanned aerial vehicles (UAVs) because altitude decreases can result in fatal accidents. Stall recovery policies perform appropriate control sequences to save aircrafts from such lethal situations. Learning stall recovery policies using reinforcement learning methods is desirable because such policies can be learned automatically. However, stall recovery training is challenging since the interplay between an aircraft and its environment is very complicated. In this work, the proposed stall recovery learning approach yields better performance than other methods. We successfully apply smooth reward functions to the learning process because reward functions are critical for the convergence of policy learning. We achieve good performance by applying reward scaling to the soft actor-critic algorithm with automatic entropy learning. Experimental results demonstrate that stalls can be successfully recovered using the learned policies. The comparison results show that our method provides better results than previous algorithms.