Reinforcement learning control for a flapping-wing micro aerial vehicle with output constraint

IF 1.9 4区计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Assembly Automation Pub Date : 2022-10-27 DOI:10.1108/aa-05-2022-0140

Haifeng Huang, Xiaoyan Wu, Tingting Wang, Yongbin Sun, Qiang Fu

{"title":"Reinforcement learning control for a flapping-wing micro aerial vehicle with output constraint","authors":"Haifeng Huang, Xiaoyan Wu, Tingting Wang, Yongbin Sun, Qiang Fu","doi":"10.1108/aa-05-2022-0140","DOIUrl":null,"url":null,"abstract":"\nPurpose\nThis paper aims to study the application of reinforcement learning (RL) in the control of an output-constrained flapping-wing micro aerial vehicle (FWMAV) with system uncertainty.\n\n\nDesign/methodology/approach\nA six-degrees-of-freedom hummingbird model is used without consideration of the inertial effects of the wings. A RL algorithm based on actor–critic framework is applied, which consists of an actor network with unknown policy gradient and a critic network with unknown value function. Considering the good performance of neural network (NN) in fitting nonlinearity and its optimum characteristics, an actor–critic NN optimization algorithm is designed, in which the actor and critic NNs are used to generate a policy and approximate the cost functions, respectively. In addition, to ensure the safe and stable flight of the FWMAV, a barrier Lyapunov function is used to make the flight states constrained in predefined regions. Based on the Lyapunov stability theory, the stability of the system is analyzed, and finally, the feasibility of RL in the control of a FWMAV is verified through simulation.\n\n\nFindings\nThe proposed RL control scheme works well in ensuring the trajectory tracking of the FWMAV in the presence of output constraint and system uncertainty.\n\n\nOriginality/value\nA novel RL algorithm based on actor–critic framework is applied to the control of a FWMAV with system uncertainty. For the stable and safe flight of the FWMAV, the output constraint problem is considered and solved by barrier Lyapunov function-based control.\n","PeriodicalId":55448,"journal":{"name":"Assembly Automation","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2022-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Assembly Automation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1108/aa-05-2022-0140","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 1

Abstract

Purpose This paper aims to study the application of reinforcement learning (RL) in the control of an output-constrained flapping-wing micro aerial vehicle (FWMAV) with system uncertainty. Design/methodology/approach A six-degrees-of-freedom hummingbird model is used without consideration of the inertial effects of the wings. A RL algorithm based on actor–critic framework is applied, which consists of an actor network with unknown policy gradient and a critic network with unknown value function. Considering the good performance of neural network (NN) in fitting nonlinearity and its optimum characteristics, an actor–critic NN optimization algorithm is designed, in which the actor and critic NNs are used to generate a policy and approximate the cost functions, respectively. In addition, to ensure the safe and stable flight of the FWMAV, a barrier Lyapunov function is used to make the flight states constrained in predefined regions. Based on the Lyapunov stability theory, the stability of the system is analyzed, and finally, the feasibility of RL in the control of a FWMAV is verified through simulation. Findings The proposed RL control scheme works well in ensuring the trajectory tracking of the FWMAV in the presence of output constraint and system uncertainty. Originality/value A novel RL algorithm based on actor–critic framework is applied to the control of a FWMAV with system uncertainty. For the stable and safe flight of the FWMAV, the output constraint problem is considered and solved by barrier Lyapunov function-based control.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

具有输出约束的扑翼微型飞行器的强化学习控制

目的研究强化学习在具有系统不确定性的输出约束扑翼微型飞行器(FWMAV)控制中的应用。设计/方法/方法采用了一个不考虑翅膀惯性效应的六自由度蜂鸟模型。采用了一种基于行动者-评论家框架的强化学习算法，该算法由一个具有未知策略梯度的行动者网络和一个具有未知价值函数的评论家网络组成。考虑到神经网络在拟合非线性方面的良好性能及其最优特性，设计了一种参与者-评论家神经网络优化算法，其中参与者和评论家神经网络分别用于生成策略和近似成本函数。此外，为了保证FWMAV的安全稳定飞行，采用了障碍Lyapunov函数使飞行状态约束在预定义区域内。基于Lyapunov稳定性理论，分析了系统的稳定性，最后通过仿真验证了RL在FWMAV控制中的可行性。结果提出的RL控制方案在存在输出约束和系统不确定性的情况下能够很好地保证FWMAV的轨迹跟踪。提出了一种基于角色-评价框架的强化学习算法，并将其应用于具有系统不确定性的FWMAV控制中。为了实现FWMAV的稳定安全飞行，考虑了输出约束问题，并采用基于barrier Lyapunov函数的控制方法进行了求解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Assembly Automation 工程技术-工程：制造

CiteScore

4.30

自引率

14.30%

发文量

审稿时长

3.3 months

期刊介绍： Assembly Automation publishes peer reviewed research articles, technology reviews and specially commissioned case studies. Each issue includes high quality content covering all aspects of assembly technology and automation, and reflecting the most interesting and strategically important research and development activities from around the world. Because of this, readers can stay at the very forefront of industry developments. All research articles undergo rigorous double-blind peer review, and the journal’s policy of not publishing work that has only been tested in simulation means that only the very best and most practical research articles are included. This ensures that the material that is published has real relevance and value for commercial manufacturing and research organizations.