Neural network position and orientation control of an inverted pendulum on wheels

2019 19th International Conference on Advanced Robotics (ICAR) Pub Date : 2019-12-01 DOI:10.1109/ICAR46387.2019.8981659

Christian Dengler, B. Lohmann

{"title":"Neural network position and orientation control of an inverted pendulum on wheels","authors":"Christian Dengler, B. Lohmann","doi":"10.1109/ICAR46387.2019.8981659","DOIUrl":null,"url":null,"abstract":"In this contribution, we develop a feedback controller for a wheeled inverted pendulum in the form of a neural network that is not only stabilizing the unstable system, but also allows the wheeled robot to drive to arbitrary positions within a certain radius and take a desired orientation, without the need to compute a feasible trajectory to the desired position online. While some techniques from the reinforcement learning community can be used to optimize the parameters of a general feedback controller, i.e. policy gradient methods, the method used in this work is an approach related to imitation learning or learning from demonstration. The demonstration data however does not result from e.g. a human demonstrator, but is a set of precomputed optimal trajectories. The neural network is trained to imitate the behavior of those optimal trajectories. We show that a good choice of initial states and a large number of training targets can be used to alleviate a problem of imitation learning, namely deviating from training trajectories, and we demonstrate results in simulation as well as on the physical system.","PeriodicalId":6606,"journal":{"name":"2019 19th International Conference on Advanced Robotics (ICAR)","volume":"13 1","pages":"350-355"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 19th International Conference on Advanced Robotics (ICAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAR46387.2019.8981659","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In this contribution, we develop a feedback controller for a wheeled inverted pendulum in the form of a neural network that is not only stabilizing the unstable system, but also allows the wheeled robot to drive to arbitrary positions within a certain radius and take a desired orientation, without the need to compute a feasible trajectory to the desired position online. While some techniques from the reinforcement learning community can be used to optimize the parameters of a general feedback controller, i.e. policy gradient methods, the method used in this work is an approach related to imitation learning or learning from demonstration. The demonstration data however does not result from e.g. a human demonstrator, but is a set of precomputed optimal trajectories. The neural network is trained to imitate the behavior of those optimal trajectories. We show that a good choice of initial states and a large number of training targets can be used to alleviate a problem of imitation learning, namely deviating from training trajectories, and we demonstrate results in simulation as well as on the physical system.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

轮式倒立摆的神经网络定位控制

在此贡献中，我们开发了一种以神经网络形式的轮式倒立摆反馈控制器，该控制器不仅稳定了不稳定的系统，而且允许轮式机器人在一定半径内驱动到任意位置并采取期望的方向，而无需在线计算到期望位置的可行轨迹。虽然强化学习社区的一些技术可以用来优化一般反馈控制器的参数，即策略梯度方法，但本工作中使用的方法是一种与模仿学习或演示学习相关的方法。然而，演示数据不是来自例如人类演示者，而是一组预先计算的最佳轨迹。神经网络被训练来模仿那些最优轨迹的行为。我们证明了良好的初始状态选择和大量的训练目标可以用来缓解模仿学习的问题，即偏离训练轨迹，我们在模拟和物理系统上展示了结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 19th International Conference on Advanced Robotics (ICAR)

自引率

0.00%

发文量