Dynamic Bipedal Turning through Sim-to-Real Reinforcement Learning

Fangzhou Yu, Ryan Batke, Jeremy Dao, J. Hurst, Kevin R. Green, Alan Fern
{"title":"Dynamic Bipedal Turning through Sim-to-Real Reinforcement Learning","authors":"Fangzhou Yu, Ryan Batke, Jeremy Dao, J. Hurst, Kevin R. Green, Alan Fern","doi":"10.1109/Humanoids53995.2022.10000225","DOIUrl":null,"url":null,"abstract":"For legged robots to match the athletic capabilities of humans and animals, they must not only produce robust periodic walking and running, but also seamlessly switch between nominal locomotion gaits and more specialized transient maneuvers. Despite recent advancements in controls of bipedal robots, there has been little focus on producing highly dynamic behaviors. Recent work utilizing reinforcement learning to produce policies for control of legged robots have demonstrated success in producing robust walking behaviors. However, these learned policies have difficulty expressing a multitude of different behaviors on a single network. Inspired by conventional optimization-based control techniques for legged robots, this work applies a recurrent policy to execute four-step, 90° turns trained using reference data generated from optimized single rigid body model trajectories. We present a training framework using epilogue terminal rewards for learning specific behaviors from pre-computed trajectory data and demonstrate a successful transfer to hardware on the bipedal robot Cassie.","PeriodicalId":180816,"journal":{"name":"2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/Humanoids53995.2022.10000225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

For legged robots to match the athletic capabilities of humans and animals, they must not only produce robust periodic walking and running, but also seamlessly switch between nominal locomotion gaits and more specialized transient maneuvers. Despite recent advancements in controls of bipedal robots, there has been little focus on producing highly dynamic behaviors. Recent work utilizing reinforcement learning to produce policies for control of legged robots have demonstrated success in producing robust walking behaviors. However, these learned policies have difficulty expressing a multitude of different behaviors on a single network. Inspired by conventional optimization-based control techniques for legged robots, this work applies a recurrent policy to execute four-step, 90° turns trained using reference data generated from optimized single rigid body model trajectories. We present a training framework using epilogue terminal rewards for learning specific behaviors from pre-computed trajectory data and demonstrate a successful transfer to hardware on the bipedal robot Cassie.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过模拟到真实强化学习的动态双足转弯
对于有腿的机器人来说,要想与人类和动物的运动能力相匹配,它们不仅必须能够实现稳健的周期性行走和奔跑,还必须能够在名义运动步态和更专业的瞬态动作之间无缝切换。尽管最近在控制双足机器人方面取得了进展,但很少有人关注产生高动态行为。最近利用强化学习来产生控制有腿机器人的策略的工作已经成功地产生了稳健的行走行为。然而,这些学到的策略很难在单个网络上表达大量不同的行为。受传统的基于优化的有腿机器人控制技术的启发,这项工作应用循环策略来执行四步90°转弯,该策略使用从优化的单个刚体模型轨迹生成的参考数据进行训练。我们提出了一个使用尾声终端奖励的训练框架,用于从预先计算的轨迹数据中学习特定行为,并演示了将其成功转移到双足机器人Cassie的硬件上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enabling Patient- and Teleoperator-led Robotic Physiotherapy via Strain Map Segmentation and Shared-authority Self-Contained Calibration of an Elastic Humanoid Upper Body Using Only a Head-Mounted RGB Camera Self-collision avoidance in bimanual teleoperation using CollisionIK: algorithm revision and usability experiment Bimanual Manipulation Workspace Analysis of Humanoid Robots with Object Specific Coupling Constraints A Dexterous, Adaptive, Affordable, Humanlike Robot Hand: Towards Prostheses with Dexterous Manipulation Capabilities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1