SAR: generalization of physiological agility and dexterity via synergistic action representation

IF 3.7 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Autonomous Robots Pub Date : 2024-11-14 DOI:10.1007/s10514-024-10182-4
Cameron Berg, Vittorio Caggiano, Vikash Kumar
{"title":"SAR: generalization of physiological agility and dexterity via synergistic action representation","authors":"Cameron Berg,&nbsp;Vittorio Caggiano,&nbsp;Vikash Kumar","doi":"10.1007/s10514-024-10182-4","DOIUrl":null,"url":null,"abstract":"<div><p>Learning effective continuous control policies in high-dimensional systems, including musculoskeletal agents, remains a significant challenge. Over the course of biological evolution, organisms have developed robust mechanisms for overcoming this complexity to learn highly sophisticated strategies for motor control. What accounts for this robust behavioral flexibility? Modular control via muscle synergies, i.e. coordinated muscle co-contractions, is considered to be one putative mechanism that enables organisms to learn muscle control in a simplified and generalizable action space. Drawing inspiration from this evolved motor control strategy, we use physiologically accurate human hand and leg models as a testbed for determining the extent to which a <i>Synergistic Action Representation</i> (<i>SAR</i>) acquired from simpler tasks facilitates learning and generalization on more complex tasks. We find in both cases that <i>SAR</i>-exploiting policies significantly outperform end-to-end reinforcement learning. Policies trained with <i>SAR</i> were able to achieve robust locomotion on a diverse set of terrains (e.g., stairs, hills) with state-of-the-art sample efficiency (4 M total steps), while baseline approaches failed to learn any meaningful behaviors under the same training regime. Additionally, policies trained with <i>SAR</i> on in-hand 100-object manipulation task significantly outperformed (&gt;70% success) baseline approaches (&lt;20% success). Both <i>SAR</i>-exploiting policies were also found to generalize zero-shot to out-of-domain environmental conditions, while policies that did not adopt <i>SAR</i> failed to generalize. Finally, using a simulated robotic hand and humanoid agent, we establish the generality of SAR on broader high-dimensional control problems, solving tasks with greatly improved sample efficiency. To the best of our knowledge, this investigation is the first of its kind to present an end-to-end pipeline for discovering synergies and using this representation to learn high-dimensional continuous control across a wide diversity of tasks. <b>Project website:</b>https://sites.google.com/view/sar-rl</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"48 8","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Autonomous Robots","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10514-024-10182-4","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Learning effective continuous control policies in high-dimensional systems, including musculoskeletal agents, remains a significant challenge. Over the course of biological evolution, organisms have developed robust mechanisms for overcoming this complexity to learn highly sophisticated strategies for motor control. What accounts for this robust behavioral flexibility? Modular control via muscle synergies, i.e. coordinated muscle co-contractions, is considered to be one putative mechanism that enables organisms to learn muscle control in a simplified and generalizable action space. Drawing inspiration from this evolved motor control strategy, we use physiologically accurate human hand and leg models as a testbed for determining the extent to which a Synergistic Action Representation (SAR) acquired from simpler tasks facilitates learning and generalization on more complex tasks. We find in both cases that SAR-exploiting policies significantly outperform end-to-end reinforcement learning. Policies trained with SAR were able to achieve robust locomotion on a diverse set of terrains (e.g., stairs, hills) with state-of-the-art sample efficiency (4 M total steps), while baseline approaches failed to learn any meaningful behaviors under the same training regime. Additionally, policies trained with SAR on in-hand 100-object manipulation task significantly outperformed (>70% success) baseline approaches (<20% success). Both SAR-exploiting policies were also found to generalize zero-shot to out-of-domain environmental conditions, while policies that did not adopt SAR failed to generalize. Finally, using a simulated robotic hand and humanoid agent, we establish the generality of SAR on broader high-dimensional control problems, solving tasks with greatly improved sample efficiency. To the best of our knowledge, this investigation is the first of its kind to present an end-to-end pipeline for discovering synergies and using this representation to learn high-dimensional continuous control across a wide diversity of tasks. Project website:https://sites.google.com/view/sar-rl

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SAR:通过协同作用表示法概括生理敏捷性和灵巧性
在高维系统(包括肌肉骨骼系统)中学习有效的连续控制策略仍然是一项重大挑战。在生物进化的过程中,生物已经发展出克服这种复杂性的强大机制,从而学会了高度复杂的运动控制策略。是什么造就了这种强大的行为灵活性?通过肌肉协同作用(即协调的肌肉共同收缩)进行的模块化控制被认为是一种推定机制,它使生物能够在简化和可泛化的动作空间中学习肌肉控制。从这种进化的运动控制策略中汲取灵感,我们使用生理上精确的人类手部和腿部模型作为试验平台,以确定从较简单任务中获得的协同动作表征(SAR)在多大程度上促进了对较复杂任务的学习和泛化。我们发现,在这两种情况下,利用 SAR 的策略都明显优于端到端强化学习。利用 SAR 训练的策略能够在各种地形(如楼梯、山丘)上实现稳健的运动,并具有最先进的采样效率(总步数为 400 万步),而基线方法在相同的训练机制下无法学习到任何有意义的行为。此外,在手持 100 个物体的操作任务中,使用 SAR 训练的策略明显优于基线方法(成功率为 70%)(成功率为 20%)。研究还发现,这两种利用合成孔径雷达的策略都能在域外环境条件下实现零误差泛化,而未采用合成孔径雷达的策略则无法实现泛化。最后,我们利用模拟机器人手和仿人代理,在更广泛的高维控制问题上确立了 SAR 的通用性,大大提高了解决任务的采样效率。据我们所知,这项研究首次提出了一个端到端的管道,用于发现协同效应,并利用这种表示学习各种任务的高维连续控制。项目网站:https://sites.google.com/view/sar-rl
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Autonomous Robots
Autonomous Robots 工程技术-机器人学
CiteScore
7.90
自引率
5.70%
发文量
46
审稿时长
3 months
期刊介绍: Autonomous Robots reports on the theory and applications of robotic systems capable of some degree of self-sufficiency. It features papers that include performance data on actual robots in the real world. Coverage includes: control of autonomous robots · real-time vision · autonomous wheeled and tracked vehicles · legged vehicles · computational architectures for autonomous systems · distributed architectures for learning, control and adaptation · studies of autonomous robot systems · sensor fusion · theory of autonomous systems · terrain mapping and recognition · self-calibration and self-repair for robots · self-reproducing intelligent structures · genetic algorithms as models for robot development. The focus is on the ability to move and be self-sufficient, not on whether the system is an imitation of biology. Of course, biological models for robotic systems are of major interest to the journal since living systems are prototypes for autonomous behavior.
期刊最新文献
View: visual imitation learning with waypoints Safe and stable teleoperation of quadrotor UAVs under haptic shared autonomy Synthesizing compact behavior trees for probabilistic robotics domains Integrative biomechanics of a human–robot carrying task: implications for future collaborative work Mori-zwanzig approach for belief abstraction with application to belief space planning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1