Carlo Alessi, Diego Bianchi, Gianni Stano, Matteo Cianchetti, Egidio Falotico
{"title":"通过深度强化学习实现软机械臂推举","authors":"Carlo Alessi, Diego Bianchi, Gianni Stano, Matteo Cianchetti, Egidio Falotico","doi":"10.1002/aisy.202300899","DOIUrl":null,"url":null,"abstract":"<p>Soft robots can adaptively interact with unstructured environments. However, nonlinear soft material properties challenge modeling and control. Learning-based controllers that leverage efficient mechanical models are promising for solving complex interaction tasks. This article develops a closed-loop pose/force controller for a dexterous soft manipulator enabling dynamic pushing tasks using deep reinforcement learning. Force tests investigate the mechanical properties of a soft robot module, resulting in orthogonal forces of <span></span><math>\n <semantics>\n <mrow>\n <mn>9</mn>\n <mo>−</mo>\n <mn>13</mn>\n </mrow>\n <annotation>$9 - 13$</annotation>\n </semantics></math> N. Then, the policy is trained in simulation leveraging a dynamic Cosserat rod model of the soft robot. Domain randomization mitigate the sim-to-real gap while careful reward engineering induced pose and force control even without explicit force inputs. Despite the approximate simulation, the sim-to-real transfer achieved an average reaching distance of <span></span><math>\n <semantics>\n <mrow>\n <mn>34</mn>\n <mo>±</mo>\n <mn>14</mn>\n </mrow>\n <annotation>$34 \\pm 14$</annotation>\n </semantics></math> mm (<span></span><math>\n <semantics>\n <mrow>\n <mn>8.1</mn>\n <mo>%</mo>\n <mi>L</mi>\n <mo>±</mo>\n <mn>3.4</mn>\n <mo>%</mo>\n <mi>L</mi>\n </mrow>\n <annotation>$ L \\pm L$</annotation>\n </semantics></math>), an average orientation error of <span></span><math>\n <semantics>\n <mrow>\n <mn>0.40</mn>\n <mo>±</mo>\n <mn>0.29</mn>\n </mrow>\n <annotation>$0.40 \\pm 0.29$</annotation>\n </semantics></math> rad (<span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <mn>23</mn>\n </mrow>\n <mo>°</mo>\n <mo>±</mo>\n <mrow>\n <mn>17</mn>\n </mrow>\n <mo>°</mo>\n </mrow>\n <annotation>$\\left(23\\right)^{\\circ} \\pm \\left(17\\right)^{\\circ}$</annotation>\n </semantics></math>) and applied pushing forces up to <span></span><math>\n <semantics>\n <mn>3</mn>\n <annotation>$3$</annotation>\n </semantics></math> N. Such performance is reasonable for the intended assistive tasks of the manipulator. The experiments uncovered that the soft robot interacting with the environment exhibited torsional and counter-balancing movements. Although not explicitly enforced, they emerged from the mechanical intelligence of the manipulator. The results demonstrate the potential of soft robotic manipulation via reinforcement learning.</p>","PeriodicalId":93858,"journal":{"name":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","volume":"6 8","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aisy.202300899","citationCount":"0","resultStr":"{\"title\":\"Pushing with Soft Robotic Arms via Deep Reinforcement Learning\",\"authors\":\"Carlo Alessi, Diego Bianchi, Gianni Stano, Matteo Cianchetti, Egidio Falotico\",\"doi\":\"10.1002/aisy.202300899\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Soft robots can adaptively interact with unstructured environments. However, nonlinear soft material properties challenge modeling and control. Learning-based controllers that leverage efficient mechanical models are promising for solving complex interaction tasks. This article develops a closed-loop pose/force controller for a dexterous soft manipulator enabling dynamic pushing tasks using deep reinforcement learning. Force tests investigate the mechanical properties of a soft robot module, resulting in orthogonal forces of <span></span><math>\\n <semantics>\\n <mrow>\\n <mn>9</mn>\\n <mo>−</mo>\\n <mn>13</mn>\\n </mrow>\\n <annotation>$9 - 13$</annotation>\\n </semantics></math> N. Then, the policy is trained in simulation leveraging a dynamic Cosserat rod model of the soft robot. Domain randomization mitigate the sim-to-real gap while careful reward engineering induced pose and force control even without explicit force inputs. Despite the approximate simulation, the sim-to-real transfer achieved an average reaching distance of <span></span><math>\\n <semantics>\\n <mrow>\\n <mn>34</mn>\\n <mo>±</mo>\\n <mn>14</mn>\\n </mrow>\\n <annotation>$34 \\\\pm 14$</annotation>\\n </semantics></math> mm (<span></span><math>\\n <semantics>\\n <mrow>\\n <mn>8.1</mn>\\n <mo>%</mo>\\n <mi>L</mi>\\n <mo>±</mo>\\n <mn>3.4</mn>\\n <mo>%</mo>\\n <mi>L</mi>\\n </mrow>\\n <annotation>$ L \\\\pm L$</annotation>\\n </semantics></math>), an average orientation error of <span></span><math>\\n <semantics>\\n <mrow>\\n <mn>0.40</mn>\\n <mo>±</mo>\\n <mn>0.29</mn>\\n </mrow>\\n <annotation>$0.40 \\\\pm 0.29$</annotation>\\n </semantics></math> rad (<span></span><math>\\n <semantics>\\n <mrow>\\n <mrow>\\n <mn>23</mn>\\n </mrow>\\n <mo>°</mo>\\n <mo>±</mo>\\n <mrow>\\n <mn>17</mn>\\n </mrow>\\n <mo>°</mo>\\n </mrow>\\n <annotation>$\\\\left(23\\\\right)^{\\\\circ} \\\\pm \\\\left(17\\\\right)^{\\\\circ}$</annotation>\\n </semantics></math>) and applied pushing forces up to <span></span><math>\\n <semantics>\\n <mn>3</mn>\\n <annotation>$3$</annotation>\\n </semantics></math> N. Such performance is reasonable for the intended assistive tasks of the manipulator. The experiments uncovered that the soft robot interacting with the environment exhibited torsional and counter-balancing movements. Although not explicitly enforced, they emerged from the mechanical intelligence of the manipulator. The results demonstrate the potential of soft robotic manipulation via reinforcement learning.</p>\",\"PeriodicalId\":93858,\"journal\":{\"name\":\"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)\",\"volume\":\"6 8\",\"pages\":\"\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2024-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aisy.202300899\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/aisy.202300899\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced intelligent systems (Weinheim an der Bergstrasse, Germany)","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aisy.202300899","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
软体机器人可以自适应地与非结构化环境互动。然而,非线性软材料特性对建模和控制提出了挑战。利用高效机械模型的学习型控制器有望解决复杂的交互任务。本文为灵巧的软机械手开发了一种闭环姿势/力控制器,利用深度强化学习实现动态推动任务。力测试研究了软体机器人模块的机械特性,得出了 N 的正交力。然后,利用软体机器人的动态 Cosserat 杆模型对策略进行仿真训练。域随机化减轻了模拟与实际之间的差距,同时,即使没有明确的力输入,精心设计的奖励工程也能诱导姿势和力控制。尽管是近似模拟,但模拟到实际的转换实现了平均达毫米()的伸手距离,平均方位误差为弧度(),施加的推力高达 N。对于机械手的预期辅助任务来说,这样的性能是合理的。实验发现,与环境互动的软体机器人表现出扭转和平衡运动。虽然没有明确强制执行,但它们来自机械手的机械智能。这些结果证明了通过强化学习进行软机器人操纵的潜力。
Pushing with Soft Robotic Arms via Deep Reinforcement Learning
Soft robots can adaptively interact with unstructured environments. However, nonlinear soft material properties challenge modeling and control. Learning-based controllers that leverage efficient mechanical models are promising for solving complex interaction tasks. This article develops a closed-loop pose/force controller for a dexterous soft manipulator enabling dynamic pushing tasks using deep reinforcement learning. Force tests investigate the mechanical properties of a soft robot module, resulting in orthogonal forces of N. Then, the policy is trained in simulation leveraging a dynamic Cosserat rod model of the soft robot. Domain randomization mitigate the sim-to-real gap while careful reward engineering induced pose and force control even without explicit force inputs. Despite the approximate simulation, the sim-to-real transfer achieved an average reaching distance of mm (), an average orientation error of rad () and applied pushing forces up to N. Such performance is reasonable for the intended assistive tasks of the manipulator. The experiments uncovered that the soft robot interacting with the environment exhibited torsional and counter-balancing movements. Although not explicitly enforced, they emerged from the mechanical intelligence of the manipulator. The results demonstrate the potential of soft robotic manipulation via reinforcement learning.