触手可及行为的增量学习:一种基于粒子群的逆最优控制方法

2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR) Pub Date : 2015-11-01 DOI:10.1109/SOCPAR.2015.7492796

Haitham El-Hussieny, Samy F. M. Assal, A. Abouelsoud, S. M. Megahed, T. Ogasawara

{"title":"触手可及行为的增量学习:一种基于粒子群的逆最优控制方法","authors":"Haitham El-Hussieny, Samy F. M. Assal, A. Abouelsoud, S. M. Megahed, T. Ogasawara","doi":"10.1109/SOCPAR.2015.7492796","DOIUrl":null,"url":null,"abstract":"In recent years, there has been an increasing interest in modeling natural human movements. The main question to be addressed is: what is the optimality criteria that human has optimized to achieve a certain movement. One of the most significant current discussions is the modeling of the reach-to-grasp movements that human naturally perform while approaching a certain object for grasping. Recent advances in Inverse Reinforcement Learning (IRL) approaches have facilitated investigation of reach-to-grasp movements in terms of the optimal control theory. IRL aims to learn the cost function that best describes the demonstrated human reach-to-grasp movements. Thus far, gradient-based techniques have been used to obtain the parameters of the underlying cost function. Such approaches, however, have failed to find the global optimal parameters since they are limited by locating only local optimum values. In this research, learning of the cost function for the reach-to-grasp movements is addressed as an Inverse Linear Quadratic Regulator (ILQR) problem, where linear dynamic equations and a quadratic cost are assumed. An efficient evolutionary optimization technique, Particle Swarm Optimization (PSO), is used to obtain the unknown cost for the reach-to-grasp movements under consideration. Moreover, an incremental-ILQR Algorithm is proposed to adjust the learned cost once new untrained demonstrations exist to overcome the over-fitting issue. The obtained results are encouraging and show harmony with those in neuroscience literature.","PeriodicalId":409493,"journal":{"name":"2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Incremental learning of reach-to-grasp behavior: A PSO-based Inverse optimal control approach\",\"authors\":\"Haitham El-Hussieny, Samy F. M. Assal, A. Abouelsoud, S. M. Megahed, T. Ogasawara\",\"doi\":\"10.1109/SOCPAR.2015.7492796\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, there has been an increasing interest in modeling natural human movements. The main question to be addressed is: what is the optimality criteria that human has optimized to achieve a certain movement. One of the most significant current discussions is the modeling of the reach-to-grasp movements that human naturally perform while approaching a certain object for grasping. Recent advances in Inverse Reinforcement Learning (IRL) approaches have facilitated investigation of reach-to-grasp movements in terms of the optimal control theory. IRL aims to learn the cost function that best describes the demonstrated human reach-to-grasp movements. Thus far, gradient-based techniques have been used to obtain the parameters of the underlying cost function. Such approaches, however, have failed to find the global optimal parameters since they are limited by locating only local optimum values. In this research, learning of the cost function for the reach-to-grasp movements is addressed as an Inverse Linear Quadratic Regulator (ILQR) problem, where linear dynamic equations and a quadratic cost are assumed. An efficient evolutionary optimization technique, Particle Swarm Optimization (PSO), is used to obtain the unknown cost for the reach-to-grasp movements under consideration. Moreover, an incremental-ILQR Algorithm is proposed to adjust the learned cost once new untrained demonstrations exist to overcome the over-fitting issue. The obtained results are encouraging and show harmony with those in neuroscience literature.\",\"PeriodicalId\":409493,\"journal\":{\"name\":\"2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SOCPAR.2015.7492796\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SOCPAR.2015.7492796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

近年来，人们对人类自然运动的建模越来越感兴趣。要解决的主要问题是:人类为实现某一运动而优化的最优标准是什么?当前最重要的讨论之一是对人类在接近某一物体进行抓取时自然表现的伸手到抓取动作进行建模。逆强化学习(IRL)方法的最新进展促进了从最优控制理论角度研究伸手抓握运动。IRL的目标是学习最能描述人类伸手抓握动作的代价函数。到目前为止，基于梯度的技术已被用于获得潜在成本函数的参数。然而，由于这些方法只能定位局部最优值，因此无法找到全局最优参数。在本研究中，学习的成本函数的伸手到抓的运动是作为一个逆线性二次型调节器(ILQR)问题，其中线性动力学方程和二次型成本假设。采用一种高效的进化优化技术——粒子群优化算法(PSO)，求解手抓动作的未知代价。此外，为了克服过拟合问题，提出了一种增量ilqr算法，在出现新的未经训练的演示时调整学习代价。所得结果令人鼓舞，与神经科学文献一致。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Incremental learning of reach-to-grasp behavior: A PSO-based Inverse optimal control approach

In recent years, there has been an increasing interest in modeling natural human movements. The main question to be addressed is: what is the optimality criteria that human has optimized to achieve a certain movement. One of the most significant current discussions is the modeling of the reach-to-grasp movements that human naturally perform while approaching a certain object for grasping. Recent advances in Inverse Reinforcement Learning (IRL) approaches have facilitated investigation of reach-to-grasp movements in terms of the optimal control theory. IRL aims to learn the cost function that best describes the demonstrated human reach-to-grasp movements. Thus far, gradient-based techniques have been used to obtain the parameters of the underlying cost function. Such approaches, however, have failed to find the global optimal parameters since they are limited by locating only local optimum values. In this research, learning of the cost function for the reach-to-grasp movements is addressed as an Inverse Linear Quadratic Regulator (ILQR) problem, where linear dynamic equations and a quadratic cost are assumed. An efficient evolutionary optimization technique, Particle Swarm Optimization (PSO), is used to obtain the unknown cost for the reach-to-grasp movements under consideration. Moreover, an incremental-ILQR Algorithm is proposed to adjust the learned cost once new untrained demonstrations exist to overcome the over-fitting issue. The obtained results are encouraging and show harmony with those in neuroscience literature.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR)

自引率

0.00%

发文量