{"title":"Impedance Control without Environment Model by Reinforcement Learning","authors":"Adolfo Perrusquía, Wen Yu, Xiaoou Li","doi":"10.1109/ICICIP47338.2019.9012210","DOIUrl":null,"url":null,"abstract":"In this paper, to balance the learning accuracy and time. We propose hybrid reinforcement learning, which is in both discrete and continuous domains. The action-state space of the is divided into two domains: discrete-time learning has less precision but is fast, continuous-time learning is slow but has better learning precision. This hybrid reinforcement learning can learn the optimal contact force, meanwhile it minimizes positional error in an unknown environment. Convergence of the learning is proven. Real-time experiments are carried out using the two degree-of-freedom (DOF) spin and tilt robot and the 6-DOF force/torque sensor to verify our methods.","PeriodicalId":431872,"journal":{"name":"2019 Tenth International Conference on Intelligent Control and Information Processing (ICICIP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Tenth International Conference on Intelligent Control and Information Processing (ICICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICIP47338.2019.9012210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this paper, to balance the learning accuracy and time. We propose hybrid reinforcement learning, which is in both discrete and continuous domains. The action-state space of the is divided into two domains: discrete-time learning has less precision but is fast, continuous-time learning is slow but has better learning precision. This hybrid reinforcement learning can learn the optimal contact force, meanwhile it minimizes positional error in an unknown environment. Convergence of the learning is proven. Real-time experiments are carried out using the two degree-of-freedom (DOF) spin and tilt robot and the 6-DOF force/torque sensor to verify our methods.