Hamad Ud Din, Wasif Muhammad, N. Siddique, M. J. Irshad, Ali Asghar, M. W. Jabbar
{"title":"Development of Visual Smooth Pursuit Model Using Inverse Reinforcement Learning For Humanoid Robots","authors":"Hamad Ud Din, Wasif Muhammad, N. Siddique, M. J. Irshad, Ali Asghar, M. W. Jabbar","doi":"10.1109/ICEPECC57281.2023.10209527","DOIUrl":null,"url":null,"abstract":"This Early in the $20^{\\mathrm{t}\\mathrm{h}}$ century, research on smooth pursuit began. Nowadays, it may be found in everything from little robots to sophisticated automation projects. There are now many study studies in this area, but they are all reward-based conventionally, which is not biologically feasible. In these techniques, the robot performs an action, and the agent determines the next course of action based on the performance and a certain kind of positive or negative reward. The reward in this thesis is derived from the sensory space rather than the action space, which enables the robot to predict the reward without any prior defined reward. PC/BC-DIM, a new Deep Inverse Reinforcement Learning (DIRL) technique, is presented. Rather than relying on previously specified rewards, PC/BC-DIM assesses the prediction error between certain inputs and determines whether or not to update the weight. It was controlled independently and successfully arrived at the target place, yielding satisfying results. The iCub humanoid robot simulator is used to evaluate the performance of the suggested system.","PeriodicalId":102289,"journal":{"name":"2023 International Conference on Energy, Power, Environment, Control, and Computing (ICEPECC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Energy, Power, Environment, Control, and Computing (ICEPECC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEPECC57281.2023.10209527","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This Early in the $20^{\mathrm{t}\mathrm{h}}$ century, research on smooth pursuit began. Nowadays, it may be found in everything from little robots to sophisticated automation projects. There are now many study studies in this area, but they are all reward-based conventionally, which is not biologically feasible. In these techniques, the robot performs an action, and the agent determines the next course of action based on the performance and a certain kind of positive or negative reward. The reward in this thesis is derived from the sensory space rather than the action space, which enables the robot to predict the reward without any prior defined reward. PC/BC-DIM, a new Deep Inverse Reinforcement Learning (DIRL) technique, is presented. Rather than relying on previously specified rewards, PC/BC-DIM assesses the prediction error between certain inputs and determines whether or not to update the weight. It was controlled independently and successfully arrived at the target place, yielding satisfying results. The iCub humanoid robot simulator is used to evaluate the performance of the suggested system.