{"title":"动态环境下目标导向行为的无监督自主学习框架","authors":"Chinedu Pascal Ezenkwu, Andrew Starkey","doi":"10.1007/s43674-022-00037-9","DOIUrl":null,"url":null,"abstract":"<div><p>Due to their dependence on a task-specific reward function, reinforcement learning agents are ineffective at responding to a dynamic goal or environment. This paper seeks to overcome this limitation of traditional reinforcement learning through a task-agnostic, self-organising autonomous agent framework. The proposed algorithm is a hybrid of TMGWR for self-adaptive learning of sensorimotor maps and value iteration for goal-directed planning. TMGWR has been previously demonstrated to overcome the problems associated with competing sensorimotor techniques such SOM, GNG, and GWR; these problems include: difficulty in setting a suitable number of neurons for a task, inflexibility, the inability to cope with non-markovian environments, challenges with noise, and inappropriate representation of sensory observations and actions together. However, the binary sensorimotor-link implementation in the original TMGWR enables catastrophic forgetting when the agent experiences changes in the task and it is therefore not suitable for self-adaptive learning. A new sensorimotor-link update rule is presented in this paper to enable the adaptation of the sensorimotor map to new experiences. This paper has demonstrated that the TMGWR-based algorithm has better sample efficiency than model-free reinforcement learning and better self-adaptivity than both the model-free and the traditional model-based reinforcement learning algorithms. Moreover, the algorithm has been demonstrated to give the lowest overall computational cost when compared to traditional reinforcement learning algorithms.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"2 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43674-022-00037-9.pdf","citationCount":"0","resultStr":"{\"title\":\"An unsupervised autonomous learning framework for goal-directed behaviours in dynamic contexts\",\"authors\":\"Chinedu Pascal Ezenkwu, Andrew Starkey\",\"doi\":\"10.1007/s43674-022-00037-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Due to their dependence on a task-specific reward function, reinforcement learning agents are ineffective at responding to a dynamic goal or environment. This paper seeks to overcome this limitation of traditional reinforcement learning through a task-agnostic, self-organising autonomous agent framework. The proposed algorithm is a hybrid of TMGWR for self-adaptive learning of sensorimotor maps and value iteration for goal-directed planning. TMGWR has been previously demonstrated to overcome the problems associated with competing sensorimotor techniques such SOM, GNG, and GWR; these problems include: difficulty in setting a suitable number of neurons for a task, inflexibility, the inability to cope with non-markovian environments, challenges with noise, and inappropriate representation of sensory observations and actions together. However, the binary sensorimotor-link implementation in the original TMGWR enables catastrophic forgetting when the agent experiences changes in the task and it is therefore not suitable for self-adaptive learning. A new sensorimotor-link update rule is presented in this paper to enable the adaptation of the sensorimotor map to new experiences. This paper has demonstrated that the TMGWR-based algorithm has better sample efficiency than model-free reinforcement learning and better self-adaptivity than both the model-free and the traditional model-based reinforcement learning algorithms. Moreover, the algorithm has been demonstrated to give the lowest overall computational cost when compared to traditional reinforcement learning algorithms.</p></div>\",\"PeriodicalId\":72089,\"journal\":{\"name\":\"Advances in computational intelligence\",\"volume\":\"2 3\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s43674-022-00037-9.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in computational intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s43674-022-00037-9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in computational intelligence","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43674-022-00037-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An unsupervised autonomous learning framework for goal-directed behaviours in dynamic contexts
Due to their dependence on a task-specific reward function, reinforcement learning agents are ineffective at responding to a dynamic goal or environment. This paper seeks to overcome this limitation of traditional reinforcement learning through a task-agnostic, self-organising autonomous agent framework. The proposed algorithm is a hybrid of TMGWR for self-adaptive learning of sensorimotor maps and value iteration for goal-directed planning. TMGWR has been previously demonstrated to overcome the problems associated with competing sensorimotor techniques such SOM, GNG, and GWR; these problems include: difficulty in setting a suitable number of neurons for a task, inflexibility, the inability to cope with non-markovian environments, challenges with noise, and inappropriate representation of sensory observations and actions together. However, the binary sensorimotor-link implementation in the original TMGWR enables catastrophic forgetting when the agent experiences changes in the task and it is therefore not suitable for self-adaptive learning. A new sensorimotor-link update rule is presented in this paper to enable the adaptation of the sensorimotor map to new experiences. This paper has demonstrated that the TMGWR-based algorithm has better sample efficiency than model-free reinforcement learning and better self-adaptivity than both the model-free and the traditional model-based reinforcement learning algorithms. Moreover, the algorithm has been demonstrated to give the lowest overall computational cost when compared to traditional reinforcement learning algorithms.