{"title":"使用狄利克雷过程高斯混合模型的强化学习的上下文相关元控制","authors":"Dongjae Kim, Sang Wan Lee","doi":"10.1109/IWW-BCI.2018.8311512","DOIUrl":null,"url":null,"abstract":"Arbitration between model-based (MB) and model-free (MF) reinforcement learning (RL) is key feature of human reinforcement learning. The computational model of arbitration control has been demonstrated to outperform conventional reinforcement learning algorithm, in terms of not only behavioral data but also neural signals. However, this arbitration process does not take full account of contextual changes in environment during learning. By incorporating a Dirichlet process Gaussian mixture model into the arbitration process, we propose a meta-controller for RL that quickly adapts to contextual changes of environment. The proposed model performs better than a conventional model-free RL, model-based RL, and arbitration model.","PeriodicalId":6537,"journal":{"name":"2018 6th International Conference on Brain-Computer Interface (BCI)","volume":"27 1","pages":"1-3"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Context-dependent meta-control for reinforcement learning using a Dirichlet process Gaussian mixture model\",\"authors\":\"Dongjae Kim, Sang Wan Lee\",\"doi\":\"10.1109/IWW-BCI.2018.8311512\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Arbitration between model-based (MB) and model-free (MF) reinforcement learning (RL) is key feature of human reinforcement learning. The computational model of arbitration control has been demonstrated to outperform conventional reinforcement learning algorithm, in terms of not only behavioral data but also neural signals. However, this arbitration process does not take full account of contextual changes in environment during learning. By incorporating a Dirichlet process Gaussian mixture model into the arbitration process, we propose a meta-controller for RL that quickly adapts to contextual changes of environment. The proposed model performs better than a conventional model-free RL, model-based RL, and arbitration model.\",\"PeriodicalId\":6537,\"journal\":{\"name\":\"2018 6th International Conference on Brain-Computer Interface (BCI)\",\"volume\":\"27 1\",\"pages\":\"1-3\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 6th International Conference on Brain-Computer Interface (BCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWW-BCI.2018.8311512\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 6th International Conference on Brain-Computer Interface (BCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWW-BCI.2018.8311512","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Context-dependent meta-control for reinforcement learning using a Dirichlet process Gaussian mixture model
Arbitration between model-based (MB) and model-free (MF) reinforcement learning (RL) is key feature of human reinforcement learning. The computational model of arbitration control has been demonstrated to outperform conventional reinforcement learning algorithm, in terms of not only behavioral data but also neural signals. However, this arbitration process does not take full account of contextual changes in environment during learning. By incorporating a Dirichlet process Gaussian mixture model into the arbitration process, we propose a meta-controller for RL that quickly adapts to contextual changes of environment. The proposed model performs better than a conventional model-free RL, model-based RL, and arbitration model.