{"title":"基于DE的q -学习算法在大搜索空间应用中提高收敛速度","authors":"Z. Rahaman, J. Sil","doi":"10.1109/ICESC.2014.80","DOIUrl":null,"url":null,"abstract":"The main drawback of reinforcement learning is that it learns nothing from an episode until it is over. So the learning procedure is very slow in case of large space applications. Differential Evolution (DE) algorithm is a population-based evolutionary optimization algorithm able to learn the search space in iterative way. In the paper, improvement of Q-learning method has been proposed using DE algorithm where guided randomness has been incorporated in the search space resulting fast convergence. Markov Decision Process (MDP), a mathematical framework has been used to model the problem in order to learn the large search space efficiently. The proposed algorithm exhibits better result in terms of speed and performance compare to basic Q-learning algorithm.","PeriodicalId":335267,"journal":{"name":"2014 International Conference on Electronic Systems, Signal Processing and Computing Technologies","volume":"180 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"DE Based Q-Learning Algorithm to Improve Speed of Convergence in Large Search Space Applications\",\"authors\":\"Z. Rahaman, J. Sil\",\"doi\":\"10.1109/ICESC.2014.80\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The main drawback of reinforcement learning is that it learns nothing from an episode until it is over. So the learning procedure is very slow in case of large space applications. Differential Evolution (DE) algorithm is a population-based evolutionary optimization algorithm able to learn the search space in iterative way. In the paper, improvement of Q-learning method has been proposed using DE algorithm where guided randomness has been incorporated in the search space resulting fast convergence. Markov Decision Process (MDP), a mathematical framework has been used to model the problem in order to learn the large search space efficiently. The proposed algorithm exhibits better result in terms of speed and performance compare to basic Q-learning algorithm.\",\"PeriodicalId\":335267,\"journal\":{\"name\":\"2014 International Conference on Electronic Systems, Signal Processing and Computing Technologies\",\"volume\":\"180 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-01-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Electronic Systems, Signal Processing and Computing Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICESC.2014.80\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Electronic Systems, Signal Processing and Computing Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICESC.2014.80","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DE Based Q-Learning Algorithm to Improve Speed of Convergence in Large Search Space Applications
The main drawback of reinforcement learning is that it learns nothing from an episode until it is over. So the learning procedure is very slow in case of large space applications. Differential Evolution (DE) algorithm is a population-based evolutionary optimization algorithm able to learn the search space in iterative way. In the paper, improvement of Q-learning method has been proposed using DE algorithm where guided randomness has been incorporated in the search space resulting fast convergence. Markov Decision Process (MDP), a mathematical framework has been used to model the problem in order to learn the large search space efficiently. The proposed algorithm exhibits better result in terms of speed and performance compare to basic Q-learning algorithm.