{"title":"具有截短历史的深度递归q网络","authors":"Hyunwoo Oh, Tomoyuki Kaneko","doi":"10.1109/TAAI.2018.00017","DOIUrl":null,"url":null,"abstract":"Reinforcement Learning is a kind of machine learning method which learns through agents' interaction with the environment. Deep Q-Network (DQN), which is a model of reinforcement learning based on deep neural networks, succeeded in learning human-level control policies on various kinds of Atari 2600 games with image pixel inputs. Because an input of DQN is the game frames of the last four steps, DQN had difficulty on mastering such games that need to remember events earlier than four steps in the past. To alleviate the problem, Deep Recurrent Q-Network (DRQN) and Deep Attention Recurrent Q-Network (DARQN) were proposed. In DRQN, the first fully-connected layer just after convolutional layers is replaced with an LSTM to incorporate past information. DARQN is a model with visual attention mechanisms on top of DRQN. We propose two new reinforcement learning models: Deep Recurrent Q-Network with Truncated History (T-DRQN) and Deep Attention Recurrent Q-Network with Truncated History (T-DARQN). T-DRQN uses a truncated history so that we can control the length of history to be considered. T-DARQN is a model with visual attention mechanism on top of T-DRQN. Experiments of our models on six games of Atari 2600 shows a level of performance between DQN and D(A) RQN. Furthermore, results show the necessity of using past information with a truncated length, rather than using only the current information or all of the past information.","PeriodicalId":211734,"journal":{"name":"2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Deep Recurrent Q-Network with Truncated History\",\"authors\":\"Hyunwoo Oh, Tomoyuki Kaneko\",\"doi\":\"10.1109/TAAI.2018.00017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement Learning is a kind of machine learning method which learns through agents' interaction with the environment. Deep Q-Network (DQN), which is a model of reinforcement learning based on deep neural networks, succeeded in learning human-level control policies on various kinds of Atari 2600 games with image pixel inputs. Because an input of DQN is the game frames of the last four steps, DQN had difficulty on mastering such games that need to remember events earlier than four steps in the past. To alleviate the problem, Deep Recurrent Q-Network (DRQN) and Deep Attention Recurrent Q-Network (DARQN) were proposed. In DRQN, the first fully-connected layer just after convolutional layers is replaced with an LSTM to incorporate past information. DARQN is a model with visual attention mechanisms on top of DRQN. We propose two new reinforcement learning models: Deep Recurrent Q-Network with Truncated History (T-DRQN) and Deep Attention Recurrent Q-Network with Truncated History (T-DARQN). T-DRQN uses a truncated history so that we can control the length of history to be considered. T-DARQN is a model with visual attention mechanism on top of T-DRQN. Experiments of our models on six games of Atari 2600 shows a level of performance between DQN and D(A) RQN. Furthermore, results show the necessity of using past information with a truncated length, rather than using only the current information or all of the past information.\",\"PeriodicalId\":211734,\"journal\":{\"name\":\"2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI)\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TAAI.2018.00017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAAI.2018.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Reinforcement Learning is a kind of machine learning method which learns through agents' interaction with the environment. Deep Q-Network (DQN), which is a model of reinforcement learning based on deep neural networks, succeeded in learning human-level control policies on various kinds of Atari 2600 games with image pixel inputs. Because an input of DQN is the game frames of the last four steps, DQN had difficulty on mastering such games that need to remember events earlier than four steps in the past. To alleviate the problem, Deep Recurrent Q-Network (DRQN) and Deep Attention Recurrent Q-Network (DARQN) were proposed. In DRQN, the first fully-connected layer just after convolutional layers is replaced with an LSTM to incorporate past information. DARQN is a model with visual attention mechanisms on top of DRQN. We propose two new reinforcement learning models: Deep Recurrent Q-Network with Truncated History (T-DRQN) and Deep Attention Recurrent Q-Network with Truncated History (T-DARQN). T-DRQN uses a truncated history so that we can control the length of history to be considered. T-DARQN is a model with visual attention mechanism on top of T-DRQN. Experiments of our models on six games of Atari 2600 shows a level of performance between DQN and D(A) RQN. Furthermore, results show the necessity of using past information with a truncated length, rather than using only the current information or all of the past information.