Non-local Self-attention Structure for Function Approximation in Deep Reinforcement Learning

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2019-05-12 DOI:10.1109/ICASSP.2019.8682832

Z. Wang, Xi Xiao, Guangwu Hu, Yao Yao, Dianyan Zhang, Zhendong Peng, Qing Li, Shutao Xia

引用次数: 0

Abstract

Reinforcement learning is a framework to make sequential decisions. The combination with deep neural networks further improves the ability of this framework. Convolutional nerual networks make it possible to make sequential decisions based on raw pixels information directly and make reinforcement learning achieve satisfying performances in series of tasks. However, convolutional neural networks still have own limitations in representing geometric patterns and long-term dependencies that occur consistently in state inputs. To tackle with the limitation, we propose the self-attention architecture to augment the original network. It provides a better balance between ability to model long-range dependencies and computational efficiency. Experiments on Atari games illustrate that self-attention structure is significantly effective for function approximation in deep reinforcement learning.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

深度强化学习中函数逼近的非局部自注意结构

强化学习是一个做出连续决策的框架。与深度神经网络的结合进一步提高了该框架的能力。卷积神经网络使直接基于原始像素信息进行序列决策成为可能，并使强化学习在一系列任务中取得令人满意的性能。然而，卷积神经网络在表示几何模式和长期依赖关系方面仍然有自己的局限性，这些依赖关系在状态输入中始终存在。为了解决这个问题，我们提出了自关注架构来增强原有的网络。它在远程依赖关系建模能力和计算效率之间提供了更好的平衡。在Atari游戏上的实验表明，自注意结构对于深度强化学习中的函数逼近是非常有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

自引率

0.00%

发文量

期刊最新文献

Universal Acoustic Modeling Using Neural Mixture Models Speech Landmark Bigrams for Depression Detection from Naturalistic Smartphone Speech Robust M-estimation Based Matrix Completion When Can a System of Subnetworks Be Registered Uniquely? Learning Search Path for Region-level Image Matching