Curious model-building control systems

[Proceedings] 1991 IEEE International Joint Conference on Neural Networks Pub Date : 1991-11-18 DOI:10.1109/IJCNN.1991.170605

J. Schmidhuber

引用次数: 662

Abstract

A novel curious model-building control system is described which actively tries to provoke situations for which it learned to expect to learn something about the environment. Such a system has been implemented as a four-network system based on Watkins' Q-learning algorithm which can be used to maximize the expectation of the temporal derivative of the adaptive assumed reliability of future predictions. An experiment with an artificial nondeterministic environment demonstrates that the system can be superior to previous model-building control systems, which do not address the problem of modeling the reliability of the world model's predictions in uncertain environments and use ad-hoc methods (like random search) to train the world model.<>

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

奇怪的模型构建控制系统

本文描述了一种新奇的模型构建控制系统，该系统积极地尝试激发它学会期望学习有关环境的一些东西的情况。这样的系统已经被实现为一个基于Watkins的Q-learning算法的四网络系统，该算法可用于最大化未来预测的自适应假设可靠性的时间导数的期望。一个人工不确定性环境的实验表明，该系统可以优于以前的模型构建控制系统，这些系统没有解决在不确定环境中对世界模型预测的可靠性建模的问题，而是使用特设方法(如随机搜索)来训练世界模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

[Proceedings] 1991 IEEE International Joint Conference on Neural Networks

自引率

0.00%

发文量