学习自动机前馈网络的全局收敛性

[Proceedings 1992] IJCNN International Joint Conference on Neural Networks Pub Date : 1992-06-07 DOI:10.1109/IJCNN.1992.227089

V. V. Phansalkar, M. Thathachar

{"title":"学习自动机前馈网络的全局收敛性","authors":"V. V. Phansalkar, M. Thathachar","doi":"10.1109/IJCNN.1992.227089","DOIUrl":null,"url":null,"abstract":"A feedforward network composed of units of teams of parameterized learning automata is considered as a model of a reinforcement learning system. The parameters of each learning automaton are updated using an algorithm consisting of a gradient following term and a random perturbation term. The algorithm is approximated by the Langevin equation. It is shown that it converges to the global maximum. The algorithm is decentralized and the units do not have any information exchange during updating. Simulation results on a pattern recognition problem show that reasonable rates of convergence can be obtained.<<ETX>>","PeriodicalId":286849,"journal":{"name":"[Proceedings 1992] IJCNN International Joint Conference on Neural Networks","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Global convergence of feedforward networks of learning automata\",\"authors\":\"V. V. Phansalkar, M. Thathachar\",\"doi\":\"10.1109/IJCNN.1992.227089\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A feedforward network composed of units of teams of parameterized learning automata is considered as a model of a reinforcement learning system. The parameters of each learning automaton are updated using an algorithm consisting of a gradient following term and a random perturbation term. The algorithm is approximated by the Langevin equation. It is shown that it converges to the global maximum. The algorithm is decentralized and the units do not have any information exchange during updating. Simulation results on a pattern recognition problem show that reasonable rates of convergence can be obtained.<<ETX>>\",\"PeriodicalId\":286849,\"journal\":{\"name\":\"[Proceedings 1992] IJCNN International Joint Conference on Neural Networks\",\"volume\":\"102 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1992-06-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[Proceedings 1992] IJCNN International Joint Conference on Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCNN.1992.227089\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings 1992] IJCNN International Joint Conference on Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.1992.227089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

将一个由参数化学习自动机组成的前馈网络作为强化学习系统的模型。使用由梯度跟随项和随机扰动项组成的算法更新每个学习自动机的参数。该算法由朗之万方程近似表示。结果表明，它收敛于全局极大值。该算法是去中心化的，单元在更新过程中没有任何信息交换。对一个模式识别问题的仿真结果表明，该方法可以获得合理的收敛速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Global convergence of feedforward networks of learning automata

A feedforward network composed of units of teams of parameterized learning automata is considered as a model of a reinforcement learning system. The parameters of each learning automaton are updated using an algorithm consisting of a gradient following term and a random perturbation term. The algorithm is approximated by the Langevin equation. It is shown that it converges to the global maximum. The algorithm is decentralized and the units do not have any information exchange during updating. Simulation results on a pattern recognition problem show that reasonable rates of convergence can be obtained.<>

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

[Proceedings 1992] IJCNN International Joint Conference on Neural Networks

自引率

0.00%

发文量