Learning and decision-making in artificial animals

Journal of Artificial General Intelligence Pub Date : 2018-07-01 DOI:10.2478/jagi-2018-0002

Claes Strannegård, Nils Svangård, David Lindström, Joscha Bach, Bas R. Steunebrink

{"title":"Learning and decision-making in artificial animals","authors":"Claes Strannegård, Nils Svangård, David Lindström, Joscha Bach, Bas R. Steunebrink","doi":"10.2478/jagi-2018-0002","DOIUrl":null,"url":null,"abstract":"Abstract A computational model for artificial animals (animats) interacting with real or artificial ecosystems is presented. All animats use the same mechanisms for learning and decisionmaking. Each animat has its own set of needs and its own memory structure that undergoes continuous development and constitutes the basis for decision-making. The decision-making mechanism aims at keeping the needs of the animat as satisfied as possible for as long as possible. Reward and punishment are defined in terms of changes to the level of need satisfaction. The learning mechanisms are driven by prediction error relating to reward and punishment and are of two kinds: multi-objective local Q-learning and structural learning that alter the architecture of the memory structures by adding and removing nodes. The animat model has the following key properties: (1) autonomy: it operates in a fully automatic fashion, without any need for interaction with human engineers. In particular, it does not depend on human engineers to provide goals, tasks, or seed knowledge. Still, it can operate either with or without human interaction; (2) generality: it uses the same learning and decision-making mechanisms in all environments, e.g. desert environments and forest environments and for all animats, e.g. frog animats and bee animats; and (3) adequacy: it is able to learn basic forms of animal skills such as eating, drinking, locomotion, and navigation. Eight experiments are presented. The results obtained indicate that (i) dynamic memory structures are strictly more powerful than static; (ii) it is possible to use a fixed generic design to model basic cognitive processes of a wide range of animals and environments; and (iii) the animat framework enables a uniform and gradual approach to AGI, by successively taking on more challenging problems in the form of broader and more complex classes of environments","PeriodicalId":247142,"journal":{"name":"Journal of Artificial General Intelligence","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial General Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/jagi-2018-0002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Abstract A computational model for artificial animals (animats) interacting with real or artificial ecosystems is presented. All animats use the same mechanisms for learning and decisionmaking. Each animat has its own set of needs and its own memory structure that undergoes continuous development and constitutes the basis for decision-making. The decision-making mechanism aims at keeping the needs of the animat as satisfied as possible for as long as possible. Reward and punishment are defined in terms of changes to the level of need satisfaction. The learning mechanisms are driven by prediction error relating to reward and punishment and are of two kinds: multi-objective local Q-learning and structural learning that alter the architecture of the memory structures by adding and removing nodes. The animat model has the following key properties: (1) autonomy: it operates in a fully automatic fashion, without any need for interaction with human engineers. In particular, it does not depend on human engineers to provide goals, tasks, or seed knowledge. Still, it can operate either with or without human interaction; (2) generality: it uses the same learning and decision-making mechanisms in all environments, e.g. desert environments and forest environments and for all animats, e.g. frog animats and bee animats; and (3) adequacy: it is able to learn basic forms of animal skills such as eating, drinking, locomotion, and navigation. Eight experiments are presented. The results obtained indicate that (i) dynamic memory structures are strictly more powerful than static; (ii) it is possible to use a fixed generic design to model basic cognitive processes of a wide range of animals and environments; and (iii) the animat framework enables a uniform and gradual approach to AGI, by successively taking on more challenging problems in the form of broader and more complex classes of environments

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

人工动物的学习和决策

摘要提出了一种人工动物与真实或人工生态系统相互作用的计算模型。所有动物都使用相同的学习和决策机制。每种动物都有自己的一套需求和自己的记忆结构，它们经历了不断的发展，构成了决策的基础。决策机制旨在尽可能长时间地满足动物的需求。奖励和惩罚是根据需求满足程度的变化来定义的。学习机制是由与奖惩相关的预测误差驱动的，有两种类型:多目标局部q学习和结构学习，通过增加和删除节点来改变记忆结构的结构。动物模型具有以下关键属性:(1)自主性:它以全自动的方式运行，不需要与人类工程师进行任何交互。特别是，它不依赖于人类工程师来提供目标、任务或种子知识。尽管如此，它可以在有或没有人类互动的情况下运行;(2)通用性:在所有环境(如沙漠环境和森林环境)和所有动物(如青蛙动物和蜜蜂动物)中使用相同的学习和决策机制;(3)充分性:能够学习基本形式的动物技能，如吃、喝、运动和导航。给出了八个实验。结果表明:(1)动态存储结构比静态存储结构更强大;(ii)可以使用固定的通用设计来模拟各种动物和环境的基本认知过程;(iii)动物框架通过以更广泛和更复杂的环境类别的形式依次处理更具挑战性的问题，从而实现统一和渐进的AGI方法

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Artificial General Intelligence

自引率

0.00%

发文量

期刊最新文献

Fuzzy Networks for Modeling Shared Semantic Knowledge Extending Environments to Measure Self-reflection in Reinforcement Learning Measuring Intelligence and Growth Rate: Variations on Hibbard’s Intelligence Measure Feature Reinforcement Learning: Part II. Structured MDPs The Synthesis and Decoding of Meaning