{"title":"The Need for MORE: Need Systems as Non-Linear Multi-Objective Reinforcement Learning","authors":"Matthias Rolf","doi":"10.1109/ICDL-EpiRob48136.2020.9278062","DOIUrl":null,"url":null,"abstract":"Both biological and artificial agents need to coordinate their behavior to suit various needs at the same time. Reconciling conflicts of different needs and contradictory interests such as self-preservation and curiosity is the central difficulty arising in the design and modelling of need and value systems. Current models of multi-objective reinforcement learning do either not provide satisfactory power to describe such conflicts, or lack the power to actually resolve them. This paper aims to promote a clear understanding of these limitations, and to overcome them with a theory-driven approach rather than ad hoc solutions. The first contribution of this paper is the development of an example that demonstrates previous approaches' limitations concisely. The second contribution is a new, non-linear objective function design, MORE, that addresses these and leads to a practical algorithm. Experiments show that standard RL methods fail to grasp the nature of the problem and ad-hoc solutions struggle to describe consistent preferences. MORE consistently learns a highly satisfactory solution that balances contradictory needs based on a consistent notion of optimality.","PeriodicalId":114948,"journal":{"name":"2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDL-EpiRob48136.2020.9278062","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Both biological and artificial agents need to coordinate their behavior to suit various needs at the same time. Reconciling conflicts of different needs and contradictory interests such as self-preservation and curiosity is the central difficulty arising in the design and modelling of need and value systems. Current models of multi-objective reinforcement learning do either not provide satisfactory power to describe such conflicts, or lack the power to actually resolve them. This paper aims to promote a clear understanding of these limitations, and to overcome them with a theory-driven approach rather than ad hoc solutions. The first contribution of this paper is the development of an example that demonstrates previous approaches' limitations concisely. The second contribution is a new, non-linear objective function design, MORE, that addresses these and leads to a practical algorithm. Experiments show that standard RL methods fail to grasp the nature of the problem and ad-hoc solutions struggle to describe consistent preferences. MORE consistently learns a highly satisfactory solution that balances contradictory needs based on a consistent notion of optimality.