Jonathan D. Thomas, Andrea Silvi, Devdatt Dubhashi, Emil Carlsson, Moa Johansson
{"title":"Learning Efficient Recursive Numeral Systems via Reinforcement Learning","authors":"Jonathan D. Thomas, Andrea Silvi, Devdatt Dubhashi, Emil Carlsson, Moa Johansson","doi":"arxiv-2409.07170","DOIUrl":null,"url":null,"abstract":"The emergence of mathematical concepts, such as number systems, is an\nunderstudied area in AI for mathematics and reasoning. It has previously been\nshown Carlsson et al. (2021) that by using reinforcement learning (RL), agents\ncan derive simple approximate and exact-restricted numeral systems. However, it\nis a major challenge to show how more complex recursive numeral systems,\nsimilar to the one utilised in English, could arise via a simple learning\nmechanism such as RL. Here, we introduce an approach towards deriving a\nmechanistic explanation of the emergence of recursive number systems where we\nconsider an RL agent which directly optimizes a lexicon under a given\nmeta-grammar. Utilising a slightly modified version of the seminal meta-grammar\nof Hurford (1975), we demonstrate that our RL agent can effectively modify the\nlexicon towards Pareto-optimal configurations which are comparable to those\nobserved within human numeral systems.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"102 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07170","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The emergence of mathematical concepts, such as number systems, is an
understudied area in AI for mathematics and reasoning. It has previously been
shown Carlsson et al. (2021) that by using reinforcement learning (RL), agents
can derive simple approximate and exact-restricted numeral systems. However, it
is a major challenge to show how more complex recursive numeral systems,
similar to the one utilised in English, could arise via a simple learning
mechanism such as RL. Here, we introduce an approach towards deriving a
mechanistic explanation of the emergence of recursive number systems where we
consider an RL agent which directly optimizes a lexicon under a given
meta-grammar. Utilising a slightly modified version of the seminal meta-grammar
of Hurford (1975), we demonstrate that our RL agent can effectively modify the
lexicon towards Pareto-optimal configurations which are comparable to those
observed within human numeral systems.