Emergence of chemotactic strategies with multi-agent reinforcement learning

IF 6.3 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Machine Learning Science and Technology Pub Date : 2024-08-21 DOI:10.1088/2632-2153/ad5f73

Samuel Tovey, Christoph Lohrmann, Christian Holm

{"title":"Emergence of chemotactic strategies with multi-agent reinforcement learning","authors":"Samuel Tovey, Christoph Lohrmann, Christian Holm","doi":"10.1088/2632-2153/ad5f73","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL) is a flexible and efficient method for programming micro-robots in complex environments. Here we investigate whether RL can provide insights into biological systems when trained to perform chemotaxis. Namely, whether we can learn about how intelligent agents process given information in order to swim towards a target. We run simulations covering a range of agent shapes, sizes, and swim speeds to determine if the physical constraints on biological swimmers, namely Brownian motion, lead to regions where reinforcement learners’ training fails. We find that the RL agents can perform chemotaxis as soon as it is physically possible and, in some cases, even before the active swimming overpowers the stochastic environment. We study the efficiency of the emergent policy and identify convergence in agent size and swim speeds. Finally, we study the strategy adopted by the RL algorithm to explain how the agents perform their tasks. To this end, we identify three emerging dominant strategies and several rare approaches taken. These strategies, whilst producing almost identical trajectories in simulation, are distinct and give insight into the possible mechanisms behind which biological agents explore their environment and respond to changing conditions.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":"97 1","pages":""},"PeriodicalIF":6.3000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad5f73","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Reinforcement learning (RL) is a flexible and efficient method for programming micro-robots in complex environments. Here we investigate whether RL can provide insights into biological systems when trained to perform chemotaxis. Namely, whether we can learn about how intelligent agents process given information in order to swim towards a target. We run simulations covering a range of agent shapes, sizes, and swim speeds to determine if the physical constraints on biological swimmers, namely Brownian motion, lead to regions where reinforcement learners’ training fails. We find that the RL agents can perform chemotaxis as soon as it is physically possible and, in some cases, even before the active swimming overpowers the stochastic environment. We study the efficiency of the emergent policy and identify convergence in agent size and swim speeds. Finally, we study the strategy adopted by the RL algorithm to explain how the agents perform their tasks. To this end, we identify three emerging dominant strategies and several rare approaches taken. These strategies, whilst producing almost identical trajectories in simulation, are distinct and give insight into the possible mechanisms behind which biological agents explore their environment and respond to changing conditions.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

多代理强化学习催化策略的出现

强化学习（RL）是在复杂环境中对微型机器人进行编程的一种灵活高效的方法。在此，我们将研究当强化学习被训练用于执行趋化时，它是否能为生物系统提供洞察力。也就是说，我们能否了解智能代理如何处理给定信息以游向目标。我们运行了涵盖一系列代理形状、大小和游速的模拟，以确定生物游泳者的物理限制（即布朗运动）是否会导致强化学习器的训练失败。我们发现，只要物理条件允许，RL 代理就能执行趋化，在某些情况下，甚至在主动游动压倒随机环境之前就能执行趋化。我们研究了新兴策略的效率，并确定了代理规模和游动速度的收敛性。最后，我们研究了 RL 算法采用的策略，以解释代理如何执行任务。为此，我们确定了三种新出现的主导策略和几种罕见的方法。这些策略虽然在模拟中产生了几乎相同的轨迹，但却各具特色，让我们深入了解了生物制剂探索环境和应对不断变化的条件的可能机制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Machine Learning Science and Technology Computer Science-Artificial Intelligence

CiteScore

9.10

自引率

4.40%

发文量

审稿时长

5 weeks

期刊介绍： Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.