Cumulative Training and Transfer Learning for Multi-Robots Collision-Free Navigation Problems

2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) Pub Date : 2019-10-01 DOI:10.1109/UEMCON47517.2019.8992945

Trung-Thanh Nguyen, Amartya Hatua, A. Sung

{"title":"Cumulative Training and Transfer Learning for Multi-Robots Collision-Free Navigation Problems","authors":"Trung-Thanh Nguyen, Amartya Hatua, A. Sung","doi":"10.1109/UEMCON47517.2019.8992945","DOIUrl":null,"url":null,"abstract":"Recently, the characteristics of robot autonomy, decentralized control, collective decision-making ability, high fault tolerance, etc. have significantly increased the applications of swarm robotics in targeted material delivery, precision farming, surveillance, defense and many other areas. In these multi-agent systems, safe collision avoidance is one of the most fundamental and important problems. Difference approaches, especially reinforcement learning, have been applied to solve this problem. This paper introduces a new cumulative learning approach which comprises of application of transfer learning with distributed multi-agent reinforcement learning techniques to solve collision-free navigation for swarm robotics. In our method, throughout the learning processes from the least complexity scenario to the most complex one, multiple agents can improve the shared policy through parameter sharing, reward shaping and multi-round multi-steps learning. We have adapted two policy gradient algorithms (TRPO and PPO) as the core of our distributed multiagent reinforcement learning method. The performance has shown that our new methodology can help reduce the training time and generate a robust navigation plan that can easily be generalized to complex in-door scenarios.","PeriodicalId":187022,"journal":{"name":"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UEMCON47517.2019.8992945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Recently, the characteristics of robot autonomy, decentralized control, collective decision-making ability, high fault tolerance, etc. have significantly increased the applications of swarm robotics in targeted material delivery, precision farming, surveillance, defense and many other areas. In these multi-agent systems, safe collision avoidance is one of the most fundamental and important problems. Difference approaches, especially reinforcement learning, have been applied to solve this problem. This paper introduces a new cumulative learning approach which comprises of application of transfer learning with distributed multi-agent reinforcement learning techniques to solve collision-free navigation for swarm robotics. In our method, throughout the learning processes from the least complexity scenario to the most complex one, multiple agents can improve the shared policy through parameter sharing, reward shaping and multi-round multi-steps learning. We have adapted two policy gradient algorithms (TRPO and PPO) as the core of our distributed multiagent reinforcement learning method. The performance has shown that our new methodology can help reduce the training time and generate a robust navigation plan that can easily be generalized to complex in-door scenarios.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

多机器人无碰撞导航问题的累积训练与迁移学习

近年来，机器人的自主性、分散控制、集体决策能力、高容错性等特点显著增加了群体机器人在定向投送物资、精准农业、监控、国防等诸多领域的应用。在多智能体系统中，安全避碰是最基本、最重要的问题之一。差分方法，特别是强化学习，已经被应用于解决这个问题。将迁移学习与分布式多智能体强化学习技术相结合，提出了一种新的累积学习方法来解决群体机器人的无碰撞导航问题。在我们的方法中，在从最小复杂度场景到最复杂场景的整个学习过程中，多个智能体可以通过参数共享、奖励塑造和多轮多步学习来改进共享策略。我们采用了两种策略梯度算法(TRPO和PPO)作为分布式多智能体强化学习方法的核心。实验结果表明，我们的新方法可以帮助减少训练时间，并生成一个鲁棒的导航计划，可以很容易地推广到复杂的室内场景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)

自引率

0.00%

发文量

期刊最新文献

Machine Learning for DDoS Attack Classification Using Hive Plots Low Power Design for DVFS Capable Software ADREMOVER: THE IMPROVED MACHINE LEARNING APPROACH FOR BLOCKING ADS Overhead View Person Detection Using YOLO Multi-sensor Wearable for Child Safety