Min Li, Dan Zhu, Jie Shen, Hao Yan, Sufu Li, Kun Wang, Haijun Yuan
{"title":"Research on Intelligent Control Decision of Manipulator Based on Deep Reinforcement Learning","authors":"Min Li, Dan Zhu, Jie Shen, Hao Yan, Sufu Li, Kun Wang, Haijun Yuan","doi":"10.1109/IAECST57965.2022.10061884","DOIUrl":null,"url":null,"abstract":"For the general reinforcement learning algorithm, the training time will be too long due to the need for constant trial and error to provide sufficient data in the control of the robot arm. This paper uses unity3d engine to build a simulation environment, including UR robot and intelligent electricity meter. M-SAC algorithm, which combines multi-agent with SAC (Soft Actor Critical) algorithm, speeds up the training speed and shortens the training agent time. The efficiency of M-SAC algorithm is verified by analyzing the average reward value after training.","PeriodicalId":423504,"journal":{"name":"2022 4th International Academic Exchange Conference on Science and Technology Innovation (IAECST)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 4th International Academic Exchange Conference on Science and Technology Innovation (IAECST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IAECST57965.2022.10061884","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
For the general reinforcement learning algorithm, the training time will be too long due to the need for constant trial and error to provide sufficient data in the control of the robot arm. This paper uses unity3d engine to build a simulation environment, including UR robot and intelligent electricity meter. M-SAC algorithm, which combines multi-agent with SAC (Soft Actor Critical) algorithm, speeds up the training speed and shortens the training agent time. The efficiency of M-SAC algorithm is verified by analyzing the average reward value after training.
对于一般的强化学习算法,在机械臂的控制中需要不断的试错来提供足够的数据,训练时间会过长。本文利用unity3d引擎搭建仿真环境,包括UR机器人和智能电表。M-SAC算法将多智能体与SAC (Soft Actor Critical)算法相结合,提高了训练速度,缩短了训练智能体的时间。通过对训练后平均奖励值的分析,验证了M-SAC算法的有效性。