Deep Visual-guided and Deep Reinforcement Learning Algorithm Based for Multip-Peg-in-Hole Assembly Task of Power Distribution Live-line Operation Robot
{"title":"Deep Visual-guided and Deep Reinforcement Learning Algorithm Based for Multip-Peg-in-Hole Assembly Task of Power Distribution Live-line Operation Robot","authors":"Li Zheng, Jiajun Ai, Yahao Wang, Xuming Tang, Shaolei Wu, Sheng Cheng, Rui Guo, Erbao Dong","doi":"10.1007/s10846-024-02079-2","DOIUrl":null,"url":null,"abstract":"<p>The inspection and maintenance of power distribution network are crucial for efficiently delivering electricity to consumers. Due to the high voltage of power distribution network lines, manual live-line operations are difficult, risky, and inefficient. This paper researches a Power Distribution Network Live-line Operation Robot (PDLOR) with autonomous tool assembly capabilities to replace humans in various high-risk electrical maintenance tasks. To address the challenges of tool assembly in dynamic and unstructured work environments for PDLOR, we propose a framework consisting of deep visual-guided coarse localization and prior knowledge and fuzzy logic driven deep deterministic policy gradient (PKFD-DPG) high-precision assembly algorithm. First, we propose a multiscale identification and localization network based on YOLOv5, which enables the peg-hole close quickly and reduces ineffective exploration. Second, we design a main-auxiliary combined reward system, where the main-line reward uses the hindsight experience replay mechanism, and the auxiliary reward is based on fuzzy logic inference mechanism, addressing ineffective exploration and sparse reward in the learning process. In addition, we validate the effectiveness and advantages of the proposed algorithm through simulations and physical experiments, and also compare its performance with other assembly algorithms. The experimental results show that, for single-tool assembly tasks, the success rate of PKFD-DPG is 15.2% higher than the DDPG with functionized reward functions and 51.7% higher than the PD force control method; for multip-tools assembly tasks, the success rate of PKFD-DPG method is 17% and 53.4% higher than the other methods.</p>","PeriodicalId":54794,"journal":{"name":"Journal of Intelligent & Robotic Systems","volume":"48 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent & Robotic Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10846-024-02079-2","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The inspection and maintenance of power distribution network are crucial for efficiently delivering electricity to consumers. Due to the high voltage of power distribution network lines, manual live-line operations are difficult, risky, and inefficient. This paper researches a Power Distribution Network Live-line Operation Robot (PDLOR) with autonomous tool assembly capabilities to replace humans in various high-risk electrical maintenance tasks. To address the challenges of tool assembly in dynamic and unstructured work environments for PDLOR, we propose a framework consisting of deep visual-guided coarse localization and prior knowledge and fuzzy logic driven deep deterministic policy gradient (PKFD-DPG) high-precision assembly algorithm. First, we propose a multiscale identification and localization network based on YOLOv5, which enables the peg-hole close quickly and reduces ineffective exploration. Second, we design a main-auxiliary combined reward system, where the main-line reward uses the hindsight experience replay mechanism, and the auxiliary reward is based on fuzzy logic inference mechanism, addressing ineffective exploration and sparse reward in the learning process. In addition, we validate the effectiveness and advantages of the proposed algorithm through simulations and physical experiments, and also compare its performance with other assembly algorithms. The experimental results show that, for single-tool assembly tasks, the success rate of PKFD-DPG is 15.2% higher than the DDPG with functionized reward functions and 51.7% higher than the PD force control method; for multip-tools assembly tasks, the success rate of PKFD-DPG method is 17% and 53.4% higher than the other methods.
期刊介绍:
The Journal of Intelligent and Robotic Systems bridges the gap between theory and practice in all areas of intelligent systems and robotics. It publishes original, peer reviewed contributions from initial concept and theory to prototyping to final product development and commercialization.
On the theoretical side, the journal features papers focusing on intelligent systems engineering, distributed intelligence systems, multi-level systems, intelligent control, multi-robot systems, cooperation and coordination of unmanned vehicle systems, etc.
On the application side, the journal emphasizes autonomous systems, industrial robotic systems, multi-robot systems, aerial vehicles, mobile robot platforms, underwater robots, sensors, sensor-fusion, and sensor-based control. Readers will also find papers on real applications of intelligent and robotic systems (e.g., mechatronics, manufacturing, biomedical, underwater, humanoid, mobile/legged robot and space applications, etc.).