{"title":"学习攀爬:教一个强化学习代理单绳攀爬技术","authors":"Balázs Varga","doi":"10.1109/CINTI-MACRo57952.2022.10029600","DOIUrl":null,"url":null,"abstract":"Single rope ascending technique is used in industrial alpinism, forestry, or various leisure activities. This paper presents a multi-body model of this technique involving an actuated 3D model of a humanoid, the climbing gear, and the rope, modeled as a finite-element object. This model serves as a training ground for reinforcement learning agents trying to mimic humans in rope climbing. To demonstrate the environment, an agent with a state-of-the-art reinforcement learning algorithm (Soft Actor-Critic) was trained. Results suggest that the agent can learn how to ascend the rope with speed comparable to real humans. However, the learned technique is not human-like: the artificial agent uses its arms excessively to climb, which would be too tiring for a human. That is because the environment only rewards ascension and does not penalize the energy used. The presented learning environment is developed for humanoid robots in mind that can perform complex tasks while on the rope and can carry much heavier payloads compared to climbing robots in the literature.","PeriodicalId":18535,"journal":{"name":"Micro","volume":"14 1","pages":"000209-000214"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learn to climb: teaching a reinforcement learning agent the single rope ascending technique\",\"authors\":\"Balázs Varga\",\"doi\":\"10.1109/CINTI-MACRo57952.2022.10029600\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Single rope ascending technique is used in industrial alpinism, forestry, or various leisure activities. This paper presents a multi-body model of this technique involving an actuated 3D model of a humanoid, the climbing gear, and the rope, modeled as a finite-element object. This model serves as a training ground for reinforcement learning agents trying to mimic humans in rope climbing. To demonstrate the environment, an agent with a state-of-the-art reinforcement learning algorithm (Soft Actor-Critic) was trained. Results suggest that the agent can learn how to ascend the rope with speed comparable to real humans. However, the learned technique is not human-like: the artificial agent uses its arms excessively to climb, which would be too tiring for a human. That is because the environment only rewards ascension and does not penalize the energy used. The presented learning environment is developed for humanoid robots in mind that can perform complex tasks while on the rope and can carry much heavier payloads compared to climbing robots in the literature.\",\"PeriodicalId\":18535,\"journal\":{\"name\":\"Micro\",\"volume\":\"14 1\",\"pages\":\"000209-000214\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Micro\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CINTI-MACRo57952.2022.10029600\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Micro","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CINTI-MACRo57952.2022.10029600","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Learn to climb: teaching a reinforcement learning agent the single rope ascending technique
Single rope ascending technique is used in industrial alpinism, forestry, or various leisure activities. This paper presents a multi-body model of this technique involving an actuated 3D model of a humanoid, the climbing gear, and the rope, modeled as a finite-element object. This model serves as a training ground for reinforcement learning agents trying to mimic humans in rope climbing. To demonstrate the environment, an agent with a state-of-the-art reinforcement learning algorithm (Soft Actor-Critic) was trained. Results suggest that the agent can learn how to ascend the rope with speed comparable to real humans. However, the learned technique is not human-like: the artificial agent uses its arms excessively to climb, which would be too tiring for a human. That is because the environment only rewards ascension and does not penalize the energy used. The presented learning environment is developed for humanoid robots in mind that can perform complex tasks while on the rope and can carry much heavier payloads compared to climbing robots in the literature.