Mohammed Sharafath Abdul Hameed, Venkata Harshit Koneru, Johannes Poeppelbaum, Andreas Schwung
{"title":"蠕动分选机的课程学习","authors":"Mohammed Sharafath Abdul Hameed, Venkata Harshit Koneru, Johannes Poeppelbaum, Andreas Schwung","doi":"10.1109/INDIN51773.2022.9976094","DOIUrl":null,"url":null,"abstract":"This paper presents a novel approach to train a Reinforcement Learning (RL) agent faster for transportation of parcels in a Peristaltic Sortation Machine (PSM) using curriculum learning (CL). The PSM was developed as a means to transport parcels using an actuator and a flexible film where a RL agent is trained to control the actuator. In a previous paper, training of the actuator was done on a Discrete Element Method (DEM) simulation environment of the PSM developed using an open-source DEM library called LIGGGHTS, which reduced the training time of the transportation task compared to the real machine. But it still took days to train the agent. The objective of this paper is to reduce the training time to hours. To overcome this problem, we developed a faster but lower fidelity python simulation environment (PSE) capable of simulating the transportation task of PSM. And we used it with a curriculum learning approach to accelerate training the agent in the transportation process. The RL agent is trained in two steps in the PSE: 1. with a fixed set of goal positions, 2. with randomized goal positions. Additionally, we also use Gradient Monitoring (GM), a gradient regularization method, which provides additional trust region constraints in the policy updates of the RL agent when switching between tasks. The agent so trained is then deployed and tested in the DEM environment where the agent has not been trained before. The results obtained show that the RL agent trained using CL and PSE successfully completes the tasks in the DEM environment without any loss in performance, while using only a fraction of the training time (1.87%) per episode. This will allow for faster prototyping of algorithms to be tested on the PSM in future.","PeriodicalId":359190,"journal":{"name":"2022 IEEE 20th International Conference on Industrial Informatics (INDIN)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Curriculum Learning in Peristaltic Sortation Machine\",\"authors\":\"Mohammed Sharafath Abdul Hameed, Venkata Harshit Koneru, Johannes Poeppelbaum, Andreas Schwung\",\"doi\":\"10.1109/INDIN51773.2022.9976094\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a novel approach to train a Reinforcement Learning (RL) agent faster for transportation of parcels in a Peristaltic Sortation Machine (PSM) using curriculum learning (CL). The PSM was developed as a means to transport parcels using an actuator and a flexible film where a RL agent is trained to control the actuator. In a previous paper, training of the actuator was done on a Discrete Element Method (DEM) simulation environment of the PSM developed using an open-source DEM library called LIGGGHTS, which reduced the training time of the transportation task compared to the real machine. But it still took days to train the agent. The objective of this paper is to reduce the training time to hours. To overcome this problem, we developed a faster but lower fidelity python simulation environment (PSE) capable of simulating the transportation task of PSM. And we used it with a curriculum learning approach to accelerate training the agent in the transportation process. The RL agent is trained in two steps in the PSE: 1. with a fixed set of goal positions, 2. with randomized goal positions. Additionally, we also use Gradient Monitoring (GM), a gradient regularization method, which provides additional trust region constraints in the policy updates of the RL agent when switching between tasks. The agent so trained is then deployed and tested in the DEM environment where the agent has not been trained before. The results obtained show that the RL agent trained using CL and PSE successfully completes the tasks in the DEM environment without any loss in performance, while using only a fraction of the training time (1.87%) per episode. This will allow for faster prototyping of algorithms to be tested on the PSM in future.\",\"PeriodicalId\":359190,\"journal\":{\"name\":\"2022 IEEE 20th International Conference on Industrial Informatics (INDIN)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 20th International Conference on Industrial Informatics (INDIN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INDIN51773.2022.9976094\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 20th International Conference on Industrial Informatics (INDIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIN51773.2022.9976094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Curriculum Learning in Peristaltic Sortation Machine
This paper presents a novel approach to train a Reinforcement Learning (RL) agent faster for transportation of parcels in a Peristaltic Sortation Machine (PSM) using curriculum learning (CL). The PSM was developed as a means to transport parcels using an actuator and a flexible film where a RL agent is trained to control the actuator. In a previous paper, training of the actuator was done on a Discrete Element Method (DEM) simulation environment of the PSM developed using an open-source DEM library called LIGGGHTS, which reduced the training time of the transportation task compared to the real machine. But it still took days to train the agent. The objective of this paper is to reduce the training time to hours. To overcome this problem, we developed a faster but lower fidelity python simulation environment (PSE) capable of simulating the transportation task of PSM. And we used it with a curriculum learning approach to accelerate training the agent in the transportation process. The RL agent is trained in two steps in the PSE: 1. with a fixed set of goal positions, 2. with randomized goal positions. Additionally, we also use Gradient Monitoring (GM), a gradient regularization method, which provides additional trust region constraints in the policy updates of the RL agent when switching between tasks. The agent so trained is then deployed and tested in the DEM environment where the agent has not been trained before. The results obtained show that the RL agent trained using CL and PSE successfully completes the tasks in the DEM environment without any loss in performance, while using only a fraction of the training time (1.87%) per episode. This will allow for faster prototyping of algorithms to be tested on the PSM in future.