{"title":"HO2RL: A Novel Hybrid Offline-and-Online Reinforcement Learning Method for Active Pantograph Control","authors":"Hui Wang;Zhigang Liu;Zhiwei Han","doi":"10.1109/TIE.2024.3477002","DOIUrl":null,"url":null,"abstract":"The pantograph–catenary system (PCS) is vital for high-speed trains to collect electrical power, where the contact force fluctuation seriously reduces the current collection quality, increases maintenance costs, and affects operation safety. Reinforcement learning (RL) is an attractive approach for learning active pantograph control policy by trial and error. However, the traditional RL methods suffer significant performance degradation or collapse when deployed to the real world due to the huge sim-real gap. We propose a hybrid offline-and-online reinforcement learning (HO2RL) algorithm to solve active pantograph control tasks, which elegantly combines RL policy pretraining with offline transitions and performance enhancement with online data collection. The proposed algorithm provides generalized pretrained models by learning effective behavior policy from offline experiences and then performs multidomain adaptation by online performance improvement with dynamics-aware policy evaluation. Experimental results demonstrate that the HO2RL algorithm efficiently learns from large and diverse static datasets and enables steady performance improvement by fine-tuning with online interactions. The proposed method solves active pantograph control tasks in various operation scenarios and demonstrates SOTA performance on the PCS standard benchmark.","PeriodicalId":13402,"journal":{"name":"IEEE Transactions on Industrial Electronics","volume":"72 6","pages":"6286-6296"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Industrial Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10746248/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The pantograph–catenary system (PCS) is vital for high-speed trains to collect electrical power, where the contact force fluctuation seriously reduces the current collection quality, increases maintenance costs, and affects operation safety. Reinforcement learning (RL) is an attractive approach for learning active pantograph control policy by trial and error. However, the traditional RL methods suffer significant performance degradation or collapse when deployed to the real world due to the huge sim-real gap. We propose a hybrid offline-and-online reinforcement learning (HO2RL) algorithm to solve active pantograph control tasks, which elegantly combines RL policy pretraining with offline transitions and performance enhancement with online data collection. The proposed algorithm provides generalized pretrained models by learning effective behavior policy from offline experiences and then performs multidomain adaptation by online performance improvement with dynamics-aware policy evaluation. Experimental results demonstrate that the HO2RL algorithm efficiently learns from large and diverse static datasets and enables steady performance improvement by fine-tuning with online interactions. The proposed method solves active pantograph control tasks in various operation scenarios and demonstrates SOTA performance on the PCS standard benchmark.
期刊介绍:
Journal Name: IEEE Transactions on Industrial Electronics
Publication Frequency: Monthly
Scope:
The scope of IEEE Transactions on Industrial Electronics encompasses the following areas:
Applications of electronics, controls, and communications in industrial and manufacturing systems and processes.
Power electronics and drive control techniques.
System control and signal processing.
Fault detection and diagnosis.
Power systems.
Instrumentation, measurement, and testing.
Modeling and simulation.
Motion control.
Robotics.
Sensors and actuators.
Implementation of neural networks, fuzzy logic, and artificial intelligence in industrial systems.
Factory automation.
Communication and computer networks.