HO2RL: A Novel Hybrid Offline-and-Online Reinforcement Learning Method for Active Pantograph Control

IF 7.2 1区 工程技术 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Industrial Electronics Pub Date : 2024-11-06 DOI:10.1109/TIE.2024.3477002
Hui Wang;Zhigang Liu;Zhiwei Han
{"title":"HO2RL: A Novel Hybrid Offline-and-Online Reinforcement Learning Method for Active Pantograph Control","authors":"Hui Wang;Zhigang Liu;Zhiwei Han","doi":"10.1109/TIE.2024.3477002","DOIUrl":null,"url":null,"abstract":"The pantograph–catenary system (PCS) is vital for high-speed trains to collect electrical power, where the contact force fluctuation seriously reduces the current collection quality, increases maintenance costs, and affects operation safety. Reinforcement learning (RL) is an attractive approach for learning active pantograph control policy by trial and error. However, the traditional RL methods suffer significant performance degradation or collapse when deployed to the real world due to the huge sim-real gap. We propose a hybrid offline-and-online reinforcement learning (HO2RL) algorithm to solve active pantograph control tasks, which elegantly combines RL policy pretraining with offline transitions and performance enhancement with online data collection. The proposed algorithm provides generalized pretrained models by learning effective behavior policy from offline experiences and then performs multidomain adaptation by online performance improvement with dynamics-aware policy evaluation. Experimental results demonstrate that the HO2RL algorithm efficiently learns from large and diverse static datasets and enables steady performance improvement by fine-tuning with online interactions. The proposed method solves active pantograph control tasks in various operation scenarios and demonstrates SOTA performance on the PCS standard benchmark.","PeriodicalId":13402,"journal":{"name":"IEEE Transactions on Industrial Electronics","volume":"72 6","pages":"6286-6296"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Industrial Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10746248/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The pantograph–catenary system (PCS) is vital for high-speed trains to collect electrical power, where the contact force fluctuation seriously reduces the current collection quality, increases maintenance costs, and affects operation safety. Reinforcement learning (RL) is an attractive approach for learning active pantograph control policy by trial and error. However, the traditional RL methods suffer significant performance degradation or collapse when deployed to the real world due to the huge sim-real gap. We propose a hybrid offline-and-online reinforcement learning (HO2RL) algorithm to solve active pantograph control tasks, which elegantly combines RL policy pretraining with offline transitions and performance enhancement with online data collection. The proposed algorithm provides generalized pretrained models by learning effective behavior policy from offline experiences and then performs multidomain adaptation by online performance improvement with dynamics-aware policy evaluation. Experimental results demonstrate that the HO2RL algorithm efficiently learns from large and diverse static datasets and enables steady performance improvement by fine-tuning with online interactions. The proposed method solves active pantograph control tasks in various operation scenarios and demonstrates SOTA performance on the PCS standard benchmark.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
HO2RL:用于主动受电弓控制的新型离线和在线混合强化学习方法
受电弓接触网系统(PCS)是高速列车采集电能的关键系统,其接触力波动严重降低了采集电流的质量,增加了维护成本,影响了运行安全。强化学习(RL)是通过试错法学习主动受电弓控制策略的一种有吸引力的方法。然而,传统的RL方法在应用到现实世界时,由于巨大的模拟-真实差距而导致性能显著下降或崩溃。我们提出了一种离线和在线混合强化学习(HO2RL)算法来解决主动受电弓控制任务,该算法将RL策略预训练与离线转换以及性能增强与在线数据收集巧妙地结合在一起。该算法通过从离线经验中学习有效的行为策略提供广义预训练模型,然后通过动态感知策略评估的在线性能改进进行多领域自适应。实验结果表明,HO2RL算法能够有效地从大型和多样化的静态数据集中学习,并通过在线交互的微调实现稳定的性能提升。该方法解决了各种操作场景下的主动受电弓控制任务,并在PCS标准基准测试中验证了SOTA的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Industrial Electronics
IEEE Transactions on Industrial Electronics 工程技术-工程:电子与电气
CiteScore
16.80
自引率
9.10%
发文量
1396
审稿时长
6.3 months
期刊介绍: Journal Name: IEEE Transactions on Industrial Electronics Publication Frequency: Monthly Scope: The scope of IEEE Transactions on Industrial Electronics encompasses the following areas: Applications of electronics, controls, and communications in industrial and manufacturing systems and processes. Power electronics and drive control techniques. System control and signal processing. Fault detection and diagnosis. Power systems. Instrumentation, measurement, and testing. Modeling and simulation. Motion control. Robotics. Sensors and actuators. Implementation of neural networks, fuzzy logic, and artificial intelligence in industrial systems. Factory automation. Communication and computer networks.
期刊最新文献
Calculation and Analysis of Interturn Voltage Stress in Hairpin Windings for EV Traction Machines Considering the End-Winding Parasitic Effect A Smooth Time-Optimal Motion Control Strategy of Industrial Manipulator Subject to Hybrid Joint Constraints Synthesized by Low-Computational-Complexity Zeroing Neural Network Single-Phase Isolated Bidirectional Bipolar Output Buck-Boost AC–AC Converter With Safe-Commutation Model Predictive Suspension Force Control of Bearingless Switched Reluctance Motor A Six-Phase Soft-Switching Motor Driving Topology and Its Modulation Strategy for Low Common-Mode Voltage
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1