A Reinforcement-Learning, Optimal Approach to In Situ Power Hardware-in-the-Loop Interface Control for Testing Inverter-Based Resources: Theory and Application of the Adaptive Dynamic Programming Based on the Hybrid Iteration to Tackle Uncertain Dynamics

IF 7.2 1区工程技术 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Industrial Electronics Pub Date : 2024-11-14 DOI:10.1109/TIE.2024.3426038

Masoud Davari;Omar Qasem;Weinan Gao;Frede Blaabjerg;Panos C. Kotsampopoulos;Georg Lauss;Nikos D. Hatziargyriou

{"title":"A Reinforcement-Learning, Optimal Approach to In Situ Power Hardware-in-the-Loop Interface Control for Testing Inverter-Based Resources: Theory and Application of the Adaptive Dynamic Programming Based on the Hybrid Iteration to Tackle Uncertain Dynamics","authors":"Masoud Davari;Omar Qasem;Weinan Gao;Frede Blaabjerg;Panos C. Kotsampopoulos;Georg Lauss;Nikos D. Hatziargyriou","doi":"10.1109/TIE.2024.3426038","DOIUrl":null,"url":null,"abstract":"Testing inverter-based resources (IBRs) is of utmost importance. This paper proposes a novel power hardware-in-the-loop (PHIL) interface control (PHIL-IC) employing a reinforcement-learning approach based on adaptive dynamic programming (ADP, also known as approximate dynamic programming) to enhance the PHIL-simulation-based testing of IBRs by virtue of an ADP-based method. It deploys output feedback control because of “unavailable” or “uncertain” dynamics of the entire systems (states and disturbances) linked to IBRs, power amplifiers, all the components associated with the PHIL-simulation-based testing, and their delays; it optimally designs PHIL-IC while considering all uncertainties and unavailable information about all the systems involved. To this end, the proposed ADP-based PHIL-IC utilizes a new hybrid iteration (HI) method, which differs from the traditional ADP strategies; compared with the policy iteration method, the HI algorithm does not require prior knowledge of an admissible control policy. Moreover, with a quadratic rate of convergence, the proposed HI method converges much faster than the value iteration method. Therefore, the proposed HI method saves significant learning time and iterations compared to the value iteration method. Comparing the results of the PHIL-simulation-based testing utilizing the proposed method with those of the proportional-resonant controller (as the conventional PHIL-IC) and the robust PHIL-IC based on <inline-formula><tex-math>$\\mu$</tex-math></inline-formula> synthesis (as the current state-of-the-art PHIL-IC) reveals the effectiveness and practicality of the proposed method. Those comparative results are generated by the ideal transformer model (also known as voltage-type interface) commonly used in the PHIL-simulation-based testing and practical cases of the Thévenin equivalent impedance (resistive, resistive-inductive, and inductive ones) of the model of interest associated with the power networks.","PeriodicalId":13402,"journal":{"name":"IEEE Transactions on Industrial Electronics","volume":"72 6","pages":"5867-5883"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Industrial Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10753351/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Testing inverter-based resources (IBRs) is of utmost importance. This paper proposes a novel power hardware-in-the-loop (PHIL) interface control (PHIL-IC) employing a reinforcement-learning approach based on adaptive dynamic programming (ADP, also known as approximate dynamic programming) to enhance the PHIL-simulation-based testing of IBRs by virtue of an ADP-based method. It deploys output feedback control because of “unavailable” or “uncertain” dynamics of the entire systems (states and disturbances) linked to IBRs, power amplifiers, all the components associated with the PHIL-simulation-based testing, and their delays; it optimally designs PHIL-IC while considering all uncertainties and unavailable information about all the systems involved. To this end, the proposed ADP-based PHIL-IC utilizes a new hybrid iteration (HI) method, which differs from the traditional ADP strategies; compared with the policy iteration method, the HI algorithm does not require prior knowledge of an admissible control policy. Moreover, with a quadratic rate of convergence, the proposed HI method converges much faster than the value iteration method. Therefore, the proposed HI method saves significant learning time and iterations compared to the value iteration method. Comparing the results of the PHIL-simulation-based testing utilizing the proposed method with those of the proportional-resonant controller (as the conventional PHIL-IC) and the robust PHIL-IC based on

$\mu$

synthesis (as the current state-of-the-art PHIL-IC) reveals the effectiveness and practicality of the proposed method. Those comparative results are generated by the ideal transformer model (also known as voltage-type interface) commonly used in the PHIL-simulation-based testing and practical cases of the Thévenin equivalent impedance (resistive, resistive-inductive, and inductive ones) of the model of interest associated with the power networks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于测试逆变器资源的现场电力硬件在环接口控制的强化学习优化方法：基于混合迭代的自适应动态编程理论与应用：解决不确定动态问题

测试基于逆变器的资源（ibr）是至关重要的。本文提出了一种基于自适应动态规划（ADP，也称为近似动态规划）的强化学习方法的新型电源半在环（PHIL）接口控制（PHIL- ic），利用基于ADP的方法增强基于PHIL仿真的ibr测试。它部署了输出反馈控制，因为与ibr、功率放大器、与基于phil模拟的测试相关的所有组件及其延迟相关的整个系统（状态和干扰）的“不可用”或“不确定”动态；在考虑所有系统的不确定性和不可用信息的情况下，优化设计philic。为此，本文提出的基于ADP的philic采用了一种新的混合迭代（HI）方法，与传统的ADP策略不同；与策略迭代法相比，HI算法不需要预先知道可接受的控制策略。该方法收敛速度为二次，收敛速度远快于数值迭代法。因此，与值迭代法相比，所提出的HI方法节省了大量的学习时间和迭代次数。将该方法与比例谐振控制器（传统的phili - ic）和基于$\mu$合成的鲁棒phili - ic（当前最先进的phili - ic）的测试结果进行比较，揭示了该方法的有效性和实用性。这些比较结果是由理想变压器模型（也称为电压型接口）产生的，该模型通常用于基于phill仿真的测试和与电网相关的模型的等效阻抗（电阻、电阻-感应和感应）的实际案例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Industrial Electronics 工程技术-工程：电子与电气

CiteScore

16.80

自引率

9.10%

发文量

1396

审稿时长

6.3 months

期刊介绍： Journal Name: IEEE Transactions on Industrial Electronics Publication Frequency: Monthly Scope: The scope of IEEE Transactions on Industrial Electronics encompasses the following areas: Applications of electronics, controls, and communications in industrial and manufacturing systems and processes. Power electronics and drive control techniques. System control and signal processing. Fault detection and diagnosis. Power systems. Instrumentation, measurement, and testing. Modeling and simulation. Motion control. Robotics. Sensors and actuators. Implementation of neural networks, fuzzy logic, and artificial intelligence in industrial systems. Factory automation. Communication and computer networks.