James Queeney;Ioannis Ch. Paschalidis;Christos G. Cassandras
{"title":"理论上支持样本重用的通用策略改进算法","authors":"James Queeney;Ioannis Ch. Paschalidis;Christos G. Cassandras","doi":"10.1109/TAC.2024.3454011","DOIUrl":null,"url":null,"abstract":"We develop a new class of model-free deep reinforcement learning algorithms for data-driven, learning-based control. Our Generalized Policy Improvement algorithms combine the policy improvement guarantees of on-policy methods with the efficiency of sample reuse, addressing a tradeoff between two important deployment requirements for real-world control: 1) practical performance guarantees; and 2) data efficiency. We demonstrate the benefits of this new class of algorithms through extensive experimental analysis on a broad range of simulated control tasks.","PeriodicalId":13201,"journal":{"name":"IEEE Transactions on Automatic Control","volume":"70 2","pages":"1236-1243"},"PeriodicalIF":7.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generalized Policy Improvement Algorithms With Theoretically Supported Sample Reuse\",\"authors\":\"James Queeney;Ioannis Ch. Paschalidis;Christos G. Cassandras\",\"doi\":\"10.1109/TAC.2024.3454011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We develop a new class of model-free deep reinforcement learning algorithms for data-driven, learning-based control. Our Generalized Policy Improvement algorithms combine the policy improvement guarantees of on-policy methods with the efficiency of sample reuse, addressing a tradeoff between two important deployment requirements for real-world control: 1) practical performance guarantees; and 2) data efficiency. We demonstrate the benefits of this new class of algorithms through extensive experimental analysis on a broad range of simulated control tasks.\",\"PeriodicalId\":13201,\"journal\":{\"name\":\"IEEE Transactions on Automatic Control\",\"volume\":\"70 2\",\"pages\":\"1236-1243\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2024-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Automatic Control\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10663867/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automatic Control","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10663867/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Generalized Policy Improvement Algorithms With Theoretically Supported Sample Reuse
We develop a new class of model-free deep reinforcement learning algorithms for data-driven, learning-based control. Our Generalized Policy Improvement algorithms combine the policy improvement guarantees of on-policy methods with the efficiency of sample reuse, addressing a tradeoff between two important deployment requirements for real-world control: 1) practical performance guarantees; and 2) data efficiency. We demonstrate the benefits of this new class of algorithms through extensive experimental analysis on a broad range of simulated control tasks.
期刊介绍:
In the IEEE Transactions on Automatic Control, the IEEE Control Systems Society publishes high-quality papers on the theory, design, and applications of control engineering. Two types of contributions are regularly considered:
1) Papers: Presentation of significant research, development, or application of control concepts.
2) Technical Notes and Correspondence: Brief technical notes, comments on published areas or established control topics, corrections to papers and notes published in the Transactions.
In addition, special papers (tutorials, surveys, and perspectives on the theory and applications of control systems topics) are solicited.