{"title":"连接随机优化控制和强化学习","authors":"J. Quer, Enric Ribera Borrell","doi":"10.1063/5.0140665","DOIUrl":null,"url":null,"abstract":"In this paper the connection between stochastic optimal control and reinforcement learning is investigated. Our main motivation is to apply importance sampling to sampling rare events which can be reformulated as an optimal control problem. By using a parameterised approach the optimal control problem becomes a stochastic optimization problem which still raises some open questions regarding how to tackle the scalability to high-dimensional problems and how to deal with the intrinsic metastability of the system. To explore new methods we link the optimal control problem to reinforcement learning since both share the same underlying framework, namely a Markov Decision Process (MDP). For the optimal control problem we show how the MDP can be formulated. In addition we discuss how the stochastic optimal control problem can be interpreted in the framework of reinforcement learning. At the end of the article we present the application of two different reinforcement learning algorithms to the optimal control problem and a comparison of the advantages and disadvantages of the two algorithms.","PeriodicalId":16174,"journal":{"name":"Journal of Mathematical Physics","volume":"1 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Connecting stochastic optimal control and reinforcement learning\",\"authors\":\"J. Quer, Enric Ribera Borrell\",\"doi\":\"10.1063/5.0140665\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper the connection between stochastic optimal control and reinforcement learning is investigated. Our main motivation is to apply importance sampling to sampling rare events which can be reformulated as an optimal control problem. By using a parameterised approach the optimal control problem becomes a stochastic optimization problem which still raises some open questions regarding how to tackle the scalability to high-dimensional problems and how to deal with the intrinsic metastability of the system. To explore new methods we link the optimal control problem to reinforcement learning since both share the same underlying framework, namely a Markov Decision Process (MDP). For the optimal control problem we show how the MDP can be formulated. In addition we discuss how the stochastic optimal control problem can be interpreted in the framework of reinforcement learning. At the end of the article we present the application of two different reinforcement learning algorithms to the optimal control problem and a comparison of the advantages and disadvantages of the two algorithms.\",\"PeriodicalId\":16174,\"journal\":{\"name\":\"Journal of Mathematical Physics\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2024-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Mathematical Physics\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1063/5.0140665\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"PHYSICS, MATHEMATICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Mathematical Physics","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1063/5.0140665","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHYSICS, MATHEMATICAL","Score":null,"Total":0}
Connecting stochastic optimal control and reinforcement learning
In this paper the connection between stochastic optimal control and reinforcement learning is investigated. Our main motivation is to apply importance sampling to sampling rare events which can be reformulated as an optimal control problem. By using a parameterised approach the optimal control problem becomes a stochastic optimization problem which still raises some open questions regarding how to tackle the scalability to high-dimensional problems and how to deal with the intrinsic metastability of the system. To explore new methods we link the optimal control problem to reinforcement learning since both share the same underlying framework, namely a Markov Decision Process (MDP). For the optimal control problem we show how the MDP can be formulated. In addition we discuss how the stochastic optimal control problem can be interpreted in the framework of reinforcement learning. At the end of the article we present the application of two different reinforcement learning algorithms to the optimal control problem and a comparison of the advantages and disadvantages of the two algorithms.
期刊介绍:
Since 1960, the Journal of Mathematical Physics (JMP) has published some of the best papers from outstanding mathematicians and physicists. JMP was the first journal in the field of mathematical physics and publishes research that connects the application of mathematics to problems in physics, as well as illustrates the development of mathematical methods for such applications and for the formulation of physical theories.
The Journal of Mathematical Physics (JMP) features content in all areas of mathematical physics. Specifically, the articles focus on areas of research that illustrate the application of mathematics to problems in physics, the development of mathematical methods for such applications, and for the formulation of physical theories. The mathematics featured in the articles are written so that theoretical physicists can understand them. JMP also publishes review articles on mathematical subjects relevant to physics as well as special issues that combine manuscripts on a topic of current interest to the mathematical physics community.
JMP welcomes original research of the highest quality in all active areas of mathematical physics, including the following:
Partial Differential Equations
Representation Theory and Algebraic Methods
Many Body and Condensed Matter Physics
Quantum Mechanics - General and Nonrelativistic
Quantum Information and Computation
Relativistic Quantum Mechanics, Quantum Field Theory, Quantum Gravity, and String Theory
General Relativity and Gravitation
Dynamical Systems
Classical Mechanics and Classical Fields
Fluids
Statistical Physics
Methods of Mathematical Physics.