Carles Domingo-Enrich, Jiequn Han, Brandon Amos, Joan Bruna, Ricky T. Q. Chen
{"title":"Stochastic Optimal Control Matching","authors":"Carles Domingo-Enrich, Jiequn Han, Brandon Amos, Joan Bruna, Ricky T. Q. Chen","doi":"arxiv-2312.02027","DOIUrl":null,"url":null,"abstract":"Stochastic optimal control, which has the goal of driving the behavior of\nnoisy systems, is broadly applicable in science, engineering and artificial\nintelligence. Our work introduces Stochastic Optimal Control Matching (SOCM), a\nnovel Iterative Diffusion Optimization (IDO) technique for stochastic optimal\ncontrol that stems from the same philosophy as the conditional score matching\nloss for diffusion models. That is, the control is learned via a least squares\nproblem by trying to fit a matching vector field. The training loss, which is\nclosely connected to the cross-entropy loss, is optimized with respect to both\nthe control function and a family of reparameterization matrices which appear\nin the matching vector field. The optimization with respect to the\nreparameterization matrices aims at minimizing the variance of the matching\nvector field. Experimentally, our algorithm achieves lower error than all the\nexisting IDO techniques for stochastic optimal control for four different\ncontrol settings. The key idea underlying SOCM is the path-wise\nreparameterization trick, a novel technique that is of independent interest,\ne.g., for generative modeling.","PeriodicalId":501061,"journal":{"name":"arXiv - CS - Numerical Analysis","volume":" 22","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Numerical Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2312.02027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Stochastic optimal control, which has the goal of driving the behavior of
noisy systems, is broadly applicable in science, engineering and artificial
intelligence. Our work introduces Stochastic Optimal Control Matching (SOCM), a
novel Iterative Diffusion Optimization (IDO) technique for stochastic optimal
control that stems from the same philosophy as the conditional score matching
loss for diffusion models. That is, the control is learned via a least squares
problem by trying to fit a matching vector field. The training loss, which is
closely connected to the cross-entropy loss, is optimized with respect to both
the control function and a family of reparameterization matrices which appear
in the matching vector field. The optimization with respect to the
reparameterization matrices aims at minimizing the variance of the matching
vector field. Experimentally, our algorithm achieves lower error than all the
existing IDO techniques for stochastic optimal control for four different
control settings. The key idea underlying SOCM is the path-wise
reparameterization trick, a novel technique that is of independent interest,
e.g., for generative modeling.