{"title":"Smoothing Methods for Automatic Differentiation Across Conditional Branches","authors":"Justin N. Kreikemeyer, Philipp Andelfinger","doi":"arxiv-2310.03585","DOIUrl":null,"url":null,"abstract":"Programs involving discontinuities introduced by control flow constructs such\nas conditional branches pose challenges to mathematical optimization methods\nthat assume a degree of smoothness in the objective function's response\nsurface. Smooth interpretation (SI) is a form of abstract interpretation that\napproximates the convolution of a program's output with a Gaussian kernel, thus\nsmoothing its output in a principled manner. Here, we combine SI with automatic\ndifferentiation (AD) to efficiently compute gradients of smoothed programs. In\ncontrast to AD across a regular program execution, these gradients also capture\nthe effects of alternative control flow paths. The combination of SI with AD\nenables the direct gradient-based parameter synthesis for branching programs,\nallowing for instance the calibration of simulation models or their combination\nwith neural network models in machine learning pipelines. We detail the effects\nof the approximations made for tractability in SI and propose a novel Monte\nCarlo estimator that avoids the underlying assumptions by estimating the\nsmoothed programs' gradients through a combination of AD and sampling. Using\nDiscoGrad, our tool for automatically translating simple C++ programs to a\nsmooth differentiable form, we perform an extensive evaluation. We compare the\ncombination of SI with AD and our Monte Carlo estimator to existing\ngradient-free and stochastic methods on four non-trivial and originally\ndiscontinuous problems ranging from classical simulation-based optimization to\nneural network-driven control. While the optimization progress with the\nSI-based estimator depends on the complexity of the programs' control flow, our\nMonte Carlo estimator is competitive in all problems, exhibiting the fastest\nconvergence by a substantial margin in our highest-dimensional problem.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"10 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Mathematical Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2310.03585","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Programs involving discontinuities introduced by control flow constructs such
as conditional branches pose challenges to mathematical optimization methods
that assume a degree of smoothness in the objective function's response
surface. Smooth interpretation (SI) is a form of abstract interpretation that
approximates the convolution of a program's output with a Gaussian kernel, thus
smoothing its output in a principled manner. Here, we combine SI with automatic
differentiation (AD) to efficiently compute gradients of smoothed programs. In
contrast to AD across a regular program execution, these gradients also capture
the effects of alternative control flow paths. The combination of SI with AD
enables the direct gradient-based parameter synthesis for branching programs,
allowing for instance the calibration of simulation models or their combination
with neural network models in machine learning pipelines. We detail the effects
of the approximations made for tractability in SI and propose a novel Monte
Carlo estimator that avoids the underlying assumptions by estimating the
smoothed programs' gradients through a combination of AD and sampling. Using
DiscoGrad, our tool for automatically translating simple C++ programs to a
smooth differentiable form, we perform an extensive evaluation. We compare the
combination of SI with AD and our Monte Carlo estimator to existing
gradient-free and stochastic methods on four non-trivial and originally
discontinuous problems ranging from classical simulation-based optimization to
neural network-driven control. While the optimization progress with the
SI-based estimator depends on the complexity of the programs' control flow, our
Monte Carlo estimator is competitive in all problems, exhibiting the fastest
convergence by a substantial margin in our highest-dimensional problem.