Måns Magnusson, Jakob Torgander, Paul-Christian Bürkner, Lu Zhang, Bob Carpenter, Aki Vehtari
{"title":"posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms","authors":"Måns Magnusson, Jakob Torgander, Paul-Christian Bürkner, Lu Zhang, Bob Carpenter, Aki Vehtari","doi":"arxiv-2407.04967","DOIUrl":null,"url":null,"abstract":"The generality and robustness of inference algorithms is critical to the\nsuccess of widely used probabilistic programming languages such as Stan, PyMC,\nPyro, and Turing.jl. When designing a new general-purpose inference algorithm,\nwhether it involves Monte Carlo sampling or variational approximation, the\nfundamental problem arises in evaluating its accuracy and efficiency across a\nrange of representative target models. To solve this problem, we propose\nposteriordb, a database of models and data sets defining target densities along\nwith reference Monte Carlo draws. We further provide a guide to the best\npractices in using posteriordb for model evaluation and comparison. To provide\na wide range of realistic target densities, posteriordb currently comprises 120\nrepresentative models and has been instrumental in developing several general\ninference algorithms.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"40 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.04967","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The generality and robustness of inference algorithms is critical to the
success of widely used probabilistic programming languages such as Stan, PyMC,
Pyro, and Turing.jl. When designing a new general-purpose inference algorithm,
whether it involves Monte Carlo sampling or variational approximation, the
fundamental problem arises in evaluating its accuracy and efficiency across a
range of representative target models. To solve this problem, we propose
posteriordb, a database of models and data sets defining target densities along
with reference Monte Carlo draws. We further provide a guide to the best
practices in using posteriordb for model evaluation and comparison. To provide
a wide range of realistic target densities, posteriordb currently comprises 120
representative models and has been instrumental in developing several general
inference algorithms.