perms: Likelihood-free estimation of marginal likelihoods for binary response data in Python and R

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Journal of Computational Science Pub Date : 2025-01-01 Epub Date: 2024-11-22 DOI:10.1016/j.jocs.2024.102467

Dennis Christensen , Per August Jarval Moen

{"title":"perms: Likelihood-free estimation of marginal likelihoods for binary response data in Python and R","authors":"Dennis Christensen , Per August Jarval Moen","doi":"10.1016/j.jocs.2024.102467","DOIUrl":null,"url":null,"abstract":"<div><div>In Bayesian statistics, the marginal likelihood (ML) is the key ingredient needed for model comparison and model averaging. Unfortunately, estimating MLs accurately is notoriously difficult, especially for models where posterior simulation is not possible. Recently, the idea of permutation counting was introduced, which provides an estimator which can accurately estimate MLs of models for exchangeable binary responses. Such data arise in a multitude of statistical problems, including binary classification, bioassay and sensitivity testing. Permutation counting is entirely likelihood-free and works for any model from which a random sample can be generated, including nonparametric models. Here we present <span>perms</span>, a package implementing permutation counting. Following optimisation efforts, <span>perms</span> is computationally efficient and can handle large data problems. It is available as both an R package and a Python library. A broad gallery of examples illustrating its usage is provided, which includes both standard parametric binary classification and novel applications of nonparametric models, such as changepoint analysis. We also cover the details of the implementation of <span>perms</span> and illustrate its computational speed via a simple simulation study.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102467"},"PeriodicalIF":3.7000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1877750324002606","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/22 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

In Bayesian statistics, the marginal likelihood (ML) is the key ingredient needed for model comparison and model averaging. Unfortunately, estimating MLs accurately is notoriously difficult, especially for models where posterior simulation is not possible. Recently, the idea of permutation counting was introduced, which provides an estimator which can accurately estimate MLs of models for exchangeable binary responses. Such data arise in a multitude of statistical problems, including binary classification, bioassay and sensitivity testing. Permutation counting is entirely likelihood-free and works for any model from which a random sample can be generated, including nonparametric models. Here we present perms, a package implementing permutation counting. Following optimisation efforts, perms is computationally efficient and can handle large data problems. It is available as both an R package and a Python library. A broad gallery of examples illustrating its usage is provided, which includes both standard parametric binary classification and novel applications of nonparametric models, such as changepoint analysis. We also cover the details of the implementation of perms and illustrate its computational speed via a simple simulation study.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

perms：用 Python 和 R 对二元响应数据的边际似然进行无似然估计

在贝叶斯统计中，边际似然（ML）是模型比较和模型平均所需的关键要素。遗憾的是，准确估计 ML 是出了名的困难，尤其是对于无法进行后验模拟的模型。最近，有人提出了置换计数的概念，它提供了一种估计方法，可以准确估计可交换二元响应模型的 ML。这类数据出现在许多统计问题中，包括二元分类、生物测定和灵敏度测试。置换计数完全不需要似然，适用于任何可以生成随机样本的模型，包括非参数模型。这里我们介绍 perms，这是一个实现置换计数的软件包。经过优化，perms 的计算效率很高，可以处理大型数据问题。它既是一个 R 软件包，也是一个 Python 库。我们提供了大量示例来说明它的用法，其中既包括标准参数二元分类，也包括非参数模型的新型应用，如变化点分析。我们还介绍了 perms 的实现细节，并通过简单的模拟研究说明了其计算速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Computational Science COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

5.50

自引率

3.00%

发文量

227

审稿时长

41 days

期刊介绍： Computational Science is a rapidly growing multi- and interdisciplinary field that uses advanced computing and data analysis to understand and solve complex problems. It has reached a level of predictive capability that now firmly complements the traditional pillars of experimentation and theory. The recent advances in experimental techniques such as detectors, on-line sensor networks and high-resolution imaging techniques, have opened up new windows into physical and biological processes at many levels of detail. The resulting data explosion allows for detailed data driven modeling and simulation. This new discipline in science combines computational thinking, modern computational methods, devices and collateral technologies to address problems far beyond the scope of traditional numerical methods. Computational science typically unifies three distinct elements: • Modeling, Algorithms and Simulations (e.g. numerical and non-numerical, discrete and continuous); • Software developed to solve science (e.g., biological, physical, and social), engineering, medicine, and humanities problems; • Computer and information science that develops and optimizes the advanced system hardware, software, networking, and data management components (e.g. problem solving environments).