首页 > 最新文献

arXiv - STAT - Computation最新文献

英文 中文
Incorporating additional evidence as prior information to resolve non-identifiability in Bayesian disease model calibration 在贝叶斯疾病模型校准中纳入额外证据作为先验信息以解决不可识别性问题
Pub Date : 2024-07-18 DOI: arxiv-2407.13451
Daria Semochkina, Cathal Walsh
Background: Statisticians evaluating the impact of policy interventions suchas screening or vaccination will need to make use of mathematical andcomputational models of disease progression and spread. Calibration is theprocess of identifying the parameters of these models, with a Bayesianframework providing a natural way in which to do this in a probabilisticfashion. Markov Chain Monte Carlo (MCMC) is one of a number of computationaltools that is useful in carrying out this calibration. Objective: In thecontext of complex models in particular, a key problem that arises is one ofnon-identifiability. In this setting, one approach which can be used is toconsider and ensure that appropriately informative priors are specified on thejoint parameter space. We give examples of how this arises and may be addressedin practice. Methods: Using a basic SIS model the calibration process and theassociated challenge of non-identifiability is discussed. How this problemarises in the context of a larger model for HPV and cervical cancer is alsoillustrated. Results: The conditions which allow the problem ofnon-identifiability to be resolved are demonstrated for the SIS model. For thelarger HPV model, how this impacts on the calibration process is alsodiscussed.
背景:统计学家在评估筛查或疫苗接种等政策干预措施的影响时,需要使用疾病进展和传播的数学模型和计算模型。校准是确定这些模型参数的过程,贝叶斯框架提供了一种以概率方式完成校准的自然方法。马尔可夫链蒙特卡罗(MCMC)是进行校准时非常有用的计算工具之一。目标:特别是在复杂模型的背景下,出现的一个关键问题是不可识别性。在这种情况下,可以采用的一种方法是考虑并确保在联合参数空间上指定适当的信息先验。我们将举例说明在实践中如何解决这一问题。方法:使用一个基本的 SIS 模型,讨论校准过程和相关的不可识别性挑战。我们还说明了这一问题在 HPV 和宫颈癌的大型模型中是如何出现的。结果:在 SIS 模型中证明了解决不可识别性问题的条件。对于更大的 HPV 模型,还讨论了这对校准过程的影响。
{"title":"Incorporating additional evidence as prior information to resolve non-identifiability in Bayesian disease model calibration","authors":"Daria Semochkina, Cathal Walsh","doi":"arxiv-2407.13451","DOIUrl":"https://doi.org/arxiv-2407.13451","url":null,"abstract":"Background: Statisticians evaluating the impact of policy interventions such\u0000as screening or vaccination will need to make use of mathematical and\u0000computational models of disease progression and spread. Calibration is the\u0000process of identifying the parameters of these models, with a Bayesian\u0000framework providing a natural way in which to do this in a probabilistic\u0000fashion. Markov Chain Monte Carlo (MCMC) is one of a number of computational\u0000tools that is useful in carrying out this calibration. Objective: In the\u0000context of complex models in particular, a key problem that arises is one of\u0000non-identifiability. In this setting, one approach which can be used is to\u0000consider and ensure that appropriately informative priors are specified on the\u0000joint parameter space. We give examples of how this arises and may be addressed\u0000in practice. Methods: Using a basic SIS model the calibration process and the\u0000associated challenge of non-identifiability is discussed. How this problem\u0000arises in the context of a larger model for HPV and cervical cancer is also\u0000illustrated. Results: The conditions which allow the problem of\u0000non-identifiability to be resolved are demonstrated for the SIS model. For the\u0000larger HPV model, how this impacts on the calibration process is also\u0000discussed.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convergence of SMACOF SMACOF 的收敛
Pub Date : 2024-07-17 DOI: arxiv-2407.12945
Jan De Leeuw
To study convergence of SMACOF we introduce a modification mSMACOF thatrotates the configurations from each of the SMACOF iterations to principalcomponents. This modification, called mSMACOF, has the same stress values asSMACOF in each iteration, but unlike SMACOF it produces a sequence ofconfigurations that properly converges to a solution. We show that the modifiedalgorithm can be implemented by iterating ordinary SMACOF to convergence, andthen rotating the SMACOF solution to principal components. The speed of linearconvergence of SMACOF and mSMACOF is the same, and is equal to the largesteigenvalue of the derivative of the Guttman transform, ignoring the trivialunit eigenvalues that result from rotational indeterminacy.
为了研究 SMACOF 的收敛性,我们引入了一种修正 mSMACOF,它将 SMACOF 每次迭代的配置旋转为主成分。这种修改称为 mSMACOF,每次迭代的应力值与 SMACOF 相同,但与 SMACOF 不同的是,它产生的配置序列能正确收敛到一个解。我们证明,可以通过迭代普通 SMACOF 至收敛,然后旋转 SMACOF 解的主成分来实现改进算法。SMACOF 和 mSMACOF 的线性收敛速度相同,等于古特曼变换导数的最大特征值,忽略了旋转不确定性导致的微小单位特征值。
{"title":"Convergence of SMACOF","authors":"Jan De Leeuw","doi":"arxiv-2407.12945","DOIUrl":"https://doi.org/arxiv-2407.12945","url":null,"abstract":"To study convergence of SMACOF we introduce a modification mSMACOF that\u0000rotates the configurations from each of the SMACOF iterations to principal\u0000components. This modification, called mSMACOF, has the same stress values as\u0000SMACOF in each iteration, but unlike SMACOF it produces a sequence of\u0000configurations that properly converges to a solution. We show that the modified\u0000algorithm can be implemented by iterating ordinary SMACOF to convergence, and\u0000then rotating the SMACOF solution to principal components. The speed of linear\u0000convergence of SMACOF and mSMACOF is the same, and is equal to the largest\u0000eigenvalue of the derivative of the Guttman transform, ignoring the trivial\u0000unit eigenvalues that result from rotational indeterminacy.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantile Slice Sampling 定量切片采样
Pub Date : 2024-07-17 DOI: arxiv-2407.12608
Matthew J. Heiner, Samuel B. Johnson, Joshua R. Christensen, David B. Dahl
We propose and demonstrate an alternate, effective approach to simple slicesampling. Using the probability integral transform, we first generalize Neal'sshrinkage algorithm, standardizing the procedure to an automatic and universalstarting point: the unit interval. This enables the introduction of approximate(pseudo-) targets through importance reweighting, a technique that haspopularized elliptical slice sampling. Reasonably accurate pseudo-targets canboost sampler efficiency by requiring fewer rejections and by reducing targetskewness. This strategy is effective when a natural, possibly crude,approximation to the target exists. Alternatively, obtaining a marginalpseudo-target from initial samples provides an intuitive and automatic tuningprocedure. We consider two metrics for evaluating the quality of approximation;each can be used as a criterion to find an optimal pseudo-target or as aninterpretable diagnostic. We examine performance of the proposed samplerrelative to other popular, easily implemented MCMC samplers on standard targetsin isolation, and as steps within a Gibbs sampler in a Bayesian modelingcontext. We extend the transformation method to multivariate slice samplers anddemonstrate with a constrained state-space model for which a readily availableforward-backward algorithm provides the target approximation.
我们提出并演示了另一种有效的简单切片取样方法。利用概率积分变换,我们首先概括了尼尔的缩减算法,将程序标准化为一个自动的通用起点:单位区间。这样就能通过重要度再加权引入近似(伪)目标,这种技术已在椭圆切片采样中得到普及。合理精确的伪目标可以减少剔除次数,降低目标偏差,从而提高采样器的效率。当目标存在一个自然的(可能是粗糙的)近似值时,这种策略就很有效。另外,从初始样本中获取边际伪目标也提供了一种直观的自动调整程序。我们考虑了两种评估近似质量的指标;每种指标都可用作寻找最佳伪目标的标准或可解释的诊断。我们检验了所提出的采样器与其他流行的、易于实现的 MCMC 采样器相比在标准目标上的性能,以及在贝叶斯建模背景下作为吉布斯采样器中的步骤的性能。我们将转换方法扩展到多变量切片采样器,并用一个受限状态空间模型进行了演示,该模型的前向-后向算法提供了目标近似值。
{"title":"Quantile Slice Sampling","authors":"Matthew J. Heiner, Samuel B. Johnson, Joshua R. Christensen, David B. Dahl","doi":"arxiv-2407.12608","DOIUrl":"https://doi.org/arxiv-2407.12608","url":null,"abstract":"We propose and demonstrate an alternate, effective approach to simple slice\u0000sampling. Using the probability integral transform, we first generalize Neal's\u0000shrinkage algorithm, standardizing the procedure to an automatic and universal\u0000starting point: the unit interval. This enables the introduction of approximate\u0000(pseudo-) targets through importance reweighting, a technique that has\u0000popularized elliptical slice sampling. Reasonably accurate pseudo-targets can\u0000boost sampler efficiency by requiring fewer rejections and by reducing target\u0000skewness. This strategy is effective when a natural, possibly crude,\u0000approximation to the target exists. Alternatively, obtaining a marginal\u0000pseudo-target from initial samples provides an intuitive and automatic tuning\u0000procedure. We consider two metrics for evaluating the quality of approximation;\u0000each can be used as a criterion to find an optimal pseudo-target or as an\u0000interpretable diagnostic. We examine performance of the proposed sampler\u0000relative to other popular, easily implemented MCMC samplers on standard targets\u0000in isolation, and as steps within a Gibbs sampler in a Bayesian modeling\u0000context. We extend the transformation method to multivariate slice samplers and\u0000demonstrate with a constrained state-space model for which a readily available\u0000forward-backward algorithm provides the target approximation.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scaling Hawkes processes to one million COVID-19 cases 将霍克斯过程扩展到一百万个 COVID-19 病例
Pub Date : 2024-07-16 DOI: arxiv-2407.11349
Seyoon Ko, Marc A. Suchard, Andrew J. Holbrook
Hawkes stochastic point process models have emerged as valuable statisticaltools for analyzing viral contagion. The spatiotemporal Hawkes processcharacterizes the speeds at which viruses spread within human populations.Unfortunately, likelihood-based inference using these models requires $O(N^2)$floating-point operations, for $N$ the number of observed cases. Recent workresponds to the Hawkes likelihood's computational burden by developingefficient graphics processing unit (GPU)-based routines that enable Bayesiananalysis of tens-of-thousands of observations. We build on this work anddevelop a high-performance computing (HPC) strategy that divides 30 Markovchains between 4 GPU nodes, each of which uses multiple GPUs to accelerate itschain's likelihood computations. We use this framework to apply twospatiotemporal Hawkes models to the analysis of one million COVID-19 cases inthe United States between March 2020 and June 2023. In addition to brute-forceHPC, we advocate for two simple strategies as scalable alternatives tosuccessful approaches proposed for small data settings. First, we use knowncounty-specific population densities to build a spatially varying triggeringkernel in a manner that avoids computationally costly nearest neighbors search.Second, we use a cut-posterior inference routine that accounts for infections'spatial location uncertainty by iteratively sampling latent locations uniformlywithin their respective counties of occurrence, thereby avoiding full-blownlatent variable inference for 1,000,000 infection locations.
霍克斯随机点过程模型已成为分析病毒传染的重要统计工具。时空霍克斯过程描述了病毒在人类种群中的传播速度。不幸的是,使用这些模型进行基于似然法的推断需要 $O(N^2)$ 的浮点运算,而 $N$ 是观察到的病例数。最近的工作通过开发基于图形处理器(GPU)的高效例程来解决霍克斯似然法的计算负担问题,这些例程可以对数以万计的观测数据进行贝叶斯分析。我们在此基础上开发了一种高性能计算(HPC)策略,将 30 个马尔可夫链划分为 4 个 GPU 节点,每个节点使用多个 GPU 加速其链的似然计算。我们利用这一框架将两个时空霍克斯模型应用于分析 2020 年 3 月至 2023 年 6 月期间美国的 100 万 COVID-19 病例。除了 "蛮力高性能计算"(brute-forceHPC)外,我们还主张采用两种简单的策略,作为针对小数据环境提出的成功方法的可扩展替代方案。首先,我们使用已知的特定县域人口密度来构建空间变化的触发核,这种方式避免了计算成本高昂的近邻搜索。其次,我们使用切后置推断例程,通过在各自的发生县域内均匀地迭代采样潜伏位置来考虑感染的空间位置不确定性,从而避免了对 1,000,000 个感染位置进行全吹式潜伏变量推断。
{"title":"Scaling Hawkes processes to one million COVID-19 cases","authors":"Seyoon Ko, Marc A. Suchard, Andrew J. Holbrook","doi":"arxiv-2407.11349","DOIUrl":"https://doi.org/arxiv-2407.11349","url":null,"abstract":"Hawkes stochastic point process models have emerged as valuable statistical\u0000tools for analyzing viral contagion. The spatiotemporal Hawkes process\u0000characterizes the speeds at which viruses spread within human populations.\u0000Unfortunately, likelihood-based inference using these models requires $O(N^2)$\u0000floating-point operations, for $N$ the number of observed cases. Recent work\u0000responds to the Hawkes likelihood's computational burden by developing\u0000efficient graphics processing unit (GPU)-based routines that enable Bayesian\u0000analysis of tens-of-thousands of observations. We build on this work and\u0000develop a high-performance computing (HPC) strategy that divides 30 Markov\u0000chains between 4 GPU nodes, each of which uses multiple GPUs to accelerate its\u0000chain's likelihood computations. We use this framework to apply two\u0000spatiotemporal Hawkes models to the analysis of one million COVID-19 cases in\u0000the United States between March 2020 and June 2023. In addition to brute-force\u0000HPC, we advocate for two simple strategies as scalable alternatives to\u0000successful approaches proposed for small data settings. First, we use known\u0000county-specific population densities to build a spatially varying triggering\u0000kernel in a manner that avoids computationally costly nearest neighbors search.\u0000Second, we use a cut-posterior inference routine that accounts for infections'\u0000spatial location uncertainty by iteratively sampling latent locations uniformly\u0000within their respective counties of occurrence, thereby avoiding full-blown\u0000latent variable inference for 1,000,000 infection locations.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Parallel Tempering Markov Chain Monte Carlo with Nii-C 利用 Nii-C 实现自动并行回火马尔可夫链蒙特卡洛
Pub Date : 2024-07-13 DOI: arxiv-2407.09915
Sheng Jin, Wenxin Jiang, Dong-Hong Wu
Due to the high dimensionality or multimodality that is common in modernastronomy, sampling Bayesian posteriors can be challenging. Several publiclyavailable codes based on different sampling algorithms can solve these complexmodels, but the execution of the code is not always efficient or fast enough.The article introduces a C language general-purpose code, Nii-C(https://github.com/shengjin/nii-c.git), that implements a framework ofAutomatic Parallel Tempering Markov Chain Monte Carlo. Automatic in thiscontext means that the parameters that ensure an efficient parallel temperingprocess can be set by a control system during the initial stages of a samplingprocess. The auto-tuned parameters consist of two parts, the temperatureladders of all parallel tempering Markov chains and the proposal distributionsfor all model parameters across all parallel tempering chains. In order toreduce dependencies in the compilation process and increase the code'sexecution speed, Nii-C code is constructed entirely in the C language andparallelised using the Message-Passing Interface protocol to optimise theefficiency of parallel sampling. These implementations facilitate rapidconvergence in the sampling of high-dimensional and multi-modal distributions,as well as expeditious code execution time. The Nii-C code can be used invarious research areas to trace complex distributions due to its high samplingefficiency and quick execution speed. This article presents a few applicationsof the Nii-C code.
由于现代天文学中常见的高维度或多模态性,贝叶斯后验的采样可能具有挑战性。一些基于不同采样算法的公开代码可以求解这些复杂模型,但代码执行的效率和速度并不总是足够快。本文介绍了一种 C 语言通用代码 Nii-C(https://github.com/shengjin/nii-c.git),它实现了一种自动并行调节马尔可夫链蒙特卡罗框架。这里所说的自动是指在采样过程的初始阶段,可以通过控制系统设置确保高效并行回火过程的参数。自动调整参数由两部分组成,即所有平行回火马尔可夫链的温度梯度和所有平行回火链上所有模型参数的建议分布。为了减少编译过程中的依赖性并提高代码执行速度,Nii-C 代码完全用 C 语言编写,并使用消息传递接口协议进行并行化,以优化并行采样的效率。这些实现有助于在高维和多模态分布采样时快速收敛,并加快代码执行时间。Nii-C 代码的采样效率高、执行速度快,因此可用于多个研究领域,对复杂分布进行追踪。本文将介绍 Nii-C 代码的一些应用。
{"title":"Automatic Parallel Tempering Markov Chain Monte Carlo with Nii-C","authors":"Sheng Jin, Wenxin Jiang, Dong-Hong Wu","doi":"arxiv-2407.09915","DOIUrl":"https://doi.org/arxiv-2407.09915","url":null,"abstract":"Due to the high dimensionality or multimodality that is common in modern\u0000astronomy, sampling Bayesian posteriors can be challenging. Several publicly\u0000available codes based on different sampling algorithms can solve these complex\u0000models, but the execution of the code is not always efficient or fast enough.\u0000The article introduces a C language general-purpose code, Nii-C\u0000(https://github.com/shengjin/nii-c.git), that implements a framework of\u0000Automatic Parallel Tempering Markov Chain Monte Carlo. Automatic in this\u0000context means that the parameters that ensure an efficient parallel tempering\u0000process can be set by a control system during the initial stages of a sampling\u0000process. The auto-tuned parameters consist of two parts, the temperature\u0000ladders of all parallel tempering Markov chains and the proposal distributions\u0000for all model parameters across all parallel tempering chains. In order to\u0000reduce dependencies in the compilation process and increase the code's\u0000execution speed, Nii-C code is constructed entirely in the C language and\u0000parallelised using the Message-Passing Interface protocol to optimise the\u0000efficiency of parallel sampling. These implementations facilitate rapid\u0000convergence in the sampling of high-dimensional and multi-modal distributions,\u0000as well as expeditious code execution time. The Nii-C code can be used in\u0000various research areas to trace complex distributions due to its high sampling\u0000efficiency and quick execution speed. This article presents a few applications\u0000of the Nii-C code.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Counting $N$ Queens 数出 N$ 皇后
Pub Date : 2024-07-11 DOI: arxiv-2407.08830
Nick Polson, Vadim Sokolov
Gauss proposed the problem of how to enumerate the number of solutions forplacing $N$ queens on an $Ntimes N$ chess board, so no two queens attack eachother. The N-queen problem is a classic problem in combinatorics. We describe avariety of Monte Carlo (MC) methods for counting the number of solutions. Inparticular, we propose a quantile re-ordering based on the Lorenz curve of asum that is related to counting the number of solutions. We show his approachleads to an efficient polynomial-time solution. Other MC methods includevertical likelihood Monte Carlo, importance sampling, slice sampling, simulatedannealing, energy-level sampling, and nested-sampling. Sampling binary matricesthat identify the locations of the queens on the board can be done with aSwendsen-Wang style algorithm. Our Monte Carlo approach counts the number ofsolutions in polynomial time.
高斯提出了这样一个问题:如何枚举出在一个 N 次 N 元的棋盘上摆放 N 个皇后的解的个数,从而避免两个皇后互相攻击。N 皇后问题是组合数学中的一个经典问题。我们介绍了各种计算解数的蒙特卡罗(MC)方法。特别是,我们提出了一种基于洛伦兹曲线的量子重排序方法,它与计算解的数量有关。我们证明了他的方法能带来高效的多项式时间解决方案。其他 MC 方法包括理论似然蒙特卡罗、重要性采样、切片采样、模拟嵌套、能量级采样和嵌套采样。对确定棋盘上皇后位置的二进制矩阵进行采样,可采用斯文森-旺(Swendsen-Wang)式算法。我们的蒙特卡罗方法可以在多项式时间内计算出解决方案的数量。
{"title":"Counting $N$ Queens","authors":"Nick Polson, Vadim Sokolov","doi":"arxiv-2407.08830","DOIUrl":"https://doi.org/arxiv-2407.08830","url":null,"abstract":"Gauss proposed the problem of how to enumerate the number of solutions for\u0000placing $N$ queens on an $Ntimes N$ chess board, so no two queens attack each\u0000other. The N-queen problem is a classic problem in combinatorics. We describe a\u0000variety of Monte Carlo (MC) methods for counting the number of solutions. In\u0000particular, we propose a quantile re-ordering based on the Lorenz curve of a\u0000sum that is related to counting the number of solutions. We show his approach\u0000leads to an efficient polynomial-time solution. Other MC methods include\u0000vertical likelihood Monte Carlo, importance sampling, slice sampling, simulated\u0000annealing, energy-level sampling, and nested-sampling. Sampling binary matrices\u0000that identify the locations of the queens on the board can be done with a\u0000Swendsen-Wang style algorithm. Our Monte Carlo approach counts the number of\u0000solutions in polynomial time.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The 2023/24 VIEWS Prediction Challenge: Predicting the Number of Fatalities in Armed Conflict, with Uncertainty 2023/24 VIEWS 预测挑战赛:在不确定情况下预测武装冲突中的死亡人数
Pub Date : 2024-07-08 DOI: arxiv-2407.11045
Håvard HegrePeace Research Institute OsloDepartment of Peace and Conflict Research, Uppsala University, Paola VescoPeace Research Institute OsloDepartment of Peace and Conflict Research, Uppsala University, Michael ColaresiDepartment of Peace and Conflict Research, Uppsala UniversityUniversity of Pittsburgh, Jonas VestbyPeace Research Institute Oslo, Alexa TimlickPeace Research Institute Oslo, Noorain Syed KazmiPeace Research Institute Oslo, Friederike BeckerInstitute of Statistics, Marco BinettiCenter for Crisis Early Warning, University of the Bundeswehr Munich, Tobias BodentienInstitute of Statistics, Tobias BohneCenter for Crisis Early Warning, University of the Bundeswehr Munich, Patrick T. BrandtSchool of Economic, Political, and Policy Sciences, University of Texas, Dallas, Thomas ChadefauxTrinity College Dublin, Simon DrauzInstitute of Statistics, Christoph DworschakUniversity of York, Vito D'OrazioWest Virginia University, Cornelius FritzPennsylvania State University, Hannah FrankTrinity College Dublin, Kristian Skrede GleditschUniversity of EssexPeace Research Institute Oslo, Sonja HäffnerCenter for Crisis Early Warning, University of the Bundeswehr Munich, Martin HoferUniversity College London, Finn L. KlebeUniversity College London, Luca MacisDepartment of Economics and Statistics Cognetti de Martiis, University of Turin, Alexandra MalagaInstitute for Economic Analysis, Barcelona, Marius MehrlUniversity of Leeds, Nils W. MetternichUniversity College London, Daniel MittermaierCenter for Crisis Early Warning, University of the Bundeswehr Munich, David MuchlinskiGeorgia Tech, Hannes MuellerInstitute for Economic Analysis, BarcelonaBarcelona School of Economics, Christian OswaldCenter for Crisis Early Warning, University of the Bundeswehr Munich, Paola PisanoDepartment of Economics and Statistics Cognetti de Martiis, University of Turin, David RandahlDepartment of Peace and Conflict Research, Uppsala University, Christopher RauhUniversity of Cambridge, Lotta RüterInstitute of Statistics, Thomas SchincariolTrinity College Dublin, Benjamin SeimonFundació Economia Analitica, Elena SilettiDepartment of Economics and Statistics Cognetti de Martiis, University of Turin, Marco TagliapietraDepartment of Economics and Statistics Cognetti de Martiis, University of Turin, Chandler ThornhillGeorgia Tech, Johan VegeliusDepartment of Medical Sciences, Uppsala University, Julian WalterskirchenCenter for Crisis Early Warning, University of the Bundeswehr Munich
This draft article outlines a prediction challenge where the target is toforecast the number of fatalities in armed conflicts, in the form of the UCDP`best' estimates, aggregated to the VIEWS units of analysis. It presents theformat of the contributions, the evaluation metric, and the procedures, and abrief summary of the contributions. The article serves a function analogous toa pre-analysis plan: a statement of the forecasting models made publiclyavailable before the true future prediction window commences. More informationon the challenge, and all data referred to in this document, can be found athttps://viewsforecasting.org/research/prediction-challenge-2023.
本文草案概述了一项预测挑战,其目标是以 UCDP "最佳 "估计值的形式预测武装冲突中的死亡人数,并将其汇总到 VIEWS 分析单元。文章介绍了贡献的格式、评估指标和程序,并对贡献进行了简要总结。这篇文章的作用类似于分析前计划:在真正的未来预测窗口开始之前,公布预测模型的说明。有关挑战赛的更多信息以及本文件中提到的所有数据,请访问:https://viewsforecasting.org/research/prediction-challenge-2023。
{"title":"The 2023/24 VIEWS Prediction Challenge: Predicting the Number of Fatalities in Armed Conflict, with Uncertainty","authors":"Håvard HegrePeace Research Institute OsloDepartment of Peace and Conflict Research, Uppsala University, Paola VescoPeace Research Institute OsloDepartment of Peace and Conflict Research, Uppsala University, Michael ColaresiDepartment of Peace and Conflict Research, Uppsala UniversityUniversity of Pittsburgh, Jonas VestbyPeace Research Institute Oslo, Alexa TimlickPeace Research Institute Oslo, Noorain Syed KazmiPeace Research Institute Oslo, Friederike BeckerInstitute of Statistics, Marco BinettiCenter for Crisis Early Warning, University of the Bundeswehr Munich, Tobias BodentienInstitute of Statistics, Tobias BohneCenter for Crisis Early Warning, University of the Bundeswehr Munich, Patrick T. BrandtSchool of Economic, Political, and Policy Sciences, University of Texas, Dallas, Thomas ChadefauxTrinity College Dublin, Simon DrauzInstitute of Statistics, Christoph DworschakUniversity of York, Vito D'OrazioWest Virginia University, Cornelius FritzPennsylvania State University, Hannah FrankTrinity College Dublin, Kristian Skrede GleditschUniversity of EssexPeace Research Institute Oslo, Sonja HäffnerCenter for Crisis Early Warning, University of the Bundeswehr Munich, Martin HoferUniversity College London, Finn L. KlebeUniversity College London, Luca MacisDepartment of Economics and Statistics Cognetti de Martiis, University of Turin, Alexandra MalagaInstitute for Economic Analysis, Barcelona, Marius MehrlUniversity of Leeds, Nils W. MetternichUniversity College London, Daniel MittermaierCenter for Crisis Early Warning, University of the Bundeswehr Munich, David MuchlinskiGeorgia Tech, Hannes MuellerInstitute for Economic Analysis, BarcelonaBarcelona School of Economics, Christian OswaldCenter for Crisis Early Warning, University of the Bundeswehr Munich, Paola PisanoDepartment of Economics and Statistics Cognetti de Martiis, University of Turin, David RandahlDepartment of Peace and Conflict Research, Uppsala University, Christopher RauhUniversity of Cambridge, Lotta RüterInstitute of Statistics, Thomas SchincariolTrinity College Dublin, Benjamin SeimonFundació Economia Analitica, Elena SilettiDepartment of Economics and Statistics Cognetti de Martiis, University of Turin, Marco TagliapietraDepartment of Economics and Statistics Cognetti de Martiis, University of Turin, Chandler ThornhillGeorgia Tech, Johan VegeliusDepartment of Medical Sciences, Uppsala University, Julian WalterskirchenCenter for Crisis Early Warning, University of the Bundeswehr Munich","doi":"arxiv-2407.11045","DOIUrl":"https://doi.org/arxiv-2407.11045","url":null,"abstract":"This draft article outlines a prediction challenge where the target is to\u0000forecast the number of fatalities in armed conflicts, in the form of the UCDP\u0000`best' estimates, aggregated to the VIEWS units of analysis. It presents the\u0000format of the contributions, the evaluation metric, and the procedures, and a\u0000brief summary of the contributions. The article serves a function analogous to\u0000a pre-analysis plan: a statement of the forecasting models made publicly\u0000available before the true future prediction window commences. More information\u0000on the challenge, and all data referred to in this document, can be found at\u0000https://viewsforecasting.org/research/prediction-challenge-2023.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms 后ordb:测试、基准测试和开发贝叶斯推理算法
Pub Date : 2024-07-06 DOI: arxiv-2407.04967
Måns Magnusson, Jakob Torgander, Paul-Christian Bürkner, Lu Zhang, Bob Carpenter, Aki Vehtari
The generality and robustness of inference algorithms is critical to thesuccess of widely used probabilistic programming languages such as Stan, PyMC,Pyro, and Turing.jl. When designing a new general-purpose inference algorithm,whether it involves Monte Carlo sampling or variational approximation, thefundamental problem arises in evaluating its accuracy and efficiency across arange of representative target models. To solve this problem, we proposeposteriordb, a database of models and data sets defining target densities alongwith reference Monte Carlo draws. We further provide a guide to the bestpractices in using posteriordb for model evaluation and comparison. To providea wide range of realistic target densities, posteriordb currently comprises 120representative models and has been instrumental in developing several generalinference algorithms.
推理算法的通用性和鲁棒性是 Stan、PyMC、Pyro 和 Turing.jl 等广泛使用的概率编程语言取得成功的关键。在设计新的通用推理算法时,无论是蒙特卡罗抽样还是变分近似,最基本的问题是在一系列有代表性的目标模型中评估其准确性和效率。为了解决这个问题,我们提出了posteriordb,这是一个定义目标密度的模型和数据集数据库,并附有参考蒙特卡罗抽样。我们还提供了使用 posteriordb 进行模型评估和比较的最佳实践指南。为了提供广泛的现实目标密度,posteriordb 目前包括 120 个代表性模型,并在开发几种通用推断算法方面发挥了重要作用。
{"title":"posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms","authors":"Måns Magnusson, Jakob Torgander, Paul-Christian Bürkner, Lu Zhang, Bob Carpenter, Aki Vehtari","doi":"arxiv-2407.04967","DOIUrl":"https://doi.org/arxiv-2407.04967","url":null,"abstract":"The generality and robustness of inference algorithms is critical to the\u0000success of widely used probabilistic programming languages such as Stan, PyMC,\u0000Pyro, and Turing.jl. When designing a new general-purpose inference algorithm,\u0000whether it involves Monte Carlo sampling or variational approximation, the\u0000fundamental problem arises in evaluating its accuracy and efficiency across a\u0000range of representative target models. To solve this problem, we propose\u0000posteriordb, a database of models and data sets defining target densities along\u0000with reference Monte Carlo draws. We further provide a guide to the best\u0000practices in using posteriordb for model evaluation and comparison. To provide\u0000a wide range of realistic target densities, posteriordb currently comprises 120\u0000representative models and has been instrumental in developing several general\u0000inference algorithms.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gaussian process regression with log-linear scaling for common non-stationary kernels 普通非稳态核的对数线性缩放高斯过程回归
Pub Date : 2024-07-04 DOI: arxiv-2407.03608
P. Michael Kielstra, Michael Lindsey
We introduce a fast algorithm for Gaussian process regression in lowdimensions, applicable to a widely-used family of non-stationary kernels. Thenon-stationarity of these kernels is induced by arbitrary spatially-varyingvertical and horizontal scales. In particular, any stationary kernel can beaccommodated as a special case, and we focus especially on the generalizationof the standard Mat'ern kernel. Our subroutine for kernel matrix-vectormultiplications scales almost optimally as $O(Nlog N)$, where $N$ is thenumber of regression points. Like the recently developed equispaced FourierGaussian process (EFGP) methodology, which is applicable only to stationarykernels, our approach exploits non-uniform fast Fourier transforms (NUFFTs). Weoffer a complete analysis controlling the approximation error of our method,and we validate the method's practical performance with numerical experiments.In particular we demonstrate improved scalability compared to tostate-of-the-art rank-structured approaches in spatial dimension $d>1$.
我们介绍了一种用于低维度高斯过程回归的快速算法,它适用于广泛使用的非稳态核系列。这些核的非稳态性是由任意空间变化的垂直和水平尺度引起的。特别是,任何静止核都可以作为特例来处理,我们尤其关注标准 Mat'ern 核的广义化。我们的核矩阵-向量乘法子程序几乎以最优方式缩放为 $O(N/log N)$,其中 $N$ 是回归点的数量。最近开发的等距傅立叶高斯过程(EFGP)方法只适用于静态核,而我们的方法则利用了非均匀快速傅立叶变换(NUFFT)。我们提供了控制我们方法近似误差的完整分析,并通过数值实验验证了该方法的实用性能,特别是在空间维度 $d>1$ 的情况下,与最先进的秩结构方法相比,我们证明了该方法具有更好的可扩展性。
{"title":"Gaussian process regression with log-linear scaling for common non-stationary kernels","authors":"P. Michael Kielstra, Michael Lindsey","doi":"arxiv-2407.03608","DOIUrl":"https://doi.org/arxiv-2407.03608","url":null,"abstract":"We introduce a fast algorithm for Gaussian process regression in low\u0000dimensions, applicable to a widely-used family of non-stationary kernels. The\u0000non-stationarity of these kernels is induced by arbitrary spatially-varying\u0000vertical and horizontal scales. In particular, any stationary kernel can be\u0000accommodated as a special case, and we focus especially on the generalization\u0000of the standard Mat'ern kernel. Our subroutine for kernel matrix-vector\u0000multiplications scales almost optimally as $O(Nlog N)$, where $N$ is the\u0000number of regression points. Like the recently developed equispaced Fourier\u0000Gaussian process (EFGP) methodology, which is applicable only to stationary\u0000kernels, our approach exploits non-uniform fast Fourier transforms (NUFFTs). We\u0000offer a complete analysis controlling the approximation error of our method,\u0000and we validate the method's practical performance with numerical experiments.\u0000In particular we demonstrate improved scalability compared to to\u0000state-of-the-art rank-structured approaches in spatial dimension $d>1$.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometric statistics with subspace structure preservation for SPD matrices 为 SPD 矩阵保留子空间结构的几何统计
Pub Date : 2024-07-02 DOI: arxiv-2407.03382
Cyrus Mostajeran, Nathaël Da Costa, Graham Van Goffrier, Rodolphe Sepulchre
We present a geometric framework for the processing of SPD-valued data thatpreserves subspace structures and is based on the efficient computation ofextreme generalized eigenvalues. This is achieved through the use of theThompson geometry of the semidefinite cone. We explore a particular geodesicspace structure in detail and establish several properties associated with it.Finally, we review a novel inductive mean of SPD matrices based on thisgeometry.
我们提出了一个处理 SPD 值数据的几何框架,该框架保留了子空间结构,并以高效计算极端广义特征值为基础。这是通过使用半定锥的汤普森几何来实现的。我们详细探讨了一种特殊的大地空间结构,并建立了与之相关的几个属性。最后,我们回顾了基于这种几何的 SPD 矩阵的一种新颖的归纳平均值。
{"title":"Geometric statistics with subspace structure preservation for SPD matrices","authors":"Cyrus Mostajeran, Nathaël Da Costa, Graham Van Goffrier, Rodolphe Sepulchre","doi":"arxiv-2407.03382","DOIUrl":"https://doi.org/arxiv-2407.03382","url":null,"abstract":"We present a geometric framework for the processing of SPD-valued data that\u0000preserves subspace structures and is based on the efficient computation of\u0000extreme generalized eigenvalues. This is achieved through the use of the\u0000Thompson geometry of the semidefinite cone. We explore a particular geodesic\u0000space structure in detail and establish several properties associated with it.\u0000Finally, we review a novel inductive mean of SPD matrices based on this\u0000geometry.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - STAT - Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1