arXiv - STAT - Computation最新文献

英文中文

Explicit convergence rates of underdamped Langevin dynamics under weighted and weak Poincaré--Lions inequalities 加权和弱 Poincaré--Lions 不等式下的欠阻尼 Langevin 动力学的显式收敛率

arXiv - STAT - Computation

Pub Date : 2024-07-22 DOI: arxiv-2407.16033

Giovanni Brigati, Gabriel Stoltz, Andi Q. Wang, Lihan Wang

We study the long-time convergence behavior of underdamped Langevin dynamics,when the spatial equilibrium satisfies a weighted Poincar'e inequality, with ageneral velocity distribution, which allows for fat-tail or subexponentialpotential energies, and provide constructive and fully explicit estimates in$mathrm{L}^2$-norm with $mathrm{L}^infty$ initial conditions. A keyingredient is a space-time weighted Poincar'e--Lions inequality, which in turnimplies a weak Poincar'e--Lions inequality.

我们研究了当空间平衡满足加权Poincar'e 不等式时，欠阻尼朗格文动力学的长期收敛行为，该不等式具有一般的速度分布，允许胖尾或亚指数势能，并在$mathrm{L}^2$-norm 条件下提供了建设性和完全显式的估计，初始条件为$mathrm{L}^infty$。其中一个关键因素是时空加权的 Poincar'e--Lions 不等式，而这又意味着弱 Poincar'e--Lions 不等式。

引用次数: 0

Studying the Performance of the Jellyfish Search Optimiser for the Application of Projection Pursuit 研究水母搜索优化器在投影追寻应用中的性能

arXiv - STAT - Computation

Pub Date : 2024-07-18 DOI: arxiv-2407.13663

H. Sherry Zhang, Dianne Cook, Nicolas Langrené, Jessica Wai Yin Leung

The projection pursuit (PP) guided tour interactively optimises a criteriafunction known as the PP index, to explore high-dimensional data by revealinginteresting projections. The optimisation in PP can be non-trivial, involvingnon-smooth functions and optima with a small squint angle, detectable only fromclose proximity. To address these challenges, this study investigates theperformance of a recently introduced swarm-based algorithm, Jellyfish SearchOptimiser (JSO), for optimising PP indexes. The performance of JSO forvisualising data is evaluated across various hyper-parameter settings andcompared with existing optimisers. Additionally, this work proposes novelmethods to quantify two properties of the PP index, smoothness andsquintability that capture the complexities inherent in PP optimisationproblems. These two metrics are evaluated along with JSO hyper-parameters todetermine their effects on JSO success rate. Our numerical results confirm thepositive impact of these metrics on the JSO success rate, with squintabilitybeing the most significant. The JSO algorithm has been implemented in the tourrpackage and functions to calculate smoothness and squintability are availablein the ferrn package.

投影追寻（PP）导览以交互方式优化称为 PP 指数的标准函数，通过揭示有趣的投影来探索高维数据。投影追寻中的优化过程可能并不复杂，会涉及非光滑函数和眯眼角度较小的最优点，只能从近距离探测到。为了应对这些挑战，本研究对最近推出的基于蜂群的算法水母搜索优化器（JSO）的性能进行了研究，以优化 PP 索引。研究评估了 JSO 在不同超参数设置下的数据可视化性能，并与现有优化器进行了比较。此外，这项工作还提出了新方法来量化 PP 指数的两个属性，即平滑性和可量化性，这两个属性捕捉了 PP 优化问题固有的复杂性。我们对这两个指标以及 JSO 超参数进行了评估，以确定它们对 JSO 成功率的影响。我们的数值结果证实了这些指标对 JSO 成功率的积极影响，其中斜视性最为显著。JSO 算法已在 tourr 包中实现，计算平滑度和斜视度的函数可在 ferrn 包中获得。

{"title":"Studying the Performance of the Jellyfish Search Optimiser for the Application of Projection Pursuit","authors":"H. Sherry Zhang, Dianne Cook, Nicolas Langrené, Jessica Wai Yin Leung","doi":"arxiv-2407.13663","DOIUrl":"https://doi.org/arxiv-2407.13663","url":null,"abstract":"The projection pursuit (PP) guided tour interactively optimises a criteria\u0000function known as the PP index, to explore high-dimensional data by revealing\u0000interesting projections. The optimisation in PP can be non-trivial, involving\u0000non-smooth functions and optima with a small squint angle, detectable only from\u0000close proximity. To address these challenges, this study investigates the\u0000performance of a recently introduced swarm-based algorithm, Jellyfish Search\u0000Optimiser (JSO), for optimising PP indexes. The performance of JSO for\u0000visualising data is evaluated across various hyper-parameter settings and\u0000compared with existing optimisers. Additionally, this work proposes novel\u0000methods to quantify two properties of the PP index, smoothness and\u0000squintability that capture the complexities inherent in PP optimisation\u0000problems. These two metrics are evaluated along with JSO hyper-parameters to\u0000determine their effects on JSO success rate. Our numerical results confirm the\u0000positive impact of these metrics on the JSO success rate, with squintability\u0000being the most significant. The JSO algorithm has been implemented in the tourr\u0000package and functions to calculate smoothness and squintability are available\u0000in the ferrn package.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"63 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating the evolution and inter-individual variability of infant functional module development from 0 to 5 years old 评估 0 至 5 岁婴儿功能模块发展的演变和个体间差异

arXiv - STAT - Computation

Pub Date : 2024-07-18 DOI: arxiv-2407.13118

Lingbin Bian, Nizhuan Wang, Yuanning Li, Adeel Razi, Qian Wang, Han Zhang, Dinggang Shen, the UNC/UMN Baby Connectome Project Consortium

The segregation and integration of infant brain networks undergo tremendouschanges due to the rapid development of brain function and organization.Traditional methods for estimating brain modularity usually rely ongroup-averaged functional connectivity (FC), often overlooking individualvariability. To address this, we introduce a novel approach utilizing Bayesianmodeling to analyze the dynamic development of functional modules in infantsover time. This method retains inter-individual variability and, in comparisonto conventional group averaging techniques, more effectively detects modules,taking into account the stationarity of module evolution. Furthermore, weexplore gender differences in module development under awake and sleepconditions by assessing modular similarities. Our results show that femaleinfants demonstrate more distinct modular structures between these twoconditions, possibly implying relative quiet and restful sleep compared withmale infants.

由于大脑功能和组织的快速发展，婴儿大脑网络的分离和整合发生了巨大的变化。传统的大脑模块性估计方法通常依赖于组平均功能连接性（FC），往往忽略了个体的可变性。为了解决这个问题，我们引入了一种新方法，利用贝叶斯模型来分析婴儿功能模块随时间的动态发展。这种方法保留了个体间的可变性，与传统的组平均技术相比，能更有效地检测模块，同时考虑到模块演变的静态性。此外，我们还通过评估模块的相似性，探讨了清醒和睡眠条件下模块发展的性别差异。我们的结果表明，女婴在这两种条件下表现出更明显的模块结构，这可能意味着与男婴相比，女婴的睡眠相对安静和安稳。

{"title":"Evaluating the evolution and inter-individual variability of infant functional module development from 0 to 5 years old","authors":"Lingbin Bian, Nizhuan Wang, Yuanning Li, Adeel Razi, Qian Wang, Han Zhang, Dinggang Shen, the UNC/UMN Baby Connectome Project Consortium","doi":"arxiv-2407.13118","DOIUrl":"https://doi.org/arxiv-2407.13118","url":null,"abstract":"The segregation and integration of infant brain networks undergo tremendous\u0000changes due to the rapid development of brain function and organization.\u0000Traditional methods for estimating brain modularity usually rely on\u0000group-averaged functional connectivity (FC), often overlooking individual\u0000variability. To address this, we introduce a novel approach utilizing Bayesian\u0000modeling to analyze the dynamic development of functional modules in infants\u0000over time. This method retains inter-individual variability and, in comparison\u0000to conventional group averaging techniques, more effectively detects modules,\u0000taking into account the stationarity of module evolution. Furthermore, we\u0000explore gender differences in module development under awake and sleep\u0000conditions by assessing modular similarities. Our results show that female\u0000infants demonstrate more distinct modular structures between these two\u0000conditions, possibly implying relative quiet and restful sleep compared with\u0000male infants.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Examining inverse generative social science to study targets of interest 考察逆生成社会科学，研究感兴趣的目标

arXiv - STAT - Computation

Pub Date : 2024-07-18 DOI: arxiv-2407.13474

Thomas Chesney, Asif Jaffer, Robert Pasley

We assess an emerging simulation research method -- Inverse Generative SocialScience (IGSS) citep{Epstein23a} -- that harnesses the power of evolution bynatural selection to model and explain complex targets. Drawing on a review of recent papers that use IGSS, and by applying it in twodifferent studies of conflict, we here assess its potential both as a modellingapproach and as formal theory. We find that IGSS has potential for research in studies of organistions. IGSSoffers two huge advantages over most other approaches to modelling. 1) IGSS hasthe potential to fit complex non-linear models to a target and 2) the modelshave the potential to be interpreted as social theory. The paper presents IGSS to a new audience, illustrates how it can contribute,and provides software that can be used as a basis of an IGSS study.

我们评估了一种新兴的模拟研究方法--逆生成社会科学（Inverse Generative SocialScience，IGSS）--它利用自然选择进化的力量来模拟和解释复杂的目标。通过对近期使用 IGSS 的论文进行回顾，并将其应用于两项不同的冲突研究，我们在此评估了 IGSS 作为建模方法和形式理论的潜力。我们发现，IGSS 在组织研究方面具有潜力。与大多数其他建模方法相比，IGSS 具有两个巨大优势。1）IGSS 具有将复杂的非线性模型与目标相匹配的潜力；2）模型具有被解释为社会理论的潜力。本文向新读者介绍了 IGSS，说明了 IGSS 的贡献，并提供了可用作 IGSS 研究基础的软件。

引用次数: 0

LASPATED: A Library for the Analysis of Spatio-Temporal Discrete Data (User Manual) LASPATED：时空离散数据分析库（用户手册）

arXiv - STAT - Computation

Pub Date : 2024-07-18 DOI: arxiv-2407.13889

Vincent Guigues, Anton J. Kleywegt, Giovanni Amorim, Andre Krauss, Victor Hugo Nascimento

This is the User Manual of LASPATED library. This library is available onGitHub (at https://github.com/vguigues/LASPATED)) and provides a set of toolsto analyze spatiotemporal data. A video tutorial for this library is availableon Youtube. It is made of a Python package for time and space discretizationsand of two packages (one in Matlab and one in C++) implementing the calibrationof the probabilistic models for stochastic spatio-temporal data proposed in thecompanion paper arXiv:2203.16371v2.

这是 LASPATED 库的用户手册。该库可在 GitHub 上下载（网址：https://github.com/vguigues/LASPATED），提供了一套分析时空数据的工具。Youtube 上有该库的视频教程。它由一个用于时间和空间离散化的 Python 软件包和两个软件包（一个是 Matlab 软件包，一个是 C++ 软件包）组成，这两个软件包分别实现了论文 arXiv:2203.16371v2 中提出的随机时空数据概率模型的校准。

引用次数: 0

Incorporating additional evidence as prior information to resolve non-identifiability in Bayesian disease model calibration 在贝叶斯疾病模型校准中纳入额外证据作为先验信息以解决不可识别性问题

arXiv - STAT - Computation

Pub Date : 2024-07-18 DOI: arxiv-2407.13451

Daria Semochkina, Cathal Walsh

Background: Statisticians evaluating the impact of policy interventions suchas screening or vaccination will need to make use of mathematical andcomputational models of disease progression and spread. Calibration is theprocess of identifying the parameters of these models, with a Bayesianframework providing a natural way in which to do this in a probabilisticfashion. Markov Chain Monte Carlo (MCMC) is one of a number of computationaltools that is useful in carrying out this calibration. Objective: In thecontext of complex models in particular, a key problem that arises is one ofnon-identifiability. In this setting, one approach which can be used is toconsider and ensure that appropriately informative priors are specified on thejoint parameter space. We give examples of how this arises and may be addressedin practice. Methods: Using a basic SIS model the calibration process and theassociated challenge of non-identifiability is discussed. How this problemarises in the context of a larger model for HPV and cervical cancer is alsoillustrated. Results: The conditions which allow the problem ofnon-identifiability to be resolved are demonstrated for the SIS model. For thelarger HPV model, how this impacts on the calibration process is alsodiscussed.

背景：统计学家在评估筛查或疫苗接种等政策干预措施的影响时，需要使用疾病进展和传播的数学模型和计算模型。校准是确定这些模型参数的过程，贝叶斯框架提供了一种以概率方式完成校准的自然方法。马尔可夫链蒙特卡罗（MCMC）是进行校准时非常有用的计算工具之一。目标：特别是在复杂模型的背景下，出现的一个关键问题是不可识别性。在这种情况下，可以采用的一种方法是考虑并确保在联合参数空间上指定适当的信息先验。我们将举例说明在实践中如何解决这一问题。方法：使用一个基本的 SIS 模型，讨论校准过程和相关的不可识别性挑战。我们还说明了这一问题在 HPV 和宫颈癌的大型模型中是如何出现的。结果：在 SIS 模型中证明了解决不可识别性问题的条件。对于更大的 HPV 模型，还讨论了这对校准过程的影响。

{"title":"Incorporating additional evidence as prior information to resolve non-identifiability in Bayesian disease model calibration","authors":"Daria Semochkina, Cathal Walsh","doi":"arxiv-2407.13451","DOIUrl":"https://doi.org/arxiv-2407.13451","url":null,"abstract":"Background: Statisticians evaluating the impact of policy interventions such\u0000as screening or vaccination will need to make use of mathematical and\u0000computational models of disease progression and spread. Calibration is the\u0000process of identifying the parameters of these models, with a Bayesian\u0000framework providing a natural way in which to do this in a probabilistic\u0000fashion. Markov Chain Monte Carlo (MCMC) is one of a number of computational\u0000tools that is useful in carrying out this calibration. Objective: In the\u0000context of complex models in particular, a key problem that arises is one of\u0000non-identifiability. In this setting, one approach which can be used is to\u0000consider and ensure that appropriately informative priors are specified on the\u0000joint parameter space. We give examples of how this arises and may be addressed\u0000in practice. Methods: Using a basic SIS model the calibration process and the\u0000associated challenge of non-identifiability is discussed. How this problem\u0000arises in the context of a larger model for HPV and cervical cancer is also\u0000illustrated. Results: The conditions which allow the problem of\u0000non-identifiability to be resolved are demonstrated for the SIS model. For the\u0000larger HPV model, how this impacts on the calibration process is also\u0000discussed.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Convergence of SMACOF SMACOF 的收敛

arXiv - STAT - Computation

Pub Date : 2024-07-17 DOI: arxiv-2407.12945

Jan De Leeuw

To study convergence of SMACOF we introduce a modification mSMACOF thatrotates the configurations from each of the SMACOF iterations to principalcomponents. This modification, called mSMACOF, has the same stress values asSMACOF in each iteration, but unlike SMACOF it produces a sequence ofconfigurations that properly converges to a solution. We show that the modifiedalgorithm can be implemented by iterating ordinary SMACOF to convergence, andthen rotating the SMACOF solution to principal components. The speed of linearconvergence of SMACOF and mSMACOF is the same, and is equal to the largesteigenvalue of the derivative of the Guttman transform, ignoring the trivialunit eigenvalues that result from rotational indeterminacy.

为了研究 SMACOF 的收敛性，我们引入了一种修正 mSMACOF，它将 SMACOF 每次迭代的配置旋转为主成分。这种修改称为 mSMACOF，每次迭代的应力值与 SMACOF 相同，但与 SMACOF 不同的是，它产生的配置序列能正确收敛到一个解。我们证明，可以通过迭代普通 SMACOF 至收敛，然后旋转 SMACOF 解的主成分来实现改进算法。SMACOF 和 mSMACOF 的线性收敛速度相同，等于古特曼变换导数的最大特征值，忽略了旋转不确定性导致的微小单位特征值。

引用次数: 0

Quantile Slice Sampling 定量切片采样

arXiv - STAT - Computation

Pub Date : 2024-07-17 DOI: arxiv-2407.12608

Matthew J. Heiner, Samuel B. Johnson, Joshua R. Christensen, David B. Dahl

We propose and demonstrate an alternate, effective approach to simple slicesampling. Using the probability integral transform, we first generalize Neal'sshrinkage algorithm, standardizing the procedure to an automatic and universalstarting point: the unit interval. This enables the introduction of approximate(pseudo-) targets through importance reweighting, a technique that haspopularized elliptical slice sampling. Reasonably accurate pseudo-targets canboost sampler efficiency by requiring fewer rejections and by reducing targetskewness. This strategy is effective when a natural, possibly crude,approximation to the target exists. Alternatively, obtaining a marginalpseudo-target from initial samples provides an intuitive and automatic tuningprocedure. We consider two metrics for evaluating the quality of approximation;each can be used as a criterion to find an optimal pseudo-target or as aninterpretable diagnostic. We examine performance of the proposed samplerrelative to other popular, easily implemented MCMC samplers on standard targetsin isolation, and as steps within a Gibbs sampler in a Bayesian modelingcontext. We extend the transformation method to multivariate slice samplers anddemonstrate with a constrained state-space model for which a readily availableforward-backward algorithm provides the target approximation.

我们提出并演示了另一种有效的简单切片取样方法。利用概率积分变换，我们首先概括了尼尔的缩减算法，将程序标准化为一个自动的通用起点：单位区间。这样就能通过重要度再加权引入近似（伪）目标，这种技术已在椭圆切片采样中得到普及。合理精确的伪目标可以减少剔除次数，降低目标偏差，从而提高采样器的效率。当目标存在一个自然的（可能是粗糙的）近似值时，这种策略就很有效。另外，从初始样本中获取边际伪目标也提供了一种直观的自动调整程序。我们考虑了两种评估近似质量的指标；每种指标都可用作寻找最佳伪目标的标准或可解释的诊断。我们检验了所提出的采样器与其他流行的、易于实现的 MCMC 采样器相比在标准目标上的性能，以及在贝叶斯建模背景下作为吉布斯采样器中的步骤的性能。我们将转换方法扩展到多变量切片采样器，并用一个受限状态空间模型进行了演示，该模型的前向-后向算法提供了目标近似值。

{"title":"Quantile Slice Sampling","authors":"Matthew J. Heiner, Samuel B. Johnson, Joshua R. Christensen, David B. Dahl","doi":"arxiv-2407.12608","DOIUrl":"https://doi.org/arxiv-2407.12608","url":null,"abstract":"We propose and demonstrate an alternate, effective approach to simple slice\u0000sampling. Using the probability integral transform, we first generalize Neal's\u0000shrinkage algorithm, standardizing the procedure to an automatic and universal\u0000starting point: the unit interval. This enables the introduction of approximate\u0000(pseudo-) targets through importance reweighting, a technique that has\u0000popularized elliptical slice sampling. Reasonably accurate pseudo-targets can\u0000boost sampler efficiency by requiring fewer rejections and by reducing target\u0000skewness. This strategy is effective when a natural, possibly crude,\u0000approximation to the target exists. Alternatively, obtaining a marginal\u0000pseudo-target from initial samples provides an intuitive and automatic tuning\u0000procedure. We consider two metrics for evaluating the quality of approximation;\u0000each can be used as a criterion to find an optimal pseudo-target or as an\u0000interpretable diagnostic. We examine performance of the proposed sampler\u0000relative to other popular, easily implemented MCMC samplers on standard targets\u0000in isolation, and as steps within a Gibbs sampler in a Bayesian modeling\u0000context. We extend the transformation method to multivariate slice samplers and\u0000demonstrate with a constrained state-space model for which a readily available\u0000forward-backward algorithm provides the target approximation.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scaling Hawkes processes to one million COVID-19 cases 将霍克斯过程扩展到一百万个 COVID-19 病例

arXiv - STAT - Computation

Pub Date : 2024-07-16 DOI: arxiv-2407.11349

Seyoon Ko, Marc A. Suchard, Andrew J. Holbrook

Hawkes stochastic point process models have emerged as valuable statisticaltools for analyzing viral contagion. The spatiotemporal Hawkes processcharacterizes the speeds at which viruses spread within human populations.Unfortunately, likelihood-based inference using these models requires $O(N^2)$floating-point operations, for $N$ the number of observed cases. Recent workresponds to the Hawkes likelihood's computational burden by developingefficient graphics processing unit (GPU)-based routines that enable Bayesiananalysis of tens-of-thousands of observations. We build on this work anddevelop a high-performance computing (HPC) strategy that divides 30 Markovchains between 4 GPU nodes, each of which uses multiple GPUs to accelerate itschain's likelihood computations. We use this framework to apply twospatiotemporal Hawkes models to the analysis of one million COVID-19 cases inthe United States between March 2020 and June 2023. In addition to brute-forceHPC, we advocate for two simple strategies as scalable alternatives tosuccessful approaches proposed for small data settings. First, we use knowncounty-specific population densities to build a spatially varying triggeringkernel in a manner that avoids computationally costly nearest neighbors search.Second, we use a cut-posterior inference routine that accounts for infections'spatial location uncertainty by iteratively sampling latent locations uniformlywithin their respective counties of occurrence, thereby avoiding full-blownlatent variable inference for 1,000,000 infection locations.

霍克斯随机点过程模型已成为分析病毒传染的重要统计工具。时空霍克斯过程描述了病毒在人类种群中的传播速度。不幸的是，使用这些模型进行基于似然法的推断需要 $O(N^2)$ 的浮点运算，而 $N$ 是观察到的病例数。最近的工作通过开发基于图形处理器（GPU）的高效例程来解决霍克斯似然法的计算负担问题，这些例程可以对数以万计的观测数据进行贝叶斯分析。我们在此基础上开发了一种高性能计算（HPC）策略，将 30 个马尔可夫链划分为 4 个 GPU 节点，每个节点使用多个 GPU 加速其链的似然计算。我们利用这一框架将两个时空霍克斯模型应用于分析 2020 年 3 月至 2023 年 6 月期间美国的 100 万 COVID-19 病例。除了 "蛮力高性能计算"（brute-forceHPC）外，我们还主张采用两种简单的策略，作为针对小数据环境提出的成功方法的可扩展替代方案。首先，我们使用已知的特定县域人口密度来构建空间变化的触发核，这种方式避免了计算成本高昂的近邻搜索。其次，我们使用切后置推断例程，通过在各自的发生县域内均匀地迭代采样潜伏位置来考虑感染的空间位置不确定性，从而避免了对 1,000,000 个感染位置进行全吹式潜伏变量推断。

{"title":"Scaling Hawkes processes to one million COVID-19 cases","authors":"Seyoon Ko, Marc A. Suchard, Andrew J. Holbrook","doi":"arxiv-2407.11349","DOIUrl":"https://doi.org/arxiv-2407.11349","url":null,"abstract":"Hawkes stochastic point process models have emerged as valuable statistical\u0000tools for analyzing viral contagion. The spatiotemporal Hawkes process\u0000characterizes the speeds at which viruses spread within human populations.\u0000Unfortunately, likelihood-based inference using these models requires $O(N^2)$\u0000floating-point operations, for $N$ the number of observed cases. Recent work\u0000responds to the Hawkes likelihood's computational burden by developing\u0000efficient graphics processing unit (GPU)-based routines that enable Bayesian\u0000analysis of tens-of-thousands of observations. We build on this work and\u0000develop a high-performance computing (HPC) strategy that divides 30 Markov\u0000chains between 4 GPU nodes, each of which uses multiple GPUs to accelerate its\u0000chain's likelihood computations. We use this framework to apply two\u0000spatiotemporal Hawkes models to the analysis of one million COVID-19 cases in\u0000the United States between March 2020 and June 2023. In addition to brute-force\u0000HPC, we advocate for two simple strategies as scalable alternatives to\u0000successful approaches proposed for small data settings. First, we use known\u0000county-specific population densities to build a spatially varying triggering\u0000kernel in a manner that avoids computationally costly nearest neighbors search.\u0000Second, we use a cut-posterior inference routine that accounts for infections'\u0000spatial location uncertainty by iteratively sampling latent locations uniformly\u0000within their respective counties of occurrence, thereby avoiding full-blown\u0000latent variable inference for 1,000,000 infection locations.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automatic Parallel Tempering Markov Chain Monte Carlo with Nii-C 利用 Nii-C 实现自动并行回火马尔可夫链蒙特卡洛

arXiv - STAT - Computation

Pub Date : 2024-07-13 DOI: arxiv-2407.09915

Sheng Jin, Wenxin Jiang, Dong-Hong Wu

Due to the high dimensionality or multimodality that is common in modernastronomy, sampling Bayesian posteriors can be challenging. Several publiclyavailable codes based on different sampling algorithms can solve these complexmodels, but the execution of the code is not always efficient or fast enough.The article introduces a C language general-purpose code, Nii-C(https://github.com/shengjin/nii-c.git), that implements a framework ofAutomatic Parallel Tempering Markov Chain Monte Carlo. Automatic in thiscontext means that the parameters that ensure an efficient parallel temperingprocess can be set by a control system during the initial stages of a samplingprocess. The auto-tuned parameters consist of two parts, the temperatureladders of all parallel tempering Markov chains and the proposal distributionsfor all model parameters across all parallel tempering chains. In order toreduce dependencies in the compilation process and increase the code'sexecution speed, Nii-C code is constructed entirely in the C language andparallelised using the Message-Passing Interface protocol to optimise theefficiency of parallel sampling. These implementations facilitate rapidconvergence in the sampling of high-dimensional and multi-modal distributions,as well as expeditious code execution time. The Nii-C code can be used invarious research areas to trace complex distributions due to its high samplingefficiency and quick execution speed. This article presents a few applicationsof the Nii-C code.

由于现代天文学中常见的高维度或多模态性，贝叶斯后验的采样可能具有挑战性。一些基于不同采样算法的公开代码可以求解这些复杂模型，但代码执行的效率和速度并不总是足够快。本文介绍了一种 C 语言通用代码 Nii-C(https://github.com/shengjin/nii-c.git)，它实现了一种自动并行调节马尔可夫链蒙特卡罗框架。这里所说的自动是指在采样过程的初始阶段，可以通过控制系统设置确保高效并行回火过程的参数。自动调整参数由两部分组成，即所有平行回火马尔可夫链的温度梯度和所有平行回火链上所有模型参数的建议分布。为了减少编译过程中的依赖性并提高代码执行速度，Nii-C 代码完全用 C 语言编写，并使用消息传递接口协议进行并行化，以优化并行采样的效率。这些实现有助于在高维和多模态分布采样时快速收敛，并加快代码执行时间。Nii-C 代码的采样效率高、执行速度快，因此可用于多个研究领域，对复杂分布进行追踪。本文将介绍 Nii-C 代码的一些应用。

{"title":"Automatic Parallel Tempering Markov Chain Monte Carlo with Nii-C","authors":"Sheng Jin, Wenxin Jiang, Dong-Hong Wu","doi":"arxiv-2407.09915","DOIUrl":"https://doi.org/arxiv-2407.09915","url":null,"abstract":"Due to the high dimensionality or multimodality that is common in modern\u0000astronomy, sampling Bayesian posteriors can be challenging. Several publicly\u0000available codes based on different sampling algorithms can solve these complex\u0000models, but the execution of the code is not always efficient or fast enough.\u0000The article introduces a C language general-purpose code, Nii-C\u0000(https://github.com/shengjin/nii-c.git), that implements a framework of\u0000Automatic Parallel Tempering Markov Chain Monte Carlo. Automatic in this\u0000context means that the parameters that ensure an efficient parallel tempering\u0000process can be set by a control system during the initial stages of a sampling\u0000process. The auto-tuned parameters consist of two parts, the temperature\u0000ladders of all parallel tempering Markov chains and the proposal distributions\u0000for all model parameters across all parallel tempering chains. In order to\u0000reduce dependencies in the compilation process and increase the code's\u0000execution speed, Nii-C code is constructed entirely in the C language and\u0000parallelised using the Message-Passing Interface protocol to optimise the\u0000efficiency of parallel sampling. These implementations facilitate rapid\u0000convergence in the sampling of high-dimensional and multi-modal distributions,\u0000as well as expeditious code execution time. The Nii-C code can be used in\u0000various research areas to trace complex distributions due to its high sampling\u0000efficiency and quick execution speed. This article presents a few applications\u0000of the Nii-C code.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141720751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

arXiv - STAT - Computation

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀