首页 > 最新文献

arXiv - STAT - Computation最新文献

英文 中文
Statistical Finite Elements via Interacting Particle Langevin Dynamics 通过相互作用粒子朗格文动力学实现统计有限元
Pub Date : 2024-09-11 DOI: arxiv-2409.07101
Alex Glyn-Davies, Connor Duffin, Ieva Kazlauskaite, Mark Girolami, Ö. Deniz Akyildiz
In this paper, we develop a class of interacting particle Langevin algorithmsto solve inverse problems for partial differential equations (PDEs). Inparticular, we leverage the statistical finite elements (statFEM) formulationto obtain a finite-dimensional latent variable statistical model where theparameter is that of the (discretised) forward map and the latent variable isthe statFEM solution of the PDE which is assumed to be partially observed. Wethen adapt a recently proposed expectation-maximisation like scheme,interacting particle Langevin algorithm (IPLA), for this problem and obtain ajoint estimation procedure for the parameters and the latent variables. Weconsider three main examples: (i) estimating the forcing for linear PoissonPDE, (ii) estimating the forcing for nonlinear Poisson PDE, and (iii)estimating diffusivity for linear Poisson PDE. We provide computationalcomplexity estimates for forcing estimation in the linear case. We also providecomprehensive numerical experiments and preconditioning strategies thatsignificantly improve the performance, showing that the proposed class ofmethods can be the choice for parameter inference in PDE models.
在本文中,我们开发了一类交互粒子朗文算法来解决偏微分方程(PDE)的逆问题。特别是,我们利用统计有限元(statFEM)公式获得了一个有限维的潜变量统计模型,其中参数是(离散化)前向映射的参数,潜变量是假设为部分观测的偏微分方程的 statFEM 解。针对这一问题,我们采用了最近提出的类似期望最大化的方案--交互粒子朗文算法(IPLA),并获得了参数和潜变量的联合估计程序。我们考虑了三个主要例子:(i) 估计线性泊松 PDE 的强迫;(ii) 估计非线性泊松 PDE 的强迫;(iii) 估计线性泊松 PDE 的扩散性。我们为线性情况下的强迫估计提供了计算复杂性估计。我们还提供了可显著提高性能的综合数值实验和预处理策略,表明所提出的方法可以作为 PDE 模型参数推断的选择。
{"title":"Statistical Finite Elements via Interacting Particle Langevin Dynamics","authors":"Alex Glyn-Davies, Connor Duffin, Ieva Kazlauskaite, Mark Girolami, Ö. Deniz Akyildiz","doi":"arxiv-2409.07101","DOIUrl":"https://doi.org/arxiv-2409.07101","url":null,"abstract":"In this paper, we develop a class of interacting particle Langevin algorithms\u0000to solve inverse problems for partial differential equations (PDEs). In\u0000particular, we leverage the statistical finite elements (statFEM) formulation\u0000to obtain a finite-dimensional latent variable statistical model where the\u0000parameter is that of the (discretised) forward map and the latent variable is\u0000the statFEM solution of the PDE which is assumed to be partially observed. We\u0000then adapt a recently proposed expectation-maximisation like scheme,\u0000interacting particle Langevin algorithm (IPLA), for this problem and obtain a\u0000joint estimation procedure for the parameters and the latent variables. We\u0000consider three main examples: (i) estimating the forcing for linear Poisson\u0000PDE, (ii) estimating the forcing for nonlinear Poisson PDE, and (iii)\u0000estimating diffusivity for linear Poisson PDE. We provide computational\u0000complexity estimates for forcing estimation in the linear case. We also provide\u0000comprehensive numerical experiments and preconditioning strategies that\u0000significantly improve the performance, showing that the proposed class of\u0000methods can be the choice for parameter inference in PDE models.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph sub-sampling for divide-and-conquer algorithms in large networks 大型网络中分而治之算法的图子抽样
Pub Date : 2024-09-11 DOI: arxiv-2409.06994
Eric Yanchenko
As networks continue to increase in size, current methods must be capable ofhandling large numbers of nodes and edges in order to be practically relevant.Instead of working directly with the entire (large) network, analyzingsub-networks has become a popular approach. Due to a network's inherentinter-connectedness, sub-sampling is not a trivial task. While this problem hasgained attention in recent years, it has not received sufficient attention fromthe statistics community. In this work, we provide a thorough comparison ofseven graph sub-sampling algorithms by applying them to divide-and-conqueralgorithms for community structure and core-periphery (CP) structure. Afterdiscussing the various algorithms and sub-sampling routines, we derivetheoretical results for the mis-classification rate of the divide-and-conqueralgorithm for CP structure under various sub-sampling schemes. We then performextensive experiments on both simulated and real-world data to compare thevarious methods. For the community detection task, we found that sampling nodesuniformly at random yields the best performance. For CP structure on the otherhand, there was no single winner, but algorithms which sampled core nodes at ahigher rate consistently outperformed other sampling routines, e.g., randomedge sampling and random walk sampling. The varying performance of the samplingalgorithms on different tasks demonstrates the importance of carefullyselecting a sub-sampling routine for the specific application.
随着网络规模的不断扩大,当前的方法必须能够处理大量的节点和边,才能具有实际意义。由于网络固有的相互连接性,子采样并非易事。虽然这个问题近年来受到了越来越多的关注,但却没有得到统计学界的足够重视。在这项工作中,我们通过将七种图子采样算法应用于群落结构和核心-外围(CP)结构的分而治之算法,对它们进行了全面的比较。在讨论了各种算法和子采样例程之后,我们得出了在各种子采样方案下,CP 结构的分而萃算法误分类率的理论结果。然后,我们在模拟数据和实际数据上进行了大量实验,对各种方法进行了比较。对于群落检测任务,我们发现随机均匀采样节点的性能最好。另一方面,在 CP 结构方面,虽然没有单一的优胜者,但以更高的速率对核心节点进行采样的算法始终优于其他采样程序,例如随机边缘采样和随机漫步采样。采样算法在不同任务上的不同表现表明,针对特定应用仔细选择子采样例程非常重要。
{"title":"Graph sub-sampling for divide-and-conquer algorithms in large networks","authors":"Eric Yanchenko","doi":"arxiv-2409.06994","DOIUrl":"https://doi.org/arxiv-2409.06994","url":null,"abstract":"As networks continue to increase in size, current methods must be capable of\u0000handling large numbers of nodes and edges in order to be practically relevant.\u0000Instead of working directly with the entire (large) network, analyzing\u0000sub-networks has become a popular approach. Due to a network's inherent\u0000inter-connectedness, sub-sampling is not a trivial task. While this problem has\u0000gained attention in recent years, it has not received sufficient attention from\u0000the statistics community. In this work, we provide a thorough comparison of\u0000seven graph sub-sampling algorithms by applying them to divide-and-conquer\u0000algorithms for community structure and core-periphery (CP) structure. After\u0000discussing the various algorithms and sub-sampling routines, we derive\u0000theoretical results for the mis-classification rate of the divide-and-conquer\u0000algorithm for CP structure under various sub-sampling schemes. We then perform\u0000extensive experiments on both simulated and real-world data to compare the\u0000various methods. For the community detection task, we found that sampling nodes\u0000uniformly at random yields the best performance. For CP structure on the other\u0000hand, there was no single winner, but algorithms which sampled core nodes at a\u0000higher rate consistently outperformed other sampling routines, e.g., random\u0000edge sampling and random walk sampling. The varying performance of the sampling\u0000algorithms on different tasks demonstrates the importance of carefully\u0000selecting a sub-sampling routine for the specific application.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing VarLiNGAM for Scalable and Efficient Time Series Causal Discovery 优化 VarLiNGAM 以实现可扩展的高效时间序列因果关系发现
Pub Date : 2024-09-09 DOI: arxiv-2409.05500
Ziyang Jiao, Ce Guo, Wayne Luk
Causal discovery is designed to identify causal relationships in data, a taskthat has become increasingly complex due to the computational demands oftraditional methods such as VarLiNGAM, which combines Vector AutoregressiveModel with Linear Non-Gaussian Acyclic Model for time series data. This study is dedicated to optimising causal discovery specifically for timeseries data, which is common in practical applications. Time series causaldiscovery is particularly challenging due to the need to account for temporaldependencies and potential time lag effects. By designing a specialised datasetgenerator and reducing the computational complexity of the VarLiNGAM model from( O(m^3 cdot n) ) to ( O(m^3 + m^2 cdot n) ), this study significantlyimproves the feasibility of processing large datasets. The proposed methodshave been validated on advanced computational platforms and tested acrosssimulated, real-world, and large-scale datasets, showcasing enhanced efficiencyand performance. The optimised algorithm achieved 7 to 13 times speedupcompared with the original algorithm and around 4.5 times speedup compared withthe GPU-accelerated version on large-scale datasets with feature sizes between200 and 400. Our methods aim to push the boundaries of current causal discoverycapabilities, making them more robust, scalable, and applicable to real-worldscenarios, thus facilitating breakthroughs in various fields such as healthcareand finance.
因果发现的目的是识别数据中的因果关系,由于传统方法(如针对时间序列数据的矢量自回归模型与线性非高斯循环模型相结合的 VarLiNGAM)的计算需求,这项任务变得越来越复杂。本研究致力于优化时间序列数据的因果发现,这在实际应用中很常见。由于需要考虑时间依赖性和潜在的时滞效应,时间序列因果发现尤其具有挑战性。通过设计专门的数据集生成器,并将 VarLiNGAM 模型的计算复杂度从( O(m^3 cdot n) )降低到( O(m^3 + m^2 cdot n) ),本研究大大提高了处理大型数据集的可行性。提出的方法在先进的计算平台上得到了验证,并在模拟、真实世界和大规模数据集上进行了测试,展示了更高的效率和性能。在特征大小介于 200 到 400 之间的大规模数据集上,优化算法的速度比原始算法提高了 7 到 13 倍,比 GPU 加速版本提高了约 4.5 倍。我们的方法旨在突破当前因果发现能力的界限,使其更加稳健、可扩展,并适用于现实世界的各种场景,从而促进医疗保健和金融等各个领域的突破。
{"title":"Optimizing VarLiNGAM for Scalable and Efficient Time Series Causal Discovery","authors":"Ziyang Jiao, Ce Guo, Wayne Luk","doi":"arxiv-2409.05500","DOIUrl":"https://doi.org/arxiv-2409.05500","url":null,"abstract":"Causal discovery is designed to identify causal relationships in data, a task\u0000that has become increasingly complex due to the computational demands of\u0000traditional methods such as VarLiNGAM, which combines Vector Autoregressive\u0000Model with Linear Non-Gaussian Acyclic Model for time series data. This study is dedicated to optimising causal discovery specifically for time\u0000series data, which is common in practical applications. Time series causal\u0000discovery is particularly challenging due to the need to account for temporal\u0000dependencies and potential time lag effects. By designing a specialised dataset\u0000generator and reducing the computational complexity of the VarLiNGAM model from\u0000( O(m^3 cdot n) ) to ( O(m^3 + m^2 cdot n) ), this study significantly\u0000improves the feasibility of processing large datasets. The proposed methods\u0000have been validated on advanced computational platforms and tested across\u0000simulated, real-world, and large-scale datasets, showcasing enhanced efficiency\u0000and performance. The optimised algorithm achieved 7 to 13 times speedup\u0000compared with the original algorithm and around 4.5 times speedup compared with\u0000the GPU-accelerated version on large-scale datasets with feature sizes between\u0000200 and 400. Our methods aim to push the boundaries of current causal discovery\u0000capabilities, making them more robust, scalable, and applicable to real-world\u0000scenarios, thus facilitating breakthroughs in various fields such as healthcare\u0000and finance.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Best Linear Unbiased Estimate from Privatized Histograms 从私有化直方图得出最佳线性无偏估计值
Pub Date : 2024-09-06 DOI: arxiv-2409.04387
Jordan Awan, Adam Edwards, Paul Bartholomew, Andrew Sillers
In differential privacy (DP) mechanisms, it can be beneficial to release"redundant" outputs, in the sense that a quantity can be estimated by combiningdifferent combinations of privatized values. Indeed, this structure is presentin the DP 2020 Decennial Census products published by the U.S. Census Bureau.With this structure, the DP output can be improved by enforcingself-consistency (i.e., estimators obtained by combining different valuesresult in the same estimate) and we show that the minimum variance processingis a linear projection. However, standard projection algorithms are toocomputationally expensive in terms of both memory and execution time forapplications such as the Decennial Census. We propose the Scalable EfficientAlgorithm for Best Linear Unbiased Estimate (SEA BLUE), based on a two stepprocess of aggregation and differencing that 1) enforces self-consistencythrough a linear and unbiased procedure, 2) is computationally and memoryefficient, 3) achieves the minimum variance solution under certain structuralassumptions, and 4) is empirically shown to be robust to violations of thesestructural assumptions. We propose three methods of calculating confidenceintervals from our estimates, under various assumptions. We apply SEA BLUE totwo 2010 Census demonstration products, illustrating its scalability andvalidity.
在差分隐私(DP)机制中,释放 "冗余 "输出可能是有益的,即可以通过组合不同的私有化值组合来估算一个数量。事实上,美国人口普查局发布的 DP 2020 十年期人口普查产品中就有这种结构。有了这种结构,DP 输出可以通过强制执行自一致性来改进(即通过组合不同值获得的估计值结果相同),我们证明最小方差处理是一种线性投影。然而,标准的投影算法在内存和执行时间方面都过于昂贵,不适合十年一次的人口普查等应用。我们提出了可扩展的高效最佳线性无偏估计算法(SEA BLUE),该算法基于聚合和差分两步过程,1)通过线性无偏程序实现自洽性;2)计算和内存效率高;3)在特定结构假设下实现最小方差解;4)经验表明对违反结构假设的情况具有鲁棒性。我们提出了三种在不同假设条件下计算估计值置信区间的方法。我们将 SEA BLUE 应用于两个 2010 年人口普查示范产品,以说明其可扩展性和有效性。
{"title":"Best Linear Unbiased Estimate from Privatized Histograms","authors":"Jordan Awan, Adam Edwards, Paul Bartholomew, Andrew Sillers","doi":"arxiv-2409.04387","DOIUrl":"https://doi.org/arxiv-2409.04387","url":null,"abstract":"In differential privacy (DP) mechanisms, it can be beneficial to release\u0000\"redundant\" outputs, in the sense that a quantity can be estimated by combining\u0000different combinations of privatized values. Indeed, this structure is present\u0000in the DP 2020 Decennial Census products published by the U.S. Census Bureau.\u0000With this structure, the DP output can be improved by enforcing\u0000self-consistency (i.e., estimators obtained by combining different values\u0000result in the same estimate) and we show that the minimum variance processing\u0000is a linear projection. However, standard projection algorithms are too\u0000computationally expensive in terms of both memory and execution time for\u0000applications such as the Decennial Census. We propose the Scalable Efficient\u0000Algorithm for Best Linear Unbiased Estimate (SEA BLUE), based on a two step\u0000process of aggregation and differencing that 1) enforces self-consistency\u0000through a linear and unbiased procedure, 2) is computationally and memory\u0000efficient, 3) achieves the minimum variance solution under certain structural\u0000assumptions, and 4) is empirically shown to be robust to violations of these\u0000structural assumptions. We propose three methods of calculating confidence\u0000intervals from our estimates, under various assumptions. We apply SEA BLUE to\u0000two 2010 Census demonstration products, illustrating its scalability and\u0000validity.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayesian Optimization through Sequential Monte Carlo and Statistical Physics-Inspired Techniques 通过序列蒙特卡洛和统计物理学启发技术进行贝叶斯优化
Pub Date : 2024-09-04 DOI: arxiv-2409.03094
Anton Lebedev, Thomas Warford, M. Emre Şahin
In this paper, we propose an approach for an application of Bayesianoptimization using Sequential Monte Carlo (SMC) and concepts from thestatistical physics of classical systems. Our method leverages the power ofmodern machine learning libraries such as NumPyro and JAX, allowing us toperform Bayesian optimization on multiple platforms, including CPUs, GPUs,TPUs, and in parallel. Our approach enables a low entry level for explorationof the methods while maintaining high performance. We present a promisingdirection for developing more efficient and effective techniques for a widerange of optimization problems in diverse fields.
在本文中,我们提出了一种使用序列蒙特卡罗(SMC)和经典系统统计物理学概念的贝叶斯优化应用方法。我们的方法利用了 NumPyro 和 JAX 等现代机器学习库的强大功能,使我们能够在多个平台(包括 CPU、GPU、TPU)上并行执行贝叶斯优化。我们的方法在保持高性能的同时,还降低了方法探索的入门门槛。我们提出了一个很有前途的方向,可以为不同领域中更广泛的优化问题开发更高效、更有效的技术。
{"title":"A Bayesian Optimization through Sequential Monte Carlo and Statistical Physics-Inspired Techniques","authors":"Anton Lebedev, Thomas Warford, M. Emre Şahin","doi":"arxiv-2409.03094","DOIUrl":"https://doi.org/arxiv-2409.03094","url":null,"abstract":"In this paper, we propose an approach for an application of Bayesian\u0000optimization using Sequential Monte Carlo (SMC) and concepts from the\u0000statistical physics of classical systems. Our method leverages the power of\u0000modern machine learning libraries such as NumPyro and JAX, allowing us to\u0000perform Bayesian optimization on multiple platforms, including CPUs, GPUs,\u0000TPUs, and in parallel. Our approach enables a low entry level for exploration\u0000of the methods while maintaining high performance. We present a promising\u0000direction for developing more efficient and effective techniques for a wide\u0000range of optimization problems in diverse fields.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conditional logistic individual-level models of spatial infectious disease dynamics 空间传染病动态的条件逻辑个体水平模型
Pub Date : 2024-09-04 DOI: arxiv-2409.02353
Tahmina Akter, Rob Deardon
Here, we introduce a novel framework for modelling the spatiotemporaldynamics of disease spread known as conditional logistic individual-levelmodels (CL-ILM's). This framework alleviates much of the computational burdenassociated with traditional spatiotemporal individual-level models forepidemics, and facilitates the use of standard software for fitting logisticmodels when analysing spatiotemporal disease patterns. The models can be fittedin either a frequentist or Bayesian framework. Here, we apply the new spatialCL-ILM to both simulated and semi-real data from the UK 2001 foot-and-mouthdisease epidemic.
在这里,我们介绍了一种新的疾病传播时空动态建模框架,即条件逻辑个体水平模型(CL-ILM)。该框架减轻了传统流行病时空个体水平模型的大部分计算负担,便于在分析疾病时空模式时使用标准软件拟合逻辑模型。这些模型可以在频数主义或贝叶斯框架内拟合。在此,我们将新的空间CL-ILM 应用于英国 2001 年口蹄疫疫情的模拟和半真实数据。
{"title":"Conditional logistic individual-level models of spatial infectious disease dynamics","authors":"Tahmina Akter, Rob Deardon","doi":"arxiv-2409.02353","DOIUrl":"https://doi.org/arxiv-2409.02353","url":null,"abstract":"Here, we introduce a novel framework for modelling the spatiotemporal\u0000dynamics of disease spread known as conditional logistic individual-level\u0000models (CL-ILM's). This framework alleviates much of the computational burden\u0000associated with traditional spatiotemporal individual-level models for\u0000epidemics, and facilitates the use of standard software for fitting logistic\u0000models when analysing spatiotemporal disease patterns. The models can be fitted\u0000in either a frequentist or Bayesian framework. Here, we apply the new spatial\u0000CL-ILM to both simulated and semi-real data from the UK 2001 foot-and-mouth\u0000disease epidemic.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guidance for twisted particle filter: a continuous-time perspective 扭转粒子滤波器的指导:连续时间视角
Pub Date : 2024-09-04 DOI: arxiv-2409.02399
Jianfeng Lu, Yuliang Wang
The particle filter (PF), also known as the sequential Monte Carlo (SMC), isdesigned to approximate high-dimensional probability distributions and theirnormalizing constants in the discrete-time setting. To reduce the variance ofthe Monte Carlo approximation, several twisted particle filters (TPF) have beenproposed by researchers, where one chooses or learns a twisting function thatmodifies the Markov transition kernel. In this paper, we study the TPF from acontinuous-time perspective. Under suitable settings, we show that thediscrete-time model converges to a continuous-time limit, which can be solvedthrough a series of well-studied control-based importance sampling algorithms.This discrete-continuous connection allows the design of new TPF algorithmsinspired by established continuous-time algorithms. As a concrete example,guided by existing importance sampling algorithms in the continuous-timesetting, we propose a novel algorithm called ``Twisted-Path Particle Filter"(TPPF), where the twist function, parameterized by neural networks, minimizesspecific KL-divergence between path measures. Some numerical experiments aregiven to illustrate the capability of the proposed algorithm.
粒子滤波器(PF),又称序列蒙特卡罗(SMC),设计用于在离散时间环境中近似高维概率分布及其归一化常数。为了降低蒙特卡罗近似的方差,研究人员提出了几种扭曲粒子滤波器(TPF),即选择或学习一个扭曲函数来修改马尔科夫转换核。本文从连续时间的角度研究了 TPF。在合适的设置下,我们证明离散时间模型会收敛到连续时间极限,而连续时间极限可以通过一系列经过充分研究的基于控制的重要性采样算法来求解。这种离散-连续的联系使得我们可以从已有的连续时间算法中汲取灵感,设计出新的 TPF 算法。作为一个具体的例子,在连续时间设置中现有重要性采样算法的指导下,我们提出了一种称为 "扭曲路径粒子滤波器"(TPPF)的新算法,其中扭曲函数由神经网络参数化,最小化路径度量之间的特定 KL-发散。本文给出了一些数值实验来说明所提算法的能力。
{"title":"Guidance for twisted particle filter: a continuous-time perspective","authors":"Jianfeng Lu, Yuliang Wang","doi":"arxiv-2409.02399","DOIUrl":"https://doi.org/arxiv-2409.02399","url":null,"abstract":"The particle filter (PF), also known as the sequential Monte Carlo (SMC), is\u0000designed to approximate high-dimensional probability distributions and their\u0000normalizing constants in the discrete-time setting. To reduce the variance of\u0000the Monte Carlo approximation, several twisted particle filters (TPF) have been\u0000proposed by researchers, where one chooses or learns a twisting function that\u0000modifies the Markov transition kernel. In this paper, we study the TPF from a\u0000continuous-time perspective. Under suitable settings, we show that the\u0000discrete-time model converges to a continuous-time limit, which can be solved\u0000through a series of well-studied control-based importance sampling algorithms.\u0000This discrete-continuous connection allows the design of new TPF algorithms\u0000inspired by established continuous-time algorithms. As a concrete example,\u0000guided by existing importance sampling algorithms in the continuous-time\u0000setting, we propose a novel algorithm called ``Twisted-Path Particle Filter\"\u0000(TPPF), where the twist function, parameterized by neural networks, minimizes\u0000specific KL-divergence between path measures. Some numerical experiments are\u0000given to illustrate the capability of the proposed algorithm.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parameter estimation of hidden Markov models: comparison of EM and quasi-Newton methods with a new hybrid algorithm 隐马尔可夫模型的参数估计:EM 和准牛顿方法与新混合算法的比较
Pub Date : 2024-09-04 DOI: arxiv-2409.02477
Sidonie FoulonCESP, NeuroDiderot, Thérèse TruongCESP, Anne-Louise LeuteneggerNeuroDiderot, Hervé PerdryCESP
Hidden Markov Models (HMM) model a sequence of observations that aredependent on a hidden (or latent) state that follow a Markov chain. Thesemodels are widely used in diverse fields including ecology, speech recognition,and genetics.Parameter estimation in HMM is typically performed using theBaum-Welch algorithm, a special case of the Expectation-Maximisation (EM)algorithm. While this method guarantee the convergence to a local maximum, itsconvergence rates is usually slow.Alternative methods, such as the directmaximisation of the likelihood using quasi-Newton methods (such as L-BFGS-B)can offer faster convergence but can be more complicated to implement due tochallenges to deal with the presence of bounds on the space of parameters.Wepropose a novel hybrid algorithm, QNEM, that combines the Baum-Welch and thequasi-Newton algorithms. QNEM aims to leverage the strength of both algorithmsby switching from one method to the other based on the convexity of thelikelihood function.We conducted a comparative analysis between QNEM, theBaum-Welch algorithm, an EM acceleration algorithm called SQUAREM (Varadhan,2008, Scand J Statist), and the L-BFGS-B quasi-Newton method by applying thesealgorithms to four examples built on different models. We estimated theparameters of each model using the different algorithms and evaluated theirperformances.Our results show that the best-performing algorithm depends on themodel considered. QNEM performs well overall, always being faster or equivalentto L-BFGS-B. The Baum-Welch and SQUAREM algorithms are faster than thequasi-Newton and QNEM algorithms in certain scenarios with multiple optimum. Inconclusion, QNEM offers a promising alternative to existing algorithms.
隐马尔可夫模型(HMM)是对一连串观测值的建模,这些观测值依赖于马尔可夫链上的隐藏(或潜在)状态。这些模型被广泛应用于生态学、语音识别和遗传学等多个领域。HMM 的参数估计通常使用鲍姆-韦尔奇算法(Baum-Welch algorithm)进行,该算法是期望最大化算法(EM)的一个特例。其他方法,如使用准牛顿方法(如 L-BFGS-B)直接最大化似然,可以提供更快的收敛速度,但由于要处理参数空间上存在的边界问题,实现起来可能会更加复杂。我们提出了一种新型混合算法 QNEM,它结合了 Baum-Welch 算法和准牛顿算法。我们对 QNEM、鲍姆-韦尔奇算法、一种名为 SQUAREM 的 EM 加速算法(Varadhan,2008 年,Scand J Statist)和 L-BFGS-B 准牛顿方法进行了比较分析,将这些算法应用于四个基于不同模型的示例。结果表明,最佳算法取决于所考虑的模型。QNEM 总体表现良好,速度始终快于或等同于 L-BFGS-B。在某些有多个最优的情况下,Baum-Welch 算法和 SQUAREM 算法比准牛顿算法和 QNEM 算法更快。总之,QNEM 为现有算法提供了一种有前途的替代方案。
{"title":"Parameter estimation of hidden Markov models: comparison of EM and quasi-Newton methods with a new hybrid algorithm","authors":"Sidonie FoulonCESP, NeuroDiderot, Thérèse TruongCESP, Anne-Louise LeuteneggerNeuroDiderot, Hervé PerdryCESP","doi":"arxiv-2409.02477","DOIUrl":"https://doi.org/arxiv-2409.02477","url":null,"abstract":"Hidden Markov Models (HMM) model a sequence of observations that are\u0000dependent on a hidden (or latent) state that follow a Markov chain. These\u0000models are widely used in diverse fields including ecology, speech recognition,\u0000and genetics.Parameter estimation in HMM is typically performed using the\u0000Baum-Welch algorithm, a special case of the Expectation-Maximisation (EM)\u0000algorithm. While this method guarantee the convergence to a local maximum, its\u0000convergence rates is usually slow.Alternative methods, such as the direct\u0000maximisation of the likelihood using quasi-Newton methods (such as L-BFGS-B)\u0000can offer faster convergence but can be more complicated to implement due to\u0000challenges to deal with the presence of bounds on the space of parameters.We\u0000propose a novel hybrid algorithm, QNEM, that combines the Baum-Welch and the\u0000quasi-Newton algorithms. QNEM aims to leverage the strength of both algorithms\u0000by switching from one method to the other based on the convexity of the\u0000likelihood function.We conducted a comparative analysis between QNEM, the\u0000Baum-Welch algorithm, an EM acceleration algorithm called SQUAREM (Varadhan,\u00002008, Scand J Statist), and the L-BFGS-B quasi-Newton method by applying these\u0000algorithms to four examples built on different models. We estimated the\u0000parameters of each model using the different algorithms and evaluated their\u0000performances.Our results show that the best-performing algorithm depends on the\u0000model considered. QNEM performs well overall, always being faster or equivalent\u0000to L-BFGS-B. The Baum-Welch and SQUAREM algorithms are faster than the\u0000quasi-Newton and QNEM algorithms in certain scenarios with multiple optimum. In\u0000conclusion, QNEM offers a promising alternative to existing algorithms.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dataset Distillation from First Principles: Integrating Core Information Extraction and Purposeful Learning 从第一原理出发的数据集提炼:整合核心信息提取和有目的的学习
Pub Date : 2024-09-02 DOI: arxiv-2409.01410
Vyacheslav Kungurtsev, Yuanfang Peng, Jianyang Gu, Saeed Vahidian, Anthony Quinn, Fadwa Idlahcen, Yiran Chen
Dataset distillation (DD) is an increasingly important technique that focuseson constructing a synthetic dataset capable of capturing the core informationin training data to achieve comparable performance in models trained on thelatter. While DD has a wide range of applications, the theory supporting it isless well evolved. New methods of DD are compared on a common set ofbenchmarks, rather than oriented towards any particular learning task. In thiswork, we present a formal model of DD, arguing that a precise characterizationof the underlying optimization problem must specify the inference taskassociated with the application of interest. Without this task-specific focus,the DD problem is under-specified, and the selection of a DD algorithm for aparticular task is merely heuristic. Our formalization reveals novelapplications of DD across different modeling environments. We analyze existingDD methods through this broader lens, highlighting their strengths andlimitations in terms of accuracy and faithfulness to optimal DD operation.Finally, we present numerical results for two case studies important incontemporary settings. Firstly, we address a critical challenge in medical dataanalysis: merging the knowledge from different datasets composed ofintersecting, but not identical, sets of features, in order to construct alarger dataset in what is usually a small sample setting. Secondly, we considerout-of-distribution error across boundary conditions for physics-informedneural networks (PINNs), showing the potential for DD to provide morephysically faithful data. By establishing this general formulation of DD, weaim to establish a new research paradigm by which DD can be understood and fromwhich new DD techniques can arise.
数据集提炼(Dataset distillation,DD)是一种越来越重要的技术,它主要是构建一个能够捕捉训练数据中核心信息的合成数据集,从而使在训练数据上训练的模型达到可比的性能。虽然 DD 的应用范围很广,但支持它的理论却不够成熟。DD的新方法是在一组通用基准上进行比较的,而不是面向任何特定的学习任务。在这项工作中,我们提出了一个 DD 的正式模型,认为对底层优化问题的精确描述必须明确与感兴趣的应用相关的推理任务。如果缺乏对特定任务的关注,DD 问题就不够明确,为特定任务选择 DD 算法也只是启发式的。我们的形式化揭示了 DD 在不同建模环境中的新应用。我们通过这个更广阔的视角分析了现有的 DD 方法,强调了它们在准确性和忠实于最佳 DD 操作方面的优势和局限性。最后,我们展示了两个在当代环境中非常重要的案例研究的数值结果。首先,我们讨论了医学数据分析中的一个关键挑战:合并由相互交叉但不完全相同的特征集组成的不同数据集的知识,以便在通常是小样本的情况下构建更大的数据集。其次,我们考虑了物理信息神经网络(PINNs)跨边界条件的分布误差,展示了 DD 提供更忠实于物理的数据的潜力。通过建立这种 DD 的一般表述,我们希望建立一种新的研究范式,从而理解 DD 并从中产生新的 DD 技术。
{"title":"Dataset Distillation from First Principles: Integrating Core Information Extraction and Purposeful Learning","authors":"Vyacheslav Kungurtsev, Yuanfang Peng, Jianyang Gu, Saeed Vahidian, Anthony Quinn, Fadwa Idlahcen, Yiran Chen","doi":"arxiv-2409.01410","DOIUrl":"https://doi.org/arxiv-2409.01410","url":null,"abstract":"Dataset distillation (DD) is an increasingly important technique that focuses\u0000on constructing a synthetic dataset capable of capturing the core information\u0000in training data to achieve comparable performance in models trained on the\u0000latter. While DD has a wide range of applications, the theory supporting it is\u0000less well evolved. New methods of DD are compared on a common set of\u0000benchmarks, rather than oriented towards any particular learning task. In this\u0000work, we present a formal model of DD, arguing that a precise characterization\u0000of the underlying optimization problem must specify the inference task\u0000associated with the application of interest. Without this task-specific focus,\u0000the DD problem is under-specified, and the selection of a DD algorithm for a\u0000particular task is merely heuristic. Our formalization reveals novel\u0000applications of DD across different modeling environments. We analyze existing\u0000DD methods through this broader lens, highlighting their strengths and\u0000limitations in terms of accuracy and faithfulness to optimal DD operation.\u0000Finally, we present numerical results for two case studies important in\u0000contemporary settings. Firstly, we address a critical challenge in medical data\u0000analysis: merging the knowledge from different datasets composed of\u0000intersecting, but not identical, sets of features, in order to construct a\u0000larger dataset in what is usually a small sample setting. Secondly, we consider\u0000out-of-distribution error across boundary conditions for physics-informed\u0000neural networks (PINNs), showing the potential for DD to provide more\u0000physically faithful data. By establishing this general formulation of DD, we\u0000aim to establish a new research paradigm by which DD can be understood and from\u0000which new DD techniques can arise.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Plasmode simulation for the evaluation of causal inference methods in homophilous social networks 质点模拟用于评估同亲社会网络中的因果推理方法
Pub Date : 2024-09-02 DOI: arxiv-2409.01316
Vanessa McNealis, Erica E. M. Moodie, Nema Dean
Typical simulation approaches for evaluating the performance of statisticalmethods on populations embedded in social networks may fail to captureimportant features of real-world networks. It can therefore be unclear whetherinference methods for causal effects due to interference that have been shownto perform well in such synthetic networks are applicable to social networkswhich arise in the real world. Plasmode simulation studies use a real datasetcreated from natural processes, but with part of the data-generation mechanismknown. However, given the sensitivity of relational data, many network data areprotected from unauthorized access or disclosure. In such case, plasmodesimulations cannot use released versions of real datasets which often omit thenetwork links, and instead can only rely on parameters estimated from them. Astatistical framework for creating replicated simulation datasets from privatesocial network data is developed and validated. The approach consists ofsimulating from a parametric exponential family random graph model fitted tothe network data and resampling from the observed exposure and covariatedistributions to preserve the associations among these variables.
典型的模拟方法用于评估统计方法在嵌入社交网络的人群中的表现,可能无法捕捉真实世界网络的重要特征。因此,在此类合成网络中表现出色的干扰因果效应推断方法是否适用于真实世界中出现的社会网络,可能并不清楚。质点模拟研究使用的是由自然过程创建的真实数据集,但数据生成机制的一部分是已知的。然而,鉴于关系数据的敏感性,许多网络数据都受到保护,以防止未经授权的访问或泄露。在这种情况下,等离子体模拟无法使用真实数据集的发布版本,因为这些数据集往往省略了网络链接,而只能依靠从中估算出的参数。本文开发并验证了从私人社交网络数据创建复制模拟数据集的统计框架。该方法包括从一个拟合网络数据的参数指数族随机图模型中进行模拟,并从观察到的暴露和协变分布中重新采样,以保留这些变量之间的关联。
{"title":"Plasmode simulation for the evaluation of causal inference methods in homophilous social networks","authors":"Vanessa McNealis, Erica E. M. Moodie, Nema Dean","doi":"arxiv-2409.01316","DOIUrl":"https://doi.org/arxiv-2409.01316","url":null,"abstract":"Typical simulation approaches for evaluating the performance of statistical\u0000methods on populations embedded in social networks may fail to capture\u0000important features of real-world networks. It can therefore be unclear whether\u0000inference methods for causal effects due to interference that have been shown\u0000to perform well in such synthetic networks are applicable to social networks\u0000which arise in the real world. Plasmode simulation studies use a real dataset\u0000created from natural processes, but with part of the data-generation mechanism\u0000known. However, given the sensitivity of relational data, many network data are\u0000protected from unauthorized access or disclosure. In such case, plasmode\u0000simulations cannot use released versions of real datasets which often omit the\u0000network links, and instead can only rely on parameters estimated from them. A\u0000statistical framework for creating replicated simulation datasets from private\u0000social network data is developed and validated. The approach consists of\u0000simulating from a parametric exponential family random graph model fitted to\u0000the network data and resampling from the observed exposure and covariate\u0000distributions to preserve the associations among these variables.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - STAT - Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1