首页 > 最新文献

arXiv - STAT - Computation最新文献

英文 中文
Armadillo and Eigen: A Tale of Two Linear Algebra Libraries 犰狳和 Eigen:两个线性代数库的故事
Pub Date : 2024-09-01 DOI: arxiv-2409.00568
Mauricio Vargas Sepulveda
This article introduces `cpp11eigen`, a new R package that integrates thepowerful Eigen C++ library for linear algebra into the R programmingenvironment. This article provides a detailed comparison between Armadillo andEigen speed and syntax. The `cpp11eigen` package simplifies a part of theprocess of using C++ within R by offering additional ease of integration forthose who require high-performance linear algebra operations in their Rworkflows. This work aims to discuss the tradeoff between computationalefficiency and accessibility.
本文介绍了 "cpp11eigen",这是一个新的 R 软件包,它将强大的线性代数 Eigen C++ 库集成到了 R 编程环境中。本文详细比较了 Armadillo 和 Eigen 的速度和语法。cpp11eigen "软件包简化了在 R 中使用 C++ 的部分过程,为那些在 R 工作流程中需要高性能线性代数运算的人提供了额外的集成便利。这项工作旨在讨论计算效率和易用性之间的权衡。
{"title":"Armadillo and Eigen: A Tale of Two Linear Algebra Libraries","authors":"Mauricio Vargas Sepulveda","doi":"arxiv-2409.00568","DOIUrl":"https://doi.org/arxiv-2409.00568","url":null,"abstract":"This article introduces `cpp11eigen`, a new R package that integrates the\u0000powerful Eigen C++ library for linear algebra into the R programming\u0000environment. This article provides a detailed comparison between Armadillo and\u0000Eigen speed and syntax. The `cpp11eigen` package simplifies a part of the\u0000process of using C++ within R by offering additional ease of integration for\u0000those who require high-performance linear algebra operations in their R\u0000workflows. This work aims to discuss the tradeoff between computational\u0000efficiency and accessibility.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Response probability distribution estimation of expensive computer simulators: A Bayesian active learning perspective using Gaussian process regression 昂贵计算机模拟器的响应概率分布估计:利用高斯过程回归的贝叶斯主动学习视角
Pub Date : 2024-08-31 DOI: arxiv-2409.00407
Chao Dang, Marcos A. Valdebenito, Nataly A. Manque, Jun Xu, Matthias G. R. Faes
Estimation of the response probability distributions of computer simulatorsin the presence of randomness is a crucial task in many fields. However,achieving this task with guaranteed accuracy remains an open computationalchallenge, especially for expensive-to-evaluate computer simulators. In thiswork, a Bayesian active learning perspective is presented to address thechallenge, which is based on the use of the Gaussian process (GP) regression.First, estimation of the response probability distributions is conceptuallyinterpreted as a Bayesian inference problem, as opposed to frequentistinference. This interpretation provides several important benefits: (1) itquantifies and propagates discretization error probabilistically; (2) itincorporates prior knowledge of the computer simulator, and (3) it enables theeffective reduction of numerical uncertainty in the solution to a prescribedlevel. The conceptual Bayesian idea is then realized by using the GPregression, where we derive the posterior statistics of the responseprobability distributions in semi-analytical form and also provide a numericalsolution scheme. Based on the practical Bayesian approach, a Bayesian activelearning (BAL) method is further proposed for estimating the responseprobability distributions. In this context, the key contribution lies in thedevelopment of two crucial components for active learning, i.e., stoppingcriterion and learning function, by taking advantage of posterior statistics.It is empirically demonstrated by five numerical examples that the proposed BALmethod can efficiently estimate the response probability distributions withdesired accuracy.
在存在随机性的情况下,估计计算机模拟器的响应概率分布是许多领域的一项重要任务。然而,如何在保证准确性的前提下完成这项任务仍然是一个有待解决的计算难题,尤其是对于评估成本高昂的计算机模拟器而言。首先,响应概率分布的估计在概念上被解释为贝叶斯推理问题,而不是频数推理问题。这种解释有几个重要的好处:(1)以概率方式量化和传播离散化误差;(2)纳入计算机模拟器的先验知识;(3)能够有效地将求解中的数值不确定性降低到规定水平。通过使用 GP 回归,我们以半分析的形式推导出了响应概率分布的后验统计量,并提供了数值求解方案,从而实现了概念性的贝叶斯思想。在实用贝叶斯方法的基础上,我们进一步提出了贝叶斯主动学习(BAL)方法,用于估计响应概率分布。在此背景下,贝叶斯主动学习方法的主要贡献在于利用后验统计量的优势,开发了主动学习的两个关键组件,即停止准则和学习函数,并通过五个数值示例实证证明了所提出的贝叶斯主动学习方法能够以期望的精度有效地估计响应概率分布。
{"title":"Response probability distribution estimation of expensive computer simulators: A Bayesian active learning perspective using Gaussian process regression","authors":"Chao Dang, Marcos A. Valdebenito, Nataly A. Manque, Jun Xu, Matthias G. R. Faes","doi":"arxiv-2409.00407","DOIUrl":"https://doi.org/arxiv-2409.00407","url":null,"abstract":"Estimation of the response probability distributions of computer simulators\u0000in the presence of randomness is a crucial task in many fields. However,\u0000achieving this task with guaranteed accuracy remains an open computational\u0000challenge, especially for expensive-to-evaluate computer simulators. In this\u0000work, a Bayesian active learning perspective is presented to address the\u0000challenge, which is based on the use of the Gaussian process (GP) regression.\u0000First, estimation of the response probability distributions is conceptually\u0000interpreted as a Bayesian inference problem, as opposed to frequentist\u0000inference. This interpretation provides several important benefits: (1) it\u0000quantifies and propagates discretization error probabilistically; (2) it\u0000incorporates prior knowledge of the computer simulator, and (3) it enables the\u0000effective reduction of numerical uncertainty in the solution to a prescribed\u0000level. The conceptual Bayesian idea is then realized by using the GP\u0000regression, where we derive the posterior statistics of the response\u0000probability distributions in semi-analytical form and also provide a numerical\u0000solution scheme. Based on the practical Bayesian approach, a Bayesian active\u0000learning (BAL) method is further proposed for estimating the response\u0000probability distributions. In this context, the key contribution lies in the\u0000development of two crucial components for active learning, i.e., stopping\u0000criterion and learning function, by taking advantage of posterior statistics.\u0000It is empirically demonstrated by five numerical examples that the proposed BAL\u0000method can efficiently estimate the response probability distributions with\u0000desired accuracy.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stochastic Vector Approximate Message Passing with applications to phase retrieval 随机向量近似信息传递与相位检索的应用
Pub Date : 2024-08-30 DOI: arxiv-2408.17102
Hajime Ueda, Shun Katakami, Masato Okada
Phase retrieval refers to the problem of recovering a high-dimensional vector$boldsymbol{x} in mathbb{C}^N$ from the magnitude of its linear transform$boldsymbol{z} = A boldsymbol{x}$, observed through a noisy channel. Toimprove the ill-posed nature of the inverse problem, it is a common practice toobserve the magnitude of linear measurements $boldsymbol{z}^{(1)} = A^{(1)}boldsymbol{x},..., boldsymbol{z}^{(L)} = A^{(L)}boldsymbol{x}$ usingmultiple sensing matrices $A^{(1)},..., A^{(L)}$, with ptychographic imagingbeing a remarkable example of such strategies. Inspired by existing algorithmsfor ptychographic reconstruction, we introduce stochasticity to VectorApproximate Message Passing (VAMP), a computationally efficient algorithmapplicable to a wide range of Bayesian inverse problems. By testing ourapproach in the setup of phase retrieval, we show the superior convergencespeed of the proposed algorithm.
相位检索指的是从(mathbb{C}^N)中的高维向量(vector)的线性变换(linear transform)的大小中恢复高维向量(vector)的问题。的线性变换$boldsymbol{z} = A boldsymbol{x}$的大小,并通过噪声信道进行观测。为了改善逆问题的无解性质,通常的做法是观察线性测量的大小 $boldsymbol{z}^{(1)} = A^{(1)}boldsymbol{x},..., boldsymbol{z}^{(L)} = A^{(L)}boldsymbol{x}$ 使用多个传感矩阵 $A^{(1)},..., A^{(L)}$,梯度成像就是这种策略的一个显著例子。受现有的阶梯图像重建算法的启发,我们在矢量近似信息传递(VAMP)中引入了随机性,这是一种适用于多种贝叶斯逆问题的高效计算算法。通过在相位检索设置中测试我们的方法,我们展示了所提出算法的卓越收敛速度。
{"title":"Stochastic Vector Approximate Message Passing with applications to phase retrieval","authors":"Hajime Ueda, Shun Katakami, Masato Okada","doi":"arxiv-2408.17102","DOIUrl":"https://doi.org/arxiv-2408.17102","url":null,"abstract":"Phase retrieval refers to the problem of recovering a high-dimensional vector\u0000$boldsymbol{x} in mathbb{C}^N$ from the magnitude of its linear transform\u0000$boldsymbol{z} = A boldsymbol{x}$, observed through a noisy channel. To\u0000improve the ill-posed nature of the inverse problem, it is a common practice to\u0000observe the magnitude of linear measurements $boldsymbol{z}^{(1)} = A^{(1)}\u0000boldsymbol{x},..., boldsymbol{z}^{(L)} = A^{(L)}boldsymbol{x}$ using\u0000multiple sensing matrices $A^{(1)},..., A^{(L)}$, with ptychographic imaging\u0000being a remarkable example of such strategies. Inspired by existing algorithms\u0000for ptychographic reconstruction, we introduce stochasticity to Vector\u0000Approximate Message Passing (VAMP), a computationally efficient algorithm\u0000applicable to a wide range of Bayesian inverse problems. By testing our\u0000approach in the setup of phase retrieval, we show the superior convergence\u0000speed of the proposed algorithm.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Continuous Gaussian mixture solution for linear Bayesian inversion with application to Laplace priors 线性贝叶斯反演的连续高斯混合解法与拉普拉斯先验的应用
Pub Date : 2024-08-29 DOI: arxiv-2408.16594
Rafael Flock, Yiqiu Dong, Felipe Uribe, Olivier Zahm
We focus on Bayesian inverse problems with Gaussian likelihood, linearforward model, and priors that can be formulated as a Gaussian mixture. Such amixture is expressed as an integral of Gaussian density functions weighted by amixing density over the mixing variables. Within this framework, thecorresponding posterior distribution also takes the form of a Gaussian mixture,and we derive the closed-form expression for its posterior mixing density. Tosample from the posterior Gaussian mixture, we propose a two-step samplingmethod. First, we sample the mixture variables from the posterior mixingdensity, and then we sample the variables of interest from Gaussian densitiesconditioned on the sampled mixing variables. However, the posterior mixingdensity is relatively difficult to sample from, especially in high dimensions.Therefore, we propose to replace the posterior mixing density by adimension-reduced approximation, and we provide a bound in the Hellingerdistance for the resulting approximate posterior. We apply the proposedapproach to a posterior with Laplace prior, where we introduce twodimension-reduced approximations for the posterior mixing density. Ournumerical experiments indicate that samples generated via the proposedapproximations have very low correlation and are close to the exact posterior.
我们重点研究具有高斯似然、线性前向模型和可表述为高斯混合物的先验的贝叶斯逆问题。这种混合物表示为混合变量上混合密度加权的高斯密度函数的积分。在这个框架内,相应的后验分布也是高斯混合物的形式,我们推导出了其后验混合密度的闭式表达式。为了从后验高斯混合分布中采样,我们提出了一种两步采样法。首先,我们根据后验混合密度对混合变量进行采样,然后根据以采样混合变量为条件的高斯密度对相关变量进行采样。因此,我们建议用降低维度的近似值来代替后验混合密度,并为得到的近似后验值提供了一个海林距离约束。我们将所提出的方法应用于具有拉普拉斯先验的后验,其中我们为后验混合密度引入了两个维度降低的近似值。数值实验表明,通过所提出的近似方法生成的样本具有非常低的相关性,并且接近精确后验。
{"title":"Continuous Gaussian mixture solution for linear Bayesian inversion with application to Laplace priors","authors":"Rafael Flock, Yiqiu Dong, Felipe Uribe, Olivier Zahm","doi":"arxiv-2408.16594","DOIUrl":"https://doi.org/arxiv-2408.16594","url":null,"abstract":"We focus on Bayesian inverse problems with Gaussian likelihood, linear\u0000forward model, and priors that can be formulated as a Gaussian mixture. Such a\u0000mixture is expressed as an integral of Gaussian density functions weighted by a\u0000mixing density over the mixing variables. Within this framework, the\u0000corresponding posterior distribution also takes the form of a Gaussian mixture,\u0000and we derive the closed-form expression for its posterior mixing density. To\u0000sample from the posterior Gaussian mixture, we propose a two-step sampling\u0000method. First, we sample the mixture variables from the posterior mixing\u0000density, and then we sample the variables of interest from Gaussian densities\u0000conditioned on the sampled mixing variables. However, the posterior mixing\u0000density is relatively difficult to sample from, especially in high dimensions.\u0000Therefore, we propose to replace the posterior mixing density by a\u0000dimension-reduced approximation, and we provide a bound in the Hellinger\u0000distance for the resulting approximate posterior. We apply the proposed\u0000approach to a posterior with Laplace prior, where we introduce two\u0000dimension-reduced approximations for the posterior mixing density. Our\u0000numerical experiments indicate that samples generated via the proposed\u0000approximations have very low correlation and are close to the exact posterior.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A review of sequential Monte Carlo methods for real-time disease modeling 用于实时疾病建模的蒙特卡洛序列方法综述
Pub Date : 2024-08-28 DOI: arxiv-2408.15739
Dhorasso Temfack, Jason Wyse
Sequential Monte Carlo methods are a powerful framework for approximating theposterior distribution of a state variable in a sequential manner. They providean attractive way of analyzing dynamic systems in real-time, taking intoaccount the limitations of traditional approaches such as Markov Chain MonteCarlo methods, which are not well suited to data that arrives incrementally.This paper reviews and explores the application of Sequential Monte Carlo indynamic disease modeling, highlighting its capacity for online inference andreal-time adaptation to evolving disease dynamics. The integration of kerneldensity approximation techniques within the stochasticSusceptible-Exposed-Infectious-Recovered (SEIR) compartment model is examined,demonstrating the algorithm's effectiveness in monitoring time-varyingparameters such as the effective reproduction number. Case studies, includingsimulations with synthetic data and analysis of real-world COVID-19 data fromIreland, demonstrate the practical applicability of this approach for informingtimely public health interventions.
序列蒙特卡罗方法是一种强大的框架,用于以序列方式逼近状态变量的后验分布。本文回顾并探讨了序列蒙特卡罗方法在动态疾病建模中的应用,强调了其在线推断和实时适应不断变化的疾病动态的能力。本文研究了在随机易感-暴露-感染-恢复(SEIR)区隔模型中整合核密度近似技术的问题,展示了该算法在监测有效繁殖数等时变参数方面的有效性。案例研究包括对合成数据的模拟和对爱尔兰 COVID-19 实际数据的分析,证明了这种方法在为及时的公共卫生干预提供信息方面的实际适用性。
{"title":"A review of sequential Monte Carlo methods for real-time disease modeling","authors":"Dhorasso Temfack, Jason Wyse","doi":"arxiv-2408.15739","DOIUrl":"https://doi.org/arxiv-2408.15739","url":null,"abstract":"Sequential Monte Carlo methods are a powerful framework for approximating the\u0000posterior distribution of a state variable in a sequential manner. They provide\u0000an attractive way of analyzing dynamic systems in real-time, taking into\u0000account the limitations of traditional approaches such as Markov Chain Monte\u0000Carlo methods, which are not well suited to data that arrives incrementally.\u0000This paper reviews and explores the application of Sequential Monte Carlo in\u0000dynamic disease modeling, highlighting its capacity for online inference and\u0000real-time adaptation to evolving disease dynamics. The integration of kernel\u0000density approximation techniques within the stochastic\u0000Susceptible-Exposed-Infectious-Recovered (SEIR) compartment model is examined,\u0000demonstrating the algorithm's effectiveness in monitoring time-varying\u0000parameters such as the effective reproduction number. Case studies, including\u0000simulations with synthetic data and analysis of real-world COVID-19 data from\u0000Ireland, demonstrate the practical applicability of this approach for informing\u0000timely public health interventions.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sampling parameters of ordinary differential equations with Langevin dynamics that satisfy constraints 满足约束条件的朗格文动态常微分方程参数采样
Pub Date : 2024-08-28 DOI: arxiv-2408.15505
Chris Chi, Jonathan Weare, Aaron R. Dinner
Fitting models to data to obtain distributions of consistent parameter valuesis important for uncertainty quantification, model comparison, and prediction.Standard Markov Chain Monte Carlo (MCMC) approaches for fitting ordinarydifferential equations (ODEs) to time-series data involve proposing trialparameter sets, numerically integrating the ODEs forward in time, and acceptingor rejecting the trial parameter sets. When the model dynamics dependnonlinearly on the parameters, as is generally the case, trial parameter setsare often rejected, and MCMC approaches become prohibitively computationallycostly to converge. Here, we build on methods for numerical continuation andtrajectory optimization to introduce an approach in which we use Langevindynamics in the joint space of variables and parameters to sample models thatsatisfy constraints on the dynamics. We demonstrate the method by sampling Hopfbifurcations and limit cycles of a model of a biochemical oscillator in aBayesian framework for parameter estimation, and we obtain more than a hundredfold speedup relative to a leading ensemble MCMC approach that requiresnumerically integrating the ODEs forward in time. We describe numericalexperiments that provide insight into the speedup. The method is general andcan be used in any framework for parameter estimation and model selection.
标准的马尔可夫链蒙特卡罗(MCMC)方法用于将普通微分方程(ODEs)拟合到时间序列数据中,包括提出试验参数集,对 ODEs 进行时间上的数值积分,以及接受或拒绝试验参数集。当模型动态非线性地依赖于参数时(通常是这种情况),试验参数集往往会被拒绝,MCMC 方法的收敛计算成本会高得令人望而却步。在这里,我们以数值延续和轨迹优化方法为基础,引入了一种方法,即在变量和参数的联合空间中使用朗格文德动力学,对满足动力学约束的模型进行采样。我们通过在贝叶斯框架下对一个生化振荡器模型的霍普夫分岔和极限循环进行采样,演示了这种方法的参数估计,与需要在时间上对 ODEs 进行数值积分的领先集合 MCMC 方法相比,我们获得了超过百倍的速度。我们描述了数值实验,以深入了解这种提速。该方法具有通用性,可用于参数估计和模型选择的任何框架。
{"title":"Sampling parameters of ordinary differential equations with Langevin dynamics that satisfy constraints","authors":"Chris Chi, Jonathan Weare, Aaron R. Dinner","doi":"arxiv-2408.15505","DOIUrl":"https://doi.org/arxiv-2408.15505","url":null,"abstract":"Fitting models to data to obtain distributions of consistent parameter values\u0000is important for uncertainty quantification, model comparison, and prediction.\u0000Standard Markov Chain Monte Carlo (MCMC) approaches for fitting ordinary\u0000differential equations (ODEs) to time-series data involve proposing trial\u0000parameter sets, numerically integrating the ODEs forward in time, and accepting\u0000or rejecting the trial parameter sets. When the model dynamics depend\u0000nonlinearly on the parameters, as is generally the case, trial parameter sets\u0000are often rejected, and MCMC approaches become prohibitively computationally\u0000costly to converge. Here, we build on methods for numerical continuation and\u0000trajectory optimization to introduce an approach in which we use Langevin\u0000dynamics in the joint space of variables and parameters to sample models that\u0000satisfy constraints on the dynamics. We demonstrate the method by sampling Hopf\u0000bifurcations and limit cycles of a model of a biochemical oscillator in a\u0000Bayesian framework for parameter estimation, and we obtain more than a hundred\u0000fold speedup relative to a leading ensemble MCMC approach that requires\u0000numerically integrating the ODEs forward in time. We describe numerical\u0000experiments that provide insight into the speedup. The method is general and\u0000can be used in any framework for parameter estimation and model selection.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating Complex HPV Dynamics Using Emulation and History Matching 利用仿真和历史匹配研究复杂的 HPV 动态变化
Pub Date : 2024-08-28 DOI: arxiv-2408.15805
Andrew Iskauskas, Jamie A. Cohen, Danny Scarponi, Ian Vernon, Michael Goldstein, Daniel Klein, Richard G. White, Nicky McCreesh
The study of transmission and progression of human papillomavirus (HPV) iscrucial for understanding the incidence of cervical cancers, and has beenidentified as a priority worldwide. The complexity of the disease necessitatesa detailed model of HPV transmission and its progression to cancer; to inferproperties of the above we require a careful process that can match toimperfect or incomplete observational data. In this paper, we describe theHPVsim simulator to satisfy the former requirement; to satisfy the latter wecouple this stochastic simulator to a process of emulation and history matchingusing the R package hmer. With these tools, we are able to obtain acomprehensive collection of parameter combinations that could give rise toobserved cancer data, and explore the implications of the variability of theseparameter sets as it relates to future health interventions.
研究人类乳头瘤病毒(HPV)的传播和发展对了解宫颈癌的发病率至关重要,已被确定为全球的优先事项。由于该疾病的复杂性,有必要建立一个详细的 HPV 传播及其向癌症发展的模型;要推断上述模型的特性,我们需要一个能与不完善或不完整的观察数据相匹配的谨慎过程。在本文中,我们描述了 HPVsim 模拟器,以满足前一项要求;为了满足后一项要求,我们将该随机模拟器与使用 R 软件包 hmer 的仿真和历史匹配过程结合起来。有了这些工具,我们就能全面收集可能导致癌症观测数据的参数组合,并探索这些参数集的可变性对未来健康干预的影响。
{"title":"Investigating Complex HPV Dynamics Using Emulation and History Matching","authors":"Andrew Iskauskas, Jamie A. Cohen, Danny Scarponi, Ian Vernon, Michael Goldstein, Daniel Klein, Richard G. White, Nicky McCreesh","doi":"arxiv-2408.15805","DOIUrl":"https://doi.org/arxiv-2408.15805","url":null,"abstract":"The study of transmission and progression of human papillomavirus (HPV) is\u0000crucial for understanding the incidence of cervical cancers, and has been\u0000identified as a priority worldwide. The complexity of the disease necessitates\u0000a detailed model of HPV transmission and its progression to cancer; to infer\u0000properties of the above we require a careful process that can match to\u0000imperfect or incomplete observational data. In this paper, we describe the\u0000HPVsim simulator to satisfy the former requirement; to satisfy the latter we\u0000couple this stochastic simulator to a process of emulation and history matching\u0000using the R package hmer. With these tools, we are able to obtain a\u0000comprehensive collection of parameter combinations that could give rise to\u0000observed cancer data, and explore the implications of the variability of these\u0000parameter sets as it relates to future health interventions.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Model-Free Method to Quantify Memory Utilization in Neural Point Processes 量化神经点过程内存利用率的无模型方法
Pub Date : 2024-08-28 DOI: arxiv-2408.15875
Gorana Mijatovic, Sebastiano Stramaglia, Luca Faes
Quantifying the predictive capacity of a neural system, intended as thecapability to store information and actively use it for dynamic systemevolution, is a key component of neural information processing. Informationstorage (IS), the main measure quantifying the active utilization of memory ina dynamic system, is only defined for discrete-time processes. While recenttheoretical work laid the foundations for the continuous-time analysis of thepredictive capacity stored in a process, methods for the effective computationof the related measures are needed to favor widespread utilization on neuraldata. This work introduces a method for the model-free estimation of theso-called memory utilization rate (MUR), the continuous-time counterpart of theIS, specifically designed to quantify the predictive capacity stored in neuralpoint processes. The method employs nearest-neighbor entropy estimation appliedto the inter-spike intervals measured from point-process realizations toquantify the extent of memory used by a spike train. An empirical procedurebased on surrogate data is implemented to compensate the estimation bias anddetect statistically significant levels of memory. The method is validated insimulated Poisson processes and in realistic models of coupled corticaldynamics and heartbeat dynamics. It is then applied to real spike trainsreflecting central and autonomic nervous system activities: in spontaneouslygrowing cortical neuron cultures, the MUR detected increasing memoryutilization across maturation stages, associated to emergent burstingsynchronized activity; in the study of the neuro-autonomic modulation of humanheartbeats, the MUR reflected the sympathetic activation occurring withpostural but not with mental stress. The proposed approach offers acomputationally reliable tool to analyze spike train data in computationalneuroscience and physiology.
量化神经系统的预测能力是神经信息处理的一个关键组成部分,预测能力是指神经系统存储信息并积极利用信息进行动态系统进化的能力。信息存储(IS)是量化动态系统内存主动利用率的主要指标,但它只适用于离散时间过程。虽然最近的理论工作为连续时间分析过程中存储的预测能力奠定了基础,但仍需要有效计算相关度量的方法,以促进神经数据的广泛利用。本研究介绍了一种无模型估算所谓内存利用率(MUR)的方法,即 IS 的连续时间对应值,专门用于量化神经点过程中存储的预测能力。该方法采用最近邻熵估算法,将其应用于从点进程实现中测量的尖峰间间隔,以量化尖峰序列所使用的记忆程度。基于代用数据的经验程序可补偿估计偏差,并检测出具有统计学意义的记忆水平。该方法在模拟泊松过程以及耦合皮层动力学和心跳动力学的现实模型中得到了验证。然后,将该方法应用于反映中枢神经系统和自主神经系统活动的真实尖峰列车:在自发生长的皮层神经元培养物中,MUR 检测到记忆利用率在各个成熟阶段都在增加,这与突发的同步活动有关;在人类心跳的神经-自主神经调节研究中,MUR 反映了交感神经在体力压力下的激活,而不是在精神压力下的激活。所提出的方法为在计算神经科学和生理学中分析尖峰列车数据提供了一种计算上可靠的工具。
{"title":"A Model-Free Method to Quantify Memory Utilization in Neural Point Processes","authors":"Gorana Mijatovic, Sebastiano Stramaglia, Luca Faes","doi":"arxiv-2408.15875","DOIUrl":"https://doi.org/arxiv-2408.15875","url":null,"abstract":"Quantifying the predictive capacity of a neural system, intended as the\u0000capability to store information and actively use it for dynamic system\u0000evolution, is a key component of neural information processing. Information\u0000storage (IS), the main measure quantifying the active utilization of memory in\u0000a dynamic system, is only defined for discrete-time processes. While recent\u0000theoretical work laid the foundations for the continuous-time analysis of the\u0000predictive capacity stored in a process, methods for the effective computation\u0000of the related measures are needed to favor widespread utilization on neural\u0000data. This work introduces a method for the model-free estimation of the\u0000so-called memory utilization rate (MUR), the continuous-time counterpart of the\u0000IS, specifically designed to quantify the predictive capacity stored in neural\u0000point processes. The method employs nearest-neighbor entropy estimation applied\u0000to the inter-spike intervals measured from point-process realizations to\u0000quantify the extent of memory used by a spike train. An empirical procedure\u0000based on surrogate data is implemented to compensate the estimation bias and\u0000detect statistically significant levels of memory. The method is validated in\u0000simulated Poisson processes and in realistic models of coupled cortical\u0000dynamics and heartbeat dynamics. It is then applied to real spike trains\u0000reflecting central and autonomic nervous system activities: in spontaneously\u0000growing cortical neuron cultures, the MUR detected increasing memory\u0000utilization across maturation stages, associated to emergent bursting\u0000synchronized activity; in the study of the neuro-autonomic modulation of human\u0000heartbeats, the MUR reflected the sympathetic activation occurring with\u0000postural but not with mental stress. The proposed approach offers a\u0000computationally reliable tool to analyze spike train data in computational\u0000neuroscience and physiology.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NetSurvival.jl: A glimpse into relative survival analysis with Julia NetSurvival.jl:使用 Julia 进行相对生存分析的一瞥
Pub Date : 2024-08-28 DOI: arxiv-2408.15655
Rim Alhajal, Oskar Laverny
In many population-based medical studies, the specific cause of death isunidentified, unreliable or even unavailable. Relative survival analysisaddresses this scenario, outside of standard (competing risks) survivalanalysis, to nevertheless estimate survival with respect to a specific cause.It separates the impact of the disease itself on mortality from other factors,such as age, sex, and general population trends. Different methods were createdwith the aim to construct consistent and efficient estimators for this purpose.The R package relsurv is the most commonly used today in application. WithJulia continuously proving itself to be an efficient and powerful programminglanguage, we felt the need to code a pure Julia take, thus NetSurvival.jl, ofthe standard routines and estimators in the field. The proposed implementationis clean, future-proof, well tested, and the package is correctly documentedinside the rising JuliaSurv GitHub organization, ensuring trustability of theresults. Through a comprehensive comparison in terms of performance andinterface to relsurv, we highlight the benefits of the Julia developingenvironment.
在许多基于人口的医学研究中,具体死因无法确定、不可靠甚至无法获得。相对存活率分析就是在标准(竞争风险)存活率分析之外,针对这种情况估算与特定死因相关的存活率,它将疾病本身对死亡率的影响与年龄、性别和总体人口趋势等其他因素区分开来。为实现这一目的,人们创造了不同的方法来构建一致且高效的估计器。R软件包relsurv是目前最常用的应用软件。随着 Julia 不断证明自己是一种高效、强大的编程语言,我们认为有必要对该领域的标准例程和估计器进行纯 Julia 代码转换,即 NetSurvival.jl。我们提出的实现是简洁的、面向未来的、经过良好测试的,而且该软件包在不断上升的 JuliaSurv GitHub 组织内有正确的文档记录,从而确保了结果的可信度。通过对性能和与 relsurv 接口的综合比较,我们强调了 Julia 开发环境的优势。
{"title":"NetSurvival.jl: A glimpse into relative survival analysis with Julia","authors":"Rim Alhajal, Oskar Laverny","doi":"arxiv-2408.15655","DOIUrl":"https://doi.org/arxiv-2408.15655","url":null,"abstract":"In many population-based medical studies, the specific cause of death is\u0000unidentified, unreliable or even unavailable. Relative survival analysis\u0000addresses this scenario, outside of standard (competing risks) survival\u0000analysis, to nevertheless estimate survival with respect to a specific cause.\u0000It separates the impact of the disease itself on mortality from other factors,\u0000such as age, sex, and general population trends. Different methods were created\u0000with the aim to construct consistent and efficient estimators for this purpose.\u0000The R package relsurv is the most commonly used today in application. With\u0000Julia continuously proving itself to be an efficient and powerful programming\u0000language, we felt the need to code a pure Julia take, thus NetSurvival.jl, of\u0000the standard routines and estimators in the field. The proposed implementation\u0000is clean, future-proof, well tested, and the package is correctly documented\u0000inside the rising JuliaSurv GitHub organization, ensuring trustability of the\u0000results. Through a comprehensive comparison in terms of performance and\u0000interface to relsurv, we highlight the benefits of the Julia developing\u0000environment.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An invitation to adaptive Markov chain Monte Carlo convergence theory 自适应马尔可夫链蒙特卡罗收敛理论邀请函
Pub Date : 2024-08-27 DOI: arxiv-2408.14903
Pietari Laitinen, Matti Vihola
Adaptive Markov chain Monte Carlo (MCMC) algorithms, which automatically tunetheir parameters based on past samples, have proved extremely useful inpractice. The self-tuning mechanism makes them `non-Markovian', which meansthat their validity cannot be ensured by standard Markov chains theory. Severaldifferent techniques have been suggested to analyse their theoreticalproperties, many of which are technically involved. The technical nature of thetheory may make the methods unnecessarily unappealing. We discuss one technique-- based on a martingale decomposition -- with uniformly ergodic Markovtransitions. We provide an accessible and self-contained treatment in thissetting, and give detailed proofs of the results discussed in the paper, whichonly require basic understanding of martingale theory and general state spaceMarkov chain concepts. We illustrate how our conditions can accomodatedifferent types of adaptation schemes, and can give useful insight to therequirements which ensure their validity.
自适应马尔可夫链蒙特卡罗(MCMC)算法可根据过去的样本自动调整参数,在实践中已被证明非常有用。自调整机制使其成为 "非马尔可夫 "算法,这意味着标准马尔可夫链理论无法确保其有效性。人们提出了几种不同的技术来分析它们的理论特性,其中许多都涉及技术问题。理论的技术性可能会使这些方法失去吸引力。我们讨论了一种技术--基于马丁格尔分解--与均匀遍历马尔可夫变换。我们在这种情况下提供了一种通俗易懂、自成一体的处理方法,并对文中讨论的结果给出了详细的证明,而这只需要对鞅理论和一般状态空间马尔可夫链概念有基本的了解。我们说明了我们的条件如何适应不同类型的适应方案,并对确保其有效性的要求提出了有益的见解。
{"title":"An invitation to adaptive Markov chain Monte Carlo convergence theory","authors":"Pietari Laitinen, Matti Vihola","doi":"arxiv-2408.14903","DOIUrl":"https://doi.org/arxiv-2408.14903","url":null,"abstract":"Adaptive Markov chain Monte Carlo (MCMC) algorithms, which automatically tune\u0000their parameters based on past samples, have proved extremely useful in\u0000practice. The self-tuning mechanism makes them `non-Markovian', which means\u0000that their validity cannot be ensured by standard Markov chains theory. Several\u0000different techniques have been suggested to analyse their theoretical\u0000properties, many of which are technically involved. The technical nature of the\u0000theory may make the methods unnecessarily unappealing. We discuss one technique\u0000-- based on a martingale decomposition -- with uniformly ergodic Markov\u0000transitions. We provide an accessible and self-contained treatment in this\u0000setting, and give detailed proofs of the results discussed in the paper, which\u0000only require basic understanding of martingale theory and general state space\u0000Markov chain concepts. We illustrate how our conditions can accomodate\u0000different types of adaptation schemes, and can give useful insight to the\u0000requirements which ensure their validity.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - STAT - Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1