Saifuddin Syed, Alexandre Bouchard-Côté, Kevin Chern, Arnaud Doucet
Annealed Sequential Monte Carlo (SMC) samplers are special cases of SMC samplers where the sequence of distributions can be embedded in a smooth path of distributions. Using this underlying path of distributions and a performance model based on the variance of the normalisation constant estimator, we systematically study dense schedule and large particle limits. From our theory and adaptive methods emerges a notion of global barrier capturing the inherent complexity of normalisation constant approximation under our performance model. We then turn the resulting approximations into surrogate objective functions of algorithm performance, and use them for methodology development. We obtain novel adaptive methodologies, Sequential SMC (SSMC) and Sequential AIS (SAIS) samplers, which address practical difficulties inherent in previous adaptive SMC methods. First, our SSMC algorithms are predictable: they produce a sequence of increasingly precise estimates at deterministic and known times. Second, SAIS, a special case of SSMC, enables schedule adaptation at a memory cost constant in the number of particles and require much less communication. Finally, these characteristics make SAIS highly efficient on GPUs. We develop an open-source, high-performance GPU implementation based on our methodology and demonstrate up to a hundred-fold speed improvement compared to state-of-the-art adaptive AIS methods.
{"title":"Optimised Annealed Sequential Monte Carlo Samplers","authors":"Saifuddin Syed, Alexandre Bouchard-Côté, Kevin Chern, Arnaud Doucet","doi":"arxiv-2408.12057","DOIUrl":"https://doi.org/arxiv-2408.12057","url":null,"abstract":"Annealed Sequential Monte Carlo (SMC) samplers are special cases of SMC\u0000samplers where the sequence of distributions can be embedded in a smooth path\u0000of distributions. Using this underlying path of distributions and a performance\u0000model based on the variance of the normalisation constant estimator, we\u0000systematically study dense schedule and large particle limits. From our theory\u0000and adaptive methods emerges a notion of global barrier capturing the inherent\u0000complexity of normalisation constant approximation under our performance model.\u0000We then turn the resulting approximations into surrogate objective functions of\u0000algorithm performance, and use them for methodology development. We obtain\u0000novel adaptive methodologies, Sequential SMC (SSMC) and Sequential AIS (SAIS)\u0000samplers, which address practical difficulties inherent in previous adaptive\u0000SMC methods. First, our SSMC algorithms are predictable: they produce a\u0000sequence of increasingly precise estimates at deterministic and known times.\u0000Second, SAIS, a special case of SSMC, enables schedule adaptation at a memory\u0000cost constant in the number of particles and require much less communication.\u0000Finally, these characteristics make SAIS highly efficient on GPUs. We develop\u0000an open-source, high-performance GPU implementation based on our methodology\u0000and demonstrate up to a hundred-fold speed improvement compared to\u0000state-of-the-art adaptive AIS methods.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"230 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cameron Bell, Krzystof Łatuszyński, Gareth O. Roberts
In order to tackle the problem of sampling from heavy tailed, high dimensional distributions via Markov Chain Monte Carlo (MCMC) methods, Yang, Latuszy'nski, and Roberts (2022) (arXiv:2205.12112) introduces the stereographic projection as a tool to compactify $mathbb{R}^d$ and transform the problem into sampling from a density on the unit sphere $mathbb{S}^d$. However, the improvement in algorithmic efficiency, as well as the computational cost of the implementation, are still significantly impacted by the parameters used in this transformation. To address this, we introduce adaptive versions of the Stereographic Random Walk (SRW), the Stereographic Slice Sampler (SSS), and the Stereographic Bouncy Particle Sampler (SBPS), which automatically update the parameters of the algorithms as the run progresses. The adaptive setup allows us to better exploit the power of the stereographic projection, even when the target distribution is neither centered nor homogeneous. We present a simulation study showing each algorithm's robustness to starting far from the mean in heavy tailed, high dimensional settings, as opposed to Hamiltonian Monte Carlo (HMC). We establish a novel framework for proving convergence of adaptive MCMC algorithms over collections of simultaneously uniformly ergodic Markov operators, including continuous time processes. This framework allows us to prove LLNs and a CLT for our adaptive Stereographic algorithms.
{"title":"Adaptive Stereographic MCMC","authors":"Cameron Bell, Krzystof Łatuszyński, Gareth O. Roberts","doi":"arxiv-2408.11780","DOIUrl":"https://doi.org/arxiv-2408.11780","url":null,"abstract":"In order to tackle the problem of sampling from heavy tailed, high\u0000dimensional distributions via Markov Chain Monte Carlo (MCMC) methods, Yang,\u0000Latuszy'nski, and Roberts (2022) (arXiv:2205.12112) introduces the\u0000stereographic projection as a tool to compactify $mathbb{R}^d$ and transform\u0000the problem into sampling from a density on the unit sphere $mathbb{S}^d$.\u0000However, the improvement in algorithmic efficiency, as well as the\u0000computational cost of the implementation, are still significantly impacted by\u0000the parameters used in this transformation. To address this, we introduce adaptive versions of the Stereographic Random\u0000Walk (SRW), the Stereographic Slice Sampler (SSS), and the Stereographic Bouncy\u0000Particle Sampler (SBPS), which automatically update the parameters of the\u0000algorithms as the run progresses. The adaptive setup allows us to better\u0000exploit the power of the stereographic projection, even when the target\u0000distribution is neither centered nor homogeneous. We present a simulation study\u0000showing each algorithm's robustness to starting far from the mean in heavy\u0000tailed, high dimensional settings, as opposed to Hamiltonian Monte Carlo (HMC).\u0000We establish a novel framework for proving convergence of adaptive MCMC\u0000algorithms over collections of simultaneously uniformly ergodic Markov\u0000operators, including continuous time processes. This framework allows us to\u0000prove LLNs and a CLT for our adaptive Stereographic algorithms.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Latent Space (LS) network models project the nodes of a network on a $d$-dimensional latent space to achieve dimensionality reduction of the network while preserving its relevant features. Inference is often carried out within a Markov Chain Monte Carlo (MCMC) framework. Nonetheless, it is well-known that the computational time for this set of models increases quadratically with the number of nodes. In this work, we build on the Random-Scan (RS) approach to propose an MCMC strategy that alleviates the computational burden for LS models while maintaining the benefits of a general-purpose technique. We call this novel strategy Multiple RS (MRS). This strategy is effective in reducing the computational cost by a factor without severe consequences on the MCMC draws. Moreover, we introduce a novel adaptation strategy that consists of a probabilistic update of the set of latent coordinates of each node. Our Adaptive MRS adapts the acceptance rate of the Metropolis step to adjust the probability of updating the latent coordinates. We show via simulation that the Adaptive MRS approach performs better than MRS in terms of mixing. Finally, we apply our algorithm to a multi-layer temporal LS model and show how our adaptive strategy may be beneficial to empirical applications.
潜空间(LS)网络模型将网络节点投影到一个 $d$ 维的潜空间上,以实现网络的降维,同时保留其相关特征。推理通常在马尔可夫链蒙特卡罗(MCMC)框架内进行。然而,众所周知,这组模型的计算时间会随着节点数量的增加而呈二次曲线增长。在这项工作中,我们以随机扫描(RS)方法为基础,提出了一种 MCMC 策略,既减轻了 LS 模型的计算负担,又保持了通用技术的优点。我们称这种新颖的策略为多重 RS(MRS)。此外,我们还引入了一种新颖的适应策略,它包括对每个节点的潜在坐标集进行概率更新。我们的自适应 MRS 调整 Metropolis 步骤的接受率,以调整更新潜在坐标的概率。我们通过仿真证明,自适应 MRS 方法在混合方面的表现优于 MRS。最后,我们将算法应用于多层时态 LS 模型,并展示了我们的自适应策略如何有益于经验应用。
{"title":"A Multiple Random Scan Strategy for Latent Space Models","authors":"Antonio Peruzzi, Roberto Casarin","doi":"arxiv-2408.11725","DOIUrl":"https://doi.org/arxiv-2408.11725","url":null,"abstract":"Latent Space (LS) network models project the nodes of a network on a\u0000$d$-dimensional latent space to achieve dimensionality reduction of the network\u0000while preserving its relevant features. Inference is often carried out within a\u0000Markov Chain Monte Carlo (MCMC) framework. Nonetheless, it is well-known that\u0000the computational time for this set of models increases quadratically with the\u0000number of nodes. In this work, we build on the Random-Scan (RS) approach to\u0000propose an MCMC strategy that alleviates the computational burden for LS models\u0000while maintaining the benefits of a general-purpose technique. We call this\u0000novel strategy Multiple RS (MRS). This strategy is effective in reducing the\u0000computational cost by a factor without severe consequences on the MCMC draws.\u0000Moreover, we introduce a novel adaptation strategy that consists of a\u0000probabilistic update of the set of latent coordinates of each node. Our\u0000Adaptive MRS adapts the acceptance rate of the Metropolis step to adjust the\u0000probability of updating the latent coordinates. We show via simulation that the\u0000Adaptive MRS approach performs better than MRS in terms of mixing. Finally, we\u0000apply our algorithm to a multi-layer temporal LS model and show how our\u0000adaptive strategy may be beneficial to empirical applications.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alireza Ghazavi Khorasgani, Mahtab Mirmohseni, Ahmed Elzanaty
This paper characterizes the optimal capacity-distortion (C-D) tradeoff in an optical point-to-point (P2P) system with single-input single-output for communication and single-input multiple-output for sensing (SISO-SIMO-C/S) within an integrated sensing and communication (ISAC) framework. We introduce practical, asymptotically optimal maximum a posteriori (MAP) and maximum likelihood estimators (MLE) for target distance, addressing nonlinear measurement-to-state relationships and non-conjugate priors. Our results show these estimators converge to the Bayesian Cramer-Rao bound (BCRB) as sensing antennas increase. We also demonstrate that the achievable rate-CRB (AR-CRB) serves as an outer bound (OB) for the optimal C-D region. To optimize input distribution across the Pareto boundary of the C-D region, we propose two algorithms: an iterative Blahut-Arimoto algorithm (BAA)-type method and a memory-efficient closed-form (CF) approach, including a CF optimal distribution for high optical signal-to-noise ratio (O-SNR) conditions. Additionally, we extend and modify the Deterministic-Random Tradeoff (DRT) to this optical ISAC context.
{"title":"Optical ISAC: Fundamental Performance Limits and Transceiver Design","authors":"Alireza Ghazavi Khorasgani, Mahtab Mirmohseni, Ahmed Elzanaty","doi":"arxiv-2408.11792","DOIUrl":"https://doi.org/arxiv-2408.11792","url":null,"abstract":"This paper characterizes the optimal capacity-distortion (C-D) tradeoff in an\u0000optical point-to-point (P2P) system with single-input single-output for\u0000communication and single-input multiple-output for sensing (SISO-SIMO-C/S)\u0000within an integrated sensing and communication (ISAC) framework. We introduce\u0000practical, asymptotically optimal maximum a posteriori (MAP) and maximum\u0000likelihood estimators (MLE) for target distance, addressing nonlinear\u0000measurement-to-state relationships and non-conjugate priors. Our results show\u0000these estimators converge to the Bayesian Cramer-Rao bound (BCRB) as sensing\u0000antennas increase. We also demonstrate that the achievable rate-CRB (AR-CRB)\u0000serves as an outer bound (OB) for the optimal C-D region. To optimize input\u0000distribution across the Pareto boundary of the C-D region, we propose two\u0000algorithms: an iterative Blahut-Arimoto algorithm (BAA)-type method and a\u0000memory-efficient closed-form (CF) approach, including a CF optimal distribution\u0000for high optical signal-to-noise ratio (O-SNR) conditions. Additionally, we\u0000extend and modify the Deterministic-Random Tradeoff (DRT) to this optical ISAC\u0000context.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mauricio Vargas Sepúlveda, Jonathan Schneider Malamud
This article introduces 'armadillo', a new R package that integrates the powerful Armadillo C++ library for linear algebra into the R programming environment. Targeted primarily at social scientists and other non-programmers, this article explains the computational benefits of moving code to C++ in terms of speed and syntax. We provide a comprehensive overview of Armadillo's capabilities, highlighting its user-friendly syntax akin to MATLAB and its efficiency for computationally intensive tasks. The 'armadillo' package simplifies a part of the process of using C++ within R by offering additional ease of integration for those who require high-performance linear algebra operations in their R workflows. This work aims to bridge the gap between computational efficiency and accessibility, making advanced linear algebra operations more approachable for R users without extensive programming backgrounds.
本文介绍了一个新的 R 软件包 "犰狳",它将强大的线性代数 C++ 库犰狳集成到了 R 编程环境中。本文主要针对社会科学家和其他非程序员,从速度和语法方面解释了将代码转为 C++ 的计算优势。我们对 Armadillo 的功能进行了全面概述,重点介绍了其类似于 MATLAB 的用户友好语法,以及其在计算密集型任务中的效率。犰狳 "软件包简化了在 R 中使用 C++ 的部分过程,为那些需要在 R 工作流中进行高性能线性代数运算的人提供了额外的集成便利。这项工作旨在弥合计算效率和易用性之间的差距,让没有广泛编程背景的 R 用户更容易进行高级线性代数运算。
{"title":"armadillo: An R Package to Use the Armadillo C++ Library","authors":"Mauricio Vargas Sepúlveda, Jonathan Schneider Malamud","doi":"arxiv-2408.11074","DOIUrl":"https://doi.org/arxiv-2408.11074","url":null,"abstract":"This article introduces 'armadillo', a new R package that integrates the\u0000powerful Armadillo C++ library for linear algebra into the R programming\u0000environment. Targeted primarily at social scientists and other non-programmers,\u0000this article explains the computational benefits of moving code to C++ in terms\u0000of speed and syntax. We provide a comprehensive overview of Armadillo's\u0000capabilities, highlighting its user-friendly syntax akin to MATLAB and its\u0000efficiency for computationally intensive tasks. The 'armadillo' package\u0000simplifies a part of the process of using C++ within R by offering additional\u0000ease of integration for those who require high-performance linear algebra\u0000operations in their R workflows. This work aims to bridge the gap between\u0000computational efficiency and accessibility, making advanced linear algebra\u0000operations more approachable for R users without extensive programming\u0000backgrounds.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A partially identified model, where the parameters can not be uniquely identified, often arises during statistical analysis. While researchers frequently use Bayesian inference to analyze the models, when Bayesian inference with an off-the-shelf MCMC sampling algorithm is applied to a partially identified model, the computational performance can be poor. It is found that using importance sampling with transparent reparameterization (TP) is one remedy. This method is preferable since the model is known to be rendered as identified with respect to the new parameterization, and at the same time, it may allow faster, i.i.d. Monte Carlo sampling by using conjugate convenience priors. In this paper, we explain the importance sampling method with the TP and a pseudo-TP. We introduce the pseudo-TP, an alternative to TP, since finding a TP is sometimes difficult. Then, we test the methods' performance in some scenarios and compare it to the performance of the off-the-shelf MCMC method - Gibbs sampling - applied in the original parameterization. While the importance sampling with TP (ISTP) shows generally better results than off-the-shelf MCMC methods, as seen in the compute time and trace plots, it is also seen that finding a TP which is necessary for the method may not be easy. On the other hand, the pseudo-TP method shows a mixed result and room for improvement since it relies on an approximation, which may not be adequate for a given model and dataset.
{"title":"Issues of parameterization and computation for posterior inference in partially identified models","authors":"Seren Lee, Paul Gustafson","doi":"arxiv-2408.10416","DOIUrl":"https://doi.org/arxiv-2408.10416","url":null,"abstract":"A partially identified model, where the parameters can not be uniquely\u0000identified, often arises during statistical analysis. While researchers\u0000frequently use Bayesian inference to analyze the models, when Bayesian\u0000inference with an off-the-shelf MCMC sampling algorithm is applied to a\u0000partially identified model, the computational performance can be poor. It is\u0000found that using importance sampling with transparent reparameterization (TP)\u0000is one remedy. This method is preferable since the model is known to be\u0000rendered as identified with respect to the new parameterization, and at the\u0000same time, it may allow faster, i.i.d. Monte Carlo sampling by using conjugate\u0000convenience priors. In this paper, we explain the importance sampling method\u0000with the TP and a pseudo-TP. We introduce the pseudo-TP, an alternative to TP,\u0000since finding a TP is sometimes difficult. Then, we test the methods'\u0000performance in some scenarios and compare it to the performance of the\u0000off-the-shelf MCMC method - Gibbs sampling - applied in the original\u0000parameterization. While the importance sampling with TP (ISTP) shows generally\u0000better results than off-the-shelf MCMC methods, as seen in the compute time and\u0000trace plots, it is also seen that finding a TP which is necessary for the\u0000method may not be easy. On the other hand, the pseudo-TP method shows a mixed\u0000result and room for improvement since it relies on an approximation, which may\u0000not be adequate for a given model and dataset.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"66 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The R package $statcheck$ is designed to extract statistical test results from text and check the consistency of the reported test statistics and corresponding p-values. Recently, it has also been featured as a spell checker for statistical results, aimed at improving reporting accuracy in scientific publications. In this study, I perform a check on $statcheck$ using a non-exhaustive list of 187 simple text strings with arbitrary statistical test results. These strings represent a wide range of textual representations of results including correctly manageable results, non-targeted test statistics, variable reporting styles, and common typos. Since $statcheck$'s detection heuristic is tied to a specific set of statistical test results that strictly adhere to the American Psychological Association (APA) reporting guidelines, it is unable to detect and check any reported result that even slightly deviates from this narrow style. In practice, $statcheck$ is unlikely to detect many statistical test results reported in the literature. I conclude that the capabilities and usefulness of the $statcheck$ software are very limited and that it should not be used to detect irregularities in results nor as a spell checker for statistical results. Future developments should aim to incorporate more flexible algorithms capable of handling a broader variety of reporting styles, such as those provided by $JATSdecoder$ and Large Language Models, which show promise in overcoming these limitations but they cannot replace the critical eye of a knowledgeable reader.
R 软件包 $statcheck$ 设计用于从文本中提取统计检验结果,并检查报告的检验统计量和相应 p 值的一致性。最近,它还被用作统计结果的拼写检查工具,旨在提高科学出版物的报告准确性。在本研究中,我使用一个尚未穷尽的 187 个带有任意统计检验结果的简单文本字符串列表对 $statcheck$ 进行了检查。这些字符串代表了各种结果的文字表述,包括可正确管理的结果、非目标测试统计、多变的报告风格和常见错别字。由于$statcheck$的检测启发式与一组严格遵守美国心理学会(APA)报告指南的特定统计检验结果相联系,因此它无法检测和检查任何报告结果,哪怕是与这种狭隘的风格稍有偏差。实际上,$statcheck$ 不可能检测出文献中报告的许多统计检验结果。我的结论是,$statcheck$ 软件的能力和作用非常有限,它既不能用来检测结果中的不规范之处,也不能作为统计结果的拼写检查器。未来的发展应着眼于纳入更灵活的算法,能够处理更广泛的报告风格,例如 $JATSdecoder$ 和 Large Language Models 提供的算法,它们在克服这些局限性方面显示出前景,但它们无法取代知识渊博的读者的批判性眼光。
{"title":"$statcheck$ is flawed by design and no valid spell checker for statistical results","authors":"Ingmar Böschen","doi":"arxiv-2408.07948","DOIUrl":"https://doi.org/arxiv-2408.07948","url":null,"abstract":"The R package $statcheck$ is designed to extract statistical test results\u0000from text and check the consistency of the reported test statistics and\u0000corresponding p-values. Recently, it has also been featured as a spell checker\u0000for statistical results, aimed at improving reporting accuracy in scientific\u0000publications. In this study, I perform a check on $statcheck$ using a\u0000non-exhaustive list of 187 simple text strings with arbitrary statistical test\u0000results. These strings represent a wide range of textual representations of\u0000results including correctly manageable results, non-targeted test statistics,\u0000variable reporting styles, and common typos. Since $statcheck$'s detection\u0000heuristic is tied to a specific set of statistical test results that strictly\u0000adhere to the American Psychological Association (APA) reporting guidelines, it\u0000is unable to detect and check any reported result that even slightly deviates\u0000from this narrow style. In practice, $statcheck$ is unlikely to detect many\u0000statistical test results reported in the literature. I conclude that the\u0000capabilities and usefulness of the $statcheck$ software are very limited and\u0000that it should not be used to detect irregularities in results nor as a spell\u0000checker for statistical results. Future developments should aim to incorporate\u0000more flexible algorithms capable of handling a broader variety of reporting\u0000styles, such as those provided by $JATSdecoder$ and Large Language Models,\u0000which show promise in overcoming these limitations but they cannot replace the\u0000critical eye of a knowledgeable reader.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we consider the modeling of measurement error for fund returns data. In particular, given access to a time-series of discretely observed log-returns and the associated maximum over the observation period, we develop a stochastic model which models the true log-returns and maximum via a L'evy process and the data as a measurement error there-of. The main technical difficulty of trying to infer this model, for instance Bayesian parameter estimation, is that the joint transition density of the return and maximum is seldom known, nor can it be simulated exactly. Based upon the novel stick breaking representation of [12] we provide an approximation of the model. We develop a Markov chain Monte Carlo (MCMC) algorithm to sample from the Bayesian posterior of the approximated posterior and then extend this to a multilevel MCMC method which can reduce the computational cost to approximate posterior expectations, relative to ordinary MCMC. We implement our methodology on several applications including for real data.
{"title":"Modeling of Measurement Error in Financial Returns Data","authors":"Ajay Jasra, Mohamed Maama, Aleksandar Mijatović","doi":"arxiv-2408.07405","DOIUrl":"https://doi.org/arxiv-2408.07405","url":null,"abstract":"In this paper we consider the modeling of measurement error for fund returns\u0000data. In particular, given access to a time-series of discretely observed\u0000log-returns and the associated maximum over the observation period, we develop\u0000a stochastic model which models the true log-returns and maximum via a L'evy\u0000process and the data as a measurement error there-of. The main technical\u0000difficulty of trying to infer this model, for instance Bayesian parameter\u0000estimation, is that the joint transition density of the return and maximum is\u0000seldom known, nor can it be simulated exactly. Based upon the novel stick\u0000breaking representation of [12] we provide an approximation of the model. We\u0000develop a Markov chain Monte Carlo (MCMC) algorithm to sample from the Bayesian\u0000posterior of the approximated posterior and then extend this to a multilevel\u0000MCMC method which can reduce the computational cost to approximate posterior\u0000expectations, relative to ordinary MCMC. We implement our methodology on\u0000several applications including for real data.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work considers the computation of risk measures for quantities of interest governed by PDEs with Gaussian random field parameters using Taylor approximations. While efficient, Taylor approximations are local to the point of expansion, and hence may degrade in accuracy when the variances of the input parameters are large. To address this challenge, we approximate the underlying Gaussian measure by a mixture of Gaussians with reduced variance in a dominant direction of parameter space. Taylor approximations are constructed at the means of each Gaussian mixture component, which are then combined to approximate the risk measures. The formulation is presented in the setting of infinite-dimensional Gaussian random parameters for risk measures including the mean, variance, and conditional value-at-risk. We also provide detailed analysis of the approximations errors arising from two sources: the Gaussian mixture approximation and the Taylor approximations. Numerical experiments are conducted for a semilinear advection-diffusion-reaction equation with a random diffusion coefficient field and for the Helmholtz equation with a random wave speed field. For these examples, the proposed approximation strategy can achieve less than $1%$ relative error in estimating CVaR with only $mathcal{O}(10)$ state PDE solves, which is comparable to a standard Monte Carlo estimate with $mathcal{O}(10^4)$ samples, thus achieving significant reduction in computational cost. The proposed method can therefore serve as a way to rapidly and accurately estimate risk measures under limited computational budgets.
{"title":"Gaussian mixture Taylor approximations of risk measures constrained by PDEs with Gaussian random field inputs","authors":"Dingcheng Luo, Joshua Chen, Peng Chen, Omar Ghattas","doi":"arxiv-2408.06615","DOIUrl":"https://doi.org/arxiv-2408.06615","url":null,"abstract":"This work considers the computation of risk measures for quantities of\u0000interest governed by PDEs with Gaussian random field parameters using Taylor\u0000approximations. While efficient, Taylor approximations are local to the point\u0000of expansion, and hence may degrade in accuracy when the variances of the input\u0000parameters are large. To address this challenge, we approximate the underlying\u0000Gaussian measure by a mixture of Gaussians with reduced variance in a dominant\u0000direction of parameter space. Taylor approximations are constructed at the\u0000means of each Gaussian mixture component, which are then combined to\u0000approximate the risk measures. The formulation is presented in the setting of\u0000infinite-dimensional Gaussian random parameters for risk measures including the\u0000mean, variance, and conditional value-at-risk. We also provide detailed\u0000analysis of the approximations errors arising from two sources: the Gaussian\u0000mixture approximation and the Taylor approximations. Numerical experiments are\u0000conducted for a semilinear advection-diffusion-reaction equation with a random\u0000diffusion coefficient field and for the Helmholtz equation with a random wave\u0000speed field. For these examples, the proposed approximation strategy can\u0000achieve less than $1%$ relative error in estimating CVaR with only\u0000$mathcal{O}(10)$ state PDE solves, which is comparable to a standard Monte\u0000Carlo estimate with $mathcal{O}(10^4)$ samples, thus achieving significant\u0000reduction in computational cost. The proposed method can therefore serve as a\u0000way to rapidly and accurately estimate risk measures under limited\u0000computational budgets.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aidan Li, Liyan Wang, Tianye Dou, Jeffrey S. Rosenthal
For random-walk Metropolis (RWM) and parallel tempering (PT) algorithms, an asymptotic acceptance rate of around 0.234 is known to be optimal in the high-dimensional limit. Yet, the practical relevance of this value is uncertain due to the restrictive conditions underlying its derivation. We synthesise previous theoretical advances in extending the 0.234 acceptance rate to more general settings, and demonstrate the applicability and generalizability of the 0.234 theory for practitioners with a comprehensive empirical simulation study on a variety of examples examining how acceptance rates affect Expected Squared Jumping Distance (ESJD). Our experiments show the optimality of the 0.234 acceptance rate for RWM is surprisingly robust even in lower dimensions across various proposal and multimodal target distributions which may or may not have an i.i.d. product density. Experiments on parallel tempering also show that the idealized 0.234 spacing of inverse temperatures may be approximately optimal for low dimensions and non i.i.d. product target densities, and that constructing an inverse temperature ladder with spacings given by a swap acceptance of 0.234 is a viable strategy. However, we observe the applicability of the 0.234 acceptance rate heuristic diminishes for both RWM and PT algorithms below a certain dimension which differs based on the target density, and that inhomogeneously scaled components in the target density further reduces its applicability in lower dimensions.
{"title":"Exploring the generalizability of the optimal 0.234 acceptance rate in random-walk Metropolis and parallel tempering algorithms","authors":"Aidan Li, Liyan Wang, Tianye Dou, Jeffrey S. Rosenthal","doi":"arxiv-2408.06894","DOIUrl":"https://doi.org/arxiv-2408.06894","url":null,"abstract":"For random-walk Metropolis (RWM) and parallel tempering (PT) algorithms, an\u0000asymptotic acceptance rate of around 0.234 is known to be optimal in the\u0000high-dimensional limit. Yet, the practical relevance of this value is uncertain\u0000due to the restrictive conditions underlying its derivation. We synthesise\u0000previous theoretical advances in extending the 0.234 acceptance rate to more\u0000general settings, and demonstrate the applicability and generalizability of the\u00000.234 theory for practitioners with a comprehensive empirical simulation study\u0000on a variety of examples examining how acceptance rates affect Expected Squared\u0000Jumping Distance (ESJD). Our experiments show the optimality of the 0.234\u0000acceptance rate for RWM is surprisingly robust even in lower dimensions across\u0000various proposal and multimodal target distributions which may or may not have\u0000an i.i.d. product density. Experiments on parallel tempering also show that the\u0000idealized 0.234 spacing of inverse temperatures may be approximately optimal\u0000for low dimensions and non i.i.d. product target densities, and that\u0000constructing an inverse temperature ladder with spacings given by a swap\u0000acceptance of 0.234 is a viable strategy. However, we observe the applicability\u0000of the 0.234 acceptance rate heuristic diminishes for both RWM and PT\u0000algorithms below a certain dimension which differs based on the target density,\u0000and that inhomogeneously scaled components in the target density further\u0000reduces its applicability in lower dimensions.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142189568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}