首页 > 最新文献

Foundations of data science (Springfield, Mo.)最新文献

英文 中文
Estimation and uncertainty quantification for the output from quantum simulators 量子模拟器输出的估计与不确定性量化
Q2 MATHEMATICS, APPLIED Pub Date : 2019-03-07 DOI: 10.3934/FODS.2019007
R. Bennink, A. Jasra, K. Law, P. Lougovski
The problem of estimating certain distributions over {0, 1}d is considered here. The distribution represents a quantum system of d qubits, where there are non-trivial dependencies between the qubits. A maximum entropy approach is adopted to reconstruct the distribution from exact moments or observed empirical moments. The Robbins Monro algorithm is used to solve the intractable maximum entropy problem, by constructing an unbiased estimator of the un-normalized target with a sequential Monte Carlo sampler at each iteration. In the case of empirical moments, this coincides with a maximum likelihood estimator. A Bayesian formulation is also considered in order to quantify uncertainty a posteriori. Several approaches are proposed in order to tackle this challenging problem, based on recently developed methodologies. In particular, unbiased estimators of the gradient of the log posterior are constructed and used within a provably convergent Langevin-based Markov chain Monte Carlo method. The methods are illustrated on classically simulated output from quantum simulators.
这里考虑了在{0,1}d上估计某些分布的问题。该分布表示d个量子位的量子系统,其中量子位之间存在非平凡的依赖关系。采用最大熵方法从精确矩或观测到的经验矩重建分布。Robbins-Monro算法用于解决棘手的最大熵问题,方法是在每次迭代时用顺序蒙特卡罗采样器构造未归一化目标的无偏估计器。在经验矩的情况下,这与最大似然估计器一致。为了对不确定性进行后验量化,还考虑了贝叶斯公式。根据最近开发的方法,提出了几种方法来解决这一具有挑战性的问题。特别地,在可证明收敛的基于Langevin的马尔可夫链蒙特卡罗方法中,构造并使用对数后验梯度的无偏估计量。这些方法在量子模拟器的经典模拟输出上进行了说明。
{"title":"Estimation and uncertainty quantification for the output from quantum simulators","authors":"R. Bennink, A. Jasra, K. Law, P. Lougovski","doi":"10.3934/FODS.2019007","DOIUrl":"https://doi.org/10.3934/FODS.2019007","url":null,"abstract":"The problem of estimating certain distributions over {0, 1}d is considered here. The distribution represents a quantum system of d qubits, where there are non-trivial dependencies between the qubits. A maximum entropy approach is adopted to reconstruct the distribution from exact moments or observed empirical moments. The Robbins Monro algorithm is used to solve the intractable maximum entropy problem, by constructing an unbiased estimator of the un-normalized target with a sequential Monte Carlo sampler at each iteration. In the case of empirical moments, this coincides with a maximum likelihood estimator. A Bayesian formulation is also considered in order to quantify uncertainty a posteriori. Several approaches are proposed in order to tackle this challenging problem, based on recently developed methodologies. In particular, unbiased estimators of the gradient of the log posterior are constructed and used within a provably convergent Langevin-based Markov chain Monte Carlo method. The methods are illustrated on classically simulated output from quantum simulators.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42733584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Approximate bayesian inference for geostatistical generalised linear models 地质统计广义线性模型的近似贝叶斯推断
Q2 MATHEMATICS, APPLIED Pub Date : 2019-03-07 DOI: 10.3934/FODS.2019002
E. Evangelou
The aim of this paper is to bring together recent developments in Bayesian generalised linear mixed models and geostatistics. We focus on approximate methods on both areas. A technique known as full-scale approximation, proposed by Sang and Huang (2012) for improving the computational drawbacks of large geostatistical data, is incorporated into the INLA methodology, used for approximate Bayesian inference. We also discuss how INLA can be used for approximating the posterior distribution of transformations of parameters, useful for practical applications. Issues regarding the choice of the parameters of the approximation such as the knots and taper range are also addressed. Emphasis is given in applications in the context of disease mapping by illustrating the methodology for modelling the loa loa prevalence in Cameroon and malaria in the Gambia.
本文的目的是汇集贝叶斯广义线性混合模型和地质统计学的最新发展。我们着重于这两个领域的近似方法。Sang和Huang(2012)提出了一种称为全面近似的技术,用于改善大型地质统计数据的计算缺陷,该技术被纳入INLA方法中,用于近似贝叶斯推断。我们还讨论了如何使用INLA来近似参数变换的后验分布,这对实际应用很有用。关于选择参数的近似,如节和锥度范围的问题也进行了讨论。重点介绍了在绘制疾病地图方面的应用,说明了对喀麦隆的疟疾流行率和冈比亚的疟疾进行建模的方法。
{"title":"Approximate bayesian inference for geostatistical generalised linear models","authors":"E. Evangelou","doi":"10.3934/FODS.2019002","DOIUrl":"https://doi.org/10.3934/FODS.2019002","url":null,"abstract":"The aim of this paper is to bring together recent developments in Bayesian generalised linear mixed models and geostatistics. We focus on approximate methods on both areas. A technique known as full-scale approximation, proposed by Sang and Huang (2012) for improving the computational drawbacks of large geostatistical data, is incorporated into the INLA methodology, used for approximate Bayesian inference. We also discuss how INLA can be used for approximating the posterior distribution of transformations of parameters, useful for practical applications. Issues regarding the choice of the parameters of the approximation such as the knots and taper range are also addressed. Emphasis is given in applications in the context of disease mapping by illustrating the methodology for modelling the loa loa prevalence in Cameroon and malaria in the Gambia.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44770194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combinatorial Hodge theory for equitable kidney paired donation 公平配对肾脏捐献的组合Hodge理论
Q2 MATHEMATICS, APPLIED Pub Date : 2019-03-07 DOI: 10.3934/FODS.2019004
Joshua L. Mike, V. Maroulas
Kidney Paired Donation (KPD) is a system whereby incompatible patient-donor pairs (PD pairs) are entered into a pool to find compatible cyclic kidney exchanges where each pair gives and receives a kidney. The donation allocation decision problem for a KPD pool has traditionally been viewed within an economic theory and integer-programming framework. While previous allocation schema work well to donate the maximum number of kidneys at a specific time, certain subgroups of patients are rarely matched in such an exchange. Consequently, these methods lead to systematic inequity in the exchange, where many patients are rejected a kidney repeatedly. Our goal is to investigate inequity within the distribution of kidney allocation among patients, and to present an algorithm which minimizes allocation disparities. The method presented is inspired by cohomology and describes the cyclic structure in a kidney exchange efficiently; this structure is then used to search for an equitable kidney allocation. Another key result of our approach is a score function defined on PD pairs which measures cycle disparity within a KPD pool; i.e., this function measures the relative chance for each PD pair to take part in the kidney exchange if cycles are chosen uniformly. Specifically, we show that PD pairs with underdemanded donors or highly sensitized patients have lower scores than typical PD pairs. Furthermore, our results demonstrate that PD pair score and the chance to obtain a kidney are positively correlated when allocation is done by utility-optimal integer programming methods. In contrast, the chance to obtain a kidney through our method is independent of score, and thus unbiased in this regard.
肾脏配对捐献(KPD)是一种将不相容的患者-供体配对(PD对)输入池中以寻找相容的循环肾脏交换的系统,其中每对配对提供和接受一个肾脏。传统上,人们是从经济理论和整数规划框架来看待捐赠池分配决策问题的。虽然以前的分配模式可以很好地在特定时间捐献最大数量的肾脏,但在这样的交换中,某些亚组患者很少匹配。因此,这些方法导致了器官交换系统的不公平,许多患者反复拒绝换肾。我们的目标是调查患者之间肾脏分配分配的不公平,并提出一种最小化分配差异的算法。该方法受上同调的启发,有效地描述了肾脏交换中的循环结构;然后使用这个结构来寻找一个公平的肾脏分配。我们的方法的另一个关键结果是在PD对上定义的分数函数,它测量KPD池内的周期差异;也就是说,如果周期选择一致,该函数测量每个PD对参与肾脏交换的相对机会。具体来说,我们发现供体需求不足或高度敏感的PD配对比典型的PD配对得分低。此外,我们的研究结果表明,当使用效用最优整数规划方法进行分配时,PD对评分和获得肾脏的机会呈正相关。相比之下,通过我们的方法获得肾脏的机会与得分无关,因此在这方面是公正的。
{"title":"Combinatorial Hodge theory for equitable kidney paired donation","authors":"Joshua L. Mike, V. Maroulas","doi":"10.3934/FODS.2019004","DOIUrl":"https://doi.org/10.3934/FODS.2019004","url":null,"abstract":"Kidney Paired Donation (KPD) is a system whereby incompatible patient-donor pairs (PD pairs) are entered into a pool to find compatible cyclic kidney exchanges where each pair gives and receives a kidney. The donation allocation decision problem for a KPD pool has traditionally been viewed within an economic theory and integer-programming framework. While previous allocation schema work well to donate the maximum number of kidneys at a specific time, certain subgroups of patients are rarely matched in such an exchange. Consequently, these methods lead to systematic inequity in the exchange, where many patients are rejected a kidney repeatedly. Our goal is to investigate inequity within the distribution of kidney allocation among patients, and to present an algorithm which minimizes allocation disparities. The method presented is inspired by cohomology and describes the cyclic structure in a kidney exchange efficiently; this structure is then used to search for an equitable kidney allocation. Another key result of our approach is a score function defined on PD pairs which measures cycle disparity within a KPD pool; i.e., this function measures the relative chance for each PD pair to take part in the kidney exchange if cycles are chosen uniformly. Specifically, we show that PD pairs with underdemanded donors or highly sensitized patients have lower scores than typical PD pairs. Furthermore, our results demonstrate that PD pair score and the chance to obtain a kidney are positively correlated when allocation is done by utility-optimal integer programming methods. In contrast, the chance to obtain a kidney through our method is independent of score, and thus unbiased in this regard.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44209556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Particle filters for inference of high-dimensional multivariate stochastic volatility models with cross-leverage effects 具有交叉杠杆效应的高维多元随机波动模型的粒子滤波
Q2 MATHEMATICS, APPLIED Pub Date : 2019-02-25 DOI: 10.3934/fods.2019003
Yaxian Xu, A. Jasra
Multivariate stochastic volatility models are a popular and well-known class of models in the analysis of financial time series because of their abilities to capture the important stylized facts of financial returns data. We consider the problems of filtering distribution estimation and also marginal likelihood calculation for multivariate stochastic volatility models with cross-leverage effects in the high dimensional case, that is when the number of financial time series that we analyze simultaneously (denoted by begin{document}$ d $end{document} ) is large. The standard particle filter has been widely used in the literature to solve these intractable inference problems. It has excellent performance in low to moderate dimensions, but collapses in the high dimensional case. In this article, two new and advanced particle filters proposed in [ 4 ], named the space-time particle filter and the marginal space-time particle filter, are explored for these estimation problems. The better performance in both the accuracy and stability for the two advanced particle filters are shown using simulation and empirical studies in comparison with the standard particle filter. In addition, Bayesian static model parameter estimation problem is considered with the advances in particle Markov chain Monte Carlo methods. The particle marginal Metropolis-Hastings algorithm is applied together with the likelihood estimates from the space-time particle filter to infer the static model parameter successfully when that using the likelihood estimates from the standard particle filter fails.
多元随机波动率模型是金融时间序列分析中一类流行且知名的模型,因为它们能够捕捉金融回报数据的重要程式化事实。我们考虑了在高维情况下,具有交叉杠杆效应的多元随机波动率模型的滤波分布估计和边际似然计算问题,也就是说,当我们同时分析的金融时间序列的数量(用 begin{document}$d$ end{document}表示)很大时。标准粒子滤波器在文献中被广泛用于解决这些棘手的推理问题。它在低到中等维度上具有出色的性能,但在高维度的情况下会崩溃。本文针对这些估计问题,探讨了[4]中提出的两种新的高级粒子滤波器,即时空粒子滤波器和边缘时空粒子滤波器。通过模拟和实证研究,与标准粒子滤波器相比,两种先进粒子滤波器在精度和稳定性方面都表现出更好的性能。此外,结合粒子马尔可夫链蒙特卡罗方法的进展,考虑了贝叶斯静态模型参数估计问题。当使用来自标准粒子滤波器的似然估计失败时,粒子边际Metropolis-Hastings算法与来自时空粒子滤波器的概率估计一起应用,以成功地推断静态模型参数。
{"title":"Particle filters for inference of high-dimensional multivariate stochastic volatility models with cross-leverage effects","authors":"Yaxian Xu, A. Jasra","doi":"10.3934/fods.2019003","DOIUrl":"https://doi.org/10.3934/fods.2019003","url":null,"abstract":"Multivariate stochastic volatility models are a popular and well-known class of models in the analysis of financial time series because of their abilities to capture the important stylized facts of financial returns data. We consider the problems of filtering distribution estimation and also marginal likelihood calculation for multivariate stochastic volatility models with cross-leverage effects in the high dimensional case, that is when the number of financial time series that we analyze simultaneously (denoted by begin{document}$ d $end{document} ) is large. The standard particle filter has been widely used in the literature to solve these intractable inference problems. It has excellent performance in low to moderate dimensions, but collapses in the high dimensional case. In this article, two new and advanced particle filters proposed in [ 4 ], named the space-time particle filter and the marginal space-time particle filter, are explored for these estimation problems. The better performance in both the accuracy and stability for the two advanced particle filters are shown using simulation and empirical studies in comparison with the standard particle filter. In addition, Bayesian static model parameter estimation problem is considered with the advances in particle Markov chain Monte Carlo methods. The particle marginal Metropolis-Hastings algorithm is applied together with the likelihood estimates from the space-time particle filter to infer the static model parameter successfully when that using the likelihood estimates from the standard particle filter fails.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43334711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Spectral methods to study the robustness of residual neural networks with infinite layers 用谱方法研究无穷层残差神经网络的鲁棒性
Q2 MATHEMATICS, APPLIED Pub Date : 2019-01-01 DOI: 10.3934/fods.2020012
T. Trimborn, Stephan Gerster, G. Visconti
Recently, neural networks (NN) with an infinite number of layers have been introduced. Especially for these very large NN the training procedure is very expensive. Hence, there is interest to study their robustness with respect to input data to avoid unnecessarily retraining the network. Typically, model-based statistical inference methods, e.g. Bayesian neural networks, are used to quantify uncertainties. Here, we consider a special class of residual neural networks and we study the case, when the number of layers can be arbitrarily large. Then, kinetic theory allows to interpret the network as a dynamical system, described by a partial differential equation. We study the robustness of the mean-field neural network with respect to perturbations in initial data by applying UQ approaches on the loss functions.
近年来,具有无限层数的神经网络(NN)被引入。特别是对于这些非常大的神经网络,训练过程是非常昂贵的。因此,有兴趣研究它们相对于输入数据的鲁棒性,以避免不必要的再训练网络。通常,基于模型的统计推理方法,如贝叶斯神经网络,被用来量化不确定性。在这里,我们考虑一类特殊的残差神经网络,并研究了当层数可以任意大时的情况。然后,动力学理论允许将网络解释为一个动力系统,用偏微分方程来描述。通过对损失函数应用UQ方法,研究了平均场神经网络对初始数据扰动的鲁棒性。
{"title":"Spectral methods to study the robustness of residual neural networks with infinite layers","authors":"T. Trimborn, Stephan Gerster, G. Visconti","doi":"10.3934/fods.2020012","DOIUrl":"https://doi.org/10.3934/fods.2020012","url":null,"abstract":"Recently, neural networks (NN) with an infinite number of layers have been introduced. Especially for these very large NN the training procedure is very expensive. Hence, there is interest to study their robustness with respect to input data to avoid unnecessarily retraining the network. Typically, model-based statistical inference methods, e.g. Bayesian neural networks, are used to quantify uncertainties. Here, we consider a special class of residual neural networks and we study the case, when the number of layers can be arbitrarily large. Then, kinetic theory allows to interpret the network as a dynamical system, described by a partial differential equation. We study the robustness of the mean-field neural network with respect to perturbations in initial data by applying UQ approaches on the loss functions.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70247997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Issues using logistic regression with class imbalance, with a case study from credit risk modelling 使用逻辑回归与阶级不平衡的问题,并以信用风险模型为例进行研究
Q2 MATHEMATICS, APPLIED Pub Date : 2019-01-01 DOI: 10.3934/fods.2019016
Yazhe Li, T. Bellotti, N. Adams
The class imbalance problem arises in two-class classification problems, when the less frequent (minority) class is observed much less than the majority class. This characteristic is endemic in many problems such as modeling default or fraud detection. Recent work by Owen [ 19 ] has shown that, in a theoretical context related to infinite imbalance, logistic regression behaves in such a way that all data in the rare class can be replaced by their mean vector to achieve the same coefficient estimates. We build on Owen's results to show the phenomenon remains true for both weighted and penalized likelihood methods. Such results suggest that problems may occur if there is structure within the rare class that is not captured by the mean vector. We demonstrate this problem and suggest a relabelling solution based on clustering the minority class. In a simulation and a real mortgage dataset, we show that logistic regression is not able to provide the best out-of-sample predictive performance and that an approach that is able to model underlying structure in the minority class is often superior.
类不平衡问题出现在两类分类问题中,当观察到频率较低的(少数)类比多数类少得多时。这个特征在许多问题中都很普遍,比如建模默认值或欺诈检测。Owen[19]最近的工作表明,在与无限不平衡相关的理论背景下,逻辑回归的行为方式是,所有罕见类中的数据都可以用它们的均值向量替换,以获得相同的系数估计。我们以欧文的结果为基础,表明这种现象对于加权和惩罚似然方法都是正确的。这样的结果表明,如果在稀有类中存在未被平均向量捕获的结构,则可能会出现问题。我们论证了这个问题,并提出了一种基于少数类聚类的重新标记解决方案。在模拟和真实抵押数据集中,我们表明逻辑回归无法提供最佳的样本外预测性能,并且能够在少数类别中建模底层结构的方法通常更优越。
{"title":"Issues using logistic regression with class imbalance, with a case study from credit risk modelling","authors":"Yazhe Li, T. Bellotti, N. Adams","doi":"10.3934/fods.2019016","DOIUrl":"https://doi.org/10.3934/fods.2019016","url":null,"abstract":"The class imbalance problem arises in two-class classification problems, when the less frequent (minority) class is observed much less than the majority class. This characteristic is endemic in many problems such as modeling default or fraud detection. Recent work by Owen [ 19 ] has shown that, in a theoretical context related to infinite imbalance, logistic regression behaves in such a way that all data in the rare class can be replaced by their mean vector to achieve the same coefficient estimates. We build on Owen's results to show the phenomenon remains true for both weighted and penalized likelihood methods. Such results suggest that problems may occur if there is structure within the rare class that is not captured by the mean vector. We demonstrate this problem and suggest a relabelling solution based on clustering the minority class. In a simulation and a real mortgage dataset, we show that logistic regression is not able to provide the best out-of-sample predictive performance and that an approach that is able to model underlying structure in the minority class is often superior.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70247842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Background material 背景材料
Q2 MATHEMATICS, APPLIED Pub Date : 2018-12-11 DOI: 10.1090/surv/236/02
{"title":"Background material","authors":"","doi":"10.1090/surv/236/02","DOIUrl":"https://doi.org/10.1090/surv/236/02","url":null,"abstract":"","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"60690335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Markov chain simulation for multilevel Monte Carlo 多级蒙特卡罗的马尔可夫链仿真
Q2 MATHEMATICS, APPLIED Pub Date : 2018-06-26 DOI: 10.3934/FODS.2021004
A. Jasra, K. Law, Yaxian Xu
This paper considers a new approach to using Markov chain Monte Carlo (MCMC) in contexts where one may adopt multilevel (ML) Monte Carlo. The underlying problem is to approximate expectations w.r.t. an underlying probability measure that is associated to a continuum problem, such as a continuous-time stochastic process. It is then assumed that the associated probability measure can only be used (e.g. sampled) under a discretized approximation. In such scenarios, it is known that to achieve a target error, the computational effort can be reduced when using MLMC relative to exact sampling from the most accurate discretized probability. The ideas rely upon introducing hierarchies of the discretizations where less accurate approximations cost less to compute, and using an appropriate collapsing sum expression for the target expectation. If a suitable coupling of the probability measures in the hierarchy is achieved, then a reduction in cost is possible. This article focused on the case where exact sampling from such coupling is not possible. We show that one can construct suitably coupled MCMC kernels when given only access to MCMC kernels which are invariant with respect to each discretized probability measure. We prove, under assumptions, that this coupled MCMC approach in a ML context can reduce the cost to achieve a given error, relative to exact sampling. Our approach is illustrated on a numerical example.
本文考虑了一种在可以采用多级(ML)蒙特卡罗的情况下使用马尔可夫链蒙特卡罗(MCMC)的新方法。潜在问题是通过与连续问题(如连续时间随机过程)相关的潜在概率测度来近似期望。然后假设相关的概率测度只能在离散近似下使用(例如采样)。在这种情况下,已知为了实现目标误差,相对于从最准确的离散化概率进行的精确采样,当使用MLMC时可以减少计算工作量。这些想法依赖于引入离散化的层次结构,其中不太准确的近似计算成本更低,并使用适当的折叠和表达式来表示目标期望。如果在层次结构中实现了概率度量的适当耦合,那么成本的降低是可能的。本文重点讨论了这种耦合不可能进行精确采样的情况。我们证明,当只允许访问相对于每个离散概率测度不变的MCMC核时,可以构造适当耦合的MCMC内核。我们在假设的情况下证明,相对于精确采样,ML环境中的这种耦合MCMC方法可以降低实现给定误差的成本。我们的方法在一个数值例子中得到了说明。
{"title":"Markov chain simulation for multilevel Monte Carlo","authors":"A. Jasra, K. Law, Yaxian Xu","doi":"10.3934/FODS.2021004","DOIUrl":"https://doi.org/10.3934/FODS.2021004","url":null,"abstract":"This paper considers a new approach to using Markov chain Monte Carlo (MCMC) in contexts where one may adopt multilevel (ML) Monte Carlo. The underlying problem is to approximate expectations w.r.t. an underlying probability measure that is associated to a continuum problem, such as a continuous-time stochastic process. It is then assumed that the associated probability measure can only be used (e.g. sampled) under a discretized approximation. In such scenarios, it is known that to achieve a target error, the computational effort can be reduced when using MLMC relative to exact sampling from the most accurate discretized probability. The ideas rely upon introducing hierarchies of the discretizations where less accurate approximations cost less to compute, and using an appropriate collapsing sum expression for the target expectation. If a suitable coupling of the probability measures in the hierarchy is achieved, then a reduction in cost is possible. This article focused on the case where exact sampling from such coupling is not possible. We show that one can construct suitably coupled MCMC kernels when given only access to MCMC kernels which are invariant with respect to each discretized probability measure. We prove, under assumptions, that this coupled MCMC approach in a ML context can reduce the cost to achieve a given error, relative to exact sampling. Our approach is illustrated on a numerical example.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47508531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Quantum topological data analysis with continuous variables 连续变量的量子拓扑数据分析
Q2 MATHEMATICS, APPLIED Pub Date : 2018-04-05 DOI: 10.3934/fods.2019017
G. Siopsis
I introduce a continuous-variable quantum topological data algorithm. The goal of the quantum algorithm is to calculate the Betti numbers in persistent homology which are the dimensions of the kernel of the combinatorial Laplacian. I accomplish this task with the use of qRAM to create an oracle which organizes sets of data. I then perform a continuous-variable phase estimation on a Dirac operator to get a probability distribution with eigenvalue peaks. The results also leverage an implementation of continuous-variable conditional swap gate.
介绍了一种连续可变量子拓扑数据算法。量子算法的目标是计算持久同调中的Betti数,该数是组合拉普拉斯算子的核的维数。我通过使用qRAM创建一个组织数据集的oracle来完成这项任务。然后,我对Dirac算子进行连续可变相位估计,得到具有特征值峰值的概率分布。结果还利用了连续变量条件交换门的实现。
{"title":"Quantum topological data analysis with continuous variables","authors":"G. Siopsis","doi":"10.3934/fods.2019017","DOIUrl":"https://doi.org/10.3934/fods.2019017","url":null,"abstract":"I introduce a continuous-variable quantum topological data algorithm. The goal of the quantum algorithm is to calculate the Betti numbers in persistent homology which are the dimensions of the kernel of the combinatorial Laplacian. I accomplish this task with the use of qRAM to create an oracle which organizes sets of data. I then perform a continuous-variable phase estimation on a Dirac operator to get a probability distribution with eigenvalue peaks. The results also leverage an implementation of continuous-variable conditional swap gate.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49129969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Modelling uncertainty using stochastic transport noise in a 2-layer quasi-geostrophic model 在双层准地转模型中利用随机输运噪声模拟不确定性
Q2 MATHEMATICS, APPLIED Pub Date : 2018-02-15 DOI: 10.3934/fods.2020010
C. Cotter, D. Crisan, Darryl D. Holm, Wei Pan, I. Shevchenko
The stochastic variational approach for geophysical fluid dynamics was introduced by Holm (Proc Roy Soc A, 2015) as a framework for deriving stochastic parameterisations for unresolved scales. This paper applies the variational stochastic parameterisation in a two-layer quasi-geostrophic model for a begin{document}$ beta $end{document} -plane channel flow configuration. We present a new method for estimating the stochastic forcing (used in the parameterisation) to approximate unresolved components using data from the high resolution deterministic simulation, and describe a procedure for computing physically-consistent initial conditions for the stochastic model. We also quantify uncertainty of coarse grid simulations relative to the fine grid ones in homogeneous (teamed with small-scale vortices) and heterogeneous (featuring horizontally elongated large-scale jets) flows, and analyse how the spread of stochastic solutions depends on different parameters of the model. The parameterisation is tested by comparing it with the true eddy-resolving solution that has reached some statistical equilibrium and the deterministic solution modelled on a low-resolution grid. The results show that the proposed parameterisation significantly depends on the resolution of the stochastic model and gives good ensemble performance for both homogeneous and heterogeneous flows, and the parameterisation lays solid foundations for data assimilation.
Holm(Proc Roy Soc A,2015)引入了地球物理流体动力学的随机变分方法,作为导出未解决尺度的随机参数化的框架。本文将变分随机参数化应用于一个双层准地转模型中,该模型适用于一个平面通道流结构。我们提出了一种新的方法来估计随机强迫(用于参数化),以使用来自高分辨率确定性模拟的数据来近似未解决的分量,并描述了计算随机模型物理一致初始条件的过程。我们还量化了均匀流(与小尺度涡流结合)和非均匀流(具有水平伸长的大尺度射流)中粗网格模拟相对于细网格模拟的不确定性,并分析了随机解的传播如何取决于模型的不同参数。通过将参数化与达到某种统计平衡的真实涡流解析解和低分辨率网格上建模的确定性解进行比较来测试参数化。结果表明,所提出的参数化在很大程度上取决于随机模型的分辨率,并且对于均质流和非均质流都具有良好的集成性能,并且参数化为数据同化奠定了坚实的基础。
{"title":"Modelling uncertainty using stochastic transport noise in a 2-layer quasi-geostrophic model","authors":"C. Cotter, D. Crisan, Darryl D. Holm, Wei Pan, I. Shevchenko","doi":"10.3934/fods.2020010","DOIUrl":"https://doi.org/10.3934/fods.2020010","url":null,"abstract":"The stochastic variational approach for geophysical fluid dynamics was introduced by Holm (Proc Roy Soc A, 2015) as a framework for deriving stochastic parameterisations for unresolved scales. This paper applies the variational stochastic parameterisation in a two-layer quasi-geostrophic model for a begin{document}$ beta $end{document} -plane channel flow configuration. We present a new method for estimating the stochastic forcing (used in the parameterisation) to approximate unresolved components using data from the high resolution deterministic simulation, and describe a procedure for computing physically-consistent initial conditions for the stochastic model. We also quantify uncertainty of coarse grid simulations relative to the fine grid ones in homogeneous (teamed with small-scale vortices) and heterogeneous (featuring horizontally elongated large-scale jets) flows, and analyse how the spread of stochastic solutions depends on different parameters of the model. The parameterisation is tested by comparing it with the true eddy-resolving solution that has reached some statistical equilibrium and the deterministic solution modelled on a low-resolution grid. The results show that the proposed parameterisation significantly depends on the resolution of the stochastic model and gives good ensemble performance for both homogeneous and heterogeneous flows, and the parameterisation lays solid foundations for data assimilation.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43640188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
期刊
Foundations of data science (Springfield, Mo.)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1