首页 > 最新文献

Foundations of data science (Springfield, Mo.)最新文献

英文 中文
Geometric adaptive Monte Carlo in random environment 随机环境下的几何自适应蒙特卡罗算法
Q2 MATHEMATICS, APPLIED Pub Date : 2016-08-29 DOI: 10.3934/FODS.2021014
T. Papamarkou, Alexey Lindo, E. Ford
Manifold Markov chain Monte Carlo algorithms have been introduced to sample more effectively from challenging target densities exhibiting multiple modes or strong correlations. Such algorithms exploit the local geometry of the parameter space, thus enabling chains to achieve a faster convergence rate when measured in number of steps. However, acquiring local geometric information can often increase computational complexity per step to the extent that sampling from high-dimensional targets becomes inefficient in terms of total computational time. This paper analyzes the computational complexity of manifold Langevin Monte Carlo and proposes a geometric adaptive Monte Carlo sampler aimed at balancing the benefits of exploiting local geometry with computational cost to achieve a high effective sample size for a given computational cost. The suggested sampler is a discrete-time stochastic process in random environment. The random environment allows to switch between local geometric and adaptive proposal kernels with the help of a schedule. An exponential schedule is put forward that enables more frequent use of geometric information in early transient phases of the chain, while saving computational time in late stationary phases. The average complexity can be manually set depending on the need for geometric exploitation posed by the underlying model.
引入了流形马尔可夫链蒙特卡罗算法,从具有挑战性的目标密度中更有效地采样,显示出多模式或强相关性。这种算法利用参数空间的局部几何,从而使链在以步数测量时达到更快的收敛速度。然而,获取局部几何信息通常会增加每一步的计算复杂度,以至于从高维目标进行采样在总计算时间方面变得低效。本文分析了流形朗格万蒙特卡罗的计算复杂度,提出了一种几何自适应蒙特卡罗采样器,旨在平衡利用局部几何的好处和计算成本,在给定计算成本的情况下获得高有效样本量。所建议的采样器是随机环境下的离散时间随机过程。随机环境允许在调度的帮助下在局部几何和自适应建议核之间切换。提出了一种指数调度方法,可以在链的早期瞬态阶段更频繁地使用几何信息,同时节省了后期平稳阶段的计算时间。平均复杂度可以根据底层模型所提出的几何利用需求手动设置。
{"title":"Geometric adaptive Monte Carlo in random environment","authors":"T. Papamarkou, Alexey Lindo, E. Ford","doi":"10.3934/FODS.2021014","DOIUrl":"https://doi.org/10.3934/FODS.2021014","url":null,"abstract":"Manifold Markov chain Monte Carlo algorithms have been introduced to sample more effectively from challenging target densities exhibiting multiple modes or strong correlations. Such algorithms exploit the local geometry of the parameter space, thus enabling chains to achieve a faster convergence rate when measured in number of steps. However, acquiring local geometric information can often increase computational complexity per step to the extent that sampling from high-dimensional targets becomes inefficient in terms of total computational time. This paper analyzes the computational complexity of manifold Langevin Monte Carlo and proposes a geometric adaptive Monte Carlo sampler aimed at balancing the benefits of exploiting local geometry with computational cost to achieve a high effective sample size for a given computational cost. The suggested sampler is a discrete-time stochastic process in random environment. The random environment allows to switch between local geometric and adaptive proposal kernels with the help of a schedule. An exponential schedule is put forward that enables more frequent use of geometric information in early transient phases of the chain, while saving computational time in late stationary phases. The average complexity can be manually set depending on the need for geometric exploitation posed by the underlying model.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70248343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Consistent manifold representation for topological data analysis 拓扑数据分析的一致流形表示
Q2 MATHEMATICS, APPLIED Pub Date : 2016-06-07 DOI: 10.3934/FODS.2019001
Tyrus Berry, T. Sauer
For data sampled from an arbitrary density on a manifold embedded in Euclidean space, the Continuous k-Nearest Neighbors (CkNN) graph construction is introduced. It is shown that CkNN is geometrically consistent in the sense that under certain conditions, the unnormalized graph Laplacian converges to the Laplace-Beltrami operator, spectrally as well as pointwise. It is proved for compact (and conjectured for noncompact) manifolds that CkNN is the unique unweighted construction that yields a geometry consistent with the connected components of the underlying manifold in the limit of large data. Thus CkNN produces a single graph that captures all topological features simultaneously, in contrast to persistent homology, which represents each homology generator at a separate scale. As applications we derive a new fast clustering algorithm and a method to identify patterns in natural images topologically. Finally, we conjecture that CkNN is topologically consistent, meaning that the homology of the Vietoris-Rips complex (implied by the graph Laplacian) converges to the homology of the underlying manifold (implied by the Laplace-de Rham operators) in the limit of large data.
对于嵌入欧几里德空间的流形上任意密度采样的数据,引入连续k近邻(CkNN)图构造。证明了CkNN在几何上是一致的,即在一定条件下,非归一化图拉普拉斯算子收敛于拉普拉斯-贝尔特拉米算子,在谱上和点上都是一致的。对于紧致流形证明(对于非紧致流形推测),CkNN是唯一的非加权结构,在大数据的限制下,它产生与底层流形的连接组件一致的几何形状。因此,CkNN产生一个同时捕获所有拓扑特征的单个图,而不是持久同调,它在单独的尺度上表示每个同调生成器。作为应用,我们提出了一种新的快速聚类算法和一种从拓扑上识别自然图像模式的方法。最后,我们推测CkNN是拓扑一致的,这意味着在大数据的极限下,Vietoris-Rips复合体的同调(由图拉普拉斯算子隐含)收敛于底层流形的同调(由拉普拉斯-德拉姆算子隐含)。
{"title":"Consistent manifold representation for topological data analysis","authors":"Tyrus Berry, T. Sauer","doi":"10.3934/FODS.2019001","DOIUrl":"https://doi.org/10.3934/FODS.2019001","url":null,"abstract":"For data sampled from an arbitrary density on a manifold embedded in Euclidean space, the Continuous k-Nearest Neighbors (CkNN) graph construction is introduced. It is shown that CkNN is geometrically consistent in the sense that under certain conditions, the unnormalized graph Laplacian converges to the Laplace-Beltrami operator, spectrally as well as pointwise. It is proved for compact (and conjectured for noncompact) manifolds that CkNN is the unique unweighted construction that yields a geometry consistent with the connected components of the underlying manifold in the limit of large data. Thus CkNN produces a single graph that captures all topological features simultaneously, in contrast to persistent homology, which represents each homology generator at a separate scale. As applications we derive a new fast clustering algorithm and a method to identify patterns in natural images topologically. Finally, we conjecture that CkNN is topologically consistent, meaning that the homology of the Vietoris-Rips complex (implied by the graph Laplacian) converges to the homology of the underlying manifold (implied by the Laplace-de Rham operators) in the limit of large data.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70247699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Flexible online multivariate regression with variational Bayes and the matrix-variate Dirichlet process 基于变分贝叶斯和矩阵-变量狄利克雷过程的灵活在线多元回归
Q2 MATHEMATICS, APPLIED Pub Date : 2016-02-29 DOI: 10.3934/FODS.2019006
Meng Hwee Victor Ong, D. Nott, A. Jasra
Flexible regression methods where interest centres on the way that the whole distribution of a response vector changes with covariates are very useful in some applications. A recently developed technique in this regard uses the matrix-variate Dirichlet process as a prior for a mixing distribution on a coefficient in a multivariate linear regression model. The method is attractive, particularly in the multivariate setting, for the convenient way that it allows for borrowing strength across different component regressions and for its computational simplicity and tractability. The purpose of the present article is to develop fast online variational Bayes approaches to fitting this model and to investigate how they perform compared to MCMC and batch variational methods in a number of scenarios.
在某些应用中,关注响应向量的整个分布随协变量变化的灵活回归方法是非常有用的。在这方面,最近发展的一种技术使用矩阵-变量狄利克雷过程作为多元线性回归模型中系数混合分布的先验。该方法很有吸引力,特别是在多变量设置中,因为它允许在不同的组件回归中借用强度的方便方式,以及它的计算简单性和可追溯性。本文的目的是开发快速的在线变分贝叶斯方法来拟合该模型,并研究它们与MCMC和批变分方法在许多场景中的表现。
{"title":"Flexible online multivariate regression with variational Bayes and the matrix-variate Dirichlet process","authors":"Meng Hwee Victor Ong, D. Nott, A. Jasra","doi":"10.3934/FODS.2019006","DOIUrl":"https://doi.org/10.3934/FODS.2019006","url":null,"abstract":"Flexible regression methods where interest centres on the way that the whole distribution of a response vector changes with covariates are very useful in some applications. A recently developed technique in this regard uses the matrix-variate Dirichlet process as a prior for a mixing distribution on a coefficient in a multivariate linear regression model. The method is attractive, particularly in the multivariate setting, for the convenient way that it allows for borrowing strength across different component regressions and for its computational simplicity and tractability. The purpose of the present article is to develop fast online variational Bayes approaches to fitting this model and to investigate how they perform compared to MCMC and batch variational methods in a number of scenarios.","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70247750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Metropolis-Hastings algorithms by Delayed Acceptance 延迟接受加速Metropolis-Hastings算法
Q2 MATHEMATICS, APPLIED Pub Date : 2015-03-03 DOI: 10.3934/FODS.2019005
Marco Banterle, C. Grazian, Anthony Lee, C. Robert
MCMC algorithms such as Metropolis-Hastings algorithms are slowed down by the computation of complex target distributions as exemplified by huge datasets. We offer in this paper a useful generalisation of the Delayed Acceptance approach, devised to reduce the computational costs of such algorithms by a simple and universal divide-and-conquer strategy. The idea behind the generic acceleration is to divide the acceptance step into several parts, aiming at a major reduction in computing time that out-ranks the corresponding reduction in acceptance probability. Each of the components can be sequentially compared with a uniform variate, the first rejection signalling that the proposed value is considered no further. We develop moreover theoretical bounds for the variance of associated estimators with respect to the variance of the standard Metropolis-Hastings and detail some results on optimal scaling and general optimisation of the procedure. We illustrate those accelerating features on a series of examples
以大型数据集为例,Metropolis-Hastings算法等MCMC算法由于计算复杂的目标分布而速度变慢。我们在本文中提供了延迟接受方法的一个有用的推广,旨在通过一个简单而通用的分治策略来降低此类算法的计算成本。通用加速背后的思想是将验收步骤分成几个部分,旨在大大减少计算时间,从而超过相应的验收概率减少。每个组成部分可以依次与一个统一的变量进行比较,第一次拒绝表明建议的值不再被考虑。此外,我们还根据标准Metropolis-Hastings的方差给出了相关估计量方差的理论界限,并详细介绍了该过程的最优标度和一般优化的一些结果。我们通过一系列示例来说明这些加速特性
{"title":"Accelerating Metropolis-Hastings algorithms by Delayed Acceptance","authors":"Marco Banterle, C. Grazian, Anthony Lee, C. Robert","doi":"10.3934/FODS.2019005","DOIUrl":"https://doi.org/10.3934/FODS.2019005","url":null,"abstract":"MCMC algorithms such as Metropolis-Hastings algorithms are slowed down by the computation of complex target distributions as exemplified by huge datasets. We offer in this paper a useful generalisation of the Delayed Acceptance approach, devised to reduce the computational costs of such algorithms by a simple and universal divide-and-conquer strategy. The idea behind the generic acceleration is to divide the acceptance step into several parts, aiming at a major reduction in computing time that out-ranks the corresponding reduction in acceptance probability. Each of the components can be sequentially compared with a uniform variate, the first rejection signalling that the proposed value is considered no further. We develop moreover theoretical bounds for the variance of associated estimators with respect to the variance of the standard Metropolis-Hastings and detail some results on optimal scaling and general optimisation of the procedure. We illustrate those accelerating features on a series of examples","PeriodicalId":73054,"journal":{"name":"Foundations of data science (Springfield, Mo.)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70247738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 50
期刊
Foundations of data science (Springfield, Mo.)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1