首页 > 最新文献

SIAM journal on mathematics of data science最新文献

英文 中文
Randomized Wasserstein Barycenter Computation: Resampling with Statistical Guarantees 随机Wasserstein重心计算:具有统计保证的重抽样
Q1 MATHEMATICS, APPLIED Pub Date : 2020-12-11 DOI: 10.1137/20m1385263
F. Heinemann, A. Munk, Y. Zemel
We propose a hybrid resampling method to approximate finitely supported Wasserstein barycenters on large-scale datasets, which can be combined with any exact solver. Nonasymptotic bounds on the expected error of the objective value as well as the barycenters themselves allow to calibrate computational cost and statistical accuracy. The rate of these upper bounds is shown to be optimal and independent of the underlying dimension, which appears only in the constants. Using a simple modification of the subgradient descent algorithm of Cuturi and Doucet, we showcase the applicability of our method on a myriad of simulated datasets, as well as a real-data example which are out of reach for state of the art algorithms for computing Wasserstein barycenters.
我们提出了一种混合重采样方法来近似大规模数据集上有限支持的Wasserstein质心,该方法可以与任何精确求解器结合使用。目标值的期望误差的非渐近边界以及质心本身允许校准计算成本和统计精度。这些上界的速率被证明是最优的,并且与底层维度无关,它只出现在常数中。通过对Cuturi和Doucet的次梯度下降算法的简单修改,我们展示了我们的方法在无数模拟数据集上的适用性,以及一个真实数据示例,这些示例对于计算Wasserstein质心的最先进算法来说是无法达到的。
{"title":"Randomized Wasserstein Barycenter Computation: Resampling with Statistical Guarantees","authors":"F. Heinemann, A. Munk, Y. Zemel","doi":"10.1137/20m1385263","DOIUrl":"https://doi.org/10.1137/20m1385263","url":null,"abstract":"We propose a hybrid resampling method to approximate finitely supported Wasserstein barycenters on large-scale datasets, which can be combined with any exact solver. Nonasymptotic bounds on the expected error of the objective value as well as the barycenters themselves allow to calibrate computational cost and statistical accuracy. The rate of these upper bounds is shown to be optimal and independent of the underlying dimension, which appears only in the constants. Using a simple modification of the subgradient descent algorithm of Cuturi and Doucet, we showcase the applicability of our method on a myriad of simulated datasets, as well as a real-data example which are out of reach for state of the art algorithms for computing Wasserstein barycenters.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"69 1","pages":"229-259"},"PeriodicalIF":0.0,"publicationDate":"2020-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83296297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Stochastic Tverberg Theorems With Applications in Multiclass Logistic Regression, Separability, and Centerpoints of Data 随机Tverberg定理及其在多类逻辑回归、可分性和数据中心点中的应用
Q1 MATHEMATICS, APPLIED Pub Date : 2020-12-10 DOI: 10.1137/19m1277102
J. D. Loera, T. A. Hogan
We present new stochastic geometry theorems that give bounds on the probability that $m$ random data classes all contain a point in common in their convex hulls. These theorems relate to the existe...
我们提出了新的随机几何定理,给出了$m$随机数据类在它们的凸包中都包含一个共同点的概率界。这些定理与存在…
{"title":"Stochastic Tverberg Theorems With Applications in Multiclass Logistic Regression, Separability, and Centerpoints of Data","authors":"J. D. Loera, T. A. Hogan","doi":"10.1137/19m1277102","DOIUrl":"https://doi.org/10.1137/19m1277102","url":null,"abstract":"We present new stochastic geometry theorems that give bounds on the probability that $m$ random data classes all contain a point in common in their convex hulls. These theorems relate to the existe...","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"116 1","pages":"1151-1166"},"PeriodicalIF":0.0,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79373299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting, and Regularization 高斯混合的二分类:支持向量的丰度、良性过拟合和正则化
Q1 MATHEMATICS, APPLIED Pub Date : 2020-11-18 DOI: 10.1137/21m1415121
Ke Wang, Christos Thrampoulidis
Deep neural networks generalize well despite being exceedingly overparameterized and being trained without explicit regularization. This curious phenomenon has inspired extensive research activity in establishing its statistical principles: Under what conditions is it observed? How do these depend on the data and on the training algorithm? When does regularization benefit generalization? While such questions remain wide open for deep neural nets, recent works have attempted gaining insights by studying simpler, often linear, models. Our paper contributes to this growing line of work by examining binary linear classification under a generative Gaussian mixture model. Motivated by recent results on the implicit bias of gradient descent, we study both max-margin SVM classifiers (corresponding to logistic loss) and min-norm interpolating classifiers (corresponding to least-squares loss). First, we leverage an idea introduced in [V. Muthukumar et al., arXiv:2005.08054, (2020)] to relate the SVM solution to the min-norm interpolating solution. Second, we derive novel non-asymptotic bounds on the classification error of the latter. Combining the two, we present novel sufficient conditions on the covariance spectrum and on the signal-to-noise ratio (SNR) under which interpolating estimators achieve asymptotically optimal performance as overparameterization increases. Interestingly, our results extend to a noisy model with constant probability noise flips. Contrary to previously studied discriminative data models, our results emphasize the crucial role of the SNR and its interplay with the data covariance. Finally, via a combination of analytical arguments and numerical demonstrations we identify conditions under which the interpolating estimator performs better than corresponding regularized estimates.
尽管深度神经网络过于参数化,而且训练时没有明确的正则化,但它泛化得很好。这种奇怪的现象激发了广泛的研究活动,以建立其统计原理:在什么条件下观察到它?这些是如何依赖于数据和训练算法的?什么时候正则化有利于泛化?虽然这些问题对于深度神经网络来说仍然是开放的,但最近的研究试图通过研究更简单的、通常是线性的模型来获得见解。我们的论文通过研究生成高斯混合模型下的二元线性分类,为这一不断增长的工作做出了贡献。受最近关于梯度下降隐式偏差的研究结果的启发,我们研究了最大边际SVM分类器(对应于逻辑损失)和最小范数插值分类器(对应于最小二乘损失)。首先,我们利用[V.]Muthukumar et al. [j] ., vol . 4:2005.08054, (2020) .]其次,我们对后者的分类误差给出了新的非渐近界。结合这两者,我们提出了新的充分条件,在协方差谱和信噪比(SNR)下,随着过参数化的增加,插值估计器达到渐近最优性能。有趣的是,我们的结果扩展到具有恒定概率的噪声翻转的噪声模型。与以往研究的判别数据模型相反,我们的研究结果强调信噪比及其与数据协方差的相互作用的关键作用。最后,通过分析论证和数值论证的结合,我们确定了插值估计量比相应的正则化估计性能更好的条件。
{"title":"Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting, and Regularization","authors":"Ke Wang, Christos Thrampoulidis","doi":"10.1137/21m1415121","DOIUrl":"https://doi.org/10.1137/21m1415121","url":null,"abstract":"Deep neural networks generalize well despite being exceedingly overparameterized and being trained without explicit regularization. This curious phenomenon has inspired extensive research activity in establishing its statistical principles: Under what conditions is it observed? How do these depend on the data and on the training algorithm? When does regularization benefit generalization? While such questions remain wide open for deep neural nets, recent works have attempted gaining insights by studying simpler, often linear, models. Our paper contributes to this growing line of work by examining binary linear classification under a generative Gaussian mixture model. Motivated by recent results on the implicit bias of gradient descent, we study both max-margin SVM classifiers (corresponding to logistic loss) and min-norm interpolating classifiers (corresponding to least-squares loss). First, we leverage an idea introduced in [V. Muthukumar et al., arXiv:2005.08054, (2020)] to relate the SVM solution to the min-norm interpolating solution. Second, we derive novel non-asymptotic bounds on the classification error of the latter. Combining the two, we present novel sufficient conditions on the covariance spectrum and on the signal-to-noise ratio (SNR) under which interpolating estimators achieve asymptotically optimal performance as overparameterization increases. Interestingly, our results extend to a noisy model with constant probability noise flips. Contrary to previously studied discriminative data models, our results emphasize the crucial role of the SNR and its interplay with the data covariance. Finally, via a combination of analytical arguments and numerical demonstrations we identify conditions under which the interpolating estimator performs better than corresponding regularized estimates.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"20 1","pages":"260-284"},"PeriodicalIF":0.0,"publicationDate":"2020-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75813460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Memory Capacity of Neural Networks with Threshold and Rectified Linear Unit Activations 具有阈值和整流线性单元激活的神经网络的记忆容量
Q1 MATHEMATICS, APPLIED Pub Date : 2020-10-20 DOI: 10.1137/20m1314884
R. Vershynin
Overwhelming theoretical and empirical evidence shows that mildly overparametrized neural networks---those with more connections than the size of the training data---are often able to memorize the ...
大量的理论和经验证据表明,轻度过度参数化的神经网络——那些连接数量超过训练数据规模的神经网络——通常能够记住……
{"title":"Memory Capacity of Neural Networks with Threshold and Rectified Linear Unit Activations","authors":"R. Vershynin","doi":"10.1137/20m1314884","DOIUrl":"https://doi.org/10.1137/20m1314884","url":null,"abstract":"Overwhelming theoretical and empirical evidence shows that mildly overparametrized neural networks---those with more connections than the size of the training data---are often able to memorize the ...","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"19 1","pages":"1004-1033"},"PeriodicalIF":0.0,"publicationDate":"2020-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76907136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Towards Compact Neural Networks via End-to-End Training: A Bayesian Tensor Approach with Automatic Rank Determination 基于端到端训练的紧凑神经网络:一种自动排序的贝叶斯张量方法
Q1 MATHEMATICS, APPLIED Pub Date : 2020-10-17 DOI: 10.1137/21m1391444
Cole Hawkins, Xing-er Liu, Zheng Zhang
While post-training model compression can greatly reduce the inference cost of a deep neural network, uncompressed training still consumes a huge amount of hardware resources, run-time and energy. It is highly desirable to directly train a compact neural network from scratch with low memory and low computational cost. Low-rank tensor decomposition is one of the most effective approaches to reduce the memory and computing requirements of large-size neural networks. However, directly training a low-rank tensorized neural network is a very challenging task because it is hard to determine a proper tensor rank {it a priori}, which controls the model complexity and compression ratio in the training process. This paper presents a novel end-to-end framework for low-rank tensorized training of neural networks. We first develop a flexible Bayesian model that can handle various low-rank tensor formats (e.g., CP, Tucker, tensor train and tensor-train matrix) that compress neural network parameters in training. This model can automatically determine the tensor ranks inside a nonlinear forward model, which is beyond the capability of existing Bayesian tensor methods. We further develop a scalable stochastic variational inference solver to estimate the posterior density of large-scale problems in training. Our work provides the first general-purpose rank-adaptive framework for end-to-end tensorized training. Our numerical results on various neural network architectures show orders-of-magnitude parameter reduction and little accuracy loss (or even better accuracy) in the training process. Specifically, on a very large deep learning recommendation system with over $4.2times 10^9$ model parameters, our method can reduce the variables to only $1.6times 10^5$ automatically in the training process (i.e., by $2.6times 10^4$ times) while achieving almost the same accuracy.
虽然训练后模型压缩可以大大降低深度神经网络的推理成本,但未压缩的训练仍然消耗大量的硬件资源、运行时间和能量。直接从零开始训练具有低内存和低计算成本的紧凑型神经网络是非常理想的。低秩张量分解是减少大型神经网络内存和计算需求的最有效方法之一。然而,直接训练一个低秩张化神经网络是一项非常具有挑战性的任务,因为很难确定一个合适的张量秩(it a priori),它在训练过程中控制着模型的复杂度和压缩比。提出了一种新颖的端到端神经网络低秩张化训练框架。我们首先开发了一个灵活的贝叶斯模型,可以处理各种低秩张量格式(例如,CP, Tucker,张量训练和张量训练矩阵),在训练中压缩神经网络参数。该模型可以自动确定非线性正演模型内的张量秩,这是现有贝叶斯张量方法所无法做到的。我们进一步开发了一个可扩展的随机变分推理求解器来估计训练中大规模问题的后验密度。我们的工作为端到端张化训练提供了第一个通用的秩自适应框架。我们在各种神经网络架构上的数值结果表明,在训练过程中参数降低了数量级,精度损失很小(甚至更高)。具体来说,在一个拥有超过4.2times 10^9$模型参数的非常大的深度学习推荐系统上,我们的方法可以在训练过程中自动将变量减少到只有1.6times 10^5$(即减少2.6times 10^4$),同时达到几乎相同的精度。
{"title":"Towards Compact Neural Networks via End-to-End Training: A Bayesian Tensor Approach with Automatic Rank Determination","authors":"Cole Hawkins, Xing-er Liu, Zheng Zhang","doi":"10.1137/21m1391444","DOIUrl":"https://doi.org/10.1137/21m1391444","url":null,"abstract":"While post-training model compression can greatly reduce the inference cost of a deep neural network, uncompressed training still consumes a huge amount of hardware resources, run-time and energy. It is highly desirable to directly train a compact neural network from scratch with low memory and low computational cost. Low-rank tensor decomposition is one of the most effective approaches to reduce the memory and computing requirements of large-size neural networks. However, directly training a low-rank tensorized neural network is a very challenging task because it is hard to determine a proper tensor rank {it a priori}, which controls the model complexity and compression ratio in the training process. This paper presents a novel end-to-end framework for low-rank tensorized training of neural networks. We first develop a flexible Bayesian model that can handle various low-rank tensor formats (e.g., CP, Tucker, tensor train and tensor-train matrix) that compress neural network parameters in training. This model can automatically determine the tensor ranks inside a nonlinear forward model, which is beyond the capability of existing Bayesian tensor methods. We further develop a scalable stochastic variational inference solver to estimate the posterior density of large-scale problems in training. Our work provides the first general-purpose rank-adaptive framework for end-to-end tensorized training. Our numerical results on various neural network architectures show orders-of-magnitude parameter reduction and little accuracy loss (or even better accuracy) in the training process. Specifically, on a very large deep learning recommendation system with over $4.2times 10^9$ model parameters, our method can reduce the variables to only $1.6times 10^5$ automatically in the training process (i.e., by $2.6times 10^4$ times) while achieving almost the same accuracy.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"45 1","pages":"46-71"},"PeriodicalIF":0.0,"publicationDate":"2020-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80791672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Consistency of Archetypal Analysis 原型分析的一致性
Q1 MATHEMATICS, APPLIED Pub Date : 2020-10-16 DOI: 10.1137/20M1331792
B. Osting, Dong Wang, Yiming Xu, Dominique Zosso
Archetypal analysis is an unsupervised learning method that uses a convex polytope to summarize multivariate data. For fixed $k$, the method finds a convex polytope with $k$ vertices, called archetype points, such that the polytope is contained in the convex hull of the data and the mean squared distance between the data and the polytope is minimal. In this paper, we prove a consistency result that shows if the data is independently sampled from a probability measure with bounded support, then the archetype points converge to a solution of the continuum version of the problem, of which we identify and establish several properties. We also obtain the convergence rate of the optimal objective values under appropriate assumptions on the distribution. If the data is independently sampled from a distribution with unbounded support, we also prove a consistency result for a modified method that penalizes the dispersion of the archetype points. Our analysis is supported by detailed computational experiments of the archetype points for data sampled from the uniform distribution in a disk, the normal distribution, an annular distribution, and a Gaussian mixture model.
原型分析是一种使用凸多面体来总结多元数据的无监督学习方法。对于固定的$k$,该方法找到一个具有$k$顶点的凸多面体,称为原型点,使得多面体包含在数据的凸包中,并且数据与多面体之间的均方距离最小。在本文中,我们证明了一个一致性结果,该结果表明,如果数据从有界支持的概率测度中独立采样,则原型点收敛于问题的连续统版本的解,我们识别并建立了该问题的几个性质。在适当的分布假设下,得到了最优目标值的收敛速度。如果数据是从具有无界支持的分布中独立采样的,我们还证明了对原型点分散进行惩罚的改进方法的一致性结果。我们的分析得到了从均匀分布的圆盘、正态分布、环状分布和高斯混合模型中采样的数据的原型点的详细计算实验的支持。
{"title":"Consistency of Archetypal Analysis","authors":"B. Osting, Dong Wang, Yiming Xu, Dominique Zosso","doi":"10.1137/20M1331792","DOIUrl":"https://doi.org/10.1137/20M1331792","url":null,"abstract":"Archetypal analysis is an unsupervised learning method that uses a convex polytope to summarize multivariate data. For fixed $k$, the method finds a convex polytope with $k$ vertices, called archetype points, such that the polytope is contained in the convex hull of the data and the mean squared distance between the data and the polytope is minimal. In this paper, we prove a consistency result that shows if the data is independently sampled from a probability measure with bounded support, then the archetype points converge to a solution of the continuum version of the problem, of which we identify and establish several properties. We also obtain the convergence rate of the optimal objective values under appropriate assumptions on the distribution. If the data is independently sampled from a distribution with unbounded support, we also prove a consistency result for a modified method that penalizes the dispersion of the archetype points. Our analysis is supported by detailed computational experiments of the archetype points for data sampled from the uniform distribution in a disk, the normal distribution, an annular distribution, and a Gaussian mixture model.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"50 1","pages":"1-30"},"PeriodicalIF":0.0,"publicationDate":"2020-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90037347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms 由递推随机算法导出的马尔可夫链的一些极限性质
Q1 MATHEMATICS, APPLIED Pub Date : 2020-10-15 DOI: 10.1137/19m1258104
Abhishek K. Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar
Recursive stochastic algorithms have gained significant attention in the recent past due to data-driven applications. Examples include stochastic gradient descent for solving large-scale optimizati...
由于数据驱动的应用,递归随机算法在最近得到了极大的关注。例子包括解决大规模优化问题的随机梯度下降。
{"title":"Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms","authors":"Abhishek K. Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar","doi":"10.1137/19m1258104","DOIUrl":"https://doi.org/10.1137/19m1258104","url":null,"abstract":"Recursive stochastic algorithms have gained significant attention in the recent past due to data-driven applications. Examples include stochastic gradient descent for solving large-scale optimizati...","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"97 1","pages":"967-1003"},"PeriodicalIF":0.0,"publicationDate":"2020-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79472710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multilayer Modularity Belief Propagation to Assess Detectability of Community Structure 基于多层模块化信念传播的群落结构可检测性评估
Q1 MATHEMATICS, APPLIED Pub Date : 2020-09-28 DOI: 10.1137/19m1279812
W. Weir, Benjamin Walker, Lenka Zdeborov'a, P. Mucha
Modularity-based community detection encompasses a number of widely used, efficient heuristics for identification of structure in networks. Recently, a belief propagation approach to modularity opt...
基于模块化的社区检测包含了许多广泛使用的、有效的启发式网络结构识别方法。最近,一种基于信念传播的模块化选择方法…
{"title":"Multilayer Modularity Belief Propagation to Assess Detectability of Community Structure","authors":"W. Weir, Benjamin Walker, Lenka Zdeborov'a, P. Mucha","doi":"10.1137/19m1279812","DOIUrl":"https://doi.org/10.1137/19m1279812","url":null,"abstract":"Modularity-based community detection encompasses a number of widely used, efficient heuristics for identification of structure in networks. Recently, a belief propagation approach to modularity opt...","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"1 1","pages":"872-900"},"PeriodicalIF":0.0,"publicationDate":"2020-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90478008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Sequential Construction and Dimension Reduction of Gaussian Processes Under Inequality Constraints 不等式约束下高斯过程的序列构造与降维
Q1 MATHEMATICS, APPLIED Pub Date : 2020-09-09 DOI: 10.1137/21m1407513
F. Bachoc, A. F. López-Lopera, O. Roustant
Accounting for inequality constraints, such as boundedness, monotonicity or convexity, is challenging when modeling costly-to-evaluate black box functions. In this regard, finite-dimensional Gaussian process (GP) models bring a valuable solution, as they guarantee that the inequality constraints are satisfied everywhere. Nevertheless, these models are currently restricted to small dimensional situations (up to dimension 5). Addressing this issue, we introduce the MaxMod algorithm that sequentially inserts one-dimensional knots or adds active variables, thereby performing at the same time dimension reduction and efficient knot allocation. We prove the convergence of this algorithm. In intermediary steps of the proof, we propose the notion of multi-affine extension and study its properties. We also prove the convergence of finite-dimensional GPs, when the knots are not dense in the input space, extending the recent literature. With simulated and real data, we demonstrate that the MaxMod algorithm remains efficient in higher dimension (at least in dimension 20), and has a smaller computational complexity than other constrained GP models from the state-of-the-art, to reach a given approximation error.
当对代价高昂的黑盒函数进行建模以评估时,考虑不等式约束,如有界性、单调性或凸性,是具有挑战性的。在这方面,有限维高斯过程(GP)模型带来了一个有价值的解决方案,因为它们保证了处处满足不等式约束。尽管如此,这些模型目前仅限于小维情况(高达5维)。为了解决这个问题,我们引入了MaxMod算法,该算法顺序插入一维结或添加活动变量,从而同时执行降维和有效的结分配。我们证明了该算法的收敛性。在证明的中间步骤中,我们提出了多重仿射扩张的概念,并研究了它的性质。我们还证明了当节点在输入空间中不稠密时,有限维GP的收敛性,扩展了最近的文献。通过模拟和真实数据,我们证明了MaxMod算法在更高维度(至少在维度20)上仍然有效,并且与现有技术中的其他约束GP模型相比,具有更小的计算复杂度,以达到给定的近似误差。
{"title":"Sequential Construction and Dimension Reduction of Gaussian Processes Under Inequality Constraints","authors":"F. Bachoc, A. F. López-Lopera, O. Roustant","doi":"10.1137/21m1407513","DOIUrl":"https://doi.org/10.1137/21m1407513","url":null,"abstract":"Accounting for inequality constraints, such as boundedness, monotonicity or convexity, is challenging when modeling costly-to-evaluate black box functions. In this regard, finite-dimensional Gaussian process (GP) models bring a valuable solution, as they guarantee that the inequality constraints are satisfied everywhere. Nevertheless, these models are currently restricted to small dimensional situations (up to dimension 5). Addressing this issue, we introduce the MaxMod algorithm that sequentially inserts one-dimensional knots or adds active variables, thereby performing at the same time dimension reduction and efficient knot allocation. We prove the convergence of this algorithm. In intermediary steps of the proof, we propose the notion of multi-affine extension and study its properties. We also prove the convergence of finite-dimensional GPs, when the knots are not dense in the input space, extending the recent literature. With simulated and real data, we demonstrate that the MaxMod algorithm remains efficient in higher dimension (at least in dimension 20), and has a smaller computational complexity than other constrained GP models from the state-of-the-art, to reach a given approximation error.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45269075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Exponential-Wrapped Distributions on Symmetric Spaces 对称空间上的指数包裹分布
Q1 MATHEMATICS, APPLIED Pub Date : 2020-09-04 DOI: 10.1137/21m1461551
Emmanuel Chevallier, Didong Li, Yulong Lu, D. Dunson
. In many applications, the curvature of the space supporting the data makes the statistical modelling challenging. In this paper we discuss the construction and use of probability distributions wrapped around manifolds using exponential maps. These distributions have already been used on specific manifolds. We describe their construction in the unifying framework of affine locally symmetric spaces. Affine locally symmetric spaces are a broad class of manifolds containing many manifolds encountered in data sciences. We show that on these spaces, exponential-wrapped distributions enjoy interesting properties for practical use. We provide the generic expression of the Jacobian appearing in these distributions and compute it on two particular examples: Grassmannians and pseudo-hyperboloids. We illustrate the interest of such distributions in a classification experiment on simulated data.
在许多应用中,支持数据的空间的曲率使得统计建模具有挑战性。在本文中,我们讨论了使用指数映射包裹在流形上的概率分布的构造和使用。这些分布已经在特定的流形上使用。我们在有效局部对称空间的统一框架中描述了它们的构造。有效局部对称空间是一类广泛的流形,包含数据科学中遇到的许多流形。我们证明了在这些空间上,指数包裹分布在实际应用中具有有趣的性质。我们给出了在这些分布中出现的雅可比阶的一般表达式,并在两个特定的例子上进行了计算:格拉斯曼和伪双曲面。我们在模拟数据的分类实验中说明了这种分布的兴趣。
{"title":"Exponential-Wrapped Distributions on Symmetric Spaces","authors":"Emmanuel Chevallier, Didong Li, Yulong Lu, D. Dunson","doi":"10.1137/21m1461551","DOIUrl":"https://doi.org/10.1137/21m1461551","url":null,"abstract":". In many applications, the curvature of the space supporting the data makes the statistical modelling challenging. In this paper we discuss the construction and use of probability distributions wrapped around manifolds using exponential maps. These distributions have already been used on specific manifolds. We describe their construction in the unifying framework of affine locally symmetric spaces. Affine locally symmetric spaces are a broad class of manifolds containing many manifolds encountered in data sciences. We show that on these spaces, exponential-wrapped distributions enjoy interesting properties for practical use. We provide the generic expression of the Jacobian appearing in these distributions and compute it on two particular examples: Grassmannians and pseudo-hyperboloids. We illustrate the interest of such distributions in a classification experiment on simulated data.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2020-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47949469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
SIAM journal on mathematics of data science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1