首页 > 最新文献

Journal of Machine Learning Research最新文献

英文 中文
Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality. 二元多路数据学习:概率张量分解及其统计最优性。
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2020-07-01
Miaoyan Wang, Lexin Li

We consider the problem of decomposing a higher-order tensor with binary entries. Such data problems arise frequently in applications such as neuroimaging, recommendation system, topic modeling, and sensor network localization. We propose a multilinear Bernoulli model, develop a rank-constrained likelihood-based estimation method, and obtain the theoretical accuracy guarantees. In contrast to continuous-valued problems, the binary tensor problem exhibits an interesting phase transition phenomenon according to the signal-to-noise ratio. The error bound for the parameter tensor estimation is established, and we show that the obtained rate is minimax optimal under the considered model. Furthermore, we develop an alternating optimization algorithm with convergence guarantees. The efficacy of our approach is demonstrated through both simulations and analyses of multiple data sets on the tasks of tensor completion and clustering.

我们考虑具有二元项的高阶张量的分解问题。这类数据问题在神经成像、推荐系统、主题建模、传感器网络定位等应用中经常出现。提出了多线性伯努利模型,提出了基于秩约束的似然估计方法,并获得了理论精度保证。与连续值问题相比,根据信噪比,二元张量问题表现出有趣的相变现象。建立了参数张量估计的误差界,并证明了在考虑的模型下得到的速率是极小极大最优的。在此基础上,提出了一种具有收敛性保证的交替优化算法。通过对多个数据集的张量补全和聚类任务的模拟和分析,证明了我们方法的有效性。
{"title":"Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality.","authors":"Miaoyan Wang,&nbsp;Lexin Li","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We consider the problem of decomposing a higher-order tensor with binary entries. Such data problems arise frequently in applications such as neuroimaging, recommendation system, topic modeling, and sensor network localization. We propose a multilinear Bernoulli model, develop a rank-constrained likelihood-based estimation method, and obtain the theoretical accuracy guarantees. In contrast to continuous-valued problems, the binary tensor problem exhibits an interesting phase transition phenomenon according to the signal-to-noise ratio. The error bound for the parameter tensor estimation is established, and we show that the obtained rate is minimax optimal under the considered model. Furthermore, we develop an alternating optimization algorithm with convergence guarantees. The efficacy of our approach is demonstrated through both simulations and analyses of multiple data sets on the tasks of tensor completion and clustering.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"21 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8457422/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39465843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantile Graphical Models: Bayesian Approaches. 分位数图形模型:贝叶斯方法。
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2020-01-01
Nilabja Guha, Veera Baladandayuthapani, Bani K Mallick

Graphical models are ubiquitous tools to describe the interdependence between variables measured simultaneously such as large-scale gene or protein expression data. Gaussian graphical models (GGMs) are well-established tools for probabilistic exploration of dependence structures using precision matrices and they are generated under a multivariate normal joint distribution. However, they suffer from several shortcomings since they are based on Gaussian distribution assumptions. In this article, we propose a Bayesian quantile based approach for sparse estimation of graphs. We demonstrate that the resulting graph estimation is robust to outliers and applicable under general distributional assumptions. Furthermore, we develop efficient variational Bayes approximations to scale the methods for large data sets. Our methods are applied to a novel cancer proteomics data dataset where-in multiple proteomic antibodies are simultaneously assessed on tumor samples using reverse-phase protein arrays (RPPA) technology.

图形模型是描述同时测量的变量之间的相互依赖关系的普遍工具,例如大规模的基因或蛋白质表达数据。高斯图形模型(GGMs)是利用精度矩阵对相关结构进行概率探索的成熟工具,它是在多元正态联合分布下生成的。然而,由于它们是基于高斯分布假设,因此存在一些缺点。在本文中,我们提出了一种基于贝叶斯分位数的图稀疏估计方法。我们证明了所得到的图估计对异常值具有鲁棒性,并且适用于一般分布假设。此外,我们开发了有效的变分贝叶斯近似来扩展大型数据集的方法。我们的方法应用于一个新的癌症蛋白质组学数据集,其中使用反相蛋白质阵列(RPPA)技术同时评估肿瘤样品中的多个蛋白质组学抗体。
{"title":"Quantile Graphical Models: Bayesian Approaches.","authors":"Nilabja Guha,&nbsp;Veera Baladandayuthapani,&nbsp;Bani K Mallick","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Graphical models are ubiquitous tools to describe the interdependence between variables measured simultaneously such as large-scale gene or protein expression data. Gaussian graphical models (GGMs) are well-established tools for probabilistic exploration of dependence structures using precision matrices and they are generated under a multivariate normal joint distribution. However, they suffer from several shortcomings since they are based on Gaussian distribution assumptions. In this article, we propose a Bayesian quantile based approach for sparse estimation of graphs. We demonstrate that the resulting graph estimation is robust to outliers and applicable under general distributional assumptions. Furthermore, we develop efficient variational Bayes approximations to scale the methods for large data sets. Our methods are applied to a novel cancer proteomics data dataset where-in multiple proteomic antibodies are simultaneously assessed on tumor samples using reverse-phase protein arrays (RPPA) technology.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"21 79","pages":"1-47"},"PeriodicalIF":6.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8297664/pdf/nihms-1636569.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39223529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Near-optimal Individualized Treatment Recommendations. 近乎最佳的个体化治疗建议。
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2020-01-01
Haomiao Meng, Ying-Qi Zhao, Haoda Fu, Xingye Qiao

The individualized treatment recommendation (ITR) is an important analytic framework for precision medicine. The goal of ITR is to assign the best treatments to patients based on their individual characteristics. From the machine learning perspective, the solution to the ITR problem can be formulated as a weighted classification problem to maximize the mean benefit from the recommended treatments given patients' characteristics. Several ITR methods have been proposed in both the binary setting and the multicategory setting. In practice, one may prefer a more flexible recommendation that includes multiple treatment options. This motivates us to develop methods to obtain a set of near-optimal individualized treatment recommendations alternative to each other, called alternative individualized treatment recommendations (A-ITR). We propose two methods to estimate the optimal A-ITR within the outcome weighted learning (OWL) framework. Simulation studies and a real data analysis for Type 2 diabetic patients with injectable antidiabetic treatments are conducted to show the usefulness of the proposed A-ITR framework. We also show the consistency of these methods and obtain an upper bound for the risk between the theoretically optimal recommendation and the estimated one. An R package aitr has been developed, found at https://github.com/menghaomiao/aitr.

个体化治疗推荐(ITR)是精准医疗的重要分析框架。ITR的目标是根据患者的个体特征分配最佳治疗方案。从机器学习的角度来看,ITR问题的解决方案可以被表述为一个加权分类问题,以最大化根据患者特征推荐治疗的平均收益。在二元设置和多类别设置下,提出了几种ITR方法。实际上,人们可能更喜欢更灵活的建议,包括多种治疗方案。这促使我们开发方法来获得一组相互替代的接近最佳的个体化治疗建议,称为替代个体化治疗建议(a - itr)。我们提出了两种方法来估计结果加权学习(OWL)框架下的最优A-ITR。通过对2型糖尿病患者注射降糖治疗的模拟研究和真实数据分析,证明了所提出的a - itr框架的有效性。我们还证明了这些方法的一致性,并得到了理论最优推荐和估计风险之间的上界。已经开发了一个R包,可以在https://github.com/menghaomiao/aitr上找到。
{"title":"Near-optimal Individualized Treatment Recommendations.","authors":"Haomiao Meng,&nbsp;Ying-Qi Zhao,&nbsp;Haoda Fu,&nbsp;Xingye Qiao","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The individualized treatment recommendation (ITR) is an important analytic framework for precision medicine. The goal of ITR is to assign the best treatments to patients based on their individual characteristics. From the machine learning perspective, the solution to the ITR problem can be formulated as a weighted classification problem to maximize the mean benefit from the recommended treatments given patients' characteristics. Several ITR methods have been proposed in both the binary setting and the multicategory setting. In practice, one may prefer a more flexible recommendation that includes multiple treatment options. This motivates us to develop methods to obtain a set of near-optimal individualized treatment recommendations alternative to each other, called alternative individualized treatment recommendations (A-ITR). We propose two methods to estimate the optimal A-ITR within the outcome weighted learning (OWL) framework. Simulation studies and a real data analysis for Type 2 diabetic patients with injectable antidiabetic treatments are conducted to show the usefulness of the proposed A-ITR framework. We also show the consistency of these methods and obtain an upper bound for the risk between the theoretically optimal recommendation and the estimated one. An R package aitr has been developed, found at https://github.com/menghaomiao/aitr.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"21 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8324003/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39264728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions. 鲁棒异步随机梯度推:强凸函数的渐近最优和网络无关性能。
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2020-01-01
Artin Spiridonoff, Alex Olshevsky, Ioannis Ch Paschalidis

We consider the standard model of distributed optimization of a sum of functions F ( z ) = i = 1 n f i ( z ) , where node i in a network holds the function fi (z). We allow for a harsh network model characterized by asynchronous updates, message delays, unpredictable message losses, and directed communication among nodes. In this setting, we analyze a modification of the Gradient-Push method for distributed optimization, assuming that (i) node i is capable of generating gradients of its function fi (z) corrupted by zero-mean bounded-support additive noise at each step, (ii) F(z) is strongly convex, and (iii) each fi (z) has Lipschitz gradients. We show that our proposed method asymptotically performs as well as the best bounds on centralized gradient descent that takes steps in the direction of the sum of the noisy gradients of all the functions f 1(z), …, fn (z) at each step.

我们考虑函数和的分布式优化的标准模型F (z) =∑i = 1 n F i (z),其中网络中的节点i保存函数fi (z)。我们允许一个苛刻的网络模型,其特征是异步更新,消息延迟,不可预测的消息丢失和节点之间的定向通信。在此设置中,我们分析了用于分布式优化的Gradient-Push方法的修改,假设(i)节点i能够生成其函数fi (z)的梯度,该函数在每一步都被零均值有界支持加性噪声破坏,(ii) F(z)是强凸的,以及(iii)每个fi (z)具有Lipschitz梯度。我们表明,我们提出的方法在集中梯度下降上的渐近性能与最佳边界一样好,该方法在每一步都朝着所有函数f1 (z),…,fn (z)的噪声梯度之和的方向采取步骤。
{"title":"Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions.","authors":"Artin Spiridonoff,&nbsp;Alex Olshevsky,&nbsp;Ioannis Ch Paschalidis","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We consider the standard model of distributed optimization of a sum of functions <math><mrow><mi>F</mi> <mrow><mo>(</mo> <mi>z</mi> <mo>)</mo></mrow> <mo>=</mo> <msubsup><mo>∑</mo> <mrow><mi>i</mi> <mo>=</mo> <mn>1</mn></mrow> <mi>n</mi></msubsup> <mrow><msub><mi>f</mi> <mi>i</mi></msub> <mrow><mo>(</mo> <mi>z</mi> <mo>)</mo></mrow> </mrow> </mrow> </math> , where node <i>i</i> in a network holds the function <i>f<sub>i</sub></i> (<b>z</b>). We allow for a harsh network model characterized by asynchronous updates, message delays, unpredictable message losses, and directed communication among nodes. In this setting, we analyze a modification of the Gradient-Push method for distributed optimization, assuming that (i) node <i>i</i> is capable of generating gradients of its function <i>f<sub>i</sub></i> (<b>z</b>) corrupted by zero-mean bounded-support additive noise at each step, (ii) <i>F</i>(<b>z</b>) is strongly convex, and (iii) each <i>f<sub>i</sub></i> (<b>z</b>) has Lipschitz gradients. We show that our proposed method asymptotically performs as well as the best bounds on centralized gradient descent that takes steps in the direction of the sum of the noisy gradients of all the functions <i>f</i> <sub>1</sub>(<b>z</b>), …, <i>f<sub>n</sub></i> (<b>z</b>) at each step.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"21 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7520166/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38434192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Regularization-Based Adaptive Test for High-Dimensional Generalized Linear Models. 高维广义线性模型的基于正则化的自适应检验。
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2020-01-01 Epub Date: 2020-07-26
Chong Wu, Gongjun Xu, Xiaotong Shen, Wei Pan

In spite of its urgent importance in the era of big data, testing high-dimensional parameters in generalized linear models (GLMs) in the presence of high-dimensional nuisance parameters has been largely under-studied, especially with regard to constructing powerful tests for general (and unknown) alternatives. Most existing tests are powerful only against certain alternatives and may yield incorrect Type I error rates under high-dimensional nuisance parameter situations. In this paper, we propose the adaptive interaction sum of powered score (aiSPU) test in the framework of penalized regression with a non-convex penalty, called truncated Lasso penalty (TLP), which can maintain correct Type I error rates while yielding high statistical power across a wide range of alternatives. To calculate its p-values analytically, we derive its asymptotic null distribution. Via simulations, its superior finite-sample performance is demonstrated over several representative existing methods. In addition, we apply it and other representative tests to an Alzheimer's Disease Neuroimaging Initiative (ADNI) data set, detecting possible gene-gender interactions for Alzheimer's disease. We also put R package "aispu" implementing the proposed test on GitHub.

尽管在大数据时代具有紧迫的重要性,但在存在高维干扰参数的情况下测试广义线性模型(GLM)中的高维参数在很大程度上被研究不足,尤其是在为一般(和未知)替代方案构建强大的测试方面。大多数现有的测试仅针对某些替代方案是强大的,并且在高维干扰参数情况下可能产生不正确的I型错误率。在本文中,我们提出了在具有非凸惩罚的惩罚回归框架下的自适应交互和幂分数(aiSPU)检验,称为截断Lasso惩罚(TLP),它可以保持正确的I型错误率,同时在广泛的备选方案中产生高统计幂。为了解析地计算它的p值,我们导出了它的渐近零分布。通过仿真,与几种具有代表性的现有方法相比,证明了其优越的有限样本性能。此外,我们将其和其他具有代表性的测试应用于阿尔茨海默病神经成像倡议(ADNI)数据集,检测阿尔茨海默病可能的基因-性别相互作用。我们还在GitHub上放了R包“aispu”来实现所提出的测试。
{"title":"A Regularization-Based Adaptive Test for High-Dimensional Generalized Linear Models.","authors":"Chong Wu,&nbsp;Gongjun Xu,&nbsp;Xiaotong Shen,&nbsp;Wei Pan","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In spite of its urgent importance in the era of big data, testing high-dimensional parameters in generalized linear models (GLMs) in the presence of high-dimensional nuisance parameters has been largely under-studied, especially with regard to constructing powerful tests for general (and unknown) alternatives. Most existing tests are powerful only against certain alternatives and may yield incorrect Type I error rates under high-dimensional nuisance parameter situations. In this paper, we propose the adaptive interaction sum of powered score (aiSPU) test in the framework of penalized regression with a non-convex penalty, called truncated Lasso penalty (TLP), which can maintain correct Type I error rates while yielding high statistical power across a wide range of alternatives. To calculate its <i>p</i>-values analytically, we derive its asymptotic null distribution. Via simulations, its superior finite-sample performance is demonstrated over several representative existing methods. In addition, we apply it and other representative tests to an Alzheimer's Disease Neuroimaging Initiative (ADNI) data set, detecting possible gene-gender interactions for Alzheimer's disease. We also put R package \"<i>aispu</i>\" implementing the proposed test on GitHub.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"21 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7425805/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38270305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimax Nonparametric Parallelism Test. 最小非参数平行检验。
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2020-01-01
Xin Xing, Meimei Liu, Ping Ma, Wenxuan Zhong

Testing the hypothesis of parallelism is a fundamental statistical problem arising from many applied sciences. In this paper, we develop a nonparametric parallelism test for inferring whether the trends are parallel in treatment and control groups. In particular, the proposed nonparametric parallelism test is a Wald type test based on a smoothing spline ANOVA (SSANOVA) model which can characterize the complex patterns of the data. We derive that the asymptotic null distribution of the test statistic is a Chi-square distribution, unveiling a new version of Wilks phenomenon. Notably, we establish the minimax sharp lower bound of the distinguishable rate for the nonparametric parallelism test by using the information theory, and further prove that the proposed test is minimax optimal. Simulation studies are conducted to investigate the empirical performance of the proposed test. DNA methylation and neuroimaging studies are presented to illustrate potential applications of the test. The software is available at https://github.com/BioAlgs/Parallelism.

平行假设检验是许多应用科学中出现的基本统计问题。在本文中,我们开发了一种非参数平行性检验,用于推断治疗组和对照组的趋势是否平行。具体而言,本文提出的非参数平行性检验是一种基于平滑样条方差分析(SSANOVA)模型的 Wald 型检验,它可以描述数据的复杂模式。我们推导出检验统计量的渐近零分布是 Chi-square 分布,揭示了新版本的 Wilks 现象。值得注意的是,我们利用信息论建立了非参数并行性检验可区分率的最小陡峭下限,并进一步证明了所提出的检验是最小最优的。我们还进行了模拟研究,以考察所提检验的经验性能。此外,还介绍了 DNA 甲基化和神经影像学研究,以说明该检验的潜在应用。该软件可在 https://github.com/BioAlgs/Parallelism 上获取。
{"title":"Minimax Nonparametric Parallelism Test.","authors":"Xin Xing, Meimei Liu, Ping Ma, Wenxuan Zhong","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Testing the hypothesis of parallelism is a fundamental statistical problem arising from many applied sciences. In this paper, we develop a nonparametric parallelism test for inferring whether the trends are parallel in treatment and control groups. In particular, the proposed nonparametric parallelism test is a Wald type test based on a smoothing spline ANOVA (SSANOVA) model which can characterize the complex patterns of the data. We derive that the asymptotic null distribution of the test statistic is a Chi-square distribution, unveiling a new version of Wilks phenomenon. Notably, we establish the minimax sharp lower bound of the distinguishable rate for the nonparametric parallelism test by using the information theory, and further prove that the proposed test is minimax optimal. Simulation studies are conducted to investigate the empirical performance of the proposed test. DNA methylation and neuroimaging studies are presented to illustrate potential applications of the test. The software is available at https://github.com/BioAlgs/Parallelism.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"21 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11086968/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140912390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Provable Convex Co-clustering of Tensors. 可证明的张量凸共聚
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2020-01-01
Eric C Chi, Brian R Gaines, Will Wei Sun, Hua Zhou, Jian Yang

Cluster analysis is a fundamental tool for pattern discovery of complex heterogeneous data. Prevalent clustering methods mainly focus on vector or matrix-variate data and are not applicable to general-order tensors, which arise frequently in modern scientific and business applications. Moreover, there is a gap between statistical guarantees and computational efficiency for existing tensor clustering solutions due to the nature of their non-convex formulations. In this work, we bridge this gap by developing a provable convex formulation of tensor co-clustering. Our convex co-clustering (CoCo) estimator enjoys stability guarantees and its computational and storage costs are polynomial in the size of the data. We further establish a non-asymptotic error bound for the CoCo estimator, which reveals a surprising "blessing of dimensionality" phenomenon that does not exist in vector or matrix-variate cluster analysis. Our theoretical findings are supported by extensive simulated studies. Finally, we apply the CoCo estimator to the cluster analysis of advertisement click tensor data from a major online company. Our clustering results provide meaningful business insights to improve advertising effectiveness.

聚类分析是发现复杂异构数据模式的基本工具。流行的聚类方法主要集中于向量或矩阵变量数据,不适用于现代科学和商业应用中经常出现的一般阶张量。此外,现有的张量聚类解决方案由于其非凸公式的性质,在统计保证和计算效率之间存在差距。在这项工作中,我们通过开发一种可证明的张量共聚类凸表述来弥合这一差距。我们的凸共聚类(CoCo)估计器具有稳定性保证,其计算和存储成本是数据大小的多项式。我们进一步建立了 CoCo 估计器的非渐近误差约束,揭示了一种令人惊讶的 "维度祝福 "现象,而这种现象在向量或矩阵变量聚类分析中并不存在。我们的理论发现得到了大量模拟研究的支持。最后,我们将 CoCo 估计器应用于一家大型网络公司广告点击张量数据的聚类分析。我们的聚类结果为提高广告效果提供了有意义的商业见解。
{"title":"Provable Convex Co-clustering of Tensors.","authors":"Eric C Chi, Brian R Gaines, Will Wei Sun, Hua Zhou, Jian Yang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Cluster analysis is a fundamental tool for pattern discovery of complex heterogeneous data. Prevalent clustering methods mainly focus on vector or matrix-variate data and are not applicable to general-order tensors, which arise frequently in modern scientific and business applications. Moreover, there is a gap between statistical guarantees and computational efficiency for existing tensor clustering solutions due to the nature of their non-convex formulations. In this work, we bridge this gap by developing a provable convex formulation of tensor co-clustering. Our convex co-clustering (CoCo) estimator enjoys stability guarantees and its computational and storage costs are polynomial in the size of the data. We further establish a non-asymptotic error bound for the CoCo estimator, which reveals a surprising \"blessing of dimensionality\" phenomenon that does not exist in vector or matrix-variate cluster analysis. Our theoretical findings are supported by extensive simulated studies. Finally, we apply the CoCo estimator to the cluster analysis of advertisement click tensor data from a major online company. Our clustering results provide meaningful business insights to improve advertising effectiveness.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"21 ","pages":""},"PeriodicalIF":4.3,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7731944/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38706545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proximal Distance Algorithms: Theory and Practice. 近距离算法:理论与实践。
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2019-04-01
Kevin L Keys, Hua Zhou, Kenneth Lange

Proximal distance algorithms combine the classical penalty method of constrained minimization with distance majorization. If f(x) is the loss function, and C is the constraint set in a constrained minimization problem, then the proximal distance principle mandates minimizing the penalized loss f ( x ) + ρ 2 dist ( x , C ) 2 and following the solution x ρ to its limit as ρ tends to ∞. At each iteration the squared Euclidean distance dist(x,C)2 is majorized by the spherical quadratic ‖x- P C (x k )‖2, where P C (x k ) denotes the projection of the current iterate x k onto C. The minimum of the surrogate function f ( x ) + ρ 2 x - P C ( x k ) 2 is given by the proximal map prox ρ -1f [P C (x k )]. The next iterate x k+1 automatically decreases the original penalized loss for fixed ρ. Since many explicit projections and proximal maps are known, it is straightforward to derive and implement novel optimization algorithms in this setting. These algorithms can take hundreds if not thousands of iterations to converge, but the simple nature of each iteration makes proximal distance algorithms competitive with traditional algorithms. For convex problems, proximal distance algorithms reduce to proximal gradient algorithms and therefore enjoy well understood convergence properties. For nonconvex problems, one can attack convergence by invoking Zangwill's theorem. Our numerical examples demonstrate the utility of proximal distance algorithms in various high-dimensional settings, including a) linear programming, b) constrained least squares, c) projection to the closest kinship matrix, d) projection onto a second-order cone constraint, e) calculation of Horn's copositive matrix index, f) linear complementarity programming, and g) sparse principal components analysis. The proximal distance algorithm in each case is competitive or superior in speed to traditional methods such as the interior point method and the alternating direction method of multipliers (ADMM). Source code for the numerical examples can be found at https://github.com/klkeys/proxdist.

近距离算法将约束最小化的经典惩罚方法与距离优化相结合。如果f(x)是损失函数,而C是约束最小化问题中的约束集,则近距离原理要求最小化惩罚损失f(x)+ρ2 dist(x,C)2,并在ρ趋于∞时遵循解xρ到其极限。在每次迭代中,欧几里得距离dist(x,C)2的平方由球面二次方的‖x-PC(xk)‖2来控制,其中PC(xK)表示当前迭代的xk在C上的投影。代理函数f(x)+ρ2‖x-P C(xk。下一次迭代x k+1自动减少固定ρ的原始惩罚损失。由于许多显式投影和近端映射是已知的,因此在这种情况下推导和实现新的优化算法是简单的。这些算法可能需要数百次甚至数千次迭代才能收敛,但每次迭代的简单性使近距离算法与传统算法具有竞争力。对于凸问题,近距离算法简化为近梯度算法,因此具有众所周知的收敛特性。对于非凸问题,可以通过调用Zangwill定理来攻击收敛性。我们的数值例子证明了近距离算法在各种高维设置中的实用性,包括a)线性规划,b)约束最小二乘,c)投影到最接近的亲属矩阵,d)投影到二阶锥约束,e)计算Horn的正方矩阵指数,f)线性互补规划,以及g)稀疏主成分分析。在每种情况下,近距离算法在速度上都优于传统方法,如内点法和交替方向乘法器法(ADMM)。有关数值示例的源代码,请访问https://github.com/klkeys/proxdist.
{"title":"Proximal Distance Algorithms: Theory and Practice.","authors":"Kevin L Keys,&nbsp;Hua Zhou,&nbsp;Kenneth Lange","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Proximal distance algorithms combine the classical penalty method of constrained minimization with distance majorization. If <i>f</i>(<i>x</i>) is the loss function, and <i>C</i> is the constraint set in a constrained minimization problem, then the proximal distance principle mandates minimizing the penalized loss <math><mrow><mi>f</mi> <mo>(</mo> <mi>x</mi> <mo>)</mo> <mo>+</mo> <mfrac><mi>ρ</mi> <mn>2</mn></mfrac> <mtext>dist</mtext> <msup> <mrow><mrow><mo>(</mo> <mrow><mi>x</mi> <mo>,</mo> <mi>C</mi></mrow> <mo>)</mo></mrow> </mrow> <mn>2</mn></msup> </mrow> </math> and following the solution <i>x</i> <sub><i>ρ</i></sub> to its limit as <i>ρ</i> tends to ∞. At each iteration the squared Euclidean distance dist(<i>x,C</i>)<sup>2</sup> is majorized by the spherical quadratic ‖<i>x</i>- <i>P</i> <sub><i>C</i></sub> (<i>x</i> <sub><i>k</i></sub> )‖<sup>2</sup>, where <i>P</i> <sub><i>C</i></sub> (<i>x</i> <sub><i>k</i></sub> ) denotes the projection of the current iterate <i>x</i> <sub><i>k</i></sub> onto <i>C</i>. The minimum of the surrogate function <math><mrow><mi>f</mi> <mo>(</mo> <mi>x</mi> <mo>)</mo> <mo>+</mo> <mfrac><mi>ρ</mi> <mn>2</mn></mfrac> <mo>‖</mo> <mi>x</mi> <mo>-</mo> <msub><mi>P</mi> <mi>C</mi></msub> <mrow><mo>(</mo> <mrow><msub><mi>x</mi> <mi>k</mi></msub> </mrow> <mo>)</mo></mrow> <msup><mo>‖</mo> <mn>2</mn></msup> </mrow> </math> is given by the proximal map prox <sub><i>ρ</i></sub> -<sub>1<i>f</i></sub> [<i>P</i> <sub><i>C</i></sub> (<i>x</i> <sub><i>k</i></sub> )]. The next iterate <i>x</i> <sub><i>k</i>+1</sub> automatically decreases the original penalized loss for fixed <i>ρ</i>. Since many explicit projections and proximal maps are known, it is straightforward to derive and implement novel optimization algorithms in this setting. These algorithms can take hundreds if not thousands of iterations to converge, but the simple nature of each iteration makes proximal distance algorithms competitive with traditional algorithms. For convex problems, proximal distance algorithms reduce to proximal gradient algorithms and therefore enjoy well understood convergence properties. For nonconvex problems, one can attack convergence by invoking Zangwill's theorem. Our numerical examples demonstrate the utility of proximal distance algorithms in various high-dimensional settings, including a) linear programming, b) constrained least squares, c) projection to the closest kinship matrix, d) projection onto a second-order cone constraint, e) calculation of Horn's copositive matrix index, f) linear complementarity programming, and g) sparse principal components analysis. The proximal distance algorithm in each case is competitive or superior in speed to traditional methods such as the interior point method and the alternating direction method of multipliers (ADMM). Source code for the numerical examples can be found at https://github.com/klkeys/proxdist.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"20 ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6812563/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41219016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks. 简化PC算法:大型随机网络中改进的因果结构学习。
IF 6 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2019-01-01
Arjun Sondhi, Ali Shojaie

We consider the task of estimating a high-dimensional directed acyclic graph, given observations from a linear structural equation model with arbitrary noise distribution. By exploiting properties of common random graphs, we develop a new algorithm that requires conditioning only on small sets of variables. The proposed algorithm, which is essentially a modified version of the PC-Algorithm, offers significant gains in both computational complexity and estimation accuracy. In particular, it results in more efficient and accurate estimation in large networks containing hub nodes, which are common in biological systems. We prove the consistency of the proposed algorithm, and show that it also requires a less stringent faithfulness assumption than the PC-Algorithm. Simulations in low and high-dimensional settings are used to illustrate these findings. An application to gene expression data suggests that the proposed algorithm can identify a greater number of clinically relevant genes than current methods.

我们考虑了估计高维有向无环图的任务,给出了具有任意噪声分布的线性结构方程模型的观测结果。通过利用常见随机图的性质,我们开发了一种只需要对小变量集进行条件处理的新算法。所提出的算法本质上是PC算法的修改版本,在计算复杂性和估计精度方面都有显著的提高。特别是,它在包含中枢节点的大型网络中产生了更高效和准确的估计,这在生物系统中很常见。我们证明了所提出算法的一致性,并表明它还需要比PC算法更不严格的忠实性假设。使用低维和高维环境中的模拟来说明这些发现。对基因表达数据的应用表明,与目前的方法相比,所提出的算法可以识别更多的临床相关基因。
{"title":"The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks.","authors":"Arjun Sondhi, Ali Shojaie","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We consider the task of estimating a high-dimensional directed acyclic graph, given observations from a linear structural equation model with arbitrary noise distribution. By exploiting properties of common random graphs, we develop a new algorithm that requires conditioning only on small sets of variables. The proposed algorithm, which is essentially a modified version of the PC-Algorithm, offers significant gains in both computational complexity and estimation accuracy. In particular, it results in more efficient and accurate estimation in large networks containing hub nodes, which are common in biological systems. We prove the consistency of the proposed algorithm, and show that it also requires a less stringent faithfulness assumption than the PC-Algorithm. Simulations in low and high-dimensional settings are used to illustrate these findings. An application to gene expression data suggests that the proposed algorithm can identify a greater number of clinically relevant genes than current methods.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"20 164","pages":""},"PeriodicalIF":6.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10552884/pdf/nihms-1885649.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41105823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal Learning via Manifold Regularization. 通过漫反射正则化进行因果学习
IF 4.3 3区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2019-01-01
Steven M Hill, Chris J Oates, Duncan A Blythe, Sach Mukherjee

This paper frames causal structure estimation as a machine learning task. The idea is to treat indicators of causal relationships between variables as 'labels' and to exploit available data on the variables of interest to provide features for the labelling task. Background scientific knowledge or any available interventional data provide labels on some causal relationships and the remainder are treated as unlabelled. To illustrate the key ideas, we develop a distance-based approach (based on bivariate histograms) within a manifold regularization framework. We present empirical results on three different biological data sets (including examples where causal effects can be verified by experimental intervention), that together demonstrate the efficacy and general nature of the approach as well as its simplicity from a user's point of view.

本文将因果结构估算作为一项机器学习任务。其思路是将变量间因果关系的指标视为 "标签",并利用相关变量的可用数据为标签任务提供特征。背景科学知识或任何可用的干预数据可为某些因果关系提供标签,而其余的则被视为无标签。为了说明关键思路,我们在流形正则化框架内开发了一种基于距离的方法(基于双变量直方图)。我们展示了三个不同生物数据集(包括可通过实验干预验证因果效应的例子)的实证结果,这些结果共同证明了该方法的有效性和通用性,以及从用户角度看它的简便性。
{"title":"Causal Learning via Manifold Regularization.","authors":"Steven M Hill, Chris J Oates, Duncan A Blythe, Sach Mukherjee","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper frames causal structure estimation as a machine learning task. The idea is to treat indicators of causal relationships between variables as 'labels' and to exploit available data on the variables of interest to provide features for the labelling task. Background scientific knowledge or any available interventional data provide labels on some causal relationships and the remainder are treated as unlabelled. To illustrate the key ideas, we develop a distance-based approach (based on bivariate histograms) within a manifold regularization framework. We present empirical results on three different biological data sets (including examples where causal effects can be verified by experimental intervention), that together demonstrate the efficacy and general nature of the approach as well as its simplicity from a user's point of view.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"20 ","pages":"127"},"PeriodicalIF":4.3,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6986916/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9142095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Machine Learning Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1