Convex optimization techniques for fitting sparse Gaussian graphical models

Proceedings of the 23rd international conference on Machine learning Pub Date : 2006-06-25 DOI:10.1145/1143844.1143856

O. Banerjee, L. Ghaoui, A. d’Aspremont, G. Natsoulis

引用次数: 192

Abstract

We consider the problem of fitting a large-scale covariance matrix to multivariate Gaussian data in such a way that the inverse is sparse, thus providing model selection. Beginning with a dense empirical covariance matrix, we solve a maximum likelihood problem with an l1-norm penalty term added to encourage sparsity in the inverse. For models with tens of nodes, the resulting problem can be solved using standard interior-point algorithms for convex optimization, but these methods scale poorly with problem size. We present two new algorithms aimed at solving problems with a thousand nodes. The first, based on Nesterov's first-order algorithm, yields a rigorous complexity estimate for the problem, with a much better dependence on problem size than interior-point methods. Our second algorithm uses block coordinate descent, updating row/columns of the covariance matrix sequentially. Experiments with genomic data show that our method is able to uncover biologically interpretable connections among genes.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

拟合稀疏高斯图形模型的凸优化技术

我们考虑将大规模协方差矩阵拟合到多变量高斯数据的问题，这样逆是稀疏的，从而提供模型选择。从一个密集的经验协方差矩阵开始，我们解决了一个极大似然问题，增加了一个11范数惩罚项，以鼓励逆的稀疏性。对于具有数十个节点的模型，可以使用凸优化的标准内点算法来解决所产生的问题，但这些方法对问题规模的可扩展性很差。我们提出了两种新的算法，旨在解决一千节点的问题。第一种方法基于Nesterov的一阶算法，对问题产生了严格的复杂性估计，与内点方法相比，它对问题大小的依赖性要好得多。我们的第二个算法使用块坐标下降，按顺序更新协方差矩阵的行/列。基因组数据实验表明，我们的方法能够揭示基因之间可解释的生物学联系。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 23rd international conference on Machine learning

自引率

0.00%

发文量

期刊最新文献

On a theory of learning with similarity functions Bayesian learning of measurement and structural models Predictive search distributions Data association for topic intensity tracking Feature value acquisition in testing: a sequential batch test algorithm