Learning Scalable Deep Kernels with Recurrent Structure.

IF 4.3 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Journal of Machine Learning Research Pub Date : 2017-01-01

Maruan Al-Shedivat, Andrew Gordon Wilson, Yunus Saatchi, Zhiting Hu, Eric P Xing

{"title":"Learning Scalable Deep Kernels with Recurrent Structure.","authors":"Maruan Al-Shedivat, Andrew Gordon Wilson, Yunus Saatchi, Zhiting Hu, Eric P Xing","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Many applications in speech, robotics, finance, and biology deal with sequential data, where ordering matters and recurrent structures are common. However, this structure cannot be easily captured by standard kernel functions. To model such structure, we propose expressive closed-form kernel functions for Gaussian processes. The resulting model, GP-LSTM, fully encapsulates the inductive biases of long short-term memory (LSTM) recurrent networks, while retaining the non-parametric probabilistic advantages of Gaussian processes. We learn the properties of the proposed kernels by optimizing the Gaussian process marginal likelihood using a new provably convergent semi-stochastic gradient procedure, and exploit the structure of these kernels for scalable training and prediction. This approach provides a practical representation for Bayesian LSTMs. We demonstrate state-of-the-art performance on several benchmarks, and thoroughly investigate a consequential autonomous driving application, where the predictive uncertainties provided by GP-LSTM are uniquely valuable.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"18 1","pages":"2850-2886"},"PeriodicalIF":4.3000,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6334642/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Machine Learning Research","FirstCategoryId":"94","ListUrlMain":"","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Many applications in speech, robotics, finance, and biology deal with sequential data, where ordering matters and recurrent structures are common. However, this structure cannot be easily captured by standard kernel functions. To model such structure, we propose expressive closed-form kernel functions for Gaussian processes. The resulting model, GP-LSTM, fully encapsulates the inductive biases of long short-term memory (LSTM) recurrent networks, while retaining the non-parametric probabilistic advantages of Gaussian processes. We learn the properties of the proposed kernels by optimizing the Gaussian process marginal likelihood using a new provably convergent semi-stochastic gradient procedure, and exploit the structure of these kernels for scalable training and prediction. This approach provides a practical representation for Bayesian LSTMs. We demonstrate state-of-the-art performance on several benchmarks, and thoroughly investigate a consequential autonomous driving application, where the predictive uncertainties provided by GP-LSTM are uniquely valuable.

Abstract Image

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用循环结构学习可扩展深度核。

语音、机器人、金融和生物学中的许多应用都处理顺序数据，其中排序问题和循环结构是常见的。然而，这种结构不能被标准核函数轻易捕获。为了对这种结构建模，我们提出了高斯过程的表达性闭形式核函数。所得模型GP-LSTM充分封装了长短期记忆(LSTM)循环网络的归纳偏差，同时保留了高斯过程的非参数概率优势。我们通过使用一种新的可证明收敛的半随机梯度过程优化高斯过程的边际似然来学习所提出的核的性质，并利用这些核的结构进行可扩展的训练和预测。这种方法为贝叶斯lstm提供了一种实用的表示。我们在几个基准测试中展示了最先进的性能，并深入研究了相应的自动驾驶应用，其中GP-LSTM提供的预测不确定性具有独特的价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Machine Learning Research 工程技术-计算机：人工智能

CiteScore

18.80

自引率

0.00%

发文量

审稿时长

3 months

期刊介绍： The Journal of Machine Learning Research (JMLR) provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online. JMLR has a commitment to rigorous yet rapid reviewing. JMLR seeks previously unpublished papers on machine learning that contain: new principled algorithms with sound empirical validation, and with justification of theoretical, psychological, or biological nature; experimental and/or theoretical studies yielding new insight into the design and behavior of learning in intelligent systems; accounts of applications of existing techniques that shed light on the strengths and weaknesses of the methods; formalization of new learning tasks (e.g., in the context of new applications) and of methods for assessing performance on those tasks; development of new analytical frameworks that advance theoretical studies of practical learning methods; computational models of data from natural learning systems at the behavioral or neural level; or extremely well-written surveys of existing work.

期刊最新文献

Convergence for nonconvex ADMM, with applications to CT imaging. Effect-Invariant Mechanisms for Policy Generalization. Nonparametric Regression for 3D Point Cloud Learning. Batch Normalization Preconditioning for Stochastic Gradient Langevin Dynamics Why Self-Attention is Natural for Sequence-to-Sequence Problems? A Perspective from Symmetries