Sequence Modeling with Hierarchical Deep Generative Models with Dual Memory

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management Pub Date : 2017-11-06 DOI:10.1145/3132847.3132952

Yanan Zheng, L. Wen, Jianmin Wang, Jun Yan, Lei Ji

{"title":"Sequence Modeling with Hierarchical Deep Generative Models with Dual Memory","authors":"Yanan Zheng, L. Wen, Jianmin Wang, Jun Yan, Lei Ji","doi":"10.1145/3132847.3132952","DOIUrl":null,"url":null,"abstract":"Deep Generative Models (DGMs) are able to extract high-level representations from massive unlabeled data and are explainable from a probabilistic perspective. Such characteristics favor sequence modeling tasks. However, it still remains a huge challenge to model sequences with DGMs. Unlike real-valued data that can be directly fed into models, sequence data consist of discrete elements and require being transformed into certain representations first. This leads to the following two challenges. First, high-level features are sensitive to small variations of inputs as well as the way of representing data. Second, the models are more likely to lose long-term information during multiple transformations. In this paper, we propose a Hierarchical Deep Generative Model With Dual Memory to address the two challenges. Furthermore, we provide a method to efficiently perform inference and learning on the model. The proposed model extends basic DGMs with an improved hierarchically organized multi-layer architecture. Besides, our model incorporates memories along dual directions, respectively denoted as broad memory and deep memory. The model is trained end-to-end by optimizing a variational lower bound on data log-likelihood using the improved stochastic variational method. We perform experiments on several tasks with various datasets and obtain excellent results. The results of language modeling show our method significantly outperforms state-of-the-art results in terms of generative performance. Extended experiments including document modeling and sentiment analysis, prove the high-effectiveness of dual memory mechanism and latent representations. Text random generation provides a straightforward perception for advantages of our model.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3132847.3132952","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep Generative Models (DGMs) are able to extract high-level representations from massive unlabeled data and are explainable from a probabilistic perspective. Such characteristics favor sequence modeling tasks. However, it still remains a huge challenge to model sequences with DGMs. Unlike real-valued data that can be directly fed into models, sequence data consist of discrete elements and require being transformed into certain representations first. This leads to the following two challenges. First, high-level features are sensitive to small variations of inputs as well as the way of representing data. Second, the models are more likely to lose long-term information during multiple transformations. In this paper, we propose a Hierarchical Deep Generative Model With Dual Memory to address the two challenges. Furthermore, we provide a method to efficiently perform inference and learning on the model. The proposed model extends basic DGMs with an improved hierarchically organized multi-layer architecture. Besides, our model incorporates memories along dual directions, respectively denoted as broad memory and deep memory. The model is trained end-to-end by optimizing a variational lower bound on data log-likelihood using the improved stochastic variational method. We perform experiments on several tasks with various datasets and obtain excellent results. The results of language modeling show our method significantly outperforms state-of-the-art results in terms of generative performance. Extended experiments including document modeling and sentiment analysis, prove the high-effectiveness of dual memory mechanism and latent representations. Text random generation provides a straightforward perception for advantages of our model.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

具有双存储器的层次深度生成模型序列建模

深度生成模型(dgm)能够从大量未标记数据中提取高级表示，并且可以从概率角度进行解释。这些特征有利于序列建模任务。然而，用dgm建立序列模型仍然是一个巨大的挑战。与可以直接输入模型的实值数据不同，序列数据由离散元素组成，需要首先转换为特定的表示。这就带来了以下两个挑战。首先，高级特征对输入的微小变化以及表示数据的方式都很敏感。其次，在多次转换期间，模型更有可能丢失长期信息。在本文中，我们提出了一种具有双存储器的分层深度生成模型来解决这两个挑战。此外，我们提供了一种有效地对模型进行推理和学习的方法。提出的模型通过改进的分层组织的多层体系结构扩展了基本的dgm。此外，我们的模型包含双向记忆，分别表示为宽记忆和深记忆。采用改进的随机变分方法优化数据对数似然的变分下界，对模型进行端到端训练。我们用不同的数据集对几个任务进行了实验，并获得了很好的结果。语言建模的结果表明，我们的方法在生成性能方面明显优于最先进的结果。扩展实验包括文档建模和情感分析，证明了双重记忆机制和潜在表征的有效性。文本随机生成为我们的模型的优势提供了一个直观的感知。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2017 ACM on Conference on Information and Knowledge Management

自引率

0.00%

发文量

期刊最新文献

Query and Animate Multi-attribute Trajectory Data HyPerInsight: Data Exploration Deep Inside HyPer Algorithmic Bias: Do Good Systems Make Relevant Documents More Retrievable? NeuPL: Attention-based Semantic Matching and Pair-Linking for Entity Disambiguation Health Forum Thread Recommendation Using an Interest Aware Topic Model