Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation

IF 14.8 AI Open Pub Date : 2023-01-01 DOI:10.1016/j.aiopen.2023.05.001

Jialin Yu , Alexandra I. Cristea , Anoushka Harit , Zhongtian Sun , Olanrewaju Tahir Aduragba , Lei Shi , Noura Al Moubayed

{"title":"Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation","authors":"Jialin Yu , Alexandra I. Cristea , Anoushka Harit , Zhongtian Sun , Olanrewaju Tahir Aduragba , Lei Shi , Noura Al Moubayed","doi":"10.1016/j.aiopen.2023.05.001","DOIUrl":null,"url":null,"abstract":"<div>This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (<math><mrow><mi>p</mi><mo><</mo><mo>.</mo><mn>05</mn></mrow></math>; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.</div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 19-32"},"PeriodicalIF":14.8000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666651023000025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin ( $p < . 05$ ; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

作为潜在序列的语言：半监督转述生成的深层潜在变量模型

本文探讨了半监督转述生成的深层潜变量模型，其中未标记数据的缺失目标对被建模为潜转述序列。我们提出了一种新的无监督模型，称为变分序列自动编码重建（VSAR），该模型在给定观测文本的情况下执行潜在序列推理。为了利用来自文本对的信息，我们还引入了一种新的监督模型，称为双向学习（DDL），该模型旨在与我们提出的VSAR模型集成。将VSAR与DDL相结合（DDL+VSAR）使我们能够进行半监督学习。尽管如此，合并后的车型仍存在冷启动问题。为了进一步解决这个问题，我们提出了一种改进的权重初始化解决方案，从而产生了一种新的两阶段训练方案，我们称之为知识强化学习（KRL）。我们的经验评估表明，在完整数据上，与最先进的监督基线相比，组合模型产生了具有竞争力的性能。此外，在只有一小部分标记对可用的情况下，我们的组合模型始终显著优于强监督模型基线（DDL）（p<；.05；Wilcoxon检验）。我们的代码可在https://github.com/jialin-yu/latent-sequence-paraphrase.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

AI Open

CiteScore

45.00

自引率

0.00%

发文量