BiVaSE:一种具有随机初始化Transformer层的双语变分句编码器

IF 0.5 3区 文学 N/A LANGUAGE & LINGUISTICS Acta Linguistica Academica Pub Date : 2022-12-12 DOI:10.1556/2062.2022.00584
Bence Nyéki
{"title":"BiVaSE:一种具有随机初始化Transformer层的双语变分句编码器","authors":"Bence Nyéki","doi":"10.1556/2062.2022.00584","DOIUrl":null,"url":null,"abstract":"Transformer-based NLP models have achieved state-of-the-art results in many NLP tasks including text classification and text generation. However, the layers of these models do not output any explicit representations for texts units larger than tokens (e.g. sentences), although such representations are required to perform text classification. Sentence encodings are usually obtained by applying a pooling technique during fine-tuning on a specific task. In this paper, a new sentence encoder is introduced. Relying on an autoencoder architecture, it was trained to learn sentence representations from the very beginning of its training. The model was trained on bilingual data with variational Bayesian inference. Sentence representations were evaluated in downstream and linguistic probing tasks. Although the newly introduced encoder generally performs worse than well-known Transformer-based encoders, the experiments show that it was able to learn to incorporate linguistic information in the sentence representations.","PeriodicalId":37594,"journal":{"name":"Acta Linguistica Academica","volume":null,"pages":null},"PeriodicalIF":0.5000,"publicationDate":"2022-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BiVaSE: A bilingual variational sentence encoder with randomly initialized Transformer layers\",\"authors\":\"Bence Nyéki\",\"doi\":\"10.1556/2062.2022.00584\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transformer-based NLP models have achieved state-of-the-art results in many NLP tasks including text classification and text generation. However, the layers of these models do not output any explicit representations for texts units larger than tokens (e.g. sentences), although such representations are required to perform text classification. Sentence encodings are usually obtained by applying a pooling technique during fine-tuning on a specific task. In this paper, a new sentence encoder is introduced. Relying on an autoencoder architecture, it was trained to learn sentence representations from the very beginning of its training. The model was trained on bilingual data with variational Bayesian inference. Sentence representations were evaluated in downstream and linguistic probing tasks. Although the newly introduced encoder generally performs worse than well-known Transformer-based encoders, the experiments show that it was able to learn to incorporate linguistic information in the sentence representations.\",\"PeriodicalId\":37594,\"journal\":{\"name\":\"Acta Linguistica Academica\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2022-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Linguistica Academica\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1556/2062.2022.00584\",\"RegionNum\":3,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"N/A\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Linguistica Academica","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1556/2062.2022.00584","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"N/A","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0

摘要

基于Transformer的NLP模型在许多NLP任务中取得了最先进的结果,包括文本分类和文本生成。然而,这些模型的层不输出大于标记(例如句子)的文本单元的任何显式表示,尽管执行文本分类需要这样的表示。句子编码通常是通过在对特定任务进行微调时应用池技术来获得的。本文介绍了一种新的句子编码器。依靠自动编码器架构,它从一开始就被训练来学习句子表示。该模型使用变分贝叶斯推理在双语数据上进行训练。在下游和语言探究任务中评估句子表征。尽管新引入的编码器通常比众所周知的基于Transformer的编码器性能更差,但实验表明,它能够学会将语言信息融入句子表示中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
BiVaSE: A bilingual variational sentence encoder with randomly initialized Transformer layers
Transformer-based NLP models have achieved state-of-the-art results in many NLP tasks including text classification and text generation. However, the layers of these models do not output any explicit representations for texts units larger than tokens (e.g. sentences), although such representations are required to perform text classification. Sentence encodings are usually obtained by applying a pooling technique during fine-tuning on a specific task. In this paper, a new sentence encoder is introduced. Relying on an autoencoder architecture, it was trained to learn sentence representations from the very beginning of its training. The model was trained on bilingual data with variational Bayesian inference. Sentence representations were evaluated in downstream and linguistic probing tasks. Although the newly introduced encoder generally performs worse than well-known Transformer-based encoders, the experiments show that it was able to learn to incorporate linguistic information in the sentence representations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Acta Linguistica Academica
Acta Linguistica Academica Arts and Humanities-Literature and Literary Theory
CiteScore
1.00
自引率
20.00%
发文量
20
期刊介绍: Acta Linguistica Academica publishes papers on general linguistics. Papers presenting empirical material must have strong theoretical implications. The scope of the journal is not restricted to the core areas of linguistics; it also covers areas such as socio- and psycholinguistics, neurolinguistics, discourse analysis, the philosophy of language, language typology, and formal semantics. The journal also publishes book and dissertation reviews and advertisements.
期刊最新文献
American linguistics in transition: From post-Bloomfieldian structuralism to generative grammar Production of Mandarin and Fuzhou lexical tones in six- to seven-year-old Mandarin-Fuzhou bilingual children No lowering, only paradigms: A paradigm-based account of linking vowels in Hungarian Strange a construction: The “A egy N” in Hungarian Another fortis-lenis language: A reanalysis of Old English obstruents
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1