字幕学习的字符级序列到序列方法

Haijun Zhang, Jingxuan Li, Yuzhu Ji, Heng Yue
{"title":"字幕学习的字符级序列到序列方法","authors":"Haijun Zhang, Jingxuan Li, Yuzhu Ji, Heng Yue","doi":"10.1109/INDIN.2016.7819265","DOIUrl":null,"url":null,"abstract":"This paper presents a character-level sequence-to-sequence learning method, RNNembed. Specifically, we embed a Recurrent Neural Network (RNN) into an encoder-decoder framework and generate character-level sequence representation as input. The dimension of input feature space can be significantly reduced as well as avoiding the need to handle unknown or rare words in sequences. In the language model, we improve the basic structure of a Gated Recurrent Unit (GRU) by adding an output gate, which is used for filtering out unimportant information involved in the attention scheme of the alignment model. Our proposed method was examined in a large-scale dataset on a task of English-to-Chinese translation. Experimental results demonstrate that the proposed approach achieves a translation performance comparable, or close, to conventional word-based and phrase-based systems.","PeriodicalId":421680,"journal":{"name":"2016 IEEE 14th International Conference on Industrial Informatics (INDIN)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A character-level sequence-to-sequence method for subtitle learning\",\"authors\":\"Haijun Zhang, Jingxuan Li, Yuzhu Ji, Heng Yue\",\"doi\":\"10.1109/INDIN.2016.7819265\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a character-level sequence-to-sequence learning method, RNNembed. Specifically, we embed a Recurrent Neural Network (RNN) into an encoder-decoder framework and generate character-level sequence representation as input. The dimension of input feature space can be significantly reduced as well as avoiding the need to handle unknown or rare words in sequences. In the language model, we improve the basic structure of a Gated Recurrent Unit (GRU) by adding an output gate, which is used for filtering out unimportant information involved in the attention scheme of the alignment model. Our proposed method was examined in a large-scale dataset on a task of English-to-Chinese translation. Experimental results demonstrate that the proposed approach achieves a translation performance comparable, or close, to conventional word-based and phrase-based systems.\",\"PeriodicalId\":421680,\"journal\":{\"name\":\"2016 IEEE 14th International Conference on Industrial Informatics (INDIN)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 14th International Conference on Industrial Informatics (INDIN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INDIN.2016.7819265\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 14th International Conference on Industrial Informatics (INDIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIN.2016.7819265","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

本文提出了一种字符级的序列到序列学习方法RNNembed。具体来说,我们将递归神经网络(RNN)嵌入到编码器-解码器框架中,并生成字符级序列表示作为输入。输入特征空间的维数可以显著降低,并且可以避免处理序列中未知或罕见的单词。在语言模型中,我们通过增加一个输出门来改进门控循环单元(GRU)的基本结构,该输出门用于过滤掉对齐模型的注意方案中涉及的不重要信息。我们提出的方法在一个大规模的数据集上进行了英汉翻译任务的检验。实验结果表明,该方法的翻译性能与传统的基于词和短语的系统相当或接近。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A character-level sequence-to-sequence method for subtitle learning
This paper presents a character-level sequence-to-sequence learning method, RNNembed. Specifically, we embed a Recurrent Neural Network (RNN) into an encoder-decoder framework and generate character-level sequence representation as input. The dimension of input feature space can be significantly reduced as well as avoiding the need to handle unknown or rare words in sequences. In the language model, we improve the basic structure of a Gated Recurrent Unit (GRU) by adding an output gate, which is used for filtering out unimportant information involved in the attention scheme of the alignment model. Our proposed method was examined in a large-scale dataset on a task of English-to-Chinese translation. Experimental results demonstrate that the proposed approach achieves a translation performance comparable, or close, to conventional word-based and phrase-based systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
LPV modelling and LPV observer-based fault detection for wind turbine systems Determining the optimal level of autonomy in cyber-physical production systems Detecting illegally parked vehicle based on cumulative dual foreground difference An electronic stethoscope for heart diseases based on micro-electro-mechanical-system microphone A PID controller for the underwater robot station-keeping
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1