基于多通道卷积神经网络的跨语言对话状态跟踪

Hongjie Shi, Takashi Ushio, M. Endo, K. Yamagami, Noriaki Horii
{"title":"基于多通道卷积神经网络的跨语言对话状态跟踪","authors":"Hongjie Shi, Takashi Ushio, M. Endo, K. Yamagami, Noriaki Horii","doi":"10.1109/SLT.2016.7846318","DOIUrl":null,"url":null,"abstract":"The fifth Dialog State Tracking Challenge (DSTC5) introduces a new cross-language dialog state tracking scenario, where the participants are asked to build their trackers based on the English training corpus, while evaluating them with the unlabeled Chinese corpus. Although the computer-generated translations for both English and Chinese corpus are provided in the dataset, these translations contain errors and careless use of them can easily hurt the performance of the built trackers. To address this problem, we propose a multichannel Convolutional Neural Networks (CNN) architecture, in which we treat English and Chinese language as different input channels of one single CNN model. In the evaluation of DSTC5, we found that such multichannel architecture can effectively improve the robustness against translation errors. Additionally, our method for DSTC5 is purely machine learning based and requires no prior knowledge about the target language. We consider this a desirable property for building a tracker in the cross-language context, as not every developer will be familiar with both languages.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"A multichannel convolutional neural network for cross-language dialog state tracking\",\"authors\":\"Hongjie Shi, Takashi Ushio, M. Endo, K. Yamagami, Noriaki Horii\",\"doi\":\"10.1109/SLT.2016.7846318\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The fifth Dialog State Tracking Challenge (DSTC5) introduces a new cross-language dialog state tracking scenario, where the participants are asked to build their trackers based on the English training corpus, while evaluating them with the unlabeled Chinese corpus. Although the computer-generated translations for both English and Chinese corpus are provided in the dataset, these translations contain errors and careless use of them can easily hurt the performance of the built trackers. To address this problem, we propose a multichannel Convolutional Neural Networks (CNN) architecture, in which we treat English and Chinese language as different input channels of one single CNN model. In the evaluation of DSTC5, we found that such multichannel architecture can effectively improve the robustness against translation errors. Additionally, our method for DSTC5 is purely machine learning based and requires no prior knowledge about the target language. We consider this a desirable property for building a tracker in the cross-language context, as not every developer will be familiar with both languages.\",\"PeriodicalId\":281635,\"journal\":{\"name\":\"2016 IEEE Spoken Language Technology Workshop (SLT)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Spoken Language Technology Workshop (SLT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SLT.2016.7846318\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846318","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 34

摘要

第五次对话状态跟踪挑战(DSTC5)引入了一个新的跨语言对话状态跟踪场景,参与者被要求基于英语训练语料库构建自己的跟踪器,同时使用未标记的中文语料库对其进行评估。尽管数据集中提供了计算机生成的中英文语料库翻译,但这些翻译包含错误,并且不小心使用它们很容易损害构建的跟踪器的性能。为了解决这个问题,我们提出了一种多通道卷积神经网络(CNN)架构,其中我们将英语和中文作为一个CNN模型的不同输入通道。在对DSTC5的评估中,我们发现这种多通道架构可以有效地提高对翻译错误的鲁棒性。此外,我们的DSTC5方法纯粹是基于机器学习的,不需要关于目标语言的先验知识。我们认为这是在跨语言上下文中构建跟踪器的理想属性,因为不是每个开发人员都熟悉这两种语言。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A multichannel convolutional neural network for cross-language dialog state tracking
The fifth Dialog State Tracking Challenge (DSTC5) introduces a new cross-language dialog state tracking scenario, where the participants are asked to build their trackers based on the English training corpus, while evaluating them with the unlabeled Chinese corpus. Although the computer-generated translations for both English and Chinese corpus are provided in the dataset, these translations contain errors and careless use of them can easily hurt the performance of the built trackers. To address this problem, we propose a multichannel Convolutional Neural Networks (CNN) architecture, in which we treat English and Chinese language as different input channels of one single CNN model. In the evaluation of DSTC5, we found that such multichannel architecture can effectively improve the robustness against translation errors. Additionally, our method for DSTC5 is purely machine learning based and requires no prior knowledge about the target language. We consider this a desirable property for building a tracker in the cross-language context, as not every developer will be familiar with both languages.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Further optimisations of constant Q cepstral processing for integrated utterance and text-dependent speaker verification Learning dialogue dynamics with the method of moments A study of speech distortion conditions in real scenarios for speech processing applications Comparing speaker independent and speaker adapted classification for word prominence detection Influence of corpus size and content on the perceptual quality of a unit selection MaryTTS voice
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1