Intent detection using semantically enriched word embeddings

Joo-Kyung Kim, Gökhan Tür, Asli Celikyilmaz, Bin Cao, Ye-Yi Wang
{"title":"Intent detection using semantically enriched word embeddings","authors":"Joo-Kyung Kim, Gökhan Tür, Asli Celikyilmaz, Bin Cao, Ye-Yi Wang","doi":"10.1109/SLT.2016.7846297","DOIUrl":null,"url":null,"abstract":"State-of-the-art targeted language understanding systems rely on deep learning methods using 1-hot word vectors or off-the-shelf word embeddings. While word embeddings can be enriched with information from semantic lexicons (such as WordNet and PPDB) to improve their semantic representation, most previous research on word-embedding enriching has focused on improving intrinsic word-level tasks such as word analogy and antonym detection. In this work, we enrich word embeddings to force semantically similar or dissimilar words to be closer or farther away in the embedding space to improve the performance of an extrinsic task, namely, intent detection for spoken language understanding. We utilize several semantic lexicons, such as WordNet, PPDB, and Macmillan Dictionary to enrich the word embeddings and later use them as initial representation of words for intent detection. Thus, we enrich embeddings outside the neural network as opposed to learning the embeddings within the network, and, on top of the embeddings, build bidirectional LSTM for intent detection. Our experiments on ATIS and a real log dataset from Microsoft Cortana show that word embeddings enriched with semantic lexicons can improve intent detection.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"87","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846297","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 87

Abstract

State-of-the-art targeted language understanding systems rely on deep learning methods using 1-hot word vectors or off-the-shelf word embeddings. While word embeddings can be enriched with information from semantic lexicons (such as WordNet and PPDB) to improve their semantic representation, most previous research on word-embedding enriching has focused on improving intrinsic word-level tasks such as word analogy and antonym detection. In this work, we enrich word embeddings to force semantically similar or dissimilar words to be closer or farther away in the embedding space to improve the performance of an extrinsic task, namely, intent detection for spoken language understanding. We utilize several semantic lexicons, such as WordNet, PPDB, and Macmillan Dictionary to enrich the word embeddings and later use them as initial representation of words for intent detection. Thus, we enrich embeddings outside the neural network as opposed to learning the embeddings within the network, and, on top of the embeddings, build bidirectional LSTM for intent detection. Our experiments on ATIS and a real log dataset from Microsoft Cortana show that word embeddings enriched with semantic lexicons can improve intent detection.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于语义丰富词嵌入的意图检测
最先进的目标语言理解系统依赖于使用1-热词向量或现成的词嵌入的深度学习方法。虽然词嵌入可以利用语义词汇(如WordNet和PPDB)的信息进行丰富,以提高其语义表征,但大多数关于词嵌入丰富的研究都集中在提高固有的词级任务,如词类比和反义词检测。在这项工作中,我们丰富了词嵌入,以迫使语义相似或不相似的词在嵌入空间中更近或更远,以提高外在任务的性能,即用于口语理解的意图检测。我们利用几个语义词典,如WordNet、PPDB和Macmillan Dictionary来丰富词嵌入,然后将它们用作意图检测的词的初始表示。因此,我们丰富了神经网络外部的嵌入,而不是学习网络内部的嵌入,并且在嵌入的基础上,构建了用于意图检测的双向LSTM。我们在ATIS和来自Microsoft Cortana的真实日志数据集上的实验表明,富含语义词汇的词嵌入可以改善意图检测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Further optimisations of constant Q cepstral processing for integrated utterance and text-dependent speaker verification Learning dialogue dynamics with the method of moments A study of speech distortion conditions in real scenarios for speech processing applications Comparing speaker independent and speaker adapted classification for word prominence detection Influence of corpus size and content on the perceptual quality of a unit selection MaryTTS voice
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1