Persian Language Understanding in Task-Oriented Dialogue System for Online Shopping

Zeinab Borhanifard, Hossein Basafa, S. Z. Razavi, Heshaam Faili
{"title":"Persian Language Understanding in Task-Oriented Dialogue System for Online Shopping","authors":"Zeinab Borhanifard, Hossein Basafa, S. Z. Razavi, Heshaam Faili","doi":"10.1109/IKT51791.2020.9345639","DOIUrl":null,"url":null,"abstract":"Natural language understanding is a critical module in task-oriented dialogue systems. Recently, state-of-the-art approaches use deep learning methods and transformers to improve the performance of dialogue systems. In this work, we propose a natural language understanding model with a specific-shopping named entity recognizer using a joint learning-based BERT transformer for task-oriented dialogue systems in the Persian Language. Since there is no published available dataset for Persian online shopping dialogue systems, to tackle the lack of data, we propose two methods for generating training data: fully-simulated and semi-simulated method. We created a simulated dataset with a hybrid of rule-based and template-based generation methods and a semi-simulated dataset where the language generation part is done by a human to increase the quality of the dataset. Our experiments with the natural language understanding module show that a combination of the datasets can improve results. These dataset generation methods can apply in other domains for low-resource languages in task-oriented dialogue systems too to solve the cold start problem of datasets.","PeriodicalId":382725,"journal":{"name":"2020 11th International Conference on Information and Knowledge Technology (IKT)","volume":"6 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th International Conference on Information and Knowledge Technology (IKT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IKT51791.2020.9345639","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Natural language understanding is a critical module in task-oriented dialogue systems. Recently, state-of-the-art approaches use deep learning methods and transformers to improve the performance of dialogue systems. In this work, we propose a natural language understanding model with a specific-shopping named entity recognizer using a joint learning-based BERT transformer for task-oriented dialogue systems in the Persian Language. Since there is no published available dataset for Persian online shopping dialogue systems, to tackle the lack of data, we propose two methods for generating training data: fully-simulated and semi-simulated method. We created a simulated dataset with a hybrid of rule-based and template-based generation methods and a semi-simulated dataset where the language generation part is done by a human to increase the quality of the dataset. Our experiments with the natural language understanding module show that a combination of the datasets can improve results. These dataset generation methods can apply in other domains for low-resource languages in task-oriented dialogue systems too to solve the cold start problem of datasets.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向任务的网上购物对话系统中的波斯语理解
自然语言理解是面向任务的对话系统的关键模块。最近,最先进的方法使用深度学习方法和转换器来提高对话系统的性能。在这项工作中,我们提出了一个带有特定购物命名实体识别器的自然语言理解模型,该模型使用基于联合学习的BERT转换器用于波斯语的面向任务的对话系统。由于没有发布的波斯语在线购物对话系统可用数据集,为了解决数据缺乏的问题,我们提出了两种生成训练数据的方法:完全模拟和半模拟方法。我们创建了一个模拟数据集,混合了基于规则和基于模板的生成方法,以及一个半模拟数据集,其中语言生成部分由人工完成,以提高数据集的质量。我们对自然语言理解模块的实验表明,数据集的组合可以提高结果。这些数据集生成方法也可以应用于面向任务的对话系统中低资源语言的其他领域,以解决数据集冷启动问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A New Sentence Ordering Method using BERT Pretrained Model Classical-Quantum Multiple Access Wiretap Channel with Common Message: One-Shot Rate Region Business Process Improvement Challenges: A Systematic Literature Review The risk prediction of heart disease by using neuro-fuzzy and improved GOA Distributed Learning Automata-Based Algorithm for Finding K-Clique in Complex Social Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1