Survey of Available Datasets for Designing Task Oriented Dialogue Agents

遥感信息 Pub Date : 2019-12-20 DOI:10.1109/ICMRSISIIT46373.2020.9405898

Manisha Thakkar, N. Pise

{"title":"Survey of Available Datasets for Designing Task Oriented Dialogue Agents","authors":"Manisha Thakkar, N. Pise","doi":"10.1109/ICMRSISIIT46373.2020.9405898","DOIUrl":null,"url":null,"abstract":"Dialogue Systems are increasingly popular with the recent advances in neural approaches and NLP applied to conversational AI. Alexa, Siri, Cortana, Google Mini are handily used by many users to do small tasks and control their home appliances in hands free style. Enterprises are also deploying 24 × 7 dialogue agent in place of traditional customer support to increase user engagement and improve their processes. Dialogue Systems are also augmented with Robots to improve human-robot dialogues.Conversational Agents are classified into two main types: Social bots/Chitchat bots and Task Oriented Dialogue Agents. Social bots aim to engage user with unstructured human conversations. These dialogue agents don’t have fixed aim to complete and focus more on carrying out open domain conversations. For example ALIZA, Microsoft XiaoIce etcOn the other hand, Task oriented dialogue agents help user to accomplish certain tasks in specific domains like Restaurant booking, Flight reservation, customer support etc. These are popularly used in controlling home appliances and carrying out simple tasks by users in day to day life. Siri, Alexa, Google Mini, Cortana are task oriented dialogue agents. There is increasing interest in building task completion dialogue agents that span over multiple sub-domains to accomplish a complex user goal.With the increasing acceptance of Dialogue Agents, there is need of high-quality, large-scale dialogue datasets for better performance of task oriented dialogue agent in changing environment. Neural approaches are applied to design intelligent dialogue agents frequently which require very large datasets. However, there are following challenges while building intelligent task completion dialogue systems. Firstly, there are a lot of datasets available for chit-chat bots but they are not directly relevant to task oriented systems. Secondly, to scale out the system to new domains with limited in-domain data.In this paper, we studied different data collection methods, important characteristics of dialogue datasets and their potential uses. This paper presents a survey of publicly available datasets and their applicability for designing modern task - oriented dialogue agents.","PeriodicalId":64877,"journal":{"name":"遥感信息","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"遥感信息","FirstCategoryId":"1087","ListUrlMain":"https://doi.org/10.1109/ICMRSISIIT46373.2020.9405898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Dialogue Systems are increasingly popular with the recent advances in neural approaches and NLP applied to conversational AI. Alexa, Siri, Cortana, Google Mini are handily used by many users to do small tasks and control their home appliances in hands free style. Enterprises are also deploying 24 × 7 dialogue agent in place of traditional customer support to increase user engagement and improve their processes. Dialogue Systems are also augmented with Robots to improve human-robot dialogues.Conversational Agents are classified into two main types: Social bots/Chitchat bots and Task Oriented Dialogue Agents. Social bots aim to engage user with unstructured human conversations. These dialogue agents don’t have fixed aim to complete and focus more on carrying out open domain conversations. For example ALIZA, Microsoft XiaoIce etcOn the other hand, Task oriented dialogue agents help user to accomplish certain tasks in specific domains like Restaurant booking, Flight reservation, customer support etc. These are popularly used in controlling home appliances and carrying out simple tasks by users in day to day life. Siri, Alexa, Google Mini, Cortana are task oriented dialogue agents. There is increasing interest in building task completion dialogue agents that span over multiple sub-domains to accomplish a complex user goal.With the increasing acceptance of Dialogue Agents, there is need of high-quality, large-scale dialogue datasets for better performance of task oriented dialogue agent in changing environment. Neural approaches are applied to design intelligent dialogue agents frequently which require very large datasets. However, there are following challenges while building intelligent task completion dialogue systems. Firstly, there are a lot of datasets available for chit-chat bots but they are not directly relevant to task oriented systems. Secondly, to scale out the system to new domains with limited in-domain data.In this paper, we studied different data collection methods, important characteristics of dialogue datasets and their potential uses. This paper presents a survey of publicly available datasets and their applicability for designing modern task - oriented dialogue agents.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

面向任务的对话代理设计的可用数据集综述

随着神经方法和NLP应用于会话人工智能的最新进展，对话系统越来越受欢迎。Alexa, Siri, Cortana，谷歌Mini被许多用户方便地使用来完成小任务，并以自由的方式控制他们的家用电器。企业也在部署24 × 7的对话代理来代替传统的客户支持，以增加用户参与度并改进他们的流程。对话系统也增加了机器人，以改善人机对话。会话代理分为两种主要类型:社交机器人/聊天机器人和面向任务的对话代理。社交机器人旨在与用户进行非结构化的人类对话。这些对话代理没有固定的目标来完成，而是更专注于执行开放领域的对话。例如ALIZA, Microsoft XiaoIce等，另一方面，面向任务的对话代理帮助用户完成特定领域的某些任务，如餐厅预订，机票预订，客户支持等。在日常生活中，它们被广泛用于控制家用电器和执行简单的任务。Siri、Alexa、b谷歌Mini、Cortana都是面向任务的对话代理。人们对构建跨多个子域的任务完成对话代理来完成复杂的用户目标越来越感兴趣。随着人们对对话代理的接受程度越来越高，需要高质量、大规模的对话数据集来提高面向任务的对话代理在变化环境中的性能。神经网络方法经常应用于智能对话代理的设计，这需要非常大的数据集。然而，在构建智能任务完成对话系统时，存在以下挑战。首先，聊天机器人有很多可用的数据集，但它们与面向任务的系统没有直接关系。其次，将系统扩展到具有有限域内数据的新域。在本文中，我们研究了不同的数据收集方法，对话数据集的重要特征及其潜在用途。本文介绍了公共可用数据集及其在设计现代面向任务的对话代理中的适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

遥感信息

自引率

0.00%

发文量

3984

期刊介绍： Remote Sensing Information is a bimonthly academic journal supervised by the Ministry of Natural Resources of the People's Republic of China and sponsored by China Academy of Surveying and Mapping Science. Since its inception in 1986, it has been one of the authoritative journals in the field of remote sensing in China.In 2014, it was recognised as one of the first batch of national academic journals, and was awarded the honours of Core Journals of China Science Citation Database, Chinese Core Journals, and Core Journals of Science and Technology of China. The journal won the Excellence Award (First Prize) of the National Excellent Surveying, Mapping and Geographic Information Journal Award in 2011 and 2017 respectively. Remote Sensing Information is dedicated to reporting the cutting-edge theoretical and applied results of remote sensing science and technology, promoting academic exchanges at home and abroad, and promoting the application of remote sensing science and technology and industrial development. The journal adheres to the principles of openness, fairness and professionalism, abides by the anonymous review system of peer experts, and has good social credibility. The main columns include Review, Theoretical Research, Innovative Applications, Special Reports, International News, Famous Experts' Forum, Geographic National Condition Monitoring, etc., covering various fields such as surveying and mapping, forestry, agriculture, geology, meteorology, ocean, environment, national defence and so on. Remote Sensing Information aims to provide a high-level academic exchange platform for experts and scholars in the field of remote sensing at home and abroad, to enhance academic influence, and to play a role in promoting and supporting the protection of natural resources, green technology innovation, and the construction of ecological civilisation.