Generating Predictable and Adaptive Dialog Policies in Single- and Multi-domain Goal-oriented Dialog Systems

Nhat X. T. Le, A.B. Siddique, Fuad Jamour, Samet Oymak, Vagelis Hristidis
{"title":"Generating Predictable and Adaptive Dialog Policies in Single- and Multi-domain Goal-oriented Dialog Systems","authors":"Nhat X. T. Le, A.B. Siddique, Fuad Jamour, Samet Oymak, Vagelis Hristidis","doi":"10.1142/s1793351x21400109","DOIUrl":null,"url":null,"abstract":"Most existing commercial goal-oriented chatbots are diagram-based; i.e. they follow a rigid dialog flow to fill the slot values needed to achieve a user’s goal. Diagram-based chatbots are predictable, thus their adoption in commercial settings; however, their lack of flexibility may cause many users to leave the conversation before achieving their goal. On the other hand, state-of-the-art research chatbots use Reinforcement Learning (RL) to generate flexible dialog policies. However, such chatbots can be unpredictable, may violate the intended business constraints, and require large training datasets to produce a mature policy. We propose a framework that achieves a middle ground between the diagram-based and RL-based chatbots: we constrain the space of possible chatbot responses using a novel structure, the chatbot dependency graph, and use RL to dynamically select the best valid responses. Dependency graphs are directed graphs that conveniently express a chatbot’s logic by defining the dependencies among slots: all valid dialog flows are encapsulated in one dependency graph. Our experiments in both single-domain and multi-domain settings show that our framework quickly adapts to user characteristics and achieves up to 23.77% improved success rate compared to a state-of-the-art RL model.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Semantic Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s1793351x21400109","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Most existing commercial goal-oriented chatbots are diagram-based; i.e. they follow a rigid dialog flow to fill the slot values needed to achieve a user’s goal. Diagram-based chatbots are predictable, thus their adoption in commercial settings; however, their lack of flexibility may cause many users to leave the conversation before achieving their goal. On the other hand, state-of-the-art research chatbots use Reinforcement Learning (RL) to generate flexible dialog policies. However, such chatbots can be unpredictable, may violate the intended business constraints, and require large training datasets to produce a mature policy. We propose a framework that achieves a middle ground between the diagram-based and RL-based chatbots: we constrain the space of possible chatbot responses using a novel structure, the chatbot dependency graph, and use RL to dynamically select the best valid responses. Dependency graphs are directed graphs that conveniently express a chatbot’s logic by defining the dependencies among slots: all valid dialog flows are encapsulated in one dependency graph. Our experiments in both single-domain and multi-domain settings show that our framework quickly adapts to user characteristics and achieves up to 23.77% improved success rate compared to a state-of-the-art RL model.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在单域和多域目标导向对话系统中生成可预测和自适应的对话策略
大多数现有的商业目标导向聊天机器人都是基于图表的;也就是说,它们遵循严格的对话流程来填充实现用户目标所需的槽值。基于图表的聊天机器人是可预测的,因此它们在商业环境中的采用;然而,它们缺乏灵活性可能会导致许多用户在实现目标之前离开对话。另一方面,最先进的研究聊天机器人使用强化学习(RL)来生成灵活的对话策略。然而,这种聊天机器人可能是不可预测的,可能违反预期的业务约束,并且需要大量的训练数据集来生成成熟的策略。我们提出了一个介于基于图的聊天机器人和基于强化学习的聊天机器人之间的框架:我们使用一种新的结构,即聊天机器人依赖图来约束可能的聊天机器人响应的空间,并使用强化学习来动态选择最佳有效响应。依赖图是有向图,它通过定义插槽之间的依赖关系来方便地表达聊天机器人的逻辑:所有有效的对话流都封装在一个依赖图中。我们在单域和多域设置下的实验表明,我们的框架可以快速适应用户特征,与最先进的强化学习模型相比,成功率提高了23.77%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Guest Editorial - Special Issue on IEEE AIKE 2022 TemporalDedup: Domain-Independent Deduplication of Redundant and Errant Temporal Data Knowledge Graph-Based Explainable Artificial Intelligence for Business Process Analysis Knowledge Graph-Based Integration of Autonomous Driving Datasets Confidence-Based Cheat Detection Through Constrained Order Inference of Temporal Sequences
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1