Democratizing Chatbot Debugging: A Computational Framework for Evaluating and Explaining Inappropriate Chatbot Responses

Xu Han, Michelle X. Zhou, Yichen Wang, Wenxi Chen, Tom Yeh
{"title":"Democratizing Chatbot Debugging: A Computational Framework for Evaluating and Explaining Inappropriate Chatbot Responses","authors":"Xu Han, Michelle X. Zhou, Yichen Wang, Wenxi Chen, Tom Yeh","doi":"10.1145/3571884.3604308","DOIUrl":null,"url":null,"abstract":"Evaluating and understanding the inappropriateness of chatbot behaviors can be challenging, particularly for chatbot designers without technical backgrounds. To democratize the debugging process of chatbot misbehaviors for non-technical designers, we propose a framework that leverages dialogue act (DA) modeling to automate the evaluation and explanation of chatbot response inappropriateness. The framework first produces characterizations of context-aware DAs based on discourse analysis theory and real-world human-chatbot transcripts. It then automatically extracts features to identify the appropriateness level of a response and can explain the causes of the inappropriate response by examining the DA mismatch between the response and its conversational context. Using interview chatbots as a testbed, our framework achieves comparable classification accuracy with higher explainability and fewer computational resources than the deep learning baseline, making it the first step in utilizing DAs for chatbot response appropriateness evaluation and explanation.","PeriodicalId":127379,"journal":{"name":"Proceedings of the 5th International Conference on Conversational User Interfaces","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Conversational User Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3571884.3604308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Evaluating and understanding the inappropriateness of chatbot behaviors can be challenging, particularly for chatbot designers without technical backgrounds. To democratize the debugging process of chatbot misbehaviors for non-technical designers, we propose a framework that leverages dialogue act (DA) modeling to automate the evaluation and explanation of chatbot response inappropriateness. The framework first produces characterizations of context-aware DAs based on discourse analysis theory and real-world human-chatbot transcripts. It then automatically extracts features to identify the appropriateness level of a response and can explain the causes of the inappropriate response by examining the DA mismatch between the response and its conversational context. Using interview chatbots as a testbed, our framework achieves comparable classification accuracy with higher explainability and fewer computational resources than the deep learning baseline, making it the first step in utilizing DAs for chatbot response appropriateness evaluation and explanation.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
民主化聊天机器人调试:评估和解释不适当的聊天机器人响应的计算框架
评估和理解聊天机器人行为的不适当性可能具有挑战性,特别是对于没有技术背景的聊天机器人设计师来说。为了使非技术设计人员对聊天机器人不当行为的调试过程民主化,我们提出了一个利用对话行为(DA)建模来自动评估和解释聊天机器人响应不当的框架。该框架首先基于话语分析理论和真实世界的人类聊天机器人文本生成了上下文感知的人工智能特征。然后,它自动提取特征以识别响应的适当级别,并通过检查响应与其会话上下文之间的数据处理不匹配来解释不适当响应的原因。使用访谈聊天机器人作为测试平台,我们的框架实现了与深度学习基线相当的分类精度,具有更高的可解释性和更少的计算资源,使其成为利用DAs进行聊天机器人响应适当性评估和解释的第一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Pilot Evaluation of a Conversational Listener for Conversational User Interfaces From Writing Dialogue to Designing Conversation: Considering the potential of Conversation Analysis for Voice User Interfaces Harnessing Large Language Models for Cognitive Assistants in Factories Ah, Alright, Okay! Communicating Understanding in Conversational Product Search Chatbots as Advisers: the Effects of Response Variability and Reply Suggestion Buttons
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1