Measuring perceived empathy in dialogue systems

IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE AI & Society Pub Date : 2023-07-23 DOI:10.1007/s00146-023-01715-z

Shauna Concannon, Marcus Tomalin

{"title":"Measuring perceived empathy in dialogue systems","authors":"Shauna Concannon, Marcus Tomalin","doi":"10.1007/s00146-023-01715-z","DOIUrl":null,"url":null,"abstract":"<div><p>Dialogue systems, from Virtual Personal Assistants such as Siri, Cortana, and Alexa to state-of-the-art systems such as BlenderBot3 and ChatGPT, are already widely available, used in a variety of applications, and are increasingly part of many people’s lives. However, the task of enabling them to use empathetic language more convincingly is still an emerging research topic. Such systems generally make use of complex neural networks to learn the patterns of typical human language use, and the interactions in which the systems participate are usually mediated either via interactive text-based or speech-based interfaces. In human–human interaction, empathy has been shown to promote prosocial behaviour and improve interaction. In the context of dialogue systems, to advance the understanding of how perceptions of empathy affect interactions, it is necessary to bring greater clarity to how empathy is measured and assessed. Assessing the way dialogue systems create perceptions of empathy brings together a range of technological, psychological, and ethical considerations that merit greater scrutiny than they have received so far. However, there is currently no widely accepted evaluation method for determining the degree of empathy that any given system possesses (or, at least, appears to possess). Currently, different research teams use a variety of automated metrics, alongside different forms of subjective human assessment such as questionnaires, self-assessment measures and narrative engagement scales. This diversity of evaluation practice means that, given two DSs, it is usually impossible to determine which of them conveys the greater degree of empathy in its dialogic exchanges with human users. Acknowledging this problem, the present article provides an overview of how empathy is measured in human–human interactions and considers some of the ways it is currently measured in human–DS interactions. Finally, it introduces a novel third-person analytical framework, called the Empathy Scale for Human–Computer Communication (ESHCC), to support greater uniformity in how perceived empathy is measured during interactions with state-of-the-art DSs.</p></div>","PeriodicalId":47165,"journal":{"name":"AI & Society","volume":"39 5","pages":"2233 - 2247"},"PeriodicalIF":4.7000,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00146-023-01715-z.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI & Society","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s00146-023-01715-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Dialogue systems, from Virtual Personal Assistants such as Siri, Cortana, and Alexa to state-of-the-art systems such as BlenderBot3 and ChatGPT, are already widely available, used in a variety of applications, and are increasingly part of many people’s lives. However, the task of enabling them to use empathetic language more convincingly is still an emerging research topic. Such systems generally make use of complex neural networks to learn the patterns of typical human language use, and the interactions in which the systems participate are usually mediated either via interactive text-based or speech-based interfaces. In human–human interaction, empathy has been shown to promote prosocial behaviour and improve interaction. In the context of dialogue systems, to advance the understanding of how perceptions of empathy affect interactions, it is necessary to bring greater clarity to how empathy is measured and assessed. Assessing the way dialogue systems create perceptions of empathy brings together a range of technological, psychological, and ethical considerations that merit greater scrutiny than they have received so far. However, there is currently no widely accepted evaluation method for determining the degree of empathy that any given system possesses (or, at least, appears to possess). Currently, different research teams use a variety of automated metrics, alongside different forms of subjective human assessment such as questionnaires, self-assessment measures and narrative engagement scales. This diversity of evaluation practice means that, given two DSs, it is usually impossible to determine which of them conveys the greater degree of empathy in its dialogic exchanges with human users. Acknowledging this problem, the present article provides an overview of how empathy is measured in human–human interactions and considers some of the ways it is currently measured in human–DS interactions. Finally, it introduces a novel third-person analytical framework, called the Empathy Scale for Human–Computer Communication (ESHCC), to support greater uniformity in how perceived empathy is measured during interactions with state-of-the-art DSs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

衡量对话系统中的感知共鸣

从 Siri、Cortana 和 Alexa 等虚拟个人助理，到 BlenderBot3 和 ChatGPT 等最先进的系统，对话系统已被广泛应用于各种场合，并日益成为许多人生活的一部分。然而，如何让它们更令人信服地使用移情语言仍是一个新兴的研究课题。这类系统通常利用复杂的神经网络来学习人类典型的语言使用模式，而系统参与的互动通常是通过交互式文本界面或语音界面来实现的。在人与人的互动中，移情被证明可以促进亲社会行为并改善互动。在对话系统中，为了进一步了解移情感知如何影响互动，有必要进一步明确如何衡量和评估移情感知。评估对话系统如何产生移情感知，涉及一系列技术、心理和伦理方面的考虑因素，值得进行更深入的研究。然而，目前还没有一种广为接受的评估方法来确定任何特定系统所具有（或至少看起来具有）的移情程度。目前，不同的研究团队使用各种自动化指标，以及不同形式的人类主观评估，如问卷调查、自我评估措施和叙事参与量表。这种评估实践的多样性意味着，如果给定两个 DS，通常无法确定哪一个在与人类用户的对话交流中表达了更大程度的同理心。考虑到这一问题，本文概述了在人与人的交互中如何衡量同理心，并考虑了目前在人与 DS 交互中衡量同理心的一些方法。最后，文章介绍了一个新颖的第三人称分析框架，即 "人机交互移情量表"（ESHCC），以支持在与最先进的 DS 交互过程中如何测量感知到的移情的统一性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

AI & Society COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

8.00

自引率

20.00%

发文量

257

期刊介绍： AI & Society: Knowledge, Culture and Communication, is an International Journal publishing refereed scholarly articles, position papers, debates, short communications, and reviews of books and other publications. Established in 1987, the Journal focuses on societal issues including the design, use, management, and policy of information, communications and new media technologies, with a particular emphasis on cultural, social, cognitive, economic, ethical, and philosophical implications. AI & Society has a broad scope and is strongly interdisciplinary. We welcome contributions and participation from researchers and practitioners in a variety of fields including information technologies, humanities, social sciences, arts and sciences. This includes broader societal and cultural impacts, for example on governance, security, sustainability, identity, inclusion, working life, corporate and community welfare, and well-being of people. Co-authored articles from diverse disciplines are encouraged. AI & Society seeks to promote an understanding of the potential, transformative impacts and critical consequences of pervasive technology for societies. Technological innovations, including new sciences such as biotech, nanotech and neuroscience, offer a great potential for societies, but also pose existential risk. Rooted in the human-centred tradition of science and technology, the Journal acts as a catalyst, promoter and facilitator of engagement with diversity of voices and over-the-horizon issues of arts, science, technology and society. AI & Society expects that, in keeping with the ethos of the journal, submissions should provide a substantial and explicit argument on the societal dimension of research, particularly the benefits, impacts and implications for society. This may include factors such as trust, biases, privacy, reliability, responsibility, and competence of AI systems. Such arguments should be validated by critical comment on current research in this area. Curmudgeon Corner will retain its opinionated ethos. The journal is in three parts: a) full length scholarly articles; b) strategic ideas, critical reviews and reflections; c) Student Forum is for emerging researchers and new voices to communicate their ongoing research to the wider academic community, mentored by the Journal Advisory Board; Book Reviews and News; Curmudgeon Corner for the opinionated. Papers in the Original Section may include original papers, which are underpinned by theoretical, methodological, conceptual or philosophical foundations. The Open Forum Section may include strategic ideas, critical reviews and potential implications for society of current research. Network Research Section papers make substantial contributions to theoretical and methodological foundations within societal domains. These will be multi-authored papers that include a summary of the contribution of each author to the paper. Original, Open Forum and Network papers are peer reviewed. The Student Forum Section may include theoretical, methodological, and application orientations of ongoing research including case studies, as well as, contextual action research experiences. Papers in this section are normally single-authored and are also formally reviewed. Curmudgeon Corner is a short opinionated column on trends in technology, arts, science and society, commenting emphatically on issues of concern to the research community and wider society. Normal word length: Original and Network Articles 10k, Open Forum 8k, Student Forum 6k, Curmudgeon 1k. The exception to the co-author limit of Original and Open Forum (4), Network (10), Student (3) and Curmudgeon (2) articles will be considered for their special contributions. Please do not send your submissions by email but use the "Submit manuscript" button. NOTE TO AUTHORS: The Journal expects its authors to include, in their submissions: a) An acknowledgement of the pre-accept/pre-publication versions of their manuscripts on non-commercial and academic sites. b) Images: obtain permissions from the copyright holder/original sources. c) Formal permission from their ethics committees when conducting studies with people.

期刊最新文献

Reflexive ecologies of knowledge in the future of AI & Society Revisiting 'who gets in': borders and migration management in the era of automation and AI in Canada. The machine in the manuscript: editorial dilemmas AI, society, and the shadows of our desires Is Consent-GPT valid? Public attitudes to generative AI use in surgical consent.