Timo Strohmann, Dominik Siemon, Bijan Khosrawi-Rad, S. Robra-Bissantz
{"title":"Toward a design theory for virtual companionship","authors":"Timo Strohmann, Dominik Siemon, Bijan Khosrawi-Rad, S. Robra-Bissantz","doi":"10.1080/07370024.2022.2084620","DOIUrl":null,"url":null,"abstract":"Due to significant technological advances in the field of artificial intelligence (AI), which are driven by increased computing power, the ubiquitous availability of data, as well as new algorithms, new forms of intelligent systems and services have been developed and brought to the market (Choudhury et al., 2020; Clark et al., 2019a; Diederich et al., 2022; Kaplan & Haenlein, 2019; Ransbotham et al., 2018; Robert et al., 2020; Rzepka & Berger, 2018). In addition to specific applications in the form of virtual assistants, such as Apple’s Siri or Amazon’s Alexa, companies increasingly develop chatbots and enterprise bots to interact with customers (Diederich et al., 2022; Maedche et al., 2016; McTear et al., 2016). What all these systems have in common, is that they allow their users to interact with them using natural language, which is why the systems are summarized by the term conversational agent (CA) (Diederich et al., 2022; McTear et al., 2016). There are already various use cases for CAs today, ranging from executing smartphone functions, such as creating calendar entries or sending messages to smart home control, to interaction in the healthcare context (Ahmad et al., 2022; Elshan et al., 2022; Gnewuch et al., 2017; McTear et al., 2016; Sin & Munteanu, 2020). Thus, CAs currently offer a new way of interacting with information technology (Morana et al., 2017). Recent literature reviews show a growing interest in CAs and AI-enabled systems (Diederich et al., 2022; Elshan et al., 2022; Nißen et al., 2021; Rzepka & Berger, 2018), but mainly a limited variety of application contexts, which mostly focus on short-term interactions in marketing, sales, and support. Application scenarios that require long-term interaction are available but under-researched (Diederich et al., 2022; Elshan et al., 2022). Additionally, the current applications show that CA’s main goal is to provide personal assistant functionality, while less attention goes to the actual interaction with the system which should be improved by social behaviors being incorporated (Elshan et al., 2022; Gnewuch et al., 2017; Krämer et al., 2011; Nißen et al., 2021; Rzepka & Berger, 2018). Most of these interactions are initiated by the user and not by the CA, which means that the CA acts reactively rather than proactively. Moreover, these interactions are isolated, transactional, and based on predefined paths, as if they are starting over every time (Seymour et al., 2018). Although presently, from a technological perspective, CAs can predominantly conduct restricted conversations related to a specific topic (Diederich et al., 2022), modern language prediction models such as the Generative Pre-trained Transformer 3 (GPT-3) are able to fundamentally expand the capabilities of CAs. They achieve this by enabling open-topic and richer conversations with strong interpersonal character (Brown et al., 2020). The GPT-3 and many other recent language models are built on Transformer (Vaswani et al., 2017), a neural network architecture invented by Google Research in 2017. Google’s recent language model LaMDA shows how human-like ways of interacting can be achieved by enabling open-topic conversations based on the modern","PeriodicalId":56306,"journal":{"name":"Human-Computer Interaction","volume":"263 1","pages":"194 - 234"},"PeriodicalIF":4.5000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human-Computer Interaction","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1080/07370024.2022.2084620","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, CYBERNETICS","Score":null,"Total":0}
引用次数: 7
Abstract
Due to significant technological advances in the field of artificial intelligence (AI), which are driven by increased computing power, the ubiquitous availability of data, as well as new algorithms, new forms of intelligent systems and services have been developed and brought to the market (Choudhury et al., 2020; Clark et al., 2019a; Diederich et al., 2022; Kaplan & Haenlein, 2019; Ransbotham et al., 2018; Robert et al., 2020; Rzepka & Berger, 2018). In addition to specific applications in the form of virtual assistants, such as Apple’s Siri or Amazon’s Alexa, companies increasingly develop chatbots and enterprise bots to interact with customers (Diederich et al., 2022; Maedche et al., 2016; McTear et al., 2016). What all these systems have in common, is that they allow their users to interact with them using natural language, which is why the systems are summarized by the term conversational agent (CA) (Diederich et al., 2022; McTear et al., 2016). There are already various use cases for CAs today, ranging from executing smartphone functions, such as creating calendar entries or sending messages to smart home control, to interaction in the healthcare context (Ahmad et al., 2022; Elshan et al., 2022; Gnewuch et al., 2017; McTear et al., 2016; Sin & Munteanu, 2020). Thus, CAs currently offer a new way of interacting with information technology (Morana et al., 2017). Recent literature reviews show a growing interest in CAs and AI-enabled systems (Diederich et al., 2022; Elshan et al., 2022; Nißen et al., 2021; Rzepka & Berger, 2018), but mainly a limited variety of application contexts, which mostly focus on short-term interactions in marketing, sales, and support. Application scenarios that require long-term interaction are available but under-researched (Diederich et al., 2022; Elshan et al., 2022). Additionally, the current applications show that CA’s main goal is to provide personal assistant functionality, while less attention goes to the actual interaction with the system which should be improved by social behaviors being incorporated (Elshan et al., 2022; Gnewuch et al., 2017; Krämer et al., 2011; Nißen et al., 2021; Rzepka & Berger, 2018). Most of these interactions are initiated by the user and not by the CA, which means that the CA acts reactively rather than proactively. Moreover, these interactions are isolated, transactional, and based on predefined paths, as if they are starting over every time (Seymour et al., 2018). Although presently, from a technological perspective, CAs can predominantly conduct restricted conversations related to a specific topic (Diederich et al., 2022), modern language prediction models such as the Generative Pre-trained Transformer 3 (GPT-3) are able to fundamentally expand the capabilities of CAs. They achieve this by enabling open-topic and richer conversations with strong interpersonal character (Brown et al., 2020). The GPT-3 and many other recent language models are built on Transformer (Vaswani et al., 2017), a neural network architecture invented by Google Research in 2017. Google’s recent language model LaMDA shows how human-like ways of interacting can be achieved by enabling open-topic conversations based on the modern
由于人工智能(AI)领域的重大技术进步,这是由不断增强的计算能力、无处不在的数据可用性以及新算法驱动的,新形式的智能系统和服务已经开发出来并推向市场(Choudhury等人,2020;Clark et al., 2019a;Diederich et al., 2022;Kaplan & Haenlein, 2019;Ransbotham等人,2018;Robert et al., 2020;Rzepka & Berger, 2018)。除了虚拟助手形式的特定应用程序,如苹果的Siri或亚马逊的Alexa,公司越来越多地开发聊天机器人和企业机器人来与客户互动(Diederich等人,2022;Maedche et al., 2016;McTear et al., 2016)。所有这些系统的共同点是,它们允许用户使用自然语言与它们进行交互,这就是为什么这些系统被术语会话代理(CA)所概括(Diederich et al., 2022;McTear et al., 2016)。如今,ca已经有了各种各样的用例,从执行智能手机功能(如创建日历条目或向智能家居控制发送消息)到医疗保健环境中的交互(Ahmad等人,2022;Elshan et al., 2022;Gnewuch等人,2017;McTear et al., 2016;Sin & Munteanu, 2020)。因此,ca目前提供了一种与信息技术交互的新方式(Morana et al., 2017)。最近的文献综述显示,人们对ca和人工智能系统的兴趣越来越大(Diederich et al., 2022;Elshan et al., 2022;Nißen et al., 2021;Rzepka & Berger, 2018),但主要是有限种类的应用程序上下文,主要集中在市场营销,销售和支持中的短期交互。需要长期交互的应用场景是可用的,但研究不足(Diederich等人,2022;Elshan et al., 2022)。此外,目前的应用表明,CA的主要目标是提供个人助理功能,而较少关注与系统的实际交互,这应该通过纳入社会行为来改善(Elshan等人,2022;Gnewuch等人,2017;Krämer等人,2011;Nißen et al., 2021;Rzepka & Berger, 2018)。大多数这些交互都是由用户发起的,而不是由CA发起的,这意味着CA是被动的,而不是主动的。此外,这些交互是孤立的、事务性的,并且基于预定义的路径,就好像它们每次都重新开始一样(Seymour等人,2018)。虽然目前,从技术角度来看,ca主要可以进行与特定主题相关的限制性对话(Diederich等人,2022),但现代语言预测模型,如生成预训练转换器3 (GPT-3)能够从根本上扩展ca的能力。他们通过启用具有强烈人际特征的开放式话题和更丰富的对话来实现这一目标(Brown et al., 2020)。GPT-3和许多其他最新的语言模型都是基于Transformer (Vaswani et al., 2017)构建的,Transformer是Google Research在2017年发明的一种神经网络架构。谷歌最近的语言模型LaMDA展示了如何通过基于现代语言的开放话题对话来实现类似人类的交互方式
期刊介绍:
Human-Computer Interaction (HCI) is a multidisciplinary journal defining and reporting
on fundamental research in human-computer interaction. The goal of HCI is to be a journal
of the highest quality that combines the best research and design work to extend our
understanding of human-computer interaction. The target audience is the research
community with an interest in both the scientific implications and practical relevance of
how interactive computer systems should be designed and how they are actually used. HCI is
concerned with the theoretical, empirical, and methodological issues of interaction science
and system design as it affects the user.