首页 > 最新文献

ACM Transactions on Interactive Intelligent Systems最新文献

英文 中文
Exploring the Effects of Self-correction Behavior of an Intelligent Virtual Character during a Jigsaw Puzzle Co-solving Task 探索智能虚拟角色在共同解决拼图游戏任务中自我纠正行为的影响
IF 3.6 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-10 DOI: 10.1145/3688006
Minsoo Choi, Siqi Guo, Alexandros Koilias, Matias Volonte, Dominic Kao, Christos Mousas
Although researchers have explored how humans perceive the intelligence of virtual characters, few studies have focused on the ability of intelligent virtual characters to fix their mistakes. Thus, we explored the self-correction behavior of a virtual character with different intelligence capabilities in a within-group design ( (N=23) ) study. For this study, we developed a virtual character that can solve a jigsaw puzzle whose self-correction behavior is controlled by two parameters, namely, Intelligence and Accuracy of Self-correction . Then, we integrated the virtual character into our virtual reality experience and asked participants to co-solve a jigsaw puzzle. During the study, our participants were exposed to five experimental conditions resulting from combinations of the Intelligence and Accuracy of Self-correction parameters. In each condition, we asked our participants to respond to a survey examining their perceptions of the virtual character's intelligence and awareness (private, public, and surroundings awareness) and user experiences, including trust, enjoyment, performance, frustration, and desire for future interaction. We also collected application logs, including participants’ dwell gaze data, completion times, and the number of puzzle pieces they placed to co-solve the jigsaw puzzle. The results of all the survey ratings and the completion time were statistically significant. Our results indicated that higher levels of Intelligence and Accuracy of Self-correction enhanced not only our participants’ perceptions of the virtual character's intelligence, awareness (private, public, and surroundings), trustworthiness, and performance but also increased their enjoyment and desire for future interaction with the virtual character while reducing their frustration and completion time. Moreover, we found that as the Intelligence and Accuracy of Self-correction increased, participants had to place fewer puzzle pieces and needed less time to complete the jigsaw puzzle. Lastly, regardless of the experimental condition to which we exposed our participants, they gazed at the virtual character for more time compared to the puzzle pieces and puzzle goal in the virtual environment.
尽管研究人员已经探索了人类如何感知虚拟角色的智能,但很少有研究关注智能虚拟角色改正错误的能力。因此,我们在一项组内设计((N=23))研究中探索了具有不同智力能力的虚拟人物的自我纠错行为。在这项研究中,我们开发了一个可以解决拼图游戏的虚拟角色,其自我纠错行为由两个参数控制,即智力和自我纠错的准确性。然后,我们将该虚拟角色整合到虚拟现实体验中,并要求参与者共同解决一个拼图游戏。在研究过程中,我们将参与者置于由智力和自我纠正准确性参数组合而成的五种实验条件下。在每种条件下,我们都要求参与者回答一份调查问卷,调查他们对虚拟角色的智能和意识(私人意识、公共意识和周围环境意识)以及用户体验的看法,包括信任、乐趣、表现、挫败感和对未来互动的渴望。我们还收集了应用日志,包括参与者的停留注视数据、完成时间以及他们为共同解决拼图游戏而摆放的拼图数量。所有调查评分和完成时间的结果都具有统计学意义。我们的结果表明,较高的智力水平和自我纠正的准确性不仅增强了参与者对虚拟角色的智力、意识(私人、公共和周围环境)、可信度和表现的感知,还增加了他们的乐趣和未来与虚拟角色互动的愿望,同时减少了他们的挫败感和完成时间。此外,我们还发现,随着 "智力 "和 "自我纠正的准确性 "的提高,参与者需要摆放的拼图块数减少了,完成拼图所需的时间也缩短了。最后,无论在哪种实验条件下,与虚拟环境中的拼图块和拼图目标相比,参与者注视虚拟人物的时间更长。
{"title":"Exploring the Effects of Self-correction Behavior of an Intelligent Virtual Character during a Jigsaw Puzzle Co-solving Task","authors":"Minsoo Choi, Siqi Guo, Alexandros Koilias, Matias Volonte, Dominic Kao, Christos Mousas","doi":"10.1145/3688006","DOIUrl":"https://doi.org/10.1145/3688006","url":null,"abstract":"\u0000 Although researchers have explored how humans perceive the intelligence of virtual characters, few studies have focused on the ability of intelligent virtual characters to fix their mistakes. Thus, we explored the self-correction behavior of a virtual character with different intelligence capabilities in a within-group design (\u0000 \u0000 (N=23)\u0000 \u0000 ) study. For this study, we developed a virtual character that can solve a jigsaw puzzle whose self-correction behavior is controlled by two parameters, namely,\u0000 Intelligence\u0000 and\u0000 Accuracy of Self-correction\u0000 . Then, we integrated the virtual character into our virtual reality experience and asked participants to co-solve a jigsaw puzzle. During the study, our participants were exposed to five experimental conditions resulting from combinations of the\u0000 Intelligence\u0000 and\u0000 Accuracy of Self-correction\u0000 parameters. In each condition, we asked our participants to respond to a survey examining their perceptions of the virtual character's intelligence and awareness (private, public, and surroundings awareness) and user experiences, including trust, enjoyment, performance, frustration, and desire for future interaction. We also collected application logs, including participants’ dwell gaze data, completion times, and the number of puzzle pieces they placed to co-solve the jigsaw puzzle. The results of all the survey ratings and the completion time were statistically significant. Our results indicated that higher levels of\u0000 Intelligence\u0000 and\u0000 Accuracy of Self-correction\u0000 enhanced not only our participants’ perceptions of the virtual character's intelligence, awareness (private, public, and surroundings), trustworthiness, and performance but also increased their enjoyment and desire for future interaction with the virtual character while reducing their frustration and completion time. Moreover, we found that as the\u0000 Intelligence\u0000 and\u0000 Accuracy of Self-correction\u0000 increased, participants had to place fewer puzzle pieces and needed less time to complete the jigsaw puzzle. Lastly, regardless of the experimental condition to which we exposed our participants, they gazed at the virtual character for more time compared to the puzzle pieces and puzzle goal in the virtual environment.\u0000","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.6,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141920864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Categorical and Continuous Features in Counterfactual Explanations of AI Systems 人工智能系统反事实解释中的分类特征和连续特征
IF 3.4 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-20 DOI: 10.1145/3673907
Greta Warren, Ruth M.J. Byrne, Mark T. Keane

Recently, eXplainable AI (XAI) research has focused on the use of counterfactual explanations to address interpretability, algorithmic recourse, and bias in AI system decision-making. The developers of these algorithms claim they meet user requirements in generating counterfactual explanations with “plausible”, “actionable” or “causally important” features. However, few of these claims have been tested in controlled psychological studies. Hence, we know very little about which aspects of counterfactual explanations really help users understand the decisions of AI systems. Nor do we know whether counterfactual explanations are an advance on more traditional causal explanations that have a longer history in AI (e.g., in expert systems). Accordingly, we carried out three user studies to (i) test a fundamental distinction in feature-types, between categorical and continuous features, and (ii) compare the relative effectiveness of counterfactual and causal explanations. The studies used a simulated, automated decision-making app that determined safe driving limits after drinking alcohol, based on predicted blood alcohol content, where users’ responses were measured objectively (using predictive accuracy) and subjectively (using satisfaction and trust judgments). Study 1 (N = 127) showed that users understand explanations referring to categorical features more readily than those referring to continuous features. It also discovered a dissociation between objective and subjective measures: counterfactual explanations elicited higher accuracy than no-explanation controls but elicited no more accuracy than causal explanations, yet counterfactual explanations elicited greater satisfaction and trust than causal explanations. In Study 2 (N = 136) we transformed the continuous features of presented items to be categorical (i.e., binary) and found that these converted features led to highly accurate responding. Study 3 (N = 211) explicitly compared matched items involving either mixed features (i.e., a mix of categorical and continuous features) or categorical features (i.e., categorical and categorically-transformed continuous features), and found that users were more accurate when categorically-transformed features were used instead of continuous ones. It also replicated the dissociation between objective and subjective effects of explanations. The findings delineate important boundary conditions for current and future counterfactual explanation methods in XAI.

最近,eXplainable AI(XAI)的研究重点是使用反事实解释来解决人工智能系统决策中的可解释性、算法追索权和偏差问题。这些算法的开发者声称,他们在生成具有 "可信"、"可操作 "或 "因果关系重要 "特征的反事实解释时满足了用户需求。然而,这些说法很少经过受控心理学研究的检验。因此,我们对反事实解释的哪些方面真正有助于用户理解人工智能系统的决策知之甚少。我们也不知道反事实解释是否比人工智能领域(如专家系统)历史更悠久的传统因果解释更先进。因此,我们进行了三项用户研究,以(i) 检验分类特征和连续特征在特征类型上的基本区别,(ii) 比较反事实解释和因果解释的相对有效性。这些研究使用了一个模拟的自动决策应用程序,该应用程序根据预测的血液酒精含量确定饮酒后的安全驾驶限制,并对用户的反应进行了客观测量(使用预测准确性)和主观测量(使用满意度和信任度判断)。研究 1(N=127)表明,与连续特征的解释相比,用户更容易理解涉及分类特征的解释。研究还发现了客观和主观测量之间的差异:反事实解释比无解释对照组的准确率更高,但准确率并不比因果解释高,但反事实解释比因果解释更能引起满意度和信任度。在研究 2(N = 136)中,我们将呈现项目的连续特征转换成了分类特征(即二元特征),结果发现这些转换后的特征导致了高度准确的反应。研究 3(N = 211)明确比较了涉及混合特征(即分类特征和连续特征的混合)或分类特征(即分类特征和经分类转换的连续特征)的匹配项目,结果发现,当使用经分类转换的特征而不是连续特征时,用户的回答更准确。研究还重复了解释的客观效果和主观效果之间的分离。这些发现为当前和未来 XAI 中的反事实解释方法划定了重要的边界条件。
{"title":"Categorical and Continuous Features in Counterfactual Explanations of AI Systems","authors":"Greta Warren, Ruth M.J. Byrne, Mark T. Keane","doi":"10.1145/3673907","DOIUrl":"https://doi.org/10.1145/3673907","url":null,"abstract":"<p>Recently, eXplainable AI (XAI) research has focused on the use of counterfactual explanations to address interpretability, algorithmic recourse, and bias in AI system decision-making. The developers of these algorithms claim they meet user requirements in generating counterfactual explanations with “plausible”, “actionable” or “causally important” features. However, few of these claims have been tested in controlled psychological studies. Hence, we know very little about which aspects of counterfactual explanations really help users understand the decisions of AI systems. Nor do we know whether counterfactual explanations are an advance on more traditional causal explanations that have a longer history in AI (e.g., in expert systems). Accordingly, we carried out three user studies to (i) test a fundamental distinction in feature-types, between categorical and continuous features, and (ii) compare the relative effectiveness of counterfactual and causal explanations. The studies used a simulated, automated decision-making app that determined safe driving limits after drinking alcohol, based on predicted blood alcohol content, where users’ responses were measured objectively (using predictive accuracy) and subjectively (using satisfaction and trust judgments). Study 1 (N = 127) showed that users understand explanations referring to categorical features more readily than those referring to continuous features. It also discovered a dissociation between objective and subjective measures: counterfactual explanations elicited higher accuracy than no-explanation controls but elicited no more accuracy than causal explanations, yet counterfactual explanations elicited greater satisfaction and trust than causal explanations. In Study 2 (N = 136) we transformed the continuous features of presented items to be categorical (i.e., binary) and found that these converted features led to highly accurate responding. Study 3 (N = 211) explicitly compared matched items involving either mixed features (i.e., a mix of categorical and continuous features) or categorical features (i.e., categorical and categorically-transformed continuous features), and found that users were more accurate when categorically-transformed features were used instead of continuous ones. It also replicated the dissociation between objective and subjective effects of explanations. The findings delineate important boundary conditions for current and future counterfactual explanation methods in XAI.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ID.8: Co-Creating Visual Stories with Generative AI ID.8:利用生成式人工智能共同创作视觉故事
IF 3.4 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-15 DOI: 10.1145/3672277
Victor Nikhil Antony, Chien-Ming Huang

Storytelling is an integral part of human culture and significantly impacts cognitive and socio-emotional development and connection. Despite the importance of interactive visual storytelling, the process of creating such content requires specialized skills and is labor-intensive. This paper introduces ID.8, an open-source system designed for the co-creation of visual stories with generative AI. We focus on enabling an inclusive storytelling experience by simplifying the content creation process and allowing for customization. Our user evaluation confirms a generally positive user experience in domains such as enjoyment and exploration, while highlighting areas for improvement, particularly in immersiveness, alignment, and partnership between the user and the AI system. Overall, our findings indicate promising possibilities for empowering people to create visual stories with generative AI. This work contributes a novel content authoring system, ID.8, and insights into the challenges and potential of using generative AI for multimedia content creation.

讲故事是人类文化不可分割的一部分,对认知和社会情感的发展和联系有着重要影响。尽管交互式视觉故事非常重要,但创建此类内容的过程需要专业技能,而且是劳动密集型的。本文介绍 ID.8,这是一个开源系统,旨在利用生成式人工智能共同创作视觉故事。我们的重点是通过简化内容创建流程和允许定制来实现包容性的故事体验。我们的用户评估证实,在欣赏和探索等领域,用户体验普遍良好,同时也指出了需要改进的地方,特别是在沉浸感、一致性以及用户与人工智能系统之间的合作关系方面。总之,我们的研究结果表明,利用生成式人工智能增强人们创作视觉故事的能力大有可为。这项研究成果为我们提供了一个新颖的内容创作系统 ID.8,并深入探讨了在多媒体内容创作中使用生成式人工智能所面临的挑战和潜力。
{"title":"ID.8: Co-Creating Visual Stories with Generative AI","authors":"Victor Nikhil Antony, Chien-Ming Huang","doi":"10.1145/3672277","DOIUrl":"https://doi.org/10.1145/3672277","url":null,"abstract":"<p>Storytelling is an integral part of human culture and significantly impacts cognitive and socio-emotional development and connection. Despite the importance of interactive visual storytelling, the process of creating such content requires specialized skills and is labor-intensive. This paper introduces ID.8, an open-source system designed for the co-creation of visual stories with generative AI. We focus on enabling an inclusive storytelling experience by simplifying the content creation process and allowing for customization. Our user evaluation confirms a generally positive user experience in domains such as enjoyment and exploration, while highlighting areas for improvement, particularly in immersiveness, alignment, and partnership between the user and the AI system. Overall, our findings indicate promising possibilities for empowering people to create visual stories with generative AI. This work contributes a novel content authoring system, ID.8, and insights into the challenges and potential of using generative AI for multimedia content creation.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visualization for Recommendation Explainability: A Survey and New Perspectives 可视化推荐的可解释性:调查与新视角
IF 3.4 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-11 DOI: 10.1145/3672276
Mohamed Amine Chatti, Mouadh Guesmi, Arham Muslim

Providing system-generated explanations for recommendations represents an important step towards transparent and trustworthy recommender systems. Explainable recommender systems provide a human-understandable rationale for their outputs. Over the past two decades, explainable recommendation has attracted much attention in the recommender systems research community. This paper aims to provide a comprehensive review of research efforts on visual explanation in recommender systems. More concretely, we systematically review the literature on explanations in recommender systems based on four dimensions, namely explanation aim, explanation scope, explanation method, and explanation format. Recognizing the importance of visualization, we approach the recommender system literature from the angle of explanatory visualizations, that is using visualizations as a display style of explanation. As a result, we derive a set of guidelines that might be constructive for designing explanatory visualizations in recommender systems and identify perspectives for future work in this field. The aim of this review is to help recommendation researchers and practitioners better understand the potential of visually explainable recommendation research and to support them in the systematic design of visual explanations in current and future recommender systems.

为推荐提供由系统生成的解释,是实现透明、可信的推荐系统的重要一步。可解释的推荐系统为其输出提供了人类可以理解的理由。过去二十年来,可解释推荐在推荐系统研究领域引起了广泛关注。本文旨在全面回顾有关推荐系统中视觉解释的研究工作。更具体地说,我们从解释目的、解释范围、解释方法和解释格式四个维度系统地回顾了推荐系统中的解释文献。由于认识到可视化的重要性,我们从解释可视化的角度,即使用可视化作为解释的显示方式,来研究推荐系统的相关文献。因此,我们得出了一套对推荐系统中解释性可视化设计可能具有建设性的指导原则,并确定了这一领域未来工作的前景。本综述旨在帮助推荐研究人员和从业人员更好地理解可视化解释推荐研究的潜力,并支持他们在当前和未来的推荐系统中系统地设计可视化解释。
{"title":"Visualization for Recommendation Explainability: A Survey and New Perspectives","authors":"Mohamed Amine Chatti, Mouadh Guesmi, Arham Muslim","doi":"10.1145/3672276","DOIUrl":"https://doi.org/10.1145/3672276","url":null,"abstract":"<p>Providing system-generated explanations for recommendations represents an important step towards transparent and trustworthy recommender systems. Explainable recommender systems provide a human-understandable rationale for their outputs. Over the past two decades, explainable recommendation has attracted much attention in the recommender systems research community. This paper aims to provide a comprehensive review of research efforts on visual explanation in recommender systems. More concretely, we systematically review the literature on explanations in recommender systems based on four dimensions, namely explanation aim, explanation scope, explanation method, and explanation format. Recognizing the importance of visualization, we approach the recommender system literature from the angle of explanatory visualizations, that is using visualizations as a display style of explanation. As a result, we derive a set of guidelines that might be constructive for designing explanatory visualizations in recommender systems and identify perspectives for future work in this field. The aim of this review is to help recommendation researchers and practitioners better understand the potential of visually explainable recommendation research and to support them in the systematic design of visual explanations in current and future recommender systems.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unpacking Human-AI interactions: From interaction primitives to a design space 解读人机交互:从交互基元到设计空间
IF 3.4 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-06-08 DOI: 10.1145/3664522
Konstantinos Tsiakas, Dave Murray-Rust

This paper aims to develop a semi-formal representation for Human-AI (HAI) interactions, by building a set of interaction primitives which can specify the information exchanges between users and AI systems during their interaction. We show how these primitives can be combined into a set of interaction patterns which can capture common interactions between humans and AI/ML models. The motivation behind this is twofold: firstly, to provide a compact generalisation of existing practices for the design and implementation of HAI interactions; and secondly, to support the creation of new interactions by extending the design space of HAI interactions. Taking into consideration frameworks, guidelines and taxonomies related to human-centered design and implementation of AI systems, we define a vocabulary for describing information exchanges based on the model’s characteristics and interactional capabilities. Based on this vocabulary, a message passing model for interactions between humans and models is presented, which we demonstrate can account for existing HAI interaction systems and approaches. Finally, we build this into design patterns which can describe common interactions between users and models, and we discuss how this approach can be used towards a design space for HAI interactions that creates new possibilities for designs as well as keeping track of implementation issues and concerns.

本文旨在通过建立一套交互基元,为人类与人工智能(HAI)的交互开发一种半正式的表示方法,这套交互基元可以指定用户与人工智能系统在交互过程中的信息交流。我们展示了如何将这些基元组合成一套交互模式,从而捕捉人类与人工智能/ML 模型之间的常见交互。这样做的动机有两个:首先,为设计和实现人工智能交互提供现有实践的紧凑概括;其次,通过扩展人工智能交互的设计空间,支持创建新的交互。考虑到与人工智能系统以人为本的设计和实施相关的框架、指南和分类法,我们定义了一个词汇表,用于描述基于模型特征和交互能力的信息交换。在此词汇的基础上,我们提出了一个用于人类与模型之间交互的信息传递模型,并证明该模型可用于现有的人工智能交互系统和方法。最后,我们将其构建为可描述用户与模型之间常见交互的设计模式,并讨论了如何利用这种方法为人工智能交互设计空间,从而为设计创造新的可能性,并跟踪实施问题和关注点。
{"title":"Unpacking Human-AI interactions: From interaction primitives to a design space","authors":"Konstantinos Tsiakas, Dave Murray-Rust","doi":"10.1145/3664522","DOIUrl":"https://doi.org/10.1145/3664522","url":null,"abstract":"<p>This paper aims to develop a semi-formal representation for Human-AI (HAI) interactions, by building a set of interaction primitives which can specify the information exchanges between users and AI systems during their interaction. We show how these primitives can be combined into a set of interaction patterns which can capture common interactions between humans and AI/ML models. The motivation behind this is twofold: firstly, to provide a compact generalisation of existing practices for the design and implementation of HAI interactions; and secondly, to support the creation of new interactions by extending the design space of HAI interactions. Taking into consideration frameworks, guidelines and taxonomies related to human-centered design and implementation of AI systems, we define a vocabulary for describing information exchanges based on the model’s characteristics and interactional capabilities. Based on this vocabulary, a message passing model for interactions between humans and models is presented, which we demonstrate can account for existing HAI interaction systems and approaches. Finally, we build this into design patterns which can describe common interactions between users and models, and we discuss how this approach can be used towards a design space for HAI interactions that creates new possibilities for designs as well as keeping track of implementation issues and concerns.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141553220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Reasoning and Value Alignment Test to Assess Advanced GPT Reasoning 评估高级 GPT 推理能力的推理和价值排列测试
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-06-03 DOI: 10.1145/3670691
Timothy R. McIntosh, Tong Liu, Teo Susnjak, Paul Watters, Malka N. Halgamuge

In response to diverse perspectives on Artificial General Intelligence (AGI), ranging from potential safety and ethical concerns to more extreme views about the threats it poses to humanity, this research presents a generic method to gauge the reasoning capabilities of Artificial Intelligence (AI) models as a foundational step in evaluating safety measures. Recognizing that AI reasoning measures cannot be wholly automated, due to factors such as cultural complexity, we conducted an extensive examination of five commercial Generative Pre-trained Transformers (GPTs), focusing on their comprehension and interpretation of culturally intricate contexts. Utilizing our novel “Reasoning and Value Alignment Test”, we assessed the GPT models’ ability to reason in complex situations and grasp local cultural subtleties. Our findings have indicated that, although the models have exhibited high levels of human-like reasoning, significant limitations remained, especially concerning the interpretation of cultural contexts. This paper also explored potential applications and use-cases of our Test, underlining its significance in AI training, ethics compliance, sensitivity auditing, and AI-driven cultural consultation. We concluded by emphasizing its broader implications in the AGI domain, highlighting the necessity for interdisciplinary approaches, wider accessibility to various GPT models, and a profound understanding of the interplay between GPT reasoning and cultural sensitivity.

针对有关人工智能(AGI)的各种观点,从潜在的安全和道德问题到对人类构成威胁的更极端观点,本研究提出了一种通用方法来衡量人工智能(AI)模型的推理能力,作为评估安全措施的基础步骤。我们认识到,由于文化复杂性等因素,人工智能推理措施不可能完全自动化,因此我们对五种商用生成预训练转换器(GPT)进行了广泛研究,重点关注它们对错综复杂的文化背景的理解和解释。我们利用新颖的 "推理和价值一致性测试",评估了 GPT 模型在复杂情况下的推理能力以及对当地文化微妙之处的把握能力。我们的研究结果表明,尽管模型表现出了高水平的类人推理能力,但仍存在很大的局限性,尤其是在文化背景的解释方面。本文还探讨了 "测试 "的潜在应用和使用案例,强调了其在人工智能培训、道德合规、敏感性审计和人工智能驱动的文化咨询方面的重要意义。最后,我们强调了它在人工智能领域的广泛影响,强调了跨学科方法的必要性、更广泛地使用各种 GPT 模型的可能性,以及对 GPT 推理和文化敏感性之间相互作用的深刻理解。
{"title":"A Reasoning and Value Alignment Test to Assess Advanced GPT Reasoning","authors":"Timothy R. McIntosh, Tong Liu, Teo Susnjak, Paul Watters, Malka N. Halgamuge","doi":"10.1145/3670691","DOIUrl":"https://doi.org/10.1145/3670691","url":null,"abstract":"<p>In response to diverse perspectives on <i>Artificial General Intelligence</i> (AGI), ranging from potential safety and ethical concerns to more extreme views about the threats it poses to humanity, this research presents a generic method to gauge the reasoning capabilities of <i>Artificial Intelligence</i> (AI) models as a foundational step in evaluating safety measures. Recognizing that AI reasoning measures cannot be wholly automated, due to factors such as cultural complexity, we conducted an extensive examination of five commercial <i>Generative Pre-trained Transformers</i> (GPTs), focusing on their comprehension and interpretation of culturally intricate contexts. Utilizing our novel “Reasoning and Value Alignment Test”, we assessed the GPT models’ ability to reason in complex situations and grasp local cultural subtleties. Our findings have indicated that, although the models have exhibited high levels of human-like reasoning, significant limitations remained, especially concerning the interpretation of cultural contexts. This paper also explored potential applications and use-cases of our Test, underlining its significance in AI training, ethics compliance, sensitivity auditing, and AI-driven cultural consultation. We concluded by emphasizing its broader implications in the AGI domain, highlighting the necessity for interdisciplinary approaches, wider accessibility to various GPT models, and a profound understanding of the interplay between GPT reasoning and cultural sensitivity.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141257527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AutoRL X: Automated Reinforcement Learning on the Web AutoRL X:网络自动强化学习
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-06-03 DOI: 10.1145/3670692
Loraine Franke, Daniel Karl I. Weidele, Nima Dehmamy, Lipeng Ning, Daniel Haehn

Reinforcement Learning (RL) is crucial in decision optimization, but its inherent complexity often presents challenges in interpretation and communication. Building upon AutoDOViz — an interface that pushed the boundaries of Automated RL for Decision Optimization — this paper unveils an open-source expansion with a web-based platform for RL. Our work introduces a taxonomy of RL visualizations and launches a dynamic web platform, leveraging backend flexibility for AutoRL frameworks like ARLO and Svelte.js for a smooth interactive user experience in the front end. Since AutoDOViz is not open-source, we present AutoRL X, a new interface designed to visualize RL processes. AutoRL X is shaped by the extensive user feedback and expert interviews from AutoDOViz studies, and it brings forth an intelligent interface with real-time, intuitive visualization capabilities that enhance understanding, collaborative efforts, and personalization of RL agents. Addressing the gap in accurately representing complex real-world challenges within standard RL environments, we demonstrate our tool's application in healthcare, explicitly optimizing brain stimulation trajectories. A user study contrasts the performance of human users optimizing electric fields via a 2D interface with RL agents’ behavior that we visually analyze in AutoRL X, assessing the practicality of automated RL. All our data and code is openly available at: https://github.com/lorifranke/autorlx.

强化学习(RL)在决策优化中至关重要,但其固有的复杂性往往给解释和交流带来挑战。AutoDOViz是一个用于决策优化的自动强化学习界面,本文以该界面为基础,揭示了强化学习网络平台的开源扩展。我们的工作引入了 RL 可视化分类法,并启动了一个动态网络平台,利用 ARLO 和 Svelte.js 等 AutoRL 框架的后端灵活性,在前端提供流畅的交互式用户体验。由于 AutoDOViz 并非开源,因此我们推出了 AutoRL X,这是一个专为可视化 RL 过程而设计的新界面。AutoRL X 是根据 AutoDOViz 研究中广泛的用户反馈和专家访谈形成的,它带来了一个具有实时、直观可视化功能的智能界面,可增强对 RL 代理的理解、协作努力和个性化。为了弥补在标准 RL 环境中准确呈现复杂现实世界挑战的不足,我们展示了我们的工具在医疗保健领域的应用,明确优化了脑刺激轨迹。一项用户研究对比了人类用户通过二维界面优化电场的表现和我们在 AutoRL X 中可视化分析的 RL 代理行为,评估了自动 RL 的实用性。我们的所有数据和代码均可在以下网址公开获取:https://github.com/lorifranke/autorlx。
{"title":"AutoRL X: Automated Reinforcement Learning on the Web","authors":"Loraine Franke, Daniel Karl I. Weidele, Nima Dehmamy, Lipeng Ning, Daniel Haehn","doi":"10.1145/3670692","DOIUrl":"https://doi.org/10.1145/3670692","url":null,"abstract":"<p>Reinforcement Learning (RL) is crucial in decision optimization, but its inherent complexity often presents challenges in interpretation and communication. Building upon AutoDOViz — an interface that pushed the boundaries of Automated RL for Decision Optimization — this paper unveils an open-source expansion with a web-based platform for RL. Our work introduces a taxonomy of RL visualizations and launches a dynamic web platform, leveraging backend flexibility for AutoRL frameworks like ARLO and Svelte.js for a smooth interactive user experience in the front end. Since AutoDOViz is not open-source, we present AutoRL X, a new interface designed to visualize RL processes. AutoRL X is shaped by the extensive user feedback and expert interviews from AutoDOViz studies, and it brings forth an intelligent interface with real-time, intuitive visualization capabilities that enhance understanding, collaborative efforts, and personalization of RL agents. Addressing the gap in accurately representing complex real-world challenges within standard RL environments, we demonstrate our tool's application in healthcare, explicitly optimizing brain stimulation trajectories. A user study contrasts the performance of human users optimizing electric fields via a 2D interface with RL agents’ behavior that we visually analyze in AutoRL X, assessing the practicality of automated RL. All our data and code is openly available at: https://github.com/lorifranke/autorlx.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141257271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Scientific Paper Skimming with Augmented Intelligence Through Customizable Faceted Highlights 通过可定制的面状亮点,利用增强智能加速科学论文浏览
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-05-23 DOI: 10.1145/3665648
Raymond Fok, Luca Soldaini, Cassidy Trier, Erin Bransom, Kelsey MacMillan, Evie (Yu-Yen) Cheng, Hita Kambhamettu, Jonathan Bragg, Kyle Lo, Marti A. Hearst, Andrew Head, Daniel S. Weld
Scholars need to keep up with an exponentially increasing flood of scientific papers. To aid this challenge, we introduce Scim, a novel intelligent interface that helps scholars skim papers to rapidly review and gain a cursory understanding of its contents. Scim supports the skimming process by highlighting salient content within a paper, directing a scholar’s attention. These automatically-extracted highlights are faceted by content type, evenly distributed across a paper, and have a density configurable by scholars. We evaluate Scim with an in-lab usability study and a longitudinal diary study, revealing how its highlights facilitate the more efficient construction of a conceptualization of a paper. Finally, we describe the process of scaling highlights from their conception within Scim, a research prototype, to production on over 521,000 papers within the Semantic Reader, a publicly-available augmented reading interface for scientific papers. We conclude by discussing design considerations and tensions for the design of future skimming tools with augmented intelligence.
学者们需要跟上呈指数级增长的科学论文洪流。为了应对这一挑战,我们推出了一种新颖的智能界面 Scim,它可以帮助学者快速浏览论文,粗略了解论文内容。Scim 通过高亮显示论文中的突出内容来支持略读过程,从而引导学者的注意力。这些自动提取的重点内容按内容类型分门别类,均匀地分布在论文中,其密度可由学者自行设定。我们通过一项实验室可用性研究和一项纵向日记研究对 Scim 进行了评估,揭示了 Scim 的亮点是如何促进更有效地构建论文概念的。最后,我们介绍了将高亮显示从Scim(一种研究原型)中的概念扩展到Semantic Reader(一种公开提供的科学论文增强阅读界面)中的521,000多篇论文的过程。最后,我们讨论了设计方面的注意事项以及未来使用增强智能设计略读工具的紧张局势。
{"title":"Accelerating Scientific Paper Skimming with Augmented Intelligence Through Customizable Faceted Highlights","authors":"Raymond Fok, Luca Soldaini, Cassidy Trier, Erin Bransom, Kelsey MacMillan, Evie (Yu-Yen) Cheng, Hita Kambhamettu, Jonathan Bragg, Kyle Lo, Marti A. Hearst, Andrew Head, Daniel S. Weld","doi":"10.1145/3665648","DOIUrl":"https://doi.org/10.1145/3665648","url":null,"abstract":"Scholars need to keep up with an exponentially increasing flood of scientific papers. To aid this challenge, we introduce Scim, a novel intelligent interface that helps scholars skim papers to rapidly review and gain a cursory understanding of its contents. Scim supports the skimming process by highlighting salient content within a paper, directing a scholar’s attention. These automatically-extracted highlights are faceted by content type, evenly distributed across a paper, and have a density configurable by scholars. We evaluate Scim with an in-lab usability study and a longitudinal diary study, revealing how its highlights facilitate the more efficient construction of a conceptualization of a paper. Finally, we describe the process of scaling highlights from their conception within Scim, a research prototype, to production on over 521,000 papers within the Semantic Reader, a publicly-available augmented reading interface for scientific papers. We conclude by discussing design considerations and tensions for the design of future skimming tools with augmented intelligence.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141105045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reassuring, Misleading, Debunking: Comparing Effects of XAI Methods on Human Decisions 安慰、误导、揭穿:比较 XAI 方法对人类决策的影响
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-05-22 DOI: 10.1145/3665647
Christina Humer, Andreas Hinterreiter, Benedikt Leichtmann, Martina Mara, Marc Streit
Trust calibration is essential in AI-assisted decision-making. If human users understand the rationale on which an AI model has made a prediction, they can decide whether they consider this prediction reasonable. Especially in high-risk tasks such as mushroom hunting (where a wrong decision may be fatal), it is important that users make correct choices to trust or overrule the AI. Various explainable AI (XAI) methods are currently being discussed as potentially useful for facilitating understanding and subsequently calibrating user trust. So far, however, it remains unclear which approaches are most effective. In this paper, the effects of XAI methods on human AI-assisted decision-making in the high-risk task of mushroom picking were tested. For that endeavor, the effects of ( i ) Grad-CAM attributions, ( ii ) nearest-neighbor examples, and ( iii ) network-dissection concepts were compared in a between-subjects experiment with (N=501) participants representing end-users of the system. In general, nearest-neighbor examples improved decision correctness the most. However, varying effects for different task items became apparent. All explanations seemed to be particularly effective when they revealed reasons to ( i ) doubt a specific AI classification when the AI was wrong and ( ii ) trust a specific AI classification when the AI was correct. Our results suggest that well-established methods, such as Grad-CAM attribution maps, might not be as beneficial to end users as expected and that XAI techniques for use in real-world scenarios must be chosen carefully.
信任校准对人工智能辅助决策至关重要。如果人类用户了解人工智能模型做出预测的依据,他们就能决定是否认为这一预测是合理的。特别是在蘑菇采集等高风险任务中(错误的决定可能会致命),用户必须正确选择信任或否决人工智能。目前,人们正在讨论各种可解释的人工智能(XAI)方法,认为它们可能有助于促进理解并随后校准用户信任度。然而,到目前为止,人们仍不清楚哪种方法最有效。本文测试了在采蘑菇这一高风险任务中,XAI 方法对人类人工智能辅助决策的影响。为此,我们在一个主体间实验中比较了(i)Grad-CAM归因、(ii)最近邻示例和(iii)网络剖析概念的效果。总的来说,最近邻例子对决策正确性的提高最大。然而,不同任务项目的效果也不尽相同。所有解释似乎都特别有效,因为它们揭示了(i)当人工智能错误时怀疑特定人工智能分类的理由,以及(ii)当人工智能正确时信任特定人工智能分类的理由。我们的研究结果表明,Grad-CAM归因图等成熟的方法可能并不像预期的那样有利于最终用户,在真实世界场景中使用的XAI技术必须谨慎选择。
{"title":"Reassuring, Misleading, Debunking: Comparing Effects of XAI Methods on Human Decisions","authors":"Christina Humer, Andreas Hinterreiter, Benedikt Leichtmann, Martina Mara, Marc Streit","doi":"10.1145/3665647","DOIUrl":"https://doi.org/10.1145/3665647","url":null,"abstract":"\u0000 Trust calibration is essential in AI-assisted decision-making. If human users understand the rationale on which an AI model has made a prediction, they can decide whether they consider this prediction reasonable. Especially in high-risk tasks such as mushroom hunting (where a wrong decision may be fatal), it is important that users make correct choices to trust or overrule the AI. Various explainable AI (XAI) methods are currently being discussed as potentially useful for facilitating understanding and subsequently calibrating user trust. So far, however, it remains unclear which approaches are most effective. In this paper, the effects of XAI methods on human AI-assisted decision-making in the high-risk task of mushroom picking were tested. For that endeavor, the effects of (\u0000 i\u0000 ) Grad-CAM attributions, (\u0000 ii\u0000 ) nearest-neighbor examples, and (\u0000 iii\u0000 ) network-dissection concepts were compared in a between-subjects experiment with\u0000 \u0000 (N=501)\u0000 \u0000 participants representing end-users of the system. In general, nearest-neighbor examples improved decision correctness the most. However, varying effects for different task items became apparent. All explanations seemed to be particularly effective when they revealed reasons to (\u0000 i\u0000 ) doubt a specific AI classification when the AI was wrong and (\u0000 ii\u0000 ) trust a specific AI classification when the AI was correct. Our results suggest that well-established methods, such as Grad-CAM attribution maps, might not be as beneficial to end users as expected and that XAI techniques for use in real-world scenarios must be chosen carefully.\u0000","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141112172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring User Experience Inclusivity in Human-AI Interaction via Five User Problem-Solving Styles 通过五种用户解决问题的方式衡量人机交互中的用户体验包容性
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-05-08 DOI: 10.1145/3663740
Andrew Anderson, Jimena Noa Guevara, Fatima Moussaoui, Tianyi Li, Mihaela Vorvoreanu, Margaret Burnett

Motivations: Recent research has emerged on generally how to improve AI products’ Human-AI Interaction (HAI) User Experience (UX), but relatively little is known about HAI-UX inclusivity. For example, what kinds of users are supported, and who are left out? What product changes would make it more inclusive?

Objectives: To help fill this gap, we present an approach to measuring what kinds of diverse users an AI product leaves out and how to act upon that knowledge. To bring actionability to the results, the approach focuses on users’ problem-solving diversity. Thus, our specific objectives were: (1) to show how the measure can reveal which participants with diverse problem-solving styles were left behind in a set of AI products; and (2) to relate participants’ problem-solving diversity to their demographic diversity, specifically gender and age.

Methods: We performed 18 experiments, discarding two that failed manipulation checks. Each experiment was a 2x2 factorial experiment with online participants, comparing two AI products: one deliberately violating one of 18 HAI guideline and the other applying the same guideline. For our first objective, we used our measure to analyze how much each AI product gained/lost HAI-UX inclusivity compared to its counterpart, where inclusivity meant supportiveness to participants with particular problem-solving styles. For our second objective, we analyzed how participants’ problem-solving styles aligned with their gender identities and ages.

Results & Implications: Participants’ diverse problem-solving styles revealed six types of inclusivity results: (1) the AI products that followed an HAI guideline were almost always more inclusive across diversity of problem-solving styles than the products that did not follow that guideline—but “who” got most of the inclusivity varied widely by guideline and by problem-solving style; (2) when an AI product had risk implications, four variables’ values varied in tandem: participants’ feelings of control, their (lack of) suspicion, their trust in the product, and their certainty while using the product; (3) the more control an AI product offered users, the more inclusive it was; (4) whether an AI product was learning from “my” data or other people’s affected how inclusive that product was; (5) participants’ problem-solving styles skewed differently by gender and age group; and (6) almost all of the results suggested actions that HAI practitioners could take to improve their products’ inclusivity further. Together, these results suggest that a key to improving the demographic inclusivity of an AI product (e.g., across a wide range of genders, ages, etc.) can often be obtained by improving the product’s support of diverse problem-solving styles.

动机:最近出现了一些关于如何改善人工智能产品的人机交互(HAI)用户体验(UX)的研究,但对 HAI-UX 的包容性却知之甚少。例如,哪些用户得到了支持,哪些用户被排除在外?怎样的产品改变才能使其更具包容性?为了帮助填补这一空白,我们提出了一种方法来衡量人工智能产品遗漏了哪些类型的不同用户,以及如何根据这些知识采取行动。为了使结果具有可操作性,该方法侧重于用户解决问题的多样性。因此,我们的具体目标是(1)展示该方法如何揭示哪些具有不同问题解决风格的参与者被一组人工智能产品所遗漏;(2)将参与者的问题解决多样性与他们的人口统计学多样性(特别是性别和年龄)联系起来:我们进行了 18 次实验,剔除了两次未通过操作检查的实验。每个实验都是由在线参与者参与的 2x2 因式实验,比较两个人工智能产品:一个故意违反了 18 项 HAI 准则中的一项,另一个则采用了相同的准则。在第一个目标中,我们使用我们的测量方法来分析每种人工智能产品与其对应产品相比在 HAI-UX 包容性方面的得失。在第二个目标中,我们分析了参与者解决问题的风格如何与他们的性别认同和年龄相吻合:参与者解决问题的不同风格揭示了六种类型的包容性结果:(1)遵循 HAI 指南的人工智能产品几乎总是比不遵循该指南的产品在解决问题风格的多样性方面更具包容性--但 "谁 "获得了最大的包容性因指南和解决问题风格的不同而有很大差异;(2)当人工智能产品具有风险影响时,四个变量的值同时变化:(3) 人工智能产品为用户提供的控制越多,其包容性就越强;(4) 人工智能产品是从 "我 "的数据还是从其他人的数据中学习,会影响该产品的包容性;(5) 不同性别和年龄组的参与者解决问题的风格也不同;(6) 几乎所有结果都建议人工智能从业者采取行动,进一步提高其产品的包容性。总之,这些结果表明,提高人工智能产品的人口包容性(例如,跨越广泛的性别、年龄等)的关键往往可以通过改善产品对不同问题解决方式的支持来实现。
{"title":"Measuring User Experience Inclusivity in Human-AI Interaction via Five User Problem-Solving Styles","authors":"Andrew Anderson, Jimena Noa Guevara, Fatima Moussaoui, Tianyi Li, Mihaela Vorvoreanu, Margaret Burnett","doi":"10.1145/3663740","DOIUrl":"https://doi.org/10.1145/3663740","url":null,"abstract":"<p><b>Motivations:</b> Recent research has emerged on generally how to improve AI products’ Human-AI Interaction (HAI) User Experience (UX), but relatively little is known about HAI-UX inclusivity. For example, what kinds of users are supported, and who are left out? What product changes would make it more inclusive?</p><p><b>Objectives:</b> To help fill this gap, we present an approach to measuring what kinds of diverse users an AI product leaves out and how to act upon that knowledge. To bring actionability to the results, the approach focuses on users’ problem-solving diversity. Thus, our specific objectives were: (1) to show how the measure can reveal which participants with diverse problem-solving styles were left behind in a set of AI products; and (2) to relate participants’ problem-solving diversity to their demographic diversity, specifically gender and age.</p><p><b>Methods:</b> We performed 18 experiments, discarding two that failed manipulation checks. Each experiment was a 2x2 factorial experiment with online participants, comparing two AI products: one deliberately violating one of 18 HAI guideline and the other applying the same guideline. For our first objective, we used our measure to analyze how much each AI product gained/lost HAI-UX inclusivity compared to its counterpart, where inclusivity meant supportiveness to participants with particular problem-solving styles. For our second objective, we analyzed how participants’ problem-solving styles aligned with their gender identities and ages.</p><p><b>Results &amp; Implications:</b> Participants’ diverse problem-solving styles revealed six types of inclusivity results: (1) the AI products that followed an HAI guideline were almost always more inclusive across diversity of problem-solving styles than the products that did not follow that guideline—but “who” got most of the inclusivity varied widely by guideline and by problem-solving style; (2) when an AI product had risk implications, four variables’ values varied in tandem: participants’ feelings of control, their (lack of) suspicion, their trust in the product, and their certainty while using the product; (3) the more control an AI product offered users, the more inclusive it was; (4) whether an AI product was learning from “my” data or other people’s affected how inclusive that product was; (5) participants’ problem-solving styles skewed differently by gender and age group; and (6) almost all of the results suggested actions that HAI practitioners could take to improve their products’ inclusivity further. Together, these results suggest that a key to improving the demographic inclusivity of an AI product (e.g., across a wide range of genders, ages, etc.) can often be obtained by improving the product’s support of diverse problem-solving styles.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Interactive Intelligent Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1