ACM Transactions on Interactive Intelligent Systems最新文献

Categorical and Continuous Features in Counterfactual Explanations of AI Systems 人工智能系统反事实解释中的分类特征和连续特征

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-06-20 DOI: 10.1145/3673907

Greta Warren, Ruth M.J. Byrne, Mark T. Keane

Recently, eXplainable AI (XAI) research has focused on the use of counterfactual explanations to address interpretability, algorithmic recourse, and bias in AI system decision-making. The developers of these algorithms claim they meet user requirements in generating counterfactual explanations with “plausible”, “actionable” or “causally important” features. However, few of these claims have been tested in controlled psychological studies. Hence, we know very little about which aspects of counterfactual explanations really help users understand the decisions of AI systems. Nor do we know whether counterfactual explanations are an advance on more traditional causal explanations that have a longer history in AI (e.g., in expert systems). Accordingly, we carried out three user studies to (i) test a fundamental distinction in feature-types, between categorical and continuous features, and (ii) compare the relative effectiveness of counterfactual and causal explanations. The studies used a simulated, automated decision-making app that determined safe driving limits after drinking alcohol, based on predicted blood alcohol content, where users’ responses were measured objectively (using predictive accuracy) and subjectively (using satisfaction and trust judgments). Study 1 (N = 127) showed that users understand explanations referring to categorical features more readily than those referring to continuous features. It also discovered a dissociation between objective and subjective measures: counterfactual explanations elicited higher accuracy than no-explanation controls but elicited no more accuracy than causal explanations, yet counterfactual explanations elicited greater satisfaction and trust than causal explanations. In Study 2 (N = 136) we transformed the continuous features of presented items to be categorical (i.e., binary) and found that these converted features led to highly accurate responding. Study 3 (N = 211) explicitly compared matched items involving either mixed features (i.e., a mix of categorical and continuous features) or categorical features (i.e., categorical and categorically-transformed continuous features), and found that users were more accurate when categorically-transformed features were used instead of continuous ones. It also replicated the dissociation between objective and subjective effects of explanations. The findings delineate important boundary conditions for current and future counterfactual explanation methods in XAI.

最近，eXplainable AI（XAI）的研究重点是使用反事实解释来解决人工智能系统决策中的可解释性、算法追索权和偏差问题。这些算法的开发者声称，他们在生成具有 "可信"、"可操作 "或 "因果关系重要 "特征的反事实解释时满足了用户需求。然而，这些说法很少经过受控心理学研究的检验。因此，我们对反事实解释的哪些方面真正有助于用户理解人工智能系统的决策知之甚少。我们也不知道反事实解释是否比人工智能领域（如专家系统）历史更悠久的传统因果解释更先进。因此，我们进行了三项用户研究，以(i) 检验分类特征和连续特征在特征类型上的基本区别，(ii) 比较反事实解释和因果解释的相对有效性。这些研究使用了一个模拟的自动决策应用程序，该应用程序根据预测的血液酒精含量确定饮酒后的安全驾驶限制，并对用户的反应进行了客观测量（使用预测准确性）和主观测量（使用满意度和信任度判断）。研究 1（N=127）表明，与连续特征的解释相比，用户更容易理解涉及分类特征的解释。研究还发现了客观和主观测量之间的差异：反事实解释比无解释对照组的准确率更高，但准确率并不比因果解释高，但反事实解释比因果解释更能引起满意度和信任度。在研究 2（N = 136）中，我们将呈现项目的连续特征转换成了分类特征（即二元特征），结果发现这些转换后的特征导致了高度准确的反应。研究 3（N = 211）明确比较了涉及混合特征（即分类特征和连续特征的混合）或分类特征（即分类特征和经分类转换的连续特征）的匹配项目，结果发现，当使用经分类转换的特征而不是连续特征时，用户的回答更准确。研究还重复了解释的客观效果和主观效果之间的分离。这些发现为当前和未来 XAI 中的反事实解释方法划定了重要的边界条件。

{"title":"Categorical and Continuous Features in Counterfactual Explanations of AI Systems","authors":"Greta Warren, Ruth M.J. Byrne, Mark T. Keane","doi":"10.1145/3673907","DOIUrl":"https://doi.org/10.1145/3673907","url":null,"abstract":"Recently, eXplainable AI (XAI) research has focused on the use of counterfactual explanations to address interpretability, algorithmic recourse, and bias in AI system decision-making. The developers of these algorithms claim they meet user requirements in generating counterfactual explanations with “plausible”, “actionable” or “causally important” features. However, few of these claims have been tested in controlled psychological studies. Hence, we know very little about which aspects of counterfactual explanations really help users understand the decisions of AI systems. Nor do we know whether counterfactual explanations are an advance on more traditional causal explanations that have a longer history in AI (e.g., in expert systems). Accordingly, we carried out three user studies to (i) test a fundamental distinction in feature-types, between categorical and continuous features, and (ii) compare the relative effectiveness of counterfactual and causal explanations. The studies used a simulated, automated decision-making app that determined safe driving limits after drinking alcohol, based on predicted blood alcohol content, where users’ responses were measured objectively (using predictive accuracy) and subjectively (using satisfaction and trust judgments). Study 1 (N = 127) showed that users understand explanations referring to categorical features more readily than those referring to continuous features. It also discovered a dissociation between objective and subjective measures: counterfactual explanations elicited higher accuracy than no-explanation controls but elicited no more accuracy than causal explanations, yet counterfactual explanations elicited greater satisfaction and trust than causal explanations. In Study 2 (N = 136) we transformed the continuous features of presented items to be categorical (i.e., binary) and found that these converted features led to highly accurate responding. Study 3 (N = 211) explicitly compared matched items involving either mixed features (i.e., a mix of categorical and continuous features) or categorical features (i.e., categorical and categorically-transformed continuous features), and found that users were more accurate when categorically-transformed features were used instead of continuous ones. It also replicated the dissociation between objective and subjective effects of explanations. The findings delineate important boundary conditions for current and future counterfactual explanation methods in XAI.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"11 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ID.8: Co-Creating Visual Stories with Generative AI ID.8：利用生成式人工智能共同创作视觉故事

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-06-15 DOI: 10.1145/3672277

Victor Nikhil Antony, Chien-Ming Huang

Storytelling is an integral part of human culture and significantly impacts cognitive and socio-emotional development and connection. Despite the importance of interactive visual storytelling, the process of creating such content requires specialized skills and is labor-intensive. This paper introduces ID.8, an open-source system designed for the co-creation of visual stories with generative AI. We focus on enabling an inclusive storytelling experience by simplifying the content creation process and allowing for customization. Our user evaluation confirms a generally positive user experience in domains such as enjoyment and exploration, while highlighting areas for improvement, particularly in immersiveness, alignment, and partnership between the user and the AI system. Overall, our findings indicate promising possibilities for empowering people to create visual stories with generative AI. This work contributes a novel content authoring system, ID.8, and insights into the challenges and potential of using generative AI for multimedia content creation.

讲故事是人类文化不可分割的一部分，对认知和社会情感的发展和联系有着重要影响。尽管交互式视觉故事非常重要，但创建此类内容的过程需要专业技能，而且是劳动密集型的。本文介绍 ID.8，这是一个开源系统，旨在利用生成式人工智能共同创作视觉故事。我们的重点是通过简化内容创建流程和允许定制来实现包容性的故事体验。我们的用户评估证实，在欣赏和探索等领域，用户体验普遍良好，同时也指出了需要改进的地方，特别是在沉浸感、一致性以及用户与人工智能系统之间的合作关系方面。总之，我们的研究结果表明，利用生成式人工智能增强人们创作视觉故事的能力大有可为。这项研究成果为我们提供了一个新颖的内容创作系统 ID.8，并深入探讨了在多媒体内容创作中使用生成式人工智能所面临的挑战和潜力。

引用次数: 0

Visualization for Recommendation Explainability: A Survey and New Perspectives 可视化推荐的可解释性：调查与新视角

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-06-11 DOI: 10.1145/3672276

Mohamed Amine Chatti, Mouadh Guesmi, Arham Muslim

Providing system-generated explanations for recommendations represents an important step towards transparent and trustworthy recommender systems. Explainable recommender systems provide a human-understandable rationale for their outputs. Over the past two decades, explainable recommendation has attracted much attention in the recommender systems research community. This paper aims to provide a comprehensive review of research efforts on visual explanation in recommender systems. More concretely, we systematically review the literature on explanations in recommender systems based on four dimensions, namely explanation aim, explanation scope, explanation method, and explanation format. Recognizing the importance of visualization, we approach the recommender system literature from the angle of explanatory visualizations, that is using visualizations as a display style of explanation. As a result, we derive a set of guidelines that might be constructive for designing explanatory visualizations in recommender systems and identify perspectives for future work in this field. The aim of this review is to help recommendation researchers and practitioners better understand the potential of visually explainable recommendation research and to support them in the systematic design of visual explanations in current and future recommender systems.

为推荐提供由系统生成的解释，是实现透明、可信的推荐系统的重要一步。可解释的推荐系统为其输出提供了人类可以理解的理由。过去二十年来，可解释推荐在推荐系统研究领域引起了广泛关注。本文旨在全面回顾有关推荐系统中视觉解释的研究工作。更具体地说，我们从解释目的、解释范围、解释方法和解释格式四个维度系统地回顾了推荐系统中的解释文献。由于认识到可视化的重要性，我们从解释可视化的角度，即使用可视化作为解释的显示方式，来研究推荐系统的相关文献。因此，我们得出了一套对推荐系统中解释性可视化设计可能具有建设性的指导原则，并确定了这一领域未来工作的前景。本综述旨在帮助推荐研究人员和从业人员更好地理解可视化解释推荐研究的潜力，并支持他们在当前和未来的推荐系统中系统地设计可视化解释。

{"title":"Visualization for Recommendation Explainability: A Survey and New Perspectives","authors":"Mohamed Amine Chatti, Mouadh Guesmi, Arham Muslim","doi":"10.1145/3672276","DOIUrl":"https://doi.org/10.1145/3672276","url":null,"abstract":"Providing system-generated explanations for recommendations represents an important step towards transparent and trustworthy recommender systems. Explainable recommender systems provide a human-understandable rationale for their outputs. Over the past two decades, explainable recommendation has attracted much attention in the recommender systems research community. This paper aims to provide a comprehensive review of research efforts on visual explanation in recommender systems. More concretely, we systematically review the literature on explanations in recommender systems based on four dimensions, namely explanation aim, explanation scope, explanation method, and explanation format. Recognizing the importance of visualization, we approach the recommender system literature from the angle of explanatory visualizations, that is using visualizations as a display style of explanation. As a result, we derive a set of guidelines that might be constructive for designing explanatory visualizations in recommender systems and identify perspectives for future work in this field. The aim of this review is to help recommendation researchers and practitioners better understand the potential of visually explainable recommendation research and to support them in the systematic design of visual explanations in current and future recommender systems.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"28 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unpacking Human-AI interactions: From interaction primitives to a design space 解读人机交互：从交互基元到设计空间

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-06-08 DOI: 10.1145/3664522

Konstantinos Tsiakas, Dave Murray-Rust

This paper aims to develop a semi-formal representation for Human-AI (HAI) interactions, by building a set of interaction primitives which can specify the information exchanges between users and AI systems during their interaction. We show how these primitives can be combined into a set of interaction patterns which can capture common interactions between humans and AI/ML models. The motivation behind this is twofold: firstly, to provide a compact generalisation of existing practices for the design and implementation of HAI interactions; and secondly, to support the creation of new interactions by extending the design space of HAI interactions. Taking into consideration frameworks, guidelines and taxonomies related to human-centered design and implementation of AI systems, we define a vocabulary for describing information exchanges based on the model’s characteristics and interactional capabilities. Based on this vocabulary, a message passing model for interactions between humans and models is presented, which we demonstrate can account for existing HAI interaction systems and approaches. Finally, we build this into design patterns which can describe common interactions between users and models, and we discuss how this approach can be used towards a design space for HAI interactions that creates new possibilities for designs as well as keeping track of implementation issues and concerns.

本文旨在通过建立一套交互基元，为人类与人工智能（HAI）的交互开发一种半正式的表示方法，这套交互基元可以指定用户与人工智能系统在交互过程中的信息交流。我们展示了如何将这些基元组合成一套交互模式，从而捕捉人类与人工智能/ML 模型之间的常见交互。这样做的动机有两个：首先，为设计和实现人工智能交互提供现有实践的紧凑概括；其次，通过扩展人工智能交互的设计空间，支持创建新的交互。考虑到与人工智能系统以人为本的设计和实施相关的框架、指南和分类法，我们定义了一个词汇表，用于描述基于模型特征和交互能力的信息交换。在此词汇的基础上，我们提出了一个用于人类与模型之间交互的信息传递模型，并证明该模型可用于现有的人工智能交互系统和方法。最后，我们将其构建为可描述用户与模型之间常见交互的设计模式，并讨论了如何利用这种方法为人工智能交互设计空间，从而为设计创造新的可能性，并跟踪实施问题和关注点。

{"title":"Unpacking Human-AI interactions: From interaction primitives to a design space","authors":"Konstantinos Tsiakas, Dave Murray-Rust","doi":"10.1145/3664522","DOIUrl":"https://doi.org/10.1145/3664522","url":null,"abstract":"This paper aims to develop a semi-formal representation for Human-AI (HAI) interactions, by building a set of interaction primitives which can specify the information exchanges between users and AI systems during their interaction. We show how these primitives can be combined into a set of interaction patterns which can capture common interactions between humans and AI/ML models. The motivation behind this is twofold: firstly, to provide a compact generalisation of existing practices for the design and implementation of HAI interactions; and secondly, to support the creation of new interactions by extending the design space of HAI interactions. Taking into consideration frameworks, guidelines and taxonomies related to human-centered design and implementation of AI systems, we define a vocabulary for describing information exchanges based on the model’s characteristics and interactional capabilities. Based on this vocabulary, a message passing model for interactions between humans and models is presented, which we demonstrate can account for existing HAI interaction systems and approaches. Finally, we build this into design patterns which can describe common interactions between users and models, and we discuss how this approach can be used towards a design space for HAI interactions that creates new possibilities for designs as well as keeping track of implementation issues and concerns.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"36 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141553220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Reasoning and Value Alignment Test to Assess Advanced GPT Reasoning 评估高级 GPT 推理能力的推理和价值排列测试

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-06-03 DOI: 10.1145/3670691

Timothy R. McIntosh, Tong Liu, Teo Susnjak, Paul Watters, Malka N. Halgamuge

In response to diverse perspectives on Artificial General Intelligence (AGI), ranging from potential safety and ethical concerns to more extreme views about the threats it poses to humanity, this research presents a generic method to gauge the reasoning capabilities of Artificial Intelligence (AI) models as a foundational step in evaluating safety measures. Recognizing that AI reasoning measures cannot be wholly automated, due to factors such as cultural complexity, we conducted an extensive examination of five commercial Generative Pre-trained Transformers (GPTs), focusing on their comprehension and interpretation of culturally intricate contexts. Utilizing our novel “Reasoning and Value Alignment Test”, we assessed the GPT models’ ability to reason in complex situations and grasp local cultural subtleties. Our findings have indicated that, although the models have exhibited high levels of human-like reasoning, significant limitations remained, especially concerning the interpretation of cultural contexts. This paper also explored potential applications and use-cases of our Test, underlining its significance in AI training, ethics compliance, sensitivity auditing, and AI-driven cultural consultation. We concluded by emphasizing its broader implications in the AGI domain, highlighting the necessity for interdisciplinary approaches, wider accessibility to various GPT models, and a profound understanding of the interplay between GPT reasoning and cultural sensitivity.

针对有关人工智能（AGI）的各种观点，从潜在的安全和道德问题到对人类构成威胁的更极端观点，本研究提出了一种通用方法来衡量人工智能（AI）模型的推理能力，作为评估安全措施的基础步骤。我们认识到，由于文化复杂性等因素，人工智能推理措施不可能完全自动化，因此我们对五种商用生成预训练转换器（GPT）进行了广泛研究，重点关注它们对错综复杂的文化背景的理解和解释。我们利用新颖的 "推理和价值一致性测试"，评估了 GPT 模型在复杂情况下的推理能力以及对当地文化微妙之处的把握能力。我们的研究结果表明，尽管模型表现出了高水平的类人推理能力，但仍存在很大的局限性，尤其是在文化背景的解释方面。本文还探讨了 "测试 "的潜在应用和使用案例，强调了其在人工智能培训、道德合规、敏感性审计和人工智能驱动的文化咨询方面的重要意义。最后，我们强调了它在人工智能领域的广泛影响，强调了跨学科方法的必要性、更广泛地使用各种 GPT 模型的可能性，以及对 GPT 推理和文化敏感性之间相互作用的深刻理解。

{"title":"A Reasoning and Value Alignment Test to Assess Advanced GPT Reasoning","authors":"Timothy R. McIntosh, Tong Liu, Teo Susnjak, Paul Watters, Malka N. Halgamuge","doi":"10.1145/3670691","DOIUrl":"https://doi.org/10.1145/3670691","url":null,"abstract":"In response to diverse perspectives on Artificial General Intelligence (AGI), ranging from potential safety and ethical concerns to more extreme views about the threats it poses to humanity, this research presents a generic method to gauge the reasoning capabilities of Artificial Intelligence (AI) models as a foundational step in evaluating safety measures. Recognizing that AI reasoning measures cannot be wholly automated, due to factors such as cultural complexity, we conducted an extensive examination of five commercial Generative Pre-trained Transformers (GPTs), focusing on their comprehension and interpretation of culturally intricate contexts. Utilizing our novel “Reasoning and Value Alignment Test”, we assessed the GPT models’ ability to reason in complex situations and grasp local cultural subtleties. Our findings have indicated that, although the models have exhibited high levels of human-like reasoning, significant limitations remained, especially concerning the interpretation of cultural contexts. This paper also explored potential applications and use-cases of our Test, underlining its significance in AI training, ethics compliance, sensitivity auditing, and AI-driven cultural consultation. We concluded by emphasizing its broader implications in the AGI domain, highlighting the necessity for interdisciplinary approaches, wider accessibility to various GPT models, and a profound understanding of the interplay between GPT reasoning and cultural sensitivity.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"43 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141257527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AutoRL X: Automated Reinforcement Learning on the Web AutoRL X：网络自动强化学习

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-06-03 DOI: 10.1145/3670692

Loraine Franke, Daniel Karl I. Weidele, Nima Dehmamy, Lipeng Ning, Daniel Haehn

Reinforcement Learning (RL) is crucial in decision optimization, but its inherent complexity often presents challenges in interpretation and communication. Building upon AutoDOViz — an interface that pushed the boundaries of Automated RL for Decision Optimization — this paper unveils an open-source expansion with a web-based platform for RL. Our work introduces a taxonomy of RL visualizations and launches a dynamic web platform, leveraging backend flexibility for AutoRL frameworks like ARLO and Svelte.js for a smooth interactive user experience in the front end. Since AutoDOViz is not open-source, we present AutoRL X, a new interface designed to visualize RL processes. AutoRL X is shaped by the extensive user feedback and expert interviews from AutoDOViz studies, and it brings forth an intelligent interface with real-time, intuitive visualization capabilities that enhance understanding, collaborative efforts, and personalization of RL agents. Addressing the gap in accurately representing complex real-world challenges within standard RL environments, we demonstrate our tool's application in healthcare, explicitly optimizing brain stimulation trajectories. A user study contrasts the performance of human users optimizing electric fields via a 2D interface with RL agents’ behavior that we visually analyze in AutoRL X, assessing the practicality of automated RL. All our data and code is openly available at: https://github.com/lorifranke/autorlx.

强化学习（RL）在决策优化中至关重要，但其固有的复杂性往往给解释和交流带来挑战。AutoDOViz是一个用于决策优化的自动强化学习界面，本文以该界面为基础，揭示了强化学习网络平台的开源扩展。我们的工作引入了 RL 可视化分类法，并启动了一个动态网络平台，利用 ARLO 和 Svelte.js 等 AutoRL 框架的后端灵活性，在前端提供流畅的交互式用户体验。由于 AutoDOViz 并非开源，因此我们推出了 AutoRL X，这是一个专为可视化 RL 过程而设计的新界面。AutoRL X 是根据 AutoDOViz 研究中广泛的用户反馈和专家访谈形成的，它带来了一个具有实时、直观可视化功能的智能界面，可增强对 RL 代理的理解、协作努力和个性化。为了弥补在标准 RL 环境中准确呈现复杂现实世界挑战的不足，我们展示了我们的工具在医疗保健领域的应用，明确优化了脑刺激轨迹。一项用户研究对比了人类用户通过二维界面优化电场的表现和我们在 AutoRL X 中可视化分析的 RL 代理行为，评估了自动 RL 的实用性。我们的所有数据和代码均可在以下网址公开获取：https://github.com/lorifranke/autorlx。

{"title":"AutoRL X: Automated Reinforcement Learning on the Web","authors":"Loraine Franke, Daniel Karl I. Weidele, Nima Dehmamy, Lipeng Ning, Daniel Haehn","doi":"10.1145/3670692","DOIUrl":"https://doi.org/10.1145/3670692","url":null,"abstract":"Reinforcement Learning (RL) is crucial in decision optimization, but its inherent complexity often presents challenges in interpretation and communication. Building upon AutoDOViz — an interface that pushed the boundaries of Automated RL for Decision Optimization — this paper unveils an open-source expansion with a web-based platform for RL. Our work introduces a taxonomy of RL visualizations and launches a dynamic web platform, leveraging backend flexibility for AutoRL frameworks like ARLO and Svelte.js for a smooth interactive user experience in the front end. Since AutoDOViz is not open-source, we present AutoRL X, a new interface designed to visualize RL processes. AutoRL X is shaped by the extensive user feedback and expert interviews from AutoDOViz studies, and it brings forth an intelligent interface with real-time, intuitive visualization capabilities that enhance understanding, collaborative efforts, and personalization of RL agents. Addressing the gap in accurately representing complex real-world challenges within standard RL environments, we demonstrate our tool's application in healthcare, explicitly optimizing brain stimulation trajectories. A user study contrasts the performance of human users optimizing electric fields via a 2D interface with RL agents’ behavior that we visually analyze in AutoRL X, assessing the practicality of automated RL. All our data and code is openly available at: https://github.com/lorifranke/autorlx.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"24 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141257271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Measuring User Experience Inclusivity in Human-AI Interaction via Five User Problem-Solving Styles 通过五种用户解决问题的方式衡量人机交互中的用户体验包容性

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-05-08 DOI: 10.1145/3663740

Andrew Anderson, Jimena Noa Guevara, Fatima Moussaoui, Tianyi Li, Mihaela Vorvoreanu, Margaret Burnett

Motivations: Recent research has emerged on generally how to improve AI products’ Human-AI Interaction (HAI) User Experience (UX), but relatively little is known about HAI-UX inclusivity. For example, what kinds of users are supported, and who are left out? What product changes would make it more inclusive?

Objectives: To help fill this gap, we present an approach to measuring what kinds of diverse users an AI product leaves out and how to act upon that knowledge. To bring actionability to the results, the approach focuses on users’ problem-solving diversity. Thus, our specific objectives were: (1) to show how the measure can reveal which participants with diverse problem-solving styles were left behind in a set of AI products; and (2) to relate participants’ problem-solving diversity to their demographic diversity, specifically gender and age.

Methods: We performed 18 experiments, discarding two that failed manipulation checks. Each experiment was a 2x2 factorial experiment with online participants, comparing two AI products: one deliberately violating one of 18 HAI guideline and the other applying the same guideline. For our first objective, we used our measure to analyze how much each AI product gained/lost HAI-UX inclusivity compared to its counterpart, where inclusivity meant supportiveness to participants with particular problem-solving styles. For our second objective, we analyzed how participants’ problem-solving styles aligned with their gender identities and ages.

Results & Implications: Participants’ diverse problem-solving styles revealed six types of inclusivity results: (1) the AI products that followed an HAI guideline were almost always more inclusive across diversity of problem-solving styles than the products that did not follow that guideline—but “who” got most of the inclusivity varied widely by guideline and by problem-solving style; (2) when an AI product had risk implications, four variables’ values varied in tandem: participants’ feelings of control, their (lack of) suspicion, their trust in the product, and their certainty while using the product; (3) the more control an AI product offered users, the more inclusive it was; (4) whether an AI product was learning from “my” data or other people’s affected how inclusive that product was; (5) participants’ problem-solving styles skewed differently by gender and age group; and (6) almost all of the results suggested actions that HAI practitioners could take to improve their products’ inclusivity further. Together, these results suggest that a key to improving the demographic inclusivity of an AI product (e.g., across a wide range of genders, ages, etc.) can often be obtained by improving the product’s support of diverse problem-solving styles.

动机：最近出现了一些关于如何改善人工智能产品的人机交互（HAI）用户体验（UX）的研究，但对 HAI-UX 的包容性却知之甚少。例如，哪些用户得到了支持，哪些用户被排除在外？怎样的产品改变才能使其更具包容性？为了帮助填补这一空白，我们提出了一种方法来衡量人工智能产品遗漏了哪些类型的不同用户，以及如何根据这些知识采取行动。为了使结果具有可操作性，该方法侧重于用户解决问题的多样性。因此，我们的具体目标是(1)展示该方法如何揭示哪些具有不同问题解决风格的参与者被一组人工智能产品所遗漏；(2)将参与者的问题解决多样性与他们的人口统计学多样性（特别是性别和年龄）联系起来：我们进行了 18 次实验，剔除了两次未通过操作检查的实验。每个实验都是由在线参与者参与的 2x2 因式实验，比较两个人工智能产品：一个故意违反了 18 项 HAI 准则中的一项，另一个则采用了相同的准则。在第一个目标中，我们使用我们的测量方法来分析每种人工智能产品与其对应产品相比在 HAI-UX 包容性方面的得失。在第二个目标中，我们分析了参与者解决问题的风格如何与他们的性别认同和年龄相吻合：参与者解决问题的不同风格揭示了六种类型的包容性结果：（1）遵循 HAI 指南的人工智能产品几乎总是比不遵循该指南的产品在解决问题风格的多样性方面更具包容性--但 "谁 "获得了最大的包容性因指南和解决问题风格的不同而有很大差异；（2）当人工智能产品具有风险影响时，四个变量的值同时变化：(3) 人工智能产品为用户提供的控制越多，其包容性就越强；(4) 人工智能产品是从 "我 "的数据还是从其他人的数据中学习，会影响该产品的包容性；(5) 不同性别和年龄组的参与者解决问题的风格也不同；(6) 几乎所有结果都建议人工智能从业者采取行动，进一步提高其产品的包容性。总之，这些结果表明，提高人工智能产品的人口包容性（例如，跨越广泛的性别、年龄等）的关键往往可以通过改善产品对不同问题解决方式的支持来实现。

{"title":"Measuring User Experience Inclusivity in Human-AI Interaction via Five User Problem-Solving Styles","authors":"Andrew Anderson, Jimena Noa Guevara, Fatima Moussaoui, Tianyi Li, Mihaela Vorvoreanu, Margaret Burnett","doi":"10.1145/3663740","DOIUrl":"https://doi.org/10.1145/3663740","url":null,"abstract":"Motivations: Recent research has emerged on generally how to improve AI products’ Human-AI Interaction (HAI) User Experience (UX), but relatively little is known about HAI-UX inclusivity. For example, what kinds of users are supported, and who are left out? What product changes would make it more inclusive?Objectives: To help fill this gap, we present an approach to measuring what kinds of diverse users an AI product leaves out and how to act upon that knowledge. To bring actionability to the results, the approach focuses on users’ problem-solving diversity. Thus, our specific objectives were: (1) to show how the measure can reveal which participants with diverse problem-solving styles were left behind in a set of AI products; and (2) to relate participants’ problem-solving diversity to their demographic diversity, specifically gender and age.Methods: We performed 18 experiments, discarding two that failed manipulation checks. Each experiment was a 2x2 factorial experiment with online participants, comparing two AI products: one deliberately violating one of 18 HAI guideline and the other applying the same guideline. For our first objective, we used our measure to analyze how much each AI product gained/lost HAI-UX inclusivity compared to its counterpart, where inclusivity meant supportiveness to participants with particular problem-solving styles. For our second objective, we analyzed how participants’ problem-solving styles aligned with their gender identities and ages.Results & Implications: Participants’ diverse problem-solving styles revealed six types of inclusivity results: (1) the AI products that followed an HAI guideline were almost always more inclusive across diversity of problem-solving styles than the products that did not follow that guideline—but “who” got most of the inclusivity varied widely by guideline and by problem-solving style; (2) when an AI product had risk implications, four variables’ values varied in tandem: participants’ feelings of control, their (lack of) suspicion, their trust in the product, and their certainty while using the product; (3) the more control an AI product offered users, the more inclusive it was; (4) whether an AI product was learning from “my” data or other people’s affected how inclusive that product was; (5) participants’ problem-solving styles skewed differently by gender and age group; and (6) almost all of the results suggested actions that HAI practitioners could take to improve their products’ inclusivity further. Together, these results suggest that a key to improving the demographic inclusivity of an AI product (e.g., across a wide range of genders, ages, etc.) can often be obtained by improving the product’s support of diverse problem-solving styles.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"25 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140925548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cooperative Multi-Objective Bayesian Design Optimization 合作式多目标贝叶斯设计优化

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-04-17 DOI: 10.1145/3657643

George Mo, John Dudley, Liwei Chan, Yi-Chi Liao, Antti Oulasvirta, Per Ola Kristensson

Computational methods can potentially facilitate user interface design by complementing designer intuition, prior experience, and personal preference. Framing a user interface design task as a multi-objective optimization problem can help with operationalizing and structuring this process at the expense of designer agency and experience. While offering a systematic means of exploring the design space, the optimization process cannot typically leverage the designer’s expertise in quickly identifying that a given ‘bad’ design is not worth evaluating. We here examine a cooperative approach where both the designer and optimization process share a common goal, and work in partnership by establishing a shared understanding of the design space. We tackle the research question: how can we foster cooperation between the designer and a systematic optimization process in order to best leverage their combined strength? We introduce and present an evaluation of a cooperative approach that allows the user to express their design insight and work in concert with a multi-objective design process. We find that the cooperative approach successfully encourages designers to explore more widely in the design space than when they are working without assistance from an optimization process. The cooperative approach also delivers design outcomes that are comparable to an optimization process run without any direct designer input, but achieves this with greater efficiency and substantially higher designer engagement levels.

计算方法可以补充设计者的直觉、先前经验和个人偏好，从而为用户界面设计提供潜在的便利。将用户界面设计任务构建为一个多目标优化问题，有助于在牺牲设计者的主观能动性和经验的前提下，实现这一过程的可操作性和结构化。虽然优化过程提供了一种探索设计空间的系统化方法，但它通常无法利用设计者的专业知识来快速识别某个 "糟糕 "的设计是否值得评估。在这里，我们研究了一种合作方法，在这种方法中，设计者和优化流程拥有共同的目标，并通过建立对设计空间的共同理解来开展合作。我们要解决的研究问题是：如何促进设计者与系统优化流程之间的合作，从而最大限度地发挥它们的综合优势？我们介绍并评估了一种合作方法，这种方法允许用户表达自己的设计见解，并与多目标设计流程协同工作。我们发现，与没有优化流程协助的情况相比，合作方法成功地鼓励了设计人员在设计空间中进行更广泛的探索。合作方法所产生的设计结果与没有设计人员直接输入的优化流程不相上下，但效率更高，设计人员的参与度也更高。

{"title":"Cooperative Multi-Objective Bayesian Design Optimization","authors":"George Mo, John Dudley, Liwei Chan, Yi-Chi Liao, Antti Oulasvirta, Per Ola Kristensson","doi":"10.1145/3657643","DOIUrl":"https://doi.org/10.1145/3657643","url":null,"abstract":"Computational methods can potentially facilitate user interface design by complementing designer intuition, prior experience, and personal preference. Framing a user interface design task as a multi-objective optimization problem can help with operationalizing and structuring this process at the expense of designer agency and experience. While offering a systematic means of exploring the design space, the optimization process cannot typically leverage the designer’s expertise in quickly identifying that a given ‘bad’ design is not worth evaluating. We here examine a cooperative approach where both the designer and optimization process share a common goal, and work in partnership by establishing a shared understanding of the design space. We tackle the research question: how can we foster cooperation between the designer and a systematic optimization process in order to best leverage their combined strength? We introduce and present an evaluation of a cooperative approach that allows the user to express their design insight and work in concert with a multi-objective design process. We find that the cooperative approach successfully encourages designers to explore more widely in the design space than when they are working without assistance from an optimization process. The cooperative approach also delivers design outcomes that are comparable to an optimization process run without any direct designer input, but achieves this with greater efficiency and substantially higher designer engagement levels.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"29 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140615493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Spatial Constraint Model for Manipulating Static Visualizations 操纵静态可视化的空间约束模型

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-04-11 DOI: 10.1145/3657642

Can Liu, Yu Zhang, Cong Wu, Chen Li, Xiaoru Yuan

We introduce a spatial constraint model to characterize the positioning and interactions in visualizations, thereby facilitating the activation of static visualizations. Our model provides users with the capability to manipulate visualizations through operations such as selection, filtering, navigation, arrangement, and aggregation. Building upon this conceptual framework, we propose a prototype system designed to activate pre-existing visualizations by imbuing them with intelligent interactions. This augmentation is accomplished through the integration of visual objects with forces. The instantiation of our spatial constraint model enables seamless animated transitions between distinct visualization layouts. To demonstrate the efficacy of our approach, we present usage scenarios that involve the activation of visualizations within real-world contexts.

我们引入了一个空间约束模型来描述可视化中的定位和交互，从而促进静态可视化的激活。我们的模型为用户提供了通过选择、过滤、导航、排列和聚合等操作操纵可视化的能力。在这一概念框架的基础上，我们提出了一个原型系统，旨在通过智能交互激活已有的可视化。这种增强是通过将视觉对象与力整合来实现的。空间约束模型的实例化可实现不同可视化布局之间的无缝动画转换。为了展示我们方法的功效，我们介绍了在现实世界中激活可视化的使用场景。

引用次数: 0

generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation generAItor：为语言模型的可解释性和适应性生成环中树文本

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-03-14 DOI: 10.1145/3652028

Thilo Spinner, Rebecca Kehlbeck, Rita Sevastjanova, Tobias Stähle, Daniel A. Keim, Oliver Deussen, Mennatallah El-Assady

Large language models (LLMs) are widely deployed in various downstream tasks, e.g., auto-completion, aided writing, or chat-based text generation. However, the considered output candidates of the underlying search algorithm are under-explored and under-explained. We tackle this shortcoming by proposing a tree-in-the-loop approach, where a visual representation of the beam search tree is the central component for analyzing, explaining, and adapting the generated outputs. To support these tasks, we present generAItor, a visual analytics technique, augmenting the central beam search tree with various task-specific widgets, providing targeted visualizations and interaction possibilities. Our approach allows interactions on multiple levels and offers an iterative pipeline that encompasses generating, exploring, and comparing output candidates, as well as fine-tuning the model based on adapted data. Our case study shows that our tool generates new insights in gender bias analysis beyond state-of-the-art template-based methods. Additionally, we demonstrate the applicability of our approach in a qualitative user study. Finally, we quantitatively evaluate the adaptability of the model to few samples, as occurring in text-generation use cases.

大语言模型（LLM）被广泛应用于各种下游任务，如自动完成、辅助写作或基于聊天的文本生成。然而，人们对底层搜索算法的输出候选结果探索不足，解释不够。针对这一不足，我们提出了一种 "环中树 "方法，即以波束搜索树的可视化表示作为分析、解释和调整生成输出的核心组件。为了支持这些任务，我们提出了一种可视化分析技术 generAItor，用各种特定任务小部件来增强中央波束搜索树，提供有针对性的可视化和交互可能性。我们的方法允许多层次的互动，并提供了一个迭代管道，包括生成、探索和比较候选输出，以及根据调整后的数据对模型进行微调。我们的案例研究表明，我们的工具在性别偏见分析方面产生了新的见解，超越了最先进的基于模板的方法。此外，我们还在一项定性用户研究中展示了我们方法的适用性。最后，我们定量评估了模型对少量样本的适应性，如在文本生成使用案例中出现的情况。

{"title":"generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation","authors":"Thilo Spinner, Rebecca Kehlbeck, Rita Sevastjanova, Tobias Stähle, Daniel A. Keim, Oliver Deussen, Mennatallah El-Assady","doi":"10.1145/3652028","DOIUrl":"https://doi.org/10.1145/3652028","url":null,"abstract":"Large language models (LLMs) are widely deployed in various downstream tasks, e.g., auto-completion, aided writing, or chat-based text generation. However, the considered output candidates of the underlying search algorithm are under-explored and under-explained. We tackle this shortcoming by proposing a tree-in-the-loop approach, where a visual representation of the beam search tree is the central component for analyzing, explaining, and adapting the generated outputs. To support these tasks, we present generAItor, a visual analytics technique, augmenting the central beam search tree with various task-specific widgets, providing targeted visualizations and interaction possibilities. Our approach allows interactions on multiple levels and offers an iterative pipeline that encompasses generating, exploring, and comparing output candidates, as well as fine-tuning the model based on adapted data. Our case study shows that our tool generates new insights in gender bias analysis beyond state-of-the-art template-based methods. Additionally, we demonstrate the applicability of our approach in a qualitative user study. Finally, we quantitatively evaluate the adaptability of the model to few samples, as occurring in text-generation use cases.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"248 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140126424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0