ACM Transactions on Interactive Intelligent Systems最新文献_第2页

“It would work for me too”: How Online Communities Shape Software Developers’ Trust in AI-Powered Code Generation Tools "它对我也有用"：在线社区如何影响软件开发人员对人工智能代码生成工具的信任

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-03-09 DOI: 10.1145/3651990

Ruijia Cheng, Ruotong Wang, Thomas Zimmermann, Denae Ford

While revolutionary AI-powered code generation tools have been rising rapidly, we know little about how and how to help software developers form appropriate trust in those AI tools. Through a two-phase formative study, we investigate how online communities shape developers’ trust in AI tools and how we can leverage community features to facilitate appropriate user trust. Through interviewing 17 developers, we find that developers collectively make sense of AI tools using the experiences shared by community members and leverage community signals to evaluate AI suggestions. We then surface design opportunities and conduct 11 design probe sessions to explore the design space of using community features to support user trust in AI code generation systems. We synthesize our findings and extend an existing model of user trust in AI technologies with sociotechnical factors. We map out the design considerations for integrating user community into the AI code generation experience.

虽然革命性的人工智能代码生成工具迅速崛起，但我们对如何帮助软件开发人员对这些人工智能工具形成适当的信任却知之甚少。通过一项分两个阶段进行的形成性研究，我们调查了在线社区如何形成开发人员对人工智能工具的信任，以及我们如何利用社区功能来促进用户的适当信任。通过对 17 名开发人员的访谈，我们发现开发人员会利用社区成员分享的经验来共同理解人工智能工具，并利用社区信号来评估人工智能建议。然后，我们提出了设计机会，并进行了 11 次设计探究会议，以探索使用社区功能支持人工智能代码生成系统中用户信任的设计空间。我们对研究结果进行了综合，并利用社会技术因素扩展了用户对人工智能技术信任度的现有模型。我们列出了将用户社区融入人工智能代码生成体验的设计考虑因素。

引用次数: 0

Insights into Natural Language Database Query Errors: From Attention Misalignment to User Handling Strategies 洞察自然语言数据库查询错误：从注意力错位到用户处理策略

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-03-02 DOI: 10.1145/3650114

Zheng Ning, Yuan Tian, Zheng Zhang, Tianyi Zhang, Toby Jia-Jun Li

Querying structured databases with natural language (NL2SQL) has remained a difficult problem for years. Recently, the advancement of machine learning (ML), natural language processing (NLP), and large language models (LLM) have led to significant improvements in performance, with the best model achieving ∼ 85% percent accuracy on the benchmark Spider dataset. However, there is a lack of a systematic understanding of the types, causes, and effectiveness of error-handling mechanisms of errors for erroneous queries nowadays. To bridge the gap, a taxonomy of errors made by four representative NL2SQL models was built in this work, along with an in-depth analysis of the errors. Second, the causes of model errors were explored by analyzing the model-human attention alignment to the natural language query. Last, a within-subjects user study with 26 participants was conducted to investigate the effectiveness of three interactive error-handling mechanisms in NL2SQL. Findings from this paper shed light on the design of model structure and error discovery and repair strategies for natural language data query interfaces in the future.

用自然语言查询结构化数据库（NL2SQL）多年来一直是个难题。最近，机器学习（ML）、自然语言处理（NLP）和大型语言模型（LLM）的发展使性能有了显著提高，最佳模型在基准 Spider 数据集上的准确率达到了 ∼ 85%。然而，目前对错误查询的错误类型、原因和错误处理机制的有效性还缺乏系统的了解。为了弥补这一差距，本研究建立了四种具有代表性的 NL2SQL 模型所犯的错误分类法，并对这些错误进行了深入分析。其次，通过分析自然语言查询中模型与人类注意力的一致性，探讨了模型错误的原因。最后，对 26 名参与者进行了主体内用户研究，以调查 NL2SQL 中三种交互式错误处理机制的有效性。本文的研究结果为未来自然语言数据查询界面的模型结构设计和错误发现与修复策略提供了启示。

{"title":"Insights into Natural Language Database Query Errors: From Attention Misalignment to User Handling Strategies","authors":"Zheng Ning, Yuan Tian, Zheng Zhang, Tianyi Zhang, Toby Jia-Jun Li","doi":"10.1145/3650114","DOIUrl":"https://doi.org/10.1145/3650114","url":null,"abstract":"Querying structured databases with natural language (NL2SQL) has remained a difficult problem for years. Recently, the advancement of machine learning (ML), natural language processing (NLP), and large language models (LLM) have led to significant improvements in performance, with the best model achieving ∼ 85% percent accuracy on the benchmark Spider dataset. However, there is a lack of a systematic understanding of the types, causes, and effectiveness of error-handling mechanisms of errors for erroneous queries nowadays. To bridge the gap, a taxonomy of errors made by four representative NL2SQL models was built in this work, along with an in-depth analysis of the errors. Second, the causes of model errors were explored by analyzing the model-human attention alignment to the natural language query. Last, a within-subjects user study with 26 participants was conducted to investigate the effectiveness of three interactive error-handling mechanisms in NL2SQL. Findings from this paper shed light on the design of model structure and error discovery and repair strategies for natural language data query interfaces in the future.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"59 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140019034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Man and the Machine: Effects of AI-assisted Human Labeling on Interactive Annotation of Real-Time Video Streams 人与机器：人工智能辅助人工标注对实时视频流互动注释的影响

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-02-29 DOI: 10.1145/3649457

Marko Radeta, Ruben Freitas, Claudio Rodrigues, Agustin Zuniga, Ngoc Thi Nguyen, Huber Flores, Petteri Nurmi

AI-assisted interactive annotation is a powerful way to facilitate data annotation – a prerequisite for constructing robust AI models. While AI-assisted interactive annotation has been extensively studied in static settings, less is known about its usage in dynamic scenarios where the annotators operate under time and cognitive constraints, e.g., while detecting suspicious or dangerous activities from real-time surveillance feeds. Understanding how AI can assist annotators in these tasks and facilitate consistent annotation is paramount to ensure high performance for AI models trained on these data. We address this gap in interactive machine learning (IML) research, contributing an extensive investigation of the benefits, limitations, and challenges of AI-assisted annotation in dynamic application use cases. We address both the effects of AI on annotators and the effects of (AI) annotations on the performance of AI models trained on annotated data in real-time video annotations. We conduct extensive experiments that compare annotation performance at two annotator levels (expert and non-expert) and two interactive labelling techniques (with and without AI-assistance). In a controlled study with N = 34 annotators and a follow up study with 51963 images and their annotation labels being input to the AI model, we demonstrate that the benefits of AI-assisted models are greatest for non-expert users and for cases where targets are only partially or briefly visible. The expert users tend to outperform or achieve similar performance as AI model. Labels combining AI and expert annotations result in the best overall performance as the AI reduces overflow and latency in the expert annotations. We derive guidelines for the use of AI-assisted human annotation in real-time dynamic use cases.

人工智能辅助交互式注释是一种促进数据注释的强大方法，也是构建强大人工智能模型的先决条件。虽然人工智能辅助交互式注释已在静态环境中得到广泛研究，但对其在动态场景中的应用却知之甚少，在动态场景中，注释者的操作受到时间和认知能力的限制，例如从实时监控馈送中检测可疑或危险活动。要确保在这些数据上训练的人工智能模型的高性能，了解人工智能如何协助注释者完成这些任务并促进注释的一致性至关重要。我们针对交互式机器学习（IML）研究中的这一空白，对动态应用案例中人工智能辅助标注的优势、局限性和挑战进行了广泛的调查。我们既探讨了人工智能对注释者的影响，也探讨了（人工智能）注释对在实时视频注释中根据注释数据训练的人工智能模型性能的影响。我们进行了广泛的实验，比较了两种注释者水平（专家和非专家）和两种交互式标签技术（有人工智能辅助和无人工智能辅助）下的注释性能。在一项由 N = 34 名标注者进行的对照研究和一项由 51963 张图像及其标注标签输入人工智能模型的后续研究中，我们证明了人工智能辅助模型对非专家用户以及目标仅部分可见或短暂可见的情况的优势最大。专家用户的表现往往优于人工智能模型或与之相近。由于人工智能减少了专家注释的溢出和延迟，因此结合人工智能和专家注释的标签可获得最佳的整体性能。我们得出了在实时动态用例中使用人工智能辅助人类注释的指导原则。

{"title":"Man and the Machine: Effects of AI-assisted Human Labeling on Interactive Annotation of Real-Time Video Streams","authors":"Marko Radeta, Ruben Freitas, Claudio Rodrigues, Agustin Zuniga, Ngoc Thi Nguyen, Huber Flores, Petteri Nurmi","doi":"10.1145/3649457","DOIUrl":"https://doi.org/10.1145/3649457","url":null,"abstract":"AI-assisted interactive annotation is a powerful way to facilitate data annotation – a prerequisite for constructing robust AI models. While AI-assisted interactive annotation has been extensively studied in static settings, less is known about its usage in dynamic scenarios where the annotators operate under time and cognitive constraints, e.g., while detecting suspicious or dangerous activities from real-time surveillance feeds. Understanding how AI can assist annotators in these tasks and facilitate consistent annotation is paramount to ensure high performance for AI models trained on these data. We address this gap in interactive machine learning (IML) research, contributing an extensive investigation of the benefits, limitations, and challenges of AI-assisted annotation in dynamic application use cases. We address both the effects of AI on annotators and the effects of (AI) annotations on the performance of AI models trained on annotated data in real-time video annotations. We conduct extensive experiments that compare annotation performance at two annotator levels (expert and non-expert) and two interactive labelling techniques (with and without AI-assistance). In a controlled study with N = 34 annotators and a follow up study with 51963 images and their annotation labels being input to the AI model, we demonstrate that the benefits of AI-assisted models are greatest for non-expert users and for cases where targets are only partially or briefly visible. The expert users tend to outperform or achieve similar performance as AI model. Labels combining AI and expert annotations result in the best overall performance as the AI reduces overflow and latency in the expert annotations. We derive guidelines for the use of AI-assisted human annotation in real-time dynamic use cases.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"19 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140009301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Talk2Data : A Natural Language Interface for Exploratory Visual Analysis via Question Decomposition Talk2Data ：通过问题分解进行探索性视觉分析的自然语言界面

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-02-07 DOI: 10.1145/3643894

Yi Guo, Danqing Shi, Mingjuan Guo, Yanqiu Wu, Nan Cao, Qing Chen

Through a natural language interface (NLI) for exploratory visual analysis, users can directly “ask” analytical questions about the given tabular data. This process greatly improves user experience and lowers the technical barriers of data analysis. Existing techniques focus on generating a visualization from a concrete question. However, complex questions, requiring multiple data queries and visualizations to answer, are frequently asked in data exploration and analysis, which cannot be easily solved with the existing techniques. To address this issue, in this paper, we introduce Talk2Data, a natural language interface for exploratory visual analysis that supports answering complex questions. It leverages an advanced deep-learning model to resolve complex questions into a series of simple questions that could gradually elaborate on the users’ requirements. To present answers, we design a set of annotated and captioned visualizations to represent the answers in a form that supports interpretation and narration. We conducted an ablation study and a controlled user study to evaluate the Talk2Data’s effectiveness and usefulness.

通过用于探索性可视分析的自然语言界面（NLI），用户可以直接就给定的表格数据 "提出 "分析问题。这一过程大大改善了用户体验，降低了数据分析的技术门槛。现有技术侧重于根据具体问题生成可视化。然而，在数据探索和分析中经常会遇到复杂的问题，需要多次数据查询和可视化才能回答，而现有的技术无法轻松解决这些问题。为了解决这个问题，我们在本文中介绍了 Talk2Data，这是一种用于探索性可视化分析的自然语言界面，支持回答复杂问题。它利用先进的深度学习模型，将复杂的问题解析为一系列简单的问题，从而逐步阐述用户的需求。为了呈现答案，我们设计了一套带注释和标题的可视化界面，以支持解释和叙述的形式呈现答案。我们进行了一项消融研究和一项受控用户研究，以评估 Talk2Data 的有效性和实用性。

{"title":"Talk2Data : A Natural Language Interface for Exploratory Visual Analysis via Question Decomposition","authors":"Yi Guo, Danqing Shi, Mingjuan Guo, Yanqiu Wu, Nan Cao, Qing Chen","doi":"10.1145/3643894","DOIUrl":"https://doi.org/10.1145/3643894","url":null,"abstract":"Through a natural language interface (NLI) for exploratory visual analysis, users can directly “ask” analytical questions about the given tabular data. This process greatly improves user experience and lowers the technical barriers of data analysis. Existing techniques focus on generating a visualization from a concrete question. However, complex questions, requiring multiple data queries and visualizations to answer, are frequently asked in data exploration and analysis, which cannot be easily solved with the existing techniques. To address this issue, in this paper, we introduce Talk2Data, a natural language interface for exploratory visual analysis that supports answering complex questions. It leverages an advanced deep-learning model to resolve complex questions into a series of simple questions that could gradually elaborate on the users’ requirements. To present answers, we design a set of annotated and captioned visualizations to represent the answers in a form that supports interpretation and narration. We conducted an ablation study and a controlled user study to evaluate the Talk2Data’s effectiveness and usefulness.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"1 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139767287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Entity Footprinting: Modeling Contextual User States via Digital Activity Monitoring 实体足迹：通过数字活动监测建模上下文用户状态

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-02-05 DOI: 10.1145/3643893

Zeinab R. Yousefi, Tung Vuong, Marie AlGhossein, Tuukka Ruotsalo, Giulio Jaccuci, Samuel Kaski

Our digital life consists of activities that are organized around tasks and exhibit different user states in the digital contexts around these activities. Previous works have shown that digital activity monitoring can be used to predict entities that users will need to perform digital tasks. There have been methods developed to automatically detect the tasks of a user. However, these studies typically support only specific applications and tasks and relatively little research has been conducted on real-life digital activities. This paper introduces user state modeling and prediction with contextual information captured as entities, recorded from real-world digital user behavior, called entity footprinting; a system that records users’ digital activities on their screens and proactively provides useful entities across application boundaries without requiring explicit query formulation. Our methodology is to detect contextual user states using latent representations of entities occurring in digital activities. Using topic models and recurrent neural networks, the model learns the latent representation of concurrent entities and their sequential relationships. We report a field study in which the digital activities of thirteen people were recorded continuously for 14 days. The model learned from this data is used to 1) predict contextual user states, and 2) predict relevant entities for the detected states. The results show improved user state detection accuracy and entity prediction performance compared to static, heuristic, and basic topic models. Our findings have implications for the design of proactive recommendation systems that can implicitly infer users’ contextual state by monitoring users’ digital activities and proactively recommending the right information at the right time.

我们的数字生活由各种活动组成，这些活动围绕任务展开，并在围绕这些活动的数字环境中呈现出不同的用户状态。以往的研究表明，数字活动监测可用于预测用户执行数字任务所需的实体。目前已经开发出自动检测用户任务的方法。不过，这些研究通常只支持特定的应用和任务，而对现实生活中的数字活动进行的研究相对较少。本文介绍了用户状态建模和预测，以及从真实世界的数字用户行为中记录的以实体形式捕获的上下文信息，即实体足迹；该系统记录用户屏幕上的数字活动，并主动提供跨应用边界的有用实体，而无需明确的查询表述。我们的方法是利用数字活动中出现的实体的潜在表征来检测用户的上下文状态。利用主题模型和递归神经网络，该模型可以学习并发实体的潜在表征及其顺序关系。我们报告了一项实地研究，该研究连续记录了 13 个人 14 天的数字活动。从这些数据中学到的模型用于：1）预测用户的上下文状态；2）预测检测到的状态的相关实体。结果表明，与静态模型、启发式模型和基本主题模型相比，用户状态检测准确率和实体预测性能都有所提高。我们的发现对主动推荐系统的设计具有重要意义，该系统可以通过监控用户的数字活动隐式推断用户的上下文状态，并在适当的时间主动推荐适当的信息。

{"title":"Entity Footprinting: Modeling Contextual User States via Digital Activity Monitoring","authors":"Zeinab R. Yousefi, Tung Vuong, Marie AlGhossein, Tuukka Ruotsalo, Giulio Jaccuci, Samuel Kaski","doi":"10.1145/3643893","DOIUrl":"https://doi.org/10.1145/3643893","url":null,"abstract":"Our digital life consists of activities that are organized around tasks and exhibit different user states in the digital contexts around these activities. Previous works have shown that digital activity monitoring can be used to predict entities that users will need to perform digital tasks. There have been methods developed to automatically detect the tasks of a user. However, these studies typically support only specific applications and tasks and relatively little research has been conducted on real-life digital activities. This paper introduces user state modeling and prediction with contextual information captured as entities, recorded from real-world digital user behavior, called entity footprinting; a system that records users’ digital activities on their screens and proactively provides useful entities across application boundaries without requiring explicit query formulation. Our methodology is to detect contextual user states using latent representations of entities occurring in digital activities. Using topic models and recurrent neural networks, the model learns the latent representation of concurrent entities and their sequential relationships. We report a field study in which the digital activities of thirteen people were recorded continuously for 14 days. The model learned from this data is used to 1) predict contextual user states, and 2) predict relevant entities for the detected states. The results show improved user state detection accuracy and entity prediction performance compared to static, heuristic, and basic topic models. Our findings have implications for the design of proactive recommendation systems that can implicitly infer users’ contextual state by monitoring users’ digital activities and proactively recommending the right information at the right time.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"4 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139767234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predicting Group Choices from Group Profiles 从群体概况预测群体选择

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2024-01-10 DOI: 10.1145/3639710

Hanif Emamgholizadeh, Amra Delić, Francesco Ricci

Group recommender systems (GRSs) identify items to recommend to a group of people by aggregating group members’ individual preferences into a group profile, and selecting the items that have the largest score in the group profile. The GRS predicts that these recommendations would be chosen by the group, by assuming that the group is applying the same preference aggregation strategy as the one adopted by the GRS. However, predicting the choice of a group is more complex since the GRS is not aware of the exact preference aggregation strategy that is going to be used by the group.

To this end, the aim of this paper is to validate the research hypothesis that, by using a machine learning approach and a data set of observed group choices, it is possible to predict a group’s final choice, better than by using a standard preference aggregation strategy. Inspired by the Decision Scheme theory, which first tried to address the group choice prediction problem, we search for a group profile definition that, in conjunction with a machine learning model, can be used to accurately predict a group choice. Moreover, to cope with the data scarcity problem, we propose two data augmentation methods, which add synthetic group profiles to the training data, and we hypothesize they can further improve the choice prediction accuracy.

We validate our research hypotheses by using a data set containing 282 participants organized in 79 groups. The experiments indicate that the proposed method outperforms baseline aggregation strategies when used for group choice prediction. The method we propose is robust with the presence of missing preference data and achieves a performance superior to what humans can achieve on the group choice prediction task. Finally, the proposed data augmentation method can also improve the prediction accuracy. Our approach can be exploited in novel GRSs to identify the items that the group is likely to choose and to help groups to make even better and fairer choices.

群体推荐系统（GRS）通过将群体成员的个人偏好汇总到群体档案中，并选择在群体档案中得分最高的项目，从而确定向群体推荐的项目。群体偏好聚合系统假定群体采用的偏好聚合策略与群体偏好聚合系统采用的策略相同，从而预测群体会选择这些推荐项目。然而，预测一个群体的选择更为复杂，因为 GRS 并不知道该群体将使用的确切偏好聚合策略。为此，本文旨在验证以下研究假设：通过使用机器学习方法和观察到的群体选择数据集，可以比使用标准偏好汇总策略更好地预测群体的最终选择。受首次尝试解决群体选择预测问题的 "决策方案 "理论的启发，我们寻找了一种群体特征定义，它与机器学习模型相结合，可用于准确预测群体选择。此外，为了应对数据稀缺的问题，我们提出了两种数据增强方法，即在训练数据中添加合成的群体特征，并假设这两种方法可以进一步提高选择预测的准确性。我们使用一个包含 282 名参与者的数据集（分为 79 个小组）验证了我们的研究假设。实验结果表明，在用于群体选择预测时，我们提出的方法优于基线聚合策略。我们提出的方法对缺失偏好数据具有鲁棒性，在群体选择预测任务中的表现优于人类。最后，我们提出的数据增强方法还能提高预测的准确性。我们的方法可用于新型 GRS，以确定群体可能选择的项目，帮助群体做出更好、更公平的选择。

{"title":"Predicting Group Choices from Group Profiles","authors":"Hanif Emamgholizadeh, Amra Delić, Francesco Ricci","doi":"10.1145/3639710","DOIUrl":"https://doi.org/10.1145/3639710","url":null,"abstract":"Group recommender systems (GRSs) identify items to recommend to a group of people by aggregating group members’ individual preferences into a group profile, and selecting the items that have the largest score in the group profile. The GRS predicts that these recommendations would be chosen by the group, by assuming that the group is applying the same preference aggregation strategy as the one adopted by the GRS. However, predicting the choice of a group is more complex since the GRS is not aware of the exact preference aggregation strategy that is going to be used by the group. To this end, the aim of this paper is to validate the research hypothesis that, by using a machine learning approach and a data set of observed group choices, it is possible to predict a group’s final choice, better than by using a standard preference aggregation strategy. Inspired by the Decision Scheme theory, which first tried to address the group choice prediction problem, we search for a group profile definition that, in conjunction with a machine learning model, can be used to accurately predict a group choice. Moreover, to cope with the data scarcity problem, we propose two data augmentation methods, which add synthetic group profiles to the training data, and we hypothesize they can further improve the choice prediction accuracy. We validate our research hypotheses by using a data set containing 282 participants organized in 79 groups. The experiments indicate that the proposed method outperforms baseline aggregation strategies when used for group choice prediction. The method we propose is robust with the presence of missing preference data and achieves a performance superior to what humans can achieve on the group choice prediction task. Finally, the proposed data augmentation method can also improve the prediction accuracy. Our approach can be exploited in novel GRSs to identify the items that the group is likely to choose and to help groups to make even better and fairer choices.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"12 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139409668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How should an AI trust its human teammates? Exploring possible cues of artificial trust 人工智能应如何信任人类队友？探索人工信任的可能线索

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-12-06 DOI: 10.1145/3635475

Carolina Centeio Jorge, Catholijn M. Jonker, Myrthe L. Tielman

In teams composed of humans, we use trust in others to make decisions, such as what to do next, who to help and who to ask for help. When a team member is artificial, they should also be able to assess whether a human teammate is trustworthy for a certain task. We see trustworthiness as the combination of (1) whether someone will do a task and (2) whether they can do it. With building beliefs in trustworthiness as an ultimate goal, we explore which internal factors (krypta) of the human may play a role (e.g. ability, benevolence and integrity) in determining trustworthiness, according to existing literature. Furthermore, we investigate which observable metrics (manifesta) an agent may take into account as cues for the human teammate’s krypta in an online 2D grid-world experiment (n=54). Results suggest that cues of ability, benevolence and integrity influence trustworthiness. However, we observed that trustworthiness is mainly influenced by human’s playing strategy and cost-benefit analysis, which deserves further investigation. This is a first step towards building informed beliefs of human trustworthiness in human-AI teamwork.

在由人类组成的团队中，我们利用对他人的信任来做出决定，例如下一步该做什么、该帮助谁以及该向谁求助。当团队成员是人工智能时，他们也应该能够评估人类队友是否值得信任，以完成某项任务。我们认为，可信度是以下两个方面的结合：(1) 某人是否会完成任务；(2) 某人是否能完成任务。以建立对可信度的信念为最终目标，我们根据现有文献，探索人类的哪些内部因素（krypta）可能在决定可信度方面发挥作用（如能力、仁慈和正直）。此外，在一个在线二维网格世界实验（n=54）中，我们还研究了代理可以将哪些可观测指标（manifesta）作为人类队友的 "氪"（krypta）线索。结果表明，能力、仁慈和正直的线索会影响可信度。然而，我们观察到，可信度主要受人类的游戏策略和成本效益分析的影响，这值得进一步研究。这是在人类-人工智能团队合作中建立人类可信度知情信念的第一步。

{"title":"How should an AI trust its human teammates? Exploring possible cues of artificial trust","authors":"Carolina Centeio Jorge, Catholijn M. Jonker, Myrthe L. Tielman","doi":"10.1145/3635475","DOIUrl":"https://doi.org/10.1145/3635475","url":null,"abstract":"In teams composed of humans, we use trust in others to make decisions, such as what to do next, who to help and who to ask for help. When a team member is artificial, they should also be able to assess whether a human teammate is trustworthy for a certain task. We see trustworthiness as the combination of (1) whether someone will do a task and (2) whether they can do it. With building beliefs in trustworthiness as an ultimate goal, we explore which internal factors (krypta) of the human may play a role (e.g. ability, benevolence and integrity) in determining trustworthiness, according to existing literature. Furthermore, we investigate which observable metrics (manifesta) an agent may take into account as cues for the human teammate’s krypta in an online 2D grid-world experiment (n=54). Results suggest that cues of ability, benevolence and integrity influence trustworthiness. However, we observed that trustworthiness is mainly influenced by human’s playing strategy and cost-benefit analysis, which deserves further investigation. This is a first step towards building informed beliefs of human trustworthiness in human-AI teamwork.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"16 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138545567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

I Know This Looks Bad, But I Can Explain: Understanding When AI Should Explain Actions In Human-AI Teams 我知道这看起来很糟糕，但我可以解释:理解人工智能何时应该解释人类-人工智能团队中的行为

IF 3.4 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-12-02 DOI: 10.1145/3635474

Rui Zhang, Christopher Flathmann, Geoff Musick, Beau Schelble, Nathan J. McNeese, Bart Knijnenburg, Wen Duan

Explanation of artificial intelligence (AI) decision-making has become an important research area in human-computer interaction (HCI) and computer-supported teamwork research. While plenty of research has investigated AI explanations with an intent to improve AI transparency and human trust in AI, how AI explanations function in teaming environments remains unclear. Given that a major benefit of AI giving explanations is to increase human trust understanding how AI explanations impact human trust is crucial to effective human-AI teamwork. An online experiment was conducted with 156 participants to explore this question by examining how a teammate’s explanations impact the perceived trust of the teammate and the effectiveness of the team and how these impacts vary based on whether the teammate is a human or an AI. This study shows that explanations facilitate trust in AI teammates when explaining why AI disobeyed humans’ orders but hindered trust when explaining why an AI lied to humans. In addition, participants’ personal characteristics (e.g., their gender and the individual’s ethical framework) impacted their perceptions of AI teammates both directly and indirectly in different scenarios. Our study contributes to interactive intelligent systems and HCI by shedding light on how an AI teammate’s actions and corresponding explanations are perceived by humans while identifying factors that impact trust and perceived effectiveness. This work provides an initial understanding of AI explanations in human-AI teams, which can be used for future research to build upon in exploring AI explanation implementation in collaborative environments.

人工智能(AI)决策的解释已成为人机交互(HCI)和计算机支持的团队研究的一个重要研究领域。尽管大量研究调查了人工智能解释，目的是提高人工智能的透明度和人类对人工智能的信任，但人工智能解释在团队环境中的作用仍不清楚。考虑到人工智能解释的一个主要好处是增加人类的信任，了解人工智能解释如何影响人类的信任对于有效的人类-人工智能团队合作至关重要。我们对156名参与者进行了一项在线实验，通过检查队友的解释如何影响队友的信任和团队效率，以及这些影响如何根据队友是人类还是人工智能而变化，来探索这个问题。这项研究表明，解释为什么AI不服从人类的命令，会促进对AI队友的信任，但解释为什么AI对人类撒谎，会阻碍信任。此外，参与者的个人特征(例如，他们的性别和个人的道德框架)直接或间接地影响了他们在不同场景下对AI队友的看法。我们的研究通过揭示人工智能队友的行为和相应的解释如何被人类感知，同时确定影响信任和感知有效性的因素，为交互式智能系统和HCI做出了贡献。这项工作提供了对人类-人工智能团队中人工智能解释的初步理解，可用于未来的研究，以探索协作环境中人工智能解释的实现。

{"title":"I Know This Looks Bad, But I Can Explain: Understanding When AI Should Explain Actions In Human-AI Teams","authors":"Rui Zhang, Christopher Flathmann, Geoff Musick, Beau Schelble, Nathan J. McNeese, Bart Knijnenburg, Wen Duan","doi":"10.1145/3635474","DOIUrl":"https://doi.org/10.1145/3635474","url":null,"abstract":"Explanation of artificial intelligence (AI) decision-making has become an important research area in human-computer interaction (HCI) and computer-supported teamwork research. While plenty of research has investigated AI explanations with an intent to improve AI transparency and human trust in AI, how AI explanations function in teaming environments remains unclear. Given that a major benefit of AI giving explanations is to increase human trust understanding how AI explanations impact human trust is crucial to effective human-AI teamwork. An online experiment was conducted with 156 participants to explore this question by examining how a teammate’s explanations impact the perceived trust of the teammate and the effectiveness of the team and how these impacts vary based on whether the teammate is a human or an AI. This study shows that explanations facilitate trust in AI teammates when explaining why AI disobeyed humans’ orders but hindered trust when explaining why an AI lied to humans. In addition, participants’ personal characteristics (e.g., their gender and the individual’s ethical framework) impacted their perceptions of AI teammates both directly and indirectly in different scenarios. Our study contributes to interactive intelligent systems and HCI by shedding light on how an AI teammate’s actions and corresponding explanations are perceived by humans while identifying factors that impact trust and perceived effectiveness. This work provides an initial understanding of AI explanations in human-AI teams, which can be used for future research to build upon in exploring AI explanation implementation in collaborative environments.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"54 3","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Meaningful Explanation Effect on User’s Trust in an AI Medical System: Designing Explanations for Non-Expert Users AI医疗系统中有意义的解释对用户信任的影响:为非专家用户设计解释

4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-11-08 DOI: 10.1145/3631614

Retno Larasati, Anna De Liddo, Enrico Motta

Whereas most research in AI system explanation for healthcare applications looks at developing algorithmic explanations targeted at AI experts or medical professionals, the question we raise is: How do we build meaningful explanations for laypeople? And how does a meaningful explanation affect user’s trust perceptions? Our research investigates how the key factors affecting human-AI trust change in the light of human expertise, and how to design explanations specifically targeted at non-experts. By means of a stage-based design method, we map the ways laypeople understand AI explanations in a User Explanation Model. We also map both medical professionals and AI experts’ practice in an Expert Explanation Model. A Target Explanation Model is then proposed, which represents how experts’ practice and layperson’s understanding can be combined to design meaningful explanations. Design guidelines for meaningful AI explanations are proposed, and a prototype of AI system explanation for non-expert users in a breast cancer scenario is presented and assessed on how it affect users’ trust perceptions.

尽管大多数针对医疗保健应用的人工智能系统解释研究着眼于开发针对人工智能专家或医疗专业人员的算法解释，但我们提出的问题是:我们如何为外行人构建有意义的解释?有意义的解释如何影响用户的信任感知?我们的研究调查了影响人类与人工智能信任的关键因素如何随着人类的专业知识而变化，以及如何设计专门针对非专家的解释。通过基于阶段的设计方法，我们映射了外行人在用户解释模型中理解AI解释的方式。我们还将医疗专业人员和人工智能专家的实践映射到专家解释模型中。然后提出了一个目标解释模型，它代表了专家的实践和外行人的理解如何结合起来设计有意义的解释。提出了有意义的人工智能解释的设计指南，并提出了一个针对乳腺癌场景中非专家用户的人工智能系统解释原型，并评估了它如何影响用户的信任感知。

{"title":"Meaningful Explanation Effect on User’s Trust in an AI Medical System: Designing Explanations for Non-Expert Users","authors":"Retno Larasati, Anna De Liddo, Enrico Motta","doi":"10.1145/3631614","DOIUrl":"https://doi.org/10.1145/3631614","url":null,"abstract":"Whereas most research in AI system explanation for healthcare applications looks at developing algorithmic explanations targeted at AI experts or medical professionals, the question we raise is: How do we build meaningful explanations for laypeople? And how does a meaningful explanation affect user’s trust perceptions? Our research investigates how the key factors affecting human-AI trust change in the light of human expertise, and how to design explanations specifically targeted at non-experts. By means of a stage-based design method, we map the ways laypeople understand AI explanations in a User Explanation Model. We also map both medical professionals and AI experts’ practice in an Expert Explanation Model. A Target Explanation Model is then proposed, which represents how experts’ practice and layperson’s understanding can be combined to design meaningful explanations. Design guidelines for meaningful AI explanations are proposed, and a prototype of AI system explanation for non-expert users in a breast cancer scenario is presented and assessed on how it affect users’ trust perceptions.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135390620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Explainable Activity Recognition in Videos using Deep Learning and Tractable Probabilistic Models 使用深度学习和可处理概率模型的视频中可解释的活动识别

4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Interactive Intelligent Systems

Pub Date : 2023-10-12 DOI: 10.1145/3626961

Chiradeep Roy, Mahsan Nourani, Shivvrat Arya, Mahesh Shanbhag, Tahrima Rahman, Eric D. Ragan, Nicholas Ruozzi, Vibhav Gogate

We consider the following video activity recognition (VAR) task: given a video, infer the set of activities being performed in the video and assign each frame to an activity. Although VAR can be solved accurately using existing deep learning techniques, deep networks are neither interpretable nor explainable and as a result their use is problematic in high stakes decision-making applications (e.g., in healthcare, experimental Biology, aviation, law, etc.). In such applications, failure may lead to disastrous consequences and therefore it is necessary that the user is able to either understand the inner workings of the model or probe it to understand its reasoning patterns for a given decision. We address these limitations of deep networks by proposing a new approach that feeds the output of a deep model into a tractable, interpretable probabilistic model called a dynamic conditional cutset network that is defined over the explanatory and output variables and then performing joint inference over the combined model. The two key benefits of using cutset networks are: (a) they explicitly model the relationship between the output and explanatory variables and as a result the combined model is likely to be more accurate than the vanilla deep model and (b) they can answer reasoning queries in polynomial time and as a result they can derive meaningful explanations by efficiently answering explanation queries. We demonstrate the efficacy of our approach on two datasets, Textually Annotated Cooking Scenes (TACoS), and wet lab, using conventional evaluation measures such as the Jaccard Index and Hamming Loss, as well as a human-subjects study.

我们考虑以下视频活动识别(VAR)任务:给定视频，推断视频中正在执行的活动集，并将每一帧分配给一个活动。尽管使用现有的深度学习技术可以准确地解决VAR，但深度网络既不可解释也不可解释，因此在高风险决策应用(例如医疗保健、实验生物学、航空、法律等)中使用它们是有问题的。在这样的应用程序中，失败可能会导致灾难性的后果，因此用户必须能够理解模型的内部工作原理，或者探索模型以理解给定决策的推理模式。我们通过提出一种新的方法来解决深度网络的这些局限性，该方法将深度模型的输出输入到一个可处理的，可解释的概率模型中，称为动态条件割集网络，该模型定义在解释变量和输出变量上，然后在组合模型上执行联合推理。使用割集网络的两个关键好处是:(a)它们显式地建模输出和解释变量之间的关系，因此组合模型可能比普通深度模型更准确;(b)它们可以在多项式时间内回答推理查询，因此它们可以通过有效地回答解释查询来获得有意义的解释。我们在两个数据集上证明了我们的方法的有效性，文本注释烹饪场景(TACoS)和湿实验室，使用传统的评估措施，如Jaccard指数和Hamming损失，以及人类受试者研究。

{"title":"Explainable Activity Recognition in Videos using Deep Learning and Tractable Probabilistic Models","authors":"Chiradeep Roy, Mahsan Nourani, Shivvrat Arya, Mahesh Shanbhag, Tahrima Rahman, Eric D. Ragan, Nicholas Ruozzi, Vibhav Gogate","doi":"10.1145/3626961","DOIUrl":"https://doi.org/10.1145/3626961","url":null,"abstract":"We consider the following video activity recognition (VAR) task: given a video, infer the set of activities being performed in the video and assign each frame to an activity. Although VAR can be solved accurately using existing deep learning techniques, deep networks are neither interpretable nor explainable and as a result their use is problematic in high stakes decision-making applications (e.g., in healthcare, experimental Biology, aviation, law, etc.). In such applications, failure may lead to disastrous consequences and therefore it is necessary that the user is able to either understand the inner workings of the model or probe it to understand its reasoning patterns for a given decision. We address these limitations of deep networks by proposing a new approach that feeds the output of a deep model into a tractable, interpretable probabilistic model called a dynamic conditional cutset network that is defined over the explanatory and output variables and then performing joint inference over the combined model. The two key benefits of using cutset networks are: (a) they explicitly model the relationship between the output and explanatory variables and as a result the combined model is likely to be more accurate than the vanilla deep model and (b) they can answer reasoning queries in polynomial time and as a result they can derive meaningful explanations by efficiently answering explanation queries. We demonstrate the efficacy of our approach on two datasets, Textually Annotated Cooking Scenes (TACoS), and wet lab, using conventional evaluation measures such as the Jaccard Index and Hamming Loss, as well as a human-subjects study.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136012607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0