首页 > 最新文献

ACM Transactions on Interactive Intelligent Systems最新文献

英文 中文
Interactions for Socially Shared Regulation in Collaborative Learning: An Interdisciplinary Multimodal Dataset 协作学习中的社会共享调节互动:跨学科多模态数据集
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-04-22 DOI: 10.1145/3658376
Yante Li, Yang Liu, Andy Nguyen, Henglin Shi, Eija Vuorenmaa, Sanna Järvelä, Guoying Zhao
Socially shared regulation plays a pivotal role in the success of collaborative learning. However, evaluating socially shared regulation of learning (SSRL) proves challenging due to the dynamic and infrequent cognitive and socio-emotional interactions, which constitute the focal point of SSRL. To address this challenge, this paper gathers interdisciplinary researchers to establish a multi-modal dataset with cognitive and socio-emotional interactions for SSRL study. Firstly, to induce cognitive and socio-emotional interactions, learning science researchers designed a special collaborative learning task with regulatory trigger events among triadic people for the SSRL study. Secondly, this dataset includes various modalities like video, Kinect data, audio, and physiological data (accelerometer, EDA, heart rate) from 81 high school students in 28 groups, offering a comprehensive view of the SSRL process. Thirdly, three-level verbal interaction annotations and non-verbal interactions including facial expression, eye gaze, gesture, and posture are provided, which could further contribute to interdisciplinary fields such as computer science, sociology, and education. In addition, comprehensive analysis verifies the dataset’s effectiveness. As far as we know, this is the first multimodal dataset for studying SSRL among triadic group members.
社会共同调节在协作学习的成功中起着举足轻重的作用。然而,由于构成社会共享学习调控(SSRL)焦点的认知和社会情感交互是动态的、不频繁的,因此评估社会共享学习调控(SSRL)具有挑战性。为了应对这一挑战,本文汇集了跨学科研究人员,为 SSRL 研究建立了一个包含认知和社会情感互动的多模态数据集。首先,为了诱导认知和社会情感互动,学习科学研究人员设计了一个特殊的协作学习任务,其中包含三体人之间的调节触发事件,用于 SSRL 研究。其次,该数据集包括来自 28 个小组的 81 名高中生的视频、Kinect 数据、音频和生理数据(加速度计、EDA、心率)等各种模式,从而提供了 SSRL 过程的全面视图。第三,提供了三级语言交互注释和非语言交互,包括面部表情、眼睛注视、手势和姿势,这将进一步促进计算机科学、社会学和教育学等跨学科领域的发展。此外,综合分析验证了数据集的有效性。据我们所知,这是第一个用于研究三人小组成员之间 SSRL 的多模态数据集。
{"title":"Interactions for Socially Shared Regulation in Collaborative Learning: An Interdisciplinary Multimodal Dataset","authors":"Yante Li, Yang Liu, Andy Nguyen, Henglin Shi, Eija Vuorenmaa, Sanna Järvelä, Guoying Zhao","doi":"10.1145/3658376","DOIUrl":"https://doi.org/10.1145/3658376","url":null,"abstract":"Socially shared regulation plays a pivotal role in the success of collaborative learning. However, evaluating socially shared regulation of learning (SSRL) proves challenging due to the dynamic and infrequent cognitive and socio-emotional interactions, which constitute the focal point of SSRL. To address this challenge, this paper gathers interdisciplinary researchers to establish a multi-modal dataset with cognitive and socio-emotional interactions for SSRL study. Firstly, to induce cognitive and socio-emotional interactions, learning science researchers designed a special collaborative learning task with regulatory trigger events among triadic people for the SSRL study. Secondly, this dataset includes various modalities like video, Kinect data, audio, and physiological data (accelerometer, EDA, heart rate) from 81 high school students in 28 groups, offering a comprehensive view of the SSRL process. Thirdly, three-level verbal interaction annotations and non-verbal interactions including facial expression, eye gaze, gesture, and posture are provided, which could further contribute to interdisciplinary fields such as computer science, sociology, and education. In addition, comprehensive analysis verifies the dataset’s effectiveness. As far as we know, this is the first multimodal dataset for studying SSRL among triadic group members.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140677072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cooperative Multi-Objective Bayesian Design Optimization 合作式多目标贝叶斯设计优化
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-04-17 DOI: 10.1145/3657643
George Mo, John Dudley, Liwei Chan, Yi-Chi Liao, Antti Oulasvirta, Per Ola Kristensson

Computational methods can potentially facilitate user interface design by complementing designer intuition, prior experience, and personal preference. Framing a user interface design task as a multi-objective optimization problem can help with operationalizing and structuring this process at the expense of designer agency and experience. While offering a systematic means of exploring the design space, the optimization process cannot typically leverage the designer’s expertise in quickly identifying that a given ‘bad’ design is not worth evaluating. We here examine a cooperative approach where both the designer and optimization process share a common goal, and work in partnership by establishing a shared understanding of the design space. We tackle the research question: how can we foster cooperation between the designer and a systematic optimization process in order to best leverage their combined strength? We introduce and present an evaluation of a cooperative approach that allows the user to express their design insight and work in concert with a multi-objective design process. We find that the cooperative approach successfully encourages designers to explore more widely in the design space than when they are working without assistance from an optimization process. The cooperative approach also delivers design outcomes that are comparable to an optimization process run without any direct designer input, but achieves this with greater efficiency and substantially higher designer engagement levels.

计算方法可以补充设计者的直觉、先前经验和个人偏好,从而为用户界面设计提供潜在的便利。将用户界面设计任务构建为一个多目标优化问题,有助于在牺牲设计者的主观能动性和经验的前提下,实现这一过程的可操作性和结构化。虽然优化过程提供了一种探索设计空间的系统化方法,但它通常无法利用设计者的专业知识来快速识别某个 "糟糕 "的设计是否值得评估。在这里,我们研究了一种合作方法,在这种方法中,设计者和优化流程拥有共同的目标,并通过建立对设计空间的共同理解来开展合作。我们要解决的研究问题是:如何促进设计者与系统优化流程之间的合作,从而最大限度地发挥它们的综合优势?我们介绍并评估了一种合作方法,这种方法允许用户表达自己的设计见解,并与多目标设计流程协同工作。我们发现,与没有优化流程协助的情况相比,合作方法成功地鼓励了设计人员在设计空间中进行更广泛的探索。合作方法所产生的设计结果与没有设计人员直接输入的优化流程不相上下,但效率更高,设计人员的参与度也更高。
{"title":"Cooperative Multi-Objective Bayesian Design Optimization","authors":"George Mo, John Dudley, Liwei Chan, Yi-Chi Liao, Antti Oulasvirta, Per Ola Kristensson","doi":"10.1145/3657643","DOIUrl":"https://doi.org/10.1145/3657643","url":null,"abstract":"<p>Computational methods can potentially facilitate user interface design by complementing designer intuition, prior experience, and personal preference. Framing a user interface design task as a multi-objective optimization problem can help with operationalizing and structuring this process at the expense of designer agency and experience. While offering a systematic means of exploring the design space, the optimization process cannot typically leverage the designer’s expertise in quickly identifying that a given ‘bad’ design is not worth evaluating. We here examine a cooperative approach where both the designer and optimization process share a common goal, and work in partnership by establishing a shared understanding of the design space. We tackle the research question: how can we foster cooperation between the designer and a systematic optimization process in order to best leverage their combined strength? We introduce and present an evaluation of a cooperative approach that allows the user to express their design insight and work in concert with a multi-objective design process. We find that the cooperative approach successfully encourages designers to explore more widely in the design space than when they are working without assistance from an optimization process. The cooperative approach also delivers design outcomes that are comparable to an optimization process run without any direct designer input, but achieves this with greater efficiency and substantially higher designer engagement levels.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140615493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Spatial Constraint Model for Manipulating Static Visualizations 操纵静态可视化的空间约束模型
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-04-11 DOI: 10.1145/3657642
Can Liu, Yu Zhang, Cong Wu, Chen Li, Xiaoru Yuan

We introduce a spatial constraint model to characterize the positioning and interactions in visualizations, thereby facilitating the activation of static visualizations. Our model provides users with the capability to manipulate visualizations through operations such as selection, filtering, navigation, arrangement, and aggregation. Building upon this conceptual framework, we propose a prototype system designed to activate pre-existing visualizations by imbuing them with intelligent interactions. This augmentation is accomplished through the integration of visual objects with forces. The instantiation of our spatial constraint model enables seamless animated transitions between distinct visualization layouts. To demonstrate the efficacy of our approach, we present usage scenarios that involve the activation of visualizations within real-world contexts.

我们引入了一个空间约束模型来描述可视化中的定位和交互,从而促进静态可视化的激活。我们的模型为用户提供了通过选择、过滤、导航、排列和聚合等操作操纵可视化的能力。在这一概念框架的基础上,我们提出了一个原型系统,旨在通过智能交互激活已有的可视化。这种增强是通过将视觉对象与力整合来实现的。空间约束模型的实例化可实现不同可视化布局之间的无缝动画转换。为了展示我们方法的功效,我们介绍了在现实世界中激活可视化的使用场景。
{"title":"A Spatial Constraint Model for Manipulating Static Visualizations","authors":"Can Liu, Yu Zhang, Cong Wu, Chen Li, Xiaoru Yuan","doi":"10.1145/3657642","DOIUrl":"https://doi.org/10.1145/3657642","url":null,"abstract":"<p>We introduce a spatial constraint model to characterize the positioning and interactions in visualizations, thereby facilitating the activation of static visualizations. Our model provides users with the capability to manipulate visualizations through operations such as selection, filtering, navigation, arrangement, and aggregation. Building upon this conceptual framework, we propose a prototype system designed to activate pre-existing visualizations by imbuing them with intelligent interactions. This augmentation is accomplished through the integration of visual objects with forces. The instantiation of our spatial constraint model enables seamless animated transitions between distinct visualization layouts. To demonstrate the efficacy of our approach, we present usage scenarios that involve the activation of visualizations within real-world contexts.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140570468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation generAItor:为语言模型的可解释性和适应性生成环中树文本
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-14 DOI: 10.1145/3652028
Thilo Spinner, Rebecca Kehlbeck, Rita Sevastjanova, Tobias Stähle, Daniel A. Keim, Oliver Deussen, Mennatallah El-Assady

Large language models (LLMs) are widely deployed in various downstream tasks, e.g., auto-completion, aided writing, or chat-based text generation. However, the considered output candidates of the underlying search algorithm are under-explored and under-explained. We tackle this shortcoming by proposing a tree-in-the-loop approach, where a visual representation of the beam search tree is the central component for analyzing, explaining, and adapting the generated outputs. To support these tasks, we present generAItor, a visual analytics technique, augmenting the central beam search tree with various task-specific widgets, providing targeted visualizations and interaction possibilities. Our approach allows interactions on multiple levels and offers an iterative pipeline that encompasses generating, exploring, and comparing output candidates, as well as fine-tuning the model based on adapted data. Our case study shows that our tool generates new insights in gender bias analysis beyond state-of-the-art template-based methods. Additionally, we demonstrate the applicability of our approach in a qualitative user study. Finally, we quantitatively evaluate the adaptability of the model to few samples, as occurring in text-generation use cases.

大语言模型(LLM)被广泛应用于各种下游任务,如自动完成、辅助写作或基于聊天的文本生成。然而,人们对底层搜索算法的输出候选结果探索不足,解释不够。针对这一不足,我们提出了一种 "环中树 "方法,即以波束搜索树的可视化表示作为分析、解释和调整生成输出的核心组件。为了支持这些任务,我们提出了一种可视化分析技术 generAItor,用各种特定任务小部件来增强中央波束搜索树,提供有针对性的可视化和交互可能性。我们的方法允许多层次的互动,并提供了一个迭代管道,包括生成、探索和比较候选输出,以及根据调整后的数据对模型进行微调。我们的案例研究表明,我们的工具在性别偏见分析方面产生了新的见解,超越了最先进的基于模板的方法。此外,我们还在一项定性用户研究中展示了我们方法的适用性。最后,我们定量评估了模型对少量样本的适应性,如在文本生成使用案例中出现的情况。
{"title":"generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation","authors":"Thilo Spinner, Rebecca Kehlbeck, Rita Sevastjanova, Tobias Stähle, Daniel A. Keim, Oliver Deussen, Mennatallah El-Assady","doi":"10.1145/3652028","DOIUrl":"https://doi.org/10.1145/3652028","url":null,"abstract":"<p>Large language models (LLMs) are widely deployed in various downstream tasks, e.g., auto-completion, aided writing, or chat-based text generation. However, the considered output candidates of the underlying search algorithm are under-explored and under-explained. We tackle this shortcoming by proposing a <i>tree-in-the-loop</i> approach, where a visual representation of the beam search tree is the central component for analyzing, explaining, and adapting the generated outputs. To support these tasks, we present generAItor, a visual analytics technique, augmenting the central beam search tree with various task-specific widgets, providing targeted visualizations and interaction possibilities. Our approach allows interactions on multiple levels and offers an iterative pipeline that encompasses generating, exploring, and comparing output candidates, as well as fine-tuning the model based on adapted data. Our case study shows that our tool generates new insights in gender bias analysis beyond state-of-the-art template-based methods. Additionally, we demonstrate the applicability of our approach in a qualitative user study. Finally, we quantitatively evaluate the adaptability of the model to few samples, as occurring in text-generation use cases.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140126424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
“It would work for me too”: How Online Communities Shape Software Developers’ Trust in AI-Powered Code Generation Tools "它对我也有用":在线社区如何影响软件开发人员对人工智能代码生成工具的信任
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-09 DOI: 10.1145/3651990
Ruijia Cheng, Ruotong Wang, Thomas Zimmermann, Denae Ford

While revolutionary AI-powered code generation tools have been rising rapidly, we know little about how and how to help software developers form appropriate trust in those AI tools. Through a two-phase formative study, we investigate how online communities shape developers’ trust in AI tools and how we can leverage community features to facilitate appropriate user trust. Through interviewing 17 developers, we find that developers collectively make sense of AI tools using the experiences shared by community members and leverage community signals to evaluate AI suggestions. We then surface design opportunities and conduct 11 design probe sessions to explore the design space of using community features to support user trust in AI code generation systems. We synthesize our findings and extend an existing model of user trust in AI technologies with sociotechnical factors. We map out the design considerations for integrating user community into the AI code generation experience.

虽然革命性的人工智能代码生成工具迅速崛起,但我们对如何帮助软件开发人员对这些人工智能工具形成适当的信任却知之甚少。通过一项分两个阶段进行的形成性研究,我们调查了在线社区如何形成开发人员对人工智能工具的信任,以及我们如何利用社区功能来促进用户的适当信任。通过对 17 名开发人员的访谈,我们发现开发人员会利用社区成员分享的经验来共同理解人工智能工具,并利用社区信号来评估人工智能建议。然后,我们提出了设计机会,并进行了 11 次设计探究会议,以探索使用社区功能支持人工智能代码生成系统中用户信任的设计空间。我们对研究结果进行了综合,并利用社会技术因素扩展了用户对人工智能技术信任度的现有模型。我们列出了将用户社区融入人工智能代码生成体验的设计考虑因素。
{"title":"“It would work for me too”: How Online Communities Shape Software Developers’ Trust in AI-Powered Code Generation Tools","authors":"Ruijia Cheng, Ruotong Wang, Thomas Zimmermann, Denae Ford","doi":"10.1145/3651990","DOIUrl":"https://doi.org/10.1145/3651990","url":null,"abstract":"<p>While revolutionary AI-powered code generation tools have been rising rapidly, we know little about how and how to help software developers form appropriate trust in those AI tools. Through a two-phase formative study, we investigate how online communities shape developers’ trust in AI tools and how we can leverage community features to facilitate appropriate user trust. Through interviewing 17 developers, we find that developers collectively make sense of AI tools using the experiences shared by community members and leverage community signals to evaluate AI suggestions. We then surface design opportunities and conduct 11 design probe sessions to explore the design space of using community features to support user trust in AI code generation systems. We synthesize our findings and extend an existing model of user trust in AI technologies with sociotechnical factors. We map out the design considerations for integrating user community into the AI code generation experience.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140072632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Insights into Natural Language Database Query Errors: From Attention Misalignment to User Handling Strategies 洞察自然语言数据库查询错误:从注意力错位到用户处理策略
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-03-02 DOI: 10.1145/3650114
Zheng Ning, Yuan Tian, Zheng Zhang, Tianyi Zhang, Toby Jia-Jun Li

Querying structured databases with natural language (NL2SQL) has remained a difficult problem for years. Recently, the advancement of machine learning (ML), natural language processing (NLP), and large language models (LLM) have led to significant improvements in performance, with the best model achieving ∼ 85% percent accuracy on the benchmark Spider dataset. However, there is a lack of a systematic understanding of the types, causes, and effectiveness of error-handling mechanisms of errors for erroneous queries nowadays. To bridge the gap, a taxonomy of errors made by four representative NL2SQL models was built in this work, along with an in-depth analysis of the errors. Second, the causes of model errors were explored by analyzing the model-human attention alignment to the natural language query. Last, a within-subjects user study with 26 participants was conducted to investigate the effectiveness of three interactive error-handling mechanisms in NL2SQL. Findings from this paper shed light on the design of model structure and error discovery and repair strategies for natural language data query interfaces in the future.

用自然语言查询结构化数据库(NL2SQL)多年来一直是个难题。最近,机器学习(ML)、自然语言处理(NLP)和大型语言模型(LLM)的发展使性能有了显著提高,最佳模型在基准 Spider 数据集上的准确率达到了 ∼ 85%。然而,目前对错误查询的错误类型、原因和错误处理机制的有效性还缺乏系统的了解。为了弥补这一差距,本研究建立了四种具有代表性的 NL2SQL 模型所犯的错误分类法,并对这些错误进行了深入分析。其次,通过分析自然语言查询中模型与人类注意力的一致性,探讨了模型错误的原因。最后,对 26 名参与者进行了主体内用户研究,以调查 NL2SQL 中三种交互式错误处理机制的有效性。本文的研究结果为未来自然语言数据查询界面的模型结构设计和错误发现与修复策略提供了启示。
{"title":"Insights into Natural Language Database Query Errors: From Attention Misalignment to User Handling Strategies","authors":"Zheng Ning, Yuan Tian, Zheng Zhang, Tianyi Zhang, Toby Jia-Jun Li","doi":"10.1145/3650114","DOIUrl":"https://doi.org/10.1145/3650114","url":null,"abstract":"<p>Querying structured databases with natural language (NL2SQL) has remained a difficult problem for years. Recently, the advancement of machine learning (ML), natural language processing (NLP), and large language models (LLM) have led to significant improvements in performance, with the best model achieving ∼ 85% percent accuracy on the benchmark Spider dataset. However, there is a lack of a systematic understanding of the types, causes, and effectiveness of error-handling mechanisms of errors for erroneous queries nowadays. To bridge the gap, a taxonomy of errors made by four representative NL2SQL models was built in this work, along with an in-depth analysis of the errors. Second, the causes of model errors were explored by analyzing the model-human attention alignment to the natural language query. Last, a within-subjects user study with 26 participants was conducted to investigate the effectiveness of three interactive error-handling mechanisms in NL2SQL. Findings from this paper shed light on the design of model structure and error discovery and repair strategies for natural language data query interfaces in the future.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140019034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Man and the Machine: Effects of AI-assisted Human Labeling on Interactive Annotation of Real-Time Video Streams 人与机器:人工智能辅助人工标注对实时视频流互动注释的影响
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-02-29 DOI: 10.1145/3649457
Marko Radeta, Ruben Freitas, Claudio Rodrigues, Agustin Zuniga, Ngoc Thi Nguyen, Huber Flores, Petteri Nurmi

AI-assisted interactive annotation is a powerful way to facilitate data annotation – a prerequisite for constructing robust AI models. While AI-assisted interactive annotation has been extensively studied in static settings, less is known about its usage in dynamic scenarios where the annotators operate under time and cognitive constraints, e.g., while detecting suspicious or dangerous activities from real-time surveillance feeds. Understanding how AI can assist annotators in these tasks and facilitate consistent annotation is paramount to ensure high performance for AI models trained on these data. We address this gap in interactive machine learning (IML) research, contributing an extensive investigation of the benefits, limitations, and challenges of AI-assisted annotation in dynamic application use cases. We address both the effects of AI on annotators and the effects of (AI) annotations on the performance of AI models trained on annotated data in real-time video annotations. We conduct extensive experiments that compare annotation performance at two annotator levels (expert and non-expert) and two interactive labelling techniques (with and without AI-assistance). In a controlled study with N = 34 annotators and a follow up study with 51963 images and their annotation labels being input to the AI model, we demonstrate that the benefits of AI-assisted models are greatest for non-expert users and for cases where targets are only partially or briefly visible. The expert users tend to outperform or achieve similar performance as AI model. Labels combining AI and expert annotations result in the best overall performance as the AI reduces overflow and latency in the expert annotations. We derive guidelines for the use of AI-assisted human annotation in real-time dynamic use cases.

人工智能辅助交互式注释是一种促进数据注释的强大方法,也是构建强大人工智能模型的先决条件。虽然人工智能辅助交互式注释已在静态环境中得到广泛研究,但对其在动态场景中的应用却知之甚少,在动态场景中,注释者的操作受到时间和认知能力的限制,例如从实时监控馈送中检测可疑或危险活动。要确保在这些数据上训练的人工智能模型的高性能,了解人工智能如何协助注释者完成这些任务并促进注释的一致性至关重要。我们针对交互式机器学习(IML)研究中的这一空白,对动态应用案例中人工智能辅助标注的优势、局限性和挑战进行了广泛的调查。我们既探讨了人工智能对注释者的影响,也探讨了(人工智能)注释对在实时视频注释中根据注释数据训练的人工智能模型性能的影响。我们进行了广泛的实验,比较了两种注释者水平(专家和非专家)和两种交互式标签技术(有人工智能辅助和无人工智能辅助)下的注释性能。在一项由 N = 34 名标注者进行的对照研究和一项由 51963 张图像及其标注标签输入人工智能模型的后续研究中,我们证明了人工智能辅助模型对非专家用户以及目标仅部分可见或短暂可见的情况的优势最大。专家用户的表现往往优于人工智能模型或与之相近。由于人工智能减少了专家注释的溢出和延迟,因此结合人工智能和专家注释的标签可获得最佳的整体性能。我们得出了在实时动态用例中使用人工智能辅助人类注释的指导原则。
{"title":"Man and the Machine: Effects of AI-assisted Human Labeling on Interactive Annotation of Real-Time Video Streams","authors":"Marko Radeta, Ruben Freitas, Claudio Rodrigues, Agustin Zuniga, Ngoc Thi Nguyen, Huber Flores, Petteri Nurmi","doi":"10.1145/3649457","DOIUrl":"https://doi.org/10.1145/3649457","url":null,"abstract":"<p>AI-assisted interactive annotation is a powerful way to facilitate data annotation – a prerequisite for constructing robust AI models. While AI-assisted interactive annotation has been extensively studied in static settings, less is known about its usage in dynamic scenarios where the annotators operate under time and cognitive constraints, e.g., while detecting suspicious or dangerous activities from real-time surveillance feeds. Understanding how AI can assist annotators in these tasks and facilitate consistent annotation is paramount to ensure high performance for AI models trained on these data. We address this gap in interactive machine learning (IML) research, contributing an extensive investigation of the benefits, limitations, and challenges of AI-assisted annotation in dynamic application use cases. We address both the effects of AI on annotators and the effects of (AI) annotations on the performance of AI models trained on annotated data in real-time video annotations. We conduct extensive experiments that compare annotation performance at two annotator levels (expert and non-expert) and two interactive labelling techniques (with and without AI-assistance). In a controlled study with <i>N</i> = 34 annotators and a follow up study with 51963 images and their annotation labels being input to the AI model, we demonstrate that the benefits of AI-assisted models are greatest for non-expert users and for cases where targets are only partially or briefly visible. The expert users tend to outperform or achieve similar performance as AI model. Labels combining AI and expert annotations result in the best overall performance as the AI reduces overflow and latency in the expert annotations. We derive guidelines for the use of AI-assisted human annotation in real-time dynamic use cases.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140009301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Talk2Data : A Natural Language Interface for Exploratory Visual Analysis via Question Decomposition Talk2Data :通过问题分解进行探索性视觉分析的自然语言界面
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-02-07 DOI: 10.1145/3643894
Yi Guo, Danqing Shi, Mingjuan Guo, Yanqiu Wu, Nan Cao, Qing Chen

Through a natural language interface (NLI) for exploratory visual analysis, users can directly “ask” analytical questions about the given tabular data. This process greatly improves user experience and lowers the technical barriers of data analysis. Existing techniques focus on generating a visualization from a concrete question. However, complex questions, requiring multiple data queries and visualizations to answer, are frequently asked in data exploration and analysis, which cannot be easily solved with the existing techniques. To address this issue, in this paper, we introduce Talk2Data, a natural language interface for exploratory visual analysis that supports answering complex questions. It leverages an advanced deep-learning model to resolve complex questions into a series of simple questions that could gradually elaborate on the users’ requirements. To present answers, we design a set of annotated and captioned visualizations to represent the answers in a form that supports interpretation and narration. We conducted an ablation study and a controlled user study to evaluate the Talk2Data’s effectiveness and usefulness.

通过用于探索性可视分析的自然语言界面(NLI),用户可以直接就给定的表格数据 "提出 "分析问题。这一过程大大改善了用户体验,降低了数据分析的技术门槛。现有技术侧重于根据具体问题生成可视化。然而,在数据探索和分析中经常会遇到复杂的问题,需要多次数据查询和可视化才能回答,而现有的技术无法轻松解决这些问题。为了解决这个问题,我们在本文中介绍了 Talk2Data,这是一种用于探索性可视化分析的自然语言界面,支持回答复杂问题。它利用先进的深度学习模型,将复杂的问题解析为一系列简单的问题,从而逐步阐述用户的需求。为了呈现答案,我们设计了一套带注释和标题的可视化界面,以支持解释和叙述的形式呈现答案。我们进行了一项消融研究和一项受控用户研究,以评估 Talk2Data 的有效性和实用性。
{"title":"Talk2Data : A Natural Language Interface for Exploratory Visual Analysis via Question Decomposition","authors":"Yi Guo, Danqing Shi, Mingjuan Guo, Yanqiu Wu, Nan Cao, Qing Chen","doi":"10.1145/3643894","DOIUrl":"https://doi.org/10.1145/3643894","url":null,"abstract":"<p>Through a natural language interface (NLI) for exploratory visual analysis, users can directly “ask” analytical questions about the given tabular data. This process greatly improves user experience and lowers the technical barriers of data analysis. Existing techniques focus on generating a visualization from a concrete question. However, complex questions, requiring multiple data queries and visualizations to answer, are frequently asked in data exploration and analysis, which cannot be easily solved with the existing techniques. To address this issue, in this paper, we introduce Talk2Data, a natural language interface for exploratory visual analysis that supports answering complex questions. It leverages an advanced deep-learning model to resolve complex questions into a series of simple questions that could gradually elaborate on the users’ requirements. To present answers, we design a set of annotated and captioned visualizations to represent the answers in a form that supports interpretation and narration. We conducted an ablation study and a controlled user study to evaluate the Talk2Data’s effectiveness and usefulness.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139767287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Entity Footprinting: Modeling Contextual User States via Digital Activity Monitoring 实体足迹:通过数字活动监测建模上下文用户状态
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-02-05 DOI: 10.1145/3643893
Zeinab R. Yousefi, Tung Vuong, Marie AlGhossein, Tuukka Ruotsalo, Giulio Jaccuci, Samuel Kaski

Our digital life consists of activities that are organized around tasks and exhibit different user states in the digital contexts around these activities. Previous works have shown that digital activity monitoring can be used to predict entities that users will need to perform digital tasks. There have been methods developed to automatically detect the tasks of a user. However, these studies typically support only specific applications and tasks and relatively little research has been conducted on real-life digital activities. This paper introduces user state modeling and prediction with contextual information captured as entities, recorded from real-world digital user behavior, called entity footprinting; a system that records users’ digital activities on their screens and proactively provides useful entities across application boundaries without requiring explicit query formulation. Our methodology is to detect contextual user states using latent representations of entities occurring in digital activities. Using topic models and recurrent neural networks, the model learns the latent representation of concurrent entities and their sequential relationships. We report a field study in which the digital activities of thirteen people were recorded continuously for 14 days. The model learned from this data is used to 1) predict contextual user states, and 2) predict relevant entities for the detected states. The results show improved user state detection accuracy and entity prediction performance compared to static, heuristic, and basic topic models. Our findings have implications for the design of proactive recommendation systems that can implicitly infer users’ contextual state by monitoring users’ digital activities and proactively recommending the right information at the right time.

我们的数字生活由各种活动组成,这些活动围绕任务展开,并在围绕这些活动的数字环境中呈现出不同的用户状态。以往的研究表明,数字活动监测可用于预测用户执行数字任务所需的实体。目前已经开发出自动检测用户任务的方法。不过,这些研究通常只支持特定的应用和任务,而对现实生活中的数字活动进行的研究相对较少。本文介绍了用户状态建模和预测,以及从真实世界的数字用户行为中记录的以实体形式捕获的上下文信息,即实体足迹;该系统记录用户屏幕上的数字活动,并主动提供跨应用边界的有用实体,而无需明确的查询表述。我们的方法是利用数字活动中出现的实体的潜在表征来检测用户的上下文状态。利用主题模型和递归神经网络,该模型可以学习并发实体的潜在表征及其顺序关系。我们报告了一项实地研究,该研究连续记录了 13 个人 14 天的数字活动。从这些数据中学到的模型用于:1)预测用户的上下文状态;2)预测检测到的状态的相关实体。结果表明,与静态模型、启发式模型和基本主题模型相比,用户状态检测准确率和实体预测性能都有所提高。我们的发现对主动推荐系统的设计具有重要意义,该系统可以通过监控用户的数字活动隐式推断用户的上下文状态,并在适当的时间主动推荐适当的信息。
{"title":"Entity Footprinting: Modeling Contextual User States via Digital Activity Monitoring","authors":"Zeinab R. Yousefi, Tung Vuong, Marie AlGhossein, Tuukka Ruotsalo, Giulio Jaccuci, Samuel Kaski","doi":"10.1145/3643893","DOIUrl":"https://doi.org/10.1145/3643893","url":null,"abstract":"<p>Our digital life consists of activities that are organized around tasks and exhibit different user states in the digital contexts around these activities. Previous works have shown that digital activity monitoring can be used to predict entities that users will need to perform digital tasks. There have been methods developed to automatically detect the tasks of a user. However, these studies typically support only specific applications and tasks and relatively little research has been conducted on real-life digital activities. This paper introduces user state modeling and prediction with contextual information captured as entities, recorded from real-world digital user behavior, called <i>entity footprinting</i>; a system that records users’ digital activities on their screens and proactively provides useful entities across application boundaries without requiring explicit query formulation. Our methodology is to detect contextual user states using latent representations of entities occurring in digital activities. Using topic models and recurrent neural networks, the model learns the latent representation of concurrent entities and their sequential relationships. We report a field study in which the digital activities of thirteen people were recorded continuously for 14 days. The model learned from this data is used to 1) predict contextual user states, and 2) predict relevant entities for the detected states. The results show improved user state detection accuracy and entity prediction performance compared to static, heuristic, and basic topic models. Our findings have implications for the design of proactive recommendation systems that can implicitly infer users’ contextual state by monitoring users’ digital activities and proactively recommending the right information at the right time.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139767234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Group Choices from Group Profiles 从群体概况预测群体选择
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2024-01-10 DOI: 10.1145/3639710
Hanif Emamgholizadeh, Amra Delić, Francesco Ricci

Group recommender systems (GRSs) identify items to recommend to a group of people by aggregating group members’ individual preferences into a group profile, and selecting the items that have the largest score in the group profile. The GRS predicts that these recommendations would be chosen by the group, by assuming that the group is applying the same preference aggregation strategy as the one adopted by the GRS. However, predicting the choice of a group is more complex since the GRS is not aware of the exact preference aggregation strategy that is going to be used by the group.

To this end, the aim of this paper is to validate the research hypothesis that, by using a machine learning approach and a data set of observed group choices, it is possible to predict a group’s final choice, better than by using a standard preference aggregation strategy. Inspired by the Decision Scheme theory, which first tried to address the group choice prediction problem, we search for a group profile definition that, in conjunction with a machine learning model, can be used to accurately predict a group choice. Moreover, to cope with the data scarcity problem, we propose two data augmentation methods, which add synthetic group profiles to the training data, and we hypothesize they can further improve the choice prediction accuracy.

We validate our research hypotheses by using a data set containing 282 participants organized in 79 groups. The experiments indicate that the proposed method outperforms baseline aggregation strategies when used for group choice prediction. The method we propose is robust with the presence of missing preference data and achieves a performance superior to what humans can achieve on the group choice prediction task. Finally, the proposed data augmentation method can also improve the prediction accuracy. Our approach can be exploited in novel GRSs to identify the items that the group is likely to choose and to help groups to make even better and fairer choices.

群体推荐系统(GRS)通过将群体成员的个人偏好汇总到群体档案中,并选择在群体档案中得分最高的项目,从而确定向群体推荐的项目。群体偏好聚合系统假定群体采用的偏好聚合策略与群体偏好聚合系统采用的策略相同,从而预测群体会选择这些推荐项目。然而,预测一个群体的选择更为复杂,因为 GRS 并不知道该群体将使用的确切偏好聚合策略。为此,本文旨在验证以下研究假设:通过使用机器学习方法和观察到的群体选择数据集,可以比使用标准偏好汇总策略更好地预测群体的最终选择。受首次尝试解决群体选择预测问题的 "决策方案 "理论的启发,我们寻找了一种群体特征定义,它与机器学习模型相结合,可用于准确预测群体选择。此外,为了应对数据稀缺的问题,我们提出了两种数据增强方法,即在训练数据中添加合成的群体特征,并假设这两种方法可以进一步提高选择预测的准确性。我们使用一个包含 282 名参与者的数据集(分为 79 个小组)验证了我们的研究假设。实验结果表明,在用于群体选择预测时,我们提出的方法优于基线聚合策略。我们提出的方法对缺失偏好数据具有鲁棒性,在群体选择预测任务中的表现优于人类。最后,我们提出的数据增强方法还能提高预测的准确性。我们的方法可用于新型 GRS,以确定群体可能选择的项目,帮助群体做出更好、更公平的选择。
{"title":"Predicting Group Choices from Group Profiles","authors":"Hanif Emamgholizadeh, Amra Delić, Francesco Ricci","doi":"10.1145/3639710","DOIUrl":"https://doi.org/10.1145/3639710","url":null,"abstract":"<p>Group recommender systems (GRSs) identify items to recommend to a group of people by aggregating group members’ individual preferences into a group profile, and selecting the items that have the largest score in the group profile. The GRS predicts that these recommendations would be chosen by the group, by assuming that the group is applying the same preference aggregation strategy as the one adopted by the GRS. However, predicting the choice of a group is more complex since the GRS is not aware of the exact preference aggregation strategy that is going to be used by the group. </p><p>To this end, the aim of this paper is to validate the research hypothesis that, by using a machine learning approach and a data set of observed group choices, it is possible to predict a group’s final choice, better than by using a standard preference aggregation strategy. Inspired by the Decision Scheme theory, which first tried to address the group choice prediction problem, we search for a group profile definition that, in conjunction with a machine learning model, can be used to accurately predict a group choice. Moreover, to cope with the data scarcity problem, we propose two data augmentation methods, which add synthetic group profiles to the training data, and we hypothesize they can further improve the choice prediction accuracy. </p><p>We validate our research hypotheses by using a data set containing 282 participants organized in 79 groups. The experiments indicate that the proposed method outperforms baseline aggregation strategies when used for group choice prediction. The method we propose is robust with the presence of missing preference data and achieves a performance superior to what humans can achieve on the group choice prediction task. Finally, the proposed data augmentation method can also improve the prediction accuracy. Our approach can be exploited in novel GRSs to identify the items that the group is likely to choose and to help groups to make even better and fairer choices.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139409668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Interactive Intelligent Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1