Proceedings of the 28th International Conference on Intelligent User Interfaces最新文献_第5页

An Empirical Study of Model Errors and User Error Discovery and Repair Strategies in Natural Language Database Queries 自然语言数据库查询中模型错误和用户错误发现与修复策略的实证研究

Proceedings of the 28th International Conference on Intelligent User Interfaces

Pub Date : 2023-03-27 DOI: 10.1145/3581641.3584067

Zheng Ning, Zheng Zhang, Tianyi Sun, Yuan Tian, Tianyi Zhang, Toby Jia-Jun Li

Recent advances in machine learning (ML) and natural language processing (NLP) have led to significant improvement in natural language interfaces for structured databases (NL2SQL). Despite the great strides, the overall accuracy of NL2SQL models is still far from being perfect (∼ 75% on the Spider benchmark). In practice, this requires users to discern incorrect SQL queries generated by a model and manually fix them when using NL2SQL models. Currently, there is a lack of comprehensive understanding about the common errors in auto-generated SQLs and the effective strategies to recognize and fix such errors. To bridge the gap, we (1) performed an in-depth analysis of errors made by three state-of-the-art NL2SQL models; (2) distilled a taxonomy of NL2SQL model errors; and (3) conducted a within-subjects user study with 26 participants to investigate the effectiveness of three representative interactive mechanisms for error discovery and repair in NL2SQL. Findings from this paper shed light on the design of future error discovery and repair strategies for natural language data query interfaces.

机器学习(ML)和自然语言处理(NLP)的最新进展导致了结构化数据库(NL2SQL)的自然语言接口的显著改进。尽管取得了很大的进步，但NL2SQL模型的整体准确性仍然远远不够完美(在Spider基准测试中约为75%)。在实践中，这需要用户识别模型生成的错误SQL查询，并在使用NL2SQL模型时手动修复它们。目前，对自动生成sql中的常见错误以及识别和修复这些错误的有效策略缺乏全面的了解。为了弥补差距，我们(1)对三个最先进的NL2SQL模型所犯的错误进行了深入分析;(2)提取了NL2SQL模型错误的分类;(3)对26名参与者进行了一项主题内用户研究，以调查NL2SQL中三种具有代表性的错误发现和修复交互机制的有效性。本文的研究结果为未来自然语言数据查询接口的错误发现和修复策略的设计提供了启示。

{"title":"An Empirical Study of Model Errors and User Error Discovery and Repair Strategies in Natural Language Database Queries","authors":"Zheng Ning, Zheng Zhang, Tianyi Sun, Yuan Tian, Tianyi Zhang, Toby Jia-Jun Li","doi":"10.1145/3581641.3584067","DOIUrl":"https://doi.org/10.1145/3581641.3584067","url":null,"abstract":"Recent advances in machine learning (ML) and natural language processing (NLP) have led to significant improvement in natural language interfaces for structured databases (NL2SQL). Despite the great strides, the overall accuracy of NL2SQL models is still far from being perfect (∼ 75% on the Spider benchmark). In practice, this requires users to discern incorrect SQL queries generated by a model and manually fix them when using NL2SQL models. Currently, there is a lack of comprehensive understanding about the common errors in auto-generated SQLs and the effective strategies to recognize and fix such errors. To bridge the gap, we (1) performed an in-depth analysis of errors made by three state-of-the-art NL2SQL models; (2) distilled a taxonomy of NL2SQL model errors; and (3) conducted a within-subjects user study with 26 participants to investigate the effectiveness of three representative interactive mechanisms for error discovery and repair in NL2SQL. Findings from this paper shed light on the design of future error discovery and repair strategies for natural language data query interfaces.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134274027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Interactive User Interface for Dialogue Summarization 对话摘要的交互式用户界面

Proceedings of the 28th International Conference on Intelligent User Interfaces

Pub Date : 2023-03-27 DOI: 10.1145/3581641.3584057

Jeesu Jung, H. Seo, S. Jung, Riwoo Chung, Hwijung Ryu, Du-Seong Chang

Summarization is one of the important tasks of natural language processing used to distill information. Recently, the sequence-to-sequence method was applied, in a general manner, to summarization tasks. The problem is that a large amount of information must be pre-trained for a specific domain, and information other than input statements cannot be utilized. To compensate for this shortcoming, controllable summarization has recently been in the spotlight. We introduced three properties into controllable summarization: 1) a new human-machine communication input format, 2) a robust constraint-sensitive summarization method for these formats, and 3) a practical interactive summarization interface available to the user. Experiments on the Wizard-of-Wikipedia dataset show that applying this input format and the constraint-sensitive method enhances summarization performance compared to the typical method. A user study shows that the interactive summarization interface is practical and that participants are evaluating it positively.

摘要是自然语言处理中提取信息的重要任务之一。近年来，序列到序列的方法被广泛应用于摘要任务中。问题是必须对特定领域的大量信息进行预训练，并且不能利用输入语句以外的信息。为了弥补这一缺陷，可控摘要近年来成为人们关注的焦点。我们在可控摘要中引入了三个特性:1)一种新的人机通信输入格式;2)一种针对这些格式的约束敏感的鲁棒摘要方法;3)一种可供用户使用的实用交互式摘要界面。在Wizard-of-Wikipedia数据集上的实验表明，与传统方法相比，采用这种输入格式和约束敏感方法提高了摘要性能。用户研究表明，交互式总结界面是实用的，用户评价积极。

引用次数: 0

Enabling Goal-Focused Exploration of Podcasts in Interactive Recommender Systems 在交互式推荐系统中实现以目标为中心的播客探索

Proceedings of the 28th International Conference on Intelligent User Interfaces

Pub Date : 2023-03-27 DOI: 10.1145/3581641.3584032

Yu Liang, Aditya Ponnada, Paul Lamere, Nediyana Daskalova

Content recommender systems often rely on modeling users’ past behavioral data to provide personalized recommendations - a practice that works well for suggesting more of the same and for media that require little time investment from users, such as music tracks. However, this approach can be further optimized for media where the user investment is higher, such as podcasts, because there is a broader space of user goals that might not be captured by the implicit signals of their past behavior. Allowing users to directly specify their goals might help narrow the space of possible recommendations. Thus, in this paper, we explore how we can enable goal-focused exploration in recommender systems by leveraging explicit input from users about their personal goals. Using podcast consumption as an example use-case, and informed by a large-scale survey (N=68k), we developed GoalPods, an interactive prototype that allows users to set personal goals and build playlists of podcast episode recommendations to meet those goals. We evaluated GoalPods with 14 participants where participants set a goal and spent a week listening to the episode playlist created for that goal. From the study, we identified two types of user goals: low-involvement (e.g. “combat boredom”) and high-involvement (e.g. “learn something new”) goals. Users found it easy to identify relevant recommendations for low-involvement goals, but they needed more structure and support to set high-involvement goals. By anchoring users on their personal goals to explore recommendations, GoalPods (and goal-focused podcast consumption) led to insightful content discovery outside the users’ filter bubbles. Based on our findings, we discuss opportunities for designing recommender systems that guide exploration via interactive goal-setting as well as implications for providing better recommendations by accounting for users’ personal goals.

内容推荐系统通常依赖于对用户过去的行为数据进行建模来提供个性化的推荐——这种做法在推荐更多相同的内容以及需要用户投入很少时间的媒体(如音乐曲目)方面效果很好。然而，这种方法可以进一步优化用户投入较高的媒体，如播客，因为用户目标的空间更大，而用户过去行为的隐含信号可能无法捕捉到这些目标。允许用户直接指定他们的目标可能有助于缩小可能推荐的空间。因此，在本文中，我们探讨了如何通过利用用户关于其个人目标的明确输入，在推荐系统中实现以目标为中心的探索。以播客消费为例，通过大规模调查(N=68k)，我们开发了GoalPods，这是一个交互式原型，允许用户设定个人目标，并建立播客剧集推荐播放列表来实现这些目标。我们用14名参与者对GoalPods进行了评估，参与者设定了一个目标，花一周的时间听为这个目标创建的剧集播放列表。从研究中，我们确定了两种类型的用户目标:低投入(例如“对抗无聊”)和高投入(例如“学习新东西”)目标。用户发现为低参与度目标确定相关建议很容易，但他们需要更多的结构和支持来设置高参与度目标。通过将用户固定在他们的个人目标上来探索推荐，GoalPods(以及以目标为中心的播客消费)在用户的过滤气泡之外带来了富有洞察力的内容发现。基于我们的发现，我们讨论了设计推荐系统的机会，该系统通过交互式目标设置来指导探索，以及通过考虑用户的个人目标来提供更好的推荐。

{"title":"Enabling Goal-Focused Exploration of Podcasts in Interactive Recommender Systems","authors":"Yu Liang, Aditya Ponnada, Paul Lamere, Nediyana Daskalova","doi":"10.1145/3581641.3584032","DOIUrl":"https://doi.org/10.1145/3581641.3584032","url":null,"abstract":"Content recommender systems often rely on modeling users’ past behavioral data to provide personalized recommendations - a practice that works well for suggesting more of the same and for media that require little time investment from users, such as music tracks. However, this approach can be further optimized for media where the user investment is higher, such as podcasts, because there is a broader space of user goals that might not be captured by the implicit signals of their past behavior. Allowing users to directly specify their goals might help narrow the space of possible recommendations. Thus, in this paper, we explore how we can enable goal-focused exploration in recommender systems by leveraging explicit input from users about their personal goals. Using podcast consumption as an example use-case, and informed by a large-scale survey (N=68k), we developed GoalPods, an interactive prototype that allows users to set personal goals and build playlists of podcast episode recommendations to meet those goals. We evaluated GoalPods with 14 participants where participants set a goal and spent a week listening to the episode playlist created for that goal. From the study, we identified two types of user goals: low-involvement (e.g. “combat boredom”) and high-involvement (e.g. “learn something new”) goals. Users found it easy to identify relevant recommendations for low-involvement goals, but they needed more structure and support to set high-involvement goals. By anchoring users on their personal goals to explore recommendations, GoalPods (and goal-focused podcast consumption) led to insightful content discovery outside the users’ filter bubbles. Based on our findings, we discuss opportunities for designing recommender systems that guide exploration via interactive goal-setting as well as implications for providing better recommendations by accounting for users’ personal goals.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123901851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

AutoDesc: Facilitating Convenient Perusal of Web Data Items for Blind Users 自动下载:方便盲人阅读网页数据项目

Proceedings of the 28th International Conference on Intelligent User Interfaces

Pub Date : 2023-03-27 DOI: 10.1145/3581641.3584049

Y. Prakash, Mohan Sunkara, H. Lee, S. Jayarathna, V. Ashok

Web data items such as shopping products, classifieds, and job listings are indispensable components of most e-commerce websites. The information on the data items are typically distributed over two or more webpages, e.g., a ‘Query-Results’ page showing the summaries of the items, and ‘Details’ pages containing full information about the items. While this organization of data mitigates information overload and visual cluttering for sighted users, it however increases the interaction overhead and effort for blind users, as back-and-forth navigation between webpages using screen reader assistive technology is tedious and cumbersome. Existing usability-enhancing solutions are unable to provide adequate support in this regard as they predominantly focus on enabling efficient content access within a single webpage, and as such are not tailored for content distributed across multiple webpages. As an initial step towards addressing this issue, we developed AutoDesc, a browser extension that leverages a custom extraction model to automatically detect and pull out additional item descriptions from the ‘details’ pages, and then proactively inject the extracted information into the ‘Query-Results’ page, thereby reducing the amount of back-and-forth screen reader navigation between the two webpages. In a study with 16 blind users, we observed that within the same time duration, the participants were able to peruse significantly more data items on average with AutoDesc, compared to that with their preferred screen readers as well as with a state-of-the-art solution.

购物产品、分类广告和工作列表等Web数据项是大多数电子商务网站不可或缺的组成部分。数据项的信息通常分布在两个或多个网页上，例如，“查询-结果”页面显示项目摘要，“详细信息”页面包含项目的完整信息。虽然这种数据组织减轻了视力正常用户的信息过载和视觉混乱，但它增加了盲人用户的交互开销和工作量，因为使用屏幕阅读器辅助技术在网页之间来回导航是繁琐和麻烦的。现有的可用性增强解决方案无法在这方面提供足够的支持，因为它们主要关注于在单个网页内实现有效的内容访问，因此不能针对分布在多个网页上的内容进行定制。作为解决这个问题的第一步，我们开发了AutoDesc，这是一个浏览器扩展，它利用自定义提取模型自动检测并从“详细信息”页面中提取额外的项目描述，然后主动将提取的信息注入“查询结果”页面，从而减少了两个网页之间来回屏幕阅读器导航的数量。在一项对16名盲人用户的研究中，我们观察到，在相同的时间内，与使用他们喜欢的屏幕阅读器和最先进的解决方案相比，参与者平均能够阅读更多的数据项。

{"title":"AutoDesc: Facilitating Convenient Perusal of Web Data Items for Blind Users","authors":"Y. Prakash, Mohan Sunkara, H. Lee, S. Jayarathna, V. Ashok","doi":"10.1145/3581641.3584049","DOIUrl":"https://doi.org/10.1145/3581641.3584049","url":null,"abstract":"Web data items such as shopping products, classifieds, and job listings are indispensable components of most e-commerce websites. The information on the data items are typically distributed over two or more webpages, e.g., a ‘Query-Results’ page showing the summaries of the items, and ‘Details’ pages containing full information about the items. While this organization of data mitigates information overload and visual cluttering for sighted users, it however increases the interaction overhead and effort for blind users, as back-and-forth navigation between webpages using screen reader assistive technology is tedious and cumbersome. Existing usability-enhancing solutions are unable to provide adequate support in this regard as they predominantly focus on enabling efficient content access within a single webpage, and as such are not tailored for content distributed across multiple webpages. As an initial step towards addressing this issue, we developed AutoDesc, a browser extension that leverages a custom extraction model to automatically detect and pull out additional item descriptions from the ‘details’ pages, and then proactively inject the extracted information into the ‘Query-Results’ page, thereby reducing the amount of back-and-forth screen reader navigation between the two webpages. In a study with 16 blind users, we observed that within the same time duration, the participants were able to peruse significantly more data items on average with AutoDesc, compared to that with their preferred screen readers as well as with a state-of-the-art solution.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122297904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Human-AI Collaboration: The Effect of AI Delegation on Human Task Performance and Task Satisfaction 人工智能协作:人工智能授权对人类任务绩效和任务满意度的影响

Proceedings of the 28th International Conference on Intelligent User Interfaces

Pub Date : 2023-03-16 DOI: 10.1145/3581641.3584052

Patrick Hemmer, Monika Westphal, M. Schemmer, S. Vetter, Michael Vossing, G. Satzger

Recent work has proposed artificial intelligence (AI) models that can learn to decide whether to make a prediction for an instance of a task or to delegate it to a human by considering both parties’ capabilities. In simulations with synthetically generated or context-independent human predictions, delegation can help improve the performance of human-AI teams—compared to humans or the AI model completing the task alone. However, so far, it remains unclear how humans perform and how they perceive the task when they are aware that an AI model delegated task instances to them. In an experimental study with 196 participants, we show that task performance and task satisfaction improve through AI delegation, regardless of whether humans are aware of the delegation. Additionally, we identify humans’ increased levels of self-efficacy as the underlying mechanism for these improvements in performance and satisfaction. Our findings provide initial evidence that allowing AI models to take over more management responsibilities can be an effective form of human-AI collaboration in workplaces.

最近的研究提出了人工智能(AI)模型，该模型可以学习决定是对任务的一个实例进行预测，还是通过考虑双方的能力将其委托给人类。在具有合成生成或与上下文无关的人类预测的模拟中，与人类或人工智能模型单独完成任务相比，授权可以帮助提高人类-人工智能团队的表现。然而，到目前为止，当人类意识到人工智能模型将任务实例委托给他们时，他们如何执行以及如何感知任务仍然不清楚。在一项有196名参与者的实验研究中，我们表明，无论人类是否意识到委托，通过人工智能委托，任务绩效和任务满意度都得到了提高。此外，我们认为人类自我效能水平的提高是这些绩效和满意度提高的潜在机制。我们的研究结果提供了初步证据，证明让人工智能模型承担更多的管理责任，可能是人类与人工智能在工作场所合作的一种有效形式。

{"title":"Human-AI Collaboration: The Effect of AI Delegation on Human Task Performance and Task Satisfaction","authors":"Patrick Hemmer, Monika Westphal, M. Schemmer, S. Vetter, Michael Vossing, G. Satzger","doi":"10.1145/3581641.3584052","DOIUrl":"https://doi.org/10.1145/3581641.3584052","url":null,"abstract":"Recent work has proposed artificial intelligence (AI) models that can learn to decide whether to make a prediction for an instance of a task or to delegate it to a human by considering both parties’ capabilities. In simulations with synthetically generated or context-independent human predictions, delegation can help improve the performance of human-AI teams—compared to humans or the AI model completing the task alone. However, so far, it remains unclear how humans perform and how they perceive the task when they are aware that an AI model delegated task instances to them. In an experimental study with 196 participants, we show that task performance and task satisfaction improve through AI delegation, regardless of whether humans are aware of the delegation. Additionally, we identify humans’ increased levels of self-efficacy as the underlying mechanism for these improvements in performance and satisfaction. Our findings provide initial evidence that allowing AI models to take over more management responsibilities can be an effective form of human-AI collaboration in workplaces.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115810634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

IRIS: Interpretable Rubric-Informed Segmentation for Action Quality Assessment IRIS:用于行动质量评估的可解释的规则信息分割

Proceedings of the 28th International Conference on Intelligent User Interfaces

Pub Date : 2023-03-16 DOI: 10.1145/3581641.3584048

Hitoshi Matsuyama, Nobuo Kawaguchi, Brian Y. Lim

AI-driven Action Quality Assessment (AQA) of sports videos can mimic Olympic judges to help score performances as a second opinion or for training. However, these AI methods are uninterpretable and do not justify their scores, which is important for algorithmic accountability. Indeed, to account for their decisions, instead of scoring subjectively, sports judges use a consistent set of criteria — rubric — on multiple actions in each performance sequence. Therefore, we propose IRIS to perform Interpretable Rubric-Informed Segmentation on action sequences for AQA. We investigated IRIS for scoring videos of figure skating performance. IRIS predicts (1) action segments, (2) technical element score differences of each segment relative to base scores, (3) multiple program component scores, and (4) the summed final score. In a modeling study, we found that IRIS performs better than non-interpretable, state-of-the-art models. In a formative user study, practicing figure skaters agreed with the rubric-informed explanations, found them useful, and trusted AI judgments more. This work highlights the importance of using judgment rubrics to account for AI decisions.

人工智能驱动的体育视频动作质量评估(AQA)可以模仿奥运会裁判，作为第二意见或训练帮助评分。然而，这些人工智能方法是不可解释的，不能证明它们的分数是合理的，这对算法问责制很重要。事实上，为了解释他们的决定，体育裁判不是主观地打分，而是对每个表演序列中的多个动作使用一套一致的标准——规则。因此，我们提出IRIS对AQA的动作序列进行可解释的基于规则的分割。我们利用IRIS系统对花样滑冰成绩的评分录像进行了研究。IRIS预测(1)动作片段，(2)每个片段相对于基础分数的技术要素得分差异，(3)多个程序组件得分，以及(4)最终总分。在建模研究中，我们发现IRIS比不可解释的、最先进的模型表现得更好。在一项形成性的用户研究中，练习花样滑冰的运动员同意基于规则的解释，认为它们很有用，并且更相信人工智能的判断。这项工作强调了使用判断规则来解释人工智能决策的重要性。

{"title":"IRIS: Interpretable Rubric-Informed Segmentation for Action Quality Assessment","authors":"Hitoshi Matsuyama, Nobuo Kawaguchi, Brian Y. Lim","doi":"10.1145/3581641.3584048","DOIUrl":"https://doi.org/10.1145/3581641.3584048","url":null,"abstract":"AI-driven Action Quality Assessment (AQA) of sports videos can mimic Olympic judges to help score performances as a second opinion or for training. However, these AI methods are uninterpretable and do not justify their scores, which is important for algorithmic accountability. Indeed, to account for their decisions, instead of scoring subjectively, sports judges use a consistent set of criteria — rubric — on multiple actions in each performance sequence. Therefore, we propose IRIS to perform Interpretable Rubric-Informed Segmentation on action sequences for AQA. We investigated IRIS for scoring videos of figure skating performance. IRIS predicts (1) action segments, (2) technical element score differences of each segment relative to base scores, (3) multiple program component scores, and (4) the summed final score. In a modeling study, we found that IRIS performs better than non-interpretable, state-of-the-art models. In a formative user study, practicing figure skaters agreed with the rubric-informed explanations, found them useful, and trusted AI judgments more. This work highlights the importance of using judgment rubrics to account for AI decisions.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130324258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Steering Recommendations and Visualising Its Impact: Effects on Adolescents’ Trust in E-Learning Platforms 指导建议及其影响可视化:对青少年在线学习平台信任的影响

Proceedings of the 28th International Conference on Intelligent User Interfaces

Pub Date : 2023-02-28 DOI: 10.1145/3581641.3584046

Jeroen Ooge, L. Dereu, K. Verbert

Researchers have widely acknowledged the potential of control mechanisms with which end-users of recommender systems can better tailor recommendations. However, few e-learning environments so far incorporate such mechanisms, for example for steering recommended exercises. In addition, studies with adolescents in this context are rare. To address these limitations, we designed a control mechanism and a visualisation of the control’s impact through an iterative design process with adolescents and teachers. Then, we investigated how these functionalities affect adolescents’ trust in an e-learning platform that recommends maths exercises. A randomised controlled experiment with 76 middle school and high school adolescents showed that visualising the impact of exercised control significantly increases trust. Furthermore, having control over their mastery level seemed to inspire adolescents to reasonably challenge themselves and reflect upon the underlying recommendation algorithm. Finally, a significant increase in perceived transparency suggested that visualising steering actions can indirectly explain why recommendations are suitable, which opens interesting research tracks for the broader field of explainable AI.

研究人员已经广泛认识到控制机制的潜力，通过这种机制，推荐系统的最终用户可以更好地定制推荐。然而，到目前为止，很少有电子学习环境包含这样的机制，例如指导推荐练习。此外，在这方面对青少年的研究很少。为了解决这些限制，我们设计了一个控制机制，并通过青少年和教师的迭代设计过程来可视化控制的影响。然后，我们调查了这些功能如何影响青少年对推荐数学练习的电子学习平台的信任。一项针对76名初中生和高中生的随机对照实验表明，想象控制所产生的影响能显著增加信任。此外，控制自己的掌握水平似乎激励青少年合理地挑战自己，并反思底层推荐算法。最后，感知透明度的显着增加表明，可视化转向动作可以间接解释为什么建议是合适的，这为可解释的人工智能的更广泛领域开辟了有趣的研究轨道。

{"title":"Steering Recommendations and Visualising Its Impact: Effects on Adolescents’ Trust in E-Learning Platforms","authors":"Jeroen Ooge, L. Dereu, K. Verbert","doi":"10.1145/3581641.3584046","DOIUrl":"https://doi.org/10.1145/3581641.3584046","url":null,"abstract":"Researchers have widely acknowledged the potential of control mechanisms with which end-users of recommender systems can better tailor recommendations. However, few e-learning environments so far incorporate such mechanisms, for example for steering recommended exercises. In addition, studies with adolescents in this context are rare. To address these limitations, we designed a control mechanism and a visualisation of the control’s impact through an iterative design process with adolescents and teachers. Then, we investigated how these functionalities affect adolescents’ trust in an e-learning platform that recommends maths exercises. A randomised controlled experiment with 76 middle school and high school adolescents showed that visualising the impact of exercised control significantly increases trust. Furthermore, having control over their mastery level seemed to inspire adolescents to reasonably challenge themselves and reflect upon the underlying recommendation algorithm. Finally, a significant increase in perceived transparency suggested that visualising steering actions can indirectly explain why recommendations are suitable, which opens interesting research tracks for the broader field of explainable AI.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131986331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Addressing UX Practitioners’ Challenges in Designing ML Applications: an Interactive Machine Learning Approach 解决用户体验从业者在设计机器学习应用程序中的挑战:一种交互式机器学习方法

Proceedings of the 28th International Conference on Intelligent User Interfaces

Pub Date : 2023-02-23 DOI: 10.1145/3581641.3584064

K. J. Kevin Feng, David W. Mcdonald

UX practitioners face novel challenges when designing user interfaces for machine learning (ML)-enabled applications. Interactive ML paradigms, like AutoML and interactive machine teaching, lower the barrier for non-expert end users to create, understand, and use ML models, but their application to UX practice is largely unstudied. We conducted a task-based design study with 27 UX practitioners where we asked them to propose a proof-of-concept design for a new ML-enabled application. During the task, our participants were given opportunities to create, test, and modify ML models as part of their workflows. Through a qualitative analysis of our post-task interview, we found that direct, interactive experimentation with ML allowed UX practitioners to tie ML capabilities and underlying data to user goals, compose affordances to enhance end-user interactions with ML, and identify ML-related ethical risks and challenges. We discuss our findings in the context of previously established human-AI guidelines. We also identify some limitations of interactive ML in UX processes and propose research-informed machine teaching as a supplement to future design tools alongside interactive ML.

用户体验从业者在为支持机器学习(ML)的应用程序设计用户界面时面临着新的挑战。交互式机器学习范例，如AutoML和交互式机器教学，降低了非专业最终用户创建、理解和使用机器学习模型的障碍，但它们在用户体验实践中的应用在很大程度上尚未得到研究。我们对27名用户体验从业者进行了一项基于任务的设计研究，我们要求他们为一个新的支持ml的应用程序提出一个概念验证设计。在任务期间，我们的参与者有机会创建、测试和修改ML模型，作为他们工作流程的一部分。通过对我们的任务后访谈的定性分析，我们发现直接的、与机器学习交互的实验允许用户体验从业者将机器学习功能和底层数据与用户目标联系起来，组成功能以增强最终用户与机器学习的交互，并识别与机器学习相关的道德风险和挑战。我们在先前建立的人类-人工智能指南的背景下讨论我们的发现。我们还确定了交互式机器学习在用户体验过程中的一些局限性，并提出了基于研究的机器教学，作为交互式机器学习之外的未来设计工具的补充。

{"title":"Addressing UX Practitioners’ Challenges in Designing ML Applications: an Interactive Machine Learning Approach","authors":"K. J. Kevin Feng, David W. Mcdonald","doi":"10.1145/3581641.3584064","DOIUrl":"https://doi.org/10.1145/3581641.3584064","url":null,"abstract":"UX practitioners face novel challenges when designing user interfaces for machine learning (ML)-enabled applications. Interactive ML paradigms, like AutoML and interactive machine teaching, lower the barrier for non-expert end users to create, understand, and use ML models, but their application to UX practice is largely unstudied. We conducted a task-based design study with 27 UX practitioners where we asked them to propose a proof-of-concept design for a new ML-enabled application. During the task, our participants were given opportunities to create, test, and modify ML models as part of their workflows. Through a qualitative analysis of our post-task interview, we found that direct, interactive experimentation with ML allowed UX practitioners to tie ML capabilities and underlying data to user goals, compose affordances to enhance end-user interactions with ML, and identify ML-related ethical risks and challenges. We discuss our findings in the context of previously established human-AI guidelines. We also identify some limitations of interactive ML in UX processes and propose research-informed machine teaching as a supplement to future design tools alongside interactive ML.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128260688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Directive Explanations for Monitoring the Risk of Diabetes Onset: Introducing Directive Data-Centric Explanations and Combinations to Support What-If Explorations 监测糖尿病发病风险的指令解释:引入以数据为中心的指令解释和组合，以支持假设探索

Proceedings of the 28th International Conference on Intelligent User Interfaces

Pub Date : 2023-02-21 DOI: 10.1145/3581641.3584075

Aditya Bhattacharya, Jeroen Ooge, G. Štiglic, K. Verbert

Explainable artificial intelligence is increasingly used in machine learning (ML) based decision-making systems in healthcare. However, little research has compared the utility of different explanation methods in guiding healthcare experts for patient care. Moreover, it is unclear how useful, understandable, actionable and trustworthy these methods are for healthcare experts, as they often require technical ML knowledge. This paper presents an explanation dashboard that predicts the risk of diabetes onset and explains those predictions with data-centric, feature-importance, and example-based explanations. We designed an interactive dashboard to assist healthcare experts, such as nurses and physicians, in monitoring the risk of diabetes onset and recommending measures to minimize risk. We conducted a qualitative study with 11 healthcare experts and a mixed-methods study with 45 healthcare experts and 51 diabetic patients to compare the different explanation methods in our dashboard in terms of understandability, usefulness, actionability, and trust. Results indicate that our participants preferred our representation of data-centric explanations that provide local explanations with a global overview over other methods. Therefore, this paper highlights the importance of visually directive data-centric explanation method for assisting healthcare experts to gain actionable insights from patient health records. Furthermore, we share our design implications for tailoring the visual representation of different explanation methods for healthcare experts.

可解释的人工智能越来越多地用于医疗保健中基于机器学习(ML)的决策系统。然而，很少有研究比较不同的解释方法在指导医疗保健专家对病人护理的效用。此外，目前尚不清楚这些方法对医疗保健专家是否有用、可理解、可操作和值得信赖，因为它们通常需要ML技术知识。本文提出了一个解释仪表板，可以预测糖尿病发病的风险，并通过以数据为中心、特征重要性和基于示例的解释来解释这些预测。我们设计了一个交互式仪表板，以帮助医疗保健专家(如护士和医生)监测糖尿病发病的风险，并建议采取措施将风险降至最低。我们对11位医疗保健专家进行了定性研究，并对45位医疗保健专家和51位糖尿病患者进行了混合方法研究，以比较我们的仪表板中不同的解释方法在可理解性、有用性、可操作性和信任度方面的差异。结果表明，与其他方法相比，我们的参与者更喜欢我们以数据为中心的解释表示，这种解释提供了具有全局概况的局部解释。因此，本文强调了视觉指导以数据为中心的解释方法的重要性，以帮助医疗保健专家从患者健康记录中获得可操作的见解。此外，我们还分享了为医疗保健专家定制不同解释方法的视觉表示的设计含义。

{"title":"Directive Explanations for Monitoring the Risk of Diabetes Onset: Introducing Directive Data-Centric Explanations and Combinations to Support What-If Explorations","authors":"Aditya Bhattacharya, Jeroen Ooge, G. Štiglic, K. Verbert","doi":"10.1145/3581641.3584075","DOIUrl":"https://doi.org/10.1145/3581641.3584075","url":null,"abstract":"Explainable artificial intelligence is increasingly used in machine learning (ML) based decision-making systems in healthcare. However, little research has compared the utility of different explanation methods in guiding healthcare experts for patient care. Moreover, it is unclear how useful, understandable, actionable and trustworthy these methods are for healthcare experts, as they often require technical ML knowledge. This paper presents an explanation dashboard that predicts the risk of diabetes onset and explains those predictions with data-centric, feature-importance, and example-based explanations. We designed an interactive dashboard to assist healthcare experts, such as nurses and physicians, in monitoring the risk of diabetes onset and recommending measures to minimize risk. We conducted a qualitative study with 11 healthcare experts and a mixed-methods study with 45 healthcare experts and 51 diabetic patients to compare the different explanation methods in our dashboard in terms of understandability, usefulness, actionability, and trust. Results indicate that our participants preferred our representation of data-centric explanations that provide local explanations with a global overview over other methods. Therefore, this paper highlights the importance of visually directive data-centric explanation method for assisting healthcare experts to gain actionable insights from patient health records. Furthermore, we share our design implications for tailoring the visual representation of different explanation methods for healthcare experts.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123219803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

AutoDOViz: Human-Centered Automation for Decision Optimization AutoDOViz:以人为中心的决策优化自动化

Proceedings of the 28th International Conference on Intelligent User Interfaces

Pub Date : 2023-02-19 DOI: 10.1145/3581641.3584094

D. Weidele, S. Afzal, Abel N. Valente, Cole Makuch, Owen Cornec, Long Vu, D. Subramanian, Werner Geyer, Rahul Nair, Inge Vejsbjerg, Radu Marinescu, Paulito Palmes, Elizabeth M. Daly, Loraine Franke, D. Haehn

We present AutoDOViz, an interactive user interface for automated decision optimization (AutoDO) using reinforcement learning (RL). Decision optimization (DO) has classically being practiced by dedicated DO researchers [43] where experts need to spend long periods of time fine tuning a solution through trial-and-error. AutoML pipeline search has sought to make it easier for a data scientist to find the best machine learning pipeline by leveraging automation to search and tune the solution. More recently, these advances have been applied to the domain of AutoDO [36], with a similar goal to find the best reinforcement learning pipeline through algorithm selection and parameter tuning. However, Decision Optimization requires significantly more complex problem specification when compared to an ML problem. AutoDOViz seeks to lower the barrier of entry for data scientists in problem specification for reinforcement learning problems, leverage the benefits of AutoDO algorithms for RL pipeline search and finally, create visualizations and policy insights in order to facilitate the typical interactive nature when communicating problem formulation and solution proposals between DO experts and domain experts. In this paper, we report our findings from semi-structured expert interviews with DO practitioners as well as business consultants, leading to design requirements for human-centered automation for DO with RL. We evaluate a system implementation with data scientists and find that they are significantly more open to engage in DO after using our proposed solution. AutoDOViz further increases trust in RL agent models and makes the automated training and evaluation process more comprehensible. As shown for other automation in ML tasks [33, 59], we also conclude automation of RL for DO can benefit from user and vice-versa when the interface promotes human-in-the-loop.

我们提出AutoDOViz，一个使用强化学习(RL)的自动决策优化(AutoDO)的交互式用户界面。决策优化(DO)通常由专门的DO研究人员进行实践，专家需要花很长时间通过试错来微调解决方案。AutoML管道搜索试图通过利用自动化来搜索和调整解决方案，使数据科学家更容易找到最佳的机器学习管道。最近，这些进展已被应用于AutoDO[36]领域，其类似的目标是通过算法选择和参数调整来找到最佳的强化学习管道。然而，与ML问题相比，Decision Optimization需要更复杂的问题规范。AutoDOViz旨在降低数据科学家在强化学习问题规范方面的进入门槛，利用AutoDO算法在强化学习管道搜索中的优势，最后，创建可视化和策略见解，以便在DO专家和领域专家之间沟通问题制定和解决方案建议时促进典型的互动性质。在本文中，我们报告了我们对DO从业者和业务顾问进行的半结构化专家访谈的发现，这导致了以人为中心的RL DO自动化的设计需求。我们与数据科学家一起评估了一个系统的实现，发现他们在使用我们提出的解决方案后更愿意参与到DO中来。AutoDOViz进一步增加了对RL代理模型的信任，并使自动化训练和评估过程更易于理解。正如机器学习任务中的其他自动化所显示的那样[33,59]，我们还得出结论，当界面促进人机交互时，DO的强化学习自动化可以从用户中受益，反之亦然。

{"title":"AutoDOViz: Human-Centered Automation for Decision Optimization","authors":"D. Weidele, S. Afzal, Abel N. Valente, Cole Makuch, Owen Cornec, Long Vu, D. Subramanian, Werner Geyer, Rahul Nair, Inge Vejsbjerg, Radu Marinescu, Paulito Palmes, Elizabeth M. Daly, Loraine Franke, D. Haehn","doi":"10.1145/3581641.3584094","DOIUrl":"https://doi.org/10.1145/3581641.3584094","url":null,"abstract":"We present AutoDOViz, an interactive user interface for automated decision optimization (AutoDO) using reinforcement learning (RL). Decision optimization (DO) has classically being practiced by dedicated DO researchers [43] where experts need to spend long periods of time fine tuning a solution through trial-and-error. AutoML pipeline search has sought to make it easier for a data scientist to find the best machine learning pipeline by leveraging automation to search and tune the solution. More recently, these advances have been applied to the domain of AutoDO [36], with a similar goal to find the best reinforcement learning pipeline through algorithm selection and parameter tuning. However, Decision Optimization requires significantly more complex problem specification when compared to an ML problem. AutoDOViz seeks to lower the barrier of entry for data scientists in problem specification for reinforcement learning problems, leverage the benefits of AutoDO algorithms for RL pipeline search and finally, create visualizations and policy insights in order to facilitate the typical interactive nature when communicating problem formulation and solution proposals between DO experts and domain experts. In this paper, we report our findings from semi-structured expert interviews with DO practitioners as well as business consultants, leading to design requirements for human-centered automation for DO with RL. We evaluate a system implementation with data scientists and find that they are significantly more open to engage in DO after using our proposed solution. AutoDOViz further increases trust in RL agent models and makes the automated training and evaluation process more comprehensible. As shown for other automation in ML tasks [33, 59], we also conclude automation of RL for DO can benefit from user and vice-versa when the interface promotes human-in-the-loop.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"388 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124810355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0