首页 > 最新文献

ACM Transactions on Interactive Intelligent Systems最新文献

英文 中文
Enabling Efficient Web Data-Record Interaction for People with Visual Impairments via Proxy Interfaces 通过代理接口为视障人士实现有效的Web数据记录交互
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2023-01-10 DOI: https://dl.acm.org/doi/10.1145/3579364
Javedul Ferdous, Hae-Na Lee, Sampath Jayarathna, Vikas Ashok

Web data records are usually accompanied by auxiliary webpage segments, such as filters, sort options, search form, and multi-page links, to enhance interaction efficiency and convenience for end users. However, blind and visually impaired (BVI) persons are presently unable to fully exploit the auxiliary segments like their sighted peers, since these segments are scattered all across the screen, and as such assistive technologies used by BVI users, i.e., screen reader and screen magnifier, are not geared for efficient interaction with such scattered content. Specifically, for blind screen reader users, content navigation is predominantly one-dimensional despite the support for skipping content, and therefore navigating to-and-fro between different parts of the webpage is tedious and frustrating. Similarly, low vision screen magnifier users have to continuously pan back-and-forth between different portions of a webpage, given that only a portion of the screen is viewable at any instant due to content enlargement. The extant techniques to overcome inefficient web interaction for BVI users have mostly focused on general web-browsing activities, and as such they provide little to no support for data record-specific interaction activities such as filtering and sorting – activities that are equally important for facilitating quick and easy access to desired data records. To fill this void, we present InSupport, a browser extension that: (i) employs custom machine learning-based algorithms to automatically extract auxiliary segments on any webpage containing data records; and (ii) provides an instantly accessible proxy one-stop interface for easily navigating the extracted auxiliary segments using either basic keyboard shortcuts or mouse actions. Evaluation studies with 14 blind participants and 16 low vision participants showed significant improvement in web usability with InSupport, driven by increased reduction in interaction time and user effort, compared to the state-of-the-art solutions.

Web数据记录通常伴随着辅助的网页段,例如过滤器、排序选项、搜索表单和多页链接,以提高最终用户的交互效率和便利性。然而,盲人和视障人士(BVI)目前无法像他们的视力正常的同龄人那样充分利用辅助部分,因为这些部分分散在屏幕上,而且BVI用户使用的辅助技术,即屏幕阅读器和屏幕放大镜,无法与这些分散的内容进行有效的交互。具体来说,对于盲人屏幕阅读器用户来说,内容导航主要是一维的,尽管支持跳过内容,因此在网页的不同部分之间来回导航是乏味和令人沮丧的。同样,低视力的屏幕放大镜用户必须不断地在网页的不同部分之间来回移动,因为在任何时候,由于内容放大,只有一部分屏幕是可见的。为英属维尔京群岛用户克服低效的网络交互的现有技术主要集中在一般的网络浏览活动上,因此它们对特定于数据记录的交互活动(如过滤和排序)提供很少或根本没有支持,而这些活动对于促进快速、轻松地访问所需的数据记录同样重要。为了填补这一空白,我们提出了InSupport,一个浏览器扩展:(i)采用基于自定义机器学习的算法来自动提取包含数据记录的任何网页上的辅助段;(ii)提供了一个即时访问的代理一站式界面,可以使用基本的键盘快捷键或鼠标操作轻松导航提取的辅助段。对14名盲人参与者和16名低视力参与者进行的评估研究表明,与最先进的解决方案相比,InSupport显著改善了网络可用性,减少了交互时间和用户的工作量。
{"title":"Enabling Efficient Web Data-Record Interaction for People with Visual Impairments via Proxy Interfaces","authors":"Javedul Ferdous, Hae-Na Lee, Sampath Jayarathna, Vikas Ashok","doi":"https://dl.acm.org/doi/10.1145/3579364","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3579364","url":null,"abstract":"<p>Web data records are usually accompanied by auxiliary webpage segments, such as filters, sort options, search form, and multi-page links, to enhance interaction efficiency and convenience for end users. However, blind and visually impaired (BVI) persons are presently unable to fully exploit the auxiliary segments like their sighted peers, since these segments are scattered all across the screen, and as such assistive technologies used by BVI users, i.e., screen reader and screen magnifier, are not geared for efficient interaction with such scattered content. Specifically, for blind screen reader users, content navigation is predominantly one-dimensional despite the support for skipping content, and therefore navigating to-and-fro between different parts of the webpage is tedious and frustrating. Similarly, low vision screen magnifier users have to continuously pan back-and-forth between different portions of a webpage, given that only a portion of the screen is viewable at any instant due to content enlargement. The extant techniques to overcome inefficient web interaction for BVI users have mostly focused on general web-browsing activities, and as such they provide little to no support for data record-specific interaction activities such as filtering and sorting – activities that are equally important for facilitating quick and easy access to <i>desired</i> data records. To fill this void, we present InSupport, a browser extension that: (i) employs custom machine learning-based algorithms to automatically extract auxiliary segments on any webpage containing data records; and (ii) provides an instantly accessible proxy <i>one-stop</i> interface for easily navigating the extracted auxiliary segments using either basic keyboard shortcuts or mouse actions. Evaluation studies with 14 blind participants and 16 low vision participants showed significant improvement in web usability with InSupport, driven by increased reduction in interaction time and user effort, compared to the state-of-the-art solutions.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enabling Efficient Web Data-Record Interaction for People with Visual Impairments via Proxy Interfaces 通过代理接口为视障人士实现有效的Web数据记录交互
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2023-01-10 DOI: 10.1145/3579364
Javedul Ferdous, H. Lee, S. Jayarathna, V. Ashok
Web data records are usually accompanied by auxiliary webpage segments, such as filters, sort options, search form, and multi-page links, to enhance interaction efficiency and convenience for end users. However, blind and visually impaired (BVI) persons are presently unable to fully exploit the auxiliary segments like their sighted peers, since these segments are scattered all across the screen, and as such assistive technologies used by BVI users, i.e., screen reader and screen magnifier, are not geared for efficient interaction with such scattered content. Specifically, for blind screen reader users, content navigation is predominantly one-dimensional despite the support for skipping content, and therefore navigating to-and-fro between different parts of the webpage is tedious and frustrating. Similarly, low vision screen magnifier users have to continuously pan back-and-forth between different portions of a webpage, given that only a portion of the screen is viewable at any instant due to content enlargement. The extant techniques to overcome inefficient web interaction for BVI users have mostly focused on general web-browsing activities, and as such they provide little to no support for data record-specific interaction activities such as filtering and sorting – activities that are equally important for facilitating quick and easy access to desired data records. To fill this void, we present InSupport, a browser extension that: (i) employs custom machine learning-based algorithms to automatically extract auxiliary segments on any webpage containing data records; and (ii) provides an instantly accessible proxy one-stop interface for easily navigating the extracted auxiliary segments using either basic keyboard shortcuts or mouse actions. Evaluation studies with 14 blind participants and 16 low vision participants showed significant improvement in web usability with InSupport, driven by increased reduction in interaction time and user effort, compared to the state-of-the-art solutions.
Web数据记录通常伴随着辅助的网页段,例如过滤器、排序选项、搜索表单和多页链接,以提高最终用户的交互效率和便利性。然而,盲人和视障人士(BVI)目前无法像他们的视力正常的同龄人那样充分利用辅助部分,因为这些部分分散在屏幕上,而且BVI用户使用的辅助技术,即屏幕阅读器和屏幕放大镜,无法与这些分散的内容进行有效的交互。具体来说,对于盲人屏幕阅读器用户来说,内容导航主要是一维的,尽管支持跳过内容,因此在网页的不同部分之间来回导航是乏味和令人沮丧的。同样,低视力的屏幕放大镜用户必须不断地在网页的不同部分之间来回移动,因为在任何时候,由于内容放大,只有一部分屏幕是可见的。为英属维尔京群岛用户克服低效的网络交互的现有技术主要集中在一般的网络浏览活动上,因此它们对特定于数据记录的交互活动(如过滤和排序)提供很少或根本没有支持,而这些活动对于促进快速、轻松地访问所需的数据记录同样重要。为了填补这一空白,我们提出了InSupport,一个浏览器扩展:(i)采用基于自定义机器学习的算法来自动提取包含数据记录的任何网页上的辅助段;(ii)提供了一个即时访问的代理一站式界面,可以使用基本的键盘快捷键或鼠标操作轻松导航提取的辅助段。对14名盲人参与者和16名低视力参与者进行的评估研究表明,与最先进的解决方案相比,InSupport显著改善了网络可用性,减少了交互时间和用户的工作量。
{"title":"Enabling Efficient Web Data-Record Interaction for People with Visual Impairments via Proxy Interfaces","authors":"Javedul Ferdous, H. Lee, S. Jayarathna, V. Ashok","doi":"10.1145/3579364","DOIUrl":"https://doi.org/10.1145/3579364","url":null,"abstract":"Web data records are usually accompanied by auxiliary webpage segments, such as filters, sort options, search form, and multi-page links, to enhance interaction efficiency and convenience for end users. However, blind and visually impaired (BVI) persons are presently unable to fully exploit the auxiliary segments like their sighted peers, since these segments are scattered all across the screen, and as such assistive technologies used by BVI users, i.e., screen reader and screen magnifier, are not geared for efficient interaction with such scattered content. Specifically, for blind screen reader users, content navigation is predominantly one-dimensional despite the support for skipping content, and therefore navigating to-and-fro between different parts of the webpage is tedious and frustrating. Similarly, low vision screen magnifier users have to continuously pan back-and-forth between different portions of a webpage, given that only a portion of the screen is viewable at any instant due to content enlargement. The extant techniques to overcome inefficient web interaction for BVI users have mostly focused on general web-browsing activities, and as such they provide little to no support for data record-specific interaction activities such as filtering and sorting – activities that are equally important for facilitating quick and easy access to desired data records. To fill this void, we present InSupport, a browser extension that: (i) employs custom machine learning-based algorithms to automatically extract auxiliary segments on any webpage containing data records; and (ii) provides an instantly accessible proxy one-stop interface for easily navigating the extracted auxiliary segments using either basic keyboard shortcuts or mouse actions. Evaluation studies with 14 blind participants and 16 low vision participants showed significant improvement in web usability with InSupport, driven by increased reduction in interaction time and user effort, compared to the state-of-the-art solutions.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82017819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Impact of Intelligent Pedagogical Agents’ Interventions on Student Behavior and Performance in Open-Ended Game Design Environments 开放式游戏设计环境中智能教学主体干预对学生行为和表现的影响
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2023-01-04 DOI: https://dl.acm.org/doi/10.1145/3578523
Özge Nilay Yalçın, Sébastien Lallé, Cristina Conati

Research has shown that free-form Game-Design (GD) environments can be very effective in fostering Computational Thinking (CT) skills at a young age. However, some students can still need some guidance during the learning process due to the highly open-ended nature of these environments. Intelligent Pedagogical Agents (IPAs) can be used to provide personalized assistance in real-time to alleviate this challenge. This paper presents our results in evaluating such an agent deployed in a real-word free-form GD learning environment to foster CT in the early K-12 education, Unity-CT. We focus on the effect of repetition by comparing student behaviors between no intervention, 1-shot, and repeated intervention groups for two different errors that are known to be challenging in the online lessons of Unity-CT. Our findings showed that the agent was perceived very positively by the students and the repeated intervention showed promising results in terms of helping students make less errors and more correct behaviors, albeit only for one of the two target errors. Building from these results, we provide insights on how to provide IPA interventions in free-form GD environments.

研究表明,自由形式的游戏设计(GD)环境可以非常有效地培养儿童的计算思维(CT)技能。然而,由于这些环境的开放性,一些学生在学习过程中仍然需要一些指导。智能教学代理(IPAs)可用于实时提供个性化帮助,以缓解这一挑战。本文介绍了我们在一个真实世界的自由形式的GD学习环境中评估这种智能体的结果,以促进早期K-12教育中的CT, Unity-CT。针对Unity-CT在线课程中已知具有挑战性的两种不同错误,我们通过比较无干预组、单次干预组和重复干预组的学生行为来关注重复的影响。我们的研究结果表明,学生对代理的感知非常积极,反复干预在帮助学生减少错误和更正确的行为方面显示出有希望的结果,尽管只针对两个目标错误中的一个。基于这些结果,我们提供了如何在自由形式的GD环境中提供IPA干预的见解。
{"title":"The Impact of Intelligent Pedagogical Agents’ Interventions on Student Behavior and Performance in Open-Ended Game Design Environments","authors":"Özge Nilay Yalçın, Sébastien Lallé, Cristina Conati","doi":"https://dl.acm.org/doi/10.1145/3578523","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3578523","url":null,"abstract":"<p>Research has shown that free-form Game-Design (GD) environments can be very effective in fostering Computational Thinking (CT) skills at a young age. However, some students can still need some guidance during the learning process due to the highly open-ended nature of these environments. Intelligent Pedagogical Agents (IPAs) can be used to provide personalized assistance in real-time to alleviate this challenge. This paper presents our results in evaluating such an agent deployed in a real-word free-form GD learning environment to foster CT in the early K-12 education, Unity-CT. We focus on the effect of repetition by comparing student behaviors between no intervention, 1-shot, and repeated intervention groups for two different errors that are known to be challenging in the online lessons of Unity-CT. Our findings showed that the agent was perceived very positively by the students and the repeated intervention showed promising results in terms of helping students make less errors and more correct behaviors, albeit only for one of the two target errors. Building from these results, we provide insights on how to provide IPA interventions in free-form GD environments.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Impact of Intelligent Pedagogical Agents’ Interventions on Student Behavior and Performance in Open-Ended Game Design Environments 开放式游戏设计环境中智能教学主体干预对学生行为和表现的影响
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2023-01-04 DOI: 10.1145/3578523
Ö. Yalçın, Sébastien Lallé, C. Conati
Research has shown that free-form Game-Design (GD) environments can be very effective in fostering Computational Thinking (CT) skills at a young age. However, some students can still need some guidance during the learning process due to the highly open-ended nature of these environments. Intelligent Pedagogical Agents (IPAs) can be used to provide personalized assistance in real-time to alleviate this challenge. This paper presents our results in evaluating such an agent deployed in a real-word free-form GD learning environment to foster CT in the early K-12 education, Unity-CT. We focus on the effect of repetition by comparing student behaviors between no intervention, 1-shot, and repeated intervention groups for two different errors that are known to be challenging in the online lessons of Unity-CT. Our findings showed that the agent was perceived very positively by the students and the repeated intervention showed promising results in terms of helping students make fewer errors and more correct behaviors, albeit only for one of the two target errors. Building from these results, we provide insights on how to provide IPA interventions in free-form GD environments.
研究表明,自由形式的游戏设计(GD)环境可以非常有效地培养儿童的计算思维(CT)技能。然而,由于这些环境的开放性,一些学生在学习过程中仍然需要一些指导。智能教学代理(IPAs)可用于实时提供个性化帮助,以缓解这一挑战。本文介绍了我们在一个真实世界的自由形式的GD学习环境中评估这种智能体的结果,以促进早期K-12教育中的CT, Unity-CT。针对Unity-CT在线课程中已知具有挑战性的两种不同错误,我们通过比较无干预组、单次干预组和重复干预组的学生行为来关注重复的影响。我们的研究结果表明,学生对代理的感知非常积极,反复干预在帮助学生减少错误和更正确的行为方面显示出有希望的结果,尽管只针对两个目标错误中的一个。基于这些结果,我们提供了如何在自由形式的GD环境中提供IPA干预的见解。
{"title":"The Impact of Intelligent Pedagogical Agents’ Interventions on Student Behavior and Performance in Open-Ended Game Design Environments","authors":"Ö. Yalçın, Sébastien Lallé, C. Conati","doi":"10.1145/3578523","DOIUrl":"https://doi.org/10.1145/3578523","url":null,"abstract":"Research has shown that free-form Game-Design (GD) environments can be very effective in fostering Computational Thinking (CT) skills at a young age. However, some students can still need some guidance during the learning process due to the highly open-ended nature of these environments. Intelligent Pedagogical Agents (IPAs) can be used to provide personalized assistance in real-time to alleviate this challenge. This paper presents our results in evaluating such an agent deployed in a real-word free-form GD learning environment to foster CT in the early K-12 education, Unity-CT. We focus on the effect of repetition by comparing student behaviors between no intervention, 1-shot, and repeated intervention groups for two different errors that are known to be challenging in the online lessons of Unity-CT. Our findings showed that the agent was perceived very positively by the students and the repeated intervention showed promising results in terms of helping students make fewer errors and more correct behaviors, albeit only for one of the two target errors. Building from these results, we provide insights on how to provide IPA interventions in free-form GD environments.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78472784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visualization and Visual Analytics Approaches for Image and Video Datasets: A Survey 图像和视频数据集的可视化和可视化分析方法:综述
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2023-01-02 DOI: 10.1145/3576935
S. Afzal, Sohaib Ghani, M. Hittawe, Sheikh Faisal Rashid, O. Knio, M. Hadwiger, I. Hoteit
Image and video data analysis has become an increasingly important research area with applications in different domains such as security surveillance, healthcare, augmented and virtual reality, video and image editing, activity analysis and recognition, synthetic content generation, distance education, telepresence, remote sensing, sports analytics, art, non-photorealistic rendering, search engines, and social media. Recent advances in Artificial Intelligence (AI) and particularly deep learning have sparked new research challenges and led to significant advancements, especially in image and video analysis. These advancements have also resulted in significant research and development in other areas such as visualization and visual analytics, and have created new opportunities for future lines of research. In this survey article, we present the current state of the art at the intersection of visualization and visual analytics, and image and video data analysis. We categorize the visualization articles included in our survey based on different taxonomies used in visualization and visual analytics research. We review these articles in terms of task requirements, tools, datasets, and application areas. We also discuss insights based on our survey results, trends and patterns, the current focus of visualization research, and opportunities for future research.
图像和视频数据分析已经成为一个越来越重要的研究领域,应用于不同的领域,如安全监控、医疗保健、增强现实和虚拟现实、视频和图像编辑、活动分析和识别、合成内容生成、远程教育、远程呈现、遥感、体育分析、艺术、非真实感渲染、搜索引擎和社交媒体。人工智能(AI)特别是深度学习的最新进展引发了新的研究挑战,并导致了重大进步,特别是在图像和视频分析方面。这些进步也导致了可视化和可视化分析等其他领域的重大研究和发展,并为未来的研究领域创造了新的机会。在这篇调查文章中,我们介绍了可视化和视觉分析以及图像和视频数据分析交叉领域的最新技术。我们根据可视化和可视化分析研究中使用的不同分类法对调查中包含的可视化文章进行分类。我们从任务需求、工具、数据集和应用领域的角度来回顾这些文章。我们还讨论了基于我们的调查结果、趋势和模式、当前可视化研究的焦点以及未来研究的机会的见解。
{"title":"Visualization and Visual Analytics Approaches for Image and Video Datasets: A Survey","authors":"S. Afzal, Sohaib Ghani, M. Hittawe, Sheikh Faisal Rashid, O. Knio, M. Hadwiger, I. Hoteit","doi":"10.1145/3576935","DOIUrl":"https://doi.org/10.1145/3576935","url":null,"abstract":"Image and video data analysis has become an increasingly important research area with applications in different domains such as security surveillance, healthcare, augmented and virtual reality, video and image editing, activity analysis and recognition, synthetic content generation, distance education, telepresence, remote sensing, sports analytics, art, non-photorealistic rendering, search engines, and social media. Recent advances in Artificial Intelligence (AI) and particularly deep learning have sparked new research challenges and led to significant advancements, especially in image and video analysis. These advancements have also resulted in significant research and development in other areas such as visualization and visual analytics, and have created new opportunities for future lines of research. In this survey article, we present the current state of the art at the intersection of visualization and visual analytics, and image and video data analysis. We categorize the visualization articles included in our survey based on different taxonomies used in visualization and visual analytics research. We review these articles in terms of task requirements, tools, datasets, and application areas. We also discuss insights based on our survey results, trends and patterns, the current focus of visualization research, and opportunities for future research.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80970703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Special Issue on Highlights of IUI 2021: Introduction 2021年IUI亮点特刊:导论
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2022-12-31 DOI: 10.1145/3561516
T. Hammond, Bart P. Knijnenburg, J. O’Donovan, Paul Taele
degree of illocution results in the generation of more usable explanations. The authors evaluated their hypothesis on two
言外不通的程度导致产生更多可用的解释。作者用两点来评估他们的假设
{"title":"Special Issue on Highlights of IUI 2021: Introduction","authors":"T. Hammond, Bart P. Knijnenburg, J. O’Donovan, Paul Taele","doi":"10.1145/3561516","DOIUrl":"https://doi.org/10.1145/3561516","url":null,"abstract":"degree of illocution results in the generation of more usable explanations. The authors evaluated their hypothesis on two","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2022-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73046318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning and Understanding User Interface Semantics from Heterogeneous Networks with Multimodal and Positional Attributes 从具有多模态和位置属性的异构网络中学习和理解用户界面语义
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2022-12-23 DOI: https://dl.acm.org/doi/10.1145/3578522
Gary Ang, Ee-Peng Lim

User interfaces (UI) of desktop, web, and mobile applications involve a hierarchy of objects (e.g., applications, screens, view class, and other types of design objects) with multimodal (e.g., textual, visual) and positional (e.g., spatial location, sequence order and hierarchy level) attributes. We can therefore represent a set of application UIs as a heterogeneous network with multimodal and positional attributes. Such a network not only represents how users understand the visual layout of UIs, but also influences how users would interact with applications through these UIs. To model the UI semantics well for different UI annotation, search, and evaluation tasks, this paper proposes the novel Heterogeneous Attention-based Multimodal Positional (HAMP) graph neural network model. HAMP combines graph neural networks with the scaled dot-product attention used in transformers to learn the embeddings of heterogeneous nodes and associated multimodal and positional attributes in a unified manner. HAMP is evaluated with classification and regression tasks conducted on three distinct real-world datasets. Our experiments demonstrate that HAMP significantly out-performs other state-of-the-art models on such tasks. To further provide interpretations of the contribution of heterogeneous network information for understanding the relationships between the UI structure and prediction tasks, we propose Adaptive HAMP (AHAMP), which adaptively learns the importance of different edges linking different UI objects. Our experiments demonstrate AHAMP’s superior performance over HAMP on a number of tasks, and its ability to provide interpretations of the contribution of multimodal and positional attributes, as well as heterogeneous network information to different tasks.

桌面、web和移动应用程序的用户界面(UI)涉及具有多模态(例如,文本、视觉)和位置(例如,空间位置、序列顺序和层次级别)属性的对象层次(例如,应用程序、屏幕、视图类和其他类型的设计对象)。因此,我们可以将一组应用程序ui表示为具有多模态和位置属性的异构网络。这样的网络不仅代表了用户如何理解ui的视觉布局,而且还影响了用户如何通过这些ui与应用程序交互。为了更好地为不同的用户界面标注、搜索和评估任务建模,本文提出了一种基于异构注意的多模态位置(HAMP)图神经网络模型。HAMP将图神经网络与变压器中使用的尺度点积注意相结合,以统一的方式学习异构节点的嵌入以及相关的多模态和位置属性。HAMP通过对三个不同的真实世界数据集进行分类和回归任务来评估。我们的实验表明,HAMP在这类任务上的表现明显优于其他最先进的模型。为了进一步解释异构网络信息的贡献,以理解UI结构和预测任务之间的关系,我们提出了自适应HAMP (AHAMP),它自适应地学习连接不同UI对象的不同边的重要性。我们的实验证明了AHAMP在许多任务上优于HAMP,并且它能够提供对多模态和位置属性的贡献的解释,以及对不同任务的异构网络信息。
{"title":"Learning and Understanding User Interface Semantics from Heterogeneous Networks with Multimodal and Positional Attributes","authors":"Gary Ang, Ee-Peng Lim","doi":"https://dl.acm.org/doi/10.1145/3578522","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3578522","url":null,"abstract":"<p>User interfaces (UI) of desktop, web, and mobile applications involve a hierarchy of objects (e.g., applications, screens, view class, and other types of design objects) with multimodal (e.g., textual, visual) and positional (e.g., spatial location, sequence order and hierarchy level) attributes. We can therefore represent a set of application UIs as a heterogeneous network with multimodal and positional attributes. Such a network not only represents how users understand the visual layout of UIs, but also influences how users would interact with applications through these UIs. To model the UI semantics well for different UI annotation, search, and evaluation tasks, this paper proposes the novel Heterogeneous Attention-based Multimodal Positional (HAMP) graph neural network model. HAMP combines graph neural networks with the scaled dot-product attention used in transformers to learn the embeddings of heterogeneous nodes and associated multimodal and positional attributes in a unified manner. HAMP is evaluated with classification and regression tasks conducted on three distinct real-world datasets. Our experiments demonstrate that HAMP significantly out-performs other state-of-the-art models on such tasks. To further provide interpretations of the contribution of heterogeneous network information for understanding the relationships between the UI structure and prediction tasks, we propose Adaptive HAMP (AHAMP), which adaptively learns the importance of different edges linking different UI objects. Our experiments demonstrate AHAMP’s superior performance over HAMP on a number of tasks, and its ability to provide interpretations of the contribution of multimodal and positional attributes, as well as heterogeneous network information to different tasks.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning and Understanding User Interface Semantics from Heterogeneous Networks with Multimodal and Positional Attributes 从具有多模态和位置属性的异构网络中学习和理解用户界面语义
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2022-12-23 DOI: 10.1145/3578522
Gary (Ming) Ang, Ee-Peng Lim
User interfaces (UI) of desktop, web, and mobile applications involve a hierarchy of objects (e.g., applications, screens, view class, and other types of design objects) with multimodal (e.g., textual and visual) and positional (e.g., spatial location, sequence order, and hierarchy level) attributes. We can therefore represent a set of application UIs as a heterogeneous network with multimodal and positional attributes. Such a network not only represents how users understand the visual layout of UIs but also influences how users would interact with applications through these UIs. To model the UI semantics well for different UI annotation, search, and evaluation tasks, this article proposes the novel Heterogeneous Attention-based Multimodal Positional (HAMP) graph neural network model. HAMP combines graph neural networks with the scaled dot-product attention used in transformers to learn the embeddings of heterogeneous nodes and associated multimodal and positional attributes in a unified manner. HAMP is evaluated with classification and regression tasks conducted on three distinct real-world datasets. Our experiments demonstrate that HAMP significantly out-performs other state-of-the-art models on such tasks. To further provide interpretations of the contribution of heterogeneous network information for understanding the relationships between the UI structure and prediction tasks, we propose Adaptive HAMP (AHAMP), which adaptively learns the importance of different edges linking different UI objects. Our experiments demonstrate AHAMP’s superior performance over HAMP on a number of tasks, and its ability to provide interpretations of the contribution of multimodal and positional attributes, as well as heterogeneous network information to different tasks.
桌面、web和移动应用程序的用户界面(UI)涉及具有多模态(例如,文本和视觉)和位置(例如,空间位置、序列顺序和层次级别)属性的对象层次(例如,应用程序、屏幕、视图类和其他类型的设计对象)。因此,我们可以将一组应用程序ui表示为具有多模态和位置属性的异构网络。这样的网络不仅表示用户如何理解ui的可视化布局,而且还影响用户如何通过这些ui与应用程序进行交互。为了更好地为不同的用户界面标注、搜索和评估任务建模,本文提出了一种基于异构注意的多模态位置(HAMP)图神经网络模型。HAMP将图神经网络与变压器中使用的尺度点积注意相结合,以统一的方式学习异构节点的嵌入以及相关的多模态和位置属性。HAMP通过对三个不同的真实世界数据集进行分类和回归任务来评估。我们的实验表明,HAMP在这类任务上的表现明显优于其他最先进的模型。为了进一步解释异构网络信息的贡献,以理解UI结构和预测任务之间的关系,我们提出了自适应HAMP (AHAMP),它自适应地学习连接不同UI对象的不同边的重要性。我们的实验证明了AHAMP在许多任务上优于HAMP,并且它能够提供对多模态和位置属性的贡献的解释,以及对不同任务的异构网络信息。
{"title":"Learning and Understanding User Interface Semantics from Heterogeneous Networks with Multimodal and Positional Attributes","authors":"Gary (Ming) Ang, Ee-Peng Lim","doi":"10.1145/3578522","DOIUrl":"https://doi.org/10.1145/3578522","url":null,"abstract":"User interfaces (UI) of desktop, web, and mobile applications involve a hierarchy of objects (e.g., applications, screens, view class, and other types of design objects) with multimodal (e.g., textual and visual) and positional (e.g., spatial location, sequence order, and hierarchy level) attributes. We can therefore represent a set of application UIs as a heterogeneous network with multimodal and positional attributes. Such a network not only represents how users understand the visual layout of UIs but also influences how users would interact with applications through these UIs. To model the UI semantics well for different UI annotation, search, and evaluation tasks, this article proposes the novel Heterogeneous Attention-based Multimodal Positional (HAMP) graph neural network model. HAMP combines graph neural networks with the scaled dot-product attention used in transformers to learn the embeddings of heterogeneous nodes and associated multimodal and positional attributes in a unified manner. HAMP is evaluated with classification and regression tasks conducted on three distinct real-world datasets. Our experiments demonstrate that HAMP significantly out-performs other state-of-the-art models on such tasks. To further provide interpretations of the contribution of heterogeneous network information for understanding the relationships between the UI structure and prediction tasks, we propose Adaptive HAMP (AHAMP), which adaptively learns the importance of different edges linking different UI objects. Our experiments demonstrate AHAMP’s superior performance over HAMP on a number of tasks, and its ability to provide interpretations of the contribution of multimodal and positional attributes, as well as heterogeneous network information to different tasks.","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82768043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detection and Recognition of Driver Distraction Using Multimodal Signals 基于多模态信号的驾驶员分心检测与识别
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2022-12-12 DOI: https://dl.acm.org/doi/10.1145/3519267
Kapotaksha Das, Michalis Papakostas, Kais Riani, Andrew Gasiorowski, Mohamed Abouelenien, Mihai Burzo, Rada Mihalcea

Distracted driving is a leading cause of accidents worldwide. The tasks of distraction detection and recognition have been traditionally addressed as computer vision problems. However, distracted behaviors are not always expressed in a visually observable way. In this work, we introduce a novel multimodal dataset of distracted driver behaviors, consisting of data collected using twelve information channels coming from visual, acoustic, near-infrared, thermal, physiological and linguistic modalities. The data were collected from 45 subjects while being exposed to four different distractions (three cognitive and one physical). For the purposes of this paper, we performed experiments with visual, physiological, and thermal information to explore potential of multimodal modeling for distraction recognition. In addition, we analyze the value of different modalities by identifying specific visual, physiological, and thermal groups of features that contribute the most to distraction characterization. Our results highlight the advantage of multimodal representations and reveal valuable insights for the role played by the three modalities on identifying different types of driving distractions.

分心驾驶是世界范围内交通事故的主要原因。分心检测和识别的任务传统上被认为是计算机视觉问题。然而,分心的行为并不总是以视觉上可观察的方式表达。在这项工作中,我们引入了一个新的多模态驾驶行为数据集,包括使用视觉、声学、近红外、热、生理和语言等12个信息通道收集的数据。这些数据是从45名受试者中收集的,他们暴露在四种不同的干扰中(三种认知干扰,一种身体干扰)。为了达到本文的目的,我们进行了视觉、生理和热信息的实验,以探索多模态建模在分心识别中的潜力。此外,我们通过识别对分心特征贡献最大的特定视觉、生理和热特征组来分析不同模式的价值。我们的研究结果强调了多模态表征的优势,并揭示了三种模式在识别不同类型的驾驶干扰方面所起的作用。
{"title":"Detection and Recognition of Driver Distraction Using Multimodal Signals","authors":"Kapotaksha Das, Michalis Papakostas, Kais Riani, Andrew Gasiorowski, Mohamed Abouelenien, Mihai Burzo, Rada Mihalcea","doi":"https://dl.acm.org/doi/10.1145/3519267","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3519267","url":null,"abstract":"<p>Distracted driving is a leading cause of accidents worldwide. The tasks of distraction detection and recognition have been traditionally addressed as computer vision problems. However, distracted behaviors are not always expressed in a visually observable way. In this work, we introduce a novel multimodal dataset of distracted driver behaviors, consisting of data collected using twelve information channels coming from visual, acoustic, near-infrared, thermal, physiological and linguistic modalities. The data were collected from 45 subjects while being exposed to four different distractions (three cognitive and one physical). For the purposes of this paper, we performed experiments with visual, physiological, and thermal information to explore potential of multimodal modeling for distraction recognition. In addition, we analyze the value of different modalities by identifying specific visual, physiological, and thermal groups of features that contribute the most to distraction characterization. Our results highlight the advantage of multimodal representations and reveal valuable insights for the role played by the three modalities on identifying different types of driving distractions.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2022-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Importance of User Backgrounds and Impressions: Lessons Learned from Interactive AI Applications 关于用户背景和印象的重要性:从交互式人工智能应用中学到的经验教训
IF 3.4 4区 计算机科学 Q2 Computer Science Pub Date : 2022-12-12 DOI: https://dl.acm.org/doi/10.1145/3531066
Mahsan Nourani, Chiradeep Roy, Jeremy E. Block, Donald R. Honeycutt, Tahrima Rahman, Eric D. Ragan, Vibhav Gogate

While EXplainable Artificial Intelligence (XAI) approaches aim to improve human-AI collaborative decision-making by improving model transparency and mental model formations, experiential factors associated with human users can cause challenges in ways system designers do not anticipate. In this article, we first showcase a user study on how anchoring bias can potentially affect mental model formations when users initially interact with an intelligent system and the role of explanations in addressing this bias. Using a video activity recognition tool in cooking domain, we asked participants to verify whether a set of kitchen policies are being followed, with each policy focusing on a weakness or a strength. We controlled the order of the policies and the presence of explanations to test our hypotheses. Our main finding shows that those who observed system strengths early on were more prone to automation bias and made significantly more errors due to positive first impressions of the system, while they built a more accurate mental model of the system competencies. However, those who encountered weaknesses earlier made significantly fewer errors, since they tended to rely more on themselves, while they also underestimated model competencies due to having a more negative first impression of the model. Motivated by these findings and similar existing work, we formalize and present a conceptual model of user’s past experiences that examine the relations between user’s backgrounds, experiences, and human factors in XAI systems based on usage time. Our work presents strong findings and implications, aiming to raise the awareness of AI designers toward biases associated with user impressions and backgrounds.

虽然可解释人工智能(XAI)方法旨在通过提高模型透明度和心理模型的形成来改善人类与人工智能的协作决策,但与人类用户相关的经验因素可能会以系统设计者没有预料到的方式带来挑战。在本文中,我们首先展示了一项用户研究,即当用户最初与智能系统交互时,锚定偏见如何潜在地影响心理模型的形成,以及解释在解决这种偏见中的作用。使用烹饪领域的视频活动识别工具,我们要求参与者验证是否遵循了一套厨房政策,每个政策都侧重于一个弱点或一个优势。我们控制了政策的顺序和解释的出现来检验我们的假设。我们的主要发现表明,那些早期观察到系统优势的人更容易产生自动化偏见,并且由于对系统的积极第一印象而犯了更多的错误,而他们对系统能力建立了更准确的心理模型。然而,那些更早遇到弱点的人犯的错误明显更少,因为他们倾向于更多地依靠自己,同时他们也低估了模型能力,因为他们对模型有更负面的第一印象。受这些发现和类似现有工作的启发,我们形式化并提出了一个用户过去体验的概念模型,该模型基于使用时间检查XAI系统中用户背景、体验和人为因素之间的关系。我们的工作提出了强有力的发现和启示,旨在提高人工智能设计师对与用户印象和背景相关的偏见的认识。
{"title":"On the Importance of User Backgrounds and Impressions: Lessons Learned from Interactive AI Applications","authors":"Mahsan Nourani, Chiradeep Roy, Jeremy E. Block, Donald R. Honeycutt, Tahrima Rahman, Eric D. Ragan, Vibhav Gogate","doi":"https://dl.acm.org/doi/10.1145/3531066","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3531066","url":null,"abstract":"<p>While EXplainable Artificial Intelligence (XAI) approaches aim to improve human-AI collaborative decision-making by improving model transparency and mental model formations, experiential factors associated with human users can cause challenges in ways system designers do not anticipate. In this article, we first showcase a user study on how anchoring bias can potentially affect mental model formations when users initially interact with an intelligent system and the role of explanations in addressing this bias. Using a video activity recognition tool in cooking domain, we asked participants to verify whether a set of kitchen policies are being followed, with each policy focusing on a weakness or a strength. We controlled the order of the policies and the presence of explanations to test our hypotheses. Our main finding shows that those who observed system strengths early on were more prone to automation bias and made significantly more errors due to positive first impressions of the system, while they built a more accurate mental model of the system competencies. However, those who encountered weaknesses earlier made significantly fewer errors, since they tended to rely more on themselves, while they also underestimated model competencies due to having a more negative first impression of the model. Motivated by these findings and similar existing work, we formalize and present a conceptual model of user’s past experiences that examine the relations between user’s backgrounds, experiences, and human factors in XAI systems based on usage time. Our work presents strong findings and implications, aiming to raise the awareness of AI designers toward biases associated with user impressions and backgrounds.</p>","PeriodicalId":48574,"journal":{"name":"ACM Transactions on Interactive Intelligent Systems","volume":null,"pages":null},"PeriodicalIF":3.4,"publicationDate":"2022-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138508018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
ACM Transactions on Interactive Intelligent Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1