Yasasi Abeysinghe, Bhanuka Mahanama, Gavindya Jayawardena, Yasith Jayawardana, Mohan Sunkara, Andrew T. Duchowski, Vikas Ashok, S. Jayarathna
Understanding how individuals focus and perform visual searches during collaborative tasks can help improve user engagement. Eye tracking measures provide informative cues for such understanding. This article presents A-DisETrac, an advanced analytic dashboard for distributed eye tracking. It uses off-the-shelf eye trackers to monitor multiple users in parallel, compute both traditional and advanced gaze measures in real-time, and display them on an interactive dashboard. Using two pilot studies, the system was evaluated in terms of user experience and utility, and compared with existing work. Moreover, the system was used to study how advanced gaze measures such as ambient-focal coefficient K and real-time index of pupillary activity relate to collaborative behavior. It was observed that the time a group takes to complete a puzzle is related to the ambient visual scanning behavior quantified and groups that spent more time had more scanning behavior. User experience questionnaire results suggest that their dashboard provides a comparatively good user experience.
了解个人在协作任务中如何集中注意力和进行视觉搜索,有助于提高用户的参与度。眼动跟踪测量为这种理解提供了信息线索。本文介绍了用于分布式眼动跟踪的高级分析仪表板 A-DisETrac。它使用现成的眼动追踪器对多个用户进行并行监控,实时计算传统和先进的注视测量值,并将其显示在交互式仪表盘上。通过两项试点研究,对该系统的用户体验和实用性进行了评估,并与现有工作进行了比较。此外,该系统还用于研究环境焦点系数 K 和瞳孔活动实时指数等高级凝视测量指标与协作行为的关系。研究发现,一个小组完成一个谜题所花费的时间与量化的环境视觉扫描行为有关,花费更多时间的小组有更多的扫描行为。用户体验问卷调查结果表明,他们的仪表盘提供了相对较好的用户体验。
{"title":"A-DisETrac Advanced Analytic Dashboard for Distributed Eye Tracking","authors":"Yasasi Abeysinghe, Bhanuka Mahanama, Gavindya Jayawardena, Yasith Jayawardana, Mohan Sunkara, Andrew T. Duchowski, Vikas Ashok, S. Jayarathna","doi":"10.4018/IJMDEM.341792","DOIUrl":"https://doi.org/10.4018/IJMDEM.341792","url":null,"abstract":"Understanding how individuals focus and perform visual searches during collaborative tasks can help improve user engagement. Eye tracking measures provide informative cues for such understanding. This article presents A-DisETrac, an advanced analytic dashboard for distributed eye tracking. It uses off-the-shelf eye trackers to monitor multiple users in parallel, compute both traditional and advanced gaze measures in real-time, and display them on an interactive dashboard. Using two pilot studies, the system was evaluated in terms of user experience and utility, and compared with existing work. Moreover, the system was used to study how advanced gaze measures such as ambient-focal coefficient K and real-time index of pupillary activity relate to collaborative behavior. It was observed that the time a group takes to complete a puzzle is related to the ambient visual scanning behavior quantified and groups that spent more time had more scanning behavior. User experience questionnaire results suggest that their dashboard provides a comparatively good user experience.","PeriodicalId":445080,"journal":{"name":"International Journal of Multimedia Data Engineering and Management","volume":"47 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140753147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most existing wearable displays for augmented reality (AR) have only one fixed focal plane and hence can easily suffer from vergence-accommodation conflict (VAC). In contrast, light field displays allow users to focus at any depth free of VAC. This paper presents a series of text-based visual search tasks to systematically and quantitatively compare a near-eye light field AR display with a conventional AR display, specifically in regards to how participants wearing such displays would perform on a virtual-real integration task. Task performance is evaluated by task completion rate and accuracy. The results show that the light field AR glasses lead to significantly higher user performance than the conventional AR glasses. In addition, 80% of the participants prefer the light field AR glasses over the conventional AR glasses for visual comfort.
{"title":"Comparison of Light Field and Conventional Near-Eye AR Displays in Virtual-Real Integration Efficiency","authors":"Wei-An Teng, Su-Ling Yeh, Homer H. Chen","doi":"10.4018/ijmdem.333609","DOIUrl":"https://doi.org/10.4018/ijmdem.333609","url":null,"abstract":"Most existing wearable displays for augmented reality (AR) have only one fixed focal plane and hence can easily suffer from vergence-accommodation conflict (VAC). In contrast, light field displays allow users to focus at any depth free of VAC. This paper presents a series of text-based visual search tasks to systematically and quantitatively compare a near-eye light field AR display with a conventional AR display, specifically in regards to how participants wearing such displays would perform on a virtual-real integration task. Task performance is evaluated by task completion rate and accuracy. The results show that the light field AR glasses lead to significantly higher user performance than the conventional AR glasses. In addition, 80% of the participants prefer the light field AR glasses over the conventional AR glasses for visual comfort.","PeriodicalId":445080,"journal":{"name":"International Journal of Multimedia Data Engineering and Management","volume":" 13","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135192165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Duleep Rathgamage Don, Jonathan Boardman, Sudhashree Sayenju, Ramazan Aygun, Yifan Zhang, Bill Franks, Sereres Johnston, George Lee, Dan Sullivan, Girish Modgil
XAI requires artificial intelligence systems to provide explanations for their decisions and actions for review. Nevertheless, for big data systems where decisions are made frequently, it is technically impossible to have an expert monitor every decision. To solve this problem, the authors propose an explainability auditing method for image recognition whether the explanations are relevant for the decision made by a black box model, and involve an expert as needed when explanations are doubtful. The explainability auditing system classifies explanations as weak or satisfactory using a local explainability model by analyzing the image segments that impacted the decision. This version of the proposed method uses LIME to generate the local explanations as superpixels. Then a bag of image patches is extracted from the superpixels to determine their texture and evaluate the local explanations. Using a rooftop image dataset, the authors show that 95.7% of the cases to be audited can be detected by the proposed method.
{"title":"Automation of Explainability Auditing for Image Recognition","authors":"Duleep Rathgamage Don, Jonathan Boardman, Sudhashree Sayenju, Ramazan Aygun, Yifan Zhang, Bill Franks, Sereres Johnston, George Lee, Dan Sullivan, Girish Modgil","doi":"10.4018/ijmdem.332882","DOIUrl":"https://doi.org/10.4018/ijmdem.332882","url":null,"abstract":"XAI requires artificial intelligence systems to provide explanations for their decisions and actions for review. Nevertheless, for big data systems where decisions are made frequently, it is technically impossible to have an expert monitor every decision. To solve this problem, the authors propose an explainability auditing method for image recognition whether the explanations are relevant for the decision made by a black box model, and involve an expert as needed when explanations are doubtful. The explainability auditing system classifies explanations as weak or satisfactory using a local explainability model by analyzing the image segments that impacted the decision. This version of the proposed method uses LIME to generate the local explanations as superpixels. Then a bag of image patches is extracted from the superpixels to determine their texture and evaluate the local explanations. Using a rooftop image dataset, the authors show that 95.7% of the cases to be audited can be detected by the proposed method.","PeriodicalId":445080,"journal":{"name":"International Journal of Multimedia Data Engineering and Management","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135271392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The development of digital twin for smart city applications requires real-time monitoring and mapping of urban environments. This work develops a framework of real-time urban mapping using an airborne light detection and ranging (LIDAR) agent and game engine. In order to improve the accuracy and efficiency of data acquisition and utilization, the framework is focused on the following aspects: (1) an optimal navigation strategy using Deep Q-Network (DQN) reinforcement learning, (2) multi-streamed game engines employed in visualizing data of urban environment and training the deep-learning-enabled data acquisition platform, (3) dynamic mesh used to formulate and analyze the captured point-cloud, and (4) a quantitative error analysis for points generated with our experimental aerial mapping platform, and an accuracy analysis of post-processing. Experimental results show that the proposed DQN-enabled navigation strategy, rendering algorithm, and post-processing could enable a game engine to efficiently generate a highly accurate digital twin of an urban environment.
{"title":"Adaptive Acquisition and Visualization of Point Cloud Using Airborne LIDAR and Game Engine","authors":"Chengxuan Huang, Evan Brock, Dalei Wu, Yu Liang","doi":"10.4018/ijmdem.332881","DOIUrl":"https://doi.org/10.4018/ijmdem.332881","url":null,"abstract":"The development of digital twin for smart city applications requires real-time monitoring and mapping of urban environments. This work develops a framework of real-time urban mapping using an airborne light detection and ranging (LIDAR) agent and game engine. In order to improve the accuracy and efficiency of data acquisition and utilization, the framework is focused on the following aspects: (1) an optimal navigation strategy using Deep Q-Network (DQN) reinforcement learning, (2) multi-streamed game engines employed in visualizing data of urban environment and training the deep-learning-enabled data acquisition platform, (3) dynamic mesh used to formulate and analyze the captured point-cloud, and (4) a quantitative error analysis for points generated with our experimental aerial mapping platform, and an accuracy analysis of post-processing. Experimental results show that the proposed DQN-enabled navigation strategy, rendering algorithm, and post-processing could enable a game engine to efficiently generate a highly accurate digital twin of an urban environment.","PeriodicalId":445080,"journal":{"name":"International Journal of Multimedia Data Engineering and Management","volume":"49 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136234977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-10-01DOI: 10.4018/ijmdem.2020100103
Zhigang Zhu, Jin Chen, Lei Zhang, Yaohua Chang, Tyler Franklin, Hao Tang, Arber Ruci
The iASSIST is an iPhone-based assistive sensor solution for independent and safe travel for people who are blind or visually impaired, or those who simply face challenges in navigating an unfamiliar indoor environment. The solution integrates information of Bluetooth beacons, data connectivity, visual models, and user preferences. Hybrid models of interiors are created in a modeling stage with these multimodal data, collected, and mapped to the floor plan as the modeler walks through the building. Client-server architecture allows scaling to large areas by lazy-loading models according to beacon signals and/or adjacent region proximity. During the navigation stage, a user with the navigation app is localized within the floor plan, using visual, connectivity, and user preference data, along an optimal route to their destination. User interfaces for both modeling and navigation use multimedia channels, including visual, audio, and haptic feedback for targeted users. The design of human subject test experiments is also described, in addition to some preliminary experimental results.
{"title":"iASSIST","authors":"Zhigang Zhu, Jin Chen, Lei Zhang, Yaohua Chang, Tyler Franklin, Hao Tang, Arber Ruci","doi":"10.4018/ijmdem.2020100103","DOIUrl":"https://doi.org/10.4018/ijmdem.2020100103","url":null,"abstract":"The iASSIST is an iPhone-based assistive sensor solution for independent and safe travel for people who are blind or visually impaired, or those who simply face challenges in navigating an unfamiliar indoor environment. The solution integrates information of Bluetooth beacons, data connectivity, visual models, and user preferences. Hybrid models of interiors are created in a modeling stage with these multimodal data, collected, and mapped to the floor plan as the modeler walks through the building. Client-server architecture allows scaling to large areas by lazy-loading models according to beacon signals and/or adjacent region proximity. During the navigation stage, a user with the navigation app is localized within the floor plan, using visual, connectivity, and user preference data, along an optimal route to their destination. User interfaces for both modeling and navigation use multimedia channels, including visual, audio, and haptic feedback for targeted users. The design of human subject test experiments is also described, in addition to some preliminary experimental results.","PeriodicalId":445080,"journal":{"name":"International Journal of Multimedia Data Engineering and Management","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125343255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}