Fixation-identification in dynamic scenes: comparing an automated algorithm to manual coding

Proceedings APGV : ... Symposium on Applied Perception in Graphics and Visualization. Symposium on Applied Perception in Graphics and Visualization Pub Date : 2008-08-09 DOI:10.1145/1394281.1394287

S. Munn, Leanne Stefano, J. Pelz

{"title":"Fixation-identification in dynamic scenes: comparing an automated algorithm to manual coding","authors":"S. Munn, Leanne Stefano, J. Pelz","doi":"10.1145/1394281.1394287","DOIUrl":null,"url":null,"abstract":"Video-based eye trackers produce an output video showing where a subject is looking, the subject's point-of-regard (POR), for each frame of a video of the scene. Fixation-identification algorithms simplify the long list of POR data into a more manageable set of data, especially for further analysis, by grouping PORs into fixations. Most current fixation-identification algorithms assume that the POR data are defined in static two-dimensional scene images and only use these raw POR data to identify fixations. The applicability of these algorithms to gaze data in dynamic scene videos is largely unexplored. We implemented a simple velocity-based, duration-sensitive fixation-identification algorithm and compared its performance to results obtained by three experienced users manually coding the eye tracking data displayed within the scene video such that these manual coders had knowledge of the scene motion. We performed this comparison for eye tracking data collected during two different tasks involving different types of scene motion. These two tasks included a subject walking around a building for about 100 seconds (Task 1) and a seated subject viewing a computer animation (approximately 90 seconds long, Task 2). It took our manual coders on average 75 minutes (stdev = 28) and 80 minutes (17) to code results from the first and second tasks, respectively. The automatic fixation-identification algorithm, implemented in MATLAB and run on an Apple 2.16 GHz MacBook, produced results in 0.26 seconds for Task 1 and 0.21 seconds for Task 2. For the first task (walking), the average percent difference among the three human manual coders was 9% (3.5) and the average percent difference between the automatically generated results and the three coders was 11% (2.0). For the second task (animation), the average percent difference among the three human coders was 4% (0.75) and the average percent difference between the automatically generated results and the three coders was 5% (0.9).","PeriodicalId":89458,"journal":{"name":"Proceedings APGV : ... Symposium on Applied Perception in Graphics and Visualization. Symposium on Applied Perception in Graphics and Visualization","volume":"5 1","pages":"33-42"},"PeriodicalIF":0.0000,"publicationDate":"2008-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"62","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings APGV : ... Symposium on Applied Perception in Graphics and Visualization. Symposium on Applied Perception in Graphics and Visualization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1394281.1394287","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 62

Abstract

Video-based eye trackers produce an output video showing where a subject is looking, the subject's point-of-regard (POR), for each frame of a video of the scene. Fixation-identification algorithms simplify the long list of POR data into a more manageable set of data, especially for further analysis, by grouping PORs into fixations. Most current fixation-identification algorithms assume that the POR data are defined in static two-dimensional scene images and only use these raw POR data to identify fixations. The applicability of these algorithms to gaze data in dynamic scene videos is largely unexplored. We implemented a simple velocity-based, duration-sensitive fixation-identification algorithm and compared its performance to results obtained by three experienced users manually coding the eye tracking data displayed within the scene video such that these manual coders had knowledge of the scene motion. We performed this comparison for eye tracking data collected during two different tasks involving different types of scene motion. These two tasks included a subject walking around a building for about 100 seconds (Task 1) and a seated subject viewing a computer animation (approximately 90 seconds long, Task 2). It took our manual coders on average 75 minutes (stdev = 28) and 80 minutes (17) to code results from the first and second tasks, respectively. The automatic fixation-identification algorithm, implemented in MATLAB and run on an Apple 2.16 GHz MacBook, produced results in 0.26 seconds for Task 1 and 0.21 seconds for Task 2. For the first task (walking), the average percent difference among the three human manual coders was 9% (3.5) and the average percent difference between the automatically generated results and the three coders was 11% (2.0). For the second task (animation), the average percent difference among the three human coders was 4% (0.75) and the average percent difference between the automatically generated results and the three coders was 5% (0.9).

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

动态场景中的注视识别:自动算法与手动编码的比较

基于视频的眼动仪产生一个输出视频，显示一个对象正在看的地方，对象的关注点(POR)，为场景的视频的每一帧。注视点识别算法通过将POR分组到注视点中，将POR数据的长列表简化为更易于管理的数据集，特别是用于进一步分析。目前的大多数注视识别算法都假设POR数据是在静态二维场景图像中定义的，并且只使用这些原始的POR数据来识别注视。这些算法在动态场景视频中注视数据的适用性在很大程度上尚未得到探索。我们实现了一种简单的基于速度的、对持续时间敏感的注视识别算法，并将其性能与三位有经验的用户手动编码场景视频中显示的眼动追踪数据所获得的结果进行了比较，这些手动编码器了解场景运动。我们对在涉及不同类型场景运动的两个不同任务中收集的眼动追踪数据进行了比较。这两个任务包括一个受试者在建筑物周围行走大约100秒(任务1)和一个坐着的受试者观看计算机动画(大约90秒长，任务2)。我们的手动编码人员平均花费75分钟(stdev = 28)和80分钟(17)来编码第一项和第二项任务的结果。在MATLAB中实现的自动注视识别算法在一台Apple 2.16 GHz MacBook上运行，任务1和任务2分别在0.26秒和0.21秒内产生结果。对于第一个任务(行走)，三个人工编码员之间的平均差异为9%(3.5%)，而自动生成的结果与三个编码员之间的平均差异为11%(2.0)。对于第二个任务(动画)，三个人类编码器之间的平均百分比差异为4%(0.75)，自动生成的结果与三个编码器之间的平均百分比差异为5%(0.9)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings APGV : ... Symposium on Applied Perception in Graphics and Visualization. Symposium on Applied Perception in Graphics and Visualization

自引率

0.00%

发文量

期刊最新文献

Evaluation of field of view calibration techniques for head-mounted displays Gaze-contingent real-time video processing to study natural vision Realistic simulation of human contrast perception after headlight glares in driving simulations Spatial localization with only auditory cues: a preliminary study Measuring gaze depth with an eye tracker during stereoscopic display