An egocentric video and eye-tracking dataset for visual search in convenience stores

IF 4.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Computer Vision and Image Understanding Pub Date : 2024-08-28 DOI:10.1016/j.cviu.2024.104129
{"title":"An egocentric video and eye-tracking dataset for visual search in convenience stores","authors":"","doi":"10.1016/j.cviu.2024.104129","DOIUrl":null,"url":null,"abstract":"<div><p>We introduce an egocentric video and eye-tracking dataset, comprised of 108 first-person videos of 36 shoppers searching for three different products (orange juice, KitKat chocolate bars, and canned tuna) in a convenience store, along with the frame-centered eye fixation locations for each video frame. The dataset also includes demographic information about each participant in the form of an 11-question survey. The paper describes two applications using the dataset — an analysis of eye fixations during search in the store, and a training of a clustered saliency model for predicting saliency of viewers engaged in product search in the store. The fixation analysis shows that fixation duration statistics are very similar to those found in image and video viewing, suggesting that similar visual processing is employed during search in 3D environments and during viewing of imagery on computer screens. A clustering technique was applied to the questionnaire data, which resulted in two clusters being detected. Based on these clusters, personalized saliency prediction models were trained on the store fixation data, which provided improved performance in prediction saliency on the store video data compared to state-of-the art universal saliency prediction methods.</p></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1077314224002108/pdfft?md5=dfee816a569ed0f626ef6b190cabb0bc&pid=1-s2.0-S1077314224002108-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Vision and Image Understanding","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077314224002108","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

We introduce an egocentric video and eye-tracking dataset, comprised of 108 first-person videos of 36 shoppers searching for three different products (orange juice, KitKat chocolate bars, and canned tuna) in a convenience store, along with the frame-centered eye fixation locations for each video frame. The dataset also includes demographic information about each participant in the form of an 11-question survey. The paper describes two applications using the dataset — an analysis of eye fixations during search in the store, and a training of a clustered saliency model for predicting saliency of viewers engaged in product search in the store. The fixation analysis shows that fixation duration statistics are very similar to those found in image and video viewing, suggesting that similar visual processing is employed during search in 3D environments and during viewing of imagery on computer screens. A clustering technique was applied to the questionnaire data, which resulted in two clusters being detected. Based on these clusters, personalized saliency prediction models were trained on the store fixation data, which provided improved performance in prediction saliency on the store video data compared to state-of-the art universal saliency prediction methods.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于便利店视觉搜索的自我中心视频和眼动跟踪数据集
我们介绍了一个以自我为中心的视频和眼动跟踪数据集,该数据集由 108 个第一人称视频组成,视频中的 36 名购物者在一家便利店中寻找三种不同的产品(橙汁、KitKat 巧克力棒和金枪鱼罐头),同时还包括每个视频帧的以帧为中心的眼球固定位置。数据集还包括以 11 个问题的调查形式提供的每位参与者的人口统计学信息。论文介绍了使用该数据集的两个应用--在商店搜索过程中的眼球定格分析,以及用于预测在商店中进行产品搜索的观众的显著性的聚类显著性模型的训练。定点分析表明,定点持续时间统计与图像和视频观看中的定点持续时间统计非常相似,这表明在三维环境中进行搜索和在计算机屏幕上观看图像时采用了类似的视觉处理方法。对问卷数据采用了聚类技术,结果发现了两个聚类。在这些聚类的基础上,对商店的固定数据进行了个性化的显著性预测模型训练,与最先进的通用显著性预测方法相比,这种方法在预测商店视频数据的显著性方面有更好的表现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer Vision and Image Understanding
Computer Vision and Image Understanding 工程技术-工程:电子与电气
CiteScore
7.80
自引率
4.40%
发文量
112
审稿时长
79 days
期刊介绍: The central focus of this journal is the computer analysis of pictorial information. Computer Vision and Image Understanding publishes papers covering all aspects of image analysis from the low-level, iconic processes of early vision to the high-level, symbolic processes of recognition and interpretation. A wide range of topics in the image understanding area is covered, including papers offering insights that differ from predominant views. Research Areas Include: • Theory • Early vision • Data structures and representations • Shape • Range • Motion • Matching and recognition • Architecture and languages • Vision systems
期刊最新文献
Deformable surface reconstruction via Riemannian metric preservation Estimating optical flow: A comprehensive review of the state of the art A lightweight convolutional neural network-based feature extractor for visible images LightSOD: Towards lightweight and efficient network for salient object detection Triple-Stream Commonsense Circulation Transformer Network for Image Captioning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1