基于语义的hmm人体扫描路径估计

Huiying Liu, Dong Xu, Qingming Huang, Wen Li, Min Xu, Stephen Lin
{"title":"基于语义的hmm人体扫描路径估计","authors":"Huiying Liu, Dong Xu, Qingming Huang, Wen Li, Min Xu, Stephen Lin","doi":"10.1109/ICCV.2013.401","DOIUrl":null,"url":null,"abstract":"We present a method for estimating human scan paths, which are sequences of gaze shifts that follow visual attention over an image. In this work, scan paths are modeled based on three principal factors that influence human attention, namely low-level feature saliency, spatial position, and semantic content. Low-level feature saliency is formulated as transition probabilities between different image regions based on feature differences. The effect of spatial position on gaze shifts is modeled as a Levy flight with the shifts following a 2D Cauchy distribution. To account for semantic content, we propose to use a Hidden Markov Model (HMM) with a Bag-of-Visual-Words descriptor of image regions. An HMM is well-suited for this purpose in that 1) the hidden states, obtained by unsupervised learning, can represent latent semantic concepts, 2) the prior distribution of the hidden states describes visual attraction to the semantic concepts, and 3) the transition probabilities represent human gaze shift patterns. The proposed method is applied to task-driven viewing processes. Experiments and analysis performed on human eye gaze data verify the effectiveness of this method.","PeriodicalId":6351,"journal":{"name":"2013 IEEE International Conference on Computer Vision","volume":"2010 1","pages":"3232-3239"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":"{\"title\":\"Semantically-Based Human Scanpath Estimation with HMMs\",\"authors\":\"Huiying Liu, Dong Xu, Qingming Huang, Wen Li, Min Xu, Stephen Lin\",\"doi\":\"10.1109/ICCV.2013.401\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a method for estimating human scan paths, which are sequences of gaze shifts that follow visual attention over an image. In this work, scan paths are modeled based on three principal factors that influence human attention, namely low-level feature saliency, spatial position, and semantic content. Low-level feature saliency is formulated as transition probabilities between different image regions based on feature differences. The effect of spatial position on gaze shifts is modeled as a Levy flight with the shifts following a 2D Cauchy distribution. To account for semantic content, we propose to use a Hidden Markov Model (HMM) with a Bag-of-Visual-Words descriptor of image regions. An HMM is well-suited for this purpose in that 1) the hidden states, obtained by unsupervised learning, can represent latent semantic concepts, 2) the prior distribution of the hidden states describes visual attraction to the semantic concepts, and 3) the transition probabilities represent human gaze shift patterns. The proposed method is applied to task-driven viewing processes. Experiments and analysis performed on human eye gaze data verify the effectiveness of this method.\",\"PeriodicalId\":6351,\"journal\":{\"name\":\"2013 IEEE International Conference on Computer Vision\",\"volume\":\"2010 1\",\"pages\":\"3232-3239\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"39\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE International Conference on Computer Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCV.2013.401\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE International Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2013.401","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 39

摘要

我们提出了一种估计人类扫描路径的方法,这是跟随视觉注意力在图像上的凝视转移序列。在这项工作中,扫描路径基于影响人类注意力的三个主要因素建模,即低水平特征显著性、空间位置和语义内容。低水平特征显著性被表示为基于特征差异的不同图像区域之间的过渡概率。空间位置对注视位移的影响模型为Levy飞行,注视位移服从二维柯西分布。为了考虑语义内容,我们建议使用隐马尔可夫模型(HMM)和图像区域的视觉词袋描述符。HMM非常适合这一目的,因为1)通过无监督学习获得的隐藏状态可以表示潜在的语义概念,2)隐藏状态的先验分布描述了对语义概念的视觉吸引力,3)转移概率表示人类的目光转移模式。将该方法应用于任务驱动的查看过程。通过人眼注视数据的实验和分析,验证了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Semantically-Based Human Scanpath Estimation with HMMs
We present a method for estimating human scan paths, which are sequences of gaze shifts that follow visual attention over an image. In this work, scan paths are modeled based on three principal factors that influence human attention, namely low-level feature saliency, spatial position, and semantic content. Low-level feature saliency is formulated as transition probabilities between different image regions based on feature differences. The effect of spatial position on gaze shifts is modeled as a Levy flight with the shifts following a 2D Cauchy distribution. To account for semantic content, we propose to use a Hidden Markov Model (HMM) with a Bag-of-Visual-Words descriptor of image regions. An HMM is well-suited for this purpose in that 1) the hidden states, obtained by unsupervised learning, can represent latent semantic concepts, 2) the prior distribution of the hidden states describes visual attraction to the semantic concepts, and 3) the transition probabilities represent human gaze shift patterns. The proposed method is applied to task-driven viewing processes. Experiments and analysis performed on human eye gaze data verify the effectiveness of this method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects A General Dense Image Matching Framework Combining Direct and Feature-Based Costs Latent Space Sparse Subspace Clustering Non-convex P-Norm Projection for Robust Sparsity Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1