A probabilistic topic approach for context-aware visual attention modeling

M. Fernandez-Torres, I. González-Díaz, F. Díaz-de-María
{"title":"A probabilistic topic approach for context-aware visual attention modeling","authors":"M. Fernandez-Torres, I. González-Díaz, F. Díaz-de-María","doi":"10.1109/CBMI.2016.7500272","DOIUrl":null,"url":null,"abstract":"The modeling of visual attention has gained much interest during the last few years since it allows to efficiently drive complex visual processes to particular areas of images or video frames. Although the literature concerning bottom-up saliency models is vast, we still lack of generic approaches modeling top-down task and context-driven visual attention. Indeed, many top-down models simply modulate the weights associated to low-level descriptors to learn more accurate representations of visual attention than those ones of the generic fusion schemes in bottom-up techniques. In this paper we propose a hierarchical generic probabilistic framework that decomposes the complex process of context-driven visual attention into a mixture of latent subtasks, each of them being in turn modeled as a combination of specific distributions of low-level descriptors. The inclusion of this intermediate level bridges the gap between low-level features and visual attention and enables more comprehensive representations of the later. Our experiments on a dataset in which videos are organized by genre demonstrate that, by learning specific distributions for each video category, we can notably enhance the system performance.","PeriodicalId":356608,"journal":{"name":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMI.2016.7500272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

The modeling of visual attention has gained much interest during the last few years since it allows to efficiently drive complex visual processes to particular areas of images or video frames. Although the literature concerning bottom-up saliency models is vast, we still lack of generic approaches modeling top-down task and context-driven visual attention. Indeed, many top-down models simply modulate the weights associated to low-level descriptors to learn more accurate representations of visual attention than those ones of the generic fusion schemes in bottom-up techniques. In this paper we propose a hierarchical generic probabilistic framework that decomposes the complex process of context-driven visual attention into a mixture of latent subtasks, each of them being in turn modeled as a combination of specific distributions of low-level descriptors. The inclusion of this intermediate level bridges the gap between low-level features and visual attention and enables more comprehensive representations of the later. Our experiments on a dataset in which videos are organized by genre demonstrate that, by learning specific distributions for each video category, we can notably enhance the system performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
上下文感知视觉注意建模的概率主题方法
视觉注意的建模在过去几年中获得了很大的兴趣,因为它可以有效地将复杂的视觉过程驱动到图像或视频帧的特定区域。尽管关于自下而上显著性模型的文献很多,但我们仍然缺乏对自上而下任务和上下文驱动的视觉注意建模的通用方法。事实上,许多自顶向下的模型只是简单地调整与低级描述符相关的权重,以学习比自底向上技术中的通用融合方案更准确的视觉注意表示。在本文中,我们提出了一个分层的通用概率框架,该框架将上下文驱动的视觉注意的复杂过程分解为潜在子任务的混合物,每个子任务依次建模为低级描述符的特定分布的组合。这一中间水平的包含弥补了低水平特征和视觉注意之间的差距,并使后者能够更全面地表现出来。我们在一个视频按类型组织的数据集上的实验表明,通过学习每个视频类别的特定分布,我们可以显著提高系统性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Music Tweet Map: A browsing interface to explore the microblogosphere of music A novel architecture of semantic web reasoner based on transferable belief model Simple tag-based subclass representations for visually-varied image classes Crowdsourcing as self-fulfilling prophecy: Influence of discarding workers in subjective assessment tasks EIR — Efficient computer aided diagnosis framework for gastrointestinal endoscopies
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1