将视觉对象识别置于上下文中。

Mengmi Zhang, Claire Tseng, Gabriel Kreiman
{"title":"将视觉对象识别置于上下文中。","authors":"Mengmi Zhang,&nbsp;Claire Tseng,&nbsp;Gabriel Kreiman","doi":"10.1109/CVPR42600.2020.01300","DOIUrl":null,"url":null,"abstract":"<p><p>Context plays an important role in visual recognition. Recent studies have shown that visual recognition networks can be fooled by placing objects in inconsistent contexts (e.g. a cow in the ocean). To understand and model the role of contextual information in visual recognition, we systematically and quantitatively investigated ten critical properties of where, when, and how context modulates recognition including amount of context, context and object resolution, geometrical structure of context, context congruence, time required to incorporate contextual information, and temporal dynamics of contextual modulation. The tasks involve recognizing a target object surrounded with context in a natural image. As an essential benchmark, we first describe a series of psychophysics experiments, where we alter one aspect of context at a time, and quantify human recognition accuracy. To computationally assess performance on the same tasks, we propose a biologically inspired context aware object recognition model consisting of a two-stream architecture. The model processes visual information at the fovea and periphery in parallel, dynamically incorporates both object and contextual information, and sequentially reasons about the class label for the target object. Across a wide range of behavioral tasks, the model approximates human level performance without retraining for each task, captures the dependence of context enhancement on image properties, and provides initial steps towards integrating scene and object information for visual recognition.</p>","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2020 ","pages":"12982-12991"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CVPR42600.2020.01300","citationCount":"36","resultStr":"{\"title\":\"Putting visual object recognition in context.\",\"authors\":\"Mengmi Zhang,&nbsp;Claire Tseng,&nbsp;Gabriel Kreiman\",\"doi\":\"10.1109/CVPR42600.2020.01300\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Context plays an important role in visual recognition. Recent studies have shown that visual recognition networks can be fooled by placing objects in inconsistent contexts (e.g. a cow in the ocean). To understand and model the role of contextual information in visual recognition, we systematically and quantitatively investigated ten critical properties of where, when, and how context modulates recognition including amount of context, context and object resolution, geometrical structure of context, context congruence, time required to incorporate contextual information, and temporal dynamics of contextual modulation. The tasks involve recognizing a target object surrounded with context in a natural image. As an essential benchmark, we first describe a series of psychophysics experiments, where we alter one aspect of context at a time, and quantify human recognition accuracy. To computationally assess performance on the same tasks, we propose a biologically inspired context aware object recognition model consisting of a two-stream architecture. The model processes visual information at the fovea and periphery in parallel, dynamically incorporates both object and contextual information, and sequentially reasons about the class label for the target object. Across a wide range of behavioral tasks, the model approximates human level performance without retraining for each task, captures the dependence of context enhancement on image properties, and provides initial steps towards integrating scene and object information for visual recognition.</p>\",\"PeriodicalId\":74560,\"journal\":{\"name\":\"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition\",\"volume\":\"2020 \",\"pages\":\"12982-12991\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/CVPR42600.2020.01300\",\"citationCount\":\"36\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR42600.2020.01300\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2020/8/5 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR42600.2020.01300","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/8/5 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36

摘要

语境在视觉识别中起着重要的作用。最近的研究表明,视觉识别网络可以通过将物体放置在不一致的环境中(例如,海洋中的奶牛)而被愚弄。为了理解语境信息在视觉识别中的作用并建立模型,我们系统地、定量地研究了语境在何地、何时以及如何调节识别的十个关键属性,包括语境的数量、语境和物体分辨率、语境的几何结构、语境一致性、整合语境信息所需的时间以及语境调节的时间动态。这些任务包括在自然图像中识别被上下文包围的目标物体。作为基本基准,我们首先描述了一系列心理物理学实验,在这些实验中,我们一次改变上下文的一个方面,并量化人类识别的准确性。为了计算评估在相同任务上的性能,我们提出了一个由两流架构组成的生物启发的上下文感知对象识别模型。该模型在中央凹和外围并行处理视觉信息,动态地合并对象和上下文信息,并依次对目标对象的类标签进行推理。在广泛的行为任务中,该模型近似于人类水平的表现,而无需对每个任务进行再训练,捕获上下文增强对图像属性的依赖,并为视觉识别整合场景和对象信息提供了初步步骤。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Putting visual object recognition in context.

Context plays an important role in visual recognition. Recent studies have shown that visual recognition networks can be fooled by placing objects in inconsistent contexts (e.g. a cow in the ocean). To understand and model the role of contextual information in visual recognition, we systematically and quantitatively investigated ten critical properties of where, when, and how context modulates recognition including amount of context, context and object resolution, geometrical structure of context, context congruence, time required to incorporate contextual information, and temporal dynamics of contextual modulation. The tasks involve recognizing a target object surrounded with context in a natural image. As an essential benchmark, we first describe a series of psychophysics experiments, where we alter one aspect of context at a time, and quantify human recognition accuracy. To computationally assess performance on the same tasks, we propose a biologically inspired context aware object recognition model consisting of a two-stream architecture. The model processes visual information at the fovea and periphery in parallel, dynamically incorporates both object and contextual information, and sequentially reasons about the class label for the target object. Across a wide range of behavioral tasks, the model approximates human level performance without retraining for each task, captures the dependence of context enhancement on image properties, and provides initial steps towards integrating scene and object information for visual recognition.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
43.50
自引率
0.00%
发文量
0
期刊最新文献
MAPSeg: Unified Unsupervised Domain Adaptation for Heterogeneous Medical Image Segmentation Based on 3D Masked Autoencoding and Pseudo-Labeling. Learned representation-guided diffusion models for large-image generation. SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology. Calibrating Multi-modal Representations: A Pursuit of Group Robustness without Annotations. Representing Part-Whole Hierarchies in Foundation Models by Learning Localizability, Composability, and Decomposability from Anatomy via Self-Supervision.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1