视觉与语言整合的计算方法

Kobus Barnard
{"title":"视觉与语言整合的计算方法","authors":"Kobus Barnard","doi":"10.2200/s00705ed1v01y201602cov007","DOIUrl":null,"url":null,"abstract":"Abstract \"This is clearly the most comprehensive and thoughtful compendium of knowledge on language/vision integration out there, and I'm sure it will be a valuable resources to many researchers and instructors.\" - Sven Dickinson, Series Editor (University of Toronto) Modeling data from visual and linguistic modalities together creates opportunities for better understanding of both, and supports many useful applications. Examples of dual visual-linguistic data includes images with keywords, video with narrative, and figures in documents. We consider two key task-driven themes: translating from one modality to another (e.g., inferring annotations for images) and understanding the data using all modalities, where one modality can help disambiguate information in another. The multiple modalities can either be essentially semantically redundant (e.g., keywords provided by a person looking at the image), or largely complementary (e.g., meta data such as the camera used). Redundancy and complementarity are two ...","PeriodicalId":377202,"journal":{"name":"Synthesis Lectures on Computer Vision","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Computational Methods for Integrating Vision and Language\",\"authors\":\"Kobus Barnard\",\"doi\":\"10.2200/s00705ed1v01y201602cov007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract \\\"This is clearly the most comprehensive and thoughtful compendium of knowledge on language/vision integration out there, and I'm sure it will be a valuable resources to many researchers and instructors.\\\" - Sven Dickinson, Series Editor (University of Toronto) Modeling data from visual and linguistic modalities together creates opportunities for better understanding of both, and supports many useful applications. Examples of dual visual-linguistic data includes images with keywords, video with narrative, and figures in documents. We consider two key task-driven themes: translating from one modality to another (e.g., inferring annotations for images) and understanding the data using all modalities, where one modality can help disambiguate information in another. The multiple modalities can either be essentially semantically redundant (e.g., keywords provided by a person looking at the image), or largely complementary (e.g., meta data such as the camera used). Redundancy and complementarity are two ...\",\"PeriodicalId\":377202,\"journal\":{\"name\":\"Synthesis Lectures on Computer Vision\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Synthesis Lectures on Computer Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2200/s00705ed1v01y201602cov007\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Synthesis Lectures on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2200/s00705ed1v01y201602cov007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

“这显然是关于语言/视觉整合的最全面和最深思熟虑的知识纲要,我相信它将成为许多研究人员和教师的宝贵资源。”- Sven Dickinson,系列编辑(多伦多大学)来自视觉和语言模式的建模数据一起为更好地理解两者创造了机会,并支持许多有用的应用。双重视觉语言数据的例子包括带有关键词的图像、带有叙述的视频和文档中的数字。我们考虑了两个关键的任务驱动主题:从一种模态转换到另一种模态(例如,推断图像的注释)和使用所有模态理解数据,其中一种模态可以帮助消除另一种模态中的信息歧义。多模态可以本质上是语义冗余的(例如,由查看图像的人提供的关键字),或者很大程度上是互补的(例如,元数据,如使用的相机)。冗余和互补性是两个…
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Computational Methods for Integrating Vision and Language
Abstract "This is clearly the most comprehensive and thoughtful compendium of knowledge on language/vision integration out there, and I'm sure it will be a valuable resources to many researchers and instructors." - Sven Dickinson, Series Editor (University of Toronto) Modeling data from visual and linguistic modalities together creates opportunities for better understanding of both, and supports many useful applications. Examples of dual visual-linguistic data includes images with keywords, video with narrative, and figures in documents. We consider two key task-driven themes: translating from one modality to another (e.g., inferring annotations for images) and understanding the data using all modalities, where one modality can help disambiguate information in another. The multiple modalities can either be essentially semantically redundant (e.g., keywords provided by a person looking at the image), or largely complementary (e.g., meta data such as the camera used). Redundancy and complementarity are two ...
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Visual Domain Adaptation in the Deep Learning Era Computer Vision in the Infrared Spectrum: Challenges and Approaches Person Re-identification with Limited Supervision Multi-Modal Face Presentation Attack Detection Computational Texture and Patterns: From Textons to Deep Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1