Harvesting Deep Models for Cross-Lingual Image Annotation

Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing Pub Date : 2017-06-19 DOI:10.1145/3095713.3095751

Qijie Wei, Xiaoxu Wang, Xirong Li

引用次数: 5

Abstract

This paper considers cross-lingual image annotation, harvesting deep visual models from one language to annotate images with labels from another language. This task cannot be accomplished by machine translation, as labels can be ambiguous and a translated vocabulary leaves us limited freedom to annotate images with appropriate labels. Given non-overlapping vocabularies between two languages, we formulate cross-lingual image annotation as a zero-shot learning problem. For cross-lingual label matching, we adapt zero-shot by replacing the current monolingual semantic embedding space by a bilingual alternative. In order to reduce both label ambiguity and redundancy we propose a simple yet effective approach called label-enhanced zero-shot learning. Using three state-of-the-art deep visual models, i.e., ResNet-152, GoogleNet-Shuffle and OpenImages, experiments on the test set of Flickr8k-CN demonstrate the viability of the proposed approach for cross-lingual image annotation.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

获取跨语言图像标注的深度模型

本文考虑跨语言图像标注，从一种语言中获取深度视觉模型，并用另一种语言的标签对图像进行标注。这个任务不能通过机器翻译完成，因为标签可能是模糊的，而且翻译后的词汇表使我们用适当的标签注释图像的自由受到限制。给定两种语言之间不重叠的词汇表，我们将跨语言图像标注制定为零学习问题。对于跨语言的标签匹配，我们通过用双语替代当前的单语语义嵌入空间来适应零射击。为了减少标签歧义和冗余，我们提出了一种简单而有效的方法，称为标签增强零次学习。使用ResNet-152、GoogleNet-Shuffle和OpenImages三种最先进的深度视觉模型，在Flickr8k-CN测试集上的实验证明了该方法用于跨语言图像标注的可行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing

自引率

0.00%

发文量

期刊最新文献

Tag Propagation Approaches within Speaking Face Graphs for Multimodal Person Discovery A free Web API for single and multi-document summarization Visualizing weakly-Annotated Multi-label Mayan Inscriptions with Supervised t-SNE Prediction of User Demographics from Music Listening Habits Detecting adversarial example attacks to deep neural networks