2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)最新文献

英文中文

Audio and video cues for geo-tagging online videos in the absence of metadata 在没有元数据的情况下，用于地理标记在线视频的音频和视频线索

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2012-06-27 DOI: 10.1109/CBMI.2012.6269808

Xavier Sevillano, X. Valero, Francesc Alías

Tagging videos with the geo-coordinates of the place where they were filmed (i.e. geo-tagging) enables indexing online multimedia repositories using geographical criteria. However, millions of non geo-tagged videos available online are invisible to the eyes of geo-oriented applications, which calls for the development of automatic techniques for estimating the location where a video was filmed. The most successful approaches to this problem largely rely on exploiting the textual metadata associated to the video, but it is not rare to encounter videos with no title, description nor tags. This work focuses on this adverse scenario and proposes a purely audiovisual approach to geo-tagging. Using a subset of the MediaEval 2011 Placing task data set, we evaluate the ability of several visual and acoustic features for estimating the videos location, and demonstrate that the optimally configured version of the proposed system outperforms the only audiovisual participant in the MediaEval 2011 Placing task.

用拍摄地点的地理坐标标记视频(即地理标记)可以使用地理标准对在线多媒体存储库进行索引。然而，数以百万计的在线非地理标记视频对于地理导向的应用程序来说是不可见的，这就要求开发自动技术来估计视频拍摄的位置。解决这个问题的最成功的方法很大程度上依赖于利用与视频相关的文本元数据，但是遇到没有标题、描述和标签的视频并不罕见。这项工作的重点是这种不利的情况下，并提出了一个纯粹的视听方法来地理标记。使用MediaEval 2011放置任务数据集的一个子集，我们评估了几种用于估计视频位置的视觉和声学特征的能力，并证明了所提出系统的最佳配置版本在MediaEval 2011放置任务中优于唯一的视听参与者。

引用次数: 4

A presentation of the REPERE challenge REPERE挑战的介绍

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2012-06-27 DOI: 10.1109/CBMI.2012.6269851

Juliette Kahn, Olivier Galibert, L. Quintard, Matthieu Carré, Aude Giraudel, P. Joly

The REPERE Challenge aims to support research on people recognition in multimodal conditions. To assess the technology progress, annual evaluation campaigns will be organized from 2012 to 2014. In this context the REPERE corpus, a French video corpus with multimodal annotation, has been developed. The systems have to answer the following questions: Who is speaking? Who is present in the video? What names are cited? What names are displayed? The challenge is to combine the various information coming from the speech and the images.

REPERE挑战旨在支持在多模式条件下对人的识别研究。为评估技术进步，将于2012年至2014年组织年度评估活动。在这种情况下，REPERE语料库，一个具有多模态注释的法语视频语料库已经开发出来。系统必须回答以下问题:谁在说话?谁出现在视频中?引用了哪些名字?显示了哪些名字?挑战在于将来自语音和图像的各种信息结合起来。

引用次数: 55

Hierarchical clustering relevance feedback for content-based image retrieval 基于内容的图像检索的层次聚类关联反馈

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2012-06-27 DOI: 10.1109/CBMI.2012.6269811

Ionut Mironica, B. Ionescu, C. Vertan

In this paper we address the issue of relevance feedback in the context of content-based image retrieval. We propose a method that uses an hierarchical cluster representation of the relevant and non-relevant images in a query. The main advantage of this strategy is in performing on the initial set of the retrieved images (user feedback is provided only once for a small number of retrieved images) instead of performing additional queries as most approaches do. Experimental tests conducted on several standard image databases and using state-of-the-art content descriptors (e.g. MPEG-7, SURF) show that the proposed method provides a significant improvement in the retrieval performance, outperforming some other classic approaches.

在本文中，我们讨论了基于内容的图像检索中的相关反馈问题。我们提出了一种方法，在查询中使用相关和不相关图像的分层聚类表示。这种策略的主要优点是在检索图像的初始集上执行(对于少量检索图像只提供一次用户反馈)，而不是像大多数方法那样执行额外的查询。在几个标准图像数据库上进行的实验测试和使用最先进的内容描述符(例如MPEG-7, SURF)表明，所提出的方法在检索性能方面有显著改善，优于其他一些经典方法。

引用次数: 4

Toward an assisted context based collaborative annotation 一种辅助的基于上下文的协作注释

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2012-06-27 DOI: 10.1109/CBMI.2012.6269852

Nesrine Ksentini, Mohamed Zarka, A. Ammar, A. Alimi

This paper introduces a novel approach of video annotation by the use of context-based assistance for the annotator. The notion of context plays, actually, a significant role in the multimedia content search and retrieval systems. In fact, a semantic interpretation (or concept) separated from its context produces incomplete information. Moreover, each concept interpretation varies according to different contexts. The assistance that we introduce uses intelligent structures such as context ontology and previous annotation. The evaluation of the proposed assisted annotation prototype has led to promising results by the use of context in the annotation process.

本文介绍了一种新的视频注释方法，即使用基于上下文的注释器辅助。上下文的概念在多媒体内容检索系统中起着重要的作用。事实上，脱离上下文的语义解释(或概念)产生的信息是不完整的。此外，每个概念的解释根据不同的语境而有所不同。我们引入的辅助使用了上下文本体和先前注释等智能结构。通过在标注过程中使用上下文，对所提出的辅助标注原型进行了评估，取得了令人满意的结果。

引用次数: 3

VOXALEADNEWS: A scalable content based video search engine VOXALEADNEWS:可扩展的基于内容的视频搜索引擎

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2012-06-27 DOI: 10.1109/CBMI.2012.6269802

Julien Law-To, R. Landais, G. Grefenstette

Video search is still largely based on text search human-supplied metadata, sometimes supplemented by extracted thumbnails. We have been developing a broadcast news search system based on recent progress in automatic speech recognition (ASR), natural language processing (NLP) and video and image processing to provide a rich content-based search to news. Our public online demonstrator of the Voxalead application, described here, currently indexes daily broadcast news content from 60 sources in English, French, Chinese, Arabic, Spanish, Dutch, Italian, German and Russian and make them searchable few time after they have been published.

视频搜索在很大程度上仍然是基于人工提供的元数据的文本搜索，有时还会辅以提取的缩略图。我们一直在开发一个基于自动语音识别(ASR)、自然语言处理(NLP)以及视频和图像处理最新进展的广播新闻搜索系统，以提供一个丰富的基于内容的新闻搜索。我们在这里描述的Voxalead应用程序的公开在线演示，目前索引了来自英语、法语、中文、阿拉伯语、西班牙语、荷兰语、意大利语、德语和俄语等60个来源的每日广播新闻内容，并在发布后不久就可以搜索到它们。

引用次数: 1

Automatic difference measure between movies using dissimilarity measure fusion and rank correlation coefficients 基于不同度量融合和等级相关系数的电影间差异自动度量

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2012-06-27 DOI: 10.1109/CBMI.2012.6269835

Nicolas Voiron, A. Benoît, P. Lambert

When considering multimedia database growth, one current challenging issue is to design accurate navigation tools. End user basic needs, such as exploration, similarity search and favorite suggestions, lead to investigate how to find semantically resembling media. One way is to build numerous continuous dissimilarity measures from low-level image features. In parallel, an other way is to build discrete dissimilarities from textual information which may be available with video sequences. However, how such different measures should be selected as relevant and be fused? To this aim, the purpose of this paper is to compare all those various dissimilarities and to propose a suitable ranking fusion method for several dissimilarities. Subjective tests with human observers on the CITIA animation movie database have been carried out to validate the model.

在考虑多媒体数据库的增长时，当前一个具有挑战性的问题是设计精确的导航工具。终端用户的基本需求，如探索、相似搜索和喜欢的建议，促使他们研究如何找到语义上相似的媒体。一种方法是从底层图像特征中构建大量连续的不相似度量。与此同时，另一种方法是从视频序列中可用的文本信息中构建离散的不相似性。然而，这些不同的措施应该如何选择相关和融合?为此，本文的目的是对所有这些不同的差异进行比较，并提出一种适合于不同差异的排序融合方法。在CITIA动画电影数据库上进行了人类观察者的主观测试来验证模型。

引用次数: 3

Comparing segmentation strategies for efficient video passage retrieval 视频片段高效检索的分割策略比较

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2012-06-27 DOI: 10.1109/CBMI.2012.6269850

Christian Wartena

We compare the effect of different text segmentation strategies on speech based passage retrieval of video. Passage retrieval has mainly been studied to improve document retrieval and to enable question answering. In these domains best results were obtained using passages defined by the paragraph structure of the source documents or by using arbitrary overlapping passages. For the retrieval of relevant passages in a video, using speech transcripts, no author defined segmentation is available. We compare retrieval results from 4 different types of segments based on the speech channel of the video: fixed length segments, a sliding window, semantically coherent segments and prosodic segments. We evaluated the methods on the corpus of the MediaEval 2011 Rich Speech Retrieval task. Our main conclusion is that the retrieval results highly depend on the right choice for the segment length. However, results using the segmentation into semantically coherent parts depend much less on the segment length. Especially, the quality of fixed length and sliding window segmentation drops fast when the segment length increases, while quality of the semantically coherent segments is much more stable. Thus, if coherent segments are defined, longer segments can be used and consequently less segments have to be considered at retrieval time.

我们比较了不同的文本分割策略对基于语音的视频段落检索的影响。文章检索的研究主要是为了提高文献检索和实现问答。在这些领域中，使用由源文档的段落结构定义的段落或使用任意重叠的段落可以获得最佳结果。对于视频中相关段落的检索，使用语音记录，没有作者定义的分割可用。我们比较了基于视频语音通道的4种不同类型片段的检索结果:固定长度片段、滑动窗口片段、语义连贯片段和韵律片段。我们在MediaEval 2011富语音检索任务的语料库上评估了这些方法。我们的主要结论是，检索结果高度依赖于正确选择的片段长度。然而，使用分割成语义上连贯的部分的结果对片段长度的依赖要小得多。特别是固定长度和滑动窗口分割的质量随片段长度的增加而下降，而语义连贯的片段质量则稳定得多。因此，如果定义了相干段，则可以使用更长的段，从而在检索时必须考虑更少的段。

{"title":"Comparing segmentation strategies for efficient video passage retrieval","authors":"Christian Wartena","doi":"10.1109/CBMI.2012.6269850","DOIUrl":"https://doi.org/10.1109/CBMI.2012.6269850","url":null,"abstract":"We compare the effect of different text segmentation strategies on speech based passage retrieval of video. Passage retrieval has mainly been studied to improve document retrieval and to enable question answering. In these domains best results were obtained using passages defined by the paragraph structure of the source documents or by using arbitrary overlapping passages. For the retrieval of relevant passages in a video, using speech transcripts, no author defined segmentation is available. We compare retrieval results from 4 different types of segments based on the speech channel of the video: fixed length segments, a sliding window, semantically coherent segments and prosodic segments. We evaluated the methods on the corpus of the MediaEval 2011 Rich Speech Retrieval task. Our main conclusion is that the retrieval results highly depend on the right choice for the segment length. However, results using the segmentation into semantically coherent parts depend much less on the segment length. Especially, the quality of fixed length and sliding window segmentation drops fast when the segment length increases, while quality of the semantically coherent segments is much more stable. Thus, if coherent segments are defined, longer segments can be used and consequently less segments have to be considered at retrieval time.","PeriodicalId":120769,"journal":{"name":"2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114484241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Examining the applicability of virtual reality technique for video retrieval 考察虚拟现实技术在视频检索中的适用性

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2012-06-27 DOI: 10.1109/CBMI.2012.6269807

Kimiaki Shirahama, K. Uehara, M. Grzegorzek

Query-By-Example (QBE) approach retrieves shots which are visually similar to example shots provided by a user. However, QBE cannot work if example shots are unavailable. To overcome this, this paper develops Query-By-Virtual-Example (QBVE) approach where example shots (virtual examples) for a query are created using virtual reality technique. A virtual example is created by synthesizing the user's gesture, 3D object and background image. Using large-scale video data, we examine the effectiveness of virtual examples from the perspective of video retrieval. In particular, we study about the comparison between virtual examples and example shots selected from real videos, the importance of camera movements, the combination strategy of gestures, 3D objects and backgrounds, and the individual difference in users.

按示例查询(QBE)方法检索与用户提供的示例镜头在视觉上相似的镜头。然而，如果示例镜头不可用，QBE不能工作。为了克服这个问题，本文开发了基于虚拟实例的查询(QBVE)方法，其中使用虚拟现实技术为查询创建示例镜头(虚拟示例)。通过综合用户的手势、三维物体和背景图像来创建虚拟示例。利用大规模视频数据，从视频检索的角度考察了虚拟样例的有效性。特别是，我们研究了虚拟示例与从真实视频中选择的示例镜头的比较，摄像机运动的重要性，手势，3D物体和背景的组合策略以及用户的个体差异。

引用次数: 2

Automatic chaptering of VoD content based on DVD content 基于DVD内容的VoD内容自动分章

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2012-06-27 DOI: 10.1109/CBMI.2012.6269812

F. Thudor, Ingrid Autier, B. Chupeau, F. Lefèbvre, Lionel Oisel

In this paper we propose a framework for automatically chaptering VoD content based on its DVD version. The idea is to benefit from artistic work performed for DVD chapter creation even if DVD and VoD video content are not exactly the same. The framework is based on a sparse to dense frame synchronization, combining both global and local image descriptions, together with adaptive video sequence splitting to enable the processing of very long sequences. A way to extract specific information from a DVD is also embedded in the framework. Results of the evaluation performed on official movie releases are presented in the paper.

本文提出了一个基于DVD版本的视频点播内容自动分章的框架。其理念是，即使DVD和VoD视频内容不完全相同，也可以从为DVD章节创作而进行的艺术作品中获益。该框架基于从稀疏到密集的帧同步，结合全局和局部图像描述，以及自适应视频序列分割，使处理非常长的序列成为可能。框架中还嵌入了一种从DVD中提取特定信息的方法。本文给出了对官方电影发行的评价结果。

引用次数: 2

DCT sign based robust privacy preserving image copy detection for cloud-based systems 基于DCT签名的云系统鲁棒隐私保护图像复制检测

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2012-06-27 DOI: 10.1109/CBMI.2012.6269815

M. Diephuis, S. Voloshynovskiy, O. Koval, F. Beekhof

In this paper we propose an architecture for message-privacy preserving copy detection and content identification for images based on the signs of the Discrete Cosine Transform (DCT) coefficients. The architecture allows for searching in encrypted data and places the computational burden on the server. Sign components of the low frequency DCT coefficients of an image are used to generate a dual set of keys that in turn are used to encrypt the source image and serve as a robust hash that can be queried for content identification. The statistical properties of these DCT sign vectors are modelled and we analyse their robustness against real world image distortions. Finally, the trade-off between the discriminative power of such vectors, the offered security and the resilience against errors is demonstrated.

在本文中，我们提出了一种基于离散余弦变换(DCT)系数符号的图像信息隐私保护复制检测和内容识别体系结构。该体系结构允许在加密数据中进行搜索，并将计算负担放在服务器上。图像的低频DCT系数的符号分量用于生成一组双密钥，这些密钥又用于加密源图像，并作为可用于内容识别的鲁棒散列进行查询。对这些DCT符号向量的统计特性进行了建模，并分析了它们对真实世界图像失真的鲁棒性。最后，证明了这些向量的判别能力、提供的安全性和抗错误弹性之间的权衡。

引用次数: 10

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀