2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)最新文献

英文中文

Tonal-based retrieval of Arabic and middle-east music by automatic makam description 基于调性的阿拉伯和中东音乐自动makam描述检索

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972516

Leonidas Ioannidis, E. Gómez, P. Herrera

The automatic description of music from traditions that do not follow the Western notation and theory needs specifically designed tools. We investigate here the makams, which are scales in the modal music of Arabic and Middle East regions. We evaluate two approaches for classifying musical pieces from the ‘makam world’, according to their scale, by using chroma features extracted from polyphonic music signals. The first method compares the extracted features with a set of makam templates, while the second one uses trained classifiers. Both approaches provided good results (F-measure=0.69 and 0.73 respectively) on a collection of 302 pieces from 9 makam families. Furthermore, error analyses showed that certain confusions were musically coherent and that these techniques could complement each other in this particular context.

不遵循西方符号和理论的传统音乐的自动描述需要专门设计的工具。我们在这里研究makams，它是阿拉伯和中东地区调式音乐中的音阶。我们通过使用从复调音乐信号中提取的色度特征，评估了两种从“makam世界”中分类音乐作品的方法。第一种方法将提取的特征与一组makam模板进行比较，而第二种方法使用训练好的分类器。两种方法对9个makam家族的302件标本的收集都提供了良好的结果(F-measure分别为0.69和0.73)。此外，错误分析表明，某些混淆在音乐上是连贯的，这些技术可以在这种特定的背景下相互补充。

引用次数: 17

An efficient method for the unsupervised discovery of signalling motifs in large audio streams 一种大型音频流中无监督发现信号基序的有效方法

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972536

Armando Muscariello, G. Gravier, F. Bimbot

Providing effective tools to navigate and access through long audio archives, or monitor and classify broadcast streams, proves to be an extremely challenging task. Main issues originate from the varied nature of patterns of interest in a composite audio environment, the massive size of such databases, and the capability of performing when prior knowledge on audio content is scarce or absent. This paper proposes a computational architecture aimed at discovering occurrences of repeating patterns in audio streams by means of unsupervised learning. The targeted repetitions (or motifs) are called signalling, by analogy with a biological nomenclature, as referring to a broad class of audio patterns (as jingles, songs, advertisements, etc…) frequently occurring in broadcast audio. We adapt a system originally developed for word discovery applications, and demonstrate its effectiveness in a song discovery scenario. The adaption consists in speeding up critical parts of the computations, mostly based on audio feature coarsening, to deal with the large occurrence period of repeating songs in radio streams.

提供有效的工具来导航和访问长音频档案，或监控和分类广播流，被证明是一项极具挑战性的任务。主要问题源于复合音频环境中感兴趣的模式的不同性质，此类数据库的巨大规模，以及在缺乏或缺乏音频内容的先验知识时执行的能力。本文提出了一种计算架构，旨在通过无监督学习发现音频流中重复模式的出现。目标重复(或母题)被称为信号，与生物学命名法类似，指的是广播音频中经常出现的一大类音频模式(如叮当声、歌曲、广告等)。我们采用了最初为单词发现应用程序开发的系统，并在歌曲发现场景中演示了其有效性。这种适应包括加速关键部分的计算，主要基于音频特征粗化，以处理广播流中重复歌曲的大量出现周期。

{"title":"An efficient method for the unsupervised discovery of signalling motifs in large audio streams","authors":"Armando Muscariello, G. Gravier, F. Bimbot","doi":"10.1109/CBMI.2011.5972536","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972536","url":null,"abstract":"Providing effective tools to navigate and access through long audio archives, or monitor and classify broadcast streams, proves to be an extremely challenging task. Main issues originate from the varied nature of patterns of interest in a composite audio environment, the massive size of such databases, and the capability of performing when prior knowledge on audio content is scarce or absent. This paper proposes a computational architecture aimed at discovering occurrences of repeating patterns in audio streams by means of unsupervised learning. The targeted repetitions (or motifs) are called signalling, by analogy with a biological nomenclature, as referring to a broad class of audio patterns (as jingles, songs, advertisements, etc…) frequently occurring in broadcast audio. We adapt a system originally developed for word discovery applications, and demonstrate its effectiveness in a song discovery scenario. The adaption consists in speeding up critical parts of the computations, mostly based on audio feature coarsening, to deal with the large occurrence period of repeating songs in radio streams.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114208782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Automatic extraction of pornographic contents using radon transform based audio features 使用基于氡变换的音频特征自动提取色情内容

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972546

Myungjong Kim, Hoirin Kim

This paper focuses on the problem of classifying pornographic sounds, such as sexual scream or moan, to detect and block the objectionable multimedia contents. To represent the large temporal variations of pornographic sounds, we propose a novel feature extraction method based on Radon transform. Radon transform provides a way to extract the global trend of orientations in a 2-D region and therefore it is applicable to the time-frequency spectrograms in the long-range segment to capture the large temporal variations of sexual sounds. Radon feature is extracted using histograms and flux of Radon coefficients. We adopt Gaussian mixture model to statistically represent the pornographic and non-pornographic sounds, and the test sounds are classified by using likelihood ratio test. Evaluations on several hundred pornographic and non-pornographic sound clips indicate that the proposed features can achieve satisfactory results that this approach could be used as an alternative to the image-based methods.

本文主要研究色情声音的分类问题，如性尖叫或性呻吟，以检测和屏蔽令人反感的多媒体内容。针对色情声音的大时间变化特征，提出了一种基于Radon变换的特征提取方法。Radon变换提供了一种在二维区域中提取全局方向趋势的方法，因此它适用于远程段的时频谱图，以捕捉性声音的大时间变化。利用Radon系数的直方图和通量提取Radon特征。我们采用高斯混合模型对色情和非色情声音进行统计表示，并使用似然比检验对测试声音进行分类。对数百个色情和非色情声音片段的评估表明，所提出的特征可以达到令人满意的结果，该方法可以用作基于图像的方法的替代方法。

引用次数: 14

On-line characters identification in movies 电影中的在线角色识别

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972540

Bertrand Delezoide, D. Nouri, S. Hamlaoui

Characters identification in video consists in assigning name labels to persons present in a video. We explore here the online labeling of faces in movie. Previous work such as [12, 13] demonstrated promising results on learning and classifying characters using a manually annotated learning-corpus. Some practical issues appear when applying this method on large scale movie database in permanent evolution, as the number of characters to recognize is important and continuously grows. In this paper we build on the first method extending the coverage greatly by learning appearance models of new cast members online for each new movie. In addition, we make the following contributions: (1) we propose to apply Active Appearance Models (AAM) tracking method in order to track local facial features over time and orientation changes, (2) we evaluate important parameters of the feature extraction such as position and size of local features. We report results on the movie I am a legend demonstrating the relevance of our on-line approach of the problem.

视频中的人物识别包括为视频中的人物分配姓名标签。本文探讨了电影中人脸的在线标记。先前的工作，如[12,13]在使用手动注释的学习语料库学习和分类字符方面展示了有希望的结果。由于需要识别的角色数量非常重要且不断增长，因此将该方法应用于长期演化的大型电影数据库时，会出现一些实际问题。在本文中，我们在第一种方法的基础上，通过在线学习每部新电影的新演员的外观模型，大大扩展了覆盖范围。此外，我们还做出了以下贡献:(1)我们提出了应用Active Appearance Models (AAM)跟踪方法来跟踪局部面部特征随时间和方向变化的情况;(2)我们评估了局部特征的位置和大小等特征提取的重要参数。我们报告了电影《我是传奇》的结果，展示了我们在线解决问题的方法的相关性。

引用次数: 2

Query log simulation for long-term learning in image retrieval 查询日志模拟用于图像检索中的长期学习

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972520

Donn Morrison, S. Marchand-Maillet, E. Bruno

In this paper we formalise a query simulation framework for the evaluation of long-term learning systems for image retrieval. Long-term learning relies on historical queries and associated relevance judgements, usually stored in query logs, in order to improve search results presented to users of the retrieval system. Evaluation of long-term learning methods requires access to query logs, preferably in large quantity. However, real-world query logs are notoriously difficult to acquire due to legitimate efforts of safeguarding user privacy. Query log simulation provides a useful means of evaluating long-term learning approaches without the need for real-world data. We introduce a query log simulator that is based on a user model of long-term learning that explains the observed relevance judgements contained in query logs. We validate simulated queries against a real-world query log of an image retrieval system and demonstrate that for evaluation purposes, the simulator is accurate on a global level.

在本文中，我们形式化了一个用于评估图像检索的长期学习系统的查询模拟框架。长期学习依赖于历史查询和相关的相关性判断，通常存储在查询日志中，以改善呈现给检索系统用户的搜索结果。评估长期学习方法需要访问查询日志，最好是大量的查询日志。然而，由于保护用户隐私的合法努力，真实世界的查询日志很难获得。查询日志模拟提供了一种评估长期学习方法的有用方法，而不需要实际数据。我们介绍了一个基于长期学习的用户模型的查询日志模拟器，该模型解释了查询日志中包含的观察到的相关性判断。我们根据图像检索系统的真实查询日志验证模拟查询，并证明出于评估目的，模拟器在全局级别上是准确的。

引用次数: 1

Detecting the long-tail of Points of Interest in tagged photo collections 标记图片集中兴趣点的长尾检测

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972551

Christos Zigkolis, S. Papadopoulos, Y. Kompatsiaris, A. Vakali

The paper tackles the problem of matching the photos of a tagged photo collection to a list of “long-tail” Points Of Interest (PoIs), that is PoIs that are not very popular and thus not well represented in the photo collection. Despite the significance of improving “long-tail” PoI photo retrieval for travel applications, most landmark detection methods to date have been tested on very popular landmarks. In this paper, we conduct a thorough empirical analysis comparing four baseline matching methods that rely on photo metadata, three variants of an approach that uses cluster analysis in order to discover PoI-related photo clusters, and a real-world retrieval mechanism (Flickr search) on a set of less popular PoIs. A user-based evaluation of the aforementioned methods is conducted on a Flickr photo collection of over 100, 000 photos from 10 well-known touristic destinations in Greece. A set of 104 “long-tail” PoIs is collected for these destinations from Wikipedia, Wikimapia and OpenStreetMap. The results demonstrate that two of the baseline methods outperform Flickr search in terms of precision and F-measure, whereas two of the cluster-based methods outperform it in terms of recall and PoI coverage. We consider the results of this study valuable for enhancing the indexing of pictorial content in social media sites.

本文解决了将带标签的照片集合中的照片与“长尾”兴趣点(PoIs)列表相匹配的问题，即不太受欢迎的兴趣点，因此在照片集合中没有很好地表示。尽管改进“长尾”PoI照片检索对旅游应用具有重要意义，但迄今为止，大多数地标检测方法都是在非常受欢迎的地标上进行测试的。在本文中，我们进行了全面的实证分析，比较了四种依赖于照片元数据的基线匹配方法，三种使用聚类分析来发现poi相关照片聚类的方法，以及一种基于一组不太受欢迎的poi的现实世界检索机制(Flickr搜索)。对上述方法的基于用户的评估是在Flickr上收集的来自希腊10个著名旅游目的地的10万多张照片上进行的。我们从维基百科、维基百科和OpenStreetMap上为这些目的地收集了104个“长尾”poi。结果表明，两种基线方法在精度和F-measure方面优于Flickr搜索，而两种基于聚类的方法在召回率和PoI覆盖方面优于Flickr搜索。我们认为这项研究的结果对于增强社交媒体网站上图片内容的索引是有价值的。

{"title":"Detecting the long-tail of Points of Interest in tagged photo collections","authors":"Christos Zigkolis, S. Papadopoulos, Y. Kompatsiaris, A. Vakali","doi":"10.1109/CBMI.2011.5972551","DOIUrl":"https://doi.org/10.1109/CBMI.2011.5972551","url":null,"abstract":"The paper tackles the problem of matching the photos of a tagged photo collection to a list of “long-tail” Points Of Interest (PoIs), that is PoIs that are not very popular and thus not well represented in the photo collection. Despite the significance of improving “long-tail” PoI photo retrieval for travel applications, most landmark detection methods to date have been tested on very popular landmarks. In this paper, we conduct a thorough empirical analysis comparing four baseline matching methods that rely on photo metadata, three variants of an approach that uses cluster analysis in order to discover PoI-related photo clusters, and a real-world retrieval mechanism (Flickr search) on a set of less popular PoIs. A user-based evaluation of the aforementioned methods is conducted on a Flickr photo collection of over 100, 000 photos from 10 well-known touristic destinations in Greece. A set of 104 “long-tail” PoIs is collected for these destinations from Wikipedia, Wikimapia and OpenStreetMap. The results demonstrate that two of the baseline methods outperform Flickr search in terms of precision and F-measure, whereas two of the cluster-based methods outperform it in terms of recall and PoI coverage. We consider the results of this study valuable for enhancing the indexing of pictorial content in social media sites.","PeriodicalId":358337,"journal":{"name":"2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127948522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Interactive video search and browsing systems 交互式视频搜索和浏览系统

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972543

M. Bertini, A. Bimbo, Andrea Ferracani, Daniele Pezzatini

In this paper we present two interactive systems for video search and browsing; one is a web application based on the Rich Internet Application paradigm, designed to obtain the levels of responsiveness and interactivity typical of a desktop application, while the other exploits multi-touch devices to implement a multi-user collaborative application. Both systems use the same ontology-based video search engine, that is capable of expanding user queries through ontology reasoning, and let users to search for specific video segments that contain a semantic concept or to browse the content of video collections, when it's too difficult to express a specific query.

本文提出了两个视频搜索和浏览交互系统;一个是基于富互联网应用范例的web应用程序，旨在获得典型桌面应用程序的响应性和交互性水平，而另一个则利用多点触摸设备实现多用户协作应用程序。两个系统使用相同的基于本体的视频搜索引擎，能够通过本体推理扩展用户查询，当用户难以表达特定的查询时，用户可以搜索包含语义概念的特定视频片段或浏览视频集合的内容。

引用次数: 3

Medical image modality classification and retrieval 医学图像模态分类与检索

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972544

G. Csurka, S. Clinchant, Guillaume Jacquet

The aim of this paper is to explore different medical image modality and retrieval strategies. First, we analyze how current state-of-the art image representations (bags of visual words and Fisher Vectors) perform when we use them for medical modality classification. Then we integrated these representations in a content based image retrieval system and tested on a medical image retrieval task. Finally, in both cases, we explored how the performance can be improved if we combine visual with textual information. To show the performance of different systems we compared our approaches to the systems participated at the Medical Task of the latest ImageClef Challenge [16].

本文的目的是探讨不同的医学图像模式和检索策略。首先，我们分析了当前最先进的图像表示(视觉词袋和Fisher向量)在用于医学模态分类时的表现。然后将这些表示集成到一个基于内容的图像检索系统中，并在一个医学图像检索任务上进行了测试。最后，在这两种情况下，我们都探讨了将视觉信息与文本信息结合起来如何提高性能。为了展示不同系统的性能，我们将我们的方法与参加最新ImageClef挑战赛医疗任务的系统进行了比较[16]。

引用次数: 27

A content-based system for music recommendation and visualization of user preferences working on semantic notions 一个基于内容的系统，用于音乐推荐和用户偏好的可视化，处理语义概念

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972554

D. Bogdanov, Martín Haro, Ferdinand Fuhrmann, Anna Xambó, E. Gómez, P. Herrera

The amount of digital music has grown unprecedentedly during the last years and requires the development of effective methods for search and retrieval. In particular, content-based preference elicitation for music recommendation is a challenging problem that is effectively addressed in this paper. We present a system which automatically generates recommendations and visualizes a user's musical preferences, given her/his accounts on popular online music services. Using these services, the system retrieves a set of tracks preferred by a user, and further computes a semantic description of musical preferences based on raw audio information. For the audio analysis we used the capabilities of the Canoris API. Thereafter, the system generates music recommendations, using a semantic music similarity measure, and a user's preference visualization, mapping semantic descriptors to visual elements.

在过去的几年里，数字音乐的数量以前所未有的速度增长，需要开发有效的搜索和检索方法。特别是，基于内容的音乐推荐偏好激发是一个具有挑战性的问题，本文有效地解决了这一问题。我们提出了一个系统，自动生成推荐和可视化用户的音乐偏好，给她/他的帐户在流行的在线音乐服务。使用这些服务，系统检索用户喜欢的一组曲目，并进一步计算基于原始音频信息的音乐偏好的语义描述。对于音频分析，我们使用了Canoris API的功能。然后，系统使用语义音乐相似度度量和用户偏好可视化，将语义描述符映射到视觉元素，生成音乐推荐。

引用次数: 16

Text detection and recognition for person identification in videos 视频中人物识别的文本检测与识别

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

Pub Date : 2011-06-13 DOI: 10.1109/CBMI.2011.5972553

Johann Poignant, F. Thollard, G. Quénot, L. Besacier

This article presents a demo of person search in audiovisual broadcast using only the text available in a video and in resources external to the video. We also present the different steps used to recognize characters in video for multi-modal person recognition systems. Text detection is realized using the text features (texture, color, contrast, geometry, temporal information). The text recognition itself is performed by the Google Tesseract free software. The method was successfully evaluated on a broadcast news corpus that contains 59 videos from the France 2 French TV channel.

本文演示了在音像广播中仅使用视频中可用的文本和视频外部资源中的人员搜索。我们还介绍了用于多模态人物识别系统中视频字符识别的不同步骤。文本检测是利用文本特征(纹理、颜色、对比度、几何形状、时间信息)实现的。文本识别本身是由谷歌Tesseract免费软件执行的。该方法在包含法国2频道59个视频的广播新闻语料库上成功地进行了评估。

引用次数: 8

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀