2009 Seventh International Workshop on Content-Based Multimedia Indexing最新文献

英文中文

Compressed Domain Copy Detection of Scalable SVC Videos 可扩展SVC视频的压缩域复制检测

2009 Seventh International Workshop on Content-Based Multimedia Indexing

Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.26

Christian Käs, H. Nicolas

We propose a novel approach for compressed domain copy detection of scalable videos stored in a database. We analyze compressed H.264/SVC streams and form different scalable low-level and mid-level feature vectors that are robust to multiple transformations. The features are based on easily available information like the encoding bit rate over time and the motion vectors found in the stream. The focus of this paper lies on the scalability and robustness of the features. A combination of different descriptors is used to perform copy detection on a database containing scalable, SVC-coded High-Definition (HD) video clips.

我们提出了一种新的压缩域复制检测方法，用于存储在数据库中的可扩展视频。我们分析了压缩后的H.264/SVC流，并形成了不同的可扩展的低级和中级特征向量，这些特征向量对多种转换具有鲁棒性。这些特征是基于容易获得的信息，如编码比特率随时间的变化和在流中发现的运动向量。本文的重点在于特征的可扩展性和鲁棒性。不同描述符的组合用于在包含可扩展的、svc编码的高清(HD)视频剪辑的数据库上执行复制检测。

引用次数: 3

AudioCycle: Browsing Musical Loop Libraries AudioCycle:浏览音乐循环库

2009 Seventh International Workshop on Content-Based Multimedia Indexing

Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.19

S. Dupont, Thomas Dubuisson, J. Urbain, R. Sebbe, N. D'Alessandro, Christian Frisson

This paper presents AudioCycle, a prototype application for browsing through music loop libraries. AudioCycle provides the user with a graphical view where the audio extracts are visualized and organized according to their similarity in terms of musical properties, such as timbre, harmony, and rhythm. The user is able to navigate in this visual representation, and listen to individual audio extracts, searching for those of interest. AudioCycle draws from a range of technologies, including audio analysis from music information retrieval research, 3D visualization, spatial auditory rendering, audio time-scaling and pitch modification. The proposed approach extends on previously described music and audio browsers. Concepts developed here will be of interest to DJs, remixers, musicians, soundtrack composers, but also sound designers and foley artists. Possible extension to multimedia libraries are also suggested.

本文介绍了AudioCycle，一个浏览音乐循环库的原型应用程序。AudioCycle为用户提供了一个图形化视图，其中根据音色、和声和节奏等音乐属性的相似性对音频摘录进行可视化和组织。用户能够在这种视觉表示中导航，并收听单个音频摘录，搜索感兴趣的内容。audiycle借鉴了一系列技术，包括音乐信息检索研究的音频分析，3D可视化，空间听觉渲染，音频时间尺度和音高修改。所提出的方法扩展了前面描述的音乐和音频浏览器。这里开发的概念将对dj，混音师，音乐家，配乐作曲家，以及声音设计师和福利艺术家感兴趣。并提出了扩展多媒体库的可能性。

引用次数: 13

Sperm Whales Records Indexation Using Passive Acoustics Localization 使用被动声学定位的抹香鲸记录索引

2009 Seventh International Workshop on Content-Based Multimedia Indexing

Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.39

F. Bénard, H. Glotin

This paper focuses on the robust indexing of sperm whale hydrophone recordings based on a set of features extracted from a real-time passive underwater acoustic tracking algorithm for multiple emitting whales. In past years, interest in marine mammals has increased leading to the development of robust and real-time systems. Acoustic localization permits to study whales' behavior in deep water (several hundreds of meters) without interfering with the environment. In this paper, we recall and use a real-time multiple tracking algorithm recently developed, which provides a localization of one or several sperm whales. Given the position coordinates, we are able to analyse different features such as speed, energy of the clicks, Inter-Click-Interval (ICI).... These features allow us to construct different markers which lead to the indexing and structuring the audio files. Thus, the behavior study is facilitated choosing and accessing the corresponding index in the audio file. The complete indexing algorithm is processed on real data from the NUWC (Naval Undersea Warfare Center of the US Navy) and the AUTEC (Atlantic Undersea Test & Evaluation Center - Bahamas). Our model is validated by similar results from the US Navy (NUWC) and SOEST (School of Ocean and Earth Science and Technology) Hawaii university labs in a single whale case. Finally, as an illustration, we index a single whale sound file thanks to the extracted whale's features provided by the tracking, and we present an example of an XML script structuring it.

本文主要研究了基于多发射鲸实时被动水声跟踪算法提取的一组特征的抹香鲸水听器录音鲁棒索引。在过去的几年里，对海洋哺乳动物的兴趣增加了，导致了健壮和实时系统的发展。声学定位允许在不干扰环境的情况下研究深海(几百米)鲸鱼的行为。在本文中，我们回顾并使用了最近开发的实时多重跟踪算法，该算法提供了一个或几个抹香鲸的定位。给定位置坐标，我们能够分析不同的特征，如速度，点击能量，点击间隔(ICI)....这些特性使我们能够构建不同的标记，从而对音频文件进行索引和结构化。从而便于行为研究在音频文件中选择和访问相应的索引。完整的索引算法是在NUWC(美国海军海底作战中心)和AUTEC(大西洋海底测试与评估中心-巴哈马)的真实数据上处理的。我们的模型得到了美国海军(NUWC)和夏威夷大学海洋与地球科学与技术学院(SOEST)实验室在单个鲸鱼案例中的类似结果的验证。最后，作为说明，我们将索引单个鲸鱼声音文件，这要归功于跟踪所提供的提取鲸鱼的特性，并且我们提供了一个XML脚本构建它的示例。

{"title":"Sperm Whales Records Indexation Using Passive Acoustics Localization","authors":"F. Bénard, H. Glotin","doi":"10.1109/CBMI.2009.39","DOIUrl":"https://doi.org/10.1109/CBMI.2009.39","url":null,"abstract":"This paper focuses on the robust indexing of sperm whale hydrophone recordings based on a set of features extracted from a real-time passive underwater acoustic tracking algorithm for multiple emitting whales. In past years, interest in marine mammals has increased leading to the development of robust and real-time systems. Acoustic localization permits to study whales' behavior in deep water (several hundreds of meters) without interfering with the environment. In this paper, we recall and use a real-time multiple tracking algorithm recently developed, which provides a localization of one or several sperm whales. Given the position coordinates, we are able to analyse different features such as speed, energy of the clicks, Inter-Click-Interval (ICI).... These features allow us to construct different markers which lead to the indexing and structuring the audio files. Thus, the behavior study is facilitated choosing and accessing the corresponding index in the audio file. The complete indexing algorithm is processed on real data from the NUWC (Naval Undersea Warfare Center of the US Navy) and the AUTEC (Atlantic Undersea Test & Evaluation Center - Bahamas). Our model is validated by similar results from the US Navy (NUWC) and SOEST (School of Ocean and Earth Science and Technology) Hawaii university labs in a single whale case. Finally, as an illustration, we index a single whale sound file thanks to the extracted whale's features provided by the tracking, and we present an example of an XML script structuring it.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121525264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Monophony vs Polyphony: A New Method Based on Weibull Bivariate Models 单音与复调:一种基于威布尔二元模型的新方法

2009 Seventh International Workshop on Content-Based Multimedia Indexing

Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.24

H. Lachambre, R. André-Obrecht, J. Pinquier

Our contribution takes place in the context of music indexation. In many applications, such as multipitch estimation, it can be useful to know the number of notes played at a time. In this work, we aim at distinguish monophonies (one note at a time) from polyphonies (several notes at a time). We analyze an indicator which gives the confidence on the estimated pitch. In the case of a monophony, the pitch is relatively easy to determine, this indicator is low. In the case of a polyphony, the pitch is much more difficult to determine, so the indicator is higher and varies more. Considering these two facts, we compute the short term mean and variance of the indicator, and model the bivariate repartition of these two parameters with Weibull bivariate distributions for each class (monophony and polyphony). The classification is made by computing the likelihood over one second for each class and taking the best one.Models are learned with 25 seconds of each kind of signal. Our best results give a global error rate of 6.3 %, performed on a balanced corpus containing approximately 18 minutes of signal.

我们的贡献发生在音乐索引的背景下。在许多应用程序中，例如多音高估计，知道一次演奏的音符数量可能是有用的。在这部作品中，我们的目标是区分单音(一次一个音符)和复音(一次几个音符)。我们分析了一个给出估计pitch置信度的指标。在单音的情况下，音高相对容易确定，这个指标低。在复调的情况下，音高更难确定，所以指标更高，变化更大。考虑到这两个事实，我们计算了指标的短期均值和方差，并对每个类别(单音和复音)用威布尔二元分布对这两个参数的二元重划分进行了建模。分类是通过计算每个类别在一秒内的可能性并选择最好的类别来完成的。每种信号的学习时间为25秒。我们的最佳结果给出了6.3%的全局错误率，在包含大约18分钟信号的平衡语料库上执行。

引用次数: 3

Biometric Responses to Music-Rich Segments in Films: The CDVPlex 对电影中音乐丰富片段的生物识别反应:CDVPlex

2009 Seventh International Workshop on Content-Based Multimedia Indexing

Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.21

A. Smeaton, S. Rothwell

Summarising or generating trailers for films or movies involves finding the highlights within those films, those segments where we become most afraid, happy, sad, annoyed, excited, etc. In this paper we explore three questions related to automatic detection of film highlights by measuring the physiological responses of viewers of those films. Firstly, whether emotional highlights can be detected through viewer biometrics, secondly whether individuals watching a film in a group experience similar emotional reactions as others in the group and thirdly whether the presence of music in a film correlates with the occurrence of emotional highlights. We analyse the results of an experiment known as the CDVPlex, where we monitored and recorded physiological reactions from people as they viewed films in a controlled cinema-like environment. A selection of films were manually annotated for the locations of their emotive contents. We then studied the physiological peaks identified among participants while viewing the same film and how these correlated with emotion tags and with music. We conclude that these are highly correlated and that music-rich segments of a film do act as a catalyst in stimulating viewer response, though we don't know what exact emotions the viewers were experiencing. The results of this work could impact the way in which we index movie content on PVRs for example, paying special significance to movie segments which are most likely to be highlights.

为电影或电影总结或制作预告片需要找到这些电影中的亮点，即那些让我们感到最害怕、最快乐、最悲伤、最烦恼、最兴奋的片段。在本文中，我们通过测量这些电影观众的生理反应来探讨与电影亮点自动检测相关的三个问题。首先，情感亮点是否可以通过观众生物识别技术检测到;其次，在一个群体中观看电影的个人是否会经历与群体中其他人相似的情感反应;第三，电影中音乐的存在是否与情感亮点的发生相关。我们分析了一个名为CDVPlex的实验结果，在这个实验中，我们监测并记录了人们在一个类似电影院的受控环境中观看电影时的生理反应。一些影片被手工标注为它们的情感内容的位置。然后，我们研究了参与者在观看同一部电影时发现的生理高峰，以及这些高峰与情感标签和音乐的关系。我们得出的结论是，这两者是高度相关的，电影中音乐丰富的片段确实是刺激观众反应的催化剂，尽管我们不知道观众正在经历的确切情绪。这项工作的结果可能会影响我们在pvr上索引电影内容的方式，例如，对最有可能成为亮点的电影片段给予特殊的意义。

{"title":"Biometric Responses to Music-Rich Segments in Films: The CDVPlex","authors":"A. Smeaton, S. Rothwell","doi":"10.1109/CBMI.2009.21","DOIUrl":"https://doi.org/10.1109/CBMI.2009.21","url":null,"abstract":"Summarising or generating trailers for films or movies involves finding the highlights within those films, those segments where we become most afraid, happy, sad, annoyed, excited, etc. In this paper we explore three questions related to automatic detection of film highlights by measuring the physiological responses of viewers of those films. Firstly, whether emotional highlights can be detected through viewer biometrics, secondly whether individuals watching a film in a group experience similar emotional reactions as others in the group and thirdly whether the presence of music in a film correlates with the occurrence of emotional highlights. We analyse the results of an experiment known as the CDVPlex, where we monitored and recorded physiological reactions from people as they viewed films in a controlled cinema-like environment. A selection of films were manually annotated for the locations of their emotive contents. We then studied the physiological peaks identified among participants while viewing the same film and how these correlated with emotion tags and with music. We conclude that these are highly correlated and that music-rich segments of a film do act as a catalyst in stimulating viewer response, though we don't know what exact emotions the viewers were experiencing. The results of this work could impact the way in which we index movie content on PVRs for example, paying special significance to movie segments which are most likely to be highlights.","PeriodicalId":417012,"journal":{"name":"2009 Seventh International Workshop on Content-Based Multimedia Indexing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117177761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Combining Cohort and UBM Models in Open Set Speaker Identification 结合队列模型和UBM模型在开集说话人识别中的应用

2009 Seventh International Workshop on Content-Based Multimedia Indexing

Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.30

Anthony Brew, P. Cunningham

In open set speaker identification it is important to build an alternative model against which to compare scores from the 'target' speaker model. Two alternative strategies for building an alternative model are to build a single global model by sampling from a pool of training data, the Universal Background (UBM), or to build a cohort of models from selected individuals in the training data for the target speaker. The main contribution in this paper is to show that these approaches can be unified by using a Support Vector Machine (SVM) to learn a decision rule in the score space made up of the output scores of the client, cohort and UBM model.

在开放集说话人识别中，重要的是建立一个替代模型，与“目标”说话人模型的分数进行比较。建立替代模型的两种策略是通过从训练数据池中采样来建立一个单一的全局模型，即通用背景(UBM)，或者从目标说话者的训练数据中选择个体建立一个模型队列。本文的主要贡献是表明这些方法可以通过使用支持向量机(SVM)来统一，在由客户端、队列和UBM模型的输出分数组成的分数空间中学习决策规则。

引用次数: 5

A Database Architecture for Real-Time Motion Retrieval 一种实时运动检索的数据库体系结构

2009 Seventh International Workshop on Content-Based Multimedia Indexing

Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.20

Charly Awad, N. Courty, S. Gibet

Over the past decade, many research fields have realized the benefits of motion capture data, leading to an exponential growth of the size of motion databases. Consequently indexing, querying, and retrieving motion capture data have become important considerations in the usability of such databases. Our aim is to efficiently retrieve motion from such databases in order to produce real-time animation. For that purpose, we propose a new database architecture which structures both the semantic and raw data contained in motion data. The performance of the overall architecture is evaluated by measuring the efficiency of the motion retrieval process, in terms of the mean time access to the data.

在过去的十年中，许多研究领域已经意识到运动捕捉数据的好处，导致运动数据库的规模呈指数级增长。因此，索引、查询和检索动作捕捉数据已成为这类数据库可用性的重要考虑因素。我们的目标是从这样的数据库中有效地检索运动，以产生实时动画。为此，我们提出了一种新的数据库结构，该结构既包含语义数据，也包含运动数据中的原始数据。整体架构的性能是通过测量运动检索过程的效率来评估的，根据对数据的平均时间访问。

引用次数: 6

Music Mood Annotator Design and Integration 音乐情绪注释器的设计和集成

2009 Seventh International Workshop on Content-Based Multimedia Indexing

Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.45

C. Laurier, O. Meyers, J. Serrà, Martin Blech, P. Herrera

A robust and efficient technique for automatic music mood annotation is presented. A song's mood is expressed by a supervised machine learning approach based on musical features extracted from the raw audio signal. A ground truth, used for training, is created using both social network information systems and individual experts. Tests of 7 different classification configurations have been performed, showing that Support Vector Machines perform best for the task at hand. Moreover, we evaluate the algorithm robustness to different audio compression schemes. This fact, often neglected, is fundamental to build a system that is usable in real conditions. In addition, the integration of a fast and scalable version of this technique with the European Project PHAROS is discussed.

提出了一种鲁棒高效的音乐情绪自动标注技术。歌曲的情绪是通过一种基于从原始音频信号中提取的音乐特征的监督机器学习方法来表达的。一个用于培训的基本事实是由社会网络信息系统和个人专家共同创建的。对7种不同的分类配置进行了测试，表明支持向量机对手头的任务表现最好。此外，我们还评估了算法对不同音频压缩方案的鲁棒性。这个经常被忽视的事实是构建一个在实际条件下可用的系统的基础。此外，还讨论了该技术与欧洲项目PHAROS的快速和可扩展版本的集成。

引用次数: 22

BlockWeb: An IR Model for Block Structured Web Pages 块Web:块结构网页的IR模型

2009 Seventh International Workshop on Content-Based Multimedia Indexing

Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.36

Emmanuel Bruno, Nicolas Faessel, J. Maitre, M. Scholl

BlockWeb is a model that we have developed for indexing and querying web pages according to their content as well as to their visual rendering. These pages are split up into blocks what has several advantages in terms of page indexing and querying: (i) blocks of a page most similar to a query may be returned instead of the page as a whole (ii) the importance of a block can be taken into account, as well as (iii) the permeability of the blocks to the content of neighbor blocks. In this paper, we present the BlockWeb model and show its interest for indexing images of Web pages, through an experiment performed on electronic versions of French daily newspapers. We also present the engine we have implemented for block extraction, indexing and querying according to the BlockWeb model.

BlockWeb是我们开发的一个模型，用于根据网页的内容和视觉呈现对网页进行索引和查询。这些页面被分成块，这在页面索引和查询方面有几个优点:(i)可以返回与查询最相似的页面块，而不是整个页面;(ii)可以考虑块的重要性，以及(iii)块对相邻块内容的渗透性。在本文中，我们提出了BlockWeb模型，并通过在法国日报的电子版上进行的实验，展示了它对网页图像索引的兴趣。我们还介绍了我们根据BlockWeb模型实现的区块提取、索引和查询引擎。

引用次数: 9

Structured Named Entity Retrieval in Audio Broadcast News 音频广播新闻中的结构化命名实体检索

2009 Seventh International Workshop on Content-Based Multimedia Indexing

Pub Date : 2009-06-03 DOI: 10.1109/CBMI.2009.41

Azeddine Zidouni, M. Quafafou, H. Glotin

This paper focuses on the role of structures in named entity retrieval inside audio transcription. We consider the transcription documents structures that guide the parsing process, and from which we deduce an optimal hierarchical structure of the space of concepts. Therefore, a concept (named entity) is represented by a node or any sub-path in this hierarchy. We show the interest of such structure in the recognition of the named entities using the Conditional Random Fields (CRFs). The comparison of our approach to the Hidden Markov Model (HMM) method shows an important improvement of recognition using Combining CRFs. We also show the impact of time axis in the prediction process.

本文主要研究了结构在音频转录中命名实体检索中的作用。我们考虑引导解析过程的转录文档结构，并从中推断出概念空间的最佳层次结构。因此，概念(命名实体)由该层次结构中的节点或任何子路径表示。我们展示了这种结构对使用条件随机场(CRFs)识别命名实体的兴趣。将该方法与隐马尔可夫模型(HMM)方法进行了比较，结果表明结合CRFs的方法对识别有重要的改进。我们还展示了时间轴在预测过程中的影响。

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2009 Seventh International Workshop on Content-Based Multimedia Indexing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀