首页 > 最新文献

2010 IEEE International Workshop on Multimedia Signal Processing最新文献

英文 中文
Encoder and decoder side global and local motion estimation for Distributed Video Coding 分布式视频编码的编码器和解码器侧全局和局部运动估计
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662043
F. Dufaux, T. Ebrahimi
In this paper, we propose a new Distributed Video Coding (DVC) architecture where motion estimation is performed both at the encoder and decoder, effectively combining global and local motion models. We show that the proposed approach improves significantly the quality of Side Information (SI), especially for sequences with complex motion patterns. In turn, it leads to rate-distortion gains of up to 1 dB when compared to the state-of-the-art DISCOVER DVC codec.
在本文中,我们提出了一种新的分布式视频编码(DVC)架构,该架构在编码器和解码器都进行运动估计,有效地结合了全局和局部运动模型。我们表明,该方法显著提高了侧信息(SI)的质量,特别是对于具有复杂运动模式的序列。反过来,与最先进的DISCOVER DVC编解码器相比,它可带来高达1 dB的速率失真增益。
{"title":"Encoder and decoder side global and local motion estimation for Distributed Video Coding","authors":"F. Dufaux, T. Ebrahimi","doi":"10.1109/MMSP.2010.5662043","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662043","url":null,"abstract":"In this paper, we propose a new Distributed Video Coding (DVC) architecture where motion estimation is performed both at the encoder and decoder, effectively combining global and local motion models. We show that the proposed approach improves significantly the quality of Side Information (SI), especially for sequences with complex motion patterns. In turn, it leads to rate-distortion gains of up to 1 dB when compared to the state-of-the-art DISCOVER DVC codec.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130541580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
An objective metric for assessing quality of experience on stereoscopic images 评价立体图像体验质量的客观度量
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662049
Liyuan Xing, Junyong You, T. Ebrahimi, A. Perkis
Most quality models for stereoscopic presentations are dedicated to measuring quality degradation caused by compression artefacts. However, non-compression distortions induced during acquisition and presentation usually have significant influence on 3D viewing experience. In this paper, we propose an objective metric for viewing experience assessment by taking camera baseline and binocular distortion crosstalk into consideration. In particular, the proposed metric is based on our previous work on both subjective evaluation and objective assessment of crosstalk perception. Results on a publicly available stereoscopic quality database demonstrate that the proposed metric can achieve more than 87% correlation with subjective assessment of viewing experience.
大多数立体呈现的质量模型都致力于测量由压缩伪影引起的质量退化。然而,在获取和呈现过程中引起的非压缩失真通常会对3D观看体验产生重大影响。本文提出了一种考虑摄像机基线和双眼畸变串扰的客观观影体验评价指标。特别地,提出的度量是基于我们之前关于相声感知的主观评估和客观评估的工作。在一个公开可用的立体质量数据库上的结果表明,所提出的度量可以与观看体验的主观评价达到87%以上的相关性。
{"title":"An objective metric for assessing quality of experience on stereoscopic images","authors":"Liyuan Xing, Junyong You, T. Ebrahimi, A. Perkis","doi":"10.1109/MMSP.2010.5662049","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662049","url":null,"abstract":"Most quality models for stereoscopic presentations are dedicated to measuring quality degradation caused by compression artefacts. However, non-compression distortions induced during acquisition and presentation usually have significant influence on 3D viewing experience. In this paper, we propose an objective metric for viewing experience assessment by taking camera baseline and binocular distortion crosstalk into consideration. In particular, the proposed metric is based on our previous work on both subjective evaluation and objective assessment of crosstalk perception. Results on a publicly available stereoscopic quality database demonstrate that the proposed metric can achieve more than 87% correlation with subjective assessment of viewing experience.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133298080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Efficient error control in 3D mesh coding 有效的三维网格编码误差控制
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662035
D. Cernea, A. Munteanu, A. Alecu, J. Cornelis, P. Schelkens, F. Morán
Our recently proposed wavelet-based L-infinite-constrained coding approach for meshes ensures that the maximum error between the vertex positions in the original and decoded meshes is guaranteed to be lower than a given upper bound. Instantiations of both L-2 and L-infinite coding approaches are demonstrated for MESHGRID, which is a scalable 3D object encoding system, part of MPEG-4 AFX. In this survey paper, we compare the novel L-infinite distortion estimator against the L-2 distortion estimator which is typically employed in 3D mesh coding systems. In addition, we show that, under certain conditions, the L-infinite estimator can be exploited to approximate the Hausdorff distance in real-time implementations.
我们最近提出的基于小波的l无限约束网格编码方法保证了原始网格和解码网格顶点位置之间的最大误差低于给定的上界。在MESHGRID中演示了L-2和L-infinite编码方法的实例,MESHGRID是一个可扩展的3D对象编码系统,是MPEG-4 AFX的一部分。在本文中,我们比较了新型的l -无限失真估计和三维网格编码系统中典型使用的L-2失真估计。此外,我们还证明了在一定条件下,l无限估计量可以被用来在实时实现中近似豪斯多夫距离。
{"title":"Efficient error control in 3D mesh coding","authors":"D. Cernea, A. Munteanu, A. Alecu, J. Cornelis, P. Schelkens, F. Morán","doi":"10.1109/MMSP.2010.5662035","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662035","url":null,"abstract":"Our recently proposed wavelet-based L-infinite-constrained coding approach for meshes ensures that the maximum error between the vertex positions in the original and decoded meshes is guaranteed to be lower than a given upper bound. Instantiations of both L-2 and L-infinite coding approaches are demonstrated for MESHGRID, which is a scalable 3D object encoding system, part of MPEG-4 AFX. In this survey paper, we compare the novel L-infinite distortion estimator against the L-2 distortion estimator which is typically employed in 3D mesh coding systems. In addition, we show that, under certain conditions, the L-infinite estimator can be exploited to approximate the Hausdorff distance in real-time implementations.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"27 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114121610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Side information enhancement using an adaptive hash-based genetic algorithm in a Wyner-Ziv context 在Wyner-Ziv上下文中使用自适应哈希遗传算法的侧信息增强
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662036
Thomas Maugey, C. Yaacoub, J. Farah, Marco Cagnazzo, B. Pesquet-Popescu
Side information construction in Wyner-Ziv video coding is a sensible task which strongly influences the final ratedistortion performance of the scheme. This side information is usually generated through an interpolation of the previous and next images. Some of the zones of a scene however, such as the occlusions, cannot be estimated with other frames. In this paper we propose to avoid this problem by sending some hash information for these unpredictable zones of the image. The resulting algorithm is described and tested here. The obtained results show the advantages of using localized hash information for the high error zones in distributed video coding.
Wyner-Ziv视频编码中的侧信息构建是一项敏感的任务,对最终的码率失真性能有很大的影响。这种侧面信息通常是通过对前一幅和下一幅图像的插值生成的。然而,场景的某些区域,例如遮挡,无法用其他帧来估计。在本文中,我们建议通过为图像的这些不可预测区域发送一些哈希信息来避免这个问题。这里描述并测试了生成的算法。研究结果表明,在分布式视频编码的高错误区,使用局部哈希信息是有优势的。
{"title":"Side information enhancement using an adaptive hash-based genetic algorithm in a Wyner-Ziv context","authors":"Thomas Maugey, C. Yaacoub, J. Farah, Marco Cagnazzo, B. Pesquet-Popescu","doi":"10.1109/MMSP.2010.5662036","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662036","url":null,"abstract":"Side information construction in Wyner-Ziv video coding is a sensible task which strongly influences the final ratedistortion performance of the scheme. This side information is usually generated through an interpolation of the previous and next images. Some of the zones of a scene however, such as the occlusions, cannot be estimated with other frames. In this paper we propose to avoid this problem by sending some hash information for these unpredictable zones of the image. The resulting algorithm is described and tested here. The obtained results show the advantages of using localized hash information for the high error zones in distributed video coding.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"199 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122429260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A new image projection method for panoramic image stitching 一种新的全景图像拼接投影方法
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662006
Beom Su Kim, H. Koo, N. Cho
We propose a new image projection method in an attempt to reduce the perceptual distortion in panoramic image mosaics. Specifically, we reduce the stretching distortion of some image patches and bending of straight lines. Since the stretching distortion usually occurs when projecting a viewing sphere to the cylindrical image surface in an oblique direction, we propose to use an adjustable cylindrical surface to match the viewing direction with the equator of the cylindrical surface. Also, in order to find the trade-off between the stretching distortion and bending of straight lines, we also adjust the curvature of cylindrical surface according to the object of interest in the image. The warping function from the viewing sphere to the adjustable image surface is derived and the amount of distortion caused by this warping function is also defined. From the measure of distortion, the optimal pose of the cylindrical image plane and its curvature are determined, and the image on the viewing sphere is projected on the optimal plane. The experimental results show that the proposed method produces the panoramic image with less distortion than the existing methods.
我们提出了一种新的图像投影方法,试图减少全景图像拼接中的感知失真。具体来说,我们减少了一些图像块的拉伸失真和直线的弯曲。由于观察球以倾斜方向投射到柱面像面时通常会出现拉伸变形,因此我们建议使用可调节的柱面来匹配观察方向与柱面赤道的匹配。此外,为了在直线的拉伸变形和弯曲之间找到平衡,我们还根据图像中感兴趣的对象来调整圆柱形表面的曲率。导出了从观察球到可调图像表面的扭曲函数,并定义了由该扭曲函数引起的畸变量。从畸变测度出发,确定柱面成像平面的最优位姿及其曲率,并将观察球上的图像投影到最优平面上。实验结果表明,与现有方法相比,该方法产生的全景图像失真较小。
{"title":"A new image projection method for panoramic image stitching","authors":"Beom Su Kim, H. Koo, N. Cho","doi":"10.1109/MMSP.2010.5662006","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662006","url":null,"abstract":"We propose a new image projection method in an attempt to reduce the perceptual distortion in panoramic image mosaics. Specifically, we reduce the stretching distortion of some image patches and bending of straight lines. Since the stretching distortion usually occurs when projecting a viewing sphere to the cylindrical image surface in an oblique direction, we propose to use an adjustable cylindrical surface to match the viewing direction with the equator of the cylindrical surface. Also, in order to find the trade-off between the stretching distortion and bending of straight lines, we also adjust the curvature of cylindrical surface according to the object of interest in the image. The warping function from the viewing sphere to the adjustable image surface is derived and the amount of distortion caused by this warping function is also defined. From the measure of distortion, the optimal pose of the cylindrical image plane and its curvature are determined, and the image on the viewing sphere is projected on the optimal plane. The experimental results show that the proposed method produces the panoramic image with less distortion than the existing methods.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115456991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On joint distribution modeling in distributed video coding systems 分布式视频编码系统中的联合分布建模
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662037
Y. Priziment, D. Malah
Performance of a distributed video coding system depends, to a large extent, on the accuracy of joint source and side information distribution modeling. In this work we first examine a family of stationary joint distribution models. As one of our findings, we propose to use the double-Gamma model as an alternative to the widely adopted Laplace model, due to its superior performance. In addition, we suggest a new spatially adaptive model, which enables to follow the spatially varying joint statistics of the source and side information. We present two methods, class-based and neighborhood-based, for estimation of the spatially varying model parameters. We then show how the obtained pixel domain model can be used in the transform domain to facilitate utilization of frame spatial redundancy. Integration of the proposed models into a distributed video coding system resulted in improved performance.
分布式视频编码系统的性能在很大程度上取决于源侧信息联合分布建模的准确性。在这项工作中,我们首先研究了一类平稳联合分布模型。作为我们的发现之一,我们建议使用双伽马模型作为广泛采用的拉普拉斯模型的替代方案,因为它具有优越的性能。此外,我们还提出了一种新的空间自适应模型,该模型能够跟踪源侧信息的空间变化联合统计。我们提出了基于类和基于邻域的两种方法来估计空间变化的模型参数。然后,我们展示了如何在变换域中使用获得的像素域模型来促进帧空间冗余的利用。将所提出的模型集成到分布式视频编码系统中,提高了性能。
{"title":"On joint distribution modeling in distributed video coding systems","authors":"Y. Priziment, D. Malah","doi":"10.1109/MMSP.2010.5662037","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662037","url":null,"abstract":"Performance of a distributed video coding system depends, to a large extent, on the accuracy of joint source and side information distribution modeling. In this work we first examine a family of stationary joint distribution models. As one of our findings, we propose to use the double-Gamma model as an alternative to the widely adopted Laplace model, due to its superior performance. In addition, we suggest a new spatially adaptive model, which enables to follow the spatially varying joint statistics of the source and side information. We present two methods, class-based and neighborhood-based, for estimation of the spatially varying model parameters. We then show how the obtained pixel domain model can be used in the transform domain to facilitate utilization of frame spatial redundancy. Integration of the proposed models into a distributed video coding system resulted in improved performance.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128300932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Spectral EEG features and tasks selection process: Some considerations toward BCI applications 频谱脑电特征和任务选择过程:脑机接口应用的一些考虑
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662010
Monica-Claudia Dobrea, D. Dobrea, D. Alexa
In this paper, we further develop the idea of subject specific mental tasks selection process as a necessary prerequisite in any EEG-based brain computer interface (BCI) application. While, in two previous researches we proved — using the EEG-extracted auto-regressive (AR) parameters and twelve different mental tasks —, the major gains one can obtain in tasks classification performance only by selecting the proper tasks, here we investigate the putative relation that exists between each (subject, given EEG features) pair and the corresponding individual optimum set of cognitive tasks. In this idea, a set of three different spectrum relative power parameters were considered. The classification performances achieved with these last EEG features are comparatively presented for two subjects and for two sets of tasks: i) the frequently used in the BCI field, Keirn and Aunon set of tasks, and ii) the previously determined (AR-based) optimum individual set of tasks.
在本文中,我们进一步发展了受试者特定心理任务选择过程作为任何基于脑电图的脑机接口(BCI)应用的必要前提。然而,在之前的两项研究中,我们证明了只有通过选择合适的任务才能获得任务分类性能的主要收益,在这里,我们研究了每个(受试者,给定的EEG特征)对与相应的个体最佳认知任务集之间存在的假定关系。在这个思想中,考虑了一组三种不同的频谱相对功率参数。用这些最后的EEG特征实现的分类性能比较呈现了两个主题和两组任务:i)在BCI领域中经常使用的Keirn和Aunon任务集,以及ii)先前确定的(基于ar的)最佳个人任务集。
{"title":"Spectral EEG features and tasks selection process: Some considerations toward BCI applications","authors":"Monica-Claudia Dobrea, D. Dobrea, D. Alexa","doi":"10.1109/MMSP.2010.5662010","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662010","url":null,"abstract":"In this paper, we further develop the idea of subject specific mental tasks selection process as a necessary prerequisite in any EEG-based brain computer interface (BCI) application. While, in two previous researches we proved — using the EEG-extracted auto-regressive (AR) parameters and twelve different mental tasks —, the major gains one can obtain in tasks classification performance only by selecting the proper tasks, here we investigate the putative relation that exists between each (subject, given EEG features) pair and the corresponding individual optimum set of cognitive tasks. In this idea, a set of three different spectrum relative power parameters were considered. The classification performances achieved with these last EEG features are comparatively presented for two subjects and for two sets of tasks: i) the frequently used in the BCI field, Keirn and Aunon set of tasks, and ii) the previously determined (AR-based) optimum individual set of tasks.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117150498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Face hallucination using Bayesian global estimation and local basis selection 基于贝叶斯全局估计和局部基选择的人脸幻觉
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662063
Chih-Chung Hsu, Chia-Wen Lin, Chiou-Ting Hsu, H. Liao, Jen-Yu Yu
This paper proposes a two-step prototype-face-based scheme of hallucinating the high-resolution detail of a low-resolution input face image. The proposed scheme is mainly composed of two steps: the global estimation step and the local facial-parts refinement step. In the global estimation step, the initial high-resolution face image is hallucinated via a linear combination of the global prototype faces with a coefficient vector. Instead of estimating coefficient vector in the high-dimensional raw image domain, we propose a maximum a posteriori (MAP) estimator to estimate the optimum set of coefficients in the low-dimensional coefficient domain. In the local refinement step, the facial parts (i.e., eyes, nose and mouth) are further refined using a basis selection method based on overcomplete nonnegative matrix factorization (ONMF). Experimental results demonstrate that the proposed method can achieve significant subjective and objective improvement over state-of-the-art face hallucination methods, especially when an input face does not belong to a person in the training data set.
提出了一种基于原型人脸的两步法对低分辨率输入人脸图像的高分辨率细节产生幻觉的方案。该方案主要由全局估计和局部人脸部分细化两步组成。在全局估计步骤中,通过全局原型人脸与系数向量的线性组合产生初始的高分辨率人脸图像。代替在高维原始图像域中估计系数向量,我们提出了一个最大后验(MAP)估计器来估计低维系数域中的最优系数集。在局部细化步骤中,使用基于过完全非负矩阵分解(ONMF)的基选择方法进一步细化面部部位(即眼睛、鼻子和嘴巴)。实验结果表明,该方法在主观上和客观上都比目前最先进的人脸幻觉方法有了显著的提高,特别是当输入的人脸不属于训练数据集中的人时。
{"title":"Face hallucination using Bayesian global estimation and local basis selection","authors":"Chih-Chung Hsu, Chia-Wen Lin, Chiou-Ting Hsu, H. Liao, Jen-Yu Yu","doi":"10.1109/MMSP.2010.5662063","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662063","url":null,"abstract":"This paper proposes a two-step prototype-face-based scheme of hallucinating the high-resolution detail of a low-resolution input face image. The proposed scheme is mainly composed of two steps: the global estimation step and the local facial-parts refinement step. In the global estimation step, the initial high-resolution face image is hallucinated via a linear combination of the global prototype faces with a coefficient vector. Instead of estimating coefficient vector in the high-dimensional raw image domain, we propose a maximum a posteriori (MAP) estimator to estimate the optimum set of coefficients in the low-dimensional coefficient domain. In the local refinement step, the facial parts (i.e., eyes, nose and mouth) are further refined using a basis selection method based on overcomplete nonnegative matrix factorization (ONMF). Experimental results demonstrate that the proposed method can achieve significant subjective and objective improvement over state-of-the-art face hallucination methods, especially when an input face does not belong to a person in the training data set.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123323657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Probabilistic framework for template-based chord recognition 基于模板的和弦识别的概率框架
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662016
L. Oudre, C. Févotte, Y. Grenier
This paper describes a method for chord recognition from audio signals. Our method provides a coherent and relevant probabilistic framework for template-based transcription. The only information needed for the transcription is the definition of the chords : in particular neither annotated audio data nor music theory knowledge is required. We extract from the signal a succession of chroma vectors which are our model observations. We propose a generative model for these observations from chord distribution probabilities and fixed chord templates. The parameters are evaluated through an EM algorithm. In order to capture the temporal structure, we apply some post-processing filtering methods before detecting the chords. Our method is evaluated on two audio corpus. Results show that our method outperforms state-of-the-art chord recognition methods and also gives more relevant chord transcriptions.
本文介绍了一种从音频信号中识别和弦的方法。我们的方法为基于模板的转录提供了一个连贯和相关的概率框架。转录所需的唯一信息是和弦的定义:特别是既不需要注释音频数据也不需要音乐理论知识。我们从信号中提取一系列色度向量,这些色度向量是我们的模型观测值。我们提出了一个基于弦分布概率和固定弦模板的生成模型。通过EM算法对参数进行评估。为了捕获时间结构,我们在检测和弦之前应用了一些后处理滤波方法。我们的方法在两个音频语料库上进行了评估。结果表明,我们的方法优于最先进的和弦识别方法,并且还提供了更多相关的和弦转录。
{"title":"Probabilistic framework for template-based chord recognition","authors":"L. Oudre, C. Févotte, Y. Grenier","doi":"10.1109/MMSP.2010.5662016","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662016","url":null,"abstract":"This paper describes a method for chord recognition from audio signals. Our method provides a coherent and relevant probabilistic framework for template-based transcription. The only information needed for the transcription is the definition of the chords : in particular neither annotated audio data nor music theory knowledge is required. We extract from the signal a succession of chroma vectors which are our model observations. We propose a generative model for these observations from chord distribution probabilities and fixed chord templates. The parameters are evaluated through an EM algorithm. In order to capture the temporal structure, we apply some post-processing filtering methods before detecting the chords. Our method is evaluated on two audio corpus. Results show that our method outperforms state-of-the-art chord recognition methods and also gives more relevant chord transcriptions.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"63 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114060227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Content identification based on digital fingerprint: What can be done if ML decoding fails? 基于数字指纹的内容识别:如果ML解码失败怎么办?
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5661995
F. Farhadzadeh, S. Voloshynovskiy, O. Koval
In this paper, the performance of the content identification based on digital fingerprinting and order statistic list decoding is analyzed by evaluating the probabilities of correct identification, false acceptance and the probability mass function of queried binary fingerprint position on the list of candidates. The particular attention is dedicated to the cases when traditional maximum likelihood decoder fails to produce the reliable content identification. The maximum likelihood decoding is shown to be a particular case of order statistic list decoding for the list size equals 1. We demonstrate the efficiency of the proposed content identification system performance by investigating the probability mass function behavior and imposing the constraint on the cardinality of list size.
本文通过评估正确识别概率、错误接受概率和查询到的二进制指纹在候选指纹列表上位置的概率质量函数,分析了基于数字指纹和顺序统计列表解码的内容识别性能。特别注意传统的最大似然解码器不能产生可靠的内容识别的情况。最大似然解码显示为列表大小等于1的顺序统计列表解码的特殊情况。我们通过研究概率质量函数行为和对列表大小的基数性施加约束来证明所提出的内容识别系统性能的有效性。
{"title":"Content identification based on digital fingerprint: What can be done if ML decoding fails?","authors":"F. Farhadzadeh, S. Voloshynovskiy, O. Koval","doi":"10.1109/MMSP.2010.5661995","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5661995","url":null,"abstract":"In this paper, the performance of the content identification based on digital fingerprinting and order statistic list decoding is analyzed by evaluating the probabilities of correct identification, false acceptance and the probability mass function of queried binary fingerprint position on the list of candidates. The particular attention is dedicated to the cases when traditional maximum likelihood decoder fails to produce the reliable content identification. The maximum likelihood decoding is shown to be a particular case of order statistic list decoding for the list size equals 1. We demonstrate the efficiency of the proposed content identification system performance by investigating the probability mass function behavior and imposing the constraint on the cardinality of list size.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115387887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2010 IEEE International Workshop on Multimedia Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1