2008 IEEE 10th Workshop on Multimedia Signal Processing最新文献

英文中文

Increased flexibility in inter picture partitioning 增加了图片间分区的灵活性

2008 IEEE 10th Workshop on Multimedia Signal Processing

Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665090

Kenneth Vermeirsch, J. D. Cock, S. Notebaert, P. Lambert, R. Walle

To attain efficient coding of sequences with complex motion activity, modern video coding standards allow variable block sizes to be employed in temporal prediction. A block with complex motion can be partitioned into two equal-sized halves or into four quadrants. In this paper we study the impact of allowing blocks to be partitioned in two unequal subpartitions. Additionally, we allow block partitioning along diagonal edges. These provisions allows encoders to better adapt to the local characteristics of the motion activity in a video sequence. We verified experimentally that the presence of partitioning edges that do not coincide with transform boundaries does not negatively impact the decorrelating strength of the residual transform, so the proposed extended partitioning strategies can be applied regardless of the details of the residual coder. Implementing the proposed extended partitioning modes in an H.264/AVC coder at the macroblock and submacroblock level, we observe that coding efficiency gains are greatest in low-resolution sequences, where moving features in the video sequence tend to be more spatially localized. For CIF and QCIF sequences we achieve a bit rate reduction of about 3-6% over a wide fidelity range.

为了获得具有复杂运动活动的序列的有效编码，现代视频编码标准允许在时间预测中使用可变块大小。具有复杂运动的块可以分成大小相等的两半或四个象限。在本文中，我们研究了允许块被划分为两个不相等的子分区的影响。此外，我们允许沿对角线边缘进行块分区。这些规定允许编码器更好地适应视频序列中运动活动的局部特征。我们通过实验验证了与变换边界不一致的分割边缘的存在不会对残差变换的去相关强度产生负面影响，因此所提出的扩展分割策略可以不考虑残差编码器的细节而应用。在H.264/AVC编码器中，在宏块和子宏块级别实现了所提出的扩展划分模式，我们观察到编码效率在低分辨率序列中最大，其中视频序列中的运动特征往往更倾向于空间局部化。对于CIF和QCIF序列，我们在宽保真度范围内实现了约3-6%的比特率降低。

{"title":"Increased flexibility in inter picture partitioning","authors":"Kenneth Vermeirsch, J. D. Cock, S. Notebaert, P. Lambert, R. Walle","doi":"10.1109/MMSP.2008.4665090","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665090","url":null,"abstract":"To attain efficient coding of sequences with complex motion activity, modern video coding standards allow variable block sizes to be employed in temporal prediction. A block with complex motion can be partitioned into two equal-sized halves or into four quadrants. In this paper we study the impact of allowing blocks to be partitioned in two unequal subpartitions. Additionally, we allow block partitioning along diagonal edges. These provisions allows encoders to better adapt to the local characteristics of the motion activity in a video sequence. We verified experimentally that the presence of partitioning edges that do not coincide with transform boundaries does not negatively impact the decorrelating strength of the residual transform, so the proposed extended partitioning strategies can be applied regardless of the details of the residual coder. Implementing the proposed extended partitioning modes in an H.264/AVC coder at the macroblock and submacroblock level, we observe that coding efficiency gains are greatest in low-resolution sequences, where moving features in the video sequence tend to be more spatially localized. For CIF and QCIF sequences we achieve a bit rate reduction of about 3-6% over a wide fidelity range.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"28 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122707554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Optimal server bandwidth allocation for streaming multiple streams via P2P multicast 通过P2P组播实现多流服务器带宽的最优分配

2008 IEEE 10th Workshop on Multimedia Signal Processing

Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665124

Aditya Mavlankar, Jeonghun Noh, Pierpaolo Baccichet, B. Girod

We consider the general scenario where content hosted by the server comprises streams and each peer can subscribe one or more streams. Multiple multicast trees are built to deliver the streams to respective peers while exploiting the overlap of their interests for efficient and scalable delivery. We propose an optimization framework for allocating server bandwidth to minimize distortion across the peer population. We apply the framework to a novel application, peer-to-peer (P2P) multicast live video streaming with virtual pan/tilt/zoom functionality. In this application, each user can watch arbitrary regions of a high-spatial-resolution scene yet the system exploits overlapping interests by building multicast trees. Experimental results indicate that optimal server bandwidth allocation enhances the delivered quality across the peer population.

我们考虑的一般场景是，服务器托管的内容包含流，每个对等端可以订阅一个或多个流。建立多个组播树来将流发送到各自的对等体，同时利用它们的兴趣重叠来实现高效和可扩展的传输。我们提出了一个分配服务器带宽的优化框架，以最大限度地减少对等群体之间的失真。我们将该框架应用于一个新颖的应用，具有虚拟平移/倾斜/缩放功能的点对点(P2P)多播直播视频流。在该应用程序中，每个用户可以观看高空间分辨率场景的任意区域，但系统通过构建多播树来利用重叠的兴趣。实验结果表明，优化服务器带宽分配可以提高整个对等群体的传输质量。

引用次数: 6

Dynamic chroma feature vectors with applications to cover song identification 动态色度特征向量在翻唱歌曲识别中的应用

2008 IEEE 10th Workshop on Multimedia Signal Processing

Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665217

Samuel Kim, Shrikanth S. Narayanan

A new chroma-based dynamic feature vector is proposed inspired by psychophysical observations that the human auditory system detects reltative pitch changes rather than absolute pitch values. The proposed chroma-based dynamic feature vector describes the relative pitch change intervals. The utility of the proposed feature vector incorporated with a music fingerprint extraction algorithm is experimentally explored within a music cover song identification framework. The results with a classical music database suggest that the proposed biologically plausible dynamic chroma feature vector can be successfully added to the conventional chroma feature vector as a complementary feature; it provides a 5.8% relative performance improvement.

人类听觉系统检测相对音高变化而非绝对音高值的心理物理学观察启发，提出了一种新的基于色度的动态特征向量。提出的基于色度的动态特征向量描述了相对音高变化区间。在音乐翻唱歌曲识别框架中，实验探索了将所提出的特征向量与音乐指纹提取算法相结合的实用性。基于古典音乐数据库的结果表明，所提出的生物似然动态色度特征向量可以作为互补特征成功地添加到传统的色度特征向量中;它提供了5.8%的相对性能提升。

引用次数: 31

Adaptive wavelet coding of the depth map for stereoscopic view synthesis 立体视图合成中深度图的自适应小波编码

2008 IEEE 10th Workshop on Multimedia Signal Processing

Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665114

Ismaël Daribo, C. Tillier, B. Pesquet-Popescu

Multi-view video and 3D television are emerging applications raising the problem of efficient encoding of a depth map, in addition to classical texture images. This paper investigates depth image coding via an adaptive wavelet lifting scheme. Switching between long filters in homogeneous areas and short filters over the edges of the depth map is decided based on the contours detected in the texture image. The method takes thus into consideration the correlation existing between the edges in the texture and in the depth image, leading to an improved encoding of the latter one.

除了经典的纹理图像之外，多视图视频和3D电视的新兴应用也提出了深度图的高效编码问题。研究了一种基于自适应小波提升的深度图像编码方法。在均匀区域的长滤波器和深度图边缘的短滤波器之间切换是根据纹理图像中检测到的轮廓来决定的。该方法充分考虑了纹理图像边缘与深度图像边缘之间存在的相关性，从而改进了深度图像的编码。

引用次数: 40

Periodic signal extraction with frequency-selective amplitude modulation and global time-warping for music signal decomposition 基于频率选择调幅和全局时间规整的音乐信号周期提取

2008 IEEE 10th Workshop on Multimedia Signal Processing

Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665215

M. Triki, D. Slock, A. Triki

A key building block in music transcription and indexing operations is the decomposition of music signals into notes. We model a note signal as a periodic signal with (slow) frequency-selective amplitude modulation and global time warping. Time-varying frequency-selective amplitude modulation allows the various harmonics of the periodic signal to decay at different speeds. Time-warping allows for some limited global frequency modulation. The bandlimited variation of the frequency-selective amplitude modulation and of the global time warping gets expressed through a subsampled representation and parametrization of the corresponding signals. Assuming additive white Gaussian noise, a maximum likelihood approach is proposed for the estimation of the model parameters and the optimization is performed in an iterative (cyclic) fashion that leads to a sequence of simple least-squares problems.

音乐转录和索引操作的一个关键组成部分是将音乐信号分解成音符。我们将音符信号建模为具有(慢)频率选择性振幅调制和全局时间规整的周期信号。时变频率选择调幅允许周期信号的各种谐波以不同的速度衰减。时间扭曲允许一些有限的全球频率调制。通过对相应信号的次采样表示和参数化来表示频率选择性调幅和全局时间规整的带限变化。假设加性高斯白噪声，提出了一种极大似然方法来估计模型参数，并以迭代(循环)方式进行优化，从而导致一系列简单的最小二乘问题。

引用次数: 9

SPSA based feature relevance estimation for video retrieval 基于SPSA的视频检索特征相关性估计

2008 IEEE 10th Workshop on Multimedia Signal Processing

Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665147

S. Velusamy, S. Bhatnagar, S. Basavaraja, V. Sridhar

With the availability of a huge amount of video data on various sources, efficient video retrieval tools are increasingly in demand. Video being a multi-modal data, the perceptions of ldquorelevancerdquo between the user provided query video (in case of Query-By-Example type of video search) and retrieved video clips are subjective in nature. We present an efficient video retrieval method that takes userpsilas feedback on the relevance of retrieved videos and iteratively reformulates the input query feature vectors (QFV) for improved video retrieval. The QFV reformulation is done by a simple, but powerful feature weight optimization method based on Simultaneous Perturbation Stochastic Approximation (SPSA) technique. A video retrieval system with video indexing, searching and relevance feedback (RF) phases is built for demonstrating the performance of the proposed method. The query and database videos are indexed using the conventional video features like color, texture, etc. However, we use the comprehensive and novel methods of feature representations, and a spatio-temporal distance measure to retrieve the top M videos that are similar to the query. In feedback phase, the user activated iterative on the previously retrieved videos is used to reformulate the QFV weights (measure of importance) that reflect the userpsilas preference, automatically. It is our observation that a few iterations of such feedback are generally sufficient for retrieving the desired video clips. The novel application of SPSA based RF for user-oriented feature weights optimization makes the proposed method to be distinct from the existing ones. The experimental results show that the proposed RF based video retrieval exhibit good performance.

随着各种来源的大量视频数据的可用性，对高效视频检索工具的需求日益增加。视频是一种多模态数据，用户提供的查询视频(针对按例查询类型的视频搜索)与检索到的视频片段之间的相关性感知是主观的。我们提出了一种有效的视频检索方法，该方法利用用户对检索视频相关性的反馈，迭代地重新制定输入查询特征向量(QFV)以改进视频检索。QFV重构采用基于同步摄动随机逼近(SPSA)技术的一种简单但功能强大的特征权优化方法。建立了一个包含视频索引、搜索和相关反馈(RF)阶段的视频检索系统，以验证该方法的性能。使用传统的视频特征(如颜色、纹理等)对查询和数据库视频进行索引。然而，我们使用综合和新颖的特征表示方法，以及时空距离度量来检索与查询相似的前M个视频。在反馈阶段，使用先前检索的视频的用户激活迭代来自动重新制定反映用户偏好的QFV权重(重要性度量)。根据我们的观察，这种反馈的几次迭代通常足以检索所需的视频剪辑。基于SPSA的射频算法在面向用户的特征权重优化中的新颖应用，使得该方法区别于现有的方法。实验结果表明，基于射频的视频检索具有良好的性能。

{"title":"SPSA based feature relevance estimation for video retrieval","authors":"S. Velusamy, S. Bhatnagar, S. Basavaraja, V. Sridhar","doi":"10.1109/MMSP.2008.4665147","DOIUrl":"https://doi.org/10.1109/MMSP.2008.4665147","url":null,"abstract":"With the availability of a huge amount of video data on various sources, efficient video retrieval tools are increasingly in demand. Video being a multi-modal data, the perceptions of ldquorelevancerdquo between the user provided query video (in case of Query-By-Example type of video search) and retrieved video clips are subjective in nature. We present an efficient video retrieval method that takes userpsilas feedback on the relevance of retrieved videos and iteratively reformulates the input query feature vectors (QFV) for improved video retrieval. The QFV reformulation is done by a simple, but powerful feature weight optimization method based on Simultaneous Perturbation Stochastic Approximation (SPSA) technique. A video retrieval system with video indexing, searching and relevance feedback (RF) phases is built for demonstrating the performance of the proposed method. The query and database videos are indexed using the conventional video features like color, texture, etc. However, we use the comprehensive and novel methods of feature representations, and a spatio-temporal distance measure to retrieve the top M videos that are similar to the query. In feedback phase, the user activated iterative on the previously retrieved videos is used to reformulate the QFV weights (measure of importance) that reflect the userpsilas preference, automatically. It is our observation that a few iterations of such feedback are generally sufficient for retrieving the desired video clips. The novel application of SPSA based RF for user-oriented feature weights optimization makes the proposed method to be distinct from the existing ones. The experimental results show that the proposed RF based video retrieval exhibit good performance.","PeriodicalId":402287,"journal":{"name":"2008 IEEE 10th Workshop on Multimedia Signal Processing","volume":"03 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129851200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Image interpolation using visual attention model and particle swarm optimization 基于视觉注意模型和粒子群算法的图像插值

2008 IEEE 10th Workshop on Multimedia Signal Processing

Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665078

Hsuan-Ying Chen, Jin-Jang Leou

In this study, a new edge-directed image interpolation approach using visual attention model and particle swarm optimization (PSO) is proposed. First, a high-quality saliency map of an image to be interpolated is generated by the proposed visual attention model in an effective manner. Then, based on the saliency map, bilinear interpolation and the proposed PSO interpolation are employed for non-saliency blocks (non-ROIs) and saliency blocks (ROIs), respectively, to obtain the final interpolation results. The proposed approach is applicable for image interpolation with arbitrary magnification factors (MFs). Based on the experimental results obtained in this study, the interpolation results of the proposed approach are better than those of three comparison methods.

本文提出了一种基于视觉注意模型和粒子群优化(PSO)的图像边缘插值方法。首先，本文提出的视觉注意模型有效地生成了待插值图像的高质量显著性图。然后，在显著性映射的基础上，对非显著性块(non- roi)和显著性块(roi)分别采用双线性插值和所提出的粒子群插值，得到最终的插值结果。该方法适用于任意放大倍数的图像插值。实验结果表明，本文方法的插值结果优于三种对比方法。

引用次数: 2

Distortion-aware retransmission and concealment of video packets using a Wyner-Ziv-coded thumbnail 使用wyner - ziv编码缩略图的视频数据包的失真感知重传和隐藏

2008 IEEE 10th Workshop on Multimedia Signal Processing

Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665116

Zhi Li, Y. Lin, D. Varodayan, Pierpaolo Baccichet, B. Girod

We investigate retransmission-based robust video streaming over lossy packet networks in this paper. We propose to send a thumbnail video along with the video packets. The thumbnail video is Wyner-Ziv-coded to exploit its correlation with the primary video. The receiver decodes the primary video with the help of error concealment to mitigate packet losses. Upon receiving and successfully decoding the thumbnail, the receiver can estimate the local distortion due to packet losses and make intelligent decisions on which packets are needed for retransmission. Additional gain in video quality can be achieved by using the thumbnail to aid error concealment. Our experimental results demonstrate gains over previously proposed distortion-unaware heuristic methods.

本文研究了在有损分组网络上基于重传的鲁棒视频流。我们建议发送一个缩略视频随着视频包。缩略视频是wyner - ziv编码，以利用其与主视频的相关性。接收端利用错误隐藏技术对主视频进行解码，以减少数据包丢失。在接收到并成功解码缩略图后，接收器可以估计由于数据包丢失而导致的局部失真，并对需要重传的数据包做出智能决策。通过使用缩略图来帮助隐藏错误，可以获得视频质量的额外增益。我们的实验结果证明了比先前提出的不失真启发式方法的收益。

引用次数: 5

Multimedia search: Past and current approaches 多媒体搜索:过去和现在的方法

2008 IEEE 10th Workshop on Multimedia Signal Processing

Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665037

HongJiang Zhang

Summary form only given. After 15 years of extensive research efforts, multimedia retrieval has final come to its prime time when everything becomes accessible on the Web. However, Web search both provides a new paradigm and poses challenge to multimedia retrieval research. It calls for a rethinking of the traditional content-based approaches, especially in how to make use the massive but noisy meta-data associated with Web pages and links. In this talk, we will first review some familiar approaches in content-based multimedia retrieval and examine their limits. We will then present a few new efforts in web multimedia search to illustrate some new thoughts in this space.

只提供摘要形式。经过15年的广泛研究，多媒体检索终于迎来了它的黄金时代，所有的东西都可以在网络上访问。然而，网络搜索为多媒体检索研究提供了新的范式，同时也提出了新的挑战。它要求对传统的基于内容的方法进行重新思考，特别是在如何利用与网页和链接相关的大量但嘈杂的元数据方面。在这个演讲中，我们将首先回顾一些常见的基于内容的多媒体检索方法，并检查它们的局限性。然后，我们将介绍一些在网络多媒体搜索方面的新努力，以说明这个领域的一些新想法。

引用次数: 2

Genre classification of compressed audio data 压缩音频数据的类型分类

2008 IEEE 10th Workshop on Multimedia Signal Processing

Pub Date : 2008-11-05 DOI: 10.1109/MMSP.2008.4665157

A. Rizzi, Nicola Maurizio Buccino, M. Panella, A. Uncini

This paper deals with the musical genre classification problem, starting from a set of features extracted directly from MPEG-1 layer III compressed audio data. The automatic classification of compressed audio signals into a short hierarchy of musical genres is explored. More specifically, three feature sets for representing timbre, rhythmic content and energy content are proposed for a four leafs tree genre hierarchy. The adopted set of features are computed from the spectral information available in the MPEG decoding stage. The performance and relative importance of the proposed approach is investigated by training a classification model using the audio collections proposed in musical genre contests. We also used an optimization strategy based on genetic algorithms. The results are comparable to those obtained by PCM-based musical genre classification systems.

本文从直接从MPEG-1 layer III压缩音频数据中提取的一组特征开始，处理音乐类型分类问题。研究了将压缩音频信号自动分类为短层次音乐类型的方法。更具体地说，提出了表示四叶树类型层次的音色、节奏含量和能量含量的三个特征集。所采用的特征集是从MPEG解码阶段可用的频谱信息中计算出来的。通过使用音乐类型比赛中提出的音频集训练分类模型，研究了所提出方法的性能和相对重要性。我们还使用了基于遗传算法的优化策略。结果与基于pcm的音乐类型分类系统的结果相当。

引用次数: 28

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2008 IEEE 10th Workshop on Multimedia Signal Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀