首页 > 最新文献

2010 IEEE International Workshop on Multimedia Signal Processing最新文献

英文 中文
Unequal error protection random linear coding for multimedia communications 多媒体通信中的不等错误保护随机线性编码
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662033
D. Vukobratović, V. Stanković
This paper focuses on recent research on unequal error protection random linear coding (UEP RLC) for applications in network coded (NC) multimedia communications. We define a class of UEP RLC called expanding window random linear coding (EW-RLC) and provide exact decoding probability analysis for different importance classes of the source data assuming the Gaussian Elimination (GE) decoder applied at the receiver. Using this analysis, we provide a detailed investigation of the EW-RLC design for the distortion optimized scalable H.264/SVC coded video transmission over packet networks with packet erasures over a range of heterogeneous receivers with varying receiver reception overhead capabilities.
本文主要介绍了在网络编码多媒体通信中应用的不等错保护随机线性编码(UEP RLC)的最新研究进展。我们定义了一类UEP RLC,称为扩展窗口随机线性编码(EW-RLC),并提供了精确的解码概率分析不同重要类的源数据假设高斯消去(GE)解码器应用于接收端。利用这一分析,我们对EW-RLC设计进行了详细的研究,该设计用于在数据包网络上进行失真优化的可扩展H.264/SVC编码视频传输,并在一系列具有不同接收器接收开销能力的异构接收器上进行数据包擦除。
{"title":"Unequal error protection random linear coding for multimedia communications","authors":"D. Vukobratović, V. Stanković","doi":"10.1109/MMSP.2010.5662033","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662033","url":null,"abstract":"This paper focuses on recent research on unequal error protection random linear coding (UEP RLC) for applications in network coded (NC) multimedia communications. We define a class of UEP RLC called expanding window random linear coding (EW-RLC) and provide exact decoding probability analysis for different importance classes of the source data assuming the Gaussian Elimination (GE) decoder applied at the receiver. Using this analysis, we provide a detailed investigation of the EW-RLC design for the distortion optimized scalable H.264/SVC coded video transmission over packet networks with packet erasures over a range of heterogeneous receivers with varying receiver reception overhead capabilities.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124566896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Person recognition using a bag of facial soft biometrics (BoFSB) 人脸软生物识别技术(BoFSB)
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662074
A. Dantcheva, J. Dugelay, P. Elia
This work introduces the novel idea of using a bag of facial soft biometrics for person verification and identification. The novel tool inherits the non-intrusiveness and computational efficiency of soft biometrics, which allow for fast and enrolment-free biometric analysis, even in the absence of consent and cooperation of the surveillance subject. In conjunction with the proposed system design and detection algorithms, we also proceed to shed some light on the statistical properties of different parameters that are pertinent to the proposed system, as well as provide insight on general design aspects in soft-biometric systems, and different aspects regarding efficient resource allocation.
本文介绍了一种利用面部软生物特征包进行身份验证和识别的新思路。这种新工具继承了软生物识别技术的非侵入性和计算效率,即使在没有得到监视对象的同意和合作的情况下,也可以进行快速和免注册的生物识别分析。结合所提出的系统设计和检测算法,我们还进一步阐明了与所提出的系统相关的不同参数的统计特性,并提供了软生物识别系统的一般设计方面的见解,以及有关有效资源分配的不同方面。
{"title":"Person recognition using a bag of facial soft biometrics (BoFSB)","authors":"A. Dantcheva, J. Dugelay, P. Elia","doi":"10.1109/MMSP.2010.5662074","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662074","url":null,"abstract":"This work introduces the novel idea of using a bag of facial soft biometrics for person verification and identification. The novel tool inherits the non-intrusiveness and computational efficiency of soft biometrics, which allow for fast and enrolment-free biometric analysis, even in the absence of consent and cooperation of the surveillance subject. In conjunction with the proposed system design and detection algorithms, we also proceed to shed some light on the statistical properties of different parameters that are pertinent to the proposed system, as well as provide insight on general design aspects in soft-biometric systems, and different aspects regarding efficient resource allocation.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124545790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
A Bayesian image annotation framework integrating search and context 一个集搜索和上下文于一体的贝叶斯图像标注框架
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662072
Rui Zhang, Kui Wu, Kim-Hui Yap, L. Guan
Conventional approaches to image annotation tackle the problem based on the low-level visual information. Considering the importance of the information on the constrained interaction among the objects in a real world scene, contextual information has been utilized to recognize scene and object categories. In this paper, we propose a Bayesian approach to region-based image annotation, which integrates the content-based search and context into a unified framework. The content-based search selects representative keywords by matching an unlabeled image with the labeled ones followed by a weighted keyword ranking, which are in turn used by the context model to calculate the a prior probabilities of the object categories. Finally, a Bayesian framework integrates the a priori probabilities and the visual properties of image regions. The framework was evaluated using two databases and several performance measures, which demonstrated its superiority to both visual content-based and context-based approaches.
传统的图像标注方法是基于底层的视觉信息来解决这个问题的。考虑到现实世界场景中物体之间的约束交互信息的重要性,上下文信息被用来识别场景和物体类别。本文提出了一种基于贝叶斯的图像区域标注方法,该方法将基于内容的搜索和上下文整合到一个统一的框架中。基于内容的搜索通过将未标记的图像与标记的图像进行匹配,然后对关键字进行加权排序,从而选择具有代表性的关键字,然后由上下文模型使用这些关键字来计算对象类别的先验概率。最后,一个贝叶斯框架将先验概率和图像区域的视觉属性结合起来。使用两个数据库和几个性能指标对该框架进行了评估,这表明其优于基于视觉内容和基于上下文的方法。
{"title":"A Bayesian image annotation framework integrating search and context","authors":"Rui Zhang, Kui Wu, Kim-Hui Yap, L. Guan","doi":"10.1109/MMSP.2010.5662072","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662072","url":null,"abstract":"Conventional approaches to image annotation tackle the problem based on the low-level visual information. Considering the importance of the information on the constrained interaction among the objects in a real world scene, contextual information has been utilized to recognize scene and object categories. In this paper, we propose a Bayesian approach to region-based image annotation, which integrates the content-based search and context into a unified framework. The content-based search selects representative keywords by matching an unlabeled image with the labeled ones followed by a weighted keyword ranking, which are in turn used by the context model to calculate the a prior probabilities of the object categories. Finally, a Bayesian framework integrates the a priori probabilities and the visual properties of image regions. The framework was evaluated using two databases and several performance measures, which demonstrated its superiority to both visual content-based and context-based approaches.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128989911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive semi-regular remeshing: A Voronoi-based approach 自适应半规则网格:基于voronoi的方法
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662045
Aymen Kammoun, F. Payan, M. Antonini
We propose an adaptive semi-regular remeshing algorithm for surface meshes. Our algorithm uses Voronoi tessellations during both simplification and refinement stages. During simplification, the algorithm constructs a first centroidal Voronoi tessellation of the vertices of the input mesh. The sites of the Voronoi cells are the vertices of the base mesh of the semi-regular output. During refinement, the new vertices added at each resolution level by regular subdivision are considered as new Voronoi sites. We then use the Lloyd relaxation algorithm to update their position, and finally we obtain uniform semi-regular meshes. Our algorithm also enables adaptive remeshing by tuning a threshold based on the mass probability of the Voronoi sites added by subdivision. Experimentation shows that our technique produces semi-regular meshes of high quality, with significantly less triangles than state of the art techniques.
提出了一种曲面网格的自适应半规则重划分算法。我们的算法在简化和细化阶段都使用了Voronoi镶嵌。在简化过程中,算法构建输入网格顶点的第一个质心Voronoi镶嵌。Voronoi细胞的位置是半规则输出的基本网格的顶点。在细化过程中,通过规则细分在每个分辨率级别上添加的新顶点被视为新的Voronoi站点。然后使用Lloyd松弛算法更新它们的位置,最后得到均匀的半规则网格。我们的算法还通过根据细分添加的Voronoi站点的质量概率调整阈值来实现自适应重网格划分。实验表明,我们的技术产生高质量的半规则网格,与最先进的技术相比,三角形明显减少。
{"title":"Adaptive semi-regular remeshing: A Voronoi-based approach","authors":"Aymen Kammoun, F. Payan, M. Antonini","doi":"10.1109/MMSP.2010.5662045","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662045","url":null,"abstract":"We propose an adaptive semi-regular remeshing algorithm for surface meshes. Our algorithm uses Voronoi tessellations during both simplification and refinement stages. During simplification, the algorithm constructs a first centroidal Voronoi tessellation of the vertices of the input mesh. The sites of the Voronoi cells are the vertices of the base mesh of the semi-regular output. During refinement, the new vertices added at each resolution level by regular subdivision are considered as new Voronoi sites. We then use the Lloyd relaxation algorithm to update their position, and finally we obtain uniform semi-regular meshes. Our algorithm also enables adaptive remeshing by tuning a threshold based on the mass probability of the Voronoi sites added by subdivision. Experimentation shows that our technique produces semi-regular meshes of high quality, with significantly less triangles than state of the art techniques.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125653085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Strategies of buffering schedule in P2P VoD streaming P2P视频点播流媒体中的缓冲调度策略
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662056
Zhi Wang, Lifeng Sun, Shiqiang Yang
As compared to live peer-to-peer (P2P) streaming, modern P2P video-on-demand (VoD) systems have brought much larger volumes of videos and more interactive controls to the Internet users. As the increase of bitrate of the videos and the full VCR controls of P2P VoD, the behavior “buffering” motivates us to design different schedule and service strategies for peers, to improve the playback performance, and the alleviation of the dedicated streaming server, by making best use of the bandwidth and cache capacities of these buffering peers. In our design, peers strategically decide which segments in the video to download first, and which requests to serve first. We conduct extended simulations to evaluate the performance of the strategies, and the results show our design outperforms the conventional sequential scheme, with respect to improving the playback quality and reducing the server load.
与点对点(P2P)流媒体直播相比,现代P2P视频点播(VoD)系统为互联网用户带来了更大的视频容量和更多的交互式控制。随着视频比特率的提高和P2P视频点播的全VCR控制,“缓冲”行为促使我们为对等点设计不同的调度和服务策略,通过充分利用这些缓冲对等点的带宽和缓存容量来提高播放性能,减轻专用流媒体服务器的压力。在我们的设计中,对等体策略性地决定首先下载视频中的哪些片段,以及首先服务哪些请求。我们进行了扩展的仿真来评估策略的性能,结果表明我们的设计在提高播放质量和减少服务器负载方面优于传统的顺序方案。
{"title":"Strategies of buffering schedule in P2P VoD streaming","authors":"Zhi Wang, Lifeng Sun, Shiqiang Yang","doi":"10.1109/MMSP.2010.5662056","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662056","url":null,"abstract":"As compared to live peer-to-peer (P2P) streaming, modern P2P video-on-demand (VoD) systems have brought much larger volumes of videos and more interactive controls to the Internet users. As the increase of bitrate of the videos and the full VCR controls of P2P VoD, the behavior “buffering” motivates us to design different schedule and service strategies for peers, to improve the playback performance, and the alleviation of the dedicated streaming server, by making best use of the bandwidth and cache capacities of these buffering peers. In our design, peers strategically decide which segments in the video to download first, and which requests to serve first. We conduct extended simulations to evaluate the performance of the strategies, and the results show our design outperforms the conventional sequential scheme, with respect to improving the playback quality and reducing the server load.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116668791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Encoder rate control for block-based distributed video coding 基于块的分布式视频编码的编码器速率控制
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662042
Chen Fu, Joohee Kim
Distributed video coding is a new paradigm for video compression based on the Slepian-Wolf and Wyner-Ziv theorems. Wyner-Ziv video coding, a lossy compression with receiver side information, enables low-complexity video encoding at the expense of a complex decoder. Most of the existing distributed video coding techniques require a feedback channel to determine the number of parity bits for decoding Wyner-Ziv frames at the decoder. However, a feedback channel is not available for some applications or a feedback channel-based decoder rate control may not be used due to delay constraints in wireless video sensor network applications. In this paper, an encoder-based rate control method for distributed video coding is proposed. The proposed solution consists of a low complexity side information generation method at the encoder and a rate estimation algorithm that determines the number of parity bits to be transmitted to the decoder. The performance of the proposed algorithm is compared with existing encoder-based rate control methods and a decoder-based rate control algorithm based on a feedback channel.
分布式视频编码是基于Slepian-Wolf定理和Wyner-Ziv定理的一种新的视频压缩范式。Wyner-Ziv视频编码是一种带有接收端信息的有损压缩,以牺牲复杂的解码器为代价实现低复杂度的视频编码。大多数现有的分布式视频编码技术需要一个反馈通道来确定解码Wyner-Ziv帧的奇偶校验位的数量。然而,由于无线视频传感器网络应用中的延迟限制,反馈通道不可用于某些应用,或者基于反馈通道的解码器速率控制可能无法使用。本文提出了一种基于编码器的分布式视频编码速率控制方法。提出的解决方案包括在编码器处的低复杂度侧信息生成方法和确定要传输到解码器的奇偶校验位数的速率估计算法。将该算法的性能与现有的基于编码器的速率控制方法和基于反馈信道的基于解码器的速率控制算法进行了比较。
{"title":"Encoder rate control for block-based distributed video coding","authors":"Chen Fu, Joohee Kim","doi":"10.1109/MMSP.2010.5662042","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662042","url":null,"abstract":"Distributed video coding is a new paradigm for video compression based on the Slepian-Wolf and Wyner-Ziv theorems. Wyner-Ziv video coding, a lossy compression with receiver side information, enables low-complexity video encoding at the expense of a complex decoder. Most of the existing distributed video coding techniques require a feedback channel to determine the number of parity bits for decoding Wyner-Ziv frames at the decoder. However, a feedback channel is not available for some applications or a feedback channel-based decoder rate control may not be used due to delay constraints in wireless video sensor network applications. In this paper, an encoder-based rate control method for distributed video coding is proposed. The proposed solution consists of a low complexity side information generation method at the encoder and a rate estimation algorithm that determines the number of parity bits to be transmitted to the decoder. The performance of the proposed algorithm is compared with existing encoder-based rate control methods and a decoder-based rate control algorithm based on a feedback channel.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125250408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Audio-haptic physically-based simulation of walking on different grounds 在不同场地上行走的听觉触觉物理模拟
Pub Date : 2010-12-10 DOI: 10.1109/MMSP.2010.5662031
L. Turchet, R. Nordahl, S. Serafin, Amir Berrezag, Smilen Dimitrov, V. Hayward
We describe a system which simulates in realtime the auditory and haptic sensations of walking on different surfaces. The system is based on a pair of sandals enhanced with pressure sensors and actuators. The pressure sensors detect the interaction force during walking, and control several physically based synthesis algorithms, which drive both the auditory and haptic feedback. The different hardware and software components of the system are described, together with possible uses and possibilities for improvements in future design iterations.
我们描述了一个实时模拟在不同表面上行走的听觉和触觉的系统。该系统基于一双凉鞋,增强了压力传感器和执行器。压力传感器检测行走过程中的相互作用力,并控制几种基于物理的合成算法,这些算法驱动听觉和触觉反馈。描述了系统的不同硬件和软件组件,以及在未来设计迭代中可能的用途和改进的可能性。
{"title":"Audio-haptic physically-based simulation of walking on different grounds","authors":"L. Turchet, R. Nordahl, S. Serafin, Amir Berrezag, Smilen Dimitrov, V. Hayward","doi":"10.1109/MMSP.2010.5662031","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662031","url":null,"abstract":"We describe a system which simulates in realtime the auditory and haptic sensations of walking on different surfaces. The system is based on a pair of sandals enhanced with pressure sensors and actuators. The pressure sensors detect the interaction force during walking, and control several physically based synthesis algorithms, which drive both the auditory and haptic feedback. The different hardware and software components of the system are described, together with possible uses and possibilities for improvements in future design iterations.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132999438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
An efficient framework on large-scale video genre classification 一种高效的大规模视频类型分类框架
Pub Date : 2010-10-01 DOI: 10.1109/MMSP.2010.5662069
Ning Zhang, L. Guan
Efficient data mining and indexing is important for multimedia analysis and retrieval. In the field of large-scale video analysis, effective genre categorization plays an important role and serves one of the fundamental steps for data mining. Existing works utilize domain-knowledge dependent feature extraction, which is limited from genre diversification as well as data volume scalability. In this paper, we propose a systematic framework for automatically classifying video genres using domain-knowledge independent descriptors in feature extraction, and a bag-of-visualwords (BoW) based model in compact video representation. Scale invariant feature transform (SIFT) local descriptor accelerated by GPU hardware is adopted for feature extraction. BoW model with an innovative codebook generation using bottom-up two-layer K-means clustering is proposed to abstract the video characteristics. Besides the histogram-based distribution in summarizing video data, a modified latent Dirichlet allocation (mLDA) based distribution is also introduced. At the classification stage, a k-nearest neighbor (k-NN) classifier is employed. Compared with state of art large-scale genre categorization in [1], the experimental results on a 23-sports dataset demonstrate that our proposed framework achieves a comparable classification accuracy with 27% and 64% expansion in data volume and diversity, respectively.
高效的数据挖掘和索引对多媒体分析和检索具有重要意义。在大规模视频分析领域,有效的类型分类扮演着重要的角色,是数据挖掘的基础步骤之一。现有的工作采用依赖领域知识的特征提取,受类型多样化和数据量可扩展性的限制。在本文中,我们提出了一个系统的框架,在特征提取中使用领域知识无关的描述符自动分类视频类型,在紧凑视频表示中使用基于视觉词袋(BoW)的模型。采用GPU硬件加速的尺度不变特征变换(SIFT)局部描述子进行特征提取。提出了一种基于自底向上两层k均值聚类的编码本生成BoW模型来抽象视频特征。除了基于直方图的视频数据汇总分布外,还引入了一种改进的基于潜在狄利克雷分布(mLDA)的视频数据汇总分布。在分类阶段,使用k近邻(k-NN)分类器。与[1]中最先进的大规模类型分类相比,在23个运动数据集上的实验结果表明,我们提出的框架在数据量和多样性上分别扩展了27%和64%,达到了相当的分类精度。
{"title":"An efficient framework on large-scale video genre classification","authors":"Ning Zhang, L. Guan","doi":"10.1109/MMSP.2010.5662069","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662069","url":null,"abstract":"Efficient data mining and indexing is important for multimedia analysis and retrieval. In the field of large-scale video analysis, effective genre categorization plays an important role and serves one of the fundamental steps for data mining. Existing works utilize domain-knowledge dependent feature extraction, which is limited from genre diversification as well as data volume scalability. In this paper, we propose a systematic framework for automatically classifying video genres using domain-knowledge independent descriptors in feature extraction, and a bag-of-visualwords (BoW) based model in compact video representation. Scale invariant feature transform (SIFT) local descriptor accelerated by GPU hardware is adopted for feature extraction. BoW model with an innovative codebook generation using bottom-up two-layer K-means clustering is proposed to abstract the video characteristics. Besides the histogram-based distribution in summarizing video data, a modified latent Dirichlet allocation (mLDA) based distribution is also introduced. At the classification stage, a k-nearest neighbor (k-NN) classifier is employed. Compared with state of art large-scale genre categorization in [1], the experimental results on a 23-sports dataset demonstrate that our proposed framework achieves a comparable classification accuracy with 27% and 64% expansion in data volume and diversity, respectively.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125674929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
An N-gram model for unstructured audio signals toward information retrieval 面向信息检索的非结构化音频信号n图模型
Pub Date : 2010-10-01 DOI: 10.1109/MMSP.2010.5662068
Samuel Kim, Shiva Sundaram, P. Georgiou, Shrikanth S. Narayanan
An N-gram modeling approach for unstructured audio signals is introduced with applications to audio information retrieval. The proposed N-gram approach aims to capture local dynamic information in acoustic words within the acoustic topic model framework which assumes an audio signal consists of latent acoustic topics and each topic can be interpreted as a distribution over acoustic words. Experimental results on classifying audio clips from BBC Sound Effects Library according to both semantic and onomatopoeic labels indicate that the proposed N-gram approach performs better than using only a bag-of-words approach by providing complementary local dynamic information.
介绍了一种非结构化音频信号的n图建模方法,并将其应用于音频信息检索。提出的n图方法旨在在声学主题模型框架内捕获声学单词中的局部动态信息,该模型假设音频信号由潜在的声学主题组成,并且每个主题可以被解释为声学单词的分布。根据语义和拟声标签对BBC Sound Effects Library中的音频片段进行分类的实验结果表明,通过提供互补的局部动态信息,所提出的N-gram方法比仅使用词袋方法表现更好。
{"title":"An N-gram model for unstructured audio signals toward information retrieval","authors":"Samuel Kim, Shiva Sundaram, P. Georgiou, Shrikanth S. Narayanan","doi":"10.1109/MMSP.2010.5662068","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662068","url":null,"abstract":"An N-gram modeling approach for unstructured audio signals is introduced with applications to audio information retrieval. The proposed N-gram approach aims to capture local dynamic information in acoustic words within the acoustic topic model framework which assumes an audio signal consists of latent acoustic topics and each topic can be interpreted as a distribution over acoustic words. Experimental results on classifying audio clips from BBC Sound Effects Library according to both semantic and onomatopoeic labels indicate that the proposed N-gram approach performs better than using only a bag-of-words approach by providing complementary local dynamic information.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115462063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Toward realtime side information decoding on multi-core processors 在多核处理器上实现实时侧信息解码
Pub Date : 2010-10-01 DOI: 10.1109/MMSP.2010.5662040
S. Momcilovic, Yige Wang, S. Rane, A. Vetro
Most distributed source coding schemes involve the application of a channel code to the signal and transmission of the resulting syndromes. For low-complexity encoding with superior compression performance, graph-based channel codes such as LDPC codes are used to generate the syndromes. The encoder performs simple XOR operations, while the decoder uses belief propagation (BP) decoding to recover the signal of interest using the syndromes and some correlated side information. We consider parallelization of BP decoding on general-purpose multi-core CPUs. The motivation is to make BP decoding fast enough for realtime applications. We consider three different BP decoding algorithms: Sum-Product BP, Min-Sum BP and Algorithm E. The speedup obtained by parallelizing these algorithms is examined along with the tradeoff against decoding performance. Parallelization is achieved by dividing the received syndrome vectors among different cores, and by using vector operations to simultaneously process multiple check nodes in each core. While Min-Sum BP has intermediate decoding complexity, a “vectorized” version of Min-Sum BP performs nearly as fast as the much simpler Algorithm E with significantly fewer decoding errors. Our experiments indicate that, for the best compromise between speed and performance, the decoder should use Min-Sum BP when the side information is of good quality and Sum-Product BP otherwise.
大多数分布式源编码方案涉及对信号应用信道码并传输由此产生的综合征。对于具有优越压缩性能的低复杂度编码,采用基于图的信道码(如LDPC码)生成证型。编码器执行简单的异或操作,解码器使用信念传播(BP)解码,利用综合征和一些相关的侧信息恢复感兴趣的信号。研究了通用多核cpu上BP解码的并行化问题。其动机是使BP解码速度足够快,可以用于实时应用。我们考虑了三种不同的BP解码算法:和积BP、最小和BP和算法e。通过并行化这些算法获得的加速以及对解码性能的权衡进行了研究。通过将接收到的综合征向量划分到不同的核中,并通过向量运算在每个核中同时处理多个检查节点来实现并行化。虽然最小和BP具有中等解码复杂度,但最小和BP的“矢量化”版本的执行速度几乎与更简单的算法E一样快,解码错误明显减少。我们的实验表明,为了在速度和性能之间取得最佳折衷,当侧信息质量较好时,解码器应使用最小和BP,否则使用和积BP。
{"title":"Toward realtime side information decoding on multi-core processors","authors":"S. Momcilovic, Yige Wang, S. Rane, A. Vetro","doi":"10.1109/MMSP.2010.5662040","DOIUrl":"https://doi.org/10.1109/MMSP.2010.5662040","url":null,"abstract":"Most distributed source coding schemes involve the application of a channel code to the signal and transmission of the resulting syndromes. For low-complexity encoding with superior compression performance, graph-based channel codes such as LDPC codes are used to generate the syndromes. The encoder performs simple XOR operations, while the decoder uses belief propagation (BP) decoding to recover the signal of interest using the syndromes and some correlated side information. We consider parallelization of BP decoding on general-purpose multi-core CPUs. The motivation is to make BP decoding fast enough for realtime applications. We consider three different BP decoding algorithms: Sum-Product BP, Min-Sum BP and Algorithm E. The speedup obtained by parallelizing these algorithms is examined along with the tradeoff against decoding performance. Parallelization is achieved by dividing the received syndrome vectors among different cores, and by using vector operations to simultaneously process multiple check nodes in each core. While Min-Sum BP has intermediate decoding complexity, a “vectorized” version of Min-Sum BP performs nearly as fast as the much simpler Algorithm E with significantly fewer decoding errors. Our experiments indicate that, for the best compromise between speed and performance, the decoder should use Min-Sum BP when the side information is of good quality and Sum-Product BP otherwise.","PeriodicalId":105774,"journal":{"name":"2010 IEEE International Workshop on Multimedia Signal Processing","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117225061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
2010 IEEE International Workshop on Multimedia Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1