首页 > 最新文献

2011 IEEE 13th International Workshop on Multimedia Signal Processing最新文献

英文 中文
Image quality assessment based on multiple watermarking approach 基于多重水印的图像质量评估方法
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093787
N. Baaziz, Dong Zheng, Demin Wang
Automatic monitoring of image/video quality is very important in modern multimedia communication services. We are interested in digital watermarking as a promising approach to image quality assessment without reference to the original image. The proposed methodology makes use of wavelet-based embedding of multiple watermarks with robustness control in order to capture the degree of the degradation undergone by a received image. The watermark robustness is controlled through 1) embedding and detection of multiple watermarks, 2) multi-resolution and directional subband selection, 3) perceptual watermark weighting and 4) fine watermark strength adjustment process. At the receiver end, the detection or lack of detection of the watermarks in a received image are used to estimate image's PSNR range and determine its associated quality attribute. Simulation results show the efficiency of such watermarking scheme in assessing the quality level of test images under JPEG compression.
在现代多媒体通信业务中,图像/视频质量的自动监控非常重要。我们感兴趣的数字水印作为一个有前途的方法来评估图像质量不参考原始图像。该方法利用基于小波的多水印嵌入和鲁棒性控制来捕获接收到的图像所经历的退化程度。通过1)多个水印的嵌入和检测,2)多分辨率和定向子带选择,3)感知水印加权,4)精细水印强度调整过程来控制水印的鲁棒性。在接收端,利用接收图像中水印的检测或不检测来估计图像的PSNR范围,并确定其相关的质量属性。仿真结果表明,该水印方案在JPEG压缩条件下对测试图像的质量水平进行了有效的评估。
{"title":"Image quality assessment based on multiple watermarking approach","authors":"N. Baaziz, Dong Zheng, Demin Wang","doi":"10.1109/MMSP.2011.6093787","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093787","url":null,"abstract":"Automatic monitoring of image/video quality is very important in modern multimedia communication services. We are interested in digital watermarking as a promising approach to image quality assessment without reference to the original image. The proposed methodology makes use of wavelet-based embedding of multiple watermarks with robustness control in order to capture the degree of the degradation undergone by a received image. The watermark robustness is controlled through 1) embedding and detection of multiple watermarks, 2) multi-resolution and directional subband selection, 3) perceptual watermark weighting and 4) fine watermark strength adjustment process. At the receiver end, the detection or lack of detection of the watermarks in a received image are used to estimate image's PSNR range and determine its associated quality attribute. Simulation results show the efficiency of such watermarking scheme in assessing the quality level of test images under JPEG compression.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124895521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Angular intra prediction in High Efficiency Video Coding (HEVC) 高效视频编码(HEVC)中的角内预测
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093806
J. Lainema, K. Ugur
New video coding solutions, such as the HEVC (High Efficiency Video Coding) standard being developed by JCT-VC (Joint Collaborative Team on Video Coding), are typically designed for high resolution video content. Increasing video resolution creates two basic requirements for practical video codecs; those need to be able to provide compression efficiency superior to prior video coding solutions and the computational requirements need to be aligned with the foreseeable hardware platforms. This paper proposes an intra prediction method which is designed to provide high compression efficiency and which can be implemented effectively in resource constrained environments making it applicable to wide range of use cases. When designing the method, special attention was given to the algorithmic definition of the prediction sample generation, in order to be able to utilize the same reconstruction process at different block sizes. The proposed method outperforms earlier variations of the same family of technologies significantly and consistently across different classes of video material, and has recently been adopted as the directional intra prediction method for the draft HEVC standard. Experimental results show that the proposed method outperforms the H.264/AVC intra prediction approach on average by 4.8 %. For sequences with dominant directional structures, the coding efficiency gains become more significant and exceed 10 %.
新的视频编码解决方案,如JCT-VC(视频编码联合协作小组)正在开发的HEVC(高效视频编码)标准,通常是为高分辨率视频内容设计的。提高视频分辨率对实际视频编解码器提出了两个基本要求;这些需要能够提供优于先前视频编码解决方案的压缩效率,并且计算需求需要与可预见的硬件平台保持一致。本文提出了一种能够提供高压缩效率的内部预测方法,该方法可以在资源受限的环境中有效实现,适用于广泛的用例。在设计该方法时,特别注意了预测样本生成的算法定义,以便能够在不同块大小下使用相同的重建过程。所提出的方法在不同类别的视频材料中显著且一致地优于同一系列技术的早期变体,并且最近被采用为HEVC标准草案的定向内预测方法。实验结果表明,该方法比H.264/AVC帧内预测方法平均提高4.8%。对于具有优势方向结构的序列,编码效率提高更为显著,可达10%以上。
{"title":"Angular intra prediction in High Efficiency Video Coding (HEVC)","authors":"J. Lainema, K. Ugur","doi":"10.1109/MMSP.2011.6093806","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093806","url":null,"abstract":"New video coding solutions, such as the HEVC (High Efficiency Video Coding) standard being developed by JCT-VC (Joint Collaborative Team on Video Coding), are typically designed for high resolution video content. Increasing video resolution creates two basic requirements for practical video codecs; those need to be able to provide compression efficiency superior to prior video coding solutions and the computational requirements need to be aligned with the foreseeable hardware platforms. This paper proposes an intra prediction method which is designed to provide high compression efficiency and which can be implemented effectively in resource constrained environments making it applicable to wide range of use cases. When designing the method, special attention was given to the algorithmic definition of the prediction sample generation, in order to be able to utilize the same reconstruction process at different block sizes. The proposed method outperforms earlier variations of the same family of technologies significantly and consistently across different classes of video material, and has recently been adopted as the directional intra prediction method for the draft HEVC standard. Experimental results show that the proposed method outperforms the H.264/AVC intra prediction approach on average by 4.8 %. For sequences with dominant directional structures, the coding efficiency gains become more significant and exceed 10 %.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123026585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
ViewMark: An interactive videoconferencing system for mobile devices ViewMark:针对移动设备的交互式视频会议系统
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093792
Shu Shi, Zhengyou Zhang
ViewMark, a server-client based interactive mobile videoconferencing system is proposed in this paper to enhance the remote meeting experience for mobile users. Compared with the state-of-the-art mobile videoconferencing technology, ViewMark is novel in allowing a mobile user to interactively change the viewpoint of the remote video, create viewmarks, and hear with spatial audio. In addition, ViewMark also streams the screen of the presentation slides to mobile devices. In this paper, we introduce the system design of ViewMark in details, compare the devices that can be used to implement interactive videoconferencing, and demonstrate the prototype system we have built on Windows Mobile platform.
为了提高移动用户的远程会议体验,本文提出了一种基于服务器-客户端的交互式移动视频会议系统ViewMark。与最先进的移动视频会议技术相比,ViewMark的新颖之处在于,它允许移动用户以交互方式改变远程视频的视点,创建ViewMark,并使用空间音频收听。此外,ViewMark还将演示幻灯片的屏幕流式传输到移动设备。本文详细介绍了ViewMark的系统设计,比较了可用于实现交互式视频会议的设备,并演示了我们在Windows Mobile平台上构建的原型系统。
{"title":"ViewMark: An interactive videoconferencing system for mobile devices","authors":"Shu Shi, Zhengyou Zhang","doi":"10.1109/MMSP.2011.6093792","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093792","url":null,"abstract":"ViewMark, a server-client based interactive mobile videoconferencing system is proposed in this paper to enhance the remote meeting experience for mobile users. Compared with the state-of-the-art mobile videoconferencing technology, ViewMark is novel in allowing a mobile user to interactively change the viewpoint of the remote video, create viewmarks, and hear with spatial audio. In addition, ViewMark also streams the screen of the presentation slides to mobile devices. In this paper, we introduce the system design of ViewMark in details, compare the devices that can be used to implement interactive videoconferencing, and demonstrate the prototype system we have built on Windows Mobile platform.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128127843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
GPU based fast algorithm for tanner graph based image interpolation 基于GPU的tanner图图像插值快速算法
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093783
Wei Lei, Ruiqin Xiong, Siwei Ma, Luhong Liang
In image/video processing software and hardware products, low complexity interpolation algorithms, such as cubic and splines methods, are commonly used. However, these methods tend to blur textures and produce jaggy effect compared with other adaptive methods such as NEDI, SAI. Tanner graph based image interpolation algorithm has better effect in dealing with edge and texture, but with high computation complexity. Thanks to the high performance parallel processing capability of today's GPU, use of complex algorithms for real time application is becoming possible. In this paper, we present a fast algorithm for tanner graph based image interpolation and it's implementation on GPU. In our algorithm, the image model training process of tanner graph based image interpolation is greatly simplified. Experimental results show that the GPU implementation can be more than 47 times as fast as the CPU implementation.
在图像/视频处理软件和硬件产品中,通常使用低复杂度的插值算法,如三次和样条方法。但与NEDI、SAI等其他自适应方法相比,这些方法容易使纹理模糊,产生锯齿效果。基于Tanner图的图像插值算法在处理边缘和纹理方面效果较好,但计算量较大。由于当今GPU的高性能并行处理能力,使用复杂的算法进行实时应用成为可能。本文提出了一种基于tanner图的快速图像插值算法及其在GPU上的实现。该算法极大地简化了基于tanner图插值的图像模型训练过程。实验结果表明,GPU实现的速度是CPU实现的47倍以上。
{"title":"GPU based fast algorithm for tanner graph based image interpolation","authors":"Wei Lei, Ruiqin Xiong, Siwei Ma, Luhong Liang","doi":"10.1109/MMSP.2011.6093783","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093783","url":null,"abstract":"In image/video processing software and hardware products, low complexity interpolation algorithms, such as cubic and splines methods, are commonly used. However, these methods tend to blur textures and produce jaggy effect compared with other adaptive methods such as NEDI, SAI. Tanner graph based image interpolation algorithm has better effect in dealing with edge and texture, but with high computation complexity. Thanks to the high performance parallel processing capability of today's GPU, use of complex algorithms for real time application is becoming possible. In this paper, we present a fast algorithm for tanner graph based image interpolation and it's implementation on GPU. In our algorithm, the image model training process of tanner graph based image interpolation is greatly simplified. Experimental results show that the GPU implementation can be more than 47 times as fast as the CPU implementation.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130349420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Separation of speech sources using an Acoustic Vector Sensor 使用声矢量传感器分离语音源
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093797
M. Shujau, C. Ritz, I. Burnett
This paper investigates how the directional characteristics of an Acoustic Vector Sensor (AVS) can be used to separate speech sources. The technique described in this work takes advantage of the frequency domain direction of arrival estimates to identify the location, relative to the AVS array, of each individual speaker in a group of speakers and separate them accordingly into individual speech signals. Results presented in this work show that the technique can be used for real-time separation of speech sources using a single 20ms frame of speech, furthermore the results presented show that there is an average improvement in the Signal to Interference Ratio (SIR) for the proposed algorithm over the unprocessed recording of 15.1 dB and an average improvement of 5.4 dB in terms of Signal to Distortion Ratio (SDR) over the unprocessed recordings. In addition to the SIR and SDR results, Perceptual Evaluation of Speech Quality (PESQ) and listening tests both show an improvement in perceptual quality of 1 Mean Opinion Score (MOS) over unprocessed recordings.
本文研究了如何利用声矢量传感器(AVS)的方向特性来分离语音源。这项工作中描述的技术利用频域到达方向估计来识别相对于AVS阵列的一组扬声器中的每个扬声器的位置,并相应地将它们分离为单个语音信号。本研究的结果表明,该技术可用于使用单个20ms语音帧的语音源的实时分离,此外,研究结果表明,与未处理的15.1 dB记录相比,该算法的信干扰比(SIR)平均提高,而在信失真比(SDR)方面,该算法比未处理的记录平均提高5.4 dB。除了SIR和SDR结果外,语音质量感知评估(PESQ)和听力测试都显示,与未处理的录音相比,1平均意见得分(MOS)的感知质量有所改善。
{"title":"Separation of speech sources using an Acoustic Vector Sensor","authors":"M. Shujau, C. Ritz, I. Burnett","doi":"10.1109/MMSP.2011.6093797","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093797","url":null,"abstract":"This paper investigates how the directional characteristics of an Acoustic Vector Sensor (AVS) can be used to separate speech sources. The technique described in this work takes advantage of the frequency domain direction of arrival estimates to identify the location, relative to the AVS array, of each individual speaker in a group of speakers and separate them accordingly into individual speech signals. Results presented in this work show that the technique can be used for real-time separation of speech sources using a single 20ms frame of speech, furthermore the results presented show that there is an average improvement in the Signal to Interference Ratio (SIR) for the proposed algorithm over the unprocessed recording of 15.1 dB and an average improvement of 5.4 dB in terms of Signal to Distortion Ratio (SDR) over the unprocessed recordings. In addition to the SIR and SDR results, Perceptual Evaluation of Speech Quality (PESQ) and listening tests both show an improvement in perceptual quality of 1 Mean Opinion Score (MOS) over unprocessed recordings.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"29 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114102158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Region of interest determination using human computation 用人工计算确定感兴趣的区域
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093839
Flavio P. Ribeiro, D. Florêncio
The ability to identify and track visually interesting regions has many practical applications — for example, in image and video compression, visual marketing and foveal machine vision. Due to challenges in modeling the peculiarities of human physiological and psychological responses, automatic detection of fixation points is an open problem. Indeed, no objective methods are currently capable of fully modeling the human perception of regions of interest (ROIs). Thus, research often relies on user studies with eye tracking systems. In this paper we propose a cost-effective and convenient alternative, obtained by having internet workers annotate videos with ROI coordinates. The workers use an interactive video player with a simulated mouse-driven fovea, which models the fall-off in resolution of the human visual system. Since this approach is not supervised, we implement methods for identifying inaccurate or malicious results. Using this proposal, one can collect ROI data in an automated fashion, and at a much lower cost than laboratory studies.
识别和跟踪视觉上有趣区域的能力有许多实际应用,例如,在图像和视频压缩、视觉营销和中央凹机器视觉方面。由于在模拟人类生理和心理反应的特殊性方面存在挑战,注视点的自动检测是一个悬而未决的问题。事实上,目前还没有客观的方法能够完全模拟人类对感兴趣区域(roi)的感知。因此,研究通常依赖于使用眼动追踪系统的用户研究。在本文中,我们提出了一种经济、方便的替代方案,即让网络工作者用ROI坐标对视频进行注释。工作人员使用了一个带有模拟鼠标驱动的中央凹的交互式视频播放器,它模拟了人类视觉系统分辨率的下降。由于这种方法不受监督,我们实现了识别不准确或恶意结果的方法。使用此建议,可以以自动化的方式收集ROI数据,并且比实验室研究的成本低得多。
{"title":"Region of interest determination using human computation","authors":"Flavio P. Ribeiro, D. Florêncio","doi":"10.1109/MMSP.2011.6093839","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093839","url":null,"abstract":"The ability to identify and track visually interesting regions has many practical applications — for example, in image and video compression, visual marketing and foveal machine vision. Due to challenges in modeling the peculiarities of human physiological and psychological responses, automatic detection of fixation points is an open problem. Indeed, no objective methods are currently capable of fully modeling the human perception of regions of interest (ROIs). Thus, research often relies on user studies with eye tracking systems. In this paper we propose a cost-effective and convenient alternative, obtained by having internet workers annotate videos with ROI coordinates. The workers use an interactive video player with a simulated mouse-driven fovea, which models the fall-off in resolution of the human visual system. Since this approach is not supervised, we implement methods for identifying inaccurate or malicious results. Using this proposal, one can collect ROI data in an automated fashion, and at a much lower cost than laboratory studies.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114567645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Low-complexity, near-lossless coding of depth maps from kinect-like depth cameras 低复杂度,近乎无损的深度图编码,来自类似kinect的深度相机
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093803
S. Mehrotra, Zhengyou Zhang, Q. Cai, Cha Zhang, P. Chou
Depth cameras are gaining interest rapidly in the market as depth plus RGB is being used for a variety of applications ranging from foreground/background segmentation, face tracking, activity detection, and free viewpoint video rendering. In this paper, we present a low-complexity, near-lossless codec for coding depth maps. This coding requires no buffering of video frames, is table-less, can encode or decode a frame in close to 5ms with little code optimization, and provides between 7:1 to 16:1 compression ratio for near-lossless coding of 16-bit depth maps generated by the Kinect camera.
随着深度+ RGB被用于前景/背景分割、人脸跟踪、活动检测和免费视点视频渲染等各种应用,深度相机在市场上的兴趣正在迅速增加。在本文中,我们提出了一种低复杂度、近无损的深度图编码解码器。这种编码不需要视频帧的缓冲,没有表格,可以在接近5ms的时间内编码或解码一帧,几乎没有代码优化,并提供7:1到16:1的压缩比,对Kinect摄像头生成的16位深度图进行近乎无损的编码。
{"title":"Low-complexity, near-lossless coding of depth maps from kinect-like depth cameras","authors":"S. Mehrotra, Zhengyou Zhang, Q. Cai, Cha Zhang, P. Chou","doi":"10.1109/MMSP.2011.6093803","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093803","url":null,"abstract":"Depth cameras are gaining interest rapidly in the market as depth plus RGB is being used for a variety of applications ranging from foreground/background segmentation, face tracking, activity detection, and free viewpoint video rendering. In this paper, we present a low-complexity, near-lossless codec for coding depth maps. This coding requires no buffering of video frames, is table-less, can encode or decode a frame in close to 5ms with little code optimization, and provides between 7:1 to 16:1 compression ratio for near-lossless coding of 16-bit depth maps generated by the Kinect camera.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130059674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Block-based codebook model with oriented-gradient feature for real-time foreground detection 面向前景实时检测的基于块的梯度特征码本模型
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093830
Jiu Xu, Ning Jiang, S. Goto
In this paper, a novel approach is proposed to achieve the foreground objects detection in video surveillance system using codebook method. The block-based background model upgrades the pixel-based codebook model to block level which can utilize the dependency and find relationships between neighbouring pixels, thus improving the processing speed and reducing memory during model construction and foreground detection. Moreover, by adding the orientation and magnitude of the block gradient, the codebook model contains not only information of color, but also the texture feature. The texture information can further reduce noises and refine more entire foreground regions. Experimental results prove that our method has better performance compared with the standard codebook and some other former algorithms.
本文提出了一种利用码本方法实现视频监控系统中前景目标检测的新方法。基于块的背景模型将基于像素的码本模型提升到块级,可以利用相邻像素之间的依赖关系,发现相邻像素之间的关系,从而在模型构建和前景检测过程中提高处理速度,减少内存。此外,通过添加块梯度的方向和大小,码本模型不仅包含颜色信息,还包含纹理特征。纹理信息可以进一步降低噪声,细化更完整的前景区域。实验结果表明,与标准码本和其他一些算法相比,我们的方法具有更好的性能。
{"title":"Block-based codebook model with oriented-gradient feature for real-time foreground detection","authors":"Jiu Xu, Ning Jiang, S. Goto","doi":"10.1109/MMSP.2011.6093830","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093830","url":null,"abstract":"In this paper, a novel approach is proposed to achieve the foreground objects detection in video surveillance system using codebook method. The block-based background model upgrades the pixel-based codebook model to block level which can utilize the dependency and find relationships between neighbouring pixels, thus improving the processing speed and reducing memory during model construction and foreground detection. Moreover, by adding the orientation and magnitude of the block gradient, the codebook model contains not only information of color, but also the texture feature. The texture information can further reduce noises and refine more entire foreground regions. Experimental results prove that our method has better performance compared with the standard codebook and some other former algorithms.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115940863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
An MDC-based video streaming architecture for mobile networks 基于mdc的移动网络视频流架构
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093829
C. Greco, G. Petrazzuoli, Marco Cagnazzo, B. Pesquet-Popescu
Multiple description coding (MDC) is a framework designed to improve the robustness of video content transmission in lossy environments. In this work, we propose an MDC technique using a legacy coder to produce two descriptions, based on separation of even and odd frames. If only one description is received, the missing frames are reconstructed using temporal high-order motion interpolation (HOMI), a technique originally proposed for distributed video coding. If both descriptions are received, the frames are reconstructed as a block-wise linear combination of the two descriptions, with the coefficient computed at the encoder in a RD-optimised fashion, encoded with a context-adaptive arithmetic coder, and sent as side information. We integrated the proposed technique in a mobile ad-hoc streaming protocol, and tested it using a group mobility model. The results show a non-negligible gain for the expected video quality, with respect to the reference technique.
多描述编码(Multiple description coding, MDC)是为了提高视频内容在有损环境下传输的鲁棒性而设计的一种框架。在这项工作中,我们提出了一种MDC技术,使用遗留编码器产生两种描述,基于偶数和奇数帧的分离。如果只接收到一个描述,则使用时间高阶运动插值(HOMI)重建缺失帧,这是一种最初为分布式视频编码提出的技术。如果接收到两个描述,则帧被重构为两个描述的块线性组合,在编码器上以rd优化的方式计算系数,使用上下文自适应算术编码器进行编码,并作为侧信息发送。我们将所提出的技术集成到移动自组织流协议中,并使用组移动模型对其进行了测试。结果表明,与参考技术相比,预期视频质量的增益不可忽略。
{"title":"An MDC-based video streaming architecture for mobile networks","authors":"C. Greco, G. Petrazzuoli, Marco Cagnazzo, B. Pesquet-Popescu","doi":"10.1109/MMSP.2011.6093829","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093829","url":null,"abstract":"Multiple description coding (MDC) is a framework designed to improve the robustness of video content transmission in lossy environments. In this work, we propose an MDC technique using a legacy coder to produce two descriptions, based on separation of even and odd frames. If only one description is received, the missing frames are reconstructed using temporal high-order motion interpolation (HOMI), a technique originally proposed for distributed video coding. If both descriptions are received, the frames are reconstructed as a block-wise linear combination of the two descriptions, with the coefficient computed at the encoder in a RD-optimised fashion, encoded with a context-adaptive arithmetic coder, and sent as side information. We integrated the proposed technique in a mobile ad-hoc streaming protocol, and tested it using a group mobility model. The results show a non-negligible gain for the expected video quality, with respect to the reference technique.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128188788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
L1-norm multi-frame super-resolution from images with zooming motion l1范数多帧超分辨率的图像缩放运动
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093847
Yushuang Tian, Kim-Hui Yap, Li Chen
This paper proposes a new image super-resolution (SR) approach to reconstruct a high-resolution (HR) image by fusing multiple low-resolution (LR) images with zooming motion. Most conventional SR image reconstruction methods assume that the motion among different images consists of only translation and possibly rotation. This in-plane motion model, however, is not practical in some applications, when relative zooming exists among the acquired LR images. In view of this, this paper presents a new SR method that addresses a motion model including both in-plane motion (e.g. translation and rotation) and zooming motion. Based on this model, a maximum a posteriori (MAP) based SR algorithm using L1-norm optimization is proposed. Experimental results show that the proposed algorithm based on the new motion model performs well in terms of visual evaluation and quantitative measurement.
本文提出了一种新的图像超分辨率(SR)方法,通过融合多幅具有变焦运动的低分辨率(LR)图像来重建高分辨率(HR)图像。大多数传统的SR图像重建方法假设不同图像之间的运动仅由平移和可能的旋转组成。然而,当获取的LR图像之间存在相对变焦时,这种平面内运动模型在某些应用中并不实用。鉴于此,本文提出了一种新的SR方法来处理包含平面内运动(例如平移和旋转)和缩放运动的运动模型。在此基础上,提出了一种基于l1范数优化的最大后验(MAP) SR算法。实验结果表明,基于新运动模型的算法在视觉评价和定量测量方面都取得了良好的效果。
{"title":"L1-norm multi-frame super-resolution from images with zooming motion","authors":"Yushuang Tian, Kim-Hui Yap, Li Chen","doi":"10.1109/MMSP.2011.6093847","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093847","url":null,"abstract":"This paper proposes a new image super-resolution (SR) approach to reconstruct a high-resolution (HR) image by fusing multiple low-resolution (LR) images with zooming motion. Most conventional SR image reconstruction methods assume that the motion among different images consists of only translation and possibly rotation. This in-plane motion model, however, is not practical in some applications, when relative zooming exists among the acquired LR images. In view of this, this paper presents a new SR method that addresses a motion model including both in-plane motion (e.g. translation and rotation) and zooming motion. Based on this model, a maximum a posteriori (MAP) based SR algorithm using L1-norm optimization is proposed. Experimental results show that the proposed algorithm based on the new motion model performs well in terms of visual evaluation and quantitative measurement.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"7 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133090968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2011 IEEE 13th International Workshop on Multimedia Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1