首页 > 最新文献

2013 Visual Communications and Image Processing (VCIP)最新文献

英文 中文
Backward view synthesis prediction using virtual depth map for multiview video plus depth map coding 基于虚拟深度图的多视点视频后向合成预测及深度图编码
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706435
S. Shimizu, Shiori Sugimoto, H. Kimata, Akira Kojima
View synthesis prediction has been studied as an efficient inter-view prediction scheme. Existing view synthesis prediction schemes fall into two types according to the pixel warping direction. While backward warping based view synthesis prediction enables block-based processing, forward warping based view synthesis prediction can handle occlusions properly. This paper proposes a two-step warping based view synthesis prediction; a virtual depth map is first generated by forward warping, and then prediction signals are generated by block-based backward warping using the virtual depth map. The technique of backward-warping-aware depth inpainting is also proposed. Experiments show that the proposed VSP scheme can achieve the decoder runtime reductions of about 37% on average with slight bitrate reductions relative to the conventional forward warping based VSP. Compared to the conventional backward warping based VSP, the proposed method reduces the bitrate for the synthesized views by up to 2.9% and about 2.2% on average.
视点综合预测是一种高效的视点间预测方案。现有的视图综合预测方案根据像素扭曲方向分为两种。基于后向翘曲的视图合成预测支持基于块的处理,而基于前向翘曲的视图合成预测可以正确处理遮挡。提出了一种基于两步翘曲的视图综合预测方法;首先通过前向翘曲生成虚拟深度图,然后利用虚拟深度图进行基于分块的后向翘曲生成预测信号。同时,提出了一种感知后向翘曲的深度补图技术。实验表明,与传统的前向翘曲VSP相比,所提出的VSP方案平均可减少约37%的解码器运行时间,且比特率略有下降。与传统的基于向后扭曲的VSP相比,该方法将合成视图的比特率降低了2.9%,平均降低了2.2%左右。
{"title":"Backward view synthesis prediction using virtual depth map for multiview video plus depth map coding","authors":"S. Shimizu, Shiori Sugimoto, H. Kimata, Akira Kojima","doi":"10.1109/VCIP.2013.6706435","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706435","url":null,"abstract":"View synthesis prediction has been studied as an efficient inter-view prediction scheme. Existing view synthesis prediction schemes fall into two types according to the pixel warping direction. While backward warping based view synthesis prediction enables block-based processing, forward warping based view synthesis prediction can handle occlusions properly. This paper proposes a two-step warping based view synthesis prediction; a virtual depth map is first generated by forward warping, and then prediction signals are generated by block-based backward warping using the virtual depth map. The technique of backward-warping-aware depth inpainting is also proposed. Experiments show that the proposed VSP scheme can achieve the decoder runtime reductions of about 37% on average with slight bitrate reductions relative to the conventional forward warping based VSP. Compared to the conventional backward warping based VSP, the proposed method reduces the bitrate for the synthesized views by up to 2.9% and about 2.2% on average.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":" 32","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120829674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Robust texture representation by using binary code ensemble 采用二进制代码集成的鲁棒纹理表示
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706357
Tiecheng Song, Fanman Meng, Bing Luo, Chao Huang
In this paper, we present a robust texture representation by exploring an ensemble of binary codes. The proposed method, called Locally Enhanced Binary Coding (LEBC), is training-free and needs no costly data-to-cluster assignments. Given an input image, a set of features that describe different pixel-wise properties, is first extracted so as to be robust to rotation and illumination changes. Then, these features are binarized and jointly encoded into specific pixel labels. Meanwhile, the Local Binary Pattern (LBP) operator is utilized to encode the neighboring relationship. Finally, based on the statistics of these pixel labels and LBP labels, a joint histogram is built and used for texture representation. Extensive experiments have been conducted on the Outex, CUReT and UIUC texture databases. Impressive classification results have been achieved compared with state-of-the-art LBP-based and even learning-based algorithms.
在本文中,我们通过探索二进制码集合提出了一种鲁棒的纹理表示。所提出的方法被称为局部增强二进制编码(LEBC),它不需要训练,也不需要昂贵的数据到簇分配。给定输入图像,首先提取一组描述不同像素属性的特征,以便对旋转和光照变化具有鲁棒性。然后,将这些特征二值化并联合编码为特定的像素标签。同时,利用局部二值模式(LBP)算子对相邻关系进行编码。最后,基于这些像素标签和LBP标签的统计,构建联合直方图并用于纹理表示。在Outex、CUReT和UIUC纹理数据库上进行了大量的实验。与最先进的基于lbp甚至基于学习的算法相比,已经取得了令人印象深刻的分类结果。
{"title":"Robust texture representation by using binary code ensemble","authors":"Tiecheng Song, Fanman Meng, Bing Luo, Chao Huang","doi":"10.1109/VCIP.2013.6706357","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706357","url":null,"abstract":"In this paper, we present a robust texture representation by exploring an ensemble of binary codes. The proposed method, called Locally Enhanced Binary Coding (LEBC), is training-free and needs no costly data-to-cluster assignments. Given an input image, a set of features that describe different pixel-wise properties, is first extracted so as to be robust to rotation and illumination changes. Then, these features are binarized and jointly encoded into specific pixel labels. Meanwhile, the Local Binary Pattern (LBP) operator is utilized to encode the neighboring relationship. Finally, based on the statistics of these pixel labels and LBP labels, a joint histogram is built and used for texture representation. Extensive experiments have been conducted on the Outex, CUReT and UIUC texture databases. Impressive classification results have been achieved compared with state-of-the-art LBP-based and even learning-based algorithms.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123232706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Exploration of Generalized Residual Prediction in scalable HEVC 可扩展HEVC中广义残差预测的探索
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706449
E. François, Christophe Gisquet, Jonathan Taquet, G. Laroche, P. Onno
After having issued the version 1 of the new video coding standard HEVC, ISO-MPEG and ITU-T VCEG groups are specifying its scalable extension. The candidate schemes are based on a multi-layer multi-loop coding framework, exploiting inter-layer texture and motion prediction and full base layer picture decoding. Several inter-layer prediction tools have been explored, implemented either using high-level syntax or block-level core HEVC design changes. One of these tools, Generalized Residual Prediction (GRP), has been extensively studied during several meeting cycles. It is based on second order residual prediction, exploiting motion compensation prediction residual in the base layer. This paper is focused on this new mode. The principle of GRP is described with an analysis of several implementation variants completed by a complexity analysis. Performance of these different implementations is provided, showing that noticeable gains can be obtained without significant complexity increase compared to a simple scalable design comprising only texture and motion inter-layer prediction.
在发布了新的视频编码标准HEVC的第一版之后,ISO-MPEG和ITU-T VCEG组正在指定其可扩展的扩展。候选方案基于多层多环路编码框架,利用层间纹理和运动预测以及全基础层图像解码。已经探索了几个层间预测工具,使用高级语法或块级核心HEVC设计更改来实现。其中一个工具,广义残差预测(GRP),在几个会议周期中得到了广泛的研究。它是基于二阶残差预测,利用运动补偿预测残差在基础层。本文就是对这种新模式的研究。通过复杂性分析,对几种实现变量进行了分析,描述了GRP的原理。提供了这些不同实现的性能,表明与仅包含纹理和运动层间预测的简单可扩展设计相比,可以在不显著增加复杂性的情况下获得显着的增益。
{"title":"Exploration of Generalized Residual Prediction in scalable HEVC","authors":"E. François, Christophe Gisquet, Jonathan Taquet, G. Laroche, P. Onno","doi":"10.1109/VCIP.2013.6706449","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706449","url":null,"abstract":"After having issued the version 1 of the new video coding standard HEVC, ISO-MPEG and ITU-T VCEG groups are specifying its scalable extension. The candidate schemes are based on a multi-layer multi-loop coding framework, exploiting inter-layer texture and motion prediction and full base layer picture decoding. Several inter-layer prediction tools have been explored, implemented either using high-level syntax or block-level core HEVC design changes. One of these tools, Generalized Residual Prediction (GRP), has been extensively studied during several meeting cycles. It is based on second order residual prediction, exploiting motion compensation prediction residual in the base layer. This paper is focused on this new mode. The principle of GRP is described with an analysis of several implementation variants completed by a complexity analysis. Performance of these different implementations is provided, showing that noticeable gains can be obtained without significant complexity increase compared to a simple scalable design comprising only texture and motion inter-layer prediction.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115540649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Quality assessment of 3D synthesized views with depth map distortion 具有深度图失真的三维合成视图的质量评估
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706348
Chang-Ting Tsai, H. Hang
Most existing 3D image quality metrics use 2D image quality assessment (IQA) models to predict the 3D subjective quality. But in a free viewpoint television (FTV) system, the depth map errors often produce object shifting or ghost artifacts on the synthesized pictures due to the use of Depth Image Based Rendering (DIBR) technique. These artifacts are very different from the ordinary 2D distortions such as blur, Gaussian noise, and compression errors. We thus propose a new 3D quality metric to evaluate the quality of stereo images that may contain artifacts introduced by the rendering process due to depth map errors. We first eliminate the consistent pixel shifts inside an object before the usual 2D metric is applied. The experimental results show that the proposed method enhances the correlation of the objective quality score to the 3D subjective scores.
大多数现有的三维图像质量指标使用二维图像质量评估(IQA)模型来预测三维主观质量。但是在自由视点电视(FTV)系统中,由于使用基于深度图像渲染(DIBR)技术,深度图误差往往会在合成图像上产生物体移动或鬼影。这些伪影与普通的2D失真(如模糊、高斯噪声和压缩误差)非常不同。因此,我们提出了一种新的3D质量度量来评估立体图像的质量,这些图像可能包含由渲染过程中由于深度图错误而引入的伪影。在应用通常的2D度量之前,我们首先消除对象内部的一致像素偏移。实验结果表明,该方法增强了客观质量评分与三维主观评分的相关性。
{"title":"Quality assessment of 3D synthesized views with depth map distortion","authors":"Chang-Ting Tsai, H. Hang","doi":"10.1109/VCIP.2013.6706348","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706348","url":null,"abstract":"Most existing 3D image quality metrics use 2D image quality assessment (IQA) models to predict the 3D subjective quality. But in a free viewpoint television (FTV) system, the depth map errors often produce object shifting or ghost artifacts on the synthesized pictures due to the use of Depth Image Based Rendering (DIBR) technique. These artifacts are very different from the ordinary 2D distortions such as blur, Gaussian noise, and compression errors. We thus propose a new 3D quality metric to evaluate the quality of stereo images that may contain artifacts introduced by the rendering process due to depth map errors. We first eliminate the consistent pixel shifts inside an object before the usual 2D metric is applied. The experimental results show that the proposed method enhances the correlation of the objective quality score to the 3D subjective scores.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115561341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
No reference image quality assessment based on local binary pattern statistics 没有基于局部二值模式统计的参考图像质量评估
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706418
Min Zhang, Jin Xie, Xiang Zhou, H. Fujita
Multimedia, including audio, image and video, etc, is a ubiquitous part of modern life. Evaluations, both objective and subjective, are of fundamental importance for numerous multimedia applications. In this paper, based on statistics of local binary pattern (LBP), we propose a novel and efficient quality similarity index for no reference (NR) image quality assessment (IQA). First, with the Laplacian of Gaussian (LOG) filters, the image is decomposed into multi-scale sub-band images. Then, for these sub-band images across different scales, LBP maps are encoded and the LBP histograms are formed as the quality assessment concerning feature. Finally, by support vector regression (SVR), the extracted features are mapped to the image's subjective quality score for NR IQA. The experimental results on LIVE IQA database show that the proposed method is strongly related to subjective quality evaluations and competitive to most of the state-of-the-art NR IQA methods.
多媒体包括音频、图像和视频等,是现代生活中无处不在的一部分。评价,无论是客观的还是主观的,对于许多多媒体应用都是至关重要的。本文基于局部二值模式(LBP)的统计特性,提出了一种新的、高效的无参考(NR)图像质量评价(IQA)质量相似度指标。首先,利用拉普拉斯高斯滤波(LOG)将图像分解成多尺度子带图像;然后,对这些不同尺度的子带图像进行编码,形成LBP直方图作为特征质量评价;最后,通过支持向量回归(SVR)将提取的特征映射到图像的主观质量分数,用于NR IQA。在LIVE IQA数据库上的实验结果表明,该方法与主观质量评价有很强的相关性,与大多数最先进的NR IQA方法相比具有竞争力。
{"title":"No reference image quality assessment based on local binary pattern statistics","authors":"Min Zhang, Jin Xie, Xiang Zhou, H. Fujita","doi":"10.1109/VCIP.2013.6706418","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706418","url":null,"abstract":"Multimedia, including audio, image and video, etc, is a ubiquitous part of modern life. Evaluations, both objective and subjective, are of fundamental importance for numerous multimedia applications. In this paper, based on statistics of local binary pattern (LBP), we propose a novel and efficient quality similarity index for no reference (NR) image quality assessment (IQA). First, with the Laplacian of Gaussian (LOG) filters, the image is decomposed into multi-scale sub-band images. Then, for these sub-band images across different scales, LBP maps are encoded and the LBP histograms are formed as the quality assessment concerning feature. Finally, by support vector regression (SVR), the extracted features are mapped to the image's subjective quality score for NR IQA. The experimental results on LIVE IQA database show that the proposed method is strongly related to subjective quality evaluations and competitive to most of the state-of-the-art NR IQA methods.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126758396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Single image super-resolution via phase congruency analysis 通过相位一致性分析的单图像超分辨率
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706398
Licheng Yu, Yi Xu, Bo Zhang
Single image super-resolution (SR) is a severely unconstrained task. While the self-example-based methods are able to reproduce sharp edges, they perform poorly for textures. For recovering the fine details, higher-level image segmentation and corresponding external texture database are employed in the example-based SR methods, but they involve too much human interaction. In this paper, we discuss the existing problems of example-based technique using scale space analysis. Accordingly, a robust pixel classification method is designed based on the phase congruency model in scale space, which can effectively divide images into edges, textures and flat regions. Then a super-resolution framework is proposed, which can adaptively emphasize the importance of high-frequency residuals in structural examples and scale invariant fractal property in textural regions. Experimental results show that our SR approach is able to present both sharp edges and vivid textures with few artifacts.
单幅图像超分辨率(SR)是一个非常不受约束的任务。虽然基于自例的方法能够再现尖锐的边缘,但它们对纹理的表现不佳。为了恢复精细细节,基于实例的SR方法采用了更高层次的图像分割和相应的外部纹理数据库,但涉及到过多的人机交互。本文利用尺度空间分析方法,讨论了基于实例技术存在的问题。据此,设计了一种基于尺度空间相一致性模型的鲁棒像素分类方法,可以有效地将图像划分为边缘、纹理和平面区域。然后提出了一种超分辨框架,该框架可以自适应地强调结构样本中的高频残差和纹理区域的尺度不变分形性质的重要性。实验结果表明,该方法既能呈现出锐利的边缘,又能呈现出逼真的纹理,而且伪影较少。
{"title":"Single image super-resolution via phase congruency analysis","authors":"Licheng Yu, Yi Xu, Bo Zhang","doi":"10.1109/VCIP.2013.6706398","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706398","url":null,"abstract":"Single image super-resolution (SR) is a severely unconstrained task. While the self-example-based methods are able to reproduce sharp edges, they perform poorly for textures. For recovering the fine details, higher-level image segmentation and corresponding external texture database are employed in the example-based SR methods, but they involve too much human interaction. In this paper, we discuss the existing problems of example-based technique using scale space analysis. Accordingly, a robust pixel classification method is designed based on the phase congruency model in scale space, which can effectively divide images into edges, textures and flat regions. Then a super-resolution framework is proposed, which can adaptively emphasize the importance of high-frequency residuals in structural examples and scale invariant fractal property in textural regions. Experimental results show that our SR approach is able to present both sharp edges and vivid textures with few artifacts.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121585424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning non-negative locality-constrained Linear Coding for human action recognition 学习非负位置约束线性编码用于人体动作识别
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706432
Yuanbo Chen, Xin Guo
Description methods based on interest points and Bag-of-Words (BOW) model have gained remarkable success in human action recognition. Despite their popularity, the existing interest point detectors always come with high computational complexity and lose their power when camera is moving. Additionally, vector quantization procedure in BOW model ignores the relationship between bases and is always with large reconstruction errors. In this paper, a spatio-temporal interest point detector based on flow vorticity is used, which can not only suppress most effects of camera motion but also provide prominent interest points around key positions of the moving foreground. Besides, by combining non-negativity constraints of patterns and average pooling function, a Non-negative Locality-constrained Linear Coding (NLLC) model is introduced into action recognition to provide better features representation than the traditional BOW model. Experimental results on two widely used action datasets demonstrate the effectiveness of the proposed approach.
基于兴趣点和词袋模型的描述方法在人体动作识别中取得了显著的成功。现有的兴趣点检测器虽然很受欢迎,但计算复杂度高,且在摄像机移动时性能下降。此外,BOW模型的矢量量化过程忽略了基间的关系,重构误差较大。本文提出了一种基于流涡度的时空兴趣点检测器,该检测器不仅可以抑制摄像机运动的大部分影响,而且可以在运动前景的关键位置周围提供突出的兴趣点。结合模式的非负性约束和平均池化函数,将非负性位置约束线性编码(Non-negative Locality-constrained Linear Coding, NLLC)模型引入到动作识别中,以提供比传统BOW模型更好的特征表示。在两个广泛使用的动作数据集上的实验结果证明了该方法的有效性。
{"title":"Learning non-negative locality-constrained Linear Coding for human action recognition","authors":"Yuanbo Chen, Xin Guo","doi":"10.1109/VCIP.2013.6706432","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706432","url":null,"abstract":"Description methods based on interest points and Bag-of-Words (BOW) model have gained remarkable success in human action recognition. Despite their popularity, the existing interest point detectors always come with high computational complexity and lose their power when camera is moving. Additionally, vector quantization procedure in BOW model ignores the relationship between bases and is always with large reconstruction errors. In this paper, a spatio-temporal interest point detector based on flow vorticity is used, which can not only suppress most effects of camera motion but also provide prominent interest points around key positions of the moving foreground. Besides, by combining non-negativity constraints of patterns and average pooling function, a Non-negative Locality-constrained Linear Coding (NLLC) model is introduced into action recognition to provide better features representation than the traditional BOW model. Experimental results on two widely used action datasets demonstrate the effectiveness of the proposed approach.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114465662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Temporally consistent adaptive depth map preprocessing for view synthesis 时间一致的自适应深度图预处理视图合成
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706436
Martin Köppel, Mehdi Ben Makhlouf, Marcus Müller, P. Ndjiki-Nya
In this paper, a novel Depth Image-based Rendering (DIBR) method, which generates virtual views from a video sequence and its associated Depth Maps (DMs), is presented. The proposed approach is especially designed to close holes in extrapolation scenarios, where only one original camera is available or the virtual view is placed outside the range of a set of original cameras. In such scenarios, large image regions become uncovered in the virtual view and need to be filled in a visually pleasing way. In order to handle such disocclussions, a depth preprocessing method is proposed, which is applied prior to 3-D image warping. As a first step, adaptive cross-trilateral median filtering is used to align depth discontinuities in the DM to color discontinuities in the textured image and to further reduce estimation errors in the DM. Then, a temporally consistent and adaptive asymmetric smoothing filter is designed and subsequently applied to the DM. The filter is adaptively weighted in such a way that only the DM regions that may reveal uncovered areas are filtered. Thus, strong distortions in other parts of the virtual textured image are prevented. By smoothing the depth map image, objects are slightly distorted and disocclusions in the virtual view are completely or partially covered. The proposed method shows considerable objective and subjective gains compared to the state-of-the-art one.
本文提出了一种新的基于深度图像的渲染方法,该方法从视频序列及其相关的深度图(dm)中生成虚拟视图。所提出的方法是专门为关闭外推场景中的漏洞而设计的,在这种情况下,只有一个原始摄像机可用,或者虚拟视图被放置在一组原始摄像机的范围之外。在这种情况下,大的图像区域在虚拟视图中变得不可见,需要以视觉上令人愉悦的方式填充。为了处理这类图像,提出了一种深度预处理方法,该方法应用于三维图像翘曲之前。作为第一步,自适应跨三边中值滤波用于将DM中的深度不连续点与纹理图像中的颜色不连续点对齐,并进一步减少DM中的估计误差。然后,设计一个时间一致的自适应非对称平滑滤波器并随后应用于DM。该滤波器自适应加权,以便仅过滤可能显示未覆盖区域的DM区域。因此,防止了虚拟纹理图像的其他部分的强烈扭曲。通过平滑深度图图像,物体被轻微扭曲,虚拟视图中的咬合被完全或部分覆盖。与最先进的方法相比,所提出的方法具有相当大的客观和主观收益。
{"title":"Temporally consistent adaptive depth map preprocessing for view synthesis","authors":"Martin Köppel, Mehdi Ben Makhlouf, Marcus Müller, P. Ndjiki-Nya","doi":"10.1109/VCIP.2013.6706436","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706436","url":null,"abstract":"In this paper, a novel Depth Image-based Rendering (DIBR) method, which generates virtual views from a video sequence and its associated Depth Maps (DMs), is presented. The proposed approach is especially designed to close holes in extrapolation scenarios, where only one original camera is available or the virtual view is placed outside the range of a set of original cameras. In such scenarios, large image regions become uncovered in the virtual view and need to be filled in a visually pleasing way. In order to handle such disocclussions, a depth preprocessing method is proposed, which is applied prior to 3-D image warping. As a first step, adaptive cross-trilateral median filtering is used to align depth discontinuities in the DM to color discontinuities in the textured image and to further reduce estimation errors in the DM. Then, a temporally consistent and adaptive asymmetric smoothing filter is designed and subsequently applied to the DM. The filter is adaptively weighted in such a way that only the DM regions that may reveal uncovered areas are filtered. Thus, strong distortions in other parts of the virtual textured image are prevented. By smoothing the depth map image, objects are slightly distorted and disocclusions in the virtual view are completely or partially covered. The proposed method shows considerable objective and subjective gains compared to the state-of-the-art one.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117346849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
An effective computer aided diagnosis system using B-Mode and color Doppler flow imaging for breast cancer b型和彩色多普勒血流显像对乳腺癌有效的计算机辅助诊断系统
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706400
Songbo Liu, Heng-Da Cheng, Yan Liu, Jianhua Huang, Yingtao Zhang, Xianglong Tang
To improve the diagnostic accuracy of breast ultrasound classification, a novel computer-aided diagnosis (CAD) system based on B-Mode and color Doppler flow imaging is proposed. Several new features are modeled and extracted from the static images and color Doppler image sequences to study blood flow characteristics. Moreover, we proposed a novel classifier ensemble strategy for obtaining the benefit of mutual compensation of classifiers with different characteristics. Experimental results demonstrate that the proposed CAD system can improve the true-positive and decrease the false positive detection rate, which is useful for reducing the unnecessary biopsy and death rate.
为了提高乳腺超声分类的诊断准确性,提出了一种基于b型超声和彩色多普勒血流成像的计算机辅助诊断系统。从静态图像和彩色多普勒图像序列中建模和提取了一些新的特征来研究血流特征。此外,我们提出了一种新的分类器集成策略,以获得不同特征分类器相互补偿的好处。实验结果表明,所提出的CAD系统可以提高真阳性检出率,降低假阳性检出率,有助于减少不必要的活检和死亡率。
{"title":"An effective computer aided diagnosis system using B-Mode and color Doppler flow imaging for breast cancer","authors":"Songbo Liu, Heng-Da Cheng, Yan Liu, Jianhua Huang, Yingtao Zhang, Xianglong Tang","doi":"10.1109/VCIP.2013.6706400","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706400","url":null,"abstract":"To improve the diagnostic accuracy of breast ultrasound classification, a novel computer-aided diagnosis (CAD) system based on B-Mode and color Doppler flow imaging is proposed. Several new features are modeled and extracted from the static images and color Doppler image sequences to study blood flow characteristics. Moreover, we proposed a novel classifier ensemble strategy for obtaining the benefit of mutual compensation of classifiers with different characteristics. Experimental results demonstrate that the proposed CAD system can improve the true-positive and decrease the false positive detection rate, which is useful for reducing the unnecessary biopsy and death rate.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123997382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Adaptive rounding operator for efficient Wyner-Ziv video coding 自适应四舍五入算子用于高效的Wyner-Ziv视频编码
Pub Date : 2013-11-01 DOI: 10.1109/VCIP.2013.6706368
Jeffrey J. Micallef, R. Farrugia, C. J. Debono
The Distributed Video Coding (DVC) paradigm can theoretically reach the same coding efficiencies of predictive block-based video coding schemes, like H.264/AVC. However, current DVC architectures are still far from this ideal performance. This is mainly attributed to inaccuracies in the Side Information (SI) predicted at the decoder. The work in this paper presents a coding scheme which tries to avoid mismatch in the SI predictions caused by small variations in light intensity. Using the appropriate rounding operator for every coefficient, the proposed method significantly reduces the correlation noise between the Wyner-Ziv (WZ) frame and the corresponding SI, achieving higher coding efficiencies. Experimental results demonstrate that the average Peak Signal-to-Noise Ratio (PSNR) is improved by up to 0.56dB relative to the DISCOVER codec.
分布式视频编码(DVC)范式理论上可以达到与基于预测块的视频编码方案(如H.264/AVC)相同的编码效率。然而,目前的DVC体系结构离这种理想的性能还很远。这主要是由于解码器预测的侧信息(SI)不准确。本文的工作提出了一种编码方案,该方案试图避免由光强的微小变化引起的SI预测中的不匹配。该方法对每个系数采用适当的舍入算子,显著降低了wner - ziv (WZ)帧与相应SI之间的相关噪声,实现了更高的编码效率。实验结果表明,与DISCOVER编解码器相比,平均峰值信噪比(PSNR)提高了0.56dB。
{"title":"Adaptive rounding operator for efficient Wyner-Ziv video coding","authors":"Jeffrey J. Micallef, R. Farrugia, C. J. Debono","doi":"10.1109/VCIP.2013.6706368","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706368","url":null,"abstract":"The Distributed Video Coding (DVC) paradigm can theoretically reach the same coding efficiencies of predictive block-based video coding schemes, like H.264/AVC. However, current DVC architectures are still far from this ideal performance. This is mainly attributed to inaccuracies in the Side Information (SI) predicted at the decoder. The work in this paper presents a coding scheme which tries to avoid mismatch in the SI predictions caused by small variations in light intensity. Using the appropriate rounding operator for every coefficient, the proposed method significantly reduces the correlation noise between the Wyner-Ziv (WZ) frame and the corresponding SI, achieving higher coding efficiencies. Experimental results demonstrate that the average Peak Signal-to-Noise Ratio (PSNR) is improved by up to 0.56dB relative to the DISCOVER codec.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123185113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2013 Visual Communications and Image Processing (VCIP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1