首页 > 最新文献

IVMSP 2013最新文献

英文 中文
Adaptive loop filtering based interview video coding in an hybrid video codec with MPEG-2 and HEVC for stereosopic video coding 基于自适应循环滤波的采访视频编码与MPEG-2和HEVC混合视频编解码器的立体视频编码
Pub Date : 2013-06-10 DOI: 10.1109/IVMSPW.2013.6611926
Sangsoo Ahn, Munchurl Kim
In this paper, a hybrid stereoscopic video codec is proposed based on MPEG-2 and an extended HEVC with an interview coding scheme for stereoscopic TV services through heterogeneous networks. The left-view sequences are encoded in an MPEG-2 video encoder for conventional 2D TV services via the traditional terrestrial broadcasting networks. On the other hand, the right-view sequences are encoded by an extended HEVC with a proposed interview coding scheme and the resulting bitstreams are transmitted over Internet. So, a 3D TV terminal to support the hybrid stereoscopic video streams receives the MPEG-2 data for the left-view sequences via the terrestrial broadcasting networks, and receives the right-view sequence streams of an extended HEVC data over Internet. The proposed interview coding scheme in an extended HEVC utilizes as reference frames the reconstructed MPEG-2 frames of the left-view sequences to perform predictive coding for the current frames of the right-view sequences. To enhance the texture qualities of the reference frames, an ALF tool is applied for the reconstructed MPEG-2 frames and HEVC as well. The ALF ON/OFF signaling map and ALF coefficients for the MPEG-2 reconstructed frames are transmitted in conjunction with HEVC bitstreams via Internet. The experimental results show that the proposed hybrid stereoscopic codec with ALF-based interview coding improves the coding efficiency with average 16.81% BD-rate gain, compared to a hybrid stereoscopic codec of independent MPEG-2 and HEVC codec without interview coding.
本文提出了一种基于MPEG-2和扩展HEVC的混合立体视频编解码器,该编解码器具有采访编码方案,适用于异构网络中的立体电视业务。左视图序列在MPEG-2视频编码器中编码,用于通过传统的地面广播网络提供传统的2D电视服务。另一方面,对右视图序列进行扩展的HEVC编码,并采用所提出的访谈编码方案进行编码,得到的比特流在Internet上传输。因此,支持混合立体视频流的3D电视终端通过地面广播网络接收左视图序列的MPEG-2数据,并通过Internet接收扩展HEVC数据的右视图序列流。提出的扩展HEVC采访编码方案利用重构的MPEG-2左视图序列帧作为参考帧,对右视图序列的当前帧进行预测编码。为了提高参考帧的纹理质量,对重构的MPEG-2帧和HEVC也应用了ALF工具。MPEG-2重构帧的ALF ON/OFF信令映射和ALF系数通过Internet与HEVC比特流一起传输。实验结果表明,与不含采访编码的独立MPEG-2和HEVC混合立体编解码器相比,本文提出的基于alf的采访编码混合立体编解码器的编码效率提高了16.81%,平均BD-rate增益。
{"title":"Adaptive loop filtering based interview video coding in an hybrid video codec with MPEG-2 and HEVC for stereosopic video coding","authors":"Sangsoo Ahn, Munchurl Kim","doi":"10.1109/IVMSPW.2013.6611926","DOIUrl":"https://doi.org/10.1109/IVMSPW.2013.6611926","url":null,"abstract":"In this paper, a hybrid stereoscopic video codec is proposed based on MPEG-2 and an extended HEVC with an interview coding scheme for stereoscopic TV services through heterogeneous networks. The left-view sequences are encoded in an MPEG-2 video encoder for conventional 2D TV services via the traditional terrestrial broadcasting networks. On the other hand, the right-view sequences are encoded by an extended HEVC with a proposed interview coding scheme and the resulting bitstreams are transmitted over Internet. So, a 3D TV terminal to support the hybrid stereoscopic video streams receives the MPEG-2 data for the left-view sequences via the terrestrial broadcasting networks, and receives the right-view sequence streams of an extended HEVC data over Internet. The proposed interview coding scheme in an extended HEVC utilizes as reference frames the reconstructed MPEG-2 frames of the left-view sequences to perform predictive coding for the current frames of the right-view sequences. To enhance the texture qualities of the reference frames, an ALF tool is applied for the reconstructed MPEG-2 frames and HEVC as well. The ALF ON/OFF signaling map and ALF coefficients for the MPEG-2 reconstructed frames are transmitted in conjunction with HEVC bitstreams via Internet. The experimental results show that the proposed hybrid stereoscopic codec with ALF-based interview coding improves the coding efficiency with average 16.81% BD-rate gain, compared to a hybrid stereoscopic codec of independent MPEG-2 and HEVC codec without interview coding.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130242718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Depth estimation from monocular color images using natural scene statistics models 基于自然场景统计模型的单眼彩色图像深度估计
Pub Date : 2013-06-10 DOI: 10.1109/IVMSPW.2013.6611900
Che-Chun Su, L. Cormack, A. Bovik
We consider the problem of estimating a dense depth map from a single monocular image. Inspired by psychophysical evidence of visual processing in human vision systems (HVS) and natural scene statistics (NSS) models of image and range, we propose a Bayesian framework to recover detailed 3D scene structure by exploiting the statistical relationships between local image features and depth variations inherent in natural images. By observing that similar depth structures may exist in different types of luminance/chrominance textured regions in natural scenes, we build a dictionary of canonical range patterns as the prior, and fit a multivariate Gaussian mixture (MGM) model to associate local image features to different range patterns as the likelihood. Compared with the state-of-the-art depth estimation method, we achieve similar performance in terms of pixel-wise estimated range error, but superior capability of recovering relative distant relationships between different parts of the image.
我们考虑了从单眼图像估计密集深度图的问题。受人类视觉系统(HVS)中视觉处理的心理物理证据和图像和范围的自然场景统计(NSS)模型的启发,我们提出了一个贝叶斯框架,通过利用自然图像固有的局部图像特征和深度变化之间的统计关系来恢复详细的3D场景结构。通过观察自然场景中不同类型的亮度/色度纹理区域可能存在相似的深度结构,我们构建了典型范围模式字典作为先验,并拟合了多元高斯混合(MGM)模型,将局部图像特征与不同范围模式作为似然关联。与最先进的深度估计方法相比,我们在像素估计范围误差方面实现了类似的性能,但在恢复图像不同部分之间相对距离关系方面的能力更强。
{"title":"Depth estimation from monocular color images using natural scene statistics models","authors":"Che-Chun Su, L. Cormack, A. Bovik","doi":"10.1109/IVMSPW.2013.6611900","DOIUrl":"https://doi.org/10.1109/IVMSPW.2013.6611900","url":null,"abstract":"We consider the problem of estimating a dense depth map from a single monocular image. Inspired by psychophysical evidence of visual processing in human vision systems (HVS) and natural scene statistics (NSS) models of image and range, we propose a Bayesian framework to recover detailed 3D scene structure by exploiting the statistical relationships between local image features and depth variations inherent in natural images. By observing that similar depth structures may exist in different types of luminance/chrominance textured regions in natural scenes, we build a dictionary of canonical range patterns as the prior, and fit a multivariate Gaussian mixture (MGM) model to associate local image features to different range patterns as the likelihood. Compared with the state-of-the-art depth estimation method, we achieve similar performance in terms of pixel-wise estimated range error, but superior capability of recovering relative distant relationships between different parts of the image.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"46 22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124668745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Bayesian methodology for visual object tracking on stereo sequences 基于贝叶斯方法的立体序列视觉目标跟踪
Pub Date : 2013-06-10 DOI: 10.1109/IVMSPW.2013.6611932
G. Chantas, N. Nikolaidis, I. Pitas
A general Bayesian post-processing methodology for performance improvement of object tracking in stereo video sequences is proposed in this paper. We utilize the results of any single channel visual object tracker in a Bayesian framework, in order to refine the tracking accuracy in both stereo video channels. In this framework, a variational Bayesian algorithm is employed, where prior knowledge about the object displacement (movement) is incorporated via a prior distribution. This displacement information is obtained in a preprocessing step, where object displacement is estimated via feature extraction and matching. In parallel, disparity information is extracted and utilized in the same framework. The improvements introduced by the proposed methodology in terms of tracking accuracy are quantified through experimental analysis.
提出了一种提高立体视频序列中目标跟踪性能的通用贝叶斯后处理方法。我们在贝叶斯框架中利用任何单通道视觉目标跟踪器的结果,以改进两个立体视频通道的跟踪精度。在该框架中,采用了变分贝叶斯算法,其中通过先验分布合并了关于物体位移(运动)的先验知识。该位移信息在预处理步骤中获得,其中通过特征提取和匹配估计物体位移。同时,视差信息在同一框架中提取和利用。通过实验分析,量化了该方法在跟踪精度方面的改进。
{"title":"A Bayesian methodology for visual object tracking on stereo sequences","authors":"G. Chantas, N. Nikolaidis, I. Pitas","doi":"10.1109/IVMSPW.2013.6611932","DOIUrl":"https://doi.org/10.1109/IVMSPW.2013.6611932","url":null,"abstract":"A general Bayesian post-processing methodology for performance improvement of object tracking in stereo video sequences is proposed in this paper. We utilize the results of any single channel visual object tracker in a Bayesian framework, in order to refine the tracking accuracy in both stereo video channels. In this framework, a variational Bayesian algorithm is employed, where prior knowledge about the object displacement (movement) is incorporated via a prior distribution. This displacement information is obtained in a preprocessing step, where object displacement is estimated via feature extraction and matching. In parallel, disparity information is extracted and utilized in the same framework. The improvements introduced by the proposed methodology in terms of tracking accuracy are quantified through experimental analysis.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125075703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
3D video quality metric for 3D video compression 用于3D视频压缩的3D视频质量度量
Pub Date : 2013-06-10 DOI: 10.1109/IVMSPW.2013.6611930
Amin Banitalebi-Dehkordi, M. Pourazad, P. Nasiopoulos
As the evolution of multiview display technology is bringing glasses-free 3DTV closer to reality, MPEG and VCEG are preparing an extension to HEVC to encode multiview video content. View synthesis in the current version of the 3D video codec is performed using PSNR as a quality metric measure. In this paper, we propose a full-reference Human-Visual-System based 3D video quality metric to be used in multiview encoding as an alternative to PSNR. Performance of our metric is tested in a 2-view case scenario. The quality of the compressed stereo pair, formed from a decoded view and a synthesized view, is evaluated at the encoder side. The performance is verified through a series of subjective tests and compared with that of PSNR, SSIM, MS-SSIM, VIFp, and VQM metrics. Experimental results showed that our 3D quality metric has the highest correlation with Mean Opinion Scores (MOS) compared to the other tested metrics.
随着多视角显示技术的发展,无需眼镜的3d电视越来越接近现实,MPEG和VCEG正在准备HEVC的扩展,以编码多视角视频内容。当前版本的3D视频编解码器中的视图合成是使用PSNR作为质量度量来执行的。在本文中,我们提出了一种基于全参考人类视觉系统的3D视频质量度量,用于多视图编码,作为PSNR的替代方案。我们的指标的性能是在一个双视图场景中测试的。由解码视图和合成视图组成的压缩立体图像对的质量在编码器侧进行评估。通过一系列主观测试验证了性能,并与PSNR、SSIM、MS-SSIM、VIFp和VQM指标进行了比较。实验结果表明,与其他测试指标相比,我们的3D质量指标与平均意见分数(MOS)的相关性最高。
{"title":"3D video quality metric for 3D video compression","authors":"Amin Banitalebi-Dehkordi, M. Pourazad, P. Nasiopoulos","doi":"10.1109/IVMSPW.2013.6611930","DOIUrl":"https://doi.org/10.1109/IVMSPW.2013.6611930","url":null,"abstract":"As the evolution of multiview display technology is bringing glasses-free 3DTV closer to reality, MPEG and VCEG are preparing an extension to HEVC to encode multiview video content. View synthesis in the current version of the 3D video codec is performed using PSNR as a quality metric measure. In this paper, we propose a full-reference Human-Visual-System based 3D video quality metric to be used in multiview encoding as an alternative to PSNR. Performance of our metric is tested in a 2-view case scenario. The quality of the compressed stereo pair, formed from a decoded view and a synthesized view, is evaluated at the encoder side. The performance is verified through a series of subjective tests and compared with that of PSNR, SSIM, MS-SSIM, VIFp, and VQM metrics. Experimental results showed that our 3D quality metric has the highest correlation with Mean Opinion Scores (MOS) compared to the other tested metrics.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126371753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Hybrid segmentation of depth images using a watershed and region merging based method for tree species recognition 基于流域和区域融合的深度图像混合分割方法用于树种识别
Pub Date : 2013-06-10 DOI: 10.1109/IVMSPW.2013.6611901
A. Othmani, A. Piboule, L. Voon
Tree species recognition from Terrestrial Light Detection and Ranging (T-LiDAR) scanner data is essential for estimating forest inventory attributes in a mixed planting. In this paper, we propose a new method for individual tree species recognition based on the analysis of the 3D geometric texture of tree barks. Our method transforms the 3D point cloud of a 30 cm segment of the tree trunk into a depth image on which a hybrid segmentation method using watershed and region merging techniques is applied in order to reveal bark shape characteristics. Finally, shape and intensity features are calculated on the segmented depth image and used to classify five different tree species using a Random Forest (RF) classifier. Our method has been tested using two datasets acquired in two different French forests with different terrain characteristics. The accuracy and precision rates obtained for both datasets are over 89%.
利用陆地光探测和测距(T-LiDAR)扫描仪数据识别树种是估算混合种植森林资源属性的关键。本文提出了一种基于树皮三维几何纹理分析的树种识别新方法。该方法将树干30 cm段的三维点云转换为深度图像,在深度图像上采用分水岭和区域合并混合分割方法来揭示树皮形状特征。最后,在分割深度图像上计算形状和强度特征,并使用随机森林(RF)分类器对五种不同的树种进行分类。我们的方法已经在两个不同地形特征的法国森林中使用两个数据集进行了测试。两个数据集的准确率和精密度都在89%以上。
{"title":"Hybrid segmentation of depth images using a watershed and region merging based method for tree species recognition","authors":"A. Othmani, A. Piboule, L. Voon","doi":"10.1109/IVMSPW.2013.6611901","DOIUrl":"https://doi.org/10.1109/IVMSPW.2013.6611901","url":null,"abstract":"Tree species recognition from Terrestrial Light Detection and Ranging (T-LiDAR) scanner data is essential for estimating forest inventory attributes in a mixed planting. In this paper, we propose a new method for individual tree species recognition based on the analysis of the 3D geometric texture of tree barks. Our method transforms the 3D point cloud of a 30 cm segment of the tree trunk into a depth image on which a hybrid segmentation method using watershed and region merging techniques is applied in order to reveal bark shape characteristics. Finally, shape and intensity features are calculated on the segmented depth image and used to classify five different tree species using a Random Forest (RF) classifier. Our method has been tested using two datasets acquired in two different French forests with different terrain characteristics. The accuracy and precision rates obtained for both datasets are over 89%.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128001239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Structure optimization for multi-view acquisition and stereo display system 多视点采集与立体显示系统结构优化
Pub Date : 2013-06-10 DOI: 10.1109/IVMSPW.2013.6611933
Hao Cheng, Zhixiang You, P. An, Zhaoyang Zhang
This paper introduces several models of the multi-view acquisition/stereo display system. With the use of these models, we can easily analyze the factors impacting on multi-view acquisition/stereo display system, such as stereo angle, number of views, and stereo image resolution. In order to use these factors constructing better multi-view acquisition/stereo display system, the strategy to optimize them are needed. This paper proposes a structure optimization for multi-view acquisition/stereo display system. With the structure optimization, we can adjust the factors conveniently and easily set up a real multi-view acquisition/stereo display system to achieve good effect.
本文介绍了多视点采集/立体显示系统的几种模型。利用这些模型,我们可以很容易地分析影响多视点采集/立体显示系统的因素,如立体角度、视点数和立体图像分辨率。为了利用这些因素构建更好的多视点采集/立体显示系统,需要对这些因素进行优化。提出了一种多视点采集/立体显示系统的结构优化方法。通过结构优化,我们可以方便地调整因素,轻松地建立一个真正的多视角采集/立体显示系统,以达到良好的效果。
{"title":"Structure optimization for multi-view acquisition and stereo display system","authors":"Hao Cheng, Zhixiang You, P. An, Zhaoyang Zhang","doi":"10.1109/IVMSPW.2013.6611933","DOIUrl":"https://doi.org/10.1109/IVMSPW.2013.6611933","url":null,"abstract":"This paper introduces several models of the multi-view acquisition/stereo display system. With the use of these models, we can easily analyze the factors impacting on multi-view acquisition/stereo display system, such as stereo angle, number of views, and stereo image resolution. In order to use these factors constructing better multi-view acquisition/stereo display system, the strategy to optimize them are needed. This paper proposes a structure optimization for multi-view acquisition/stereo display system. With the structure optimization, we can adjust the factors conveniently and easily set up a real multi-view acquisition/stereo display system to achieve good effect.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131213912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The MPEG-7 Audiovisual Description Profile (AVDP) and its application to multi-view video MPEG-7视听描述文件(AVDP)及其在多视图视频中的应用
Pub Date : 2013-06-10 DOI: 10.1109/IVMSPW.2013.6611928
Masanori Sano, W. Bailer, A. Messina, J. Evain, M. Matton
This paper describes a new MPEG-7 profile called AVDP (Audiovisual Description Profile). Firstly, some problems with conventional MPEG-7 profiles are described and the motivation behind the development of AVDP is explained based on requirements from broadcasters and other actors from the media industry. Secondly, the scope and functionalities of AVDP are described. Differences from the existing profiles and the basic AVDP structure and components are explained. Some useful software tools handling AVDP, including for validation and visualization are discussed. Finally the use of AVDP to represent multi-view and panoramic video content is described.
本文描述了一种新的MPEG-7文件,称为AVDP (Audiovisual Description profile)。首先,描述了传统MPEG-7格式存在的一些问题,并根据广播公司和媒体行业其他参与者的需求,解释了AVDP开发背后的动机。其次,描述了AVDP的范围和功能。说明了AVDP与现有配置文件的区别,以及AVDP的基本结构和组成。讨论了处理AVDP的一些有用的软件工具,包括验证工具和可视化工具。最后介绍了AVDP在多视点和全景视频内容表示中的应用。
{"title":"The MPEG-7 Audiovisual Description Profile (AVDP) and its application to multi-view video","authors":"Masanori Sano, W. Bailer, A. Messina, J. Evain, M. Matton","doi":"10.1109/IVMSPW.2013.6611928","DOIUrl":"https://doi.org/10.1109/IVMSPW.2013.6611928","url":null,"abstract":"This paper describes a new MPEG-7 profile called AVDP (Audiovisual Description Profile). Firstly, some problems with conventional MPEG-7 profiles are described and the motivation behind the development of AVDP is explained based on requirements from broadcasters and other actors from the media industry. Secondly, the scope and functionalities of AVDP are described. Differences from the existing profiles and the basic AVDP structure and components are explained. Some useful software tools handling AVDP, including for validation and visualization are discussed. Finally the use of AVDP to represent multi-view and panoramic video content is described.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134121707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Precision enhancement of 3D surfaces from multiple quantized depth maps 从多个量化深度图中提高三维表面的精度
Pub Date : 2013-06-10 DOI: 10.1109/IVMSPW.2013.6611913
Pengfei Wan, Gene Cheung, P. Chou, D. Florêncio, Cha Zhang, O. Au
Transmitting from sender compressed texture and depth maps of multiple viewpoints enables image synthesis at receiver from any intermediate virtual viewpoint via depth-image-based rendering (DIBR). We observe that quantized depth maps from different viewpoints of the same 3D scene constitutes multiple descriptions (MD) of the same signal, thus it is possible to reconstruct the 3D scene in higher precision at receiver when multiple depth maps are considered jointly. In this paper, we cast the precision enhancement of 3D surfaces from multiple quantized depth maps as a combinatorial optimization problem. First, we derive a lemma that allows us to increase the precision of a subset of 3D points with certainty, simply by discovering special intersections of quantization bins (QB) from both views. Then, we identify the most probable voxel-containing QB intersections using a shortest-path formulation. Experimental results show that our method can significantly increase the precision of decoded depth maps compared with standard decoding schemes.
从发送方传输压缩的纹理和多视点深度图,可以通过基于深度图像的渲染(DIBR)从任何中间虚拟视点在接收方进行图像合成。研究发现,同一三维场景不同视点的量化深度图构成了对同一信号的多个描述(MD),因此当多个深度图联合考虑时,可以在接收端以更高的精度重建三维场景。本文将多个量化深度图对三维曲面的精度增强作为一个组合优化问题。首先,我们推导出一个引理,它允许我们通过从两个视图中发现量化箱(QB)的特殊交叉点来确定地增加3D点子集的精度。然后,我们使用最短路径公式确定最可能的包含体素的QB相交。实验结果表明,与标准的深度图译码方案相比,该方法可以显著提高深度图译码的精度。
{"title":"Precision enhancement of 3D surfaces from multiple quantized depth maps","authors":"Pengfei Wan, Gene Cheung, P. Chou, D. Florêncio, Cha Zhang, O. Au","doi":"10.1109/IVMSPW.2013.6611913","DOIUrl":"https://doi.org/10.1109/IVMSPW.2013.6611913","url":null,"abstract":"Transmitting from sender compressed texture and depth maps of multiple viewpoints enables image synthesis at receiver from any intermediate virtual viewpoint via depth-image-based rendering (DIBR). We observe that quantized depth maps from different viewpoints of the same 3D scene constitutes multiple descriptions (MD) of the same signal, thus it is possible to reconstruct the 3D scene in higher precision at receiver when multiple depth maps are considered jointly. In this paper, we cast the precision enhancement of 3D surfaces from multiple quantized depth maps as a combinatorial optimization problem. First, we derive a lemma that allows us to increase the precision of a subset of 3D points with certainty, simply by discovering special intersections of quantization bins (QB) from both views. Then, we identify the most probable voxel-containing QB intersections using a shortest-path formulation. Experimental results show that our method can significantly increase the precision of decoded depth maps compared with standard decoding schemes.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134613805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Effectiveness of 3VQM in capturing depth inconsistencies 3VQM捕获深度不一致的有效性
Pub Date : 2013-06-10 DOI: 10.1109/IVMSPW.2013.6611918
Dogancan Temel, G. Al-Regib
The 3D video quality metric (3VQM) was proposed to evaluate the temporal and spatial variation of the depth errors for the depth values that would lead to inconsistencies between left and right views, fast changing disparities, and geometric distortions. Previously, we evaluated 3VQM against subjective scores. In this paper, we show the effectiveness of 3VQM in capturing errors and inconsistencies that exist in the rendered depth-based 3D videos. We further investigate how 3VQM could measure excessive disparities, fast changing disparities, geometric distortions, temporal flickering and/or spatial noise in the form of depth cues inconsistency. Results show that 3VQM best captures the depth inconsistencies based on errors in the reference views. However, the metric is not sensitive to depth map mild errors such as those resulting from blur. We also performed a subjective quality test and showed that 3VQM performs better than PSNR, weighted PSNR and SSIM in terms of accuracy, coherency and consistency.
提出了三维视频质量度量(3VQM)来评估深度值误差的时空变化,这些误差会导致左右视图不一致、差异快速变化和几何扭曲。以前,我们根据主观评分来评估3VQM。在本文中,我们展示了3VQM在捕获基于深度的渲染3D视频中存在的错误和不一致方面的有效性。我们进一步研究了3VQM如何测量过度差异、快速变化差异、几何扭曲、时间闪烁和/或深度线索不一致形式的空间噪声。结果表明,3VQM最能捕获基于参考视图误差的深度不一致。然而,度量对深度图的轻微错误不敏感,例如由模糊引起的错误。我们还进行了主观质量测试,结果表明3VQM在准确性、一致性和一致性方面优于PSNR、加权PSNR和SSIM。
{"title":"Effectiveness of 3VQM in capturing depth inconsistencies","authors":"Dogancan Temel, G. Al-Regib","doi":"10.1109/IVMSPW.2013.6611918","DOIUrl":"https://doi.org/10.1109/IVMSPW.2013.6611918","url":null,"abstract":"The 3D video quality metric (3VQM) was proposed to evaluate the temporal and spatial variation of the depth errors for the depth values that would lead to inconsistencies between left and right views, fast changing disparities, and geometric distortions. Previously, we evaluated 3VQM against subjective scores. In this paper, we show the effectiveness of 3VQM in capturing errors and inconsistencies that exist in the rendered depth-based 3D videos. We further investigate how 3VQM could measure excessive disparities, fast changing disparities, geometric distortions, temporal flickering and/or spatial noise in the form of depth cues inconsistency. Results show that 3VQM best captures the depth inconsistencies based on errors in the reference views. However, the metric is not sensitive to depth map mild errors such as those resulting from blur. We also performed a subjective quality test and showed that 3VQM performs better than PSNR, weighted PSNR and SSIM in terms of accuracy, coherency and consistency.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117280672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Camera trajectory recovery for image-based city street modeling 基于图像的城市街道建模的相机轨迹恢复
Pub Date : 2013-06-10 DOI: 10.1109/IVMSPW.2013.6611919
F. Huang, A. Tsai, Meng-Tsan Li, Jui-Yang Tsai
A semi-automatic image-based approach for city street modeling was proposed, which takes two types of images as input. One is an orthogonal aerial image of the area of interest and the other is a set of street-view spherical panoramic images. This paper focuses on the accuracy enhancement of camera trajectory recovery, which is crucial in registering two types of image sources. Scale-invariant Feature Transform feature detection and matching methods were employed to identify corresponding image points between each pair of successive panoramic images. Due to the wide field-of-view of spherical panoramic images and high image recording frequency, the number of resultant matches is generally very large. Instead of directly applying RANSAC which is very time consuming, we proposed a method to preprocess those matches. We claim that the majority of incorrect or insignificant matches will be successfully removed. Several real-world experiments were conducted to demonstrate that our method is able to achieve higher accuracy at estimating camera extrinsic parameters, and would consequently lead to a more accurate camera trajectory recovery result.
提出了一种基于图像的半自动城市街道建模方法,该方法以两类图像为输入。一个是感兴趣区域的正交航空图像,另一个是一组街景球形全景图像。本文重点研究了相机轨迹恢复精度的提高,这是两类图像源配准的关键。采用尺度不变特征变换特征检测和匹配方法,在每对连续全景图像之间识别对应的图像点。由于球面全景图像的宽视场和高图像记录频率,产生的匹配数量通常非常大。本文提出了一种对匹配项进行预处理的方法,取代了直接使用RANSAC算法进行匹配非常耗时的问题。我们声称,大多数不正确或不重要的匹配将被成功删除。几个真实世界的实验表明,我们的方法能够在估计相机外部参数方面达到更高的精度,从而导致更准确的相机轨迹恢复结果。
{"title":"Camera trajectory recovery for image-based city street modeling","authors":"F. Huang, A. Tsai, Meng-Tsan Li, Jui-Yang Tsai","doi":"10.1109/IVMSPW.2013.6611919","DOIUrl":"https://doi.org/10.1109/IVMSPW.2013.6611919","url":null,"abstract":"A semi-automatic image-based approach for city street modeling was proposed, which takes two types of images as input. One is an orthogonal aerial image of the area of interest and the other is a set of street-view spherical panoramic images. This paper focuses on the accuracy enhancement of camera trajectory recovery, which is crucial in registering two types of image sources. Scale-invariant Feature Transform feature detection and matching methods were employed to identify corresponding image points between each pair of successive panoramic images. Due to the wide field-of-view of spherical panoramic images and high image recording frequency, the number of resultant matches is generally very large. Instead of directly applying RANSAC which is very time consuming, we proposed a method to preprocess those matches. We claim that the majority of incorrect or insignificant matches will be successfully removed. Several real-world experiments were conducted to demonstrate that our method is able to achieve higher accuracy at estimating camera extrinsic parameters, and would consequently lead to a more accurate camera trajectory recovery result.","PeriodicalId":170714,"journal":{"name":"IVMSP 2013","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115231926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
IVMSP 2013
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1