Pub Date : 2009-05-04DOI: 10.1109/3DTV.2009.5069656
S. Shimizu, H. Kimata, Y. Ohtani
This paper presents a real-time video-based rendering system that uses multiview video data with depth representation for free-viewpoint navigation. The proposed rendering algorithm not only achieves high quality rendering but also increases viewpoint flexibility to cover viewpoints that do not lie on the camera baselines. The proposed system achieves real-time decoding of multiple videos and depth maps that are encoded by the H.264/AVC Multiview Video Coding Extension on a regular CPU. The rendering process is fully implemented on a commercial GPU. A performance evaluation shows that our system can generate XGA free-viewpoint images at 30 fps.
{"title":"Real-time free-viewpoint viewer from multiview video plus depth representation coded by H.264/AVC MVC extension","authors":"S. Shimizu, H. Kimata, Y. Ohtani","doi":"10.1109/3DTV.2009.5069656","DOIUrl":"https://doi.org/10.1109/3DTV.2009.5069656","url":null,"abstract":"This paper presents a real-time video-based rendering system that uses multiview video data with depth representation for free-viewpoint navigation. The proposed rendering algorithm not only achieves high quality rendering but also increases viewpoint flexibility to cover viewpoints that do not lie on the camera baselines. The proposed system achieves real-time decoding of multiple videos and depth maps that are encoded by the H.264/AVC Multiview Video Coding Extension on a regular CPU. The rendering process is fully implemented on a commercial GPU. A performance evaluation shows that our system can generate XGA free-viewpoint images at 30 fps.","PeriodicalId":230128,"journal":{"name":"2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127532677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A multiview stereo video FGS (Fine Granular Scalability) scalable scheme is presented in this paper. The similarity among adjacent views is fully utilized, A tradeoff scheme is presented in order to adapt to different demands of Quality First (QF) and View First (VF) of the decoder. The scheme is composed of three cases: I, P, B frame. The middle view is encoded as the basic layer, while the other views are predicted from the partly retrieved FGS enhancement layers of adjacent views. The FGS enhancement layer of the current view is generated based on that. Experimental results show that the presented scheme is of more flexible and extensive scalable characteristic, which could better adapt different demands on view image quality and stereo immersion of different users.
{"title":"An improved multiview stereo video FGS scalable scheme","authors":"Lei Yang, Xiaowei Song, Chunping Hou, Jichang Guo, Sumei Li, Yuan Zhou","doi":"10.1109/3DTV.2009.5069658","DOIUrl":"https://doi.org/10.1109/3DTV.2009.5069658","url":null,"abstract":"A multiview stereo video FGS (Fine Granular Scalability) scalable scheme is presented in this paper. The similarity among adjacent views is fully utilized, A tradeoff scheme is presented in order to adapt to different demands of Quality First (QF) and View First (VF) of the decoder. The scheme is composed of three cases: I, P, B frame. The middle view is encoded as the basic layer, while the other views are predicted from the partly retrieved FGS enhancement layers of adjacent views. The FGS enhancement layer of the current view is generated based on that. Experimental results show that the presented scheme is of more flexible and extensive scalable characteristic, which could better adapt different demands on view image quality and stereo immersion of different users.","PeriodicalId":230128,"journal":{"name":"2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114416001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-05-04DOI: 10.1109/3DTV.2009.5069669
P. Zanuttigh, G. Cortelazzo
This paper presents a novel strategy for the compression of depth maps. The proposed scheme starts with a segmentation step which identifies and extracts edges and main objects, then it introduces an efficient compression strategy for the segmented regions' shape. In the subsequent step a novel algorithm is used to predict the surface shape from the segmented regions and a set of regularly spaced samples. Finally the few prediction residuals are efficiently compressed using standard image compression techniques. Experimental results show that the proposed scheme not only offers a significant gain over JPEG2000 on various types of depth maps but also produces depth maps without edge artifacts particularly suited to 3D warping and free viewpoint video applications.
{"title":"Compression of depth information for 3D rendering","authors":"P. Zanuttigh, G. Cortelazzo","doi":"10.1109/3DTV.2009.5069669","DOIUrl":"https://doi.org/10.1109/3DTV.2009.5069669","url":null,"abstract":"This paper presents a novel strategy for the compression of depth maps. The proposed scheme starts with a segmentation step which identifies and extracts edges and main objects, then it introduces an efficient compression strategy for the segmented regions' shape. In the subsequent step a novel algorithm is used to predict the surface shape from the segmented regions and a set of regularly spaced samples. Finally the few prediction residuals are efficiently compressed using standard image compression techniques. Experimental results show that the proposed scheme not only offers a significant gain over JPEG2000 on various types of depth maps but also produces depth maps without edge artifacts particularly suited to 3D warping and free viewpoint video applications.","PeriodicalId":230128,"journal":{"name":"2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128980970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-05-04DOI: 10.1109/3DTV.2009.5069638
Cédric Niquin, S. Prévost, Y. Rémion
We present an offline method for stereo matching using a large number of views. Our method is based on occlusions detection. It is composed of two steps, one global and one local. In the first step we formulate an energy function that handles data, occlusions, and smooth terms through a global graph-cuts optimization. In our second step we introduce a local cost that handles occlusions from the first step in order to refine the result. This cost takes advantage of both the multi-view aspect and the occlusions. The experimental results show how our algorithm joins the advantages of both global and local methods, and how much it is accurate on boundaries detection and on details.
{"title":"Accurate multi-view depth reconstruction with occlusions handling","authors":"Cédric Niquin, S. Prévost, Y. Rémion","doi":"10.1109/3DTV.2009.5069638","DOIUrl":"https://doi.org/10.1109/3DTV.2009.5069638","url":null,"abstract":"We present an offline method for stereo matching using a large number of views. Our method is based on occlusions detection. It is composed of two steps, one global and one local. In the first step we formulate an energy function that handles data, occlusions, and smooth terms through a global graph-cuts optimization. In our second step we introduce a local cost that handles occlusions from the first step in order to refine the result. This cost takes advantage of both the multi-view aspect and the occlusions. The experimental results show how our algorithm joins the advantages of both global and local methods, and how much it is accurate on boundaries detection and on details.","PeriodicalId":230128,"journal":{"name":"2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video","volume":"187 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116322117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-05-04DOI: 10.1109/3DTV.2009.5069663
C. Weigel, S. Schwarz, T. Korn, Martin Wallebohr
We present a system for rendering free viewpoint video from data acquired by one or more stereo camera pairs in advance. The free viewpoint video can be observed standalone or shown embedded in a synthetic computer graphics scene. Compared to state-of-the art free viewpoint video applications less cameras are required. The system is scalable in terms of adding more stereo pairs in order to increase the viewing latitude around the object and is therefore adaptable to different kinds of application such as quality assessment tasks or virtual fairs. The main contribution of this paper are i) the scalable extension of the system by additional stereo pairs and ii) the embedding of the object into a synthetic scene in a pseudo 3D manner. We implement the application using a highly customizable software framework for image processing tasks.
{"title":"Interactive free viewpoint video from multiple stereo","authors":"C. Weigel, S. Schwarz, T. Korn, Martin Wallebohr","doi":"10.1109/3DTV.2009.5069663","DOIUrl":"https://doi.org/10.1109/3DTV.2009.5069663","url":null,"abstract":"We present a system for rendering free viewpoint video from data acquired by one or more stereo camera pairs in advance. The free viewpoint video can be observed standalone or shown embedded in a synthetic computer graphics scene. Compared to state-of-the art free viewpoint video applications less cameras are required. The system is scalable in terms of adding more stereo pairs in order to increase the viewing latitude around the object and is therefore adaptable to different kinds of application such as quality assessment tasks or virtual fairs. The main contribution of this paper are i) the scalable extension of the system by additional stereo pairs and ii) the embedding of the object into a synthetic scene in a pseudo 3D manner. We implement the application using a highly customizable software framework for image processing tasks.","PeriodicalId":230128,"journal":{"name":"2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127704190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-05-04DOI: 10.1109/3DTV.2009.5069625
Chenglei Wu, Xun Cao, Qionghai Dai
We present an algorithm that fuses Multi-view stereo (MVS) and photometric stereo to reconstruct 3D model of objects filmed by multiple cameras under varying illuminations. Firstly, we obtain the surface normal scaled by albedo for each view through photometric stereo techniques. Then, based on the scaled normal, a new correspondence matching method, namely surface-consistency metric, is proposed to acquire accurate 3D positions of pixels through triangulation. After filtering the point cloud, a Poisson surface reconstruction is applied to obtain a watertight mesh. The algorithm has been implemented based on our multi-camera and multi-light acquisition system. We validate the method by complete reconstruction of challenging real objects and show experimentally that this technique can greatly improve on previous MVS results.
{"title":"Accurate 3D reconstruction via surface-consistency","authors":"Chenglei Wu, Xun Cao, Qionghai Dai","doi":"10.1109/3DTV.2009.5069625","DOIUrl":"https://doi.org/10.1109/3DTV.2009.5069625","url":null,"abstract":"We present an algorithm that fuses Multi-view stereo (MVS) and photometric stereo to reconstruct 3D model of objects filmed by multiple cameras under varying illuminations. Firstly, we obtain the surface normal scaled by albedo for each view through photometric stereo techniques. Then, based on the scaled normal, a new correspondence matching method, namely surface-consistency metric, is proposed to acquire accurate 3D positions of pixels through triangulation. After filtering the point cloud, a Poisson surface reconstruction is applied to obtain a watertight mesh. The algorithm has been implemented based on our multi-camera and multi-light acquisition system. We validate the method by complete reconstruction of challenging real objects and show experimentally that this technique can greatly improve on previous MVS results.","PeriodicalId":230128,"journal":{"name":"2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131227687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-05-04DOI: 10.1109/3DTV.2009.5069640
Xiaoduan Feng, Yebin Liu, Qionghai Dai
More and more multi-luminance image acquisition systems are designed for relighting. Besides the basic purpose of the multi-luminance images, they can also be adopted to enhance the performance of multi-view stereo. By fusing the point-clouds from images under different luminance setups, a good model of the object can be achieved, with high robustness to image noise, shadows and high-lights. This is the basic idea of our novel multi-view stereo method. Supported by our own multi-view multi-luminance image acquisition system, our method can produce good models for the real world objects.
{"title":"Multi-view stereo using multi-luminance images","authors":"Xiaoduan Feng, Yebin Liu, Qionghai Dai","doi":"10.1109/3DTV.2009.5069640","DOIUrl":"https://doi.org/10.1109/3DTV.2009.5069640","url":null,"abstract":"More and more multi-luminance image acquisition systems are designed for relighting. Besides the basic purpose of the multi-luminance images, they can also be adopted to enhance the performance of multi-view stereo. By fusing the point-clouds from images under different luminance setups, a good model of the object can be achieved, with high robustness to image noise, shadows and high-lights. This is the basic idea of our novel multi-view stereo method. Supported by our own multi-view multi-luminance image acquisition system, our method can produce good models for the real world objects.","PeriodicalId":230128,"journal":{"name":"2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123321419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-05-04DOI: 10.1109/3DTV.2009.5069617
Wooseong Kang, Seunghyun Lee
An effect of the toed-in camera configuration is keystone distortion, which causes vertical and horizontal parallax in the stereoscopic image. However if the stereoscopic image captured by the toed-in camera system with fish-eye lens is displayed on mobile device, it is uncomfortable to view because the horizontal parallax contain horizontal parallax distortion occurred by the wide field of view of the lenses. Therefore, in this paper, we propose a novel correction method of the horizontal parallax distortion, which is one of the keystone distortions. We have experimented to attest the proposed method. The captured stereoscopic image was corrected for the barrel distortion and the horizontal parallax distortion. Therefore, the proposed method provides correcting of the horizontal parallax distortion from a toed-in camera system in order that users can enjoy three-dimensional effects without the visual fatigue.
{"title":"Horizontal parallax distortion correction method in toed-in camera with wide-angle lens","authors":"Wooseong Kang, Seunghyun Lee","doi":"10.1109/3DTV.2009.5069617","DOIUrl":"https://doi.org/10.1109/3DTV.2009.5069617","url":null,"abstract":"An effect of the toed-in camera configuration is keystone distortion, which causes vertical and horizontal parallax in the stereoscopic image. However if the stereoscopic image captured by the toed-in camera system with fish-eye lens is displayed on mobile device, it is uncomfortable to view because the horizontal parallax contain horizontal parallax distortion occurred by the wide field of view of the lenses. Therefore, in this paper, we propose a novel correction method of the horizontal parallax distortion, which is one of the keystone distortions. We have experimented to attest the proposed method. The captured stereoscopic image was corrected for the barrel distortion and the horizontal parallax distortion. Therefore, the proposed method provides correcting of the horizontal parallax distortion from a toed-in camera system in order that users can enjoy three-dimensional effects without the visual fatigue.","PeriodicalId":230128,"journal":{"name":"2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123491393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-05-04DOI: 10.1109/3DTV.2009.5069639
Evlambios E. Apostolidis, G. Triantafyllidis
In Free-View Television (FTV), the user can interactively control the viewpoint and generate new arbitrary views of a dynamic scene from any 3D position. The new views might be recorded and misused. Therefore the problem of copyright and copy protection in FTV should be solved. Among many alternative rights management methods, the copyright problem for visual data can be approached by means of embedding hidden imperceptible information, called watermark, into the image and video content. But this approach differs from the simple watermarking technique, since watermark in FTV should not only be resistant to common video processing and multi-view video processing operations, it should also be easily extracted from a generated video of an arbitrary view. In this paper, we focus on the evaluation of the performance of several watermarks according to their distribution characteristics in order to survive in the new generated arbitrary views of FTV.
{"title":"Free-View TV watermark selection based on the distribution characteristics","authors":"Evlambios E. Apostolidis, G. Triantafyllidis","doi":"10.1109/3DTV.2009.5069639","DOIUrl":"https://doi.org/10.1109/3DTV.2009.5069639","url":null,"abstract":"In Free-View Television (FTV), the user can interactively control the viewpoint and generate new arbitrary views of a dynamic scene from any 3D position. The new views might be recorded and misused. Therefore the problem of copyright and copy protection in FTV should be solved. Among many alternative rights management methods, the copyright problem for visual data can be approached by means of embedding hidden imperceptible information, called watermark, into the image and video content. But this approach differs from the simple watermarking technique, since watermark in FTV should not only be resistant to common video processing and multi-view video processing operations, it should also be easily extracted from a generated video of an arbitrary view. In this paper, we focus on the evaluation of the performance of several watermarks according to their distribution characteristics in order to survive in the new generated arbitrary views of FTV.","PeriodicalId":230128,"journal":{"name":"2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125940064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-05-04DOI: 10.1109/3DTV.2009.5069679
Engin Turetken, A. Alatan
A new region-based depth ordering algorithm is proposed based on the segmented motion layers with affine motion models. Starting from an initial set of layers that are independently extracted for each frame of an input sequence, relative depth order of every layer is determined following a bottom-to-top approach from local pair-wise relations to a global ordering. Layer sets of consecutive time instants are warped in two opposite directions in time to capture pair-wise occlusion relations of neighboring layers in the form of pixel voting statistics. Global depth order of layers is estimated by mapping the pair-wise relations to a directed acyclic graph and solving the longest path problem via a breadth-first search strategy. Temporal continuity is enforced both at the region segmentation and depth ordering stages to achieve temporally coherent layer support maps and depth order relations. Experimental results show that the proposed algorithm yields quite promising results even on dynamic scenes with multiple motions.
{"title":"Temporally consistent layer depth ordering via pixel voting for pseudo 3D representation","authors":"Engin Turetken, A. Alatan","doi":"10.1109/3DTV.2009.5069679","DOIUrl":"https://doi.org/10.1109/3DTV.2009.5069679","url":null,"abstract":"A new region-based depth ordering algorithm is proposed based on the segmented motion layers with affine motion models. Starting from an initial set of layers that are independently extracted for each frame of an input sequence, relative depth order of every layer is determined following a bottom-to-top approach from local pair-wise relations to a global ordering. Layer sets of consecutive time instants are warped in two opposite directions in time to capture pair-wise occlusion relations of neighboring layers in the form of pixel voting statistics. Global depth order of layers is estimated by mapping the pair-wise relations to a directed acyclic graph and solving the longest path problem via a breadth-first search strategy. Temporal continuity is enforced both at the region segmentation and depth ordering stages to achieve temporally coherent layer support maps and depth order relations. Experimental results show that the proposed algorithm yields quite promising results even on dynamic scenes with multiple motions.","PeriodicalId":230128,"journal":{"name":"2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126625343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}