Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8975905
Mansi Sharma, Gowtham Ragavan
This paper presents a novel scheme for light field compression based on a randomize hierarchical multi-view extension of high efficiency video coding (dubbed as RH-MVHEVC). Specifically, a light field data are arranged as a multiple pseudo-temporal video sequences which are efficiently compressed with MV-HEVC encoder, following an integrated random coding technique and hierarchical prediction scheme. The critical advantage of proposed RH-MVHEVC scheme is that it utilizes not just a temporal and inter-view prediction, but efficiently exploits the strong intrinsic similarities within each sub-aperture image and among neighboring sub-aperture images in both horizontal and vertical directions. Experimental results consistently outperform the state-of-the-art compression methods on benchmark ICME 2016 and ICIP 2017 grand challenge data sets. It achieves an average up to 33.803% BD-rate reduction and 1.7978 dB BD-PSNR improvement compared with an advanced JEM video encoder, and an average 20.4156% BD-rate reduction and 2.0644 dB BD-PSNR improvement compared with a latest image-based JEM-anchor coding scheme.
{"title":"A Novel Randomize Hierarchical Extension of MV-HEVC for Improved Light Field Compression","authors":"Mansi Sharma, Gowtham Ragavan","doi":"10.1109/IC3D48390.2019.8975905","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8975905","url":null,"abstract":"This paper presents a novel scheme for light field compression based on a randomize hierarchical multi-view extension of high efficiency video coding (dubbed as RH-MVHEVC). Specifically, a light field data are arranged as a multiple pseudo-temporal video sequences which are efficiently compressed with MV-HEVC encoder, following an integrated random coding technique and hierarchical prediction scheme. The critical advantage of proposed RH-MVHEVC scheme is that it utilizes not just a temporal and inter-view prediction, but efficiently exploits the strong intrinsic similarities within each sub-aperture image and among neighboring sub-aperture images in both horizontal and vertical directions. Experimental results consistently outperform the state-of-the-art compression methods on benchmark ICME 2016 and ICIP 2017 grand challenge data sets. It achieves an average up to 33.803% BD-rate reduction and 1.7978 dB BD-PSNR improvement compared with an advanced JEM video encoder, and an average 20.4156% BD-rate reduction and 2.0644 dB BD-PSNR improvement compared with a latest image-based JEM-anchor coding scheme.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"323 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115872088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8975906
J. Flotyński, D. Brutzman, Felix G. Hamza-Lup, A. Malamos, Nicholas F. Polys, L. Sikos, K. Walczak
One of the main obstacles for wide dissemination of immersive virtual and augmented reality environments on the Web is the lack of integration between 3D technologies and web technologies, which are increasingly focused on collaboration, annotation and semantics. This gap can be filled by combining VR and AR with the Semantic Web, which is a significant trend in the development of theWeb. The use of the Semantic Web may improve creation, representation, indexing, searching and processing of 3D web content by linking the content with formal and expressive descriptions of its meaning. Although several semantic approaches have been developed for 3D content, they are not explicitly linked to the available well-established 3D technologies, cover a limited set of 3D components and properties, and do not combine domain-specific and 3D-specific semantics. In this paper, we present the main motivations, concepts and development of the Semantic Web3D approach. It enables semantic ontology-based representation of 3D content built upon the Extensible 3D (X3D) standard. The approach can integrate the Semantic Web with interactive 3D technologies within different domains, thereby serving as a step towards building the next generation of the Web that incorporates semantic 3D contents.
{"title":"The Semantic Web3d: Towards Comprehensive Representation of 3d Content on the Semantic Web","authors":"J. Flotyński, D. Brutzman, Felix G. Hamza-Lup, A. Malamos, Nicholas F. Polys, L. Sikos, K. Walczak","doi":"10.1109/IC3D48390.2019.8975906","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8975906","url":null,"abstract":"One of the main obstacles for wide dissemination of immersive virtual and augmented reality environments on the Web is the lack of integration between 3D technologies and web technologies, which are increasingly focused on collaboration, annotation and semantics. This gap can be filled by combining VR and AR with the Semantic Web, which is a significant trend in the development of theWeb. The use of the Semantic Web may improve creation, representation, indexing, searching and processing of 3D web content by linking the content with formal and expressive descriptions of its meaning. Although several semantic approaches have been developed for 3D content, they are not explicitly linked to the available well-established 3D technologies, cover a limited set of 3D components and properties, and do not combine domain-specific and 3D-specific semantics. In this paper, we present the main motivations, concepts and development of the Semantic Web3D approach. It enables semantic ontology-based representation of 3D content built upon the Extensible 3D (X3D) standard. The approach can integrate the Semantic Web with interactive 3D technologies within different domains, thereby serving as a step towards building the next generation of the Web that incorporates semantic 3D contents.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126385711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8975902
Mansi Sharma, Gowtham Ragavan
This paper presents a novel image fusion scheme for view synthesis based on a layered depth profile of the scene and scale periodic transform. To create a layered depth profile of the scene, we utilize the unique properties of scale transform considering the problem of depth map computation from reference images as a certain shift-variant problem. The problem of depth computation is solved without deterministic stereo correspondences or rather than representing image signals in terms of shifts. Instead, we pose the problem of image signals being representable as scale periodic function, and compute appropriate depth estimates determining the scalings of a basis function. The rendering process is formulated as a novel image fusion in which the textures of all probable matching points are adaptively determined, leveraging implicitly the geometric information. The results demonstrate superiority of the proposed approach in suppressing geometric, blurring or flicker artifacts in rendered wide-baseline virtual videos.
{"title":"A Novel Image Fusion Scheme for FTV View Synthesis Based on Layered Depth Scene Representation & Scale Periodic Transform","authors":"Mansi Sharma, Gowtham Ragavan","doi":"10.1109/IC3D48390.2019.8975902","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8975902","url":null,"abstract":"This paper presents a novel image fusion scheme for view synthesis based on a layered depth profile of the scene and scale periodic transform. To create a layered depth profile of the scene, we utilize the unique properties of scale transform considering the problem of depth map computation from reference images as a certain shift-variant problem. The problem of depth computation is solved without deterministic stereo correspondences or rather than representing image signals in terms of shifts. Instead, we pose the problem of image signals being representable as scale periodic function, and compute appropriate depth estimates determining the scalings of a basis function. The rendering process is formulated as a novel image fusion in which the textures of all probable matching points are adaptively determined, leveraging implicitly the geometric information. The results demonstrate superiority of the proposed approach in suppressing geometric, blurring or flicker artifacts in rendered wide-baseline virtual videos.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125077401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8975990
Colm O. Fearghail, S. Knorr, A. Smolic
In cinematic virtual reality film one of the primary challenges from a storytelling perceptive is that of leading the attention of the viewers to ensure that the narrative is understood as desired. Methods from traditional cinema have been applied to varying levels of success. This paper explores the use of a saliency convolutional neural network model and measures it’s results against the intending viewing area as denoted by the creators and the ground truth as to where the viewers actually looked. This information could then be used to further increase the effectiveness of a director’s ability to focus attention in cinematic VR.
{"title":"Analysis of Intended Viewing Area vs Estimated Saliency on Narrative Plot Structures in VR Film","authors":"Colm O. Fearghail, S. Knorr, A. Smolic","doi":"10.1109/IC3D48390.2019.8975990","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8975990","url":null,"abstract":"In cinematic virtual reality film one of the primary challenges from a storytelling perceptive is that of leading the attention of the viewers to ensure that the narrative is understood as desired. Methods from traditional cinema have been applied to varying levels of success. This paper explores the use of a saliency convolutional neural network model and measures it’s results against the intending viewing area as denoted by the creators and the ground truth as to where the viewers actually looked. This information could then be used to further increase the effectiveness of a director’s ability to focus attention in cinematic VR.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130318998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ic3d48390.2019.8975904
{"title":"IC3D 2019 Conference Program","authors":"","doi":"10.1109/ic3d48390.2019.8975904","DOIUrl":"https://doi.org/10.1109/ic3d48390.2019.8975904","url":null,"abstract":"","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131018601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8975994
G. Verdelet, R. Salemme, C. Désoche, F. Volland, A. Farnè, A. Coudert, R. Hermann, Eric Truy, V. Gaveau, F. Pavani
Nowadays behavioral and cognitive neuroscience studies have turned ‘naturalistic’, aiming at understanding brain functions by maintaining complexity close to everyday life. Many scholars started using commercially available VR devices, which, were not conceived as research tools. It is therefore important to assess their spatio-temporal reliability and inform scholars about the basic resolutions they can achieve. Here we provide such an assessment for the VIVE (HTC Vive) by comparing it with a VICON (BONITA 10) system. Results show a submillimeter Vive precision (0.237mm) and a nearest centimeter accuracy (8.7mm static, 8.5mm dynamic). we also report the Vive reaction to a tracking loss: the system takes 319.5 +/− 16.8 ms to detect the loss and can still be perturbed for about 3 seconds after tracking recovery. The Vive device allows for fairly accurate and reliable spatiotemporal measurements and may be well-suited for studies with typical human behavior, provided tracking loss is prevented.
{"title":"Assessing Spatial and Temporal Reliability of the Vive System as a Tool for Naturalistic Behavioural Research","authors":"G. Verdelet, R. Salemme, C. Désoche, F. Volland, A. Farnè, A. Coudert, R. Hermann, Eric Truy, V. Gaveau, F. Pavani","doi":"10.1109/IC3D48390.2019.8975994","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8975994","url":null,"abstract":"Nowadays behavioral and cognitive neuroscience studies have turned ‘naturalistic’, aiming at understanding brain functions by maintaining complexity close to everyday life. Many scholars started using commercially available VR devices, which, were not conceived as research tools. It is therefore important to assess their spatio-temporal reliability and inform scholars about the basic resolutions they can achieve. Here we provide such an assessment for the VIVE (HTC Vive) by comparing it with a VICON (BONITA 10) system. Results show a submillimeter Vive precision (0.237mm) and a nearest centimeter accuracy (8.7mm static, 8.5mm dynamic). we also report the Vive reaction to a tracking loss: the system takes 319.5 +/− 16.8 ms to detect the loss and can still be perturbed for about 3 seconds after tracking recovery. The Vive device allows for fairly accurate and reliable spatiotemporal measurements and may be well-suited for studies with typical human behavior, provided tracking loss is prevented.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"61 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131874522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ic3d48390.2019.8975998
{"title":"[IC3D 2019 Title Page]","authors":"","doi":"10.1109/ic3d48390.2019.8975998","DOIUrl":"https://doi.org/10.1109/ic3d48390.2019.8975998","url":null,"abstract":"","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124181288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8975907
J. Flotyński, Adrian Nowak
Virtual and augmented reality environments consist of objects that typically interact with other objects and users, leading to evolution of 3D objects and scenes over time. In multiple VR/AR applications in different domains, interactions and temporal properties of 3D content may be represented using general or domain knowledge, which makes them comprehensible to average users or domain experts without an expertise in IT. Logging interactions and their results can be especially useful in VR/AR environments that are intended to monitor and gain knowledge about the system behavior as well as users’ behavior and preferences. However, the available approaches to development of VR/AR environments do not enable logging interactions in an explorable way. The main contribution of this paper is a method of developing explorable VR/AR environments on the basis of existing environments developed using well established tools, such as game engines and imperative programming languages. In the approach, interactions can be represented with general or domain knowledge. The method is discussed in the context of an immersive car showroom, which enables acquisition of knowledge about customers’ interests and preferences for marketing and merchandising purposes.
{"title":"Annotation-Based Development of Explorable Immersive VR/AR Environments","authors":"J. Flotyński, Adrian Nowak","doi":"10.1109/IC3D48390.2019.8975907","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8975907","url":null,"abstract":"Virtual and augmented reality environments consist of objects that typically interact with other objects and users, leading to evolution of 3D objects and scenes over time. In multiple VR/AR applications in different domains, interactions and temporal properties of 3D content may be represented using general or domain knowledge, which makes them comprehensible to average users or domain experts without an expertise in IT. Logging interactions and their results can be especially useful in VR/AR environments that are intended to monitor and gain knowledge about the system behavior as well as users’ behavior and preferences. However, the available approaches to development of VR/AR environments do not enable logging interactions in an explorable way. The main contribution of this paper is a method of developing explorable VR/AR environments on the basis of existing environments developed using well established tools, such as game engines and imperative programming languages. In the approach, interactions can be represented with general or domain knowledge. The method is discussed in the context of an immersive car showroom, which enables acquisition of knowledge about customers’ interests and preferences for marketing and merchandising purposes.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125047711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8976000
Nour Hobloss, Andrei I. Purica, A. Fiandrotti, Marco Cagnazzo, R. Cozot, W. Hamidouche
Convolutional Neural Networks (CNN) have been recently employed for implementing complete end-to-end view synthesis architectures, from reference view warping to target view blending while dealing with occlusions as well. However, the convolutional sizes filters must increase with the distance between reference views, making all-convolutional approaches prohibitively complex for wide baseline setups. In this work we propose a hybrid approach to view synthesis where we first warp the reference views resolving the occlusions, and then we train a simpler convolutional architecture for blending the preprocessed views. By warping the reference views, we reduce the equivalent distance between reference views, allowing the use of smaller convolutional filters and thus lower network complexity. We experimentally show that our method performs favorably against both traditional and convolutional synthesis methods while retaining lower complexity with respect to the latter.
{"title":"A Hybrid Approach to Wide Baseline View Synthesis with Convolutional Neural Networks","authors":"Nour Hobloss, Andrei I. Purica, A. Fiandrotti, Marco Cagnazzo, R. Cozot, W. Hamidouche","doi":"10.1109/IC3D48390.2019.8976000","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8976000","url":null,"abstract":"Convolutional Neural Networks (CNN) have been recently employed for implementing complete end-to-end view synthesis architectures, from reference view warping to target view blending while dealing with occlusions as well. However, the convolutional sizes filters must increase with the distance between reference views, making all-convolutional approaches prohibitively complex for wide baseline setups. In this work we propose a hybrid approach to view synthesis where we first warp the reference views resolving the occlusions, and then we train a simpler convolutional architecture for blending the preprocessed views. By warping the reference views, we reduce the equivalent distance between reference views, allowing the use of smaller convolutional filters and thus lower network complexity. We experimentally show that our method performs favorably against both traditional and convolutional synthesis methods while retaining lower complexity with respect to the latter.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130114438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8975995
Ségolène Rogge, Daniele Bonatto, Jaime Sancho, R. Salvador, E. Juárez, A. Munteanu, G. Lafruit
For enabling virtual reality on natural content, Depth Image-Based Rendering (DIBR) techniques have been steadily developed over the past decade, but their quality highly depends on that of the depth estimation. This paper is an attempt to deliver good-quality Depth Estimation Reference Software (DERS) that is well-structured for further use in the worldwide MPEG standardization committee.The existing DERS has been refactored, debugged and extended to any number of input views for generating accurate depth maps. Their quality has been validated by synthesizing DIBR virtual views with the Reference View Synthesizer (RVS) and the Versatile View Synthesizer (VVS), using the available MPEG test sequences. Resulting images and runtimes are reported.
{"title":"MPEG-I Depth Estimation Reference Software","authors":"Ségolène Rogge, Daniele Bonatto, Jaime Sancho, R. Salvador, E. Juárez, A. Munteanu, G. Lafruit","doi":"10.1109/IC3D48390.2019.8975995","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8975995","url":null,"abstract":"For enabling virtual reality on natural content, Depth Image-Based Rendering (DIBR) techniques have been steadily developed over the past decade, but their quality highly depends on that of the depth estimation. This paper is an attempt to deliver good-quality Depth Estimation Reference Software (DERS) that is well-structured for further use in the worldwide MPEG standardization committee.The existing DERS has been refactored, debugged and extended to any number of input views for generating accurate depth maps. Their quality has been validated by synthesizing DIBR virtual views with the Reference View Synthesizer (RVS) and the Versatile View Synthesizer (VVS), using the available MPEG test sequences. Resulting images and runtimes are reported.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122841344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}