Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8975903
Aditya Wadaskar, Mansi Sharma, Rohan Lal
The consumer market of High Dynamic Range (HDR) displays and cameras is blooming rapidly with the advent of 3D video and display technologies. Specialised agencies like Moving Picture Experts Group and International Telecommunication Union are demanding the standardization of latest display advancements. Lack of sufficient experimental data is a major bottleneck for the development of preliminary research efforts in 3D HDR video technology. We propose to make publicly available to the research community, a diversified database of Stereoscopic 3D HDR images and videos, captured within the beautiful campus of Indian Institute of Technology, Madras, which is blessed with rich flora and fauna, and is home to several rare wildlife species. Further, we have described the procedure of capturing, aligning, calibrating and post-processing of 3D images and videos. We have discussed research opportunities and challenges, and the potential use cases of HDR stereo 3D applications and depth-from-HDR aspects.
{"title":"A Rich Stereoscopic 3D High Dynamic Range Image & Video Database of Natural Scenes","authors":"Aditya Wadaskar, Mansi Sharma, Rohan Lal","doi":"10.1109/IC3D48390.2019.8975903","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8975903","url":null,"abstract":"The consumer market of High Dynamic Range (HDR) displays and cameras is blooming rapidly with the advent of 3D video and display technologies. Specialised agencies like Moving Picture Experts Group and International Telecommunication Union are demanding the standardization of latest display advancements. Lack of sufficient experimental data is a major bottleneck for the development of preliminary research efforts in 3D HDR video technology. We propose to make publicly available to the research community, a diversified database of Stereoscopic 3D HDR images and videos, captured within the beautiful campus of Indian Institute of Technology, Madras, which is blessed with rich flora and fauna, and is home to several rare wildlife species. Further, we have described the procedure of capturing, aligning, calibrating and post-processing of 3D images and videos. We have discussed research opportunities and challenges, and the potential use cases of HDR stereo 3D applications and depth-from-HDR aspects.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132459683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8975993
Xinyu Huang, J. Twycross, Fridolin Wild
By mixing digital data into the real world, Augmented Reality (AR) can deliver potent immersive and interactive experience to its users. In many application contexts, this requires the capability to deploy animated, high fidelity 3D character models. In this paper, we propose a novel approach to efficiently transform – using 3D scanning – an actor to a photorealistic, animated character. This generated 3D assistant must be able to move to perform recorded motion capture data, and it must be able to generate dialogue with lip sync to naturally interact with the users. The approach we propose for creating these virtual AR assistants utilizes photogrammetric scanning, motion capture, and free viewpoint video for their integration in Unity. We deploy the Occipital Structure sensor to acquire static high-resolution textured surfaces, and a Vicon motion capture system to track series of movements. The proposed capturing process consists of the steps scanning, reconstruction with Wrap 3 and Maya, editing texture maps to reduce artefacts with Photoshop, and rigging with Maya and Motion Builder to render the models fit for animation and lip-sync using LipSyncPro. We test the approach in Unity by scanning two human models with 23 captured animations each. Our findings indicate that the major factors affecting the result quality are environment setup, lighting, and processing constraints.
{"title":"A Process for the Semi-Automated Generation of Life-Sized, Interactive 3D Character Models for Holographic Projection","authors":"Xinyu Huang, J. Twycross, Fridolin Wild","doi":"10.1109/IC3D48390.2019.8975993","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8975993","url":null,"abstract":"By mixing digital data into the real world, Augmented Reality (AR) can deliver potent immersive and interactive experience to its users. In many application contexts, this requires the capability to deploy animated, high fidelity 3D character models. In this paper, we propose a novel approach to efficiently transform – using 3D scanning – an actor to a photorealistic, animated character. This generated 3D assistant must be able to move to perform recorded motion capture data, and it must be able to generate dialogue with lip sync to naturally interact with the users. The approach we propose for creating these virtual AR assistants utilizes photogrammetric scanning, motion capture, and free viewpoint video for their integration in Unity. We deploy the Occipital Structure sensor to acquire static high-resolution textured surfaces, and a Vicon motion capture system to track series of movements. The proposed capturing process consists of the steps scanning, reconstruction with Wrap 3 and Maya, editing texture maps to reduce artefacts with Photoshop, and rigging with Maya and Motion Builder to render the models fit for animation and lip-sync using LipSyncPro. We test the approach in Unity by scanning two human models with 23 captured animations each. Our findings indicate that the major factors affecting the result quality are environment setup, lighting, and processing constraints.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"271 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133274399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8976000
Nour Hobloss, Andrei I. Purica, A. Fiandrotti, Marco Cagnazzo, R. Cozot, W. Hamidouche
Convolutional Neural Networks (CNN) have been recently employed for implementing complete end-to-end view synthesis architectures, from reference view warping to target view blending while dealing with occlusions as well. However, the convolutional sizes filters must increase with the distance between reference views, making all-convolutional approaches prohibitively complex for wide baseline setups. In this work we propose a hybrid approach to view synthesis where we first warp the reference views resolving the occlusions, and then we train a simpler convolutional architecture for blending the preprocessed views. By warping the reference views, we reduce the equivalent distance between reference views, allowing the use of smaller convolutional filters and thus lower network complexity. We experimentally show that our method performs favorably against both traditional and convolutional synthesis methods while retaining lower complexity with respect to the latter.
{"title":"A Hybrid Approach to Wide Baseline View Synthesis with Convolutional Neural Networks","authors":"Nour Hobloss, Andrei I. Purica, A. Fiandrotti, Marco Cagnazzo, R. Cozot, W. Hamidouche","doi":"10.1109/IC3D48390.2019.8976000","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8976000","url":null,"abstract":"Convolutional Neural Networks (CNN) have been recently employed for implementing complete end-to-end view synthesis architectures, from reference view warping to target view blending while dealing with occlusions as well. However, the convolutional sizes filters must increase with the distance between reference views, making all-convolutional approaches prohibitively complex for wide baseline setups. In this work we propose a hybrid approach to view synthesis where we first warp the reference views resolving the occlusions, and then we train a simpler convolutional architecture for blending the preprocessed views. By warping the reference views, we reduce the equivalent distance between reference views, allowing the use of smaller convolutional filters and thus lower network complexity. We experimentally show that our method performs favorably against both traditional and convolutional synthesis methods while retaining lower complexity with respect to the latter.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130114438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/ic3d48390.2019.8976002
{"title":"IC3D 2019 Technical Program Committee","authors":"","doi":"10.1109/ic3d48390.2019.8976002","DOIUrl":"https://doi.org/10.1109/ic3d48390.2019.8976002","url":null,"abstract":"","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122530808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-01DOI: 10.1109/IC3D48390.2019.8975899
P. Avinash, Mansi Sharma
Cheap and fast 3D asset creation to enable AR/VR applications is a fast growing domain. This paper addresses a significant problem of reconstructing complete 3D information of a face in near real-time speed on a mobile phone. We propose a novel deep learning based solution to predict robust depth maps of a face, one forward facing and the other backward facing, from a single image from the wild. A critical contribution is that the proposed network is capable of learning the depths of the occluded part of the face too. This is achieved by training a fully convolutional neural network to learn the dual (forward and backward) depth maps, with a common encoder and two separate decoders. The 300W-LP, a cloud point dataset, is used to compute the required dual depth maps from the training data. The code and results will be made available at project page.
{"title":"Predicting Forward & Backward Facial Depth Maps From a Single RGB Image For Mobile 3d AR Application","authors":"P. Avinash, Mansi Sharma","doi":"10.1109/IC3D48390.2019.8975899","DOIUrl":"https://doi.org/10.1109/IC3D48390.2019.8975899","url":null,"abstract":"Cheap and fast 3D asset creation to enable AR/VR applications is a fast growing domain. This paper addresses a significant problem of reconstructing complete 3D information of a face in near real-time speed on a mobile phone. We propose a novel deep learning based solution to predict robust depth maps of a face, one forward facing and the other backward facing, from a single image from the wild. A critical contribution is that the proposed network is capable of learning the depths of the occluded part of the face too. This is achieved by training a fully convolutional neural network to learn the dual (forward and backward) depth maps, with a common encoder and two separate decoders. The 300W-LP, a cloud point dataset, is used to compute the required dual depth maps from the training data. The code and results will be made available at project page.","PeriodicalId":344706,"journal":{"name":"2019 International Conference on 3D Immersion (IC3D)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123818483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}