Pub Date : 2019-10-01DOI: 10.1109/ISMAR.2019.00-30
Yinchen Wu, Liwei Chan, Wen-Chieh Lin
Many crucial applications in the fields of filmmaking, game design, education, and cultural preservation — among others — involve the modeling, authoring, or editing of 3D objects and scenes. The two major methods of creating 3D models are 1) modeling, using computer software, and 2) reconstruction, generally using high-quality 3D scanners. Scanners of sufficient quality to support the latter method remain unaffordable to the general public. Since the emergence of consumer-grade RGBD cameras, there has been a growing interest in 3D reconstruction systems using depth cameras. However, most such systems are not user-friendly, and require intense efforts and practice if good reconstruction results are to be obtained. In this paper, we propose to increase the accessibility of depth-camera-based 3D reconstruction by assisting its users with augmented reality (AR) technology. Specifically, the proposed approach will allow users to rotate/move a target object freely with their hands and see the object being overlapped with its reconstructing model during the reconstruction process. As well as being more instinctual than conventional reconstruction systems, our proposed system will provide useful hints on complete 3D reconstruction of an object, including the best capturing range; reminder of moving and rotating the object at a steady speed; and which model regions are complex enough to require zooming-in. We evaluated our system via a user study that compared its performance against those of three other stateof- the-art approaches, and found our system outperforms the other approaches. Specifically, the participants rated it highest in usability, understandability, and model satisfaction.
{"title":"Tangible and Visible 3D Object Reconstruction in Augmented Reality","authors":"Yinchen Wu, Liwei Chan, Wen-Chieh Lin","doi":"10.1109/ISMAR.2019.00-30","DOIUrl":"https://doi.org/10.1109/ISMAR.2019.00-30","url":null,"abstract":"Many crucial applications in the fields of filmmaking, game design, education, and cultural preservation — among others — involve the modeling, authoring, or editing of 3D objects and scenes. The two major methods of creating 3D models are 1) modeling, using computer software, and 2) reconstruction, generally using high-quality 3D scanners. Scanners of sufficient quality to support the latter method remain unaffordable to the general public. Since the emergence of consumer-grade RGBD cameras, there has been a growing interest in 3D reconstruction systems using depth cameras. However, most such systems are not user-friendly, and require intense efforts and practice if good reconstruction results are to be obtained. In this paper, we propose to increase the accessibility of depth-camera-based 3D reconstruction by assisting its users with augmented reality (AR) technology. Specifically, the proposed approach will allow users to rotate/move a target object freely with their hands and see the object being overlapped with its reconstructing model during the reconstruction process. As well as being more instinctual than conventional reconstruction systems, our proposed system will provide useful hints on complete 3D reconstruction of an object, including the best capturing range; reminder of moving and rotating the object at a steady speed; and which model regions are complex enough to require zooming-in. We evaluated our system via a user study that compared its performance against those of three other stateof- the-art approaches, and found our system outperforms the other approaches. Specifically, the participants rated it highest in usability, understandability, and model satisfaction.","PeriodicalId":348216,"journal":{"name":"2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121830169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/ISMAR.2019.00030
Hyeopwoo Lee, Hyejin Kim, D. Monteiro, Youngnoh Goh, Daseong Han, Hai-Ning Liang, H. Yang, Jinki Jung
In this paper we present a comparative study of visual instructions in Immersive Virtual Reality (IVR), i.e., annotation (ANN) that employs 3D texts and objects for instructions and virtual tutor (TUT) that demonstrates a task with a 3D character. The comparison is based on three tasks, maze escape (ME), stretching exercise (SE), and crane manipulation (CM), defined by the types of a unit instruction. We conducted an automated evaluation of user's memory recall performances (recall time, accuracy, and error) by mapping a sequence of user's behaviors and events as a string. Results revealed that ANN group showed significantly more accurate performance (1.3 times) in ME and time performance (1.64 times) in SE than TUT group, while no statistical main difference was found in CM. Interestingly, although ANN showed statistically shorter execution time, the recalling time pattern of TUT group showed a steep convergence after initial trial. The results can be used in the field in terms of informing designers of IVR on what types of visual instruction are best for different task purpose.
{"title":"Annotation vs. Virtual Tutor: Comparative Analysis on the Effectiveness of Visual Instructions in Immersive Virtual Reality","authors":"Hyeopwoo Lee, Hyejin Kim, D. Monteiro, Youngnoh Goh, Daseong Han, Hai-Ning Liang, H. Yang, Jinki Jung","doi":"10.1109/ISMAR.2019.00030","DOIUrl":"https://doi.org/10.1109/ISMAR.2019.00030","url":null,"abstract":"In this paper we present a comparative study of visual instructions in Immersive Virtual Reality (IVR), i.e., annotation (ANN) that employs 3D texts and objects for instructions and virtual tutor (TUT) that demonstrates a task with a 3D character. The comparison is based on three tasks, maze escape (ME), stretching exercise (SE), and crane manipulation (CM), defined by the types of a unit instruction. We conducted an automated evaluation of user's memory recall performances (recall time, accuracy, and error) by mapping a sequence of user's behaviors and events as a string. Results revealed that ANN group showed significantly more accurate performance (1.3 times) in ME and time performance (1.64 times) in SE than TUT group, while no statistical main difference was found in CM. Interestingly, although ANN showed statistically shorter execution time, the recalling time pattern of TUT group showed a steep convergence after initial trial. The results can be used in the field in terms of informing designers of IVR on what types of visual instruction are best for different task purpose.","PeriodicalId":348216,"journal":{"name":"2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121849228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/ISMAR.2019.00016
Ruyu Liu, Jianhua Zhang, Shengyong Chen, Clemens Arth
In this paper, we address the topic of outdoor localization and tracking using monocular camera setups with poor GPS priors. We leverage 2.5D building maps, which are freely available from open-source databases such as OpenStreetMap. The main contributions of our work are a fast initialization method and a non-linear optimization scheme. The initialization upgrades a visual SLAM reconstruction with an absolute scale. The non-linear optimization uses the 2.5D building model footprint, which further improves the tracking accuracy and the scale estimation. A pose optimization step relates the vision-based camera pose estimation from SLAM to the position information received through GPS, in order to fix the common problem of drift. We evaluate our approach on a set of challenging scenarios. The experimental results show that our approach achieves improved accuracy and robustness with an advantage in run-time over previous setups.
{"title":"Towards SLAM-Based Outdoor Localization using Poor GPS and 2.5D Building Models","authors":"Ruyu Liu, Jianhua Zhang, Shengyong Chen, Clemens Arth","doi":"10.1109/ISMAR.2019.00016","DOIUrl":"https://doi.org/10.1109/ISMAR.2019.00016","url":null,"abstract":"In this paper, we address the topic of outdoor localization and tracking using monocular camera setups with poor GPS priors. We leverage 2.5D building maps, which are freely available from open-source databases such as OpenStreetMap. The main contributions of our work are a fast initialization method and a non-linear optimization scheme. The initialization upgrades a visual SLAM reconstruction with an absolute scale. The non-linear optimization uses the 2.5D building model footprint, which further improves the tracking accuracy and the scale estimation. A pose optimization step relates the vision-based camera pose estimation from SLAM to the position information received through GPS, in order to fix the common problem of drift. We evaluate our approach on a set of challenging scenarios. The experimental results show that our approach achieves improved accuracy and robustness with an advantage in run-time over previous setups.","PeriodicalId":348216,"journal":{"name":"2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"29 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114113162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/ISMAR.2019.00-25
Valentin Vasiliu, Gábor Sörös
Coherent rendering in augmented reality deals with synthesizing virtual content that seamlessly blends in with the real content. Unfortunately, capturing or modeling every real aspect in the virtual rendering process is often unfeasible or too expensive. We present a post-processing method that improves the look of rendered overlays in a dental virtual try-on application. We combine the original frame and the default rendered frame in an autoencoder neural network in order to obtain a more natural output, inspired by artistic style transfer research. Specifically, we apply the original frame as style on the rendered frame as content, repeating the process with each new pair of frames. Our method requires only a single forward pass, our shallow architecture ensures fast execution, and our internal feedback loop inherently enforces temporal consistency.
{"title":"Coherent Rendering of Virtual Smile Previews with Fast Neural Style Transfer","authors":"Valentin Vasiliu, Gábor Sörös","doi":"10.1109/ISMAR.2019.00-25","DOIUrl":"https://doi.org/10.1109/ISMAR.2019.00-25","url":null,"abstract":"Coherent rendering in augmented reality deals with synthesizing virtual content that seamlessly blends in with the real content. Unfortunately, capturing or modeling every real aspect in the virtual rendering process is often unfeasible or too expensive. We present a post-processing method that improves the look of rendered overlays in a dental virtual try-on application. We combine the original frame and the default rendered frame in an autoencoder neural network in order to obtain a more natural output, inspired by artistic style transfer research. Specifically, we apply the original frame as style on the rendered frame as content, repeating the process with each new pair of frames. Our method requires only a single forward pass, our shallow architecture ensures fast execution, and our internal feedback loop inherently enforces temporal consistency.","PeriodicalId":348216,"journal":{"name":"2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130832878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/ISMAR.2019.00026
Wenge Xu, Hai-Ning Liang, Anqi He, Zifan Wang
Augmented reality (AR) is on the rise with consumer-level head-mounted displays (HMDs) becoming available in recent years. Text entry is an essential activity for AR systems, but it is still relatively underexplored. Although it is possible to use a physical keyboard to enter text in AR systems, it is not the most optimal and ideal way because it confines the uses to a stationary position and within indoor environments. Instead, a virtual keyboard seems more suitable. Text entry via virtual keyboards requires a pointing method and a selection mechanism. Although there exist various combinations of pointing+selection mechanisms, it is not well understood how well suited each combination is to support fast text entry speed with low error rates and positive usability (regarding workload, user experience, motion sickness, and immersion). In this research, we perform an empirical study to investigate user preference and text entry performance of four pointing methods (Controller, Head, Hand, and Hybrid) in combination with two input mechanisms (Swype and Tap). Our research represents a first systematic investigation of these eight possible combinations. Our results show that Controller outperforms all the other device-free methods in both text entry performance and user experience. However, device-free pointing methods can be usable depending on task requirements and users' preferences and physical condition.
{"title":"Pointing and Selection Methods for Text Entry in Augmented Reality Head Mounted Displays","authors":"Wenge Xu, Hai-Ning Liang, Anqi He, Zifan Wang","doi":"10.1109/ISMAR.2019.00026","DOIUrl":"https://doi.org/10.1109/ISMAR.2019.00026","url":null,"abstract":"Augmented reality (AR) is on the rise with consumer-level head-mounted displays (HMDs) becoming available in recent years. Text entry is an essential activity for AR systems, but it is still relatively underexplored. Although it is possible to use a physical keyboard to enter text in AR systems, it is not the most optimal and ideal way because it confines the uses to a stationary position and within indoor environments. Instead, a virtual keyboard seems more suitable. Text entry via virtual keyboards requires a pointing method and a selection mechanism. Although there exist various combinations of pointing+selection mechanisms, it is not well understood how well suited each combination is to support fast text entry speed with low error rates and positive usability (regarding workload, user experience, motion sickness, and immersion). In this research, we perform an empirical study to investigate user preference and text entry performance of four pointing methods (Controller, Head, Hand, and Hybrid) in combination with two input mechanisms (Swype and Tap). Our research represents a first systematic investigation of these eight possible combinations. Our results show that Controller outperforms all the other device-free methods in both text entry performance and user experience. However, device-free pointing methods can be usable depending on task requirements and users' preferences and physical condition.","PeriodicalId":348216,"journal":{"name":"2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129950340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/ISMAR.2019.00-11
Jae-eun Shin, Hayun Kim, Callum Parker, Hyung-il Kim, Seoyoung Oh, Woontack Woo
One of the main challenges in creating narrative-driven Augmented Reality (AR) content for Head Mounted Displays (HMDs) is to make them equally accessible and enjoyable in different types of indoor environments. However, little has been studied in regards to whether such content can indeed provide similar, if not the same, levels of experience across different spaces. To gain more understanding towards this issue, we examine the effect of room size and furniture on the player experience of Fragments, a space-adaptive, indoor AR crime-solving game created for the Microsoft HoloLens. The study compares factors of player experience in four types of spatial conditions: (1) Large Room - Fully Furnished; (2) Large Room - Scarcely Furnished; (3) Small Room - Fully Furnished; and (4) Small Room - Scarcely Furnished. Our results show that while large spaces facilitate a higher sense of presence and narrative engagement, fully-furnished rooms raise perceived workload. Based on our findings, we propose design suggestions that can support narrative-driven, space-adaptive indoor HMD-based AR content in delivering optimal experiences for various types of rooms.
{"title":"Is Any Room Really OK? The Effect of Room Size and Furniture on Presence, Narrative Engagement, and Usability During a Space-Adaptive Augmented Reality Game","authors":"Jae-eun Shin, Hayun Kim, Callum Parker, Hyung-il Kim, Seoyoung Oh, Woontack Woo","doi":"10.1109/ISMAR.2019.00-11","DOIUrl":"https://doi.org/10.1109/ISMAR.2019.00-11","url":null,"abstract":"One of the main challenges in creating narrative-driven Augmented Reality (AR) content for Head Mounted Displays (HMDs) is to make them equally accessible and enjoyable in different types of indoor environments. However, little has been studied in regards to whether such content can indeed provide similar, if not the same, levels of experience across different spaces. To gain more understanding towards this issue, we examine the effect of room size and furniture on the player experience of Fragments, a space-adaptive, indoor AR crime-solving game created for the Microsoft HoloLens. The study compares factors of player experience in four types of spatial conditions: (1) Large Room - Fully Furnished; (2) Large Room - Scarcely Furnished; (3) Small Room - Fully Furnished; and (4) Small Room - Scarcely Furnished. Our results show that while large spaces facilitate a higher sense of presence and narrative engagement, fully-furnished rooms raise perceived workload. Based on our findings, we propose design suggestions that can support narrative-driven, space-adaptive indoor HMD-based AR content in delivering optimal experiences for various types of rooms.","PeriodicalId":348216,"journal":{"name":"2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131477952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Purpose: It is hypothesized that cyclical stereoscopy (displaying stereoscopy or 2D cyclically) has effect over visual fatigue, learning curves and quality of experience, and that those effects are different from regular stereoscopy. Materials and Methods: 59 participants played a serious game simulating a job interview with a Samsung Gear VR Head Mounted Display (HMD). Participants were randomly assigned to 3 groups: HMD with regular stereoscopy (S3D) and HMD with cyclical stereoscopy (cycles of 1 or 3 minutes). Participants played the game thrice (third try on a PC one month later). Visual discomfort, Flow, Presence, were measured with questionnaires. Visual Fatigue was assessed pre-and post-exposure with optometric measures. Learning traces were obtained in-game. Results: Visual discomfort and flow are lower with cyclical-S3D than S3D but not Presence. Cyclical stereoscopy every 1 minute is more tiring than stereoscopy. Cyclical stereoscopy every 3 minutes tends to be more tiring than stereoscopy. Cyclical stereoscopy groups improved during Short-Term Learning. None of the statistical tests showed a difference between groups in either Short-Term Learning or Long-Term Learning curves. Conclusion: cyclical stereoscopy displayed cyclically had a positive impact on Visual Comfort and Flow, but not Presence. It affects oculomotor functions in a HMD while learning with a serious game with low disparities and easy visual tasks. Other visual tasks should be tested, and eye-tracking should be considered to assess visual fatigue during exposure. Results in ecological conditions seem to support models suggesting that activating cyclically stereopsis in a HMD is more tiring than maintaining it.
{"title":"Investigating Cyclical Stereoscopy Effects Over Visual Discomfort and Fatigue in Virtual Reality While Learning","authors":"Alexis D. Souchet, Stéphanie Philippe, Floriane Ober, Aurélien Léveque, Laure Leroy","doi":"10.1109/ISMAR.2019.00031","DOIUrl":"https://doi.org/10.1109/ISMAR.2019.00031","url":null,"abstract":"Purpose: It is hypothesized that cyclical stereoscopy (displaying stereoscopy or 2D cyclically) has effect over visual fatigue, learning curves and quality of experience, and that those effects are different from regular stereoscopy. Materials and Methods: 59 participants played a serious game simulating a job interview with a Samsung Gear VR Head Mounted Display (HMD). Participants were randomly assigned to 3 groups: HMD with regular stereoscopy (S3D) and HMD with cyclical stereoscopy (cycles of 1 or 3 minutes). Participants played the game thrice (third try on a PC one month later). Visual discomfort, Flow, Presence, were measured with questionnaires. Visual Fatigue was assessed pre-and post-exposure with optometric measures. Learning traces were obtained in-game. Results: Visual discomfort and flow are lower with cyclical-S3D than S3D but not Presence. Cyclical stereoscopy every 1 minute is more tiring than stereoscopy. Cyclical stereoscopy every 3 minutes tends to be more tiring than stereoscopy. Cyclical stereoscopy groups improved during Short-Term Learning. None of the statistical tests showed a difference between groups in either Short-Term Learning or Long-Term Learning curves. Conclusion: cyclical stereoscopy displayed cyclically had a positive impact on Visual Comfort and Flow, but not Presence. It affects oculomotor functions in a HMD while learning with a serious game with low disparities and easy visual tasks. Other visual tasks should be tested, and eye-tracking should be considered to assess visual fatigue during exposure. Results in ecological conditions seem to support models suggesting that activating cyclically stereopsis in a HMD is more tiring than maintaining it.","PeriodicalId":348216,"journal":{"name":"2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133462364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/ISMAR.2019.00-22
Catherine Taylor, Chris Mullany, Robin McNicholas, D. Cosker
Improvements in both software and hardware, as well as an increase in consumer suitable equipment, have resulted in great advances in the fields of virtual and augmented reality. Typically, systems use controllers or hand gestures to interact with virtual objects. However, these motions are often unnatural and diminish the immersion of the experience. Moreover, these approaches offer limited tactile feedback. There does not currently exist a platform to bring an arbitrary physical object into the virtual world without additional peripherals or the use of expensive motion capture systems. Such a system could be used for immersive experiences within the entertainment industry as well as being applied to VR or AR training experiences, in the fields of health and engineering. We propose an end-to-end pipeline for creating an interactive virtual prop from rigid and non-rigid physical objects. This includes a novel method for tracking the deformations of rigid and non-rigid objects at interactive rates using a single RGBD camera. We scan our physical object and process the point cloud to produce a triangular mesh. A range of possible deformations can be obtained by using a finite element method simulation and these are reduced to a low dimensional basis using principal component analysis. Machine learning approaches, in particular neural networks, have become key tools in computer vision and have been used on a range of tasks. Moreover, there has been an increased trend in training networks on synthetic data. To this end, we use a convolutional neural network, trained on synthetic data, to track the movement and potential deformations of an object in unlabelled RGB images from a single RGBD camera. We demonstrate our results for several objects with different sizes and appearances.
{"title":"VR Props: An End-to-End Pipeline for Transporting Real Objects Into Virtual and Augmented Environments","authors":"Catherine Taylor, Chris Mullany, Robin McNicholas, D. Cosker","doi":"10.1109/ISMAR.2019.00-22","DOIUrl":"https://doi.org/10.1109/ISMAR.2019.00-22","url":null,"abstract":"Improvements in both software and hardware, as well as an increase in consumer suitable equipment, have resulted in great advances in the fields of virtual and augmented reality. Typically, systems use controllers or hand gestures to interact with virtual objects. However, these motions are often unnatural and diminish the immersion of the experience. Moreover, these approaches offer limited tactile feedback. There does not currently exist a platform to bring an arbitrary physical object into the virtual world without additional peripherals or the use of expensive motion capture systems. Such a system could be used for immersive experiences within the entertainment industry as well as being applied to VR or AR training experiences, in the fields of health and engineering. We propose an end-to-end pipeline for creating an interactive virtual prop from rigid and non-rigid physical objects. This includes a novel method for tracking the deformations of rigid and non-rigid objects at interactive rates using a single RGBD camera. We scan our physical object and process the point cloud to produce a triangular mesh. A range of possible deformations can be obtained by using a finite element method simulation and these are reduced to a low dimensional basis using principal component analysis. Machine learning approaches, in particular neural networks, have become key tools in computer vision and have been used on a range of tasks. Moreover, there has been an increased trend in training networks on synthetic data. To this end, we use a convolutional neural network, trained on synthetic data, to track the movement and potential deformations of an object in unlabelled RGB images from a single RGBD camera. We demonstrate our results for several objects with different sizes and appearances.","PeriodicalId":348216,"journal":{"name":"2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128639614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/ISMAR.2019.00033
J. Collins, H. Regenbrecht, T. Langlotz, Y. Can, Cem Ersoy, Russell Butson
Recent improvements of Virtual Reality (VR) technology have enabled researchers to investigate the benefits VR may provide for various domains such as health, entertainment, training, and education. A significant proportion of VR system evaluations rely on perception-based measures such as user pre-and post-questionnaires and interviews. While these self-reports provide valuable insights into users' perceptions of VR environments, recent developments in digital sensors and data collection techniques afford researchers access to measures of physiological response. This work explores the merits of physiological measures in the evaluation of emotional responses in virtual environments (ERVE). We include and place at the center of our ERVE methodology emotional response data by way of electrodermal activity and heart-rate detection which are analyzed in conjunction with event-driven data to derive further measures. In this paper, we present our ERVE methodology together with a case study within the context of VR-based learning in which we derive measures of cognitive load and moments of insight. We discuss our methodology, and its potential for use in many other application and research domains to provide more in-depth and objective analyses of experiences within VR.
{"title":"Measuring Cognitive Load and Insight: A Methodology Exemplified in a Virtual Reality Learning Context","authors":"J. Collins, H. Regenbrecht, T. Langlotz, Y. Can, Cem Ersoy, Russell Butson","doi":"10.1109/ISMAR.2019.00033","DOIUrl":"https://doi.org/10.1109/ISMAR.2019.00033","url":null,"abstract":"Recent improvements of Virtual Reality (VR) technology have enabled researchers to investigate the benefits VR may provide for various domains such as health, entertainment, training, and education. A significant proportion of VR system evaluations rely on perception-based measures such as user pre-and post-questionnaires and interviews. While these self-reports provide valuable insights into users' perceptions of VR environments, recent developments in digital sensors and data collection techniques afford researchers access to measures of physiological response. This work explores the merits of physiological measures in the evaluation of emotional responses in virtual environments (ERVE). We include and place at the center of our ERVE methodology emotional response data by way of electrodermal activity and heart-rate detection which are analyzed in conjunction with event-driven data to derive further measures. In this paper, we present our ERVE methodology together with a case study within the context of VR-based learning in which we derive measures of cognitive load and moments of insight. We discuss our methodology, and its potential for use in many other application and research domains to provide more in-depth and objective analyses of experiences within VR.","PeriodicalId":348216,"journal":{"name":"2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"245 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116150951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/ISMAR.2019.00-26
Takumi Kaminokado, D. Iwai, Kosuke Sato
We propose a novel spatial augmented reality (SAR) framework to edit the appearance of physical glossy surfaces. The key idea is utilizing the specular reflection, which was a major distractor in conventional SAR systems. Namely, we spatially manipulate the appearance of an environmental surface, which is observed through the specular reflection. We use a stereoscopic display to present two appearances with disparity on the environmental surface, by which the depth of the specularly reflected visual information corresponds to the glossy surface. We refer to this method as augmented environment mapping (AEM). The paper describes its principle, followed by three different implementation approaches inspired by typical virtual and augmented reality approaches. We confirmed the feasibility of AEM through both quantitative and qualitative experiments using prototype systems.
{"title":"Augmented Environment Mapping for Appearance Editing of Glossy Surfaces","authors":"Takumi Kaminokado, D. Iwai, Kosuke Sato","doi":"10.1109/ISMAR.2019.00-26","DOIUrl":"https://doi.org/10.1109/ISMAR.2019.00-26","url":null,"abstract":"We propose a novel spatial augmented reality (SAR) framework to edit the appearance of physical glossy surfaces. The key idea is utilizing the specular reflection, which was a major distractor in conventional SAR systems. Namely, we spatially manipulate the appearance of an environmental surface, which is observed through the specular reflection. We use a stereoscopic display to present two appearances with disparity on the environmental surface, by which the depth of the specularly reflected visual information corresponds to the glossy surface. We refer to this method as augmented environment mapping (AEM). The paper describes its principle, followed by three different implementation approaches inspired by typical virtual and augmented reality approaches. We confirmed the feasibility of AEM through both quantitative and qualitative experiments using prototype systems.","PeriodicalId":348216,"journal":{"name":"2019 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116161796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}