Pub Date : 2023-03-01DOI: 10.1109/VR55154.2023.00038
Haonan Cheng, Shiguang Liu, Jiawan Zhang
We present a lightweight and efficient rain sound synthesis method for interactive virtual environments. Existing rain sound simulation methods require massive superposition of scene-specific precomputed rain sounds, which is excessive memory consumption for virtual reality systems (e.g. video games) with limited audio memory budgets. Facing this issue, we reduce the audio memory budgets by introducing a lightweight rain sound synthesis method which is only based on eight physically-inspired basic rain sounds. First, in order to generate sufficiently various rain sounds with limited sound data, we propose an exponential moving average based frequency domain additive (FDA) synthesis method to extend and modify the pre-computed basic rain sounds. Each rain sound is generated in the frequency domain before conversion back to the time domain, allowing us to extend the rain sound which is free of temporal distortions and discontinuities. Next, we introduce an efficient binaural rendering method to simulate the 3D perception that coheres with the visual scene based on a set of Near-Field Transfer Functions (NFTF). Various results demonstrate that the proposed method drastically decreases the memory cost (77 times compressed) and overcomes the limitations of existing methods in terms of interaction.
{"title":"Lightweight Scene-aware Rain Sound Simulation for Interactive Virtual Environments","authors":"Haonan Cheng, Shiguang Liu, Jiawan Zhang","doi":"10.1109/VR55154.2023.00038","DOIUrl":"https://doi.org/10.1109/VR55154.2023.00038","url":null,"abstract":"We present a lightweight and efficient rain sound synthesis method for interactive virtual environments. Existing rain sound simulation methods require massive superposition of scene-specific precomputed rain sounds, which is excessive memory consumption for virtual reality systems (e.g. video games) with limited audio memory budgets. Facing this issue, we reduce the audio memory budgets by introducing a lightweight rain sound synthesis method which is only based on eight physically-inspired basic rain sounds. First, in order to generate sufficiently various rain sounds with limited sound data, we propose an exponential moving average based frequency domain additive (FDA) synthesis method to extend and modify the pre-computed basic rain sounds. Each rain sound is generated in the frequency domain before conversion back to the time domain, allowing us to extend the rain sound which is free of temporal distortions and discontinuities. Next, we introduce an efficient binaural rendering method to simulate the 3D perception that coheres with the visual scene based on a set of Near-Field Transfer Functions (NFTF). Various results demonstrate that the proposed method drastically decreases the memory cost (77 times compressed) and overcomes the limitations of existing methods in terms of interaction.","PeriodicalId":346767,"journal":{"name":"2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129570369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/VR55154.2023.00071
Samuel Ang, Amanda Fernandez, Michael Rushforth, J. Quarles
Virtual reality (VR) technologies are used in a diverse range of applications. Many of these involve an embodied conversational agent (ECA), a virtual human who exchanges information with the user. Unfortunately, VR technologies remain inaccessible to many users due to the phenomenon of cybersickness: a collection of negative symptoms such as nausea and headache that can appear when immersed in a simulation. Many factors are believed to affect a user's level of cybersickness, but little is known regarding how these factors may influence a user's opinion of an ECA. In this study, we examined the effects of virtual stairs, a factor associated with increased levels of cybersickness. We recruited 39 participants to complete a simulated airport experience. This involved a simple navigation task followed by a brief conversation with a virtual airport customs agent in Spanish. Participants completed the experience twice, once walking across flat hallways, and once traversing a series of staircases. We collected self-reported ratings of cybersickness, presence, and perception of the ECA. We additionally collected physiological data on heart rate and galvanic skin response. Results indicate that the virtual staircases increased user level's of cybersickness and reduced their perceived realism of the ECA, but increased levels of presence.
{"title":"You Make Me Sick! The Effect of Stairs on Presence, Cybersickness, and Perception of Embodied Conversational Agents","authors":"Samuel Ang, Amanda Fernandez, Michael Rushforth, J. Quarles","doi":"10.1109/VR55154.2023.00071","DOIUrl":"https://doi.org/10.1109/VR55154.2023.00071","url":null,"abstract":"Virtual reality (VR) technologies are used in a diverse range of applications. Many of these involve an embodied conversational agent (ECA), a virtual human who exchanges information with the user. Unfortunately, VR technologies remain inaccessible to many users due to the phenomenon of cybersickness: a collection of negative symptoms such as nausea and headache that can appear when immersed in a simulation. Many factors are believed to affect a user's level of cybersickness, but little is known regarding how these factors may influence a user's opinion of an ECA. In this study, we examined the effects of virtual stairs, a factor associated with increased levels of cybersickness. We recruited 39 participants to complete a simulated airport experience. This involved a simple navigation task followed by a brief conversation with a virtual airport customs agent in Spanish. Participants completed the experience twice, once walking across flat hallways, and once traversing a series of staircases. We collected self-reported ratings of cybersickness, presence, and perception of the ECA. We additionally collected physiological data on heart rate and galvanic skin response. Results indicate that the virtual staircases increased user level's of cybersickness and reduced their perceived realism of the ECA, but increased levels of presence.","PeriodicalId":346767,"journal":{"name":"2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)","volume":"259 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123081110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/VR55154.2023.00019
Manshul Belani, Harsh Vardhan Singh, Aman Parnami, Pushpendra Singh
A recent surge in the application of Virtual Reality in education has made VR Learning Environments (VRLEs) prevalent in fields ranging from aviation, medicine, and skill training to teaching factual and conceptual content. In spite of multiple 3D affordances provided by VR, learning content placement in VRLEs has been mostly limited to a static placement in the environment. We conduct two studies to investigate the effect of different spatial representations of learning content in virtual environments on learning outcomes and user experience. In the first study, we studied the effects of placing content at four different places - world-anchored (TV screen placed in the environment), user-anchored (panel anchored to the wrist or head-mounted display of the user) and object-anchored (panel anchored to the object associated with current content) - in the VR environment with forty-two participants in the context of learning how to operate a laser cutting machine through an immersive tutorial. In the follow-up study, twenty-two participants from this study were given the option to choose from these four placements to understand their preferences. The effects of placements were examined on learning outcome measures - knowledge gain, knowledge transfer, cognitive load, user experience, and user preferences. We found that participants preferred user-anchored (controller condition) and object-anchored placement. While knowledge gain, knowledge transfer, and cognitive load were not found to be significantly different between the four conditions, the object-anchored placement scored significantly better than the TV screen and head-mounted display conditions on the user experience scales of attractiveness, stimulation, and novelty.
{"title":"Investigating Spatial Representation of Learning Content in Virtual Reality Learning Environments","authors":"Manshul Belani, Harsh Vardhan Singh, Aman Parnami, Pushpendra Singh","doi":"10.1109/VR55154.2023.00019","DOIUrl":"https://doi.org/10.1109/VR55154.2023.00019","url":null,"abstract":"A recent surge in the application of Virtual Reality in education has made VR Learning Environments (VRLEs) prevalent in fields ranging from aviation, medicine, and skill training to teaching factual and conceptual content. In spite of multiple 3D affordances provided by VR, learning content placement in VRLEs has been mostly limited to a static placement in the environment. We conduct two studies to investigate the effect of different spatial representations of learning content in virtual environments on learning outcomes and user experience. In the first study, we studied the effects of placing content at four different places - world-anchored (TV screen placed in the environment), user-anchored (panel anchored to the wrist or head-mounted display of the user) and object-anchored (panel anchored to the object associated with current content) - in the VR environment with forty-two participants in the context of learning how to operate a laser cutting machine through an immersive tutorial. In the follow-up study, twenty-two participants from this study were given the option to choose from these four placements to understand their preferences. The effects of placements were examined on learning outcome measures - knowledge gain, knowledge transfer, cognitive load, user experience, and user preferences. We found that participants preferred user-anchored (controller condition) and object-anchored placement. While knowledge gain, knowledge transfer, and cognitive load were not found to be significantly different between the four conditions, the object-anchored placement scored significantly better than the TV screen and head-mounted display conditions on the user experience scales of attractiveness, stimulation, and novelty.","PeriodicalId":346767,"journal":{"name":"2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127060853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/VR55154.2023.00067
Hyunjin Lee, Woontack Woo
Augmented reality (AR) helps users easily accept information when they are walking by providing virtual information in front of their eyes. However, it remains unclear how to present AR notifications considering the expected user reaction to interruption. Therefore, we investigated to confirm appropriate placement methods for each type by dividing it into notification types that are handled immediately (high) or that are performed later (low). We compared two coordinate systems (display-fixed and body-fixed) and three positions (top, right, and bottom) for the notification placement. We found significant effects of notification type and placement on how notifications are perceived during the AR notification experience. Using a display-fixed coordinate system responded faster for high notification types, whereas using a body-fixed coordinate system resulted in quick walking speed for low ones. As for the position, the high types had a higher notification performance at the bottom position, but the low types had enhanced walking performance at the right position. Based on the finding of our experiment, we suggest some recommendations for the future design of AR notification while walking.
{"title":"Exploring the Effects of Augmented Reality Notification Type and Placement in AR HMD while Walking","authors":"Hyunjin Lee, Woontack Woo","doi":"10.1109/VR55154.2023.00067","DOIUrl":"https://doi.org/10.1109/VR55154.2023.00067","url":null,"abstract":"Augmented reality (AR) helps users easily accept information when they are walking by providing virtual information in front of their eyes. However, it remains unclear how to present AR notifications considering the expected user reaction to interruption. Therefore, we investigated to confirm appropriate placement methods for each type by dividing it into notification types that are handled immediately (high) or that are performed later (low). We compared two coordinate systems (display-fixed and body-fixed) and three positions (top, right, and bottom) for the notification placement. We found significant effects of notification type and placement on how notifications are perceived during the AR notification experience. Using a display-fixed coordinate system responded faster for high notification types, whereas using a body-fixed coordinate system resulted in quick walking speed for low ones. As for the position, the high types had a higher notification performance at the bottom position, but the low types had enhanced walking performance at the right position. Based on the finding of our experiment, we suggest some recommendations for the future design of AR notification while walking.","PeriodicalId":346767,"journal":{"name":"2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126943294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/VR55154.2023.00056
M. R. Miller, C. Deveaux, Eugy Han, Nilam Ram, J. Bailenson
Scholars who study nonverbal behavior have focused an incredible amount of work on proxemics, how close people stand to one another, and mutual gaze, whether or not they are looking at one another. Moreover, many studies have demonstrated a correlation between gaze and distance, and so-called equilibrium theory posits that people modulate gaze and distance to maintain proper levels of nonverbal intimacy. Virtual reality scholars have also focused on these two constructs, both for theoretical reasons, as distance and gaze are often used as proxies for psychological constructs such as social presence, and for methodological reasons, as head orientation and body position are automatically produced by most VR tracking systems. However, to date, the studies of distance and gaze in VR have largely been conducted in laboratory settings, observing behavior of a small number of participants for short periods of time. In this experimental field study, we analyze the proxemics and gaze of 232 participants over two experimental studies who each contributed up to about 240 minutes of tracking data during eight weekly 30-minute social virtual reality sessions. Participants' non-verbal behaviors changed in conjunction with context manipulations and over time. Interpersonal distance increased with the size of the virtual room; and both mutual gaze and interpersonal distance increased over time. Overall, participants oriented their heads toward the center of walls rather than to corners of rectangularly-aligned environments. Finally, statistical models demonstrated that individual differences matter, with pairs and groups maintaining more consistent differences over time than would be predicted by chance. Implications for theory and practice are discussed.
{"title":"A Large-Scale Study of Proxemics and Gaze in Groups","authors":"M. R. Miller, C. Deveaux, Eugy Han, Nilam Ram, J. Bailenson","doi":"10.1109/VR55154.2023.00056","DOIUrl":"https://doi.org/10.1109/VR55154.2023.00056","url":null,"abstract":"Scholars who study nonverbal behavior have focused an incredible amount of work on proxemics, how close people stand to one another, and mutual gaze, whether or not they are looking at one another. Moreover, many studies have demonstrated a correlation between gaze and distance, and so-called equilibrium theory posits that people modulate gaze and distance to maintain proper levels of nonverbal intimacy. Virtual reality scholars have also focused on these two constructs, both for theoretical reasons, as distance and gaze are often used as proxies for psychological constructs such as social presence, and for methodological reasons, as head orientation and body position are automatically produced by most VR tracking systems. However, to date, the studies of distance and gaze in VR have largely been conducted in laboratory settings, observing behavior of a small number of participants for short periods of time. In this experimental field study, we analyze the proxemics and gaze of 232 participants over two experimental studies who each contributed up to about 240 minutes of tracking data during eight weekly 30-minute social virtual reality sessions. Participants' non-verbal behaviors changed in conjunction with context manipulations and over time. Interpersonal distance increased with the size of the virtual room; and both mutual gaze and interpersonal distance increased over time. Overall, participants oriented their heads toward the center of walls rather than to corners of rectangularly-aligned environments. Finally, statistical models demonstrated that individual differences matter, with pairs and groups maintaining more consistent differences over time than would be predicted by chance. Implications for theory and practice are discussed.","PeriodicalId":346767,"journal":{"name":"2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)","volume":"25 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114020766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/vr55154.2023.00004
{"title":"EEE VR 2023 Table of Contents","authors":"","doi":"10.1109/vr55154.2023.00004","DOIUrl":"https://doi.org/10.1109/vr55154.2023.00004","url":null,"abstract":"","PeriodicalId":346767,"journal":{"name":"2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123133105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/VR55154.2023.00082
Carlos Quijano-Chavez, L. Nedel, C. Freitas
Trends are changes in variables or attributes over time, often represented by line plots or scatterplot variants, with time being one of the axes. Interpreting tendencies and estimating trends require observing the lines or points behavior regarding increments, decrements, or both (reversals) in the value of the observed variable. Previous work assessed variants of scatterplots like Animation, Small Multiples, and Overlaid Trails for comparing the effectiveness of trends representation using large and small displays and found differences between them. In this work, we study how best to enable the analyst to explore and perform temporal trend tasks with these same techniques in immersive virtual environments. We designed and conducted a user study based on the approaches followed by previous works regarding visualization and interaction techniques, as well as tasks for comparisons in three-dimensional settings. Results show that Overlaid Trails are the fastest overall, followed by Animation and Small Multiples, while accuracy is task-dependent. We also report results from interaction measures and questionnaires.
{"title":"Comparing Scatterplot Variants for Temporal Trends Visualization in Immersive Virtual Environments","authors":"Carlos Quijano-Chavez, L. Nedel, C. Freitas","doi":"10.1109/VR55154.2023.00082","DOIUrl":"https://doi.org/10.1109/VR55154.2023.00082","url":null,"abstract":"Trends are changes in variables or attributes over time, often represented by line plots or scatterplot variants, with time being one of the axes. Interpreting tendencies and estimating trends require observing the lines or points behavior regarding increments, decrements, or both (reversals) in the value of the observed variable. Previous work assessed variants of scatterplots like Animation, Small Multiples, and Overlaid Trails for comparing the effectiveness of trends representation using large and small displays and found differences between them. In this work, we study how best to enable the analyst to explore and perform temporal trend tasks with these same techniques in immersive virtual environments. We designed and conducted a user study based on the approaches followed by previous works regarding visualization and interaction techniques, as well as tasks for comparisons in three-dimensional settings. Results show that Overlaid Trails are the fastest overall, followed by Animation and Small Multiples, while accuracy is task-dependent. We also report results from interaction measures and questionnaires.","PeriodicalId":346767,"journal":{"name":"2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131183736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/VR55154.2023.00028
J. Hombeck, Henrik Voigt, Timo Heggemann, R. Datta, K. Lawonn
As locomotion is an important factor in improving Virtual Reality (VR) immersion and usability, research in this area has been and continues to be a crucial aspect for the success of VR applications. In recent years, a variety of techniques have been developed and evaluated, ranging from abstract control, vehicle, and teleportation techniques to more realistic techniques such as motion, gestures, and gaze. However, when it comes to hands-free scenarios, for example to increase the overall accessibility of an application or in medical scenarios under sterile conditions, most of the announced techniques cannot be applied. This is where the use of speech as an intuitive means of navigation comes in handy. As systems become more capable of understanding and producing speech, voice interfaces become a valuable alternative for input on all types of devices. This takes the quality of hands-free interaction to a new level. However, intuitive user-assisted speech interaction is difficult to realize due to semantic ambiguities in natural language utterances as well as the high real-time requirements of these systems. In this paper, we investigate steering-based locomotion and selection-based locomotion using three speech-based, hands-free methods and compare them with leaning as an established alternative. Our results show that landmark-based locomotion is a convenient, fast, and intuitive way to move between locations in a VR scene. Furthermore, we show that in scenarios where landmarks are not available, number grid-based navigation is a successful solution. Based on this, we conclude that speech is a suitable alternative in hands-free scenar-ios, and exciting ideas are emerging for future work focused on developing hands-free ad hoc navigation systems for scenes where landmarks do not exist or are difficult to articulate or recognize.
{"title":"Tell Me Where To Go: Voice-Controlled Hands-Free Locomotion for Virtual Reality Systems","authors":"J. Hombeck, Henrik Voigt, Timo Heggemann, R. Datta, K. Lawonn","doi":"10.1109/VR55154.2023.00028","DOIUrl":"https://doi.org/10.1109/VR55154.2023.00028","url":null,"abstract":"As locomotion is an important factor in improving Virtual Reality (VR) immersion and usability, research in this area has been and continues to be a crucial aspect for the success of VR applications. In recent years, a variety of techniques have been developed and evaluated, ranging from abstract control, vehicle, and teleportation techniques to more realistic techniques such as motion, gestures, and gaze. However, when it comes to hands-free scenarios, for example to increase the overall accessibility of an application or in medical scenarios under sterile conditions, most of the announced techniques cannot be applied. This is where the use of speech as an intuitive means of navigation comes in handy. As systems become more capable of understanding and producing speech, voice interfaces become a valuable alternative for input on all types of devices. This takes the quality of hands-free interaction to a new level. However, intuitive user-assisted speech interaction is difficult to realize due to semantic ambiguities in natural language utterances as well as the high real-time requirements of these systems. In this paper, we investigate steering-based locomotion and selection-based locomotion using three speech-based, hands-free methods and compare them with leaning as an established alternative. Our results show that landmark-based locomotion is a convenient, fast, and intuitive way to move between locations in a VR scene. Furthermore, we show that in scenarios where landmarks are not available, number grid-based navigation is a successful solution. Based on this, we conclude that speech is a suitable alternative in hands-free scenar-ios, and exciting ideas are emerging for future work focused on developing hands-free ad hoc navigation systems for scenes where landmarks do not exist or are difficult to articulate or recognize.","PeriodicalId":346767,"journal":{"name":"2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)","volume":"285 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122473598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/VR55154.2023.00064
Niluthpol Chowdhury Mithun, Kshitij Minhas, Han-Pang Chiu, T. Oskiper, Mikhail Sizintsev, S. Samarasekera, Rakesh Kumar
Precise estimation of global orientation and location is critical to ensure a compelling outdoor Augmented Reality (AR) experience. We address the problem of geo-pose estimation by cross-view matching of query ground images to a geo-referenced aerial satellite image database. Recently, neural network-based methods have shown state-of-the-art performance in cross-view matching. However, most of the prior works focus only on location estimation, ignoring orientation, which cannot meet the requirements in outdoor AR applications. We propose a new transformer neural network-based model and a modified triplet ranking loss for joint location and orientation estimation. Experiments on several benchmark cross-view geo-localization datasets show that our model achieves state-of-the-art performance. Furthermore, we present an approach to extend the single image query-based geo-localization approach by utilizing temporal information from a navigation pipeline for robust continuous geo-localization. Experimentation on several large-scale real-world video sequences demonstrates that our approach enables high-precision and stable AR insertion.
{"title":"Cross-View Visual Geo-Localization for Outdoor Augmented Reality","authors":"Niluthpol Chowdhury Mithun, Kshitij Minhas, Han-Pang Chiu, T. Oskiper, Mikhail Sizintsev, S. Samarasekera, Rakesh Kumar","doi":"10.1109/VR55154.2023.00064","DOIUrl":"https://doi.org/10.1109/VR55154.2023.00064","url":null,"abstract":"Precise estimation of global orientation and location is critical to ensure a compelling outdoor Augmented Reality (AR) experience. We address the problem of geo-pose estimation by cross-view matching of query ground images to a geo-referenced aerial satellite image database. Recently, neural network-based methods have shown state-of-the-art performance in cross-view matching. However, most of the prior works focus only on location estimation, ignoring orientation, which cannot meet the requirements in outdoor AR applications. We propose a new transformer neural network-based model and a modified triplet ranking loss for joint location and orientation estimation. Experiments on several benchmark cross-view geo-localization datasets show that our model achieves state-of-the-art performance. Furthermore, we present an approach to extend the single image query-based geo-localization approach by utilizing temporal information from a navigation pipeline for robust continuous geo-localization. Experimentation on several large-scale real-world video sequences demonstrates that our approach enables high-precision and stable AR insertion.","PeriodicalId":346767,"journal":{"name":"2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127180150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-01DOI: 10.1109/vr55154.2023.00009
{"title":"International Program Supercommittee","authors":"","doi":"10.1109/vr55154.2023.00009","DOIUrl":"https://doi.org/10.1109/vr55154.2023.00009","url":null,"abstract":"","PeriodicalId":346767,"journal":{"name":"2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123063103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}