Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092400
P. Debenham, G. Thomas, Jonathan Trout
In this paper we describe the development of an augmented reality system designed to provide an exciting new way for the Natural History Museum in London to present evolutionary history to their visitors. The system uses a through-the-lens tracker and infrared LED markers to provide an unobtrusive and robust system that can operate for multiple users across a wide area.
{"title":"Evolutionary augmented reality at the Natural History Museum","authors":"P. Debenham, G. Thomas, Jonathan Trout","doi":"10.1109/ISMAR.2011.6092400","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092400","url":null,"abstract":"In this paper we describe the development of an augmented reality system designed to provide an exciting new way for the Natural History Museum in London to present evolutionary history to their visitors. The system uses a through-the-lens tracker and infrared LED markers to provide an unobtrusive and robust system that can operate for multiple users across a wide area.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"191 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132469333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092367
Christian Pirchheim, Gerhard Reitmayr
We present a real-time camera pose tracking and mapping system which uses the assumption of a planar scene to implement a highly efficient mapping algorithm. Our light-weight mapping approach is based on keyframes and plane-induced homographies between them. We solve the planar reconstruction problem of estimating the keyframe poses with an efficient image rectification algorithm. Camera pose tracking uses continuously extended and refined planar point maps and delivers robustly estimated 6DOF poses. We compare system and method with bundle adjustment and monocular SLAM on synthetic and indoor image sequences. We demonstrate large savings in computational effort compared to the monocular SLAM system while the reduction in accuracy remains acceptable.
{"title":"Homography-based planar mapping and tracking for mobile phones","authors":"Christian Pirchheim, Gerhard Reitmayr","doi":"10.1109/ISMAR.2011.6092367","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092367","url":null,"abstract":"We present a real-time camera pose tracking and mapping system which uses the assumption of a planar scene to implement a highly efficient mapping algorithm. Our light-weight mapping approach is based on keyframes and plane-induced homographies between them. We solve the planar reconstruction problem of estimating the keyframe poses with an efficient image rectification algorithm. Camera pose tracking uses continuously extended and refined planar point maps and delivers robustly estimated 6DOF poses. We compare system and method with bundle adjustment and monocular SLAM on synthetic and indoor image sequences. We demonstrate large savings in computational effort compared to the monocular SLAM system while the reduction in accuracy remains acceptable.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127989781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092376
Daniel Kurz, Selim Benhimane
This paper investigates how different stages in handheld Augmented Reality (AR) applications can benefit from knowing the direction of the gravity measured with inertial sensors. It presents approaches to improve the description and matching of feature points, detection and tracking of planar templates, and the visual quality of the rendering of virtual 3D objects by incorporating the gravity vector. In handheld AR, both the camera and the display are located in the user's hand and therefore can be freely moved. The pose of the camera is generally determined with respect to piecewise planar objects that have a known static orientation with respect to gravity. In the presence of (close to) vertical surfaces, we show how gravity-aligned feature descriptors (GAFD) improve the initialization of tracking algorithms relying on feature point descriptor-based approaches in terms of quality and performance. For (close to) horizontal surfaces, we propose to use the gravity vector to rectify the camera image and detect and describe features in the rectified image. The resulting gravity-rectified feature descriptors (GREFD) provide an improved precision-recall characteristic and enable faster initialization, in particular under steep viewing angles. Gravity-rectified camera images also allow for real-time 6 DoF pose estimation using an edge-based object detection algorithm handling only 4 DoF similarity transforms. Finally, the rendering of virtual 3D objects can be made more realistic and plausible by taking into account the orientation of the gravitational force in addition to the relative pose between the handheld device and a real object.
{"title":"Gravity-aware handheld Augmented Reality","authors":"Daniel Kurz, Selim Benhimane","doi":"10.1109/ISMAR.2011.6092376","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092376","url":null,"abstract":"This paper investigates how different stages in handheld Augmented Reality (AR) applications can benefit from knowing the direction of the gravity measured with inertial sensors. It presents approaches to improve the description and matching of feature points, detection and tracking of planar templates, and the visual quality of the rendering of virtual 3D objects by incorporating the gravity vector. In handheld AR, both the camera and the display are located in the user's hand and therefore can be freely moved. The pose of the camera is generally determined with respect to piecewise planar objects that have a known static orientation with respect to gravity. In the presence of (close to) vertical surfaces, we show how gravity-aligned feature descriptors (GAFD) improve the initialization of tracking algorithms relying on feature point descriptor-based approaches in terms of quality and performance. For (close to) horizontal surfaces, we propose to use the gravity vector to rectify the camera image and detect and describe features in the rectified image. The resulting gravity-rectified feature descriptors (GREFD) provide an improved precision-recall characteristic and enable faster initialization, in particular under steep viewing angles. Gravity-rectified camera images also allow for real-time 6 DoF pose estimation using an edge-based object detection algorithm handling only 4 DoF similarity transforms. Finally, the rendering of virtual 3D objects can be made more realistic and plausible by taking into account the orientation of the gravitational force in addition to the relative pose between the handheld device and a real object.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131420935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092377
Youngmin Park, V. Lepetit, Woontack Woo
We propose a texture-less object detection and 3D tracking method which automatically extracts on the fly the information it needs from color images and the corresponding depth maps. While texture-less 3D tracking is not new, it requires a prior CAD model, and real-time methods for detection still have to be developed for robust tracking. To detect the target, we propose to rely on a fast template-based method, which provides an initial estimate of its 3D pose, and we refine this estimate using the depth and image contours information. We automatically extract a 3D model for the target from the depth information. To this end, we developed methods to enhance the depth map and to stabilize the 3D pose estimation. We demonstrate our method on challenging sequences exhibiting partial occlusions and fast motions.
{"title":"Texture-less object tracking with online training using an RGB-D camera","authors":"Youngmin Park, V. Lepetit, Woontack Woo","doi":"10.1109/ISMAR.2011.6092377","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092377","url":null,"abstract":"We propose a texture-less object detection and 3D tracking method which automatically extracts on the fly the information it needs from color images and the corresponding depth maps. While texture-less 3D tracking is not new, it requires a prior CAD model, and real-time methods for detection still have to be developed for robust tracking. To detect the target, we propose to rely on a fast template-based method, which provides an initial estimate of its 3D pose, and we refine this estimate using the depth and image contours information. We automatically extract a 3D model for the target from the depth information. To this end, we developed methods to enhance the depth map and to stabilize the 3D pose estimation. We demonstrate our method on challenging sequences exhibiting partial occlusions and fast motions.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127673416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092399
Jonathan Ventura, Tobias Höllerer
We describe an end-to-end system for mobile, vision-based localization and tracking in urban environments. Our system uses panoramic imagery which is processed and indexed to provide localization coverage over a large area using few capture points. We utilize a client-server model which allows for remote computation and data storage while maintaining real-time tracking performance. Previous search results are cached and re-used by the mobile client to minimize communication overhead. We evaluate the use of the system for flexible real-time camera tracking in large outdoor spaces.
{"title":"Outdoor mobile localization from panoramic imagery","authors":"Jonathan Ventura, Tobias Höllerer","doi":"10.1109/ISMAR.2011.6092399","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092399","url":null,"abstract":"We describe an end-to-end system for mobile, vision-based localization and tracking in urban environments. Our system uses panoramic imagery which is processed and indexed to provide localization coverage over a large area using few capture points. We utilize a client-server model which allows for remote computation and data storage while maintaining real-time tracking performance. Previous search results are cached and re-used by the mobile client to minimize communication overhead. We evaluate the use of the system for flexible real-time camera tracking in large outdoor spaces.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130358294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092374
Christopher Coffin, Cha Lee, Tobias Höllerer
Natural feature tracking systems for augmented reality are highly accurate, but can suffer from lost tracking. When registration is lost, the system must be able to re-localize and recover tracking. Likewise, when a camera is new to a scene, it must be able to perform the related task of localization. Localization and re-localization can only be performed at certain points or when viewing particular objects or parts of the scene with a sufficient number and quality of recognizable features to allow for tracking recovery. We explore how the density of such recovery locations/poses influences the time it takes users to resume tracking. We focus our evaluation on two generalized techniques for localization: keyframe-based and model-based. For the keyframe-based approach we assume a constant collection rate for keyframes. We find that at practical collection rates, the task of localization to a previously acquired keyframe that is shown to the user does not become more time-consuming as the interval between keyframes increases. For a localization approach using model data, we consider a grid of points around the model at which localization is guaranteed to succeed. We find that the user interface is crucial to successful localization. Localization can occur quickly if users do not need to orient themselves to marked localization points. When users are forced to mentally register themselves with a map of the scene, localization quickly becomes impractical as the distance to the next localization point increases. We contend that our results will help future designers of localization techniques to better plan for the effects of their proposed solutions.
{"title":"Evaluating the impact of recovery density on augmented reality tracking","authors":"Christopher Coffin, Cha Lee, Tobias Höllerer","doi":"10.1109/ISMAR.2011.6092374","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092374","url":null,"abstract":"Natural feature tracking systems for augmented reality are highly accurate, but can suffer from lost tracking. When registration is lost, the system must be able to re-localize and recover tracking. Likewise, when a camera is new to a scene, it must be able to perform the related task of localization. Localization and re-localization can only be performed at certain points or when viewing particular objects or parts of the scene with a sufficient number and quality of recognizable features to allow for tracking recovery. We explore how the density of such recovery locations/poses influences the time it takes users to resume tracking. We focus our evaluation on two generalized techniques for localization: keyframe-based and model-based. For the keyframe-based approach we assume a constant collection rate for keyframes. We find that at practical collection rates, the task of localization to a previously acquired keyframe that is shown to the user does not become more time-consuming as the interval between keyframes increases. For a localization approach using model data, we consider a grid of points around the model at which localization is guaranteed to succeed. We find that the user interface is crucial to successful localization. Localization can occur quickly if users do not need to orient themselves to marked localization points. When users are forced to mentally register themselves with a map of the scene, localization quickly becomes impractical as the distance to the next localization point increases. We contend that our results will help future designers of localization techniques to better plan for the effects of their proposed solutions.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124785518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092402
W. Lui, D. Browne, L. Kleeman, T. Drummond, Wai Ho Li
Visual prostheses such as retinal implants provide bionic vision that is limited in spatial and intensity resolution. This limitation is a fundamental challenge of bionic vision as it severely truncates salient visual information. We propose to address this challenge by performing real time transformations of visual and non-visual sensor data into symbolic representations that are then rendered as low resolution vision; a concept we call Transformative Reality. For example, a depth camera allows the detection of empty ground in cluttered environments that is then visually rendered as bionic vision to enable indoor navigation. Such symbolic representations are similar to virtual content overlays used in Augmented Reality but are registered to the 3D world via the user's sense of touch. Preliminary user trials, where a head mounted display artificially constrains vision to a 25×25 grid of binary dots, suggest that Transformative Reality provides practical and significant improvements over traditional bionic vision in tasks such as indoor navigation, object localisation and people detection.
{"title":"Transformative reality: Augmented reality for visual prostheses","authors":"W. Lui, D. Browne, L. Kleeman, T. Drummond, Wai Ho Li","doi":"10.1109/ISMAR.2011.6092402","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092402","url":null,"abstract":"Visual prostheses such as retinal implants provide bionic vision that is limited in spatial and intensity resolution. This limitation is a fundamental challenge of bionic vision as it severely truncates salient visual information. We propose to address this challenge by performing real time transformations of visual and non-visual sensor data into symbolic representations that are then rendered as low resolution vision; a concept we call Transformative Reality. For example, a depth camera allows the detection of empty ground in cluttered environments that is then visually rendered as bionic vision to enable indoor navigation. Such symbolic representations are similar to virtual content overlays used in Augmented Reality but are registered to the 3D world via the user's sense of touch. Preliminary user trials, where a head mounted display artificially constrains vision to a 25×25 grid of binary dots, suggest that Transformative Reality provides practical and significant improvements over traditional bionic vision in tasks such as indoor navigation, object localisation and people detection.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116867368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-01DOI: 10.1109/ISMAR.2011.6143897
Thommen Korah, Yun-Ta Tsai
Detailed 3D scans of urban environments are increasingly being collected with the goal of bringing more location-aware content to mobile users. This work converts large collections of LIDAR scans and street-view panoramas into a representation that extracts semantically meaningful components of the scene. Compressing this data by an order of magnitude or more enables rich user interactions with mobile applications that have a very good knowledge of the scene around them. These representations are suitable for integrating into physics engines and transmission over mobile networks — key components of modern AR entertainment solutions.
{"title":"Urban canvas: Unfreezing street-view imagery with semantically compressed LIDAR pointclouds","authors":"Thommen Korah, Yun-Ta Tsai","doi":"10.1109/ISMAR.2011.6143897","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6143897","url":null,"abstract":"Detailed 3D scans of urban environments are increasingly being collected with the goal of bringing more location-aware content to mobile users. This work converts large collections of LIDAR scans and street-view panoramas into a representation that extracts semantically meaningful components of the scene. Compressing this data by an order of magnitude or more enables rich user interactions with mobile applications that have a very good knowledge of the scene around them. These representations are suitable for integrating into physics engines and transmission over mobile networks — key components of modern AR entertainment solutions.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129262146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-01DOI: 10.2312/EGVE/JVRC12/073-080
P. Maier, Arindam Dey, C. Waechter, C. Sandor, M. Tönnis, G. Klinker
The calibration of optical see-through head-mounted displays is an important fundament for correct object alignment in augmented reality. Any calibration process for OSTHMDs requires users to align 2D points in screen space with 3D points in the real world and to confirm each alignment. In this poster, we present the results of our empiric evaluation where we compared four confirmation methods: Keyboard, Hand-held, Voice, and Waiting. The Waiting method, designed to reduce head motion during confirmation, showed a significantly higher accuracy than all other methods. Averaging over a time frame for sampling user input before the time of confirmation improved the accuracy of all methods in addition. We conducted a further expert study proving that the results achieved with a video see-through head-mounted display showed valid for optical see-through head-mounted display calibration, too.
{"title":"An empiric evaluation of confirmation methods for optical see-through head-mounted display calibration","authors":"P. Maier, Arindam Dey, C. Waechter, C. Sandor, M. Tönnis, G. Klinker","doi":"10.2312/EGVE/JVRC12/073-080","DOIUrl":"https://doi.org/10.2312/EGVE/JVRC12/073-080","url":null,"abstract":"The calibration of optical see-through head-mounted displays is an important fundament for correct object alignment in augmented reality. Any calibration process for OSTHMDs requires users to align 2D points in screen space with 3D points in the real world and to confirm each alignment. In this poster, we present the results of our empiric evaluation where we compared four confirmation methods: Keyboard, Hand-held, Voice, and Waiting. The Waiting method, designed to reduce head motion during confirmation, showed a significantly higher accuracy than all other methods. Averaging over a time frame for sampling user input before the time of confirmation improved the accuracy of all methods in addition. We conducted a further expert study proving that the results achieved with a video see-through head-mounted display showed valid for optical see-through head-mounted display calibration, too.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115337676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-01DOI: 10.1109/ISMAR.2011.6143893
F. Mannuß, J. Rubel, Clemens Wagner, F. Bingel, André Hinkenjann
We present a system for interactive magnetic field simulation in an AR-setup. The aim of this work is to investigate how AR technology can help to develop a better understanding of the concept of fields and field lines and their relationship to the magnetic forces in typical school experiments. The haptic feedback is provided by real magnets that are optically tracked. In a stereo video see-through head-mounted display, the magnets are augmented with the dynamically computed field lines.
{"title":"Augmenting magnetic field lines for school experiments","authors":"F. Mannuß, J. Rubel, Clemens Wagner, F. Bingel, André Hinkenjann","doi":"10.1109/ISMAR.2011.6143893","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6143893","url":null,"abstract":"We present a system for interactive magnetic field simulation in an AR-setup. The aim of this work is to investigate how AR technology can help to develop a better understanding of the concept of fields and field lines and their relationship to the magnetic forces in typical school experiments. The haptic feedback is provided by real magnets that are optically tracked. In a stereo video see-through head-mounted display, the magnets are augmented with the dynamically computed field lines.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124748441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}