Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092371
B. MacIntyre, A. Hill, Hafez Rouzati, Maribeth Gandy Coleman, Brian Davidson
A common vision of Augmented Reality (AR) is that of a person immersed in a diverse collection of virtual information, superimposed on their view of the world around them. If such a vision is to become reality, an ecosystem for AR must be created that satisfies at least these properties: multiple sources (or channels of interactive information) must be able to be simultaneously displayed and interacted with, channels must be isolated from each other (for security and stability), channel authors must have the flexibility to design the content and interactivity of their channel, and the application must fluidly integrate with the ever-growing cloud of systems and services that define our digital lives. In this paper, we present the design and implementation of the Argon AR Web Browser and describe our vision of an AR application environment that leverages the WWW ecosystem. We also describe KARML, our extension to KML (the spatial markup language for Google Earth and Maps), that supports the functionality required for mobile AR. We combine KARML with the full range of standard web technologies to create a standards-based web browser for mobile AR. KARML lets users develop 2D and 3D content using existing web technologies and facilitates easy deployment from standard web servers. We highlight a number of projects that have used Argon and point out the ways in which our web-based architecture has made previously impractical AR concepts possible.
增强现实(AR)的一个常见愿景是,一个人沉浸在不同的虚拟信息集合中,叠加在他们对周围世界的看法上。如果这样的愿景成为现实,必须创建一个至少满足以下属性的AR生态系统:多个来源(或交互信息的渠道)必须能够同时显示和交互,渠道必须彼此隔离(为了安全和稳定),渠道作者必须能够灵活地设计其渠道的内容和交互性,应用程序必须与定义我们数字生活的不断增长的系统和服务云流畅地集成。在本文中,我们介绍了Argon AR Web浏览器的设计和实现,并描述了我们对利用WWW生态系统的AR应用环境的愿景。我们还描述了KARML,我们对KML(谷歌地球和地图的空间标记语言)的扩展,它支持移动AR所需的功能。我们将KARML与各种标准web技术相结合,为移动AR创建基于标准的web浏览器。KARML允许用户使用现有的web技术开发2D和3D内容,并便于从标准web服务器轻松部署。我们重点介绍了一些使用Argon的项目,并指出我们基于web的架构如何使以前不切实际的AR概念成为可能。
{"title":"The Argon AR Web Browser and standards-based AR application environment","authors":"B. MacIntyre, A. Hill, Hafez Rouzati, Maribeth Gandy Coleman, Brian Davidson","doi":"10.1109/ISMAR.2011.6092371","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092371","url":null,"abstract":"A common vision of Augmented Reality (AR) is that of a person immersed in a diverse collection of virtual information, superimposed on their view of the world around them. If such a vision is to become reality, an ecosystem for AR must be created that satisfies at least these properties: multiple sources (or channels of interactive information) must be able to be simultaneously displayed and interacted with, channels must be isolated from each other (for security and stability), channel authors must have the flexibility to design the content and interactivity of their channel, and the application must fluidly integrate with the ever-growing cloud of systems and services that define our digital lives. In this paper, we present the design and implementation of the Argon AR Web Browser and describe our vision of an AR application environment that leverages the WWW ecosystem. We also describe KARML, our extension to KML (the spatial markup language for Google Earth and Maps), that supports the functionality required for mobile AR. We combine KARML with the full range of standard web technologies to create a standards-based web browser for mobile AR. KARML lets users develop 2D and 3D content using existing web technologies and facilitates easy deployment from standard web servers. We highlight a number of projects that have used Argon and point out the ways in which our web-based architecture has made previously impractical AR concepts possible.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"93 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125974453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092378
Richard A. Newcombe, S. Izadi, Otmar Hilliges, D. Molyneaux, David Kim, A. Davison, Pushmeet Kohli, J. Shotton, Steve Hodges, A. Fitzgibbon
We present a system for accurate real-time mapping of complex and arbitrary indoor scenes in variable lighting conditions, using only a moving low-cost depth camera and commodity graphics hardware. We fuse all of the depth data streamed from a Kinect sensor into a single global implicit surface model of the observed scene in real-time. The current sensor pose is simultaneously obtained by tracking the live depth frame relative to the global model using a coarse-to-fine iterative closest point (ICP) algorithm, which uses all of the observed depth data available. We demonstrate the advantages of tracking against the growing full surface model compared with frame-to-frame tracking, obtaining tracking and mapping results in constant time within room sized scenes with limited drift and high accuracy. We also show both qualitative and quantitative results relating to various aspects of our tracking and mapping system. Modelling of natural scenes, in real-time with only commodity sensor and GPU hardware, promises an exciting step forward in augmented reality (AR), in particular, it allows dense surfaces to be reconstructed in real-time, with a level of detail and robustness beyond any solution yet presented using passive computer vision.
{"title":"KinectFusion: Real-time dense surface mapping and tracking","authors":"Richard A. Newcombe, S. Izadi, Otmar Hilliges, D. Molyneaux, David Kim, A. Davison, Pushmeet Kohli, J. Shotton, Steve Hodges, A. Fitzgibbon","doi":"10.1109/ISMAR.2011.6092378","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092378","url":null,"abstract":"We present a system for accurate real-time mapping of complex and arbitrary indoor scenes in variable lighting conditions, using only a moving low-cost depth camera and commodity graphics hardware. We fuse all of the depth data streamed from a Kinect sensor into a single global implicit surface model of the observed scene in real-time. The current sensor pose is simultaneously obtained by tracking the live depth frame relative to the global model using a coarse-to-fine iterative closest point (ICP) algorithm, which uses all of the observed depth data available. We demonstrate the advantages of tracking against the growing full surface model compared with frame-to-frame tracking, obtaining tracking and mapping results in constant time within room sized scenes with limited drift and high accuracy. We also show both qualitative and quantitative results relating to various aspects of our tracking and mapping system. Modelling of natural scenes, in real-time with only commodity sensor and GPU hardware, promises an exciting step forward in augmented reality (AR), in particular, it allows dense surfaces to be reconstructed in real-time, with a level of detail and robustness beyond any solution yet presented using passive computer vision.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"402 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122371108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092393
Saeko Shimazu, D. Iwai, Kosuke Sato
This paper introduces a new high dynamic range (HDR) display system that generates a physical 3D HDR image without using stereoscopic methods. To boost contrast beyond that obtained using either a hardcopy or a projector, we employ a multiprojection system to superimpose images onto a textured solid hardcopy that is output by a 3D printer or a rapid prototyping machine. We introduce two basic techniques for our 3D HDR display. The first technique computes an optimal placement of projectors so that projected images cover the hardcopy's entire surface while maximizing image quality. The second technique allows a user to place the projectors near the computed optimal position by projecting from each projector images that act as visual guides. Through proof-of-concept experiments, we were able to modulate luminance and chrominance with a registration error of less than 3 mm. The physical contrast ratio obtained using our method was approximately 5,000:1, while it was 5:1 in the case of viewing the 3D printout under environmental light and 1,000:1 in the case of using the projectors to project the image on regular screens.
{"title":"3D high dynamic range display system","authors":"Saeko Shimazu, D. Iwai, Kosuke Sato","doi":"10.1109/ISMAR.2011.6092393","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092393","url":null,"abstract":"This paper introduces a new high dynamic range (HDR) display system that generates a physical 3D HDR image without using stereoscopic methods. To boost contrast beyond that obtained using either a hardcopy or a projector, we employ a multiprojection system to superimpose images onto a textured solid hardcopy that is output by a 3D printer or a rapid prototyping machine. We introduce two basic techniques for our 3D HDR display. The first technique computes an optimal placement of projectors so that projected images cover the hardcopy's entire surface while maximizing image quality. The second technique allows a user to place the projectors near the computed optimal position by projecting from each projector images that act as visual guides. Through proof-of-concept experiments, we were able to modulate luminance and chrominance with a registration error of less than 3 mm. The physical contrast ratio obtained using our method was approximately 5,000:1, while it was 5:1 in the case of viewing the 3D printout under environmental light and 1,000:1 in the case of using the projectors to project the image on regular screens.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114396577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092397
Inkyu Han, Hyoungnyoun Kim, Ji-Hyung Park
The three-dimensional (3D) reconstruction of objects has been well studied in the literature of augmented reality (AR) [1, 2]. Most existing studies have assumed that the to-be-constructed target object is rigid, whereas objects in the real world can be dynamic or deformable. Therefore, AR systems are required to deal with non-rigid objects to be adaptive to environmental changes. In this paper, we address the problem of reconstructing articulated objects as a starting point for modeling deformable objects. An articulated object is composed of partially rigid components linked with joints. After building a mesh model of the object, the model is segmented into the components along their boundaries by a graph-cut-based approach that we propose.
{"title":"Graph-cut-based 3D model segmentation for articulated object reconstruction","authors":"Inkyu Han, Hyoungnyoun Kim, Ji-Hyung Park","doi":"10.1109/ISMAR.2011.6092397","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092397","url":null,"abstract":"The three-dimensional (3D) reconstruction of objects has been well studied in the literature of augmented reality (AR) [1, 2]. Most existing studies have assumed that the to-be-constructed target object is rigid, whereas objects in the real world can be dynamic or deformable. Therefore, AR systems are required to deal with non-rigid objects to be adaptive to environmental changes. In this paper, we address the problem of reconstructing articulated objects as a starting point for modeling deformable objects. An articulated object is composed of partially rigid components linked with joints. After building a mesh model of the object, the model is segmented into the components along their boundaries by a graph-cut-based approach that we propose.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"91 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126125920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092365
M. Donoser, P. Kontschieder, H. Bischof
In this paper we introduce a novel real-time method to track weakly textured planar objects and to simultaneously estimate their 3D pose. The basic idea is to adapt the classic tracking-by-detection approach, which seeks for the object to be tracked independently in each frame, for tracking non-textured objects. In order to robustly estimate the 3D pose of such objects in each frame, we have to tackle three demanding problems. First, we need to find a stable representation of the object which is discriminable against the background and highly repetitive. Second, we have to robustly relocate this representation in every frame, also during considerable viewpoint changes. Finally, we have to estimate the pose from a single, closed object contour. Of course, all demands shall be accommodated at low computational costs and in real-time. To attack the above mentioned problems, we propose to exploit the properties of Maximally Stable Extremal Regions (MSERs) for detecting the required contours in an efficient manner and to apply random ferns as efficient and robust classifier for tracking. To estimate the 3D pose, we construct a perspectively invariant frame on the closed contour which is intrinsically provided by the extracted MSER. In our experiments we obtain robust tracking results with accurate poses on various challenging image sequences at a single requirement: One MSER used for tracking has to have at least one concavity that sufficiently deviates from its convex hull.
{"title":"Robust planar target tracking and pose estimation from a single concavity","authors":"M. Donoser, P. Kontschieder, H. Bischof","doi":"10.1109/ISMAR.2011.6092365","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092365","url":null,"abstract":"In this paper we introduce a novel real-time method to track weakly textured planar objects and to simultaneously estimate their 3D pose. The basic idea is to adapt the classic tracking-by-detection approach, which seeks for the object to be tracked independently in each frame, for tracking non-textured objects. In order to robustly estimate the 3D pose of such objects in each frame, we have to tackle three demanding problems. First, we need to find a stable representation of the object which is discriminable against the background and highly repetitive. Second, we have to robustly relocate this representation in every frame, also during considerable viewpoint changes. Finally, we have to estimate the pose from a single, closed object contour. Of course, all demands shall be accommodated at low computational costs and in real-time. To attack the above mentioned problems, we propose to exploit the properties of Maximally Stable Extremal Regions (MSERs) for detecting the required contours in an efficient manner and to apply random ferns as efficient and robust classifier for tracking. To estimate the 3D pose, we construct a perspectively invariant frame on the closed contour which is intrinsically provided by the extracted MSER. In our experiments we obtain robust tracking results with accurate poses on various challenging image sequences at a single requirement: One MSER used for tracking has to have at least one concavity that sufficiently deviates from its convex hull.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"12 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132953990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092390
Alessandro Mulloni, H. Seichter, D. Schmalstieg
We investigate user experiences when using augmented reality (AR) as a new aid to navigation. We integrate AR with other more common interfaces into a handheld navigation system, and we conduct an exploratory study to see where and how people exploit AR. Based on previous work on augmented photographs, we hypothesize that AR is used more to support wayfinding at static locations when users approach a road intersection. In partial contrast to this hypothesis, our results from a user evaluation hint that users will expect to use the system while walking. Further, our results also show that AR is usually exploited shortly before and after road intersections, suggesting that tracking support will be mostly needed in proximity of road intersections.
{"title":"User experiences with augmented reality aided navigation on phones","authors":"Alessandro Mulloni, H. Seichter, D. Schmalstieg","doi":"10.1109/ISMAR.2011.6092390","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092390","url":null,"abstract":"We investigate user experiences when using augmented reality (AR) as a new aid to navigation. We integrate AR with other more common interfaces into a handheld navigation system, and we conduct an exploratory study to see where and how people exploit AR. Based on previous work on augmented photographs, we hypothesize that AR is used more to support wayfinding at static locations when users approach a road intersection. In partial contrast to this hypothesis, our results from a user evaluation hint that users will expect to use the system while walking. Further, our results also show that AR is usually exploited shortly before and after road intersections, suggesting that tracking support will be mostly needed in proximity of road intersections.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134122816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092401
Markus Broecker, Ross T. Smith, B. Thomas
This poster presents the concept of combining two display technologies to enhance graphics effects in spatial augmented reality (SAR) environments. This is achieved by using an ePaper surface as an adaptive substrate instead of a white painted surface allowing the development of novel image techniques to improve image quality and object appearance in projector-based SAR environments.
{"title":"Adaptive substrate for enhanced spatial augmented reality contrast and resolution","authors":"Markus Broecker, Ross T. Smith, B. Thomas","doi":"10.1109/ISMAR.2011.6092401","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092401","url":null,"abstract":"This poster presents the concept of combining two display technologies to enhance graphics effects in spatial augmented reality (SAR) environments. This is achieved by using an ePaper surface as an adaptive substrate instead of a white painted surface allowing the development of novel image techniques to improve image quality and object appearance in projector-based SAR environments.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134189612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR-AMH.2011.6093646
C. Perey
The researchers and developers of mobile AR platforms need to use a common platform for developing experiences regardless of the surroundings of the user. In order to expand the use of AR both indoor and outdoor with and without computer vision techniques, the breadth of options available for positioning users and points of interest needs to expand. Separately, the experts in indoor positioning and navigation are generally not as familiar with AR use scenarios as they are with other domains. Together, positioning and navigation experts, and mobile AR experts, will discuss: — What are the indoor positioning and navigation systems best suited for mobile AR? — What studies are underway or need to be conducted in order to advance this field?
{"title":"Indoor positioning and navigation for mobile AR","authors":"C. Perey","doi":"10.1109/ISMAR-AMH.2011.6093646","DOIUrl":"https://doi.org/10.1109/ISMAR-AMH.2011.6093646","url":null,"abstract":"The researchers and developers of mobile AR platforms need to use a common platform for developing experiences regardless of the surroundings of the user. In order to expand the use of AR both indoor and outdoor with and without computer vision techniques, the breadth of options available for positioning users and points of interest needs to expand. Separately, the experts in indoor positioning and navigation are generally not as familiar with AR use scenarios as they are with other domains. Together, positioning and navigation experts, and mobile AR experts, will discuss: — What are the indoor positioning and navigation systems best suited for mobile AR? — What studies are underway or need to be conducted in order to advance this field?","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113935836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ISMAR.2011.6092396
Yi Wu, M. E. Choubassi, I. Kozintsev
We describe an augmented reality prototype for exploring a 3D urban environment on mobile devices. Our system utilizes the location and orientation sensors on the mobile platform as well as computer vision techniques to register the live view of the device with the 3D urban data. In particular, the system recognizes the buildings in the live video, tracks the camera pose, and augments the video with relevant information about the buildings in the correct perspective. The 3D urban data consist of 3D point clouds and corresponding geo-tagged RGB images of the urban environment. We also discuss the processing steps to make such 3D data scalable and usable by our system.
{"title":"Augmenting 3D urban environment using mobile devices","authors":"Yi Wu, M. E. Choubassi, I. Kozintsev","doi":"10.1109/ISMAR.2011.6092396","DOIUrl":"https://doi.org/10.1109/ISMAR.2011.6092396","url":null,"abstract":"We describe an augmented reality prototype for exploring a 3D urban environment on mobile devices. Our system utilizes the location and orientation sensors on the mobile platform as well as computer vision techniques to register the live view of the device with the 3D urban data. In particular, the system recognizes the buildings in the live video, tracks the camera pose, and augments the video with relevant information about the buildings in the correct perspective. The 3D urban data consist of 3D point clouds and corresponding geo-tagged RGB images of the urban environment. We also discuss the processing steps to make such 3D data scalable and usable by our system.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124742619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-26DOI: 10.1109/ismar.2011.6092373
Gilles Simon
Tracking-by-synthesis is a promising method for markerless vision-based camera tracking, particularly suitable for Augmented Reality applications. In particular, it is drift-free, viewpoint invariant and easy-to-combine with physical sensors such as GPS and inertial sensors. While edge features have been used succesfully within the tracking-by-synthesis framework, point features have, to our knowledge, still never been used. We believe that this is due to the fact that real-time corner detectors are generally weakly repeatable between a camera image and a rendered texture. In this paper, we compare the repeatability of commonly used FAST, Harris and SURF interest point detectors across view synthesis. We show that adding depth blur to the rendered texture can drastically improve the repeatability of FAST and Harris corner detectors (up to 100% in our experiments), which can be very helpful, e.g., to make tracking-by-synthesis running on mobile phones. We propose a method for simulating depth blur on the rendered images using a pre-calibrated depth response curve. In order to fulfil the performance requirements, a pyramidal approach is used based on the well-known MIP mapping technique. We also propose an original method for calibrating the depth response curve, which is suitable for any kind of focus lenses and comes for free in terms of programming effort, once the tracking-by-synthesis algorithm has been implemented.
{"title":"Tracking-by-synthesis using point features and pyramidal blurring","authors":"Gilles Simon","doi":"10.1109/ismar.2011.6092373","DOIUrl":"https://doi.org/10.1109/ismar.2011.6092373","url":null,"abstract":"Tracking-by-synthesis is a promising method for markerless vision-based camera tracking, particularly suitable for Augmented Reality applications. In particular, it is drift-free, viewpoint invariant and easy-to-combine with physical sensors such as GPS and inertial sensors. While edge features have been used succesfully within the tracking-by-synthesis framework, point features have, to our knowledge, still never been used. We believe that this is due to the fact that real-time corner detectors are generally weakly repeatable between a camera image and a rendered texture. In this paper, we compare the repeatability of commonly used FAST, Harris and SURF interest point detectors across view synthesis. We show that adding depth blur to the rendered texture can drastically improve the repeatability of FAST and Harris corner detectors (up to 100% in our experiments), which can be very helpful, e.g., to make tracking-by-synthesis running on mobile phones. We propose a method for simulating depth blur on the rendered images using a pre-calibrated depth response curve. In order to fulfil the performance requirements, a pyramidal approach is used based on the well-known MIP mapping technique. We also propose an original method for calibrating the depth response curve, which is suitable for any kind of focus lenses and comes for free in terms of programming effort, once the tracking-by-synthesis algorithm has been implemented.","PeriodicalId":298757,"journal":{"name":"2011 10th IEEE International Symposium on Mixed and Augmented Reality","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130171694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}