Michael J. Laielli, James Smith, Giscard Biamby, Trevor Darrell, B. Hartmann
{"title":"LabelAR","authors":"Michael J. Laielli, James Smith, Giscard Biamby, Trevor Darrell, B. Hartmann","doi":"10.1145/3332165.3347927","DOIUrl":null,"url":null,"abstract":"Computer vision is applied in an ever expanding range of applications, many of which require custom training data to perform well. We present a novel interface for rapid collection of labeled training images to improve CV-based object detectors. LabelAR leverages the spatial tracking capabilities of an AR-enabled camera, allowing users to place persistent bounding volumes that stay centered on real-world objects. The interface then guides the user to move the camera to cover a wide variety of viewpoints. We eliminate the need for post hoc labeling of images by automatically projecting 2D bounding boxes around objects in the images as they are captured from AR-marked viewpoints. In a user study with 12 participants, LabelAR significantly outperforms existing approaches in terms of the trade-off between detection performance and collection time.","PeriodicalId":431403,"journal":{"name":"Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology","volume":"89 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3332165.3347927","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Computer vision is applied in an ever expanding range of applications, many of which require custom training data to perform well. We present a novel interface for rapid collection of labeled training images to improve CV-based object detectors. LabelAR leverages the spatial tracking capabilities of an AR-enabled camera, allowing users to place persistent bounding volumes that stay centered on real-world objects. The interface then guides the user to move the camera to cover a wide variety of viewpoints. We eliminate the need for post hoc labeling of images by automatically projecting 2D bounding boxes around objects in the images as they are captured from AR-marked viewpoints. In a user study with 12 participants, LabelAR significantly outperforms existing approaches in terms of the trade-off between detection performance and collection time.