使用空中和街道级图像编目公共对象-城市树木

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI:10.1109/CVPR.2016.647

J. D. Wegner, Steve Branson, David Hall, K. Schindler, P. Perona

{"title":"使用空中和街道级图像编目公共对象-城市树木","authors":"J. D. Wegner, Steve Branson, David Hall, K. Schindler, P. Perona","doi":"10.1109/CVPR.2016.647","DOIUrl":null,"url":null,"abstract":"Each corner of the inhabited world is imaged from multiple viewpoints with increasing frequency. Online map services like Google Maps or Here Maps provide direct access to huge amounts of densely sampled, georeferenced images from street view and aerial perspective. There is an opportunity to design computer vision systems that will help us search, catalog and monitor public infrastructure, buildings and artifacts. We explore the architecture and feasibility of such a system. The main technical challenge is combining test time information from multiple views of each geographic location (e.g., aerial and street views). We implement two modules: det2geo, which detects the set of locations of objects belonging to a given category, and geo2cat, which computes the fine-grained category of the object at a given location. We introduce a solution that adapts state-of the-art CNN-based object detectors and classifiers. We test our method on \"Pasadena Urban Trees\", a new dataset of 80,000 trees with geographic and species annotations, and show that combining multiple views significantly improves both tree detection and tree species classification, rivaling human performance.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"44 1","pages":"6014-6023"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"148","resultStr":"{\"title\":\"Cataloging Public Objects Using Aerial and Street-Level Images — Urban Trees\",\"authors\":\"J. D. Wegner, Steve Branson, David Hall, K. Schindler, P. Perona\",\"doi\":\"10.1109/CVPR.2016.647\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Each corner of the inhabited world is imaged from multiple viewpoints with increasing frequency. Online map services like Google Maps or Here Maps provide direct access to huge amounts of densely sampled, georeferenced images from street view and aerial perspective. There is an opportunity to design computer vision systems that will help us search, catalog and monitor public infrastructure, buildings and artifacts. We explore the architecture and feasibility of such a system. The main technical challenge is combining test time information from multiple views of each geographic location (e.g., aerial and street views). We implement two modules: det2geo, which detects the set of locations of objects belonging to a given category, and geo2cat, which computes the fine-grained category of the object at a given location. We introduce a solution that adapts state-of the-art CNN-based object detectors and classifiers. We test our method on \\\"Pasadena Urban Trees\\\", a new dataset of 80,000 trees with geographic and species annotations, and show that combining multiple views significantly improves both tree detection and tree species classification, rivaling human performance.\",\"PeriodicalId\":6515,\"journal\":{\"name\":\"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)\",\"volume\":\"44 1\",\"pages\":\"6014-6023\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"148\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2016.647\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2016.647","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 148

摘要

有人居住的世界的每个角落都从多个视点成像，频率越来越高。像谷歌Maps或Here Maps这样的在线地图服务提供了直接访问大量密集采样的街道视图和空中视角的地理参考图像。我们有机会设计计算机视觉系统，帮助我们搜索、编目和监控公共基础设施、建筑和文物。我们探讨了这样一个系统的架构和可行性。主要的技术挑战是结合来自每个地理位置的多个视图的测试时间信息(例如，空中和街道视图)。我们实现了两个模块:det2geo和geo2cat，前者检测属于给定类别的对象的位置集，后者计算给定位置上对象的细粒度类别。我们介绍了一种解决方案，该解决方案采用了最先进的基于cnn的对象检测器和分类器。我们在“帕萨迪纳城市树木”上测试了我们的方法，这是一个包含80,000棵树的新数据集，带有地理和物种注释，结果表明，结合多个视图显著提高了树木检测和树种分类，与人类的表现相媲美。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Cataloging Public Objects Using Aerial and Street-Level Images — Urban Trees

Each corner of the inhabited world is imaged from multiple viewpoints with increasing frequency. Online map services like Google Maps or Here Maps provide direct access to huge amounts of densely sampled, georeferenced images from street view and aerial perspective. There is an opportunity to design computer vision systems that will help us search, catalog and monitor public infrastructure, buildings and artifacts. We explore the architecture and feasibility of such a system. The main technical challenge is combining test time information from multiple views of each geographic location (e.g., aerial and street views). We implement two modules: det2geo, which detects the set of locations of objects belonging to a given category, and geo2cat, which computes the fine-grained category of the object at a given location. We introduce a solution that adapts state-of the-art CNN-based object detectors and classifiers. We test our method on "Pasadena Urban Trees", a new dataset of 80,000 trees with geographic and species annotations, and show that combining multiple views significantly improves both tree detection and tree species classification, rivaling human performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量

期刊最新文献

Sketch Me That Shoe Multivariate Regression on the Grassmannian for Predicting Novel Domains How Hard Can It Be? Estimating the Difficulty of Visual Search in an Image Discovering the Physical Parts of an Articulated Object Class from Multiple Videos Simultaneous Optical Flow and Intensity Estimation from an Event Camera