{"title":"Investigating the use of deep learning models for land cover classification from street-level imagery","authors":"Narumasa Tsutsumida, Jing Zhao, Naho Shibuya, Kenlo Nasahara, Takeo Tadono","doi":"10.1111/1440-1703.12470","DOIUrl":null,"url":null,"abstract":"<p>Land cover classification mapping is the process of assigning labels to different types of land surfaces based on overhead imagery. However, acquiring reference samples through fieldwork for ground truth can be costly and time-intensive. Additionally, annotating high-resolution satellite images poses challenges, as certain land cover types are difficult to discern solely from nadir images. To address these challenges, this study examined the feasibility of using street-level imagery to support the collection of reference samples and identify land cover. We utilized 18,022 images captured in Japan, with 14 different land cover classes. Our approach involved using convolutional neural networks based on Inception-v4 and DenseNet, as well as Transformer-based Vision and Swin Transformers, both with and without pre-trained weights and fine-tuning techniques. Additionally, we explored explainability through Gradient-Weighted Class Activation Mapping (Grad-CAM). Our results indicate that using a Vision Transformer was the most effective method, achieving an overall accuracy of 86.12% and allowing for full explainability of land cover targets within an image. This paper proposes a promising solution for land cover classification from street-level imagery, which can be used for semi-automatic reference sample collection from geo-tagged street-level photos.</p>","PeriodicalId":11434,"journal":{"name":"Ecological Research","volume":"39 5","pages":"757-765"},"PeriodicalIF":1.7000,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1440-1703.12470","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Research","FirstCategoryId":"93","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/1440-1703.12470","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Land cover classification mapping is the process of assigning labels to different types of land surfaces based on overhead imagery. However, acquiring reference samples through fieldwork for ground truth can be costly and time-intensive. Additionally, annotating high-resolution satellite images poses challenges, as certain land cover types are difficult to discern solely from nadir images. To address these challenges, this study examined the feasibility of using street-level imagery to support the collection of reference samples and identify land cover. We utilized 18,022 images captured in Japan, with 14 different land cover classes. Our approach involved using convolutional neural networks based on Inception-v4 and DenseNet, as well as Transformer-based Vision and Swin Transformers, both with and without pre-trained weights and fine-tuning techniques. Additionally, we explored explainability through Gradient-Weighted Class Activation Mapping (Grad-CAM). Our results indicate that using a Vision Transformer was the most effective method, achieving an overall accuracy of 86.12% and allowing for full explainability of land cover targets within an image. This paper proposes a promising solution for land cover classification from street-level imagery, which can be used for semi-automatic reference sample collection from geo-tagged street-level photos.
期刊介绍:
Ecological Research has been published in English by the Ecological Society of Japan since 1986. Ecological Research publishes original papers on all aspects of ecology, in both aquatic and terrestrial ecosystems.