Attention-Based Semantic Segmentation Networks for Forest Applications

IF 2.4 2区农林科学 Q1 FORESTRY Forests Pub Date : 2023-12-14 DOI:10.3390/f14122437

See Ven Lim, M. A. Zulkifley, Azlan Saleh, A. H. Saputro, Siti Raihanah Abdani

{"title":"Attention-Based Semantic Segmentation Networks for Forest Applications","authors":"See Ven Lim, M. A. Zulkifley, Azlan Saleh, A. H. Saputro, Siti Raihanah Abdani","doi":"10.3390/f14122437","DOIUrl":null,"url":null,"abstract":"Deforestation remains one of the key concerning activities around the world due to commodity-driven extraction, agricultural land expansion, and urbanization. The effective and efficient monitoring of national forests using remote sensing technology is important for the early detection and mitigation of deforestation activities. Deep learning techniques have been vastly researched and applied to various remote sensing tasks, whereby fully convolutional neural networks have been commonly studied with various input band combinations for satellite imagery applications, but very little research has focused on deep networks with high-resolution representations, such as HRNet. In this study, an optimal semantic segmentation architecture based on high-resolution feature maps and an attention mechanism is proposed to label each pixel of the satellite imagery input for forest identification. The selected study areas are located in Malaysian rainforests, sampled from 2016, 2018, and 2020, downloaded using Google Earth Pro. Only a two-class problem is considered for this study, which is to classify each pixel either as forest or non-forest. HRNet is chosen as the baseline architecture, in which the hyperparameters are optimized before being embedded with an attention mechanism to help the model to focus on more critical features that are related to the forest. Several variants of the proposed methods are validated on 6120 sliced images, whereby the best performance reaches 85.58% for the mean intersection over union and 92.24% for accuracy. The benchmarking analysis also reveals that the attention-embedded high-resolution architecture outperforms U-Net, SegNet, and FC-DenseNet for both performance metrics. A qualitative analysis between the baseline and attention-based models also shows that fewer false classifications and cleaner prediction outputs can be observed in identifying the forest areas.","PeriodicalId":12339,"journal":{"name":"Forests","volume":"58 5","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2023-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forests","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.3390/f14122437","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FORESTRY","Score":null,"Total":0}

引用次数: 0

Abstract

Deforestation remains one of the key concerning activities around the world due to commodity-driven extraction, agricultural land expansion, and urbanization. The effective and efficient monitoring of national forests using remote sensing technology is important for the early detection and mitigation of deforestation activities. Deep learning techniques have been vastly researched and applied to various remote sensing tasks, whereby fully convolutional neural networks have been commonly studied with various input band combinations for satellite imagery applications, but very little research has focused on deep networks with high-resolution representations, such as HRNet. In this study, an optimal semantic segmentation architecture based on high-resolution feature maps and an attention mechanism is proposed to label each pixel of the satellite imagery input for forest identification. The selected study areas are located in Malaysian rainforests, sampled from 2016, 2018, and 2020, downloaded using Google Earth Pro. Only a two-class problem is considered for this study, which is to classify each pixel either as forest or non-forest. HRNet is chosen as the baseline architecture, in which the hyperparameters are optimized before being embedded with an attention mechanism to help the model to focus on more critical features that are related to the forest. Several variants of the proposed methods are validated on 6120 sliced images, whereby the best performance reaches 85.58% for the mean intersection over union and 92.24% for accuracy. The benchmarking analysis also reveals that the attention-embedded high-resolution architecture outperforms U-Net, SegNet, and FC-DenseNet for both performance metrics. A qualitative analysis between the baseline and attention-based models also shows that fewer false classifications and cleaner prediction outputs can be observed in identifying the forest areas.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于森林应用的基于注意力的语义分割网络

由于商品开采、农业用地扩张和城市化，毁林仍然是世界各地令人担忧的主要活动之一。利用遥感技术对国家森林进行有效和高效的监测，对于及早发现和减少毁林活动非常重要。深度学习技术已被广泛研究并应用于各种遥感任务，其中全卷积神经网络已被普遍研究用于卫星图像应用的各种输入波段组合，但很少有研究关注具有高分辨率表示的深度网络，如 HRNet。本研究提出了一种基于高分辨率特征图和注意力机制的最佳语义分割架构，用于标记卫星图像输入的每个像素，以进行森林识别。所选研究区域位于马来西亚的热带雨林中，采样时间分别为 2016 年、2018 年和 2020 年，使用谷歌地球专业版下载。本研究只考虑两类问题，即把每个像素划分为森林或非森林。选择 HRNet 作为基线架构，在其中优化超参数，然后嵌入注意力机制，以帮助模型关注与森林相关的更关键特征。我们在 6120 幅切片图像上验证了所提方法的几种变体，其中最佳性能达到了 85.58% 的联合平均交叉率和 92.24% 的准确率。基准分析还显示，嵌入注意力的高分辨率架构在两个性能指标上都优于 U-Net、SegNet 和 FC-DenseNet。基线模型和基于注意力的模型之间的定性分析还表明，在识别森林区域时，错误分类更少，预测结果更清晰。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Forests FORESTRY-

CiteScore

4.40

自引率

17.20%

发文量

1823

审稿时长

19.02 days

期刊介绍： Forests (ISSN 1999-4907) is an international and cross-disciplinary scholarly journal of forestry and forest ecology. It publishes research papers, short communications and review papers. There is no restriction on the length of the papers. Our aim is to encourage scientists to publish their experimental and theoretical research in as much detail as possible. Full experimental and/or methodical details must be provided for research articles.