Assessing the potential of synthetic and ex situ airborne laser scanning and ground plot data to train forest biomass models

IF 3 2区 农林科学 Q1 FORESTRY Forestry Pub Date : 2023-12-05 DOI:10.1093/forestry/cpad061
Jannika Schäfer, Lukas Winiwarter, Hannah Weiser, Jan Novotný, Bernhard Höfle, Sebastian Schmidtlein, Hans Henniger, Grzegorz Krok, Krzysztof Stereńczak, Fabian Ewald Fassnacht
{"title":"Assessing the potential of synthetic and ex situ airborne laser scanning and ground plot data to train forest biomass models","authors":"Jannika Schäfer, Lukas Winiwarter, Hannah Weiser, Jan Novotný, Bernhard Höfle, Sebastian Schmidtlein, Hans Henniger, Grzegorz Krok, Krzysztof Stereńczak, Fabian Ewald Fassnacht","doi":"10.1093/forestry/cpad061","DOIUrl":null,"url":null,"abstract":"Airborne laser scanning data are increasingly used to predict forest biomass over large areas. Biomass information cannot be derived directly from airborne laser scanning data; therefore, field measurements of forest plots are required to build regression models. We tested whether simulated laser scanning data of virtual forest plots could be used to train biomass models and thereby reduce the amount of field measurements required. We compared the performance of models that were trained with (i) simulated data only, (ii) a combination of simulated and real data, (iii) real data collected from different study sites, and (iv) real data collected from the same study site the model was applied to. We additionally investigated whether using a subset of the simulated data instead of using all simulated data improved model performance. The best matching subset of the simulated data was sampled by selecting the simulated forest plot with the highest correlation of the return height distribution profile for each real forest plot. For comparison, a randomly selected subset was evaluated. Models were tested on four forest sites located in Poland, the Czech Republic, and Canada. Model performance was assessed by root mean squared error (RMSE), squared Pearson correlation coefficient (r$^{2}$), and mean error (ME) of observed and predicted biomass. We found that models trained solely with simulated data did not achieve the accuracy of models trained with real data (RMSE increase of 52–122 %, r$^{2}$ decrease of 4–18 %). However, model performance improved when only a subset of the simulated data was used (RMSE increase of 21–118 %, r$^{2}$ decrease of 5–14 % compared to the real data model), albeit differences in model performance when using the best matching subset compared to using a randomly selected subset were small. Using simulated data for model training always resulted in a strong underprediction of biomass. Extending sparse real training datasets with simulated data decreased RMSE and increased r$^{2}$, as long as no more than 12–346 real training samples were available, depending on the study site. For three of the four study sites, models trained with real data collected from other sites outperformed models trained with simulated data and RMSE and r$^{2}$ were similar to models trained with data from the respective sites. Our results indicate that simulated data cannot yet replace real data but they can be helpful in some sites to extend training datasets when only a limited amount of real data is available.","PeriodicalId":12342,"journal":{"name":"Forestry","volume":"66 5-6","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forestry","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1093/forestry/cpad061","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FORESTRY","Score":null,"Total":0}
引用次数: 0

Abstract

Airborne laser scanning data are increasingly used to predict forest biomass over large areas. Biomass information cannot be derived directly from airborne laser scanning data; therefore, field measurements of forest plots are required to build regression models. We tested whether simulated laser scanning data of virtual forest plots could be used to train biomass models and thereby reduce the amount of field measurements required. We compared the performance of models that were trained with (i) simulated data only, (ii) a combination of simulated and real data, (iii) real data collected from different study sites, and (iv) real data collected from the same study site the model was applied to. We additionally investigated whether using a subset of the simulated data instead of using all simulated data improved model performance. The best matching subset of the simulated data was sampled by selecting the simulated forest plot with the highest correlation of the return height distribution profile for each real forest plot. For comparison, a randomly selected subset was evaluated. Models were tested on four forest sites located in Poland, the Czech Republic, and Canada. Model performance was assessed by root mean squared error (RMSE), squared Pearson correlation coefficient (r$^{2}$), and mean error (ME) of observed and predicted biomass. We found that models trained solely with simulated data did not achieve the accuracy of models trained with real data (RMSE increase of 52–122 %, r$^{2}$ decrease of 4–18 %). However, model performance improved when only a subset of the simulated data was used (RMSE increase of 21–118 %, r$^{2}$ decrease of 5–14 % compared to the real data model), albeit differences in model performance when using the best matching subset compared to using a randomly selected subset were small. Using simulated data for model training always resulted in a strong underprediction of biomass. Extending sparse real training datasets with simulated data decreased RMSE and increased r$^{2}$, as long as no more than 12–346 real training samples were available, depending on the study site. For three of the four study sites, models trained with real data collected from other sites outperformed models trained with simulated data and RMSE and r$^{2}$ were similar to models trained with data from the respective sites. Our results indicate that simulated data cannot yet replace real data but they can be helpful in some sites to extend training datasets when only a limited amount of real data is available.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估综合和非原位机载激光扫描和地面数据训练森林生物量模型的潜力
机载激光扫描数据越来越多地用于预测大面积的森林生物量。生物质信息不能直接从机载激光扫描数据中获得;因此,建立回归模型需要对森林样地进行实地测量。我们测试了虚拟森林样地的模拟激光扫描数据是否可以用于训练生物量模型,从而减少所需的实地测量量。我们比较了使用(i)模拟数据训练的模型的性能,(ii)模拟数据和真实数据的组合,(iii)从不同研究地点收集的真实数据,以及(iv)从模型应用的同一研究地点收集的真实数据。我们还研究了使用模拟数据的子集而不是使用所有模拟数据是否可以提高模型性能。选取与每个真实森林样地的回归高度分布曲线相关性最高的模拟森林样地作为模拟数据的最佳匹配子集。为了比较,随机选择一个子集进行评估。这些模型在波兰、捷克共和国和加拿大的四个森林地点进行了测试。模型性能通过观测和预测生物量的均方根误差(RMSE)、平方Pearson相关系数(r$^{2}$)和平均误差(ME)来评估。我们发现,仅用模拟数据训练的模型并没有达到用真实数据训练的模型的精度(RMSE增加52 - 122%,r$^{2}$减少4 - 18%)。然而,当只使用模拟数据的一个子集时,模型性能得到了改善(与真实数据模型相比,RMSE增加了21 - 118%,r$^{2}$减少了5 - 14%),尽管使用最佳匹配子集与使用随机选择的子集相比,模型性能的差异很小。使用模拟数据进行模型训练总是导致对生物量的严重低估。使用模拟数据扩展稀疏真实训练数据集降低了RMSE,增加了r$^{2}$,只要真实训练样本不超过12-346个,具体取决于研究地点。对于四个研究站点中的三个站点,使用从其他站点收集的真实数据训练的模型优于使用模拟数据训练的模型,RMSE和r$^{2}$与使用各自站点的数据训练的模型相似。我们的研究结果表明,模拟数据还不能取代真实数据,但在某些站点,当只有有限数量的真实数据可用时,模拟数据可以帮助扩展训练数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Forestry
Forestry 农林科学-林学
CiteScore
6.70
自引率
7.10%
发文量
47
审稿时长
12-24 weeks
期刊介绍: The journal is inclusive of all subjects, geographical zones and study locations, including trees in urban environments, plantations and natural forests. We welcome papers that consider economic, environmental and social factors and, in particular, studies that take an integrated approach to sustainable management. In considering suitability for publication, attention is given to the originality of contributions and their likely impact on policy and practice, as well as their contribution to the development of knowledge. Special Issues - each year one edition of Forestry will be a Special Issue and will focus on one subject in detail; this will usually be by publication of the proceedings of an international meeting.
期刊最新文献
Testing treecbh in Central European forests: an R package for crown base height detection using high-resolution aerial laser-scanned data Impact of coarse woody debris on habitat use of two sympatric rodent species in the temperate Białowieża Forest Current understanding and future prospects for ash dieback disease with a focus on Britain Comparison of population genetic structure of Pinus mugo Turra forest stands in the Giant Mountains by analysis of nSSR molecular marker data Managing harvesting residues: a systematic review of management treatments around the world
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1