利用 Landsat 8 和 ALOS PALSAR 数据预测非洲大草原地上生物量的机器学习特征重要性选择。

Sa'ad Ibrahim , Heiko Balzter , Kevin Tansey
{"title":"利用 Landsat 8 和 ALOS PALSAR 数据预测非洲大草原地上生物量的机器学习特征重要性选择。","authors":"Sa'ad Ibrahim ,&nbsp;Heiko Balzter ,&nbsp;Kevin Tansey","doi":"10.1016/j.mlwa.2024.100561","DOIUrl":null,"url":null,"abstract":"<div><p>In remote sensing, multiple input bands are derived from various sensors covering different regions of the electromagnetic spectrum. Each spectral band plays a unique role in land use/land cover characterization. For example, while integrating multiple sensors for predicting aboveground biomass (AGB) is important for achieving high accuracy, reducing the dataset size by eliminating redundant and irrelevant spectral features is essential for enhancing the performance of machine learning algorithms. This accelerates the learning process, thereby developing simpler and more efficient models. Our results indicate that compared individual sensor datasets, the random forest (RF) classification approach using recursive feature elimination (RFE) increased the accuracy based on F score by 82.86 % and 26.19 respectively. The mutual information regression (MIR) method shows a slight increase in accuracy when considering individual sensor datasets, but its accuracy decreases when all features are taken into account for all models. Overall, the combination of features from the Landsat 8, ALOS PALSAR backscatter, and elevation data selected based on RFE provided the best AGB estimation for the RF and XGBoost models. In contrast to the k-nearest neighbors (KNN) and support vector machines (SVM), no significant improvement in AGB estimation was detected even when RFE and MIR were used. The effect of parameter optimization was found to be more significant for RF than for all the other methods. The AGB maps show patterns of AGB estimates consistent with those of the reference dataset. This study shows how prediction errors can be minimized based on feature selection using different ML classifiers.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"16 ","pages":"Article 100561"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000379/pdfft?md5=eaa2c37c10a3e2753bcd07c6a3fa9373&pid=1-s2.0-S2666827024000379-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Machine learning feature importance selection for predicting aboveground biomass in African savannah with landsat 8 and ALOS PALSAR data\",\"authors\":\"Sa'ad Ibrahim ,&nbsp;Heiko Balzter ,&nbsp;Kevin Tansey\",\"doi\":\"10.1016/j.mlwa.2024.100561\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In remote sensing, multiple input bands are derived from various sensors covering different regions of the electromagnetic spectrum. Each spectral band plays a unique role in land use/land cover characterization. For example, while integrating multiple sensors for predicting aboveground biomass (AGB) is important for achieving high accuracy, reducing the dataset size by eliminating redundant and irrelevant spectral features is essential for enhancing the performance of machine learning algorithms. This accelerates the learning process, thereby developing simpler and more efficient models. Our results indicate that compared individual sensor datasets, the random forest (RF) classification approach using recursive feature elimination (RFE) increased the accuracy based on F score by 82.86 % and 26.19 respectively. The mutual information regression (MIR) method shows a slight increase in accuracy when considering individual sensor datasets, but its accuracy decreases when all features are taken into account for all models. Overall, the combination of features from the Landsat 8, ALOS PALSAR backscatter, and elevation data selected based on RFE provided the best AGB estimation for the RF and XGBoost models. In contrast to the k-nearest neighbors (KNN) and support vector machines (SVM), no significant improvement in AGB estimation was detected even when RFE and MIR were used. The effect of parameter optimization was found to be more significant for RF than for all the other methods. The AGB maps show patterns of AGB estimates consistent with those of the reference dataset. This study shows how prediction errors can be minimized based on feature selection using different ML classifiers.</p></div>\",\"PeriodicalId\":74093,\"journal\":{\"name\":\"Machine learning with applications\",\"volume\":\"16 \",\"pages\":\"Article 100561\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666827024000379/pdfft?md5=eaa2c37c10a3e2753bcd07c6a3fa9373&pid=1-s2.0-S2666827024000379-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine learning with applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666827024000379\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827024000379","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在遥感技术中,多个输入波段来自不同的传感器,覆盖电磁波谱的不同区域。每个光谱波段在土地利用/土地覆被特征描述中都发挥着独特的作用。例如,虽然整合多个传感器来预测地上生物量(AGB)对实现高精度非常重要,但通过消除冗余和不相关的光谱特征来减少数据集的大小,对提高机器学习算法的性能至关重要。这可以加速学习过程,从而开发出更简单、更高效的模型。我们的研究结果表明,与单个传感器数据集相比,使用递归特征消除(RFE)的随机森林(RF)分类方法提高了基于 F 分数的准确率,分别提高了 82.86 % 和 26.19 %。在考虑单个传感器数据集时,互信息回归(MIR)方法的准确率略有提高,但在所有模型都考虑所有特征时,其准确率则有所下降。总体而言,基于 RFE 选定的 Landsat 8、ALOS PALSAR 后向散射和高程数据的特征组合为 RF 和 XGBoost 模型提供了最佳的 AGB 估计。与 k-nearest neighbors(KNN)和支持向量机(SVM)相比,即使使用 RFE 和 MIR,也没有发现对 AGB 估计的显著改进。与所有其他方法相比,参数优化对 RF 的影响更为显著。AGB 地图显示的 AGB 估计模式与参考数据集的模式一致。这项研究显示了如何通过使用不同的多级分类器进行特征选择,将预测误差降到最低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Machine learning feature importance selection for predicting aboveground biomass in African savannah with landsat 8 and ALOS PALSAR data

In remote sensing, multiple input bands are derived from various sensors covering different regions of the electromagnetic spectrum. Each spectral band plays a unique role in land use/land cover characterization. For example, while integrating multiple sensors for predicting aboveground biomass (AGB) is important for achieving high accuracy, reducing the dataset size by eliminating redundant and irrelevant spectral features is essential for enhancing the performance of machine learning algorithms. This accelerates the learning process, thereby developing simpler and more efficient models. Our results indicate that compared individual sensor datasets, the random forest (RF) classification approach using recursive feature elimination (RFE) increased the accuracy based on F score by 82.86 % and 26.19 respectively. The mutual information regression (MIR) method shows a slight increase in accuracy when considering individual sensor datasets, but its accuracy decreases when all features are taken into account for all models. Overall, the combination of features from the Landsat 8, ALOS PALSAR backscatter, and elevation data selected based on RFE provided the best AGB estimation for the RF and XGBoost models. In contrast to the k-nearest neighbors (KNN) and support vector machines (SVM), no significant improvement in AGB estimation was detected even when RFE and MIR were used. The effect of parameter optimization was found to be more significant for RF than for all the other methods. The AGB maps show patterns of AGB estimates consistent with those of the reference dataset. This study shows how prediction errors can be minimized based on feature selection using different ML classifiers.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Machine learning with applications
Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications
自引率
0.00%
发文量
0
审稿时长
98 days
期刊最新文献
Document Layout Error Rate (DLER) metric to evaluate image segmentation methods Supervised machine learning for microbiomics: Bridging the gap between current and best practices Playing with words: Comparing the vocabulary and lexical diversity of ChatGPT and humans A survey on knowledge distillation: Recent advancements Texas rural land market integration: A causal analysis using machine learning applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1