Analysis of the hyperparameter optimisation of four machine learning satellite imagery classification methods

IF 2.1 3区 地球科学 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computational Geosciences Pub Date : 2024-04-05 DOI:10.1007/s10596-024-10285-y
Francisco Alonso-Sarría, Carmen Valdivieso-Ros, Francisco Gomariz-Castillo
{"title":"Analysis of the hyperparameter optimisation of four machine learning satellite imagery classification methods","authors":"Francisco Alonso-Sarría, Carmen Valdivieso-Ros, Francisco Gomariz-Castillo","doi":"10.1007/s10596-024-10285-y","DOIUrl":null,"url":null,"abstract":"<p>The classification of land use and land cover (LULC) from remotely sensed imagery in semi-arid Mediterranean areas is a challenging task due to the fragmentation of the landscape and the diversity of spatial patterns. Recently, the use of deep learning (DL) for image analysis has increased compared to commonly used machine learning (ML) methods. This paper compares the performance of four algorithms, Random Forest (RF), Support Vector Machine (SVM), Multilayer Perceptron (MLP) and Convolutional Network (CNN), using multi-source data, applying an exhaustive optimisation process of the hyperparameters. The usual approach in the optimisation process of a LULC classification model is to keep the best model in terms of accuracy without analysing the rest of the results. In this study, we have analysed such results, discovering noteworthy patterns in a space defined by the mean and standard deviation of the validation accuracy estimated in a 10-fold cross validation (CV). The point distributions in such a space do not appear to be completely random, but show clusters of points that facilitate the discovery of hyperparameter values that tend to increase the mean accuracy and decrease its standard deviation. RF is not the most accurate model, but it is the less sensitive to changes in hyperparameters. Neural Networks, tend to increase commission and omission errors of the less represented classes because their optimisation lead the model to learn better the most frequent classes. On the other hand, RF and MLP prediction layers are the most accurate from a general qualitative point of view.</p>","PeriodicalId":10662,"journal":{"name":"Computational Geosciences","volume":"43 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Geosciences","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s10596-024-10285-y","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

The classification of land use and land cover (LULC) from remotely sensed imagery in semi-arid Mediterranean areas is a challenging task due to the fragmentation of the landscape and the diversity of spatial patterns. Recently, the use of deep learning (DL) for image analysis has increased compared to commonly used machine learning (ML) methods. This paper compares the performance of four algorithms, Random Forest (RF), Support Vector Machine (SVM), Multilayer Perceptron (MLP) and Convolutional Network (CNN), using multi-source data, applying an exhaustive optimisation process of the hyperparameters. The usual approach in the optimisation process of a LULC classification model is to keep the best model in terms of accuracy without analysing the rest of the results. In this study, we have analysed such results, discovering noteworthy patterns in a space defined by the mean and standard deviation of the validation accuracy estimated in a 10-fold cross validation (CV). The point distributions in such a space do not appear to be completely random, but show clusters of points that facilitate the discovery of hyperparameter values that tend to increase the mean accuracy and decrease its standard deviation. RF is not the most accurate model, but it is the less sensitive to changes in hyperparameters. Neural Networks, tend to increase commission and omission errors of the less represented classes because their optimisation lead the model to learn better the most frequent classes. On the other hand, RF and MLP prediction layers are the most accurate from a general qualitative point of view.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
四种机器学习卫星图像分类方法的超参数优化分析
在半干旱的地中海地区,由于景观的破碎化和空间模式的多样性,从遥感图像中对土地利用和土地覆被进行分类是一项具有挑战性的任务。最近,与常用的机器学习(ML)方法相比,深度学习(DL)在图像分析中的应用越来越多。本文利用多源数据,对随机森林(RF)、支持向量机(SVM)、多层感知器(MLP)和卷积网络(CNN)这四种算法的性能进行了比较,并对超参数进行了详尽的优化。在 LULC 分类模型的优化过程中,通常的做法是保留准确率最高的模型,而不对其他结果进行分析。在本研究中,我们对这些结果进行了分析,发现了由 10 倍交叉验证(CV)中估计的验证准确率的平均值和标准偏差所定义的空间中值得注意的模式。这种空间中的点分布似乎并不是完全随机的,而是呈现出点群,有利于发现超参数值,这些超参数值往往会提高平均准确率并降低其标准偏差。射频模型并不是最准确的模型,但它对超参数变化的敏感度较低。神经网络往往会增加代表性较低类别的委托和遗漏误差,因为其优化会使模型更好地学习最常见的类别。另一方面,从一般定性的角度来看,RF 和 MLP 预测层是最准确的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computational Geosciences
Computational Geosciences 地学-地球科学综合
CiteScore
6.10
自引率
4.00%
发文量
63
审稿时长
6-12 weeks
期刊介绍: Computational Geosciences publishes high quality papers on mathematical modeling, simulation, numerical analysis, and other computational aspects of the geosciences. In particular the journal is focused on advanced numerical methods for the simulation of subsurface flow and transport, and associated aspects such as discretization, gridding, upscaling, optimization, data assimilation, uncertainty assessment, and high performance parallel and grid computing. Papers treating similar topics but with applications to other fields in the geosciences, such as geomechanics, geophysics, oceanography, or meteorology, will also be considered. The journal provides a platform for interaction and multidisciplinary collaboration among diverse scientific groups, from both academia and industry, which share an interest in developing mathematical models and efficient algorithms for solving them, such as mathematicians, engineers, chemists, physicists, and geoscientists.
期刊最新文献
High-order exponential integration for seismic wave modeling Incorporating spatial variability in surface runoff modeling with new DEM-based distributed approaches Towards practical artificial intelligence in Earth sciences Application of deep learning reduced-order modeling for single-phase flow in faulted porous media Application of supervised machine learning to assess and manage fluid-injection-induced seismicity hazards based on the Montney region of northeastern British Columbia
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1