{"title":"基于可见-近红外光谱响应预测土壤特征的一些机器学习算法的评估","authors":"S. Gruszczyński","doi":"10.7494/GEOM.2021.15.1.63","DOIUrl":null,"url":null,"abstract":"Using the Land Use and Coverage Frame Survey (LUCAS) database of European soil surface layer properties, statistical and machine learning predictive models for several key soil characteristics (clay content, pH in CaCl2, concentration of organic carbon, calcium carbonates and nitrogen and exchange cations capacity) were compared on the basis of processing their spectral responses in the visible (Vis) and near‑infrared (NIR) parts. Standard methods of relationship modeling were used: stepwise regression, partial least squares regression and linear regression with input data obtained from principal components analysis. Using the inputs extracted by statistical algorithms various machine learning algorithms were used in the modeling. The usefulness of the models was analyzed by comparison with the values of the determination coefficients, the root mean square error and the distribution of residual values. The mean square error of estimation in the cross‑validation procedure for the stack model using the multilayer perceptron and the distributed random forest were as follows: for clay content – ca. 4.5%; for pH – ca. 0.35; for SOC – ca. 7.5 g/kg (0.75% by weight); for CaCO3 content – ca. 19 g/kg; for N content – ca. 0.50 g/kg; and for CEC – ca. 3.5 cmol(+)/kg.","PeriodicalId":36672,"journal":{"name":"Geomatics and Environmental Engineering","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Evaluation of Some Machine Learning Algorithms as Tools for Predicting Soil Characteristics Based on Their Spectral Response in the Vis‑NIR Range\",\"authors\":\"S. Gruszczyński\",\"doi\":\"10.7494/GEOM.2021.15.1.63\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Using the Land Use and Coverage Frame Survey (LUCAS) database of European soil surface layer properties, statistical and machine learning predictive models for several key soil characteristics (clay content, pH in CaCl2, concentration of organic carbon, calcium carbonates and nitrogen and exchange cations capacity) were compared on the basis of processing their spectral responses in the visible (Vis) and near‑infrared (NIR) parts. Standard methods of relationship modeling were used: stepwise regression, partial least squares regression and linear regression with input data obtained from principal components analysis. Using the inputs extracted by statistical algorithms various machine learning algorithms were used in the modeling. The usefulness of the models was analyzed by comparison with the values of the determination coefficients, the root mean square error and the distribution of residual values. The mean square error of estimation in the cross‑validation procedure for the stack model using the multilayer perceptron and the distributed random forest were as follows: for clay content – ca. 4.5%; for pH – ca. 0.35; for SOC – ca. 7.5 g/kg (0.75% by weight); for CaCO3 content – ca. 19 g/kg; for N content – ca. 0.50 g/kg; and for CEC – ca. 3.5 cmol(+)/kg.\",\"PeriodicalId\":36672,\"journal\":{\"name\":\"Geomatics and Environmental Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Geomatics and Environmental Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.7494/GEOM.2021.15.1.63\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geomatics and Environmental Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7494/GEOM.2021.15.1.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}
An Evaluation of Some Machine Learning Algorithms as Tools for Predicting Soil Characteristics Based on Their Spectral Response in the Vis‑NIR Range
Using the Land Use and Coverage Frame Survey (LUCAS) database of European soil surface layer properties, statistical and machine learning predictive models for several key soil characteristics (clay content, pH in CaCl2, concentration of organic carbon, calcium carbonates and nitrogen and exchange cations capacity) were compared on the basis of processing their spectral responses in the visible (Vis) and near‑infrared (NIR) parts. Standard methods of relationship modeling were used: stepwise regression, partial least squares regression and linear regression with input data obtained from principal components analysis. Using the inputs extracted by statistical algorithms various machine learning algorithms were used in the modeling. The usefulness of the models was analyzed by comparison with the values of the determination coefficients, the root mean square error and the distribution of residual values. The mean square error of estimation in the cross‑validation procedure for the stack model using the multilayer perceptron and the distributed random forest were as follows: for clay content – ca. 4.5%; for pH – ca. 0.35; for SOC – ca. 7.5 g/kg (0.75% by weight); for CaCO3 content – ca. 19 g/kg; for N content – ca. 0.50 g/kg; and for CEC – ca. 3.5 cmol(+)/kg.