Effect of training sample size, sampling design and prediction model on soil mapping with proximal sensing data for precision liming

IF 6.6 2区农林科学 Q1 AGRICULTURE, MULTIDISCIPLINARY Precision Agriculture Pub Date : 2024-02-24 DOI:10.1007/s11119-024-10122-3

{"title":"Effect of training sample size, sampling design and prediction model on soil mapping with proximal sensing data for precision liming","authors":"","doi":"10.1007/s11119-024-10122-3","DOIUrl":null,"url":null,"abstract":"<h3>Abstract</h3> <p>Site-specific estimation of lime requirement requires high-resolution maps of soil organic carbon (SOC), clay and pH. These maps can be generated with digital soil mapping models fitted on covariates observed by proximal soil sensors. However, the quality of the derived maps depends on the applied methodology. We assessed the effects of (i) training sample size (5–100); (ii) sampling design (simple random sampling (SRS), conditioned Latin hypercube sampling (cLHS) and k-means sampling (KM)); and (iii) prediction model (multiple linear regression (MLR) and random forest (RF)) on the prediction performance for the above mentioned three soil properties. The case study is based on conditional geostatistical simulations using 250 soil samples from a 51 ha field in Eastern Germany. Lin’s concordance correlation coefficient (CCC) and root-mean-square error (RMSE) were used to evaluate model performances. Results show that with increasing training sample sizes, relative improvements of RMSE and CCC decreased exponentially. We found the lowest median RMSE values with 100 training observations i.e., 1.73%, 0.21% and 0.3 for clay, SOC and pH, respectively. However, already with a sample size of 10, models of moderate quality (CCC > 0.65) were obtained for all three soil properties. cLHS and KM performed significantly better than SRS. MLR showed lower median RMSE values than RF for SOC and pH for smaller sample sizes, but RF outperformed MLR if at least 25–30 or 75–100 soil samples were used for SOC or pH, respectively. For clay, the median RMSE was lower with RF, regardless of sample size.</p>","PeriodicalId":20423,"journal":{"name":"Precision Agriculture","volume":"5 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2024-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Precision Agriculture","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1007/s11119-024-10122-3","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURE, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Site-specific estimation of lime requirement requires high-resolution maps of soil organic carbon (SOC), clay and pH. These maps can be generated with digital soil mapping models fitted on covariates observed by proximal soil sensors. However, the quality of the derived maps depends on the applied methodology. We assessed the effects of (i) training sample size (5–100); (ii) sampling design (simple random sampling (SRS), conditioned Latin hypercube sampling (cLHS) and k-means sampling (KM)); and (iii) prediction model (multiple linear regression (MLR) and random forest (RF)) on the prediction performance for the above mentioned three soil properties. The case study is based on conditional geostatistical simulations using 250 soil samples from a 51 ha field in Eastern Germany. Lin’s concordance correlation coefficient (CCC) and root-mean-square error (RMSE) were used to evaluate model performances. Results show that with increasing training sample sizes, relative improvements of RMSE and CCC decreased exponentially. We found the lowest median RMSE values with 100 training observations i.e., 1.73%, 0.21% and 0.3 for clay, SOC and pH, respectively. However, already with a sample size of 10, models of moderate quality (CCC > 0.65) were obtained for all three soil properties. cLHS and KM performed significantly better than SRS. MLR showed lower median RMSE values than RF for SOC and pH for smaller sample sizes, but RF outperformed MLR if at least 25–30 or 75–100 soil samples were used for SOC or pH, respectively. For clay, the median RMSE was lower with RF, regardless of sample size.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

训练样本大小、取样设计和预测模型对利用近距离传感数据绘制精准施肥土壤图的影响

摘要针对具体地点的石灰需求估算需要高分辨率的土壤有机碳（SOC）、粘土和 pH 值地图。这些地图可以根据近距离土壤传感器观测到的协变量，利用数字土壤制图模型生成。然而，所生成地图的质量取决于所采用的方法。我们评估了 (i) 训练样本大小（5-100 个）；(ii) 采样设计（简单随机抽样 (SRS)、条件拉丁超立方采样 (cLHS) 和 KM 采样 (KM)）；(iii) 预测模型（多元线性回归 (MLR) 和随机森林 (RF)）对上述三种土壤特性预测性能的影响。案例研究基于条件地质统计模拟，使用了来自德国东部 51 公顷田地的 250 个土壤样本。林氏一致性相关系数（CCC）和均方根误差（RMSE）用于评估模型性能。结果表明，随着训练样本数量的增加，RMSE 和 CCC 的相对改进呈指数下降。我们发现 100 个训练观测值的 RMSE 中值最低，即粘土、SOC 和 pH 值分别为 1.73%、0.21% 和 0.3。然而，在样本量为 10 个的情况下，所有三种土壤特性的模型都达到了中等质量（CCC > 0.65）。就 SOC 和 pH 而言，在样本量较小的情况下，MLR 的 RMSE 中值低于 RF，但如果 SOC 或 pH 的土壤样本至少分别为 25-30 个或 75-100 个，RF 的表现则优于 MLR。对于粘土，无论样本量大小，RF 的 RMSE 中值都较低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Precision Agriculture 农林科学-农业综合

CiteScore

12.30

自引率

8.10%

发文量

103

审稿时长

>24 weeks

期刊介绍： Precision Agriculture promotes the most innovative results coming from the research in the field of precision agriculture. It provides an effective forum for disseminating original and fundamental research and experience in the rapidly advancing area of precision farming. There are many topics in the field of precision agriculture; therefore, the topics that are addressed include, but are not limited to: Natural Resources Variability: Soil and landscape variability, digital elevation models, soil mapping, geostatistics, geographic information systems, microclimate, weather forecasting, remote sensing, management units, scale, etc. Managing Variability: Sampling techniques, site-specific nutrient and crop protection chemical recommendation, crop quality, tillage, seed density, seed variety, yield mapping, remote sensing, record keeping systems, data interpretation and use, crops (corn, wheat, sugar beets, potatoes, peanut, cotton, vegetables, etc.), management scale, etc. Engineering Technology: Computers, positioning systems, DGPS, machinery, tillage, planting, nutrient and crop protection implements, manure, irrigation, fertigation, yield monitor and mapping, soil physical and chemical characteristic sensors, weed/pest mapping, etc. Profitability: MEY, net returns, BMPs, optimum recommendations, crop quality, technology cost, sustainability, social impacts, marketing, cooperatives, farm scale, crop type, etc. Environment: Nutrient, crop protection chemicals, sediments, leaching, runoff, practices, field, watershed, on/off farm, artificial drainage, ground water, surface water, etc. Technology Transfer: Skill needs, education, training, outreach, methods, surveys, agri-business, producers, distance education, Internet, simulations models, decision support systems, expert systems, on-farm experimentation, partnerships, quality of rural life, etc.