Philipp Baumann, Anatol Helfenstein, A. Gubler, A. Keller, R. Meuli, Daniel Wächter, Juhwan Lee, R. V. Viscarra Rossel, J. Six
{"title":"Developing the Swiss mid-infrared soil spectral library for local estimation and monitoring","authors":"Philipp Baumann, Anatol Helfenstein, A. Gubler, A. Keller, R. Meuli, Daniel Wächter, Juhwan Lee, R. V. Viscarra Rossel, J. Six","doi":"10.5194/soil-7-525-2021","DOIUrl":null,"url":null,"abstract":"Abstract. Information on soils' composition and physical, chemical and biological properties is paramount to elucidate agroecosystem functioning in space and over time. For this purpose, we developed a national Swiss soil spectral library (SSL; n=4374) in the mid-infrared (mid-IR), calibrating 16 properties from legacy measurements on soils from the Swiss Biodiversity Monitoring program (BDM; n=3778; 1094 sites) and the Swiss long-term Soil Monitoring Network (NABO; n=596; 71 sites). General models were trained with the interpretable rule-based learner CUBIST, testing combinations of {5,10,20,50, and 100} ensembles of rules (committees) and {2, 5, 7, and 9} nearest neighbors used for local averaging with repeated 10-fold cross-validation grouped by location. To evaluate the information in spectra to facilitate long-term soil monitoring at a plot level, we conducted 71 model transfers for the NABO sites to induce locally relevant information from the SSL, using the data-driven sample selection method RS-LOCAL. In total, 10 soil properties were estimated with discrimination capacity suitable for screening (R2≥0.72; ratio of performance to interquartile distance (RPIQ) ≥ 2.0), out of which total carbon (C), organic C (OC), total nitrogen (N), pH and clay showed accuracy eligible for accurate diagnostics (R2>0.8; RPIQ ≥ 3.0). CUBIST and the spectra estimated total C accurately with the root mean square error (RMSE) = 8.4 g kg−1 and the RPIQ = 4.3, while the measured range was 1–583 g kg−1 and OC with RMSE = 9.3 g kg−1 and RPIQ = 3.4 (measured range 0–583 g kg−1). Compared to the general statistical learning approach, the local transfer approach – using two respective training samples – on average reduced the RMSE of total C per site fourfold. We found that the selected SSL subsets were highly dissimilar compared to validation samples, in terms of both their spectral input space and the measured values. This suggests that data-driven selection with RS-LOCAL leverages chemical diversity in composition rather than similarity. Our results suggest that mid-IR soil estimates were sufficiently accurate to support many soil applications that require a large volume of input data, such as precision agriculture, soil C accounting and monitoring and digital soil mapping. This SSL can be updated continuously, for example, with samples from deeper profiles and organic soils, so that the measurement of key soil properties becomes even more accurate and efficient in the near future.\n","PeriodicalId":22015,"journal":{"name":"Soil Science","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soil Science","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.5194/soil-7-525-2021","RegionNum":4,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 11
Abstract
Abstract. Information on soils' composition and physical, chemical and biological properties is paramount to elucidate agroecosystem functioning in space and over time. For this purpose, we developed a national Swiss soil spectral library (SSL; n=4374) in the mid-infrared (mid-IR), calibrating 16 properties from legacy measurements on soils from the Swiss Biodiversity Monitoring program (BDM; n=3778; 1094 sites) and the Swiss long-term Soil Monitoring Network (NABO; n=596; 71 sites). General models were trained with the interpretable rule-based learner CUBIST, testing combinations of {5,10,20,50, and 100} ensembles of rules (committees) and {2, 5, 7, and 9} nearest neighbors used for local averaging with repeated 10-fold cross-validation grouped by location. To evaluate the information in spectra to facilitate long-term soil monitoring at a plot level, we conducted 71 model transfers for the NABO sites to induce locally relevant information from the SSL, using the data-driven sample selection method RS-LOCAL. In total, 10 soil properties were estimated with discrimination capacity suitable for screening (R2≥0.72; ratio of performance to interquartile distance (RPIQ) ≥ 2.0), out of which total carbon (C), organic C (OC), total nitrogen (N), pH and clay showed accuracy eligible for accurate diagnostics (R2>0.8; RPIQ ≥ 3.0). CUBIST and the spectra estimated total C accurately with the root mean square error (RMSE) = 8.4 g kg−1 and the RPIQ = 4.3, while the measured range was 1–583 g kg−1 and OC with RMSE = 9.3 g kg−1 and RPIQ = 3.4 (measured range 0–583 g kg−1). Compared to the general statistical learning approach, the local transfer approach – using two respective training samples – on average reduced the RMSE of total C per site fourfold. We found that the selected SSL subsets were highly dissimilar compared to validation samples, in terms of both their spectral input space and the measured values. This suggests that data-driven selection with RS-LOCAL leverages chemical diversity in composition rather than similarity. Our results suggest that mid-IR soil estimates were sufficiently accurate to support many soil applications that require a large volume of input data, such as precision agriculture, soil C accounting and monitoring and digital soil mapping. This SSL can be updated continuously, for example, with samples from deeper profiles and organic soils, so that the measurement of key soil properties becomes even more accurate and efficient in the near future.
摘要关于土壤组成及其物理、化学和生物特性的信息对于阐明农业生态系统在空间和时间上的功能至关重要。为此,我们开发了瑞士国家土壤光谱库(SSL;n=4374)中红外(中红外),校准来自瑞士生物多样性监测计划(BDM;n = 3778;1094个站点)和瑞士长期土壤监测网络(NABO;n = 596;71网站)。一般模型使用可解释的基于规则的学习器CUBIST进行训练,测试规则(委员会)的{5、10、20、50和100}集合和用于局部平均的{2、5、7和9}近邻的组合,并按位置分组进行重复的10倍交叉验证。为了评估光谱中的信息,以便在样地水平上进行长期土壤监测,我们使用数据驱动的样本选择方法RS-LOCAL,对NABO站点进行了71次模型转移,以从SSL中提取本地相关信息。共估计出10种土壤性质,具有适合筛选的判别能力(R2≥0.72;性能与四分位数间距比(RPIQ)≥2.0),其中总碳(C)、有机碳(OC)、总氮(N)、pH和粘土的准确度符合准确诊断要求(R2>0.8;rpiq≥3.0)。CUBIST和光谱准确地估计了总C,均方根误差(RMSE) = 8.4 g kg - 1, RPIQ = 4.3,而测量范围为1 - 583 g kg - 1, OC的RMSE = 9.3 g kg - 1, RPIQ = 3.4(测量范围为0-583 g kg - 1)。与一般的统计学习方法相比,局部迁移方法-使用两个各自的训练样本-平均将每个站点的总C的RMSE降低了四倍。我们发现,与验证样本相比,所选择的SSL子集在光谱输入空间和测量值方面都非常不同。这表明数据驱动的RS-LOCAL选择利用化学成分的多样性而不是相似性。我们的研究结果表明,中红外土壤估计足够准确,可以支持许多需要大量输入数据的土壤应用,如精准农业、土壤C核算和监测以及数字土壤制图。这种SSL可以不断更新,例如,使用来自更深剖面和有机土壤的样本,以便在不久的将来对关键土壤特性的测量变得更加准确和高效。
期刊介绍:
Cessation.Soil Science satisfies the professional needs of all scientists and laboratory personnel involved in soil and plant research by publishing primary research reports and critical reviews of basic and applied soil science, especially as it relates to soil and plant studies and general environmental soil science.
Each month, Soil Science presents authoritative research articles from an impressive array of discipline: soil chemistry and biochemistry, physics, fertility and nutrition, soil genesis and morphology, soil microbiology and mineralogy. Of immediate relevance to soil scientists-both industrial and academic-this unique publication also has long-range value for agronomists and environmental scientists.