R.C. Martins , C. Queirós , F.M. Silva , F. Santos , T.G. Barroso , R. Tosin , M. Cunha , M. Leão , M. Damásio , P. Martins , J. Silvestre
{"title":"Spectral data augmentation for leaf nutrient uptake quantification","authors":"R.C. Martins , C. Queirós , F.M. Silva , F. Santos , T.G. Barroso , R. Tosin , M. Cunha , M. Leão , M. Damásio , P. Martins , J. Silvestre","doi":"10.1016/j.biosystemseng.2024.07.001","DOIUrl":null,"url":null,"abstract":"<div><p>Data scarcity is a hurdle for physiology-based precision agriculture. Measuring nutrient uptake by visible-near infrared spectroscopy implies collecting spectral and compositional data from low-throughput, such as inductively coupled plasma optical emission spectroscopy. This paper introduces data augmentation in spectroscopy by hybridisation for expanding real-world data into synthetic datasets statistically representative of the real data, allowing the quantification of macronutrients (N, P, K, Ca, Mg, and S) and micronutrients (Fe, Mn, Zn, Cu, and B). Partial least squares (PLS), local partial least squares (LocPLS), and self-learning artificial intelligence (SLAI) were used to determine the capacity to expand the knowledge base. PLS using only real-world data (RWD) cannot quantify some nutrients (N and Cu in grapevine leaves and K, Ca, Mg, S, and Cu in apple tree leaves). The synthetic dataset of the study allowed predicting real-world leaf composition of macronutrients (N, P, K, Ca, Mg and S) (Pearson coefficient correlation (R) ∼ 0.61–0.94 and standard error (SE) ∼ 0.04–0.05%) and micro-nutrients (Fe, Mn, Zn, Cu and B) (R ∼ 0.66–0.91 and SE ∼ 0.88–3.98 ppm) in grapevine leaves using LocPLS and SLAI. The synthetic dataset loses significance if the real-world counterpart has low representativity, resulting in poor quantifications of macronutrients (R ∼ 0.51–0.72 and SE ∼ 0.02–0.13%) and micronutrients (R ∼ 0.53–0.76 and SE ∼ 8.89–37.89 ppm), and not allowing S quantification (R = 0.37, SE = 0.01) in apple tree leaves. Representative real-world sampling makes data augmentation in spectroscopy very efficient in expanding the knowledge base and nutrient quantifications.</p></div>","PeriodicalId":9173,"journal":{"name":"Biosystems Engineering","volume":"246 ","pages":"Pages 82-95"},"PeriodicalIF":4.4000,"publicationDate":"2024-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosystems Engineering","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S153751102400148X","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Data scarcity is a hurdle for physiology-based precision agriculture. Measuring nutrient uptake by visible-near infrared spectroscopy implies collecting spectral and compositional data from low-throughput, such as inductively coupled plasma optical emission spectroscopy. This paper introduces data augmentation in spectroscopy by hybridisation for expanding real-world data into synthetic datasets statistically representative of the real data, allowing the quantification of macronutrients (N, P, K, Ca, Mg, and S) and micronutrients (Fe, Mn, Zn, Cu, and B). Partial least squares (PLS), local partial least squares (LocPLS), and self-learning artificial intelligence (SLAI) were used to determine the capacity to expand the knowledge base. PLS using only real-world data (RWD) cannot quantify some nutrients (N and Cu in grapevine leaves and K, Ca, Mg, S, and Cu in apple tree leaves). The synthetic dataset of the study allowed predicting real-world leaf composition of macronutrients (N, P, K, Ca, Mg and S) (Pearson coefficient correlation (R) ∼ 0.61–0.94 and standard error (SE) ∼ 0.04–0.05%) and micro-nutrients (Fe, Mn, Zn, Cu and B) (R ∼ 0.66–0.91 and SE ∼ 0.88–3.98 ppm) in grapevine leaves using LocPLS and SLAI. The synthetic dataset loses significance if the real-world counterpart has low representativity, resulting in poor quantifications of macronutrients (R ∼ 0.51–0.72 and SE ∼ 0.02–0.13%) and micronutrients (R ∼ 0.53–0.76 and SE ∼ 8.89–37.89 ppm), and not allowing S quantification (R = 0.37, SE = 0.01) in apple tree leaves. Representative real-world sampling makes data augmentation in spectroscopy very efficient in expanding the knowledge base and nutrient quantifications.
期刊介绍:
Biosystems Engineering publishes research in engineering and the physical sciences that represent advances in understanding or modelling of the performance of biological systems for sustainable developments in land use and the environment, agriculture and amenity, bioproduction processes and the food chain. The subject matter of the journal reflects the wide range and interdisciplinary nature of research in engineering for biological systems.