Yujie Liu , Benjamin Lucas , Darby D. Bergl , Andrew D. Richardson
{"title":"Robust filling of extra-long gaps in eddy covariance CO2 flux measurements from a temperate deciduous forest using eXtreme Gradient Boosting","authors":"Yujie Liu , Benjamin Lucas , Darby D. Bergl , Andrew D. Richardson","doi":"10.1016/j.agrformet.2025.110438","DOIUrl":null,"url":null,"abstract":"<div><div>Eddy Covariance measurements are often subject to missing values, or gaps in the data record. Methods to fill short gaps are well-established, but robustly filling gaps longer than a few weeks remains a challenge. Marginal Distribution Sampling (MDS) is a standard gap-filling method, but its effectiveness for long gaps (> 30 days) is limited. We compared the performance of a machine learning algorithm, eXtreme Gradient Boosting (XGB) against MDS, using various artificial scenarios of gap lengths and locations. We gapfilled half hourly CO<sub>2</sub> flux from a temperate deciduous forest, Bartlett Experimental Forest, from 2010 to 2022. Whereas the standard implementation of MDS uses a narrowly-prescribed set of predictor variables, with XGB we were able to include additional variables. The Green Chromatic Coordinate (GCC), derived from PhenoCam imagery, and diffuse photosynthetic photon flux density, emerged as two of the three most important predictor variables. Compared to MDS, the root mean square error (RMSE) of XGB decreased by 9.5 %, and the R<sup>2</sup> increased by 2.7 % in a randomized 10-fold cross validation test. XGB outperformed MDS for both day and night times across different seasons. But annual NEE integrals varied across methods, with weaker annual net carbon uptake, by -110 ± 74 g C m<sup>-2</sup> y<sup>-1</sup> for XGB compared to MDS (214 ± 11 g C m<sup>-2</sup> yr<sup>-1</sup>). In artificial gap experiments, when trained using the 13-year data record, XGB reliably filled gaps, showing little change in RMSE for gaps up to 240 days. In contrast, the performance of MDS steadily decreased as gap lengths increased. MDS was unable to fill gaps longer than 2 months. In summary, XGB demonstrates excellent performance as an alternative method to MDS, providing reliable predictions for temperate deciduous forest carbon fluxes under different gap lengths and location scenarios. Implementation of XGB is facilitated by easy-to-use packages.</div></div>","PeriodicalId":50839,"journal":{"name":"Agricultural and Forest Meteorology","volume":"364 ","pages":"Article 110438"},"PeriodicalIF":5.6000,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agricultural and Forest Meteorology","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168192325000589","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRONOMY","Score":null,"Total":0}
引用次数: 0
Abstract
Eddy Covariance measurements are often subject to missing values, or gaps in the data record. Methods to fill short gaps are well-established, but robustly filling gaps longer than a few weeks remains a challenge. Marginal Distribution Sampling (MDS) is a standard gap-filling method, but its effectiveness for long gaps (> 30 days) is limited. We compared the performance of a machine learning algorithm, eXtreme Gradient Boosting (XGB) against MDS, using various artificial scenarios of gap lengths and locations. We gapfilled half hourly CO2 flux from a temperate deciduous forest, Bartlett Experimental Forest, from 2010 to 2022. Whereas the standard implementation of MDS uses a narrowly-prescribed set of predictor variables, with XGB we were able to include additional variables. The Green Chromatic Coordinate (GCC), derived from PhenoCam imagery, and diffuse photosynthetic photon flux density, emerged as two of the three most important predictor variables. Compared to MDS, the root mean square error (RMSE) of XGB decreased by 9.5 %, and the R2 increased by 2.7 % in a randomized 10-fold cross validation test. XGB outperformed MDS for both day and night times across different seasons. But annual NEE integrals varied across methods, with weaker annual net carbon uptake, by -110 ± 74 g C m-2 y-1 for XGB compared to MDS (214 ± 11 g C m-2 yr-1). In artificial gap experiments, when trained using the 13-year data record, XGB reliably filled gaps, showing little change in RMSE for gaps up to 240 days. In contrast, the performance of MDS steadily decreased as gap lengths increased. MDS was unable to fill gaps longer than 2 months. In summary, XGB demonstrates excellent performance as an alternative method to MDS, providing reliable predictions for temperate deciduous forest carbon fluxes under different gap lengths and location scenarios. Implementation of XGB is facilitated by easy-to-use packages.
期刊介绍:
Agricultural and Forest Meteorology is an international journal for the publication of original articles and reviews on the inter-relationship between meteorology, agriculture, forestry, and natural ecosystems. Emphasis is on basic and applied scientific research relevant to practical problems in the field of plant and soil sciences, ecology and biogeochemistry as affected by weather as well as climate variability and change. Theoretical models should be tested against experimental data. Articles must appeal to an international audience. Special issues devoted to single topics are also published.
Typical topics include canopy micrometeorology (e.g. canopy radiation transfer, turbulence near the ground, evapotranspiration, energy balance, fluxes of trace gases), micrometeorological instrumentation (e.g., sensors for trace gases, flux measurement instruments, radiation measurement techniques), aerobiology (e.g. the dispersion of pollen, spores, insects and pesticides), biometeorology (e.g. the effect of weather and climate on plant distribution, crop yield, water-use efficiency, and plant phenology), forest-fire/weather interactions, and feedbacks from vegetation to weather and the climate system.