Abstract. Understanding long-term Terrestrial water storage (TWS) variations is vital for investigating hydrological extreme events, managing water resources, and assessing climate change impacts. However, the limited data duration from the Gravity Recovery and Climate Experiment (GRACE) and its follow-on missions (GRACE-FO) poses challenges for comprehensive long-term analysis. In this study, we reconstruct TWS anomalies (TWSA) for the period Jan 1960 to Dec 2022 thereby filling data gaps between GRACE and GRACE-FO missions as well as generating a complete dataset for the pre-GRACE era. The workflow involves identifying optimal predictors from land surface model (LSM) outputs, meteorological variables, and climatic indices using a novel Bayesian Network (BN) technique for grid-based TWSA simulations. Climate indices, like the Oceanic Niño Index and Dipole Mode Index, are selected as optimal predictors for a large number of grids globally, along with TWSA from LSM outputs. The most effective machine learning (ML) algorithms among Convolutional Neural Network (CNN), Support Vector Regression (SVR), Extra Trees Regressor (ETR), and Stacking Ensemble Regression (SER) models are evaluated at each grid location to achieve optimal reproducibility. Globally, ETR performs best for most of the grids which is also noticed at the river-basin scale, particularly for the Ganga-Brahmaputra-Meghana, Godavari, Krishna, Limpopo, and Nile river basins. The simulated TWSA (BNML_TWSA) outperformed the TWSA from LSM outputs when evaluated against GRACE datasets. Improvements are particularly noted in the river basins such as Godavari, Krishna, Danube, Amazon, etc., with median values of the correlation coefficient, Nash-Sutcliffe efficiency, and RMSE for all grids in Godavari, India, being 0.927, 0.839, and 63.7 mm respectively. A comparison with TWSA reconstructed in recent studies indicates that the proposed BNML_TWSA outperforms them globally as well as for all the 11 major river basins examined. The presented dataset is published at https://doi.org/10.6084/m9.figshare.25376695 (Mandal et al., 2024) and updates will be published when needed.
{"title":"Optimal feature selection for improved ML based reconstruction of Global Terrestrial Water Storage Anomalies","authors":"Nehar Mandal, Prabal Das, Kironmala Chanda","doi":"10.5194/essd-2024-109","DOIUrl":"https://doi.org/10.5194/essd-2024-109","url":null,"abstract":"<strong>Abstract.</strong> Understanding long-term Terrestrial water storage (TWS) variations is vital for investigating hydrological extreme events, managing water resources, and assessing climate change impacts. However, the limited data duration from the Gravity Recovery and Climate Experiment (GRACE) and its follow-on missions (GRACE-FO) poses challenges for comprehensive long-term analysis. In this study, we reconstruct TWS anomalies (TWSA) for the period Jan 1960 to Dec 2022 thereby filling data gaps between GRACE and GRACE-FO missions as well as generating a complete dataset for the pre-GRACE era. The workflow involves identifying optimal predictors from land surface model (LSM) outputs, meteorological variables, and climatic indices using a novel Bayesian Network (BN) technique for grid-based TWSA simulations. Climate indices, like the Oceanic Niño Index and Dipole Mode Index, are selected as optimal predictors for a large number of grids globally, along with TWSA from LSM outputs. The most effective machine learning (ML) algorithms among Convolutional Neural Network (CNN), Support Vector Regression (SVR), Extra Trees Regressor (ETR), and Stacking Ensemble Regression (SER) models are evaluated at each grid location to achieve optimal reproducibility. Globally, ETR performs best for most of the grids which is also noticed at the river-basin scale, particularly for the Ganga-Brahmaputra-Meghana, Godavari, Krishna, Limpopo, and Nile river basins. The simulated TWSA (BNML_TWSA) outperformed the TWSA from LSM outputs when evaluated against GRACE datasets. Improvements are particularly noted in the river basins such as Godavari, Krishna, Danube, Amazon, etc., with median values of the correlation coefficient, Nash-Sutcliffe efficiency, and RMSE for all grids in Godavari, India, being 0.927, 0.839, and 63.7 mm respectively. A comparison with TWSA reconstructed in recent studies indicates that the proposed BNML_TWSA outperforms them globally as well as for all the 11 major river basins examined. The presented dataset is published at https://doi.org/10.6084/m9.figshare.25376695 (Mandal et al., 2024) and updates will be published when needed.","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"1 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140953595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract. Developing air quality management systems to control the impacts of air pollution requires reliable data. However, current initiatives do not provide datasets with large spatial and temporal resolutions for developing air pollution policies in Brazil. Here, we introduce the Brazilian Atmospheric Inventories (BRAIN), the first comprehensive database of air quality and its drivers in Brazil. BRAIN encompasses hourly datasets of meteorology, emissions, and air quality. The emissions dataset includes vehicular emissions derived from the Brazilian Vehicular Emissions Inventory Software (BRAVES), industrial emissions produced with local data from the Brazilian environmental agencies, biomass burning emissions from FINN – Fire INventory from the National Center for Atmospheric Research (NCAR), and biogenic emissions from the Model of Emissions of Gases and Aerosols from Nature (MEGAN) (https://doi.org/10.57760/sciencedb.09858, Hoinaski et al., 2023a; https://doi.org/10.57760/sciencedb.09886, Hoinaski et al., 2023b). The meteorology dataset has been derived from the Weather Research and Forecasting Model (WRF) (https://doi.org/10.57760/sciencedb.09857, Hoinaski and Will, 2023a; https://doi.org/10.57760/sciencedb.09885, Hoinaski and Will, 2023c). The air quality dataset contains the surface concentration of 216 air pollutants produced from coupling meteorological and emissions datasets with the Community Multiscale Air Quality Modeling System (CMAQ) (https://doi.org/10.57760/sciencedb.09859, Hoinaski and Will, 2023b; https://doi.org/10.57760/sciencedb.09884, Hoinaski and Will, 2023d). We provide gridded data in two domains, one covering the Brazilian territory with 20×20 km spatial resolution and another covering southern Brazil with 4×4 km spatial resolution. This paper describes how the datasets were produced, their limitations, and their spatiotemporal features. To evaluate the quality of the database, we compare the air quality dataset with 244 air quality monitoring stations, providing the model's performance for each pollutant measured by the monitoring stations. We present a sample of the spatial variability of emissions, meteorology, and air quality in Brazil from 2019, revealing the hotspots of emissions and air pollution issues. By making BRAIN publicly available, we aim to provide the required data for developing air quality policies on municipal and state scales, especially for under-developed and data-scarce municipalities. We also envision that BRAIN has the potential to create new insights into and opportunities for air pollution research in Brazil.
摘要开发空气质量管理系统以控制空气污染的影响需要可靠的数据。然而,目前的举措并没有为巴西制定空气污染政策提供大时空分辨率的数据集。在此,我们介绍巴西大气清单(BRAIN),这是巴西首个关于空气质量及其驱动因素的综合数据库。BRAIN 包含气象、排放和空气质量的每小时数据集。排放数据集包括巴西车辆排放清单软件(BRAVES)中的车辆排放、巴西环境机构根据当地数据生成的工业排放、国家大气研究中心(NCAR)的 FINN - Fire INventory 中的生物质燃烧排放,以及自然界气体和气溶胶排放模型(MEGAN)中的生物排放(https://doi.org/10.57760/sciencedb.09858, Hoinaski et al., 2023a;https://doi.org/10.57760/sciencedb.09886, Hoinaski et al., 2023b)。气象数据集来自天气研究和预测模型(WRF)(https://doi.org/10.57760/sciencedb.09857,Hoinaski 和 Will,2023a;https://doi.org/10.57760/sciencedb.09885,Hoinaski 和 Will,2023c)。空气质量数据集包含 216 种空气污染物的地表浓度,这些数据集是将气象和排放数据集与社区多尺度空气质量建模系统(CMAQ)(https://doi.org/10.57760/sciencedb.09859, Hoinaski and Will, 2023b;https://doi.org/10.57760/sciencedb.09884, Hoinaski and Will, 2023d)耦合后生成的。我们提供了两个域的网格数据,一个域覆盖巴西全境,空间分辨率为 20×20 千米,另一个域覆盖巴西南部,空间分辨率为 4×4 千米。本文介绍了数据集的制作方法、局限性及其时空特征。为了评估数据库的质量,我们将空气质量数据集与 244 个空气质量监测站进行了比较,提供了监测站测量的每种污染物的模型性能。我们展示了 2019 年巴西排放、气象和空气质量的空间变化样本,揭示了排放和空气污染问题的热点。通过公开 BRAIN,我们旨在为制定市、州范围的空气质量政策提供所需的数据,尤其是针对欠发达和数据稀缺的城市。我们还设想,BRAIN 有可能为巴西的空气污染研究提供新的见解和机会。
{"title":"Brazilian Atmospheric Inventories – BRAIN: a comprehensive database of air quality in Brazil","authors":"Leonardo Hoinaski, Robson Will, Camilo Bastos Ribeiro","doi":"10.5194/essd-16-2385-2024","DOIUrl":"https://doi.org/10.5194/essd-16-2385-2024","url":null,"abstract":"Abstract. Developing air quality management systems to control the impacts of air pollution requires reliable data. However, current initiatives do not provide datasets with large spatial and temporal resolutions for developing air pollution policies in Brazil. Here, we introduce the Brazilian Atmospheric Inventories (BRAIN), the first comprehensive database of air quality and its drivers in Brazil. BRAIN encompasses hourly datasets of meteorology, emissions, and air quality. The emissions dataset includes vehicular emissions derived from the Brazilian Vehicular Emissions Inventory Software (BRAVES), industrial emissions produced with local data from the Brazilian environmental agencies, biomass burning emissions from FINN – Fire INventory from the National Center for Atmospheric Research (NCAR), and biogenic emissions from the Model of Emissions of Gases and Aerosols from Nature (MEGAN) (https://doi.org/10.57760/sciencedb.09858, Hoinaski et al., 2023a; https://doi.org/10.57760/sciencedb.09886, Hoinaski et al., 2023b). The meteorology dataset has been derived from the Weather Research and Forecasting Model (WRF) (https://doi.org/10.57760/sciencedb.09857, Hoinaski and Will, 2023a; https://doi.org/10.57760/sciencedb.09885, Hoinaski and Will, 2023c). The air quality dataset contains the surface concentration of 216 air pollutants produced from coupling meteorological and emissions datasets with the Community Multiscale Air Quality Modeling System (CMAQ) (https://doi.org/10.57760/sciencedb.09859, Hoinaski and Will, 2023b; https://doi.org/10.57760/sciencedb.09884, Hoinaski and Will, 2023d). We provide gridded data in two domains, one covering the Brazilian territory with 20×20 km spatial resolution and another covering southern Brazil with 4×4 km spatial resolution. This paper describes how the datasets were produced, their limitations, and their spatiotemporal features. To evaluate the quality of the database, we compare the air quality dataset with 244 air quality monitoring stations, providing the model's performance for each pollutant measured by the monitoring stations. We present a sample of the spatial variability of emissions, meteorology, and air quality in Brazil from 2019, revealing the hotspots of emissions and air pollution issues. By making BRAIN publicly available, we aim to provide the required data for developing air quality policies on municipal and state scales, especially for under-developed and data-scarce municipalities. We also envision that BRAIN has the potential to create new insights into and opportunities for air pollution research in Brazil.","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"48 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140949419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-16DOI: 10.5194/essd-16-2367-2024
Songchao Chen, Zhongxing Chen, Xianglin Zhang, Zhongkui Luo, Calogero Schillaci, Dominique Arrouays, Anne Christine Richer-de-Forges, Zhou Shi
Abstract. Soil bulk density (BD) serves as a fundamental indicator of soil health and quality, exerting a significant influence on critical factors such as plant growth, nutrient availability, and water retention. Due to its limited availability in soil databases, the application of pedotransfer functions (PTFs) has emerged as a potent tool for predicting BD using other easily measurable soil properties, while the impact of these PTFs' performance on soil organic carbon (SOC) stock calculation has been rarely explored. In this study, we proposed an innovative local modeling approach for predicting BD of fine earth (BDfine) across Europe using the recently released BDfine data from the LUCAS Soil (Land Use and Coverage Area Frame Survey Soil) 2018 (0–20 cm) and relevant predictors. Our approach involved a combination of neighbor sample search, forward recursive feature selection (FRFS), and random forest (RF) models (local-RFFRFS). The results showed that local-RFFRFS had a good performance in predicting BDfine (R2 of 0.58, root mean square error (RMSE) of 0.19 g cm−3, relative error (RE) of 16.27 %), surpassing the earlier-published PTFs (R2 of 0.40–0.45, RMSE of 0.22 g cm−3, RE of 19.11 %–21.18 %) and global PTFs using RF models with and without FRFS (R2 of 0.56–0.57, RMSE of 0.19 g cm−3, RE of 16.47 %–16.74 %). Interestingly, we found that the best earlier-published PTF (R2 = 0.84, RMSE = 1.39 kg m−2, RE of 17.57 %) performed close to the local-RFFRFS (R2 = 0.85, RMSE = 1.32 kg m−2, RE of 15.01 %) in SOC stock calculation using BDfine predictions. However, the local-RFFRFS still performed better (ΔR2 > 0.2) for soil samples with low SOC stocks (< 3 kg m−2). Therefore, we suggest that the local-RFFRFS is a promising method for BDfine prediction, while earlier-published PTFs would be more efficient when BDfine is subsequently utilized for calculating SOC stock. Finally, we produced two topsoil BDfine and SOC stock datasets (18 945 and 15 389 soil samples) at 0–20 cm for LUCAS Soil 2018 using the best earlier-published PTF and local-RFFRFS, respectively. This dataset is archived on the Zenodo platform at https://doi.org/10.5281/zenodo.10211884 (S. Chen et al., 2023). The outcomes of this study present a meaningful advancement in enhancing the predictive accuracy of BDfine, and the resultant BDfine and SOC stock datasets for topsoil across the Europe enable more precise soil hydrological and biological modeling.
{"title":"European topsoil bulk density and organic carbon stock database (0–20 cm) using machine-learning-based pedotransfer functions","authors":"Songchao Chen, Zhongxing Chen, Xianglin Zhang, Zhongkui Luo, Calogero Schillaci, Dominique Arrouays, Anne Christine Richer-de-Forges, Zhou Shi","doi":"10.5194/essd-16-2367-2024","DOIUrl":"https://doi.org/10.5194/essd-16-2367-2024","url":null,"abstract":"Abstract. Soil bulk density (BD) serves as a fundamental indicator of soil health and quality, exerting a significant influence on critical factors such as plant growth, nutrient availability, and water retention. Due to its limited availability in soil databases, the application of pedotransfer functions (PTFs) has emerged as a potent tool for predicting BD using other easily measurable soil properties, while the impact of these PTFs' performance on soil organic carbon (SOC) stock calculation has been rarely explored. In this study, we proposed an innovative local modeling approach for predicting BD of fine earth (BDfine) across Europe using the recently released BDfine data from the LUCAS Soil (Land Use and Coverage Area Frame Survey Soil) 2018 (0–20 cm) and relevant predictors. Our approach involved a combination of neighbor sample search, forward recursive feature selection (FRFS), and random forest (RF) models (local-RFFRFS). The results showed that local-RFFRFS had a good performance in predicting BDfine (R2 of 0.58, root mean square error (RMSE) of 0.19 g cm−3, relative error (RE) of 16.27 %), surpassing the earlier-published PTFs (R2 of 0.40–0.45, RMSE of 0.22 g cm−3, RE of 19.11 %–21.18 %) and global PTFs using RF models with and without FRFS (R2 of 0.56–0.57, RMSE of 0.19 g cm−3, RE of 16.47 %–16.74 %). Interestingly, we found that the best earlier-published PTF (R2 = 0.84, RMSE = 1.39 kg m−2, RE of 17.57 %) performed close to the local-RFFRFS (R2 = 0.85, RMSE = 1.32 kg m−2, RE of 15.01 %) in SOC stock calculation using BDfine predictions. However, the local-RFFRFS still performed better (ΔR2 > 0.2) for soil samples with low SOC stocks (< 3 kg m−2). Therefore, we suggest that the local-RFFRFS is a promising method for BDfine prediction, while earlier-published PTFs would be more efficient when BDfine is subsequently utilized for calculating SOC stock. Finally, we produced two topsoil BDfine and SOC stock datasets (18 945 and 15 389 soil samples) at 0–20 cm for LUCAS Soil 2018 using the best earlier-published PTF and local-RFFRFS, respectively. This dataset is archived on the Zenodo platform at https://doi.org/10.5281/zenodo.10211884 (S. Chen et al., 2023). The outcomes of this study present a meaningful advancement in enhancing the predictive accuracy of BDfine, and the resultant BDfine and SOC stock datasets for topsoil across the Europe enable more precise soil hydrological and biological modeling.","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"24 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140949353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nele Reyniers, Qianyu Zha, Nans Addor, Timothy J. Osborn, Nicole Forstenhäusler, Yi He
Abstract. The United Kingdom Climate Projections 2018 (UKCP18) regional climate model (RCM) 12 km regional perturbed physics ensemble (UKCP18-RCM-PPE) is one of the three strands of the latest set of UK national climate projections produced by the UK Met Office. It has been widely adopted in climate impact assessment. In this study, we report biases in the raw UKCP18-RCM simulations that are significant and are likely to deteriorate impact assessments if they are not adjusted. Two methods were used to bias-correct UKCP18-RCM: non-parametric quantile mapping using empirical quantiles and a variant developed for the third phase of the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP) designed to preserve the climate change signal. Specifically, daily temperature and precipitation simulations for 1981 to 2080 were adjusted for the 12 ensemble members. Potential evapotranspiration was also estimated over the same period using the Penman-Monteith formulation and then bias-corrected using the latter method. Both methods successfully corrected biases in a range of daily temperature, precipitation and potential evapotranspiration metrics, and reduced biases in multi-day precipitation metrics to a lesser degree. An exploratory analysis of the projected future changes confirms the expectation of wetter, warmer winters and hotter, drier summers, and shows uneven changes in different parts of the distributions of both temperature and precipitation. Both bias-correction methods preserved the climate change signal almost equally well, as well as the spread among the projected changes. The change factor method was used as a benchmark for precipitation, and we show that it fails to capture changes in a range of variables, making it inadequate for most impact assessments. By comparing the differences between the two bias-correction methods and within the 12 ensemble members, we show that the uncertainty in future precipitation and temperature changes stemming from the climate model parameterisation far outweighs the uncertainty introduced by selecting one of these two bias-correction methods. We conclude by providing guidance on the use of the bias-corrected data sets. The data sets bias adjusted with ISIMIP3BA are publicly available in the following repositories: https://doi.org/10.5281/zenodo.6337381 for precipitation and temperature (Reyniers et al., 2022a) and https://doi.org/10.5281/zenodo.6320707 for potential evapotranspiration (Reyniers et al., 2022b) . The datasets bias-corrected using the quantile mapping method are available at https://doi.org/10.5281/zenodo.8223024 (Zha et al., 2023) .
{"title":"Two sets of bias-corrected regional UK Climate Projections 2018 (UKCP18) of temperature, precipitation and potential evapotranspiration for Great Britain","authors":"Nele Reyniers, Qianyu Zha, Nans Addor, Timothy J. Osborn, Nicole Forstenhäusler, Yi He","doi":"10.5194/essd-2024-132","DOIUrl":"https://doi.org/10.5194/essd-2024-132","url":null,"abstract":"<strong>Abstract.</strong> The United Kingdom Climate Projections 2018 (UKCP18) regional climate model (RCM) 12 km regional perturbed physics ensemble (UKCP18-RCM-PPE) is one of the three strands of the latest set of UK national climate projections produced by the UK Met Office. It has been widely adopted in climate impact assessment. In this study, we report biases in the raw UKCP18-RCM simulations that are significant and are likely to deteriorate impact assessments if they are not adjusted. Two methods were used to bias-correct UKCP18-RCM: non-parametric quantile mapping using empirical quantiles and a variant developed for the third phase of the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP) designed to preserve the climate change signal. Specifically, daily temperature and precipitation simulations for 1981 to 2080 were adjusted for the 12 ensemble members. Potential evapotranspiration was also estimated over the same period using the Penman-Monteith formulation and then bias-corrected using the latter method. Both methods successfully corrected biases in a range of daily temperature, precipitation and potential evapotranspiration metrics, and reduced biases in multi-day precipitation metrics to a lesser degree. An exploratory analysis of the projected future changes confirms the expectation of wetter, warmer winters and hotter, drier summers, and shows uneven changes in different parts of the distributions of both temperature and precipitation. Both bias-correction methods preserved the climate change signal almost equally well, as well as the spread among the projected changes. The change factor method was used as a benchmark for precipitation, and we show that it fails to capture changes in a range of variables, making it inadequate for most impact assessments. By comparing the differences between the two bias-correction methods and within the 12 ensemble members, we show that the uncertainty in future precipitation and temperature changes stemming from the climate model parameterisation far outweighs the uncertainty introduced by selecting one of these two bias-correction methods. We conclude by providing guidance on the use of the bias-corrected data sets. The data sets bias adjusted with ISIMIP3BA are publicly available in the following repositories: https://doi.org/10.5281/zenodo.6337381 for precipitation and temperature (Reyniers et al., 2022a) and https://doi.org/10.5281/zenodo.6320707 for potential evapotranspiration (Reyniers et al., 2022b) . The datasets bias-corrected using the quantile mapping method are available at https://doi.org/10.5281/zenodo.8223024 (Zha et al., 2023) .","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"145 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140949414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-15DOI: 10.5194/essd-16-2351-2024
Pierre-Antoine Versini, Leydy Alejandra Castellanos-Diaz, David Ramier, Ioulia Tchiguirinskaia
Abstract. Nature-based solutions have appeared as relevant solutions to mitigate urban heat islands. To improve our knowledge of the assessment of this ecosystem service and the related physical processes (evapotranspiration), monitoring campaigns are required. This was the objective of several experiments carried out on the Blue Green Wave, a large green roof located in Champs-sur-Marne (France). Three different protocols were implemented and tested to assess the evapotranspiration flux at different scales: the first one was based on the surface energy balance (large scale); the second one was carried out using an evapotranspiration chamber (small scale); and the third one was based on the water balance evaluated during dry periods (point scale). In addition to these evapotranspiration estimates, several hydrometeorological variables (especially temperature) were measured. Related data and Python programs providing preliminary elements of the analysis and graphical representation have been made available. They illustrate the space–time variability in the studied processes regarding their observation scale. The dataset is available at https://doi.org/10.5281/zenodo.8064053 (Versini et al., 2023).
{"title":"Evapotranspiration evaluation using three different protocols on a large green roof in the greater Paris area","authors":"Pierre-Antoine Versini, Leydy Alejandra Castellanos-Diaz, David Ramier, Ioulia Tchiguirinskaia","doi":"10.5194/essd-16-2351-2024","DOIUrl":"https://doi.org/10.5194/essd-16-2351-2024","url":null,"abstract":"Abstract. Nature-based solutions have appeared as relevant solutions to mitigate urban heat islands. To improve our knowledge of the assessment of this ecosystem service and the related physical processes (evapotranspiration), monitoring campaigns are required. This was the objective of several experiments carried out on the Blue Green Wave, a large green roof located in Champs-sur-Marne (France). Three different protocols were implemented and tested to assess the evapotranspiration flux at different scales: the first one was based on the surface energy balance (large scale); the second one was carried out using an evapotranspiration chamber (small scale); and the third one was based on the water balance evaluated during dry periods (point scale). In addition to these evapotranspiration estimates, several hydrometeorological variables (especially temperature) were measured. Related data and Python programs providing preliminary elements of the analysis and graphical representation have been made available. They illustrate the space–time variability in the studied processes regarding their observation scale. The dataset is available at https://doi.org/10.5281/zenodo.8064053 (Versini et al., 2023).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"55 18 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yichen Jiang, Su Shi, Xinyue Li, Chang Xu, Haidong Kan, Bo Hu, Xia Meng
Abstract. Ultraviolet (UV) radiation is closely related to health, but limited measurements hindered further investigation of its health effects in China. Machine learning algorithm has been widely used in predicting environmental factors with high accuracy, but limited studies have done for UV radiation. This study aimed to develop UV radiation prediction model based on random forest method, and predict UV radiation at daily level and 10 km resolution in mainland China in 2005–2020. A random forest model was employed to predict UV radiation by integrating ground UV radiation measurements from monitoring stations and multiple predictors, such as UV radiation data from satellite. Missing data of satellite-based UV radiation was filled by three-day moving average method. The model's performance was evaluated through multiple cross-validation (CV) methods. The overall R2 (root mean square error, RMSE) between measured and predicted UV radiation from model development and model 10-fold CV was 0.97 (15.64 W m-2) and 0.83 (37.44 W m-2) at daily level, respectively. The model with OMI EDD performed higher predicting accuracy than the one without it. Based on predictions of UV radiation at daily level and 10 km spatial resolution and nearly 100 % spatiotemporal coverage, we found UV radiation increased by 4.20 % while PM2.5 levels decreased by 48.51 % and O3 levels rose by 22.70 % in 2013–2020, suggesting a potential correlation among these environmental factors. Uneven spatial distribution of UV radiation was found to be associated with factors such as latitude, elevation, meteorological factors and seasons. The eastern areas of China posed higher risk with both high population density and UV radiation intensity. Based on machine learning algorithm, this study generated a gridded dataset characterized by relatively high precision and extensive spatiotemporal coverage of UV radiation, which demonstrates the spatiotemporal variability of UV radiation levels in China and can facilitate health-related research in the future. This dataset is currently freely available at https://doi.org/10.5281/zenodo.10884591 (Jiang et al., 2024).
摘要。紫外线(UV)辐射与健康密切相关,但在中国,有限的测量数据阻碍了对其健康影响的进一步研究。机器学习算法已被广泛应用于环境因素的高精度预测,但针对紫外线辐射的研究还很有限。本研究旨在开发基于随机森林方法的紫外线辐射预测模型,并预测 2005-2020 年中国大陆日水平和 10 km 分辨率的紫外线辐射。研究采用随机森林模型,综合监测站的地面紫外辐射测量数据和卫星紫外辐射数据等多个预测因子,对紫外辐射进行预测。卫星紫外辐射缺失数据采用三天移动平均法进行填补。模型的性能通过多种交叉验证(CV)方法进行评估。模型开发和模型 10 倍交叉验证得出的紫外辐射测量值与预测值之间的总 R2(均方根误差,RMSE)分别为 0.97(15.64 W m-2)和 0.83(37.44 W m-2)。采用 OMI EDD 的模型比不采用 OMI EDD 的模型预测精度更高。基于日紫外线辐射预测和 10 千米空间分辨率以及近 100%的时空覆盖率,我们发现 2013-2020 年紫外线辐射增加了 4.20%,而 PM2.5 水平下降了 48.51%,O3 水平上升了 22.70%,这表明这些环境因素之间存在潜在的相关性。研究发现,紫外线辐射的不均匀空间分布与纬度、海拔、气象因素和季节等因素有关。中国东部地区人口密度高,紫外线辐射强度大,因此风险较高。基于机器学习算法,本研究生成了一个网格数据集,该数据集具有精度相对较高、紫外线辐射时空覆盖面广的特点,展示了中国紫外线辐射水平的时空变异性,有助于未来开展与健康相关的研究。该数据集目前可在 https://doi.org/10.5281/zenodo.10884591 免费获取(Jiang 等,2024 年)。
{"title":"A 10 km daily-level ultraviolet radiation predicting dataset based on machine learning models in China from 2005 to 2020","authors":"Yichen Jiang, Su Shi, Xinyue Li, Chang Xu, Haidong Kan, Bo Hu, Xia Meng","doi":"10.5194/essd-2024-111","DOIUrl":"https://doi.org/10.5194/essd-2024-111","url":null,"abstract":"<strong>Abstract.</strong> Ultraviolet (UV) radiation is closely related to health, but limited measurements hindered further investigation of its health effects in China. Machine learning algorithm has been widely used in predicting environmental factors with high accuracy, but limited studies have done for UV radiation. This study aimed to develop UV radiation prediction model based on random forest method, and predict UV radiation at daily level and 10 km resolution in mainland China in 2005–2020. A random forest model was employed to predict UV radiation by integrating ground UV radiation measurements from monitoring stations and multiple predictors, such as UV radiation data from satellite. Missing data of satellite-based UV radiation was filled by three-day moving average method. The model's performance was evaluated through multiple cross-validation (CV) methods. The overall R<sup>2</sup> (root mean square error, RMSE) between measured and predicted UV radiation from model development and model 10-fold CV was 0.97 (15.64 W m<sup>-2</sup>) and 0.83 (37.44 W m<sup>-2</sup>) at daily level, respectively. The model with OMI EDD performed higher predicting accuracy than the one without it. Based on predictions of UV radiation at daily level and 10 km spatial resolution and nearly 100 % spatiotemporal coverage, we found UV radiation increased by 4.20 % while PM<sub>2.5</sub> levels decreased by 48.51 % and O<sub>3</sub> levels rose by 22.70 % in 2013–2020, suggesting a potential correlation among these environmental factors. Uneven spatial distribution of UV radiation was found to be associated with factors such as latitude, elevation, meteorological factors and seasons. The eastern areas of China posed higher risk with both high population density and UV radiation intensity. Based on machine learning algorithm, this study generated a gridded dataset characterized by relatively high precision and extensive spatiotemporal coverage of UV radiation, which demonstrates the spatiotemporal variability of UV radiation levels in China and can facilitate health-related research in the future. This dataset is currently freely available at https://doi.org/10.5281/zenodo.10884591 (Jiang et al., 2024).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"32 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adrià Descals, David L. A. Gaveau, Serge Wich, Zoltan Szantoi, Erik Meijaard
Abstract. Oil palm is a controversial crop, primarily because it is associated with negative environmental impacts such as tropical deforestation. Mapping the crop and its characteristics, such as age, is crucial for informing public and policy discussions regarding these impacts. Oil palm has received substantial mapping efforts, but up-to-date accurate oil palm maps for both extent and age are essential for monitoring impacts and informing concomitant debate. Here, we present a 10-meter resolution global map of industrial and smallholder oil palm, developed using Sentinel-1 data for the years 2016–2021 and a deep learning model based on convolutional neural networks. In addition, we used Landsat-5, -7, and -8 to estimate the planting year from 1990 to 2021 at a 30-meter spatial resolution. The planting year indicates the year of establishment for an oil palm plantation as of 2021, either newly planted or replanted oil palm in an existing plantation. We validated the oil palm extent layer using 17,812 randomly distributed reference points. The accuracy of the planting year layer was assessed using field data collected from 5,831 industrial parcels and 1,012 smallholder plantations distributed throughout the oil palm growing area. We found oil palm plantations covering a total mapped area of 23.98 Mha, and our area estimates are 16.66 ± 0.25 Mha of industrial and 7.59 ± 0.29 Mha of smallholder oil palm worldwide. The producers’ and users’ accuracy is 91.9 ± 3.4 % and 91.8 ± 1.0 % for industrial plantations, and 72.7 ± 1.3 % and 75.7 ± 2.5 % for smallholders, which improves upon a previous global oil palm dataset, particularly in terms of omission of oil palm. The overall mean error between estimated planting year and field data was -0.24 years and the root-mean-square error was 2.65 years, but the agreement was lower for smallholders. Mapping the extent and planting year of smallholder plantations remains challenging, particularly for wild and sparsely planted oil palm, and future mapping efforts should focus on these specific types of plantations. The average oil palm plantation age was 14.1 years, and the area of oil palm over 20 years was 6.28 Mha. Given that oil palm plantations are typically replanted after 25 years, our findings indicate that this area will require replanting within the coming decade, starting from 2021. Our dataset provides valuable input for optimal land use planning to meet the growing global demand for vegetable oils. The global oil palm extent layer for the year 2021 and the planting year layer from 1990 to 2021 can be found at https://doi.org/10.5281/zenodo.11034131 (Descals, 2024).
{"title":"Global mapping of oil palm planting year from 1990 to 2021","authors":"Adrià Descals, David L. A. Gaveau, Serge Wich, Zoltan Szantoi, Erik Meijaard","doi":"10.5194/essd-2024-157","DOIUrl":"https://doi.org/10.5194/essd-2024-157","url":null,"abstract":"<strong>Abstract.</strong> Oil palm is a controversial crop, primarily because it is associated with negative environmental impacts such as tropical deforestation. Mapping the crop and its characteristics, such as age, is crucial for informing public and policy discussions regarding these impacts. Oil palm has received substantial mapping efforts, but up-to-date accurate oil palm maps for both extent and age are essential for monitoring impacts and informing concomitant debate. Here, we present a 10-meter resolution global map of industrial and smallholder oil palm, developed using Sentinel-1 data for the years 2016–2021 and a deep learning model based on convolutional neural networks. In addition, we used Landsat-5, -7, and -8 to estimate the planting year from 1990 to 2021 at a 30-meter spatial resolution. The planting year indicates the year of establishment for an oil palm plantation as of 2021, either newly planted or replanted oil palm in an existing plantation. We validated the oil palm extent layer using 17,812 randomly distributed reference points. The accuracy of the planting year layer was assessed using field data collected from 5,831 industrial parcels and 1,012 smallholder plantations distributed throughout the oil palm growing area. We found oil palm plantations covering a total mapped area of 23.98 Mha, and our area estimates are 16.66 ± 0.25 Mha of industrial and 7.59 ± 0.29 Mha of smallholder oil palm worldwide. The producers’ and users’ accuracy is 91.9 ± 3.4 % and 91.8 ± 1.0 % for industrial plantations, and 72.7 ± 1.3 % and 75.7 ± 2.5 % for smallholders, which improves upon a previous global oil palm dataset, particularly in terms of omission of oil palm. The overall mean error between estimated planting year and field data was -0.24 years and the root-mean-square error was 2.65 years, but the agreement was lower for smallholders. Mapping the extent and planting year of smallholder plantations remains challenging, particularly for wild and sparsely planted oil palm, and future mapping efforts should focus on these specific types of plantations. The average oil palm plantation age was 14.1 years, and the area of oil palm over 20 years was 6.28 Mha. Given that oil palm plantations are typically replanted after 25 years, our findings indicate that this area will require replanting within the coming decade, starting from 2021. Our dataset provides valuable input for optimal land use planning to meet the growing global demand for vegetable oils. The global oil palm extent layer for the year 2021 and the planting year layer from 1990 to 2021 can be found at https://doi.org/10.5281/zenodo.11034131 (Descals, 2024).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"20 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guorong Zhong, Xuegang Li, Jinming Song, Baoxiao Qu, Fan Wang, Yanjun Wang, Bin Zhang, Lijing Cheng, Jun Ma, Huamao Yuan, Liqin Duan, Ning Li, Qidong Wang, Jianwei Xing, Jiajia Dai
Abstract. The continuous uptake of anthropogenic CO2 by the ocean leads to ocean acidification, which is an ongoing threat to the marine ecosystem. The ocean acidification rate was globally documented in the surface ocean but limited below the surface. Here, we present a monthly four-dimensional 1°×1° gridded product of global seawater pH, derived from a machine learning algorithm trained on pH observations at total scale and in-situ temperature from the Global Ocean Data Analysis Project (GLODAP). The constructed pH product covers the years 1992–2020 and depths from the surface to 2 km on 41 levels. Three types of machine learning algorithms were used in the pH product construction, including self-organizing map neural networks for region dividing, a stepwise algorithm for predictor selection, and feed-forward neural networks (FFNN) for non-linear relationship regression. The performance of the machine learning algorithm was validated using real observations by a cross validation method, where four repeating iterations were carried out with 25 % varied observations for each evaluation and 75 % for training. The constructed pH product is evaluated through comparisons to time series observations and the GLODAP pH climatology. The overall root mean square error between the FFNN constructed pH and the GLODAP measurements is 0.028, ranging from 0.044 in the surface to 0.013 at 2000 m. The pH product is distributed through the data repository of the Marine Science Data Center of the Chinese Academy of Sciences at http://dx.doi.org/10.12157/IOCAS.20230720.001 (Zhong et al., 2023).
{"title":"A global monthly field of seawater pH over 3 decades: a machine learning approach","authors":"Guorong Zhong, Xuegang Li, Jinming Song, Baoxiao Qu, Fan Wang, Yanjun Wang, Bin Zhang, Lijing Cheng, Jun Ma, Huamao Yuan, Liqin Duan, Ning Li, Qidong Wang, Jianwei Xing, Jiajia Dai","doi":"10.5194/essd-2024-151","DOIUrl":"https://doi.org/10.5194/essd-2024-151","url":null,"abstract":"<strong>Abstract.</strong> The continuous uptake of anthropogenic CO<sub>2</sub> by the ocean leads to ocean acidification, which is an ongoing threat to the marine ecosystem. The ocean acidification rate was globally documented in the surface ocean but limited below the surface. Here, we present a monthly four-dimensional 1°×1° gridded product of global seawater pH, derived from a machine learning algorithm trained on pH observations at total scale and in-situ temperature from the Global Ocean Data Analysis Project (GLODAP). The constructed pH product covers the years 1992–2020 and depths from the surface to 2 km on 41 levels. Three types of machine learning algorithms were used in the pH product construction, including self-organizing map neural networks for region dividing, a stepwise algorithm for predictor selection, and feed-forward neural networks (FFNN) for non-linear relationship regression. The performance of the machine learning algorithm was validated using real observations by a cross validation method, where four repeating iterations were carried out with 25 % varied observations for each evaluation and 75 % for training. The constructed pH product is evaluated through comparisons to time series observations and the GLODAP pH climatology. The overall root mean square error between the FFNN constructed pH and the GLODAP measurements is 0.028, ranging from 0.044 in the surface to 0.013 at 2000 m. The pH product is distributed through the data repository of the Marine Science Data Center of the Chinese Academy of Sciences at http://dx.doi.org/10.12157/IOCAS.20230720.001 (Zhong et al., 2023).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"33 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract. Ground-level PM2.5 data derived from satellites with machine learning are crucial for health and climate assessments, however, uncertainties persist due to the absence of spatially covered observations. To address this, we propose a novel testbed using untraditional numerical simulations to evaluate PM2.5 estimation across the entire spatial domain. The testbed emulates the general machine-learning approach, by training the model with grids corresponding to ground monitor sites and subsequently testing its predictive accuracy for other locations. Our approach enables comprehensive evaluation of various machine-learning methods’ performance in estimating PM2.5 across the spatial domain for the first time. Unexpected results are shown in the application in China, with larger PM2.5 biases found in densely populated regions with abundant ground observations across all benchmark models, challenging conventional expectations and are not explored in the recent literature. The imbalance in training samples, mostly from urban areas with high emissions, is the main reason, leading to significant overestimation due to the lack of monitors in downwind areas where PM2.5 is transported from urban areas with varying vertical profiles. Our proposed testbed also provides an efficient strategy for optimizing model structure or training samples to enhance satellite-retrieval model performance. Integration of spatiotemporal features, especially with CNN-based deep-learning approaches like the ResNet model, successfully mitigates PM2.5 overestimation (by 5–30 µg m-3) and corresponding exposure (by 3 million people • µg m-3) in the downwind area over the past nine years (2013–2021) compared to the traditional approach. Furthermore, the incorporation of 600 strategically positioned ground-measurement sites identified through the testbed is essential to achieve a more balanced distribution of training samples, thereby ensuring precise PM2.5 estimation and facilitating the assessment of associated impacts in China. In addition to presenting the retrieved surface PM2.5 concentrations in China from 2013 to 2021, this study provides a testbed dataset derived from physical modeling simulations which can serve to evaluate the performance of data-driven methodologies, such as machine learning, in estimating spatial PM2.5 concentrations for the community.
{"title":"Retrieving Ground-Level PM2.5 Concentrations in China (2013–2021) with a Numerical Model-Informed Testbed to Mitigate Sample Imbalance-Induced Biases","authors":"Siwei Li, Yu Ding, Jia Xing, Joshua S. Fu","doi":"10.5194/essd-2024-170","DOIUrl":"https://doi.org/10.5194/essd-2024-170","url":null,"abstract":"<strong>Abstract.</strong> Ground-level PM<sub>2.5</sub> data derived from satellites with machine learning are crucial for health and climate assessments, however, uncertainties persist due to the absence of spatially covered observations. To address this, we propose a novel testbed using untraditional numerical simulations to evaluate PM<sub>2.5</sub> estimation across the entire spatial domain. The testbed emulates the general machine-learning approach, by training the model with grids corresponding to ground monitor sites and subsequently testing its predictive accuracy for other locations. Our approach enables comprehensive evaluation of various machine-learning methods’ performance in estimating PM<sub>2.5</sub> across the spatial domain for the first time. Unexpected results are shown in the application in China, with larger PM<sub>2.5 </sub>biases found in densely populated regions with abundant ground observations across all benchmark models, challenging conventional expectations and are not explored in the recent literature. The imbalance in training samples, mostly from urban areas with high emissions, is the main reason, leading to significant overestimation due to the lack of monitors in downwind areas where PM<sub>2.5 </sub>is transported from urban areas with varying vertical profiles. Our proposed testbed also provides an efficient strategy for optimizing model structure or training samples to enhance satellite-retrieval model performance. Integration of spatiotemporal features, especially with CNN-based deep-learning approaches like the ResNet model, successfully mitigates PM<sub>2.5 </sub>overestimation (by 5–30 µg m<sup>-3</sup>) and corresponding exposure (by 3 million people • µg m<sup>-3</sup>) in the downwind area over the past nine years (2013–2021) compared to the traditional approach. Furthermore, the incorporation of 600 strategically positioned ground-measurement sites identified through the testbed is essential to achieve a more balanced distribution of training samples, thereby ensuring precise PM<sub>2.5</sub> estimation and facilitating the assessment of associated impacts in China. In addition to presenting the retrieved surface PM<sub>2.5 </sub>concentrations in China from 2013 to 2021, this study provides a testbed dataset derived from physical modeling simulations which can serve to evaluate the performance of data-driven methodologies, such as machine learning, in estimating spatial PM<sub>2.5</sub> concentrations for the community.","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"20 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract. During the Water Vapor Lidar Network Assimilation (WaLiNeAs) campaign, 8 lidars specifically designed to measure water vapor mixing ratio (WVMR) profiles were deployed on the western Mediterranean coast. The main objectives were to investigate the water vapor content during case studies of heavy precipitation events in the coastal Western Mediterranean and assess the impact of high spatio-temporal WVMR data on numerical weather prediction forecasts by means of state–of–the–art assimilation techniques. Given the increasing occurrence of extreme events due to climate change, WaLiNeAs is the first program in Europe to provide network–like, simultaneous and continuous water vapor profile measurements. This paper focuses on the WVMR profiling datasets obtained from three of the lidars managed by the French component of the WaLiNeAs team. These lidars were deployed in the towns of Coursan, Grau du Roi and Cannes. This measurement setup enabled monitoring of the water vapor content within the low troposphere along a period of three months over autumn – winter 2022 and four months in summer 2023. The lidars measured the WVMR profiles from the surface up to approximately 6–10 km at night, and 1–2 km during daytime; with a vertical resolution of 100 m and a time sampling between 15 – 30 min, selected to meet the needs of weather forecasting with an uncertainty lower than 0.4 g kg-1. The paper presents details about the instruments, the experimental strategy, as well as the datasets given in NETcdf format. The final dataset is divided in two datasets, the first with a time resolution of 15 min, which contains a total of 26 423 WVMR vertical profiles and the second with a time resolution of 30 min to improve the signal to noise ratio and signal altitude range.
{"title":"Water vapor Raman-lidar observations from multiple sites in the framework of WaLiNeAs","authors":"Frédéric Laly, Patrick Chazette, Julien Totems, Jérémy Lagarrigue, Laurent Forges, Cyrille Flamant","doi":"10.5194/essd-2024-73","DOIUrl":"https://doi.org/10.5194/essd-2024-73","url":null,"abstract":"<strong>Abstract.</strong> During the Water Vapor Lidar Network Assimilation (WaLiNeAs) campaign, 8 lidars specifically designed to measure water vapor mixing ratio (WVMR) profiles were deployed on the western Mediterranean coast. The main objectives were to investigate the water vapor content during case studies of heavy precipitation events in the coastal Western Mediterranean and assess the impact of high spatio-temporal WVMR data on numerical weather prediction forecasts by means of state–of–the–art assimilation techniques. Given the increasing occurrence of extreme events due to climate change, WaLiNeAs is the first program in Europe to provide network–like, simultaneous and continuous water vapor profile measurements. This paper focuses on the WVMR profiling datasets obtained from three of the lidars managed by the French component of the WaLiNeAs team. These lidars were deployed in the towns of Coursan, Grau du Roi and Cannes. This measurement setup enabled monitoring of the water vapor content within the low troposphere along a period of three months over autumn – winter 2022 and four months in summer 2023. The lidars measured the WVMR profiles from the surface up to approximately 6–10 km at night, and 1–2 km during daytime; with a vertical resolution of 100 m and a time sampling between 15 – 30 min, selected to meet the needs of weather forecasting with an uncertainty lower than 0.4 g kg<sup>-1</sup>. The paper presents details about the instruments, the experimental strategy, as well as the datasets given in NETcdf format. The final dataset is divided in two datasets, the first with a time resolution of 15 min, which contains a total of 26 423 WVMR vertical profiles and the second with a time resolution of 30 min to improve the signal to noise ratio and signal altitude range.","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"33 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}