Abstract. The marine biogeochemical time-series products, which include total alkalinity, inorganic carbon, nitrate, phosphate, silicate, and pH, constitute a foundational support mechanism for the ongoing surveillance of oceanic biogeochemical changes. These products play a critical role in facilitating research focused on dynamic monitoring of marine ecosystems and fostering sustainable oceanic development. However, existing monitoring methodologies are hampered by inherent limitations, notably the paucity of observational products that simultaneously offer high spatial and temporal resolutions. Furthermore, the interpolation methods typically employed in these contexts frequently prove low-effective on a large scale, resulting in data with extensive temporal and spatial expanses that are difficulty for applications aimed at monitoring large-scale ocean dynamics. A novel integration of the CANYON-B and Random Forest regression methods was explored to address these challenges in reconstructing key marine biogeochemical parameters. This work reconstructs the concentrations of these marine biogeochemicals at the sea surface within Australia's Exclusive Economic Zone over the period from 2000 to 2022 on a 1-kilometre scale. The approach involves the amalgamation of multi-source in-situ ocean chemistry time-series observations with MODIS Terra ocean reflectance imagery and ocean water colour product distributions. This research highlights the substantial capabilities of machine learning for the large-scale reconstruction of ocean chemistry data, introducing a new, viable method for utilising in-situ measurements and optical imagery in reconstructing marine biogeochemical elements, thereby significantly enhancing our ability to monitor large-scale ocean dynamics. The datasets generated and analysed in this study are available on Science Data Bank (https://doi.org/10.57760/sciencedb.09331) (Zhang et al., 2024)
摘要。海洋生物地球化学时间序列产品包括总碱度、无机碳、硝酸盐、磷酸盐、硅酸盐和 pH 值,是持续监测海洋生物地球化学变化的基础支持机制。这些产品在促进以海洋生态系统动态监测为重点的研究和促进海洋可持续发展方面发挥着至关重要的作用。然而,现有的监测方法受到固有限制的阻碍,特别是同时提供高空间和时间分辨率的观测产品很少。此外,在这些情况下通常采用的插值方法经常被证明在大尺度范围内效果不佳,导致数据的时空跨度过大,难以应用于大尺度海洋动态监测。为了应对这些挑战,我们探索了一种新颖的 CANYON-B 和随机森林回归方法,以重建关键的海洋生物地球化学参数。这项工作重建了 2000 年至 2022 年期间澳大利亚专属经济区海面上这些海洋生物地球化学物质在 1 公里范围内的浓度。该方法包括将多源原位海洋化学时间序列观测数据与 MODIS Terra 海洋反射率图像和海洋水色产品分布相结合。这项研究凸显了机器学习在大规模重建海洋化学数据方面的巨大能力,为利用原位测量和光学图像重建海洋生物地球化学要素引入了一种新的可行方法,从而大大提高了我们监测大尺度海洋动态的能力。本研究生成和分析的数据集可在科学数据库(https://doi.org/10.57760/sciencedb.09331)上查阅(Zhang et al.)
{"title":"High-spatiotemporal reconstruction of biogeochemical dynamics in Australia integrating satellites products and in-situ observations (2000–2022)","authors":"Xiaohan Zhang, Lizhe Wang, Jining Yan, Sheng Wang","doi":"10.5194/essd-2024-219","DOIUrl":"https://doi.org/10.5194/essd-2024-219","url":null,"abstract":"<strong>Abstract.</strong> The marine biogeochemical time-series products, which include total alkalinity, inorganic carbon, nitrate, phosphate, silicate, and pH, constitute a foundational support mechanism for the ongoing surveillance of oceanic biogeochemical changes. These products play a critical role in facilitating research focused on dynamic monitoring of marine ecosystems and fostering sustainable oceanic development. However, existing monitoring methodologies are hampered by inherent limitations, notably the paucity of observational products that simultaneously offer high spatial and temporal resolutions. Furthermore, the interpolation methods typically employed in these contexts frequently prove low-effective on a large scale, resulting in data with extensive temporal and spatial expanses that are difficulty for applications aimed at monitoring large-scale ocean dynamics. A novel integration of the CANYON-B and Random Forest regression methods was explored to address these challenges in reconstructing key marine biogeochemical parameters. This work reconstructs the concentrations of these marine biogeochemicals at the sea surface within Australia's Exclusive Economic Zone over the period from 2000 to 2022 on a 1-kilometre scale. The approach involves the amalgamation of multi-source in-situ ocean chemistry time-series observations with MODIS Terra ocean reflectance imagery and ocean water colour product distributions. This research highlights the substantial capabilities of machine learning for the large-scale reconstruction of ocean chemistry data, introducing a new, viable method for utilising in-situ measurements and optical imagery in reconstructing marine biogeochemical elements, thereby significantly enhancing our ability to monitor large-scale ocean dynamics. The datasets generated and analysed in this study are available on Science Data Bank (https://doi.org/10.57760/sciencedb.09331) (Zhang et al., 2024)","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"31 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141489476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Defu Zou, Lin Zhao, Guojie Hu, Erji Du, Guangyue Liu, Chong Wang, Wangping Li
Abstract. The ground temperature at a fixed depth is a crucial boundary condition for understanding the properties of deep permafrost. However, the commonly used mean annual ground temperature at the depth of the zero annual amplitude (MAGTdzaa) has application limitations due to large spatial heterogeneity in observed depths. In this study, we utilized 231 borehole records of mean annual ground temperature at a depth of 15 meters (MAGT15m) from 2010 to 2019 and employed support vector regression (SVR) to predict gridded MAGT15m data at a spatial resolution of nearly 1 km across the Qinghai-Tibet Plateau (QTP). SVR predictions demonstrated a R2 value of 0.48 with a negligible negative overestimation (-0.01 °C). The average MAGT15m of the QTP permafrost was -1.85 °C (±1.58 °C), with 90% of values ranging from -5.1 °C to -0.1 °C and 51.2% exceeding -1.5 °C. The freezing degree days (FDD) was the most significant predictor (p<0.001) of MAGT15m, followed by thawing degree days (TDD), mean annual precipitation (MAP), and soil bulk density (BD) (p<0.01). Overall, the MAGT15m increased from northwest to southeast and decreased with elevation. Lower MAGT15m values are prevail in high mountainous areas with steep slopes. The MAGT15m was the lowest in the basins of the Amu Darya, Indus, and Tarim rivers (-2.7 to -2.9 °C) and the highest in the Yangtze and Yellow River basins (-0.8 to -0.9 °C). The baseline dataset of MAGT15m during 2010–2019 for the QTP permafrost will facilitates simulations of deep permafrost characteristics and provides fundamental data for permafrost model validation and improvement.
{"title":"Permafrost temperature baseline at 15 meters depth in the Qinghai-Tibet Plateau (2010–2019)","authors":"Defu Zou, Lin Zhao, Guojie Hu, Erji Du, Guangyue Liu, Chong Wang, Wangping Li","doi":"10.5194/essd-2024-114","DOIUrl":"https://doi.org/10.5194/essd-2024-114","url":null,"abstract":"<strong>Abstract.</strong> The ground temperature at a fixed depth is a crucial boundary condition for understanding the properties of deep permafrost. However, the commonly used mean annual ground temperature at the depth of the zero annual amplitude (MAGT<sub>dzaa</sub>) has application limitations due to large spatial heterogeneity in observed depths. In this study, we utilized 231 borehole records of mean annual ground temperature at a depth of 15 meters (MAGT<sub>15m</sub>) from 2010 to 2019 and employed support vector regression (SVR) to predict gridded MAGT<sub>15m</sub> data at a spatial resolution of nearly 1 km across the Qinghai-Tibet Plateau (QTP). SVR predictions demonstrated a R<sup>2</sup> value of 0.48 with a negligible negative overestimation (-0.01 °C). The average MAGT<sub>15m</sub> of the QTP permafrost was -1.85 °C (±1.58 °C), with 90% of values ranging from -5.1 °C to -0.1 °C and 51.2% exceeding -1.5 °C. The freezing degree days (FDD) was the most significant predictor (p<0.001) of MAGT<sub>15m</sub>, followed by thawing degree days (TDD), mean annual precipitation (MAP), and soil bulk density (BD) (p<0.01). Overall, the MAGT<sub>15m</sub> increased from northwest to southeast and decreased with elevation. Lower MAGT<sub>15m</sub> values are prevail in high mountainous areas with steep slopes. The MAGT<sub>15m</sub> was the lowest in the basins of the Amu Darya, Indus, and Tarim rivers (-2.7 to -2.9 °C) and the highest in the Yangtze and Yellow River basins (-0.8 to -0.9 °C). The baseline dataset of MAGT<sub>15m</sub> during 2010–2019 for the QTP permafrost will facilitates simulations of deep permafrost characteristics and provides fundamental data for permafrost model validation and improvement.","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"30 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141475253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01DOI: 10.5194/essd-16-3061-2024
Kang He, Xinyi Shen, Emmanouil N. Anagnostou
Abstract. Forest fires, while destructive and dangerous, are important to the functioning and renewal of ecosystems. Over the past 2 decades, large-scale, severe forest fires have become more frequent globally, and the risk is expected to increase as fire weather and drought conditions intensify. To improve quantification of the intensity and extent of forest fire damage, we have developed a 30 m resolution global forest burn severity (GFBS) dataset of the degree of biomass consumed by fires from 2003 to 2016. To develop this dataset, we used the Global Fire Atlas product to determine when and where forest fires occurred during that period and then we overlaid the available Landsat surface reflectance products to obtain pre-fire and post-fire normalized burn ratios (NBRs) for each burned pixel, designating the difference between them as dNBR and the relative difference as RdNBR. We compared the GFBS dataset against the Canada Landsat Burned Severity (CanLaBS) product, showing better agreement than the existing Moderate Resolution Imaging Spectrometer (MODIS)-based global burn severity dataset (MOdis burn SEVerity, MOSEV) in representing the distribution of forest burn severity over Canada. Using the in situ burn severity category data available for the 2013 wildfires in southeastern Australia, we demonstrated that GFBS could provide burn severity estimation with clearer differentiation between the high-severity and moderate-/low-severity classes, while such differentiation among the in situ burn severity classes is not captured in the MOSEV product. Using the CONUS-wide composite burn index (CBI) as a ground truth, we showed that dNBR from GFBS was more strongly correlated with CBI (r=0.63) than dNBR from MOSEV (r=0.28). RdNBR from GFBS also exhibited better agreement with CBI (r=0.56) than RdNBR from MOSEV (r=0.20). On a global scale, while the dNBR and RdNBR spatial patterns extracted by GFBS are similar to those of MOSEV, MOSEV tends to provide higher burn severity levels than GFBS. We attribute this difference to variations in reflectance values and the different spatial resolutions of the two satellites. The GFBS dataset provides a more precise and reliable assessment of burn severity than existing available datasets. These enhancements are crucial for understanding the ecological impacts of forest fires and for informing management and recovery efforts in affected regions worldwide. The GFBS dataset is freely accessible at https://doi.org/10.5281/zenodo.10037629 (He et al., 2023).
{"title":"A global forest burn severity dataset from Landsat imagery (2003–2016)","authors":"Kang He, Xinyi Shen, Emmanouil N. Anagnostou","doi":"10.5194/essd-16-3061-2024","DOIUrl":"https://doi.org/10.5194/essd-16-3061-2024","url":null,"abstract":"Abstract. Forest fires, while destructive and dangerous, are important to the functioning and renewal of ecosystems. Over the past 2 decades, large-scale, severe forest fires have become more frequent globally, and the risk is expected to increase as fire weather and drought conditions intensify. To improve quantification of the intensity and extent of forest fire damage, we have developed a 30 m resolution global forest burn severity (GFBS) dataset of the degree of biomass consumed by fires from 2003 to 2016. To develop this dataset, we used the Global Fire Atlas product to determine when and where forest fires occurred during that period and then we overlaid the available Landsat surface reflectance products to obtain pre-fire and post-fire normalized burn ratios (NBRs) for each burned pixel, designating the difference between them as dNBR and the relative difference as RdNBR. We compared the GFBS dataset against the Canada Landsat Burned Severity (CanLaBS) product, showing better agreement than the existing Moderate Resolution Imaging Spectrometer (MODIS)-based global burn severity dataset (MOdis burn SEVerity, MOSEV) in representing the distribution of forest burn severity over Canada. Using the in situ burn severity category data available for the 2013 wildfires in southeastern Australia, we demonstrated that GFBS could provide burn severity estimation with clearer differentiation between the high-severity and moderate-/low-severity classes, while such differentiation among the in situ burn severity classes is not captured in the MOSEV product. Using the CONUS-wide composite burn index (CBI) as a ground truth, we showed that dNBR from GFBS was more strongly correlated with CBI (r=0.63) than dNBR from MOSEV (r=0.28). RdNBR from GFBS also exhibited better agreement with CBI (r=0.56) than RdNBR from MOSEV (r=0.20). On a global scale, while the dNBR and RdNBR spatial patterns extracted by GFBS are similar to those of MOSEV, MOSEV tends to provide higher burn severity levels than GFBS. We attribute this difference to variations in reflectance values and the different spatial resolutions of the two satellites. The GFBS dataset provides a more precise and reliable assessment of burn severity than existing available datasets. These enhancements are crucial for understanding the ecological impacts of forest fires and for informing management and recovery efforts in affected regions worldwide. The GFBS dataset is freely accessible at https://doi.org/10.5281/zenodo.10037629 (He et al., 2023).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"27 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141475230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract. The ocean surface exhibits a variety of oceanic and atmospheric phenomena. Automatically detecting and identifying these phenomena is crucial for understanding oceanic dynamics and ocean-atmosphere interactions. In this study, we select 2,383 Sentinel-1 WV mode images and 2,628 IW mode sub-images to construct a semantic segmentation dataset that includes 12 typical oceanic and atmospheric phenomena. Each phenomenon is represented by approximately 400 sub-images, resulting in a total of 5,011 images. The images in this dataset have a resolution of 100 meters and dimensions of 256 × 256 pixels. We propose a modified Segformer model to segment semantically these multiple categories of oceanic and atmospheric phenomena. Experimental results show that the modified Segformer model achieves an average Dice coefficient of 80.98 %, an average IoU of 70.32 %, and an overall accuracy of 87.13 %, demonstrating robust segmentation performance of typical oceanic and atmospheric phenomena in SAR images.
{"title":"SAR Image Semantic Segmentation of Typical Oceanic and Atmospheric Phenomena","authors":"Quankun Li, Xue Bai, Lizhen Hu, Liangsheng Li, Yaohui Bao, Xupu Geng, Xiao-Hai Yan","doi":"10.5194/essd-2024-222","DOIUrl":"https://doi.org/10.5194/essd-2024-222","url":null,"abstract":"<strong>Abstract.</strong> The ocean surface exhibits a variety of oceanic and atmospheric phenomena. Automatically detecting and identifying these phenomena is crucial for understanding oceanic dynamics and ocean-atmosphere interactions. In this study, we select 2,383 Sentinel-1 WV mode images and 2,628 IW mode sub-images to construct a semantic segmentation dataset that includes 12 typical oceanic and atmospheric phenomena. Each phenomenon is represented by approximately 400 sub-images, resulting in a total of 5,011 images. The images in this dataset have a resolution of 100 meters and dimensions of 256 × 256 pixels. We propose a modified Segformer model to segment semantically these multiple categories of oceanic and atmospheric phenomena. Experimental results show that the modified Segformer model achieves an average Dice coefficient of 80.98 %, an average IoU of 70.32 %, and an overall accuracy of 87.13 %, demonstrating robust segmentation performance of typical oceanic and atmospheric phenomena in SAR images.","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"61 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141475207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-28DOI: 10.5194/essd-16-3045-2024
Yavar Pourmohamad, John T. Abatzoglou, Erin J. Belval, Erica Fleishman, Karen Short, Matthew C. Reeves, Nicholas Nauslar, Philip E. Higuera, Eric Henderson, Sawyer Ball, Amir AghaKouchak, Jeffrey P. Prestemon, Julia Olszewski, Mojtaba Sadegh
Abstract. Wildfires are increasingly impacting social and environmental systems in the United States (US). The ability to mitigate the adverse effects of wildfires increases with understanding of the social, physical, and biological conditions that co-occurred with or caused the wildfire ignitions and contributed to the wildfire impacts. To this end, we developed the FPA FOD-Attributes dataset, which augments the sixth version of the Fire Program Analysis Fire-Occurrence Database (FPA FOD v6) with nearly 270 attributes that coincide with the date and location of each wildfire ignition in the US. FPA FOD v6 contains information on location, jurisdiction, discovery time, cause, and final size of >2.3×106 wildfires in the US between 1992 and 2020 . For each wildfire, we added physical (e.g., weather, climate, topography, and infrastructure), biological (e.g., land cover and normalized difference vegetation index), social (e.g., population density and social vulnerability index), and administrative (e.g., national and regional preparedness level and jurisdiction) attributes. This publicly available dataset can be used to answer numerous questions about the covariates associated with human- and lightning-caused wildfires. Furthermore, the FPA FOD-Attributes dataset can support descriptive, diagnostic, predictive, and prescriptive wildfire analytics, including the development of machine learning models. The FPA FOD-Attributes dataset is available at https://doi.org/10.5281/zenodo.8381129 (Pourmohamad et al., 2023).
{"title":"Physical, social, and biological attributes for improved understanding and prediction of wildfires: FPA FOD-Attributes dataset","authors":"Yavar Pourmohamad, John T. Abatzoglou, Erin J. Belval, Erica Fleishman, Karen Short, Matthew C. Reeves, Nicholas Nauslar, Philip E. Higuera, Eric Henderson, Sawyer Ball, Amir AghaKouchak, Jeffrey P. Prestemon, Julia Olszewski, Mojtaba Sadegh","doi":"10.5194/essd-16-3045-2024","DOIUrl":"https://doi.org/10.5194/essd-16-3045-2024","url":null,"abstract":"Abstract. Wildfires are increasingly impacting social and environmental systems in the United States (US). The ability to mitigate the adverse effects of wildfires increases with understanding of the social, physical, and biological conditions that co-occurred with or caused the wildfire ignitions and contributed to the wildfire impacts. To this end, we developed the FPA FOD-Attributes dataset, which augments the sixth version of the Fire Program Analysis Fire-Occurrence Database (FPA FOD v6) with nearly 270 attributes that coincide with the date and location of each wildfire ignition in the US. FPA FOD v6 contains information on location, jurisdiction, discovery time, cause, and final size of >2.3×106 wildfires in the US between 1992 and 2020 . For each wildfire, we added physical (e.g., weather, climate, topography, and infrastructure), biological (e.g., land cover and normalized difference vegetation index), social (e.g., population density and social vulnerability index), and administrative (e.g., national and regional preparedness level and jurisdiction) attributes. This publicly available dataset can be used to answer numerous questions about the covariates associated with human- and lightning-caused wildfires. Furthermore, the FPA FOD-Attributes dataset can support descriptive, diagnostic, predictive, and prescriptive wildfire analytics, including the development of machine learning models. The FPA FOD-Attributes dataset is available at https://doi.org/10.5281/zenodo.8381129 (Pourmohamad et al., 2023).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"8 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141462463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-28DOI: 10.5194/essd-16-3017-2024
Yaoming Ma, Zhipeng Xie, Yingying Chen, Shaomin Liu, Tao Che, Ziwei Xu, Lunyu Shang, Xiaobo He, Xianhong Meng, Weiqiang Ma, Baiqing Xu, Huabiao Zhao, Junbo Wang, Guangjian Wu, Xin Li
Abstract. The climate of the Tibetan Plateau (TP) has experienced substantial changes in recent decades as a result of the location's susceptibility to global climate change. The changes observed across the TP are closely associated with regional land–atmosphere interactions. Current models and satellites struggle to accurately depict the interactions; therefore, critical field observations on land–atmosphere interactions outlined here provide necessary independent validation data and fine-scale process insights for constraining reanalysis products, remote sensing retrievals, and land surface model parameterizations. Scientific data sharing is crucial for the TP since in situ observations are rarely available under these harsh conditions. However, field observations are currently dispersed among individuals or groups and have not yet been integrated for comprehensive analysis. This has prevented a better understanding of the interactions, the unprecedented changes they generate, and the substantial ecological and environmental consequences they bring about. In this study, we collaborated with different agencies and organizations to present a comprehensive dataset for hourly measurements of surface energy balance components, soil hydrothermal properties, and near-surface micrometeorological conditions spanning up to 17 years (2005–2021). This dataset, derived from 12 field stations covering a variety of typical TP landscapes, provides the most extensive in situ observation data available for studying land–atmosphere interactions on the TP to date in terms of both spatial coverage and duration. Three categories of observations are provided in this dataset: meteorological gradient data (met), soil hydrothermal data (soil), and turbulent flux data (flux). To assure data quality, a set of rigorous data-processing and quality control procedures are implemented for all observation elements (e.g., wind speed and direction at different height) in this dataset. The operational workflow and procedures are individually tailored to the varied types of elements at each station, including automated error screening, manual inspection, diagnostic checking, adjustments, and quality flagging. The hourly raw data series; the quality-assured data; and supplementary information, including data integrity and the percentage of correct data on a monthly scale, are provided via the National Tibetan Plateau Data Center (https://doi.org/10.11888/Atmos.tpdc.300977, Ma et al., 2023a). With the greatest number of stations covered, the fullest collection of meteorological elements, and the longest duration of observations and recordings to date, this dataset is the most extensive hourly land–atmosphere interaction observation dataset for the TP. It will serve as the benchmark for evaluating and refining land surface models, reanalysis products, and remote sensing retrievals, as well as for characterizing fine-scale land–atmosphere interaction processes of the TP and underlying
摘要由于青藏高原易受全球气候变化的影响,近几十年来该地区的气候发生了巨大变化。在整个青藏高原观测到的变化与区域陆地-大气相互作用密切相关。目前的模式和卫星都难以准确描述这种相互作用;因此,本文概述的有关陆地-大气相互作用的关键实地观测数据提供了必要的独立验证数据和精细尺度过程见解,用于约束再分析产品、遥感检索和陆地表面模式参数化。科学数据共享对热带雨林至关重要,因为在这些恶劣条件下很少有实地观测数据。然而,实地观测数据目前分散在个人或小组中,尚未整合起来进行综合分析。这就阻碍了我们更好地了解这些相互作用、它们产生的前所未有的变化以及它们带来的重大生态和环境后果。在这项研究中,我们与不同的机构和组织合作,提供了一个全面的数据集,每小时测量地表能量平衡成分、土壤热液特性和近地表微气象条件,时间跨度长达 17 年(2005-2021 年)。该数据集来自 12 个野外观测站,涵盖了各种典型的大陆坡地貌,在空间覆盖范围和持续时间方面提供了迄今为止最广泛的原位观测数据,用于研究大陆坡上陆地与大气的相互作用。该数据集提供了三类观测数据:气象梯度数据(气象)、土壤热液数据(土壤)和湍流通量数据(通量)。为确保数据质量,对该数据集中的所有观测要素(如不同高度的风速和风向)都实施了一套严格的数据处理和质量控制程序。操作工作流程和程序是根据每个站点不同类型的要素量身定制的,包括自动错误筛选、人工检查、诊断检查、调整和质量标记。通过国家青藏高原数据中心(https://doi.org/10.11888/Atmos.tpdc.300977, Ma et al., 2023a)提供每小时原始数据序列、质量保证数据以及补充信息,包括数据完整性和月度正确数据百分比。该数据集覆盖的站点数量最多、气象要素收集最全、观测和记录时间最长,是迄今为止青藏高原最广泛的陆地-大气相互作用小时观测数据集。它将成为评估和完善陆地表面模式、再分析产品和遥感检索的基准,也是描述大洋洲热带雨林细尺度陆地-大气相互作用过程及其影响机制的特征的基准。
{"title":"Dataset of spatially extensive long-term quality-assured land–atmosphere interactions over the Tibetan Plateau","authors":"Yaoming Ma, Zhipeng Xie, Yingying Chen, Shaomin Liu, Tao Che, Ziwei Xu, Lunyu Shang, Xiaobo He, Xianhong Meng, Weiqiang Ma, Baiqing Xu, Huabiao Zhao, Junbo Wang, Guangjian Wu, Xin Li","doi":"10.5194/essd-16-3017-2024","DOIUrl":"https://doi.org/10.5194/essd-16-3017-2024","url":null,"abstract":"Abstract. The climate of the Tibetan Plateau (TP) has experienced substantial changes in recent decades as a result of the location's susceptibility to global climate change. The changes observed across the TP are closely associated with regional land–atmosphere interactions. Current models and satellites struggle to accurately depict the interactions; therefore, critical field observations on land–atmosphere interactions outlined here provide necessary independent validation data and fine-scale process insights for constraining reanalysis products, remote sensing retrievals, and land surface model parameterizations. Scientific data sharing is crucial for the TP since in situ observations are rarely available under these harsh conditions. However, field observations are currently dispersed among individuals or groups and have not yet been integrated for comprehensive analysis. This has prevented a better understanding of the interactions, the unprecedented changes they generate, and the substantial ecological and environmental consequences they bring about. In this study, we collaborated with different agencies and organizations to present a comprehensive dataset for hourly measurements of surface energy balance components, soil hydrothermal properties, and near-surface micrometeorological conditions spanning up to 17 years (2005–2021). This dataset, derived from 12 field stations covering a variety of typical TP landscapes, provides the most extensive in situ observation data available for studying land–atmosphere interactions on the TP to date in terms of both spatial coverage and duration. Three categories of observations are provided in this dataset: meteorological gradient data (met), soil hydrothermal data (soil), and turbulent flux data (flux). To assure data quality, a set of rigorous data-processing and quality control procedures are implemented for all observation elements (e.g., wind speed and direction at different height) in this dataset. The operational workflow and procedures are individually tailored to the varied types of elements at each station, including automated error screening, manual inspection, diagnostic checking, adjustments, and quality flagging. The hourly raw data series; the quality-assured data; and supplementary information, including data integrity and the percentage of correct data on a monthly scale, are provided via the National Tibetan Plateau Data Center (https://doi.org/10.11888/Atmos.tpdc.300977, Ma et al., 2023a). With the greatest number of stations covered, the fullest collection of meteorological elements, and the longest duration of observations and recordings to date, this dataset is the most extensive hourly land–atmosphere interaction observation dataset for the TP. It will serve as the benchmark for evaluating and refining land surface models, reanalysis products, and remote sensing retrievals, as well as for characterizing fine-scale land–atmosphere interaction processes of the TP and underlying ","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"88 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141462497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marco Massa, Andrea Luca Rizzo, Davide Scafidi, Elisa Ferrari, Sara Lovati, Lucia Luzi, the MUDA working group
Abstract. In this paper, the new dynamic geophysical and geochemical MUltiparametric DAtabase (MUDA) is presented. MUDA is a new infrastructure of the National Institute of Geophysics and Volcanology (INGV), published on-line in December 2023, with the aim of archiving and disseminating multiparametric data collected by multidisciplinary monitoring networks. MUDA is a MySQL relational database with a web interface developed in php, aimed at investigating in quasi real time possible correlations between seismic phenomena and variations in endogenous and environmental parameters. At present, MUDA collects data from different types of sensors such as hydrogeochemical probes for physical-chemical parameters in waters, meteorological stations, detectors of air Radon concentration, diffusive flux of carbon dioxide (CO2) and seismometers belonging both to the National Seismic Network of INGV and to temporary networks installed in the framework of multidisciplinary research projects. MUDA daily publishes data updated to the previous day and offers the chance to view and download multiparametric time series selected for different time periods. The resultant dataset provides broad perspectives in the framework of future high frequency and continuous multiparametric monitorings as a starting point to identify possible seismic precursors for short-term earthquake forecasting. MUDA is now quoted with the Digital Object Identifier https://doi.org/10.13127/muda (Massa et al., 2023).
{"title":"MUDA: dynamic geophysical and geochemical MUltiparametric DAtabase","authors":"Marco Massa, Andrea Luca Rizzo, Davide Scafidi, Elisa Ferrari, Sara Lovati, Lucia Luzi, the MUDA working group","doi":"10.5194/essd-2024-185","DOIUrl":"https://doi.org/10.5194/essd-2024-185","url":null,"abstract":"<strong>Abstract.</strong> In this paper, the new dynamic geophysical and geochemical MUltiparametric DAtabase (MUDA) is presented. MUDA is a new infrastructure of the National Institute of Geophysics and Volcanology (INGV), published on-line in December 2023, with the aim of archiving and disseminating multiparametric data collected by multidisciplinary monitoring networks. MUDA is a <em>MySQL </em>relational database with a web interface developed in <em>php,</em> aimed at investigating in quasi real time possible correlations between seismic phenomena and variations in endogenous and environmental parameters. At present, MUDA collects data from different types of sensors such as hydrogeochemical probes for physical-chemical parameters in waters, meteorological stations, detectors of air Radon concentration, diffusive flux of carbon dioxide (CO<sub>2</sub>) and seismometers belonging both to the National Seismic Network of INGV and to temporary networks installed in the framework of multidisciplinary research projects. MUDA daily publishes data updated to the previous day and offers the chance to view and download multiparametric time series selected for different time periods. The resultant dataset provides broad perspectives in the framework of future high frequency and continuous multiparametric monitorings as a starting point to identify possible seismic precursors for short-term earthquake forecasting. MUDA is now quoted with the Digital Object Identifier https://doi.org/10.13127/muda (Massa et al., 2023).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"68 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141462445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-27DOI: 10.5194/essd-16-3001-2024
Arndt Kaps, Axel Lauer, Rémi Kazeroni, Martin Stengel, Veronika Eyring
Abstract. We present the new Cloud Class Climatology (CCClim) dataset, quantifying the global distribution of established morphological cloud types over 35 years. CCClim combines active and passive sensor data with machine learning (ML) and provides a new opportunity for improving the understanding of clouds and their related processes. CCClim is based on cloud property retrievals from the European Space Agency's (ESA) Cloud_cci dataset, adding relative occurrences of eight major cloud types, designed to be similar to those defined by the World Meteorological Organization (WMO) at 1° resolution. The ML framework used to obtain the cloud types is trained on data from multiple satellites in the afternoon constellation (A-Train). Using multiple spaceborne sensors reduces the impact of single-sensor problems like the difficulty of passive sensors to detect thin cirrus or the small footprint of active sensors. We leverage this to generate sufficient labeled data to train supervised ML models. CCClim's global coverage being almost gapless from 1982 to 2016 allows for performing process-oriented analyses of clouds on a climatological timescale. Similarly, the moderate spatial and temporal resolutions make it a lightweight dataset while enabling straightforward comparison to climate models. CCClim creates multiple opportunities to study clouds, of which we sketch out a few examples. Along with the cloud-type frequencies, CCClim contains the cloud properties used as inputs to the ML framework, such that all cloud types can be associated with relevant physical quantities. CCClim can also be combined with other datasets such as reanalysis data to assess the dynamical regime favoring the occurrence of a specific cloud type in association with its properties. Additionally, we show an example of how to evaluate a global climate model by comparing CCClim with cloud types obtained by applying the same ML method used to create CCClim to output from the icosahedral nonhydrostatic atmosphere model (ICON-A). CCClim can be accessed via the following digital object identifier: https://doi.org/10.5281/zenodo.8369202 (Kaps et al., 2023b).
摘要我们介绍了新的云类气候学(CCClim)数据集,该数据集量化了 35 年来既定形态云类型的全球分布情况。CCClim 将主动和被动传感器数据与机器学习 (ML) 相结合,为增进对云及其相关过程的了解提供了一个新机会。CCClim 基于欧洲航天局(ESA)Cloud_cci 数据集的云属性检索,增加了八种主要云类型的相对出现率,与世界气象组织(WMO)定义的 1° 分辨率云类型相似。用于获取云类型的 ML 框架是通过下午星座(A-Train)中多颗卫星的数据进行训练的。使用多个星载传感器可减少单传感器问题的影响,如被动传感器难以探测薄卷云或主动传感器的足迹较小。我们利用这一点来生成足够的标记数据,以训练有监督的 ML 模型。CCClim 的全球覆盖范围从 1982 年到 2016 年几乎没有间隙,因此可以在气候学时间尺度上对云进行过程导向分析。同样,适中的空间和时间分辨率使其成为一个轻量级数据集,同时可以直接与气候模型进行比较。CCClim 为研究云层提供了多种机会,我们仅举几个例子。除了云类型频率,CCClim 还包含作为 ML 框架输入的云属性,因此所有云类型都可以与相关物理量联系起来。CCClim 还可与其他数据集(如再分析数据)相结合,评估有利于特定云类型出现的动力学机制及其属性。此外,我们还举例说明了如何将 CCClim 与二十面体非流体静力学大气模型 (ICON-A) 的输出结果进行比较,从而评估全球气候模型。可通过以下数字对象标识符访问 CCClim:https://doi.org/10.5281/zenodo.8369202(Kaps 等人,2023b)。
{"title":"Characterizing clouds with the CCClim dataset, a machine learning cloud class climatology","authors":"Arndt Kaps, Axel Lauer, Rémi Kazeroni, Martin Stengel, Veronika Eyring","doi":"10.5194/essd-16-3001-2024","DOIUrl":"https://doi.org/10.5194/essd-16-3001-2024","url":null,"abstract":"Abstract. We present the new Cloud Class Climatology (CCClim) dataset, quantifying the global distribution of established morphological cloud types over 35 years. CCClim combines active and passive sensor data with machine learning (ML) and provides a new opportunity for improving the understanding of clouds and their related processes. CCClim is based on cloud property retrievals from the European Space Agency's (ESA) Cloud_cci dataset, adding relative occurrences of eight major cloud types, designed to be similar to those defined by the World Meteorological Organization (WMO) at 1° resolution. The ML framework used to obtain the cloud types is trained on data from multiple satellites in the afternoon constellation (A-Train). Using multiple spaceborne sensors reduces the impact of single-sensor problems like the difficulty of passive sensors to detect thin cirrus or the small footprint of active sensors. We leverage this to generate sufficient labeled data to train supervised ML models. CCClim's global coverage being almost gapless from 1982 to 2016 allows for performing process-oriented analyses of clouds on a climatological timescale. Similarly, the moderate spatial and temporal resolutions make it a lightweight dataset while enabling straightforward comparison to climate models. CCClim creates multiple opportunities to study clouds, of which we sketch out a few examples. Along with the cloud-type frequencies, CCClim contains the cloud properties used as inputs to the ML framework, such that all cloud types can be associated with relevant physical quantities. CCClim can also be combined with other datasets such as reanalysis data to assess the dynamical regime favoring the occurrence of a specific cloud type in association with its properties. Additionally, we show an example of how to evaluate a global climate model by comparing CCClim with cloud types obtained by applying the same ML method used to create CCClim to output from the icosahedral nonhydrostatic atmosphere model (ICON-A). CCClim can be accessed via the following digital object identifier: https://doi.org/10.5281/zenodo.8369202 (Kaps et al., 2023b).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"28 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract. The rapid development of remote sensing technology has led to an exponential growth in satellite images, yet their inherent complexity often makes them difficult for non-expert users to understand. Natural language, as a carrier of human knowledge, can bridge common users and complicated satellite imagery. Additionally, when paired with visual data, natural language can be utilized to train large vision-language foundation models, significantly improving performance in various tasks. Despite these advancements, the remote sensing community still faces a challenge due to the lack of large- scale, high-quality vision-language datasets for satellite images. To address this challenge, we introduce a new image-text dataset, providing high-quality natural language descriptions for global-scale satellite data. Specifically, we utilize Sentinel-2 data for its global coverage as the foundational image source, employing semantic segmentation labels from the European Space Agency’s WorldCover project to enrich the descriptions of land covers. By conducting in-depth semantic analysis, we formulate detailed prompts to elicit rich descriptions from ChatGPT. We then include a manual verification process to enhance the dataset’s quality further. This step involves manual inspection and correction to refine the dataset. Finally, we offer the community ChatEarthNet, a large-scale image-text dataset characterized by global coverage, high quality, wide-ranging diversity, and detailed descriptions. ChatEarthNet consists of 163,488 image-text pairs with captions generated by ChatGPT3.5 and an additional 10,000 image-text pairs with captions generated by ChatGPT-4V(ision). This dataset has significant potential for both training and evaluating vision-language geo-foundation models for remote sensing. The code is publicly available at https://doi.org/10.5281/zenodo.11004358 (Yuan et al., 2024b), and the ChatEarthNet dataset is at https://doi.org/10.5281/zenodo.11003436 (Yuan et al., 2024c).
{"title":"ChatEarthNet: A Global-Scale Image-Text Dataset Empowering Vision-Language Geo-Foundation Models","authors":"Zhenghang Yuan, Zhitong Xiong, Lichao Mou, Xiao Xiang Zhu","doi":"10.5194/essd-2024-140","DOIUrl":"https://doi.org/10.5194/essd-2024-140","url":null,"abstract":"<strong>Abstract.</strong> The rapid development of remote sensing technology has led to an exponential growth in satellite images, yet their inherent complexity often makes them difficult for non-expert users to understand. Natural language, as a carrier of human knowledge, can bridge common users and complicated satellite imagery. Additionally, when paired with visual data, natural language can be utilized to train large vision-language foundation models, significantly improving performance in various tasks. Despite these advancements, the remote sensing community still faces a challenge due to the lack of large- scale, high-quality vision-language datasets for satellite images. To address this challenge, we introduce a new image-text dataset, providing high-quality natural language descriptions for global-scale satellite data. Specifically, we utilize Sentinel-2 data for its global coverage as the foundational image source, employing semantic segmentation labels from the European Space Agency’s WorldCover project to enrich the descriptions of land covers. By conducting in-depth semantic analysis, we formulate detailed prompts to elicit rich descriptions from ChatGPT. We then include a manual verification process to enhance the dataset’s quality further. This step involves manual inspection and correction to refine the dataset. Finally, we offer the community ChatEarthNet, a large-scale image-text dataset characterized by global coverage, high quality, wide-ranging diversity, and detailed descriptions. ChatEarthNet consists of 163,488 image-text pairs with captions generated by ChatGPT3.5 and an additional 10,000 image-text pairs with captions generated by ChatGPT-4V(ision). This dataset has significant potential for both training and evaluating vision-language geo-foundation models for remote sensing. The code is publicly available at https://doi.org/10.5281/zenodo.11004358 (Yuan et al., 2024b), and the ChatEarthNet dataset is at https://doi.org/10.5281/zenodo.11003436 (Yuan et al., 2024c).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"62 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141461965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fan Mei, Jennifer M. Comstock, Mikhail S. Pekour, Jerome D. Fast, Beat Schmid, Krista L. Gaustad, Shuaiqi Tang, Damao Zhang, John E. Shilling, Jason Tomlinson, Adam C. Varble, Jian Wang, L. Ruby Leung, Lawrence Kleinman, Scot Martin, Sebastien C. Biraud, Brian D. Ermold, Kenneth W. Burk
Abstract. Airborne measurements are pivotal for providing detailed, spatiotemporally resolved information about atmospheric parameters, and aerosol and cloud properties, thereby enhancing our understanding of dynamic atmospheric processes. For 30 years, the U.S. Department of Energy (DOE) Office of Science supported an instrumented Gulfstream-1 (G-1) aircraft for atmospheric field campaigns. Data from the final decade of G-1 operations were archived by the Atmospheric Radiation Measurement (ARM) user facility Data Center and made publicly available at no cost to all registered users. To ensure a consistent data format and to improve the accessibility of the ARM airborne data, an integrated dataset was recently developed covering the final six years of G-1 operations (2013 to 2018). The integrated dataset includes data collected from 236 flights (766.4 hours), which covered the Arctic, the U.S. Southern Great Plains (SGP), the U.S. West Coast, the Eastern North Atlantic (ENA), the Amazon Basin in Brazil, and the Sierras de Córdoba range in Argentina. These comprehensive data streams provide much-needed insight into spatiotemporal variability of thermodynamic quantities, aerosol and cloud states and properties for addressing essential science questions in Earth system process studies. This manuscript describes the DOE ARM merged G-1 datasets, including information on the acquisition, collection, and quality control processes. It further illustrates the usage of this merged dataset to evaluate the Energy Exascale Earth System Model (E3SM) with the Earth System Model Aerosol-Cloud Diagnostics (ESMAC Diags) package.
摘要机载测量对于提供有关大气参数、气溶胶和云特性的详细时空分辨信息至关重要,从而增强了我们对动态大气过程的了解。30 年来,美国能源部(DOE)科学办公室一直支持一架配备仪器的湾流-1(G-1)飞机进行大气实地活动。G-1 最后十年的运行数据由大气辐射测量(ARM)用户设施数据中心存档,并向所有注册用户免费公开。为确保数据格式的一致性并提高 ARM 机载数据的可访问性,最近开发了一个综合数据集,涵盖 G-1 行动的最后六年(2013 年至 2018 年)。综合数据集包括 236 次飞行(766.4 小时)收集的数据,覆盖北极、美国南部大平原 (SGP)、美国西海岸、北大西洋东部 (ENA)、巴西亚马逊盆地和阿根廷科尔多瓦山脉。这些综合数据流为解决地球系统过程研究中的基本科学问题提供了急需的热力学量、气溶胶和云状态及特性的时空变化洞察力。本手稿介绍了 DOE ARM 合并 G-1 数据集,包括有关获取、收集和质量控制过程的信息。它进一步说明了如何使用该合并数据集来评估能源超大规模地球系统模型(ESM)与地球系统模型气溶胶-云诊断(ESMAC Diags)软件包。
{"title":"Atmospheric Radiation Measurement (ARM) airborne field campaign data products between 2013 and 2018","authors":"Fan Mei, Jennifer M. Comstock, Mikhail S. Pekour, Jerome D. Fast, Beat Schmid, Krista L. Gaustad, Shuaiqi Tang, Damao Zhang, John E. Shilling, Jason Tomlinson, Adam C. Varble, Jian Wang, L. Ruby Leung, Lawrence Kleinman, Scot Martin, Sebastien C. Biraud, Brian D. Ermold, Kenneth W. Burk","doi":"10.5194/essd-2024-97","DOIUrl":"https://doi.org/10.5194/essd-2024-97","url":null,"abstract":"<strong>Abstract.</strong> Airborne measurements are pivotal for providing detailed, spatiotemporally resolved information about atmospheric parameters, and aerosol and cloud properties, thereby enhancing our understanding of dynamic atmospheric processes. For 30 years, the U.S. Department of Energy (DOE) Office of Science supported an instrumented Gulfstream-1 (G-1) aircraft for atmospheric field campaigns. Data from the final decade of G-1 operations were archived by the Atmospheric Radiation Measurement (ARM) user facility Data Center and made publicly available at no cost to all registered users. To ensure a consistent data format and to improve the accessibility of the ARM airborne data, an integrated dataset was recently developed covering the final six years of G-1 operations (2013 to 2018). The integrated dataset includes data collected from 236 flights (766.4 hours), which covered the Arctic, the U.S. Southern Great Plains (SGP), the U.S. West Coast, the Eastern North Atlantic (ENA), the Amazon Basin in Brazil, and the Sierras de Córdoba range in Argentina. These comprehensive data streams provide much-needed insight into spatiotemporal variability of thermodynamic quantities, aerosol and cloud states and properties for addressing essential science questions in Earth system process studies. This manuscript describes the DOE ARM merged G-1 datasets, including information on the acquisition, collection, and quality control processes. It further illustrates the usage of this merged dataset to evaluate the Energy Exascale Earth System Model (E3SM) with the Earth System Model Aerosol-Cloud Diagnostics (ESMAC Diags) package.","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"29 1","pages":""},"PeriodicalIF":11.4,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141462563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}