Zhilong Li , Ziti Jiao , Ge Gao , Jing Guo , Chenxia Wang , Sizhe Chen , Zheyou Tan
{"title":"基于随机森林嵌入双叶光利用效率模型的2002 - 2020年全球光照和遮荫冠层总初级生产力数据","authors":"Zhilong Li , Ziti Jiao , Ge Gao , Jing Guo , Chenxia Wang , Sizhe Chen , Zheyou Tan","doi":"10.1016/j.dib.2025.111298","DOIUrl":null,"url":null,"abstract":"<div><div>Gross primary productivity (GPP) is crucial for understanding the carbon cycle and maintaining ecosystem balance under climate change. We attempt to generate a long-term global dataset for GPP of sunlit (GPP<sub>su</sub>) and shaded leaves (GPP<sub>sh</sub>) by a hybrid model combining the random forest (RF) submodule with the two-leaf light use efficiency (TL-LUE) model. First, the TL-LUE model was optimized by considering the seasonal differences in the clumping index on a global scale (TL-CLUE). Then, we used the RF technique to integrate various environmental stress factors, including meteorological factors, hydrological variables, soil properties, and elevation, which originate from the NASA MERRA-2 dataset, ISRIC soil Grids, and USGS data center. Furthermore, the RF submodule was embedded into the TL-CLUE model to construct the hybrid model (TL-CRF), which was trained and evaluated based on global eddy covariance (EC) site data from the AmeriFlux and FLUXNET2015 datasets. We produced a global GPP, GPP<sub>su</sub>, and GPP<sub>sh</sub> dataset with a spatial resolution of 0.05 × 0.05° over 2002–2020 by the TL-CRF model driven by the LP DACC leaf area index and land cover, NASA MERRA-2 incoming shortwave solar radiation, and the above environmental variables. This GPP product provides a data basis for improving our understanding of the dynamics of global vegetation productivity and its interactions with the changes in environmental conditions<em>.</em></div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"58 ","pages":"Article 111298"},"PeriodicalIF":1.4000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11786690/pdf/","citationCount":"0","resultStr":"{\"title\":\"A global gross primary productivity of sunlit and shaded canopies dataset from 2002 to 2020 via embedding random forest into two-leaf light use efficiency model\",\"authors\":\"Zhilong Li , Ziti Jiao , Ge Gao , Jing Guo , Chenxia Wang , Sizhe Chen , Zheyou Tan\",\"doi\":\"10.1016/j.dib.2025.111298\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Gross primary productivity (GPP) is crucial for understanding the carbon cycle and maintaining ecosystem balance under climate change. We attempt to generate a long-term global dataset for GPP of sunlit (GPP<sub>su</sub>) and shaded leaves (GPP<sub>sh</sub>) by a hybrid model combining the random forest (RF) submodule with the two-leaf light use efficiency (TL-LUE) model. First, the TL-LUE model was optimized by considering the seasonal differences in the clumping index on a global scale (TL-CLUE). Then, we used the RF technique to integrate various environmental stress factors, including meteorological factors, hydrological variables, soil properties, and elevation, which originate from the NASA MERRA-2 dataset, ISRIC soil Grids, and USGS data center. Furthermore, the RF submodule was embedded into the TL-CLUE model to construct the hybrid model (TL-CRF), which was trained and evaluated based on global eddy covariance (EC) site data from the AmeriFlux and FLUXNET2015 datasets. We produced a global GPP, GPP<sub>su</sub>, and GPP<sub>sh</sub> dataset with a spatial resolution of 0.05 × 0.05° over 2002–2020 by the TL-CRF model driven by the LP DACC leaf area index and land cover, NASA MERRA-2 incoming shortwave solar radiation, and the above environmental variables. This GPP product provides a data basis for improving our understanding of the dynamics of global vegetation productivity and its interactions with the changes in environmental conditions<em>.</em></div></div>\",\"PeriodicalId\":10973,\"journal\":{\"name\":\"Data in Brief\",\"volume\":\"58 \",\"pages\":\"Article 111298\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11786690/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data in Brief\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352340925000307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/10 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340925000307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/10 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
A global gross primary productivity of sunlit and shaded canopies dataset from 2002 to 2020 via embedding random forest into two-leaf light use efficiency model
Gross primary productivity (GPP) is crucial for understanding the carbon cycle and maintaining ecosystem balance under climate change. We attempt to generate a long-term global dataset for GPP of sunlit (GPPsu) and shaded leaves (GPPsh) by a hybrid model combining the random forest (RF) submodule with the two-leaf light use efficiency (TL-LUE) model. First, the TL-LUE model was optimized by considering the seasonal differences in the clumping index on a global scale (TL-CLUE). Then, we used the RF technique to integrate various environmental stress factors, including meteorological factors, hydrological variables, soil properties, and elevation, which originate from the NASA MERRA-2 dataset, ISRIC soil Grids, and USGS data center. Furthermore, the RF submodule was embedded into the TL-CLUE model to construct the hybrid model (TL-CRF), which was trained and evaluated based on global eddy covariance (EC) site data from the AmeriFlux and FLUXNET2015 datasets. We produced a global GPP, GPPsu, and GPPsh dataset with a spatial resolution of 0.05 × 0.05° over 2002–2020 by the TL-CRF model driven by the LP DACC leaf area index and land cover, NASA MERRA-2 incoming shortwave solar radiation, and the above environmental variables. This GPP product provides a data basis for improving our understanding of the dynamics of global vegetation productivity and its interactions with the changes in environmental conditions.
期刊介绍:
Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.