{"title":"Predicting the distribution of Coilia nasus abundance in the Yangtze River estuary: From interpolation to extrapolation","authors":"Yichuan Wang , Jianhui Wu , Xuefang Wang","doi":"10.1016/j.ecss.2024.108935","DOIUrl":null,"url":null,"abstract":"<div><p><em>Coilia nasus</em> was once an economically important fish in the Yangtze River estuary, but overfishing and other anthropogenic factors have severely depleted its population. To conserve and restore <em>C. nasus</em>, there is an urgent need to determine its precise spatiotemporal distribution. However, as a typical anadromous species, <em>C. nasus</em> seasonally uses estuarine habitats, resulting in a very high proportion of nulls in some seasons and posing a great challenge to predicting abundance. This study compared three commonly used tree methods (gradient boosting machine (GBM), random forest (RF), and conditional random forest (CRF)) to predict the abundance of <em>C. nasus</em> in the Yangtze River estuary using trawl resource monitoring survey data from 2013 to 2018. Based on the survey data, 16 explanatory variables, including temperature, salinity, pH, and chemical oxygen demand, were used as predictors, and the coefficient of determination (R<sup>2</sup>), root mean square error (RMSE), and root mean square logarithmic error (RMSLE) were used to evaluate the performance of the three tree methods. Three metrics were used to assess the performance difference between interpolation and extrapolation for the three tree methods when modeling by season and combining seasons. The results showed that (1) compared with combined modeling, seasonal modeling could accurately determine the high- and low-abundance regions in interpolation, and the quarterly model greatly improved the extrapolation prediction accuracy. (2) Almost all metrics indicated that the interpolation RF model had the best performance, while CRF and GBM were significantly worse than other methods for some indicators, and the RF model had better robustness and could be applied to the abundance of all seasons. (3) The model performance in extrapolation was significantly lower than that of interpolation, RF was also the best method, and RF could still identify high-abundance areas when the amount of data was much smaller than that used for interpolation. The findings of our study can be generalized to species distribution modeling of other migratory species in the Yangtze River estuary or estuarine ecosystems in the Northwest Pacific Ocean.</p></div>","PeriodicalId":50497,"journal":{"name":"Estuarine Coastal and Shelf Science","volume":"308 ","pages":"Article 108935"},"PeriodicalIF":2.6000,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Estuarine Coastal and Shelf Science","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0272771424003238","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MARINE & FRESHWATER BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Coilia nasus was once an economically important fish in the Yangtze River estuary, but overfishing and other anthropogenic factors have severely depleted its population. To conserve and restore C. nasus, there is an urgent need to determine its precise spatiotemporal distribution. However, as a typical anadromous species, C. nasus seasonally uses estuarine habitats, resulting in a very high proportion of nulls in some seasons and posing a great challenge to predicting abundance. This study compared three commonly used tree methods (gradient boosting machine (GBM), random forest (RF), and conditional random forest (CRF)) to predict the abundance of C. nasus in the Yangtze River estuary using trawl resource monitoring survey data from 2013 to 2018. Based on the survey data, 16 explanatory variables, including temperature, salinity, pH, and chemical oxygen demand, were used as predictors, and the coefficient of determination (R2), root mean square error (RMSE), and root mean square logarithmic error (RMSLE) were used to evaluate the performance of the three tree methods. Three metrics were used to assess the performance difference between interpolation and extrapolation for the three tree methods when modeling by season and combining seasons. The results showed that (1) compared with combined modeling, seasonal modeling could accurately determine the high- and low-abundance regions in interpolation, and the quarterly model greatly improved the extrapolation prediction accuracy. (2) Almost all metrics indicated that the interpolation RF model had the best performance, while CRF and GBM were significantly worse than other methods for some indicators, and the RF model had better robustness and could be applied to the abundance of all seasons. (3) The model performance in extrapolation was significantly lower than that of interpolation, RF was also the best method, and RF could still identify high-abundance areas when the amount of data was much smaller than that used for interpolation. The findings of our study can be generalized to species distribution modeling of other migratory species in the Yangtze River estuary or estuarine ecosystems in the Northwest Pacific Ocean.
期刊介绍:
Estuarine, Coastal and Shelf Science is an international multidisciplinary journal devoted to the analysis of saline water phenomena ranging from the outer edge of the continental shelf to the upper limits of the tidal zone. The journal provides a unique forum, unifying the multidisciplinary approaches to the study of the oceanography of estuaries, coastal zones, and continental shelf seas. It features original research papers, review papers and short communications treating such disciplines as zoology, botany, geology, sedimentology, physical oceanography.