Benjamin A Kellenberger, Kevin Winner, Walter Jetz
{"title":"The Performance and Potential of Deep Learning for Predicting Species Distributions","authors":"Benjamin A Kellenberger, Kevin Winner, Walter Jetz","doi":"10.1101/2024.08.09.607358","DOIUrl":null,"url":null,"abstract":"Species distribution models (SDMs) address the whereabouts of species and are central to ecology. Deep learning (DL) is poised to further elevate the already significant role of SDMs in ecology and conservation, but the potential and limitations of this transformation are still largely unassessed. We evaluate DL SDMs for 2,299 terrestrial vertebrate and invertebrate species at continental scale and 1km resolution in a like-for-like comparison with latest implementation of classic SDMs. We compare two DL methods (a multi-layer perceptron (MLP) on point covariates and a convolutional neural network (CNN) on geospatial patches) against existing SDMs (Maxent and Random Forest). On average, DL models match, but do not surpass, the performance of existing methods. DL performance is substantially weaker for species with narrow geographic ranges, fewer data points, and those assessed as threatened and hence often of greatest conservation concern. Furthermore, information leakage across dataset splits substantially inflates performance metrics, especially of CNNs. We find current DL SDMs to not provide significant gains, instead requiring careful experimental design to avoid biases. However, future advances in DL-supported use of ancillary ecological information have the potential to make DL a viable instrument in the larger SDM toolbox. Realising this opportunity will require a close collaboration between ecology and machine learning disciplines.","PeriodicalId":501320,"journal":{"name":"bioRxiv - Ecology","volume":"192 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Ecology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.09.607358","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Species distribution models (SDMs) address the whereabouts of species and are central to ecology. Deep learning (DL) is poised to further elevate the already significant role of SDMs in ecology and conservation, but the potential and limitations of this transformation are still largely unassessed. We evaluate DL SDMs for 2,299 terrestrial vertebrate and invertebrate species at continental scale and 1km resolution in a like-for-like comparison with latest implementation of classic SDMs. We compare two DL methods (a multi-layer perceptron (MLP) on point covariates and a convolutional neural network (CNN) on geospatial patches) against existing SDMs (Maxent and Random Forest). On average, DL models match, but do not surpass, the performance of existing methods. DL performance is substantially weaker for species with narrow geographic ranges, fewer data points, and those assessed as threatened and hence often of greatest conservation concern. Furthermore, information leakage across dataset splits substantially inflates performance metrics, especially of CNNs. We find current DL SDMs to not provide significant gains, instead requiring careful experimental design to avoid biases. However, future advances in DL-supported use of ancillary ecological information have the potential to make DL a viable instrument in the larger SDM toolbox. Realising this opportunity will require a close collaboration between ecology and machine learning disciplines.