The Performance and Potential of Deep Learning for Predicting Species Distributions

bioRxiv - Ecology Pub Date : 2024-08-09 DOI:10.1101/2024.08.09.607358

Benjamin A Kellenberger, Kevin Winner, Walter Jetz

{"title":"The Performance and Potential of Deep Learning for Predicting Species Distributions","authors":"Benjamin A Kellenberger, Kevin Winner, Walter Jetz","doi":"10.1101/2024.08.09.607358","DOIUrl":null,"url":null,"abstract":"Species distribution models (SDMs) address the whereabouts of species and are central to ecology. Deep learning (DL) is poised to further elevate the already significant role of SDMs in ecology and conservation, but the potential and limitations of this transformation are still largely unassessed. We evaluate DL SDMs for 2,299 terrestrial vertebrate and invertebrate species at continental scale and 1km resolution in a like-for-like comparison with latest implementation of classic SDMs. We compare two DL methods (a multi-layer perceptron (MLP) on point covariates and a convolutional neural network (CNN) on geospatial patches) against existing SDMs (Maxent and Random Forest). On average, DL models match, but do not surpass, the performance of existing methods. DL performance is substantially weaker for species with narrow geographic ranges, fewer data points, and those assessed as threatened and hence often of greatest conservation concern. Furthermore, information leakage across dataset splits substantially inflates performance metrics, especially of CNNs. We find current DL SDMs to not provide significant gains, instead requiring careful experimental design to avoid biases. However, future advances in DL-supported use of ancillary ecological information have the potential to make DL a viable instrument in the larger SDM toolbox. Realising this opportunity will require a close collaboration between ecology and machine learning disciplines.","PeriodicalId":501320,"journal":{"name":"bioRxiv - Ecology","volume":"192 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Ecology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.09.607358","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Species distribution models (SDMs) address the whereabouts of species and are central to ecology. Deep learning (DL) is poised to further elevate the already significant role of SDMs in ecology and conservation, but the potential and limitations of this transformation are still largely unassessed. We evaluate DL SDMs for 2,299 terrestrial vertebrate and invertebrate species at continental scale and 1km resolution in a like-for-like comparison with latest implementation of classic SDMs. We compare two DL methods (a multi-layer perceptron (MLP) on point covariates and a convolutional neural network (CNN) on geospatial patches) against existing SDMs (Maxent and Random Forest). On average, DL models match, but do not surpass, the performance of existing methods. DL performance is substantially weaker for species with narrow geographic ranges, fewer data points, and those assessed as threatened and hence often of greatest conservation concern. Furthermore, information leakage across dataset splits substantially inflates performance metrics, especially of CNNs. We find current DL SDMs to not provide significant gains, instead requiring careful experimental design to avoid biases. However, future advances in DL-supported use of ancillary ecological information have the potential to make DL a viable instrument in the larger SDM toolbox. Realising this opportunity will require a close collaboration between ecology and machine learning disciplines.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

深度学习在预测物种分布方面的性能和潜力

物种分布模型（SDM）涉及物种的行踪，是生态学的核心。深度学习（DL）有望进一步提升 SDMs 在生态学和自然保护中已经发挥的重要作用，但这种转变的潜力和局限性在很大程度上仍未得到评估。我们评估了 2,299 种陆生脊椎动物和无脊椎动物在大陆尺度和 1km 分辨率下的 DL SDMs，并将其与最新实施的经典 SDMs 进行了同类比较。我们将两种 DL 方法（针对点协变量的多层感知器（MLP）和针对地理空间斑块的卷积神经网络（CNN））与现有的 SDM（Maxent 和随机森林）进行了比较。平均而言，DL 模型与现有方法的性能相当，但并未超越。对于地理范围较窄、数据点较少的物种，以及那些被评估为濒危物种的物种，DL 的性能要弱得多，因此它们往往是最受保护关注的物种。此外，数据集分割时的信息泄露也大大提高了性能指标，尤其是 CNN 的性能指标。我们发现，目前的 DL SDM 并不能带来显著的收益，反而需要精心的实验设计来避免偏差。然而，未来在 DL 支持下使用辅助生态信息的进步有可能使 DL 成为更大的 SDM 工具箱中的可行工具。实现这一机遇需要生态学与机器学习学科之间的密切合作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

bioRxiv - Ecology

自引率

0.00%

发文量