Random forest regression models in ecology: Accounting for messy biological data and producing predictions with uncertainty

IF 2.2 2区 农林科学 Q2 FISHERIES Fisheries Research Pub Date : 2024-09-06 DOI:10.1016/j.fishres.2024.107161
Caitlin I. Allen Akselrud
{"title":"Random forest regression models in ecology: Accounting for messy biological data and producing predictions with uncertainty","authors":"Caitlin I. Allen Akselrud","doi":"10.1016/j.fishres.2024.107161","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning methods such as random forest regression models are useful tools in ecology when applied correctly, although features inherent to ecological data sets can lead to over-fitting or uncertain predictions. Here, a set of methods are outlined to account for temporal autocorrelation, and sparse, short, or missing data for random forest predictions. Methods are also provided for estimating prediction uncertainty due to the combination of inherent randomness in the random forest algorithm and sparse input data. This suite of methods was used to generate pre-season predictions of total catches with uncertainty for California market squid (<em>Doryteuthis opalescens</em>), the most valuable fishery in California (by ex-vessel value). The methodology presented in this analysis is not only robust, incorporating key cross-validation and hyperparameter tuning techniques from across disciplines, but is also flexible, making it applicable to various ecological and fisheries datasets beyond market squid.</p></div>","PeriodicalId":50443,"journal":{"name":"Fisheries Research","volume":"280 ","pages":"Article 107161"},"PeriodicalIF":2.2000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fisheries Research","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S016578362400225X","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"FISHERIES","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning methods such as random forest regression models are useful tools in ecology when applied correctly, although features inherent to ecological data sets can lead to over-fitting or uncertain predictions. Here, a set of methods are outlined to account for temporal autocorrelation, and sparse, short, or missing data for random forest predictions. Methods are also provided for estimating prediction uncertainty due to the combination of inherent randomness in the random forest algorithm and sparse input data. This suite of methods was used to generate pre-season predictions of total catches with uncertainty for California market squid (Doryteuthis opalescens), the most valuable fishery in California (by ex-vessel value). The methodology presented in this analysis is not only robust, incorporating key cross-validation and hyperparameter tuning techniques from across disciplines, but is also flexible, making it applicable to various ecological and fisheries datasets beyond market squid.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
生态学中的随机森林回归模型:考虑杂乱的生物数据并做出具有不确定性的预测
尽管生态数据集的固有特征可能导致过度拟合或不确定的预测,但随机森林回归模型等机器学习方法如果应用得当,仍是生态学中的有用工具。本文概述了一套方法,用于考虑随机森林预测的时间自相关性以及稀疏、短小或缺失数据。此外,还提供了一些方法,用于估算随机森林算法中固有的随机性与稀疏输入数据相结合所导致的预测不确定性。这套方法用于对加州市场鱿鱼(Doryteuthis opalescens)总产量的不确定性进行季前预测,鱿鱼是加州最有价值的渔业(按出船价值计算)。本分析中介绍的方法不仅稳健,采用了跨学科的关键交叉验证和超参数调整技术,而且灵活,适用于市场鱿鱼以外的各种生态和渔业数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Fisheries Research
Fisheries Research 农林科学-渔业
CiteScore
4.50
自引率
16.70%
发文量
294
审稿时长
15 weeks
期刊介绍: This journal provides an international forum for the publication of papers in the areas of fisheries science, fishing technology, fisheries management and relevant socio-economics. The scope covers fisheries in salt, brackish and freshwater systems, and all aspects of associated ecology, environmental aspects of fisheries, and economics. Both theoretical and practical papers are acceptable, including laboratory and field experimental studies relevant to fisheries. Papers on the conservation of exploitable living resources are welcome. Review and Viewpoint articles are also published. As the specified areas inevitably impinge on and interrelate with each other, the approach of the journal is multidisciplinary, and authors are encouraged to emphasise the relevance of their own work to that of other disciplines. The journal is intended for fisheries scientists, biological oceanographers, gear technologists, economists, managers, administrators, policy makers and legislators.
期刊最新文献
Using species-specific behavior to improve catch efficiency of target species in mixed trawl fisheries Does the presence of a pop-up satellite archival tag affect movement of Tanner crab (Chionoecetes bairdi) in an exposed Alaskan bay? Ice angling for northern pike (Esox lucius) with tip ups: Hook style affects angler catch and fish welfare Size selectivity of Muller’s pearlside (Maurolicus muelleri), glacier lanternfish (Benthosema glaciale) and krill in trawls targeting the mesopelagic fish Angler catch data as a monitoring tool for European barbel Barbus barbus in a data limited recreational fishery
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1