occTest: An integrated approach for quality control of species occurrence data

IF 6.3 1区 环境科学与生态学 Q1 ECOLOGY Global Ecology and Biogeography Pub Date : 2024-04-12 DOI:10.1111/geb.13847
Josep M. Serra-Diaz, Jeremy Borderieux, Brian Maitner, Coline C. F. Boonman, Daniel Park, Wen-Yong Guo, Arnaud Callebaut, Brian J. Enquist, Jens-C. Svenning, Cory Merow
{"title":"occTest: An integrated approach for quality control of species occurrence data","authors":"Josep M. Serra-Diaz,&nbsp;Jeremy Borderieux,&nbsp;Brian Maitner,&nbsp;Coline C. F. Boonman,&nbsp;Daniel Park,&nbsp;Wen-Yong Guo,&nbsp;Arnaud Callebaut,&nbsp;Brian J. Enquist,&nbsp;Jens-C. Svenning,&nbsp;Cory Merow","doi":"10.1111/geb.13847","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Aim</h3>\n \n <p>Species occurrence data are valuable information that enables one to estimate geographical distributions, characterize niches and their evolution, and guide spatial conservation planning. Rapid increases in species occurrence data stem from increasing digitization and aggregation efforts, and citizen science initiatives. However, persistent quality issues in occurrence data can impact the accuracy of scientific findings, underscoring the importance of filtering erroneous occurrence records in biodiversity analyses.</p>\n </section>\n \n <section>\n \n <h3> Innovation</h3>\n \n <p>We introduce an R package, occTest, that synthesizes a growing open-source ecosystem of biodiversity cleaning workflows to prepare occurrence data for different modelling applications. It offers a structured set of algorithms to identify potential problems with species occurrence records by employing a hierarchical organization of multiple tests. The workflow has a hierarchical structure organized in test<i>Phases</i> (i.e. cleaning vs. testing) <i>that encompass different testBlocks</i> grouping different <i>testTypes</i> (e.g. <i>environmental outlier detection</i>), which may use different <i>testMethods</i> (e.g. <i>Rosner test, jacknife,</i>etc.). Four different <i>testBlocks</i> characterize potential problems in geographic, environmental, human influence and temporal dimensions. Filtering and plotting functions are incorporated to facilitate the interpretation of tests. We provide examples with different data sources, with default and user-defined parameters. Compared to other available tools and workflows, occTest offers a comprehensive suite of integrated tests, and allows multiple methods associated with each test to explore consensus among data cleaning methods. It uniquely incorporates both coordinate accuracy analysis and environmental analysis of occurrence records. Furthermore, it provides a hierarchical structure to incorporate future tests yet to be developed.</p>\n </section>\n \n <section>\n \n <h3> Main conclusions</h3>\n \n <p>occTest will help users understand the quality and quantity of data available before the start of data analysis, while also enabling users to filter data using either predefined rules or custom-built rules. As a result, occTest can better assess each record's appropriateness for its intended application.</p>\n </section>\n </div>","PeriodicalId":176,"journal":{"name":"Global Ecology and Biogeography","volume":"33 7","pages":""},"PeriodicalIF":6.3000,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Global Ecology and Biogeography","FirstCategoryId":"93","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/geb.13847","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Aim

Species occurrence data are valuable information that enables one to estimate geographical distributions, characterize niches and their evolution, and guide spatial conservation planning. Rapid increases in species occurrence data stem from increasing digitization and aggregation efforts, and citizen science initiatives. However, persistent quality issues in occurrence data can impact the accuracy of scientific findings, underscoring the importance of filtering erroneous occurrence records in biodiversity analyses.

Innovation

We introduce an R package, occTest, that synthesizes a growing open-source ecosystem of biodiversity cleaning workflows to prepare occurrence data for different modelling applications. It offers a structured set of algorithms to identify potential problems with species occurrence records by employing a hierarchical organization of multiple tests. The workflow has a hierarchical structure organized in testPhases (i.e. cleaning vs. testing) that encompass different testBlocks grouping different testTypes (e.g. environmental outlier detection), which may use different testMethods (e.g. Rosner test, jacknife,etc.). Four different testBlocks characterize potential problems in geographic, environmental, human influence and temporal dimensions. Filtering and plotting functions are incorporated to facilitate the interpretation of tests. We provide examples with different data sources, with default and user-defined parameters. Compared to other available tools and workflows, occTest offers a comprehensive suite of integrated tests, and allows multiple methods associated with each test to explore consensus among data cleaning methods. It uniquely incorporates both coordinate accuracy analysis and environmental analysis of occurrence records. Furthermore, it provides a hierarchical structure to incorporate future tests yet to be developed.

Main conclusions

occTest will help users understand the quality and quantity of data available before the start of data analysis, while also enabling users to filter data using either predefined rules or custom-built rules. As a result, occTest can better assess each record's appropriateness for its intended application.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
occTest:物种出现数据质量控制的综合方法
物种出现数据是宝贵的信息,可帮助人们估计物种的地理分布,描述物种的生态位及其演变,并指导空间保护规划。物种出现数据的快速增长源于数字化和汇总工作的不断加强,以及公民科学活动的开展。然而,物种出现数据中持续存在的质量问题会影响科学研究结果的准确性,这就凸显了在生物多样性分析中过滤错误出现记录的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Global Ecology and Biogeography
Global Ecology and Biogeography 环境科学-生态学
CiteScore
12.10
自引率
3.10%
发文量
170
审稿时长
3 months
期刊介绍: Global Ecology and Biogeography (GEB) welcomes papers that investigate broad-scale (in space, time and/or taxonomy), general patterns in the organization of ecological systems and assemblages, and the processes that underlie them. In particular, GEB welcomes studies that use macroecological methods, comparative analyses, meta-analyses, reviews, spatial analyses and modelling to arrive at general, conceptual conclusions. Studies in GEB need not be global in spatial extent, but the conclusions and implications of the study must be relevant to ecologists and biogeographers globally, rather than being limited to local areas, or specific taxa. Similarly, GEB is not limited to spatial studies; we are equally interested in the general patterns of nature through time, among taxa (e.g., body sizes, dispersal abilities), through the course of evolution, etc. Further, GEB welcomes papers that investigate general impacts of human activities on ecological systems in accordance with the above criteria.
期刊最新文献
Fine-Grain Predictions Are Key to Accurately Represent Continental-Scale Biodiversity Patterns Issue Information Thermal Forcing Versus Chilling? Misspecification of Temperature Controls in Spring Phenology Models Predicting Landscape Conversion Impact on Small Mammal Occurrence and the Transmission of Parasites in the Atlantic Forest Spatial Variation in Upper Limits of Coral Cover on the Great Barrier Reef
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1