促进OBIS的高质量数据:来自OBIS数据质量评估和增强项目组的见解

Yi-Ming Gan, Ruben Perez Perez, Pieter Provoost, A. Benson, A. C. Peralta Brichtova, Elizabeth R. Lawrence, John Nicholls, Johnny Konjarla, Georgia Sarafidou, H. Saeedi, Dan Lear, Anke Penzlin, N. Wambiji, W. Appeltans
{"title":"促进OBIS的高质量数据:来自OBIS数据质量评估和增强项目组的见解","authors":"Yi-Ming Gan, Ruben Perez Perez, Pieter Provoost, A. Benson, A. C. Peralta Brichtova, Elizabeth R. Lawrence, John Nicholls, Johnny Konjarla, Georgia Sarafidou, H. Saeedi, Dan Lear, Anke Penzlin, N. Wambiji, W. Appeltans","doi":"10.3897/biss.7.112018","DOIUrl":null,"url":null,"abstract":"The Ocean Biodiversity Information System (OBIS) (Klein et al. 2019) is a global database of marine biodiversity and associated environmental data, which provides critical information to researchers and policymakers worldwide. Ensuring the accuracy and consistency of the data in OBIS is essential for its usefulness and value, not only to the scientific community but also to the science-policy interface. The OBIS Data Quality Assessment and Enhancement Project Team (QCPT), formed in 2019 by the OBIS steering group, aims to assess and enhance data quality. It has been working on three categories of activities for this purpose:\n \n Data quality enhancement and management\n \n The OBIS QCPT organized data laundry events to identify and address data quality issues of published OBIS datasets. Furthermore, individual OBIS nodes were invited to give their data-processing presentations in the monthly meetings to foster knowledge sharing and collaborative problem-solving focused on data quality. Data quality issues and solutions highlighted in the presentations and data laundry events were documented in a dedicated GitHub repository as GitHub issues. The solutions for data quality issues and marine-specific pre-publication quality control tools, designed to identify the data quality issues, were provided as feedback to the OBIS Capacity Development Task Team. These inputs were used to create training resources (see OBIS manual, upcoming OBIS training course hosted on OceanTeacher Global Academy) aimed at preventing these issues.\n \n Standardization of OBIS data processing pipeline \n \n As OBIS uses the Darwin Core standard (Wieczorek et al. 2012), the use of standardized tests and assertions in the data processing pipeline is encouraged. To achieve this, the OBIS QCPT aligned OBIS quality checks with a subset of core tests and assertions (Chapman et al. 2020) developed by the Biodiversity Information Standards (TDWG) Biodiversity Data Quality (BDQ) Task Group 2 (TG2) (Chapman et al. 2020) as tracked in this GitHub issue. Not all default parameters of the core tests and assertions are optimal for marine biodiversity data. The OBIS QCPT met monthly to determine suitable parameters for customizing the tests. The pipeline produces a data quality report for each dataset with quality flags that indicate potential data quality issues, enabling node managers and data providers to review the flagged records.\n \n Community engagement\n \n The OBIS QCPT led a survey among data users to gather insights into OBIS data quality issues and bridge the gap between the current implementation and user expectations. The survey findings enabled OBIS to prioritize issues to be addressed, as summarized in Section 2.2.2 of the 11th OBIS Steering Group meeting report. In addition to engaging with data users, the OBIS QCPT also served as a platform to discuss questions related to the use of Darwin Core from the nodes and provided feedback for the term discussions. \n In summary, the OBIS QCPT improves marine species data reliability and usability through transparent and participatory approaches, fostering continuous improvement. Collaborative efforts, standardized procedures, and knowledge sharing advance OBIS' mission of providing high quality biodiversity data for research, conservation, and ocean management.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"49 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Promoting High-Quality Data in OBIS: Insights from the OBIS Data Quality Assessment and Enhancement Project Team \",\"authors\":\"Yi-Ming Gan, Ruben Perez Perez, Pieter Provoost, A. Benson, A. C. Peralta Brichtova, Elizabeth R. Lawrence, John Nicholls, Johnny Konjarla, Georgia Sarafidou, H. Saeedi, Dan Lear, Anke Penzlin, N. Wambiji, W. Appeltans\",\"doi\":\"10.3897/biss.7.112018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Ocean Biodiversity Information System (OBIS) (Klein et al. 2019) is a global database of marine biodiversity and associated environmental data, which provides critical information to researchers and policymakers worldwide. Ensuring the accuracy and consistency of the data in OBIS is essential for its usefulness and value, not only to the scientific community but also to the science-policy interface. The OBIS Data Quality Assessment and Enhancement Project Team (QCPT), formed in 2019 by the OBIS steering group, aims to assess and enhance data quality. It has been working on three categories of activities for this purpose:\\n \\n Data quality enhancement and management\\n \\n The OBIS QCPT organized data laundry events to identify and address data quality issues of published OBIS datasets. Furthermore, individual OBIS nodes were invited to give their data-processing presentations in the monthly meetings to foster knowledge sharing and collaborative problem-solving focused on data quality. Data quality issues and solutions highlighted in the presentations and data laundry events were documented in a dedicated GitHub repository as GitHub issues. The solutions for data quality issues and marine-specific pre-publication quality control tools, designed to identify the data quality issues, were provided as feedback to the OBIS Capacity Development Task Team. These inputs were used to create training resources (see OBIS manual, upcoming OBIS training course hosted on OceanTeacher Global Academy) aimed at preventing these issues.\\n \\n Standardization of OBIS data processing pipeline \\n \\n As OBIS uses the Darwin Core standard (Wieczorek et al. 2012), the use of standardized tests and assertions in the data processing pipeline is encouraged. To achieve this, the OBIS QCPT aligned OBIS quality checks with a subset of core tests and assertions (Chapman et al. 2020) developed by the Biodiversity Information Standards (TDWG) Biodiversity Data Quality (BDQ) Task Group 2 (TG2) (Chapman et al. 2020) as tracked in this GitHub issue. Not all default parameters of the core tests and assertions are optimal for marine biodiversity data. The OBIS QCPT met monthly to determine suitable parameters for customizing the tests. The pipeline produces a data quality report for each dataset with quality flags that indicate potential data quality issues, enabling node managers and data providers to review the flagged records.\\n \\n Community engagement\\n \\n The OBIS QCPT led a survey among data users to gather insights into OBIS data quality issues and bridge the gap between the current implementation and user expectations. The survey findings enabled OBIS to prioritize issues to be addressed, as summarized in Section 2.2.2 of the 11th OBIS Steering Group meeting report. In addition to engaging with data users, the OBIS QCPT also served as a platform to discuss questions related to the use of Darwin Core from the nodes and provided feedback for the term discussions. \\n In summary, the OBIS QCPT improves marine species data reliability and usability through transparent and participatory approaches, fostering continuous improvement. Collaborative efforts, standardized procedures, and knowledge sharing advance OBIS' mission of providing high quality biodiversity data for research, conservation, and ocean management.\",\"PeriodicalId\":9011,\"journal\":{\"name\":\"Biodiversity Information Science and Standards\",\"volume\":\"49 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biodiversity Information Science and Standards\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3897/biss.7.112018\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodiversity Information Science and Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/biss.7.112018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

海洋生物多样性信息系统(OBIS) (Klein et al. 2019)是海洋生物多样性和相关环境数据的全球数据库,为世界各地的研究人员和政策制定者提供重要信息。确保OBIS数据的准确性和一致性对其有用性和价值至关重要,不仅对科学界,而且对科学-政策界面也是如此。OBIS数据质量评估和提高项目组(QCPT)由OBIS指导小组于2019年成立,旨在评估和提高数据质量。OBIS QCPT为此目的开展了三类活动:数据质量增强和管理OBIS QCPT组织了数据清洗活动,以识别和解决已发布的OBIS数据集的数据质量问题。此外,OBIS各节点应邀在每月会议上介绍数据处理情况,以促进以数据质量为重点的知识共享和协作解决问题。在演示和数据清洗事件中强调的数据质量问题和解决方案被记录在专用的GitHub存储库中,作为GitHub问题。数据质量问题的解决方案和针对海洋的出版前质量控制工具,旨在确定数据质量问题,作为反馈提供给OBIS能力发展任务小组。这些输入用于创建培训资源(参见OBIS手册,即将在OceanTeacher全球学院举办的OBIS培训课程),旨在防止这些问题。由于OBIS使用达尔文核心标准(Wieczorek et al. 2012),因此鼓励在数据处理管道中使用标准化测试和断言。为了实现这一目标,OBIS QCPT将OBIS质量检查与生物多样性信息标准(TDWG)生物多样性数据质量(BDQ)任务组2 (TG2) (Chapman et al. 2020)开发的核心测试和断言子集(Chapman et al. 2020)保持一致,并在本GitHub问题中进行了跟踪。并非所有核心测试和断言的默认参数都是海洋生物多样性数据的最佳参数。OBIS QCPT每月开会一次,以确定定制测试的合适参数。该管道为每个数据集生成数据质量报告,其中带有质量标志,指示潜在的数据质量问题,使节点管理器和数据提供程序能够检查标记的记录。OBIS QCPT在数据用户中进行了一项调查,以收集OBIS数据质量问题的见解,并弥合当前实施与用户期望之间的差距。正如OBIS指导小组第11次会议报告第2.2.2节所总结的那样,调查结果使OBIS能够优先考虑要解决的问题。除了与数据用户互动外,OBIS QCPT还作为一个平台,讨论与节点使用达尔文核心相关的问题,并为术语讨论提供反馈。总之,OBIS QCPT通过透明和参与性的方法提高了海洋物种数据的可靠性和可用性,促进了持续改进。合作努力、标准化程序和知识共享推进OBIS为研究、保护和海洋管理提供高质量生物多样性数据的使命。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Promoting High-Quality Data in OBIS: Insights from the OBIS Data Quality Assessment and Enhancement Project Team 
The Ocean Biodiversity Information System (OBIS) (Klein et al. 2019) is a global database of marine biodiversity and associated environmental data, which provides critical information to researchers and policymakers worldwide. Ensuring the accuracy and consistency of the data in OBIS is essential for its usefulness and value, not only to the scientific community but also to the science-policy interface. The OBIS Data Quality Assessment and Enhancement Project Team (QCPT), formed in 2019 by the OBIS steering group, aims to assess and enhance data quality. It has been working on three categories of activities for this purpose: Data quality enhancement and management The OBIS QCPT organized data laundry events to identify and address data quality issues of published OBIS datasets. Furthermore, individual OBIS nodes were invited to give their data-processing presentations in the monthly meetings to foster knowledge sharing and collaborative problem-solving focused on data quality. Data quality issues and solutions highlighted in the presentations and data laundry events were documented in a dedicated GitHub repository as GitHub issues. The solutions for data quality issues and marine-specific pre-publication quality control tools, designed to identify the data quality issues, were provided as feedback to the OBIS Capacity Development Task Team. These inputs were used to create training resources (see OBIS manual, upcoming OBIS training course hosted on OceanTeacher Global Academy) aimed at preventing these issues. Standardization of OBIS data processing pipeline As OBIS uses the Darwin Core standard (Wieczorek et al. 2012), the use of standardized tests and assertions in the data processing pipeline is encouraged. To achieve this, the OBIS QCPT aligned OBIS quality checks with a subset of core tests and assertions (Chapman et al. 2020) developed by the Biodiversity Information Standards (TDWG) Biodiversity Data Quality (BDQ) Task Group 2 (TG2) (Chapman et al. 2020) as tracked in this GitHub issue. Not all default parameters of the core tests and assertions are optimal for marine biodiversity data. The OBIS QCPT met monthly to determine suitable parameters for customizing the tests. The pipeline produces a data quality report for each dataset with quality flags that indicate potential data quality issues, enabling node managers and data providers to review the flagged records. Community engagement The OBIS QCPT led a survey among data users to gather insights into OBIS data quality issues and bridge the gap between the current implementation and user expectations. The survey findings enabled OBIS to prioritize issues to be addressed, as summarized in Section 2.2.2 of the 11th OBIS Steering Group meeting report. In addition to engaging with data users, the OBIS QCPT also served as a platform to discuss questions related to the use of Darwin Core from the nodes and provided feedback for the term discussions. In summary, the OBIS QCPT improves marine species data reliability and usability through transparent and participatory approaches, fostering continuous improvement. Collaborative efforts, standardized procedures, and knowledge sharing advance OBIS' mission of providing high quality biodiversity data for research, conservation, and ocean management.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Meeting Report for the Phenoscape TraitFest 2023 with Comments on Organising Interdisciplinary Meetings Implementation Experience Report for the Developing Latimer Core Standard: The DiSSCo Flanders use-case Structuring Information from Plant Morphological Descriptions using Open Information Extraction The Future of Natural History Transcription: Navigating AI advancements with VoucherVision and the Specimen Label Transcription Project (SLTP) Comparative Study: Evaluating the effects of class balancing on transformer performance in the PlantNet-300k image dataset
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1