首页 > 最新文献

Biodiversity Information Science and Standards最新文献

英文 中文
The Role of the OLS Program in the Development of echinopscis (an Extensible Notebook for Open Science on Specimens) OLS计划在棘猿学发展中的作用(可扩展的标本开放科学笔记本)
Pub Date : 2023-09-08 DOI: 10.3897/biss.7.112318
Nicky Nicolson, Eve Lucas
Starting in early 2022, biodiversity informatics researchers at Kew have been developing echinopscis: an "extensible notebook for open science on specimens". This aims to build on the early experiments that our community conducted with "e-taxonomy": the development of tools and techniques to enable taxonomic research to be conducted online. Early e-taxonomic tools (e.g., Scratchpads Smith et al. 2011) had to perform a wide range of functions, but in the past decade or so the move towards open science has built better support for generic functionality, such as reference management (Zotero) and document production (pandoc), skills development in automation and revision control to support reproducible science, as documented by the Turing Way (The Turing Way Community 2022), and an awareness of the importance of community building. We have developed echinopscis at Kew via a cross-departmental collaboration between researchers in biodiversity informatics and accelerated taxonomy. We have also benefitted from valuable input and advice from our many colleagues in associated projects and organisations around the world. OLS (originally Open Life Sciences) is a training and mentoring program for Open Science leaders with a focus on community building. The name was recently (2023) made more generic—"Open Seeds"—whilst retaining their well-known acronym "OLS"*1. OLS is a 16-week cohort-based mentoring program. Participants apply to join a cohort with a project that is developed through the 16 weeks. Each week of the syllabus alternates between time with a dedicated Open Science mentor and cohort calls, which are used to develop skills in project design, community building, open development & licencing, and inclusivity. Over 500 practitioners, experts and learners have participated across the seven completed cohorts of OLS' Open Seeds training and mentoring. Through this programme, over 300 researchers and open leaders from across six continents have designed, lauched and supported 200 projects from different disciplines worldwide. The next cohort will run between September 2023 and January 2024, and will be the eighth iteration of the program. This talk will briefly outline the work that we have done to setup and experiment with echinopscis, but will focus on the impact that the OLS program has had in its development. We will also include the use of techniques learned through OLS in other biodiversity informatics projects. OLS acknowledges that their program receives relatively few applications from project leads in biodiversity and we hope that this talk will be informative for Biodiversity Information Standards (TDWG) participants and can be used to build productive links between these communities.
从2022年初开始,邱园的生物多样性信息学研究人员一直在开发棘皮动物:一种“开放标本科学的可扩展笔记本”。这项计划的目的是建立在我们社区早期进行的“电子分类学”实验的基础上:开发工具和技术,使分类学研究能够在网上进行。早期的电子分类工具(例如,Scratchpads Smith等人,2011年)必须执行广泛的功能,但在过去十年左右,向开放科学的转变已经建立了对通用功能的更好支持,例如参考管理(Zotero)和文档制作(pandoc),自动化和修订控制方面的技能发展,以支持可复制的科学,正如图灵之路(图灵之路社区2022)所记录的那样,以及对社区建设重要性的认识。通过生物多样性信息学和加速分类学研究人员之间的跨部门合作,我们在邱园发展了棘爪学。我们还从世界各地相关项目和组织的许多同事那里获得了宝贵的意见和建议。OLS(原开放生命科学)是一个面向开放科学领导者的培训和指导计划,重点是社区建设。最近(2023年),这个名字变得更通用——“开放种子”——同时保留了他们著名的缩写“OLS”*1。OLS是一个为期16周、以群体为基础的指导项目。参与者申请加入一个在16周内开发的项目的队列。每周的教学大纲在专门的开放科学导师和队列电话之间交替进行,用于开发项目设计,社区建设,开放开发和许可以及包容性方面的技能。超过500名从业人员、专家和学习者参加了七个已完成的OLS开放种子培训和指导项目。通过这个项目,来自六大洲的300多名研究人员和开放领袖设计、启动和支持了来自世界各地不同学科的200多个项目。下一个队列将在2023年9月至2024年1月之间进行,这将是该计划的第八次迭代。这次演讲将简要地概述我们为建立和实验棘球蚴所做的工作,但将重点放在OLS项目在其发展中的影响上。我们还将在其他生物多样性信息学项目中使用通过OLS学到的技术。OLS承认他们的项目收到的来自生物多样性项目领导的申请相对较少,我们希望这次演讲将为生物多样性信息标准(TDWG)的参与者提供信息,并可用于在这些社区之间建立有效的联系。
{"title":"The Role of the OLS Program in the Development of echinopscis (an Extensible Notebook for Open Science on Specimens)","authors":"Nicky Nicolson, Eve Lucas","doi":"10.3897/biss.7.112318","DOIUrl":"https://doi.org/10.3897/biss.7.112318","url":null,"abstract":"Starting in early 2022, biodiversity informatics researchers at Kew have been developing echinopscis: an \"extensible notebook for open science on specimens\". This aims to build on the early experiments that our community conducted with \"e-taxonomy\": the development of tools and techniques to enable taxonomic research to be conducted online. Early e-taxonomic tools (e.g., Scratchpads Smith et al. 2011) had to perform a wide range of functions, but in the past decade or so the move towards open science has built better support for generic functionality, such as reference management (Zotero) and document production (pandoc), skills development in automation and revision control to support reproducible science, as documented by the Turing Way (The Turing Way Community 2022), and an awareness of the importance of community building. We have developed echinopscis at Kew via a cross-departmental collaboration between researchers in biodiversity informatics and accelerated taxonomy. We have also benefitted from valuable input and advice from our many colleagues in associated projects and organisations around the world. \u0000 OLS (originally Open Life Sciences) is a training and mentoring program for Open Science leaders with a focus on community building. The name was recently (2023) made more generic—\"Open Seeds\"—whilst retaining their well-known acronym \"OLS\"*1. OLS is a 16-week cohort-based mentoring program. Participants apply to join a cohort with a project that is developed through the 16 weeks. Each week of the syllabus alternates between time with a dedicated Open Science mentor and cohort calls, which are used to develop skills in project design, community building, open development & licencing, and inclusivity. Over 500 practitioners, experts and learners have participated across the seven completed cohorts of OLS' Open Seeds training and mentoring. Through this programme, over 300 researchers and open leaders from across six continents have designed, lauched and supported 200 projects from different disciplines worldwide. The next cohort will run between September 2023 and January 2024, and will be the eighth iteration of the program. \u0000 This talk will briefly outline the work that we have done to setup and experiment with echinopscis, but will focus on the impact that the OLS program has had in its development. We will also include the use of techniques learned through OLS in other biodiversity informatics projects. OLS acknowledges that their program receives relatively few applications from project leads in biodiversity and we hope that this talk will be informative for Biodiversity Information Standards (TDWG) participants and can be used to build productive links between these communities.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"79 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77794142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GBIF-Compliant Data Pipeline for the Management and Publication of a Global Taxonomic Reference List of Pests in Natural History Collections 符合gbif标准的全球自然馆藏害虫分类参考目录管理和出版数据管道
Pub Date : 2023-09-08 DOI: 10.3897/biss.7.112391
Carla Novoa Sepúlveda, Stephan Biebl, Nadja Pöllath, S. Seifert, Markus Weiss, Tanja Weibulat, Dagmar Triebel
There is a growing demand for monitoring pests in natural history collections (NHCs) and establishing integrated pest management (IPM) solutions (Crossman and Ryde 2022). In this context, up-to-date taxonomic reference lists and controlled vocabularies following standard schemes are crucial and facilitate recording organisms detected in collections. The data pipeline described here results in the publication of a taxon reference list based on information from online resources and standard IPM literature. Most of the over 140 pest taxa on species level and above are insects, the rest belong to other animal groups and fungi. The complete taxon names, synonyms, English and German common names, and the hierarchical classification (parent-child relationships) are organised in a client-server installation of DiversityTaxonNames (DTN) at the Bavarian Natural History Collections (SNSB). DTN is a Microsoft Structured Query Language (MS SQL) database tool of the Diversity Workbench (DWB) framework with a published Entity Relation (ER) diagram (Hagedorn et al. 2019). The management is done using the Global Biodiversity Information Facility (GBIF) backbone taxonomy as external name resource, with linkage to the respective Wikidata Q item ID as a external persistent identifier (PID). Moreover, information on pest occurrence in NHCs is given, distinguishing the Consortium of European Taxonomic Facilities (CETAF) major NHC collection types affected (i.e., heritage sciences, life sciences and earth sciences) and the object categories, e.g., natural objects/specimens damaged. The data management in DTN enables the long-running curation, done by list curators. The generic data pipeline for the management and publication of a Global Taxonomic Reference List of Pests in NHCs is based on the DTN taxon lists concept and architecture and described under About "Taxon list of pest organisms for IPM at natural history collections compiled at the SNSB". It includes four steps (A–D) with significant results for best practices of data processing (Fig. 1). A. The data is managed and processed for publication by list curators in the database DiversityTaxonNames (DTN). As a result, the list can be kept up-to-date and is—without transformation—ready to be used for IPM solutions at any NHC with a DiversityCollection installation and as part of the DWB cloud services. B. The up-to-date data is publicly available via the DTN REST Webservice for Taxon Lists with machine-readable Application Programming Interface (API). As a result, the dynamic list publication service can be used as a reference backbone for establishing IPM solutions for pest monitoring at any NHC. C. The data is provided via the GBIF checklist data publication pipeline of the SNSB through GBIF validation tools and Darwin Core Archive in DwC-A (zip format) for GBIF. As a result, the checklist information becomes part of the GBIF network with GBIF ChecklistBank and GBIF Global Taxonomy. This ensures future c
对自然历史藏品(NHCs)中有害生物监测和建立综合有害生物管理(IPM)解决方案的需求日益增长(Crossman和Ryde 2022)。在这种情况下,最新的分类参考表和遵循标准方案的受控词汇表至关重要,并有助于记录收集中检测到的生物。这里描述的数据管道导致基于在线资源和标准IPM文献信息的分类单元参考列表的发布。在种级及以上的140多个害虫分类群中,大多数是昆虫,其余属于其他动物群和真菌。完整的分类单元名称、同义词、英语和德语常用名称以及层次分类(父子关系)在巴伐利亚自然历史收藏(SNSB)的DiversityTaxonNames (DTN)的客户机-服务器安装中组织。DTN是多样性工作台(DWB)框架的Microsoft结构化查询语言(MS SQL)数据库工具,具有已发布的实体关系(ER)图(Hagedorn et al. 2019)。管理使用全球生物多样性信息设施(GBIF)主干分类法作为外部名称资源,并链接到相应的Wikidata Q项目ID作为外部持久标识符(PID)。此外,还提供了国家卫生中心有害生物发生情况的信息,区分了欧洲分类设施联盟(CETAF)受影响的主要国家卫生中心收集类型(即遗产科学、生命科学和地球科学)和对象类别,例如自然物体/标本受损。DTN中的数据管理支持长期运行的管理,由列表管理器完成。国家卫生健康中心管理和出版《全球有害生物分类参考清单》的通用数据管道基于DTN分类单元清单的概念和架构,并在关于“SNSB编制的自然历史馆藏IPM有害生物分类单元清单”中进行了描述。它包括四个步骤(A-D),对于数据处理的最佳实践具有重要的结果(图1)。A.数据由数据库DiversityTaxonNames (DTN)中的列表管理员管理和处理以供发布。因此,该列表可以保持最新状态,并且无需进行转换,即可用于安装了DiversityCollection的任何NHC的IPM解决方案,并可作为DWB云服务的一部分。B.最新的数据通过DTN REST Webservice公开提供,具有机器可读的应用程序编程接口(API)。因此,动态列表发布服务可作为任何国家卫生健康中心建立有害生物监测IPM解决方案的参考骨干。C.通过GBIF验证工具和GBIF DwC-A (zip格式)的达尔文核心档案,通过SNSB的GBIF核对表数据发布管道提供数据。因此,清单信息通过GBIF ChecklistBank和GBIF Global Taxonomy成为GBIF网络的一部分。这确保了数据将来符合可查找性、可访问性、互操作性和重用(FAIR)指导原则。D.分类单元列表的DTN REST Web服务(目前有60个列表)是通过德国生物数据联合会(GFBio)术语服务注册和访问的。因此,带有外部pid和其他信息的列表可以作为服务使用(请参阅DTN列表概述)。在即将到来的德国国家研究数据基础设施(NFDI)计划的研究数据共享中(Diepenbroek等人,2021年),它将成为具有改进可访问性的商定接口方案的标准化api层的一部分。所提供的工具、API和数据是即将推出的nfdi4生物多样性服务组合的一部分。未来的场景包括使用DiversityNaviKey (triiebel et al. 2021)将列表项和属性作为分类用于诊断目的,包括发布用于识别害虫的图像。
{"title":"GBIF-Compliant Data Pipeline for the Management and Publication of a Global Taxonomic Reference List of Pests in Natural History Collections","authors":"Carla Novoa Sepúlveda, Stephan Biebl, Nadja Pöllath, S. Seifert, Markus Weiss, Tanja Weibulat, Dagmar Triebel","doi":"10.3897/biss.7.112391","DOIUrl":"https://doi.org/10.3897/biss.7.112391","url":null,"abstract":"There is a growing demand for monitoring pests in natural history collections (NHCs) and establishing integrated pest management (IPM) solutions (Crossman and Ryde 2022). In this context, up-to-date taxonomic reference lists and controlled vocabularies following standard schemes are crucial and facilitate recording organisms detected in collections.\u0000 The data pipeline described here results in the publication of a taxon reference list based on information from online resources and standard IPM literature. Most of the over 140 pest taxa on species level and above are insects, the rest belong to other animal groups and fungi.\u0000 The complete taxon names, synonyms, English and German common names, and the hierarchical classification (parent-child relationships) are organised in a client-server installation of DiversityTaxonNames (DTN) at the Bavarian Natural History Collections (SNSB). DTN is a Microsoft Structured Query Language (MS SQL) database tool of the Diversity Workbench (DWB) framework with a published Entity Relation (ER) diagram (Hagedorn et al. 2019). The management is done using the Global Biodiversity Information Facility (GBIF) backbone taxonomy as external name resource, with linkage to the respective Wikidata Q item ID as a external persistent identifier (PID). Moreover, information on pest occurrence in NHCs is given, distinguishing the Consortium of European Taxonomic Facilities (CETAF) major NHC collection types affected (i.e., heritage sciences, life sciences and earth sciences) and the object categories, e.g., natural objects/specimens damaged. The data management in DTN enables the long-running curation, done by list curators.\u0000 The generic data pipeline for the management and publication of a Global Taxonomic Reference List of Pests in NHCs is based on the DTN taxon lists concept and architecture and described under About \"Taxon list of pest organisms for IPM at natural history collections compiled at the SNSB\". It includes four steps (A–D) with significant results for best practices of data processing (Fig. 1).\u0000 A. The data is managed and processed for publication by list curators in the database DiversityTaxonNames (DTN).\u0000 As a result, the list can be kept up-to-date and is—without transformation—ready to be used for IPM solutions at any NHC with a DiversityCollection installation and as part of the DWB cloud services.\u0000 B. The up-to-date data is publicly available via the DTN REST Webservice for Taxon Lists with machine-readable Application Programming Interface (API).\u0000 As a result, the dynamic list publication service can be used as a reference backbone for establishing IPM solutions for pest monitoring at any NHC.\u0000 C. The data is provided via the GBIF checklist data publication pipeline of the SNSB through GBIF validation tools and Darwin Core Archive in DwC-A (zip format) for GBIF.\u0000 As a result, the checklist information becomes part of the GBIF network with GBIF ChecklistBank and GBIF Global Taxonomy. This ensures future c","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84719431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Connecting the Dots: Aligning human capacity through networks toward a globally interoperable Digital Extended Specimen (DES) infrastructure 连接点:通过网络向全球可互操作的数字扩展样本(DES)基础设施调整人类能力
Pub Date : 2023-09-08 DOI: 10.3897/biss.7.112390
Elizabeth R. Ellwood, Wouter Addink, John Bates, Andrew Bentley, Jutta Buschbom, Alina Freire-Fierro, Jose Fortes, David Jennings, Kerstin Lehnert, Bertram Ludäscher, Keping Ma, James Macklin, Austin Mast, Joe Miller, Gil Nelson, Nicky Nicolson, Jyotsna Pandey, Deborah Paul, Sinlan Poo, Richard Rabeler, Pamela S. Soltis, Elycia Wallis, Michael Webster, Andrew Young, Breda Zimkus
Thanks to substantial support for biodiversity data mobilization in recent decades, billions of occurrence records are openly available, documenting life on Earth and enabling timely research, awareness raising, and policy-making. Initiatives across local to global scales have been separately funded to serve different, yet often overlapping audiences of data users, and have developed a variety of platforms and infrastructures to meet the needs of these audiences. The independent progress of biodiversity data providers has led to innovations as well as challenges for the community at large as we move towards connecting and linking a diversity of information from disparate sources as Digital Extended Specimens (DES). Recognizing a need for deeper and more frequent opportunities for communication and collaboration across the globe, an ad-hoc group of representatives of various international, national, and regional organizations have been meeting virtually since 2020 to provide a forum for updates, announcements, and shared progress. This group is provisionally named International Partners for the Digital Extended Specimen (IPDES), and is guided by these four concepts: Biodiversity, Connection, Knowledge and Agency. Participants in IPDES include representatives of the Global Biodiversity Information Facility (GBIF), Integrated Digitized Biocollections (iDigBio), American Institute of Biological Sciences (AIBS), Biodiversity Collections Network (BCoN), Natural Science Collections Alliance (NSCA), Distributed System of Scientific Collections (DiSSCo), Atlas of Living Australia (ALA), Biodiversity Information Standards (TDWG), Society for the Preservation of Natural History Collections (SPNHC), National Specimen Information Infrastructure of China (NSII), and South African National Biodiversity Institute (SANBI), as well as individuals involved with biodiversity informatics initiatives, natural science collections, museums, herbaria, and universities. Our global partners group strives to increase representation from around the globe as we aim to enable research that contributes to novel discoveries and addresses the societal challenges leading to the biodiversity crisis. Our overarching mission is to expand on the community-driven successes to connect biodiversity data and knowledge through coordination of a globally integrated network of stakeholders to enable an extensible technical and social infrastructure of data, tools, and working practices in support of our vision. The main work of our group thus far includes publishing a paper on the Digital Extended Specimen (Hardisty et al. 2022), organizing and hosting an array of activities at conferences, and asynchronous online work and forum-based exchanges. We aim to advance discussion on topics of broad interest to our community such as social and technical capacity building, broadening participation, expanding social and data networks, improving data models and building a backbone for the DES, and ide
由于近几十年来对生物多样性数据动员的大力支持,数十亿的生物发生记录可以公开获取,记录了地球上的生命,并使及时的研究、提高认识和决策成为可能。从地方到全球范围的举措已分别获得资助,以服务不同但往往重叠的数据用户受众,并开发了各种平台和基础设施,以满足这些受众的需求。生物多样性数据提供者的独立发展为整个生物多样性社区带来了创新和挑战,因为我们正朝着连接和连接来自不同来源的多样性信息作为数字扩展标本(DES)的方向发展。认识到需要在全球范围内建立更深入、更频繁的沟通与合作机会,自2020年以来,各种国际、国家和地区组织的代表组成了一个特设小组,通过虚拟方式举行会议,为更新、公告和共享进展提供论坛。该组织暂时被命名为数字扩展标本国际合作伙伴(IPDES),并以这四个概念为指导:生物多样性、联系、知识和代理。IPDES的参与者包括全球生物多样性信息设施(GBIF)、综合数字化生物馆藏(iDigBio)、美国生物科学研究所(AIBS)、生物多样性馆藏网络(BCoN)、自然科学馆藏联盟(NSCA)、科学馆藏分布式系统(DiSSCo)、澳大利亚生活图集(ALA)、生物多样性信息标准(TDWG)、自然历史馆藏保存协会(SPNHC)、中国国家标本信息基础设施(NSII)和南非国家生物多样性研究所(SANBI),以及参与生物多样性信息学倡议的个人、自然科学收藏、博物馆、植物标本馆和大学。我们的全球合作伙伴小组致力于增加来自全球各地的代表性,因为我们的目标是使研究有助于新发现和解决导致生物多样性危机的社会挑战。我们的首要任务是扩大社区驱动的成功,通过协调全球整合的利益相关者网络,将生物多样性数据和知识联系起来,使数据、工具和工作实践的可扩展技术和社会基础设施成为可能,以支持我们的愿景。到目前为止,我们小组的主要工作包括发表一篇关于数字扩展标本的论文(Hardisty et al. 2022),组织和主持一系列会议活动,以及异步在线工作和基于论坛的交流。我们的目标是推进对我们社区广泛感兴趣的主题的讨论,如社会和技术能力建设、扩大参与、扩大社会和数据网络、改进数据模型和建立经济发展系统的主干,以及确定国际筹资解决方案。本报告将重点介绍其中的一些活动,并详细介绍支持DES所需的人际网络和技术基础设施发展路线图的进展情况。它为利益相关者社区(如TDWG)和其他关注数据标准和生物多样性信息学的倡议提供了反馈和参与的机会。随着我们巩固我们的未来计划,以支持综合和相互关联的生物多样性数据,并赞扬那些从事这项工作的人。
{"title":"Connecting the Dots: Aligning human capacity through networks toward a globally interoperable Digital Extended Specimen (DES) infrastructure","authors":"Elizabeth R. Ellwood, Wouter Addink, John Bates, Andrew Bentley, Jutta Buschbom, Alina Freire-Fierro, Jose Fortes, David Jennings, Kerstin Lehnert, Bertram Ludäscher, Keping Ma, James Macklin, Austin Mast, Joe Miller, Gil Nelson, Nicky Nicolson, Jyotsna Pandey, Deborah Paul, Sinlan Poo, Richard Rabeler, Pamela S. Soltis, Elycia Wallis, Michael Webster, Andrew Young, Breda Zimkus","doi":"10.3897/biss.7.112390","DOIUrl":"https://doi.org/10.3897/biss.7.112390","url":null,"abstract":"Thanks to substantial support for biodiversity data mobilization in recent decades, billions of occurrence records are openly available, documenting life on Earth and enabling timely research, awareness raising, and policy-making. Initiatives across local to global scales have been separately funded to serve different, yet often overlapping audiences of data users, and have developed a variety of platforms and infrastructures to meet the needs of these audiences. The independent progress of biodiversity data providers has led to innovations as well as challenges for the community at large as we move towards connecting and linking a diversity of information from disparate sources as Digital Extended Specimens (DES). Recognizing a need for deeper and more frequent opportunities for communication and collaboration across the globe, an ad-hoc group of representatives of various international, national, and regional organizations have been meeting virtually since 2020 to provide a forum for updates, announcements, and shared progress. This group is provisionally named International Partners for the Digital Extended Specimen (IPDES), and is guided by these four concepts: Biodiversity, Connection, Knowledge and Agency. Participants in IPDES include representatives of the Global Biodiversity Information Facility (GBIF), Integrated Digitized Biocollections (iDigBio), American Institute of Biological Sciences (AIBS), Biodiversity Collections Network (BCoN), Natural Science Collections Alliance (NSCA), Distributed System of Scientific Collections (DiSSCo), Atlas of Living Australia (ALA), Biodiversity Information Standards (TDWG), Society for the Preservation of Natural History Collections (SPNHC), National Specimen Information Infrastructure of China (NSII), and South African National Biodiversity Institute (SANBI), as well as individuals involved with biodiversity informatics initiatives, natural science collections, museums, herbaria, and universities. Our global partners group strives to increase representation from around the globe as we aim to enable research that contributes to novel discoveries and addresses the societal challenges leading to the biodiversity crisis. Our overarching mission is to expand on the community-driven successes to connect biodiversity data and knowledge through coordination of a globally integrated network of stakeholders to enable an extensible technical and social infrastructure of data, tools, and working practices in support of our vision. The main work of our group thus far includes publishing a paper on the Digital Extended Specimen (Hardisty et al. 2022), organizing and hosting an array of activities at conferences, and asynchronous online work and forum-based exchanges. We aim to advance discussion on topics of broad interest to our community such as social and technical capacity building, broadening participation, expanding social and data networks, improving data models and building a backbone for the DES, and ide","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136299015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementing CARE Principles to Link Noongar Language and Knowledge to Western Science through the Atlas of Living Australia 通过生活澳大利亚地图集实施CARE原则,将努加语和知识与西方科学联系起来
Pub Date : 2023-09-08 DOI: 10.3897/biss.7.112349
N. Raisbeck‐Brown, Denise Smith-Ali
The Atlas of Living Australia (ALA), Australia's national online biodiversity database, is partnering with the Noongar Boodjar Language Centre (NBALC) to promote Indigenous language and knowledge by including Noongar names for plants and animals in the ALA. Names are included in the ALA species page for each plant and animal and knowledge is built into the Noongar Plant and Animal online Encyclopedia, hosted in the ALA. We demonstrate the use of CARE principles (Collective Benefit, Authority to Control, Responsibility, and Ethics (Carroll et al. 2020)) to engage, support, and deliver the project and outcomes to the Noongar people and communities working with us. The ALA addresses the FAIR principles (Wilkinson et al. 2016) for data management and stewardship ensuring data are findable, accessable, interoperable, and reusable. The ALA is partnering with NBALC in Perth to ensure all sharing of Noongar data is on Noongar terms. NBALC and ALA have been working with Noongar-Wadjari, a southern clan from the Fitzgerald River area in Western Australia, to collect, protect and share their language and traditional knowledge for local species.*1 The Noongar Encyclopedia project exhibits Collective Benefit because it is a co-innovation project that was co-designed by NBALC and ALA. The project’s activities were designed by the Community-endorsed representatives, the Knowledge Holders. The aims and aspirations of the Community were included in the project design to ensure equitable outcomes. NBALC’s more than 25-year relationship with the Community, and as Noongar people themselves, meant they had a good understanding of what the Community might want from the project. These assumptions were tested and refined during the first Community consultation, before the project plan was finalised. The Community are keen for their traditional knowledge to be shared and freely available to their Community. The ALA only shared knowledge that has passed through strict consent processes. It is seen as a safe and stable digital environment for now and the future, and where the traditional knowledge can be accessed freely and easily. The link to western science knowledge is secondary to knowledge sharing for most of the Aboriginal and Torres Strait Islander Communities that the ALA are working with although the benefits of scientists having access to both knowledge systems is seen as a positive step in care for Country into the future. The Noongar Encyclopedia project ensures Noongar Authority to Control these data because NBALC, as an Aboriginal organisation, led by Noongar people, understands the rights and interests of the Communities we are working with. Protection of these rights and inclusion of Community interests are written into the project methodology as part of the project co-design. It is important to ensure the project is working with the right people within the Community. NBALC facilitates this by finding people who hold traditional knowledge, and can trace
澳大利亚国家在线生物多样性数据库“澳大利亚生活地图集”(ALA)正与努加尔布德贾尔语言中心(NBALC)合作,通过在ALA中加入努加尔动植物名称来推广土著语言和知识。每一种动植物的名称都包含在ALA的物种页面中,知识被建立在由ALA托管的Noongar动植物在线百科全书中。我们展示了使用CARE原则(集体利益,控制权力,责任和道德(Carroll et al. 2020))来参与,支持并向与我们合作的Noongar人民和社区提供项目和成果。ALA解决了FAIR原则(Wilkinson et al. 2016),用于数据管理和管理,确保数据可查找、可访问、可互操作和可重用。ALA正在与位于珀斯的NBALC合作,以确保所有Noongar数据的共享都符合Noongar条款。NBALC和ALA一直在与西澳大利亚州菲茨杰拉德河地区的一个南部部族Noongar-Wadjari合作,收集、保护和分享他们的语言和当地物种的传统知识。*1 Noongar百科全书项目是由NBALC和ALA共同设计的协同创新项目,因此具有集体效益。该项目的活动是由共同体认可的代表,即知识持有人设计的。共同体的目标和愿望已列入项目设计,以确保公平的结果。NBALC与社区超过25年的关系,以及作为Noongar人自己,意味着他们对社区可能想从项目中得到什么有很好的理解。在项目计划最终确定之前,这些假设在第一次社区咨询期间进行了测试和完善。社区渴望他们的传统知识被分享并免费提供给他们的社区。美国ALA只分享经过严格同意程序的知识。它被视为现在和未来一个安全稳定的数字环境,在那里传统知识可以自由和容易地获得。对于ALA正在与之合作的大多数土著和托雷斯海峡岛民社区来说,与西方科学知识的联系是次要的,尽管科学家获得这两种知识系统的好处被视为关怀国家走向未来的积极一步。努格尔百科全书项目确保努格尔当局控制这些数据,因为NBALC作为一个由努格尔人领导的土著组织,了解我们正在与之合作的社区的权利和利益。保护这些权利和纳入社区利益作为项目共同设计的一部分写入项目方法。确保项目与社区内合适的人合作是很重要的。NBALC通过寻找掌握传统知识并能追溯故事来源的人来促进这一点。由于所有收集的数据都由NBALC存储和管理,因此可以确保对数据进行适当的治理。项目设计包括知识持有人的滚动同意,他们审查收集的所有数据,根据需要进行添加或编辑,并同意或拒绝通过ALA公开共享知识。努格尔百科全书项目的设计确保我们理解土著数据收集、保护、管理和共享的责任(CARE“R”)。通过与ALA的合作,NBALC正在扩大其数字数据收集和管理的能力和能力。欧共体正在建设其与语言学家和科学家合作的能力。将Noongar语言和传统知识纳入ALA向ALA的非土著用户展示了另一种命名,观察,谈论和记录物种知识的方式。这种观点不同于西方科学。Noongar人认为所有事物都是相互连接的,并根据它们的用途和连接性对事物进行分组。西方科学倾向于根据物种的物理属性对它们进行分类。语言是这种另类世界观的关键。ALA现在公布学名、英文名和Noongar单词。ALA通过Noongar百科全书和另外两个生态知识百科全书(Kamilaroi和South East Arnhem Land)链接到这些物种的另一种科学观点。Noongar百科全书项目不断受到社区的道德评估,并通过严格的西方道德评估和审查。共同体伦理评估包括在项目开始前进行的一系列评估。这些项目是与NBALC共同设计的,以确保它们符合协议和社区的期望。然后将ALA介绍给社区。 共同体决定他们是否对这个项目感兴趣,它是否满足了他们的愿望,他们是否愿意与ALA以及潜在的其他科学家一起工作。有贡献的科学家或学者由NBALC介绍给社区。欧共体有权拒绝与任何介绍的科学家或学者合作。所有贡献者在被介绍给社区之前都会被告知这个协议。《努加尔动植物百科全书》于2021年9月出版(NBALC 2021)。
{"title":"Implementing CARE Principles to Link Noongar Language and Knowledge to Western Science through the Atlas of Living Australia","authors":"N. Raisbeck‐Brown, Denise Smith-Ali","doi":"10.3897/biss.7.112349","DOIUrl":"https://doi.org/10.3897/biss.7.112349","url":null,"abstract":"The Atlas of Living Australia (ALA), Australia's national online biodiversity database, is partnering with the Noongar Boodjar Language Centre (NBALC) to promote Indigenous language and knowledge by including Noongar names for plants and animals in the ALA. Names are included in the ALA species page for each plant and animal and knowledge is built into the Noongar Plant and Animal online Encyclopedia, hosted in the ALA. We demonstrate the use of CARE principles (Collective Benefit, Authority to Control, Responsibility, and Ethics (Carroll et al. 2020)) to engage, support, and deliver the project and outcomes to the Noongar people and communities working with us. \u0000 The ALA addresses the FAIR principles (Wilkinson et al. 2016) for data management and stewardship ensuring data are findable, accessable, interoperable, and reusable. The ALA is partnering with NBALC in Perth to ensure all sharing of Noongar data is on Noongar terms. NBALC and ALA have been working with Noongar-Wadjari, a southern clan from the Fitzgerald River area in Western Australia, to collect, protect and share their language and traditional knowledge for local species.*1\u0000 The Noongar Encyclopedia project exhibits Collective Benefit because it is a co-innovation project that was co-designed by NBALC and ALA. The project’s activities were designed by the Community-endorsed representatives, the Knowledge Holders. The aims and aspirations of the Community were included in the project design to ensure equitable outcomes. NBALC’s more than 25-year relationship with the Community, and as Noongar people themselves, meant they had a good understanding of what the Community might want from the project. These assumptions were tested and refined during the first Community consultation, before the project plan was finalised. The Community are keen for their traditional knowledge to be shared and freely available to their Community. The ALA only shared knowledge that has passed through strict consent processes. It is seen as a safe and stable digital environment for now and the future, and where the traditional knowledge can be accessed freely and easily. The link to western science knowledge is secondary to knowledge sharing for most of the Aboriginal and Torres Strait Islander Communities that the ALA are working with although the benefits of scientists having access to both knowledge systems is seen as a positive step in care for Country into the future.\u0000 The Noongar Encyclopedia project ensures Noongar Authority to Control these data because NBALC, as an Aboriginal organisation, led by Noongar people, understands the rights and interests of the Communities we are working with. Protection of these rights and inclusion of Community interests are written into the project methodology as part of the project co-design. It is important to ensure the project is working with the right people within the Community. NBALC facilitates this by finding people who hold traditional knowledge, and can trace","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85507465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bidirectional Linking: Benefits, challenges, pitfalls, and solutions 双向链接:好处、挑战、缺陷和解决方案
Pub Date : 2023-09-08 DOI: 10.3897/biss.7.112344
Guido Sautter, D. Agosti
Taxonomy, and biodiversity science in general, mainly revolve around four types of entities, which are available digitally in ever increasing numbers from different services: (1) Physical specimens (kept in museums and other collections around the world) and observations are available digitally via the Global Biodiversity Information Facility (GBIF). (2) DNA sequences (often derived from preserved specimens) are available from the European Nucleotide Archive (ENA) and National Center for Biotechnology Information (NCBI), having accession numbers as their primary means of citation. (3) Taxa, identified by taxon names, are increasingly registered to nomenclatural reference databases (ZooBank, International Plant Names Index (IPNI)) and aggregated in the Catalogue of Life (CoL). (4) Taxonomic treatments combine the former three; they define taxa, express scientific opinions about existing taxa, based upon specimens as well as DNA sequences derived from themand coin respective names; they are available from TreatmentBank (as well as Zenodo/Biodiversity Literature Repository (BLR) and Swiss Institute of Bioinformatics Literature Services (SIBiLS), and GBIF). Traditionally, treatments cite specimens, taxa, and other treatments in mainly human-centric ways, describing where to find the cited object, but they are not immediately actionable in a digital sense. Specimen citations use institution and collection codes and catalog numbers (often combined with geographical and environmental data). Taxon names are a type of self-citing entities, especially when given in combination with their (bibliographic) authorship, as they represent a historical approach to human-readable taxon identifiers. Citations of treatments are very similar to those of taxon names, adding (bibliographic) information of subsequent name usages as needed. Accession numbers for DNA sequences are the closest to modern digital identifiers. However, none of these means of citation, as usually found in literature, are readily machine actionable, which makes them hard to process at scale and analyze programmatically. Identifiers coined by the various data providers, in combination with APIs to resolve them, alleviate this problem and enable computational navigation of such links. However, this alone only defers the problem, as actionable identifiers (e.g., HTTP URIs) at some point still need to be inferred from the information given in the traditional means of citation where the latter occur in data. Recent projects, like BiCIKL, aim to add machine navigable links to the various entities (or respective data records) at scale, in pursuit of (ideally) fully intermeshed records, connecting (1) treatments to subject taxon names and concepts, cited specimens and DNA sequences, as well as cited treatments (with explicit nomenclatorial implications, e.g., taxon name synonymies or rebuttals thereof), (2) (digital) specimens to assigned taxon names, citing treatments, and any derived DNA sequences,
分类学和生物多样性科学总体上主要围绕四种类型的实体,这些实体从不同的服务中获得的数字数量不断增加:(1)物理标本(保存在世界各地的博物馆和其他收藏品中)和观测结果通过全球生物多样性信息设施(GBIF)获得数字。(2) DNA序列(通常来源于保存的标本)可从欧洲核苷酸档案(ENA)和国家生物技术信息中心(NCBI)获得,其检索编号是其主要引用方式。(3)通过分类单元名称识别的分类群越来越多地被收录到命名参考数据库(ZooBank、International Plant names Index (IPNI))和生命目录(CoL)中。(4)前三种分类处理相结合;他们根据标本及其衍生的DNA序列来定义分类群,表达对现有分类群的科学看法,并创造各自的名称;它们可从TreatmentBank(以及Zenodo/生物多样性文献库(BLR)和瑞士生物信息学文献服务研究所(SIBiLS)和GBIF)获得。传统上,治疗方法主要以人为中心的方式引用标本、分类群和其他治疗方法,描述在哪里可以找到被引用的对象,但它们在数字意义上不是立即可操作的。标本引用使用机构和收集代码以及目录编号(通常与地理和环境数据相结合)。分类单元名称是一种自引用实体,特别是当与它们的(书目)作者身份结合在一起时,因为它们代表了人类可读分类单元标识符的历史方法。处理的引用与分类单元名称的引用非常相似,并根据需要添加后续名称用法的(书目)信息。DNA序列的编号是最接近现代数字标识符的。然而,通常在文献中发现的这些引用方式都不容易被机器操作,这使得它们难以大规模处理和编程分析。由各种数据提供者创造的标识符,结合解决它们的api,缓解了这个问题,并使这些链接的计算导航成为可能。然而,这只是推迟了问题的解决,因为在某些情况下,可操作的标识符(例如HTTP uri)仍然需要从传统的引用方式中给出的信息中推断出来,而后者出现在数据中。最近的项目,如BiCIKL,旨在大规模地添加到各种实体(或各自的数据记录)的机器可导航链接,以追求(理想的)完全互连的记录,将(1)处理与主题分类单元名称和概念,引用的标本和DNA序列,以及引用的处理(具有明确的命名含义,例如,分类单元名称同义词或其反驳),(2)(数字)标本与指定的分类单元名称,引用的处理和任何衍生的DNA序列连接起来。(3)源标本(或其数字对应物)的DNA序列,如适用,分配分类群名称和引用处理;(4)分类群名称,定义和同义化处理、相关(数字)标本和任何衍生DNA序列。这消除了链接序列中传递依赖关系可能出现的问题,作为故障的中间点;所有主要的数据提供者已经在不同程度上这样做了一段时间,这提供了一个很好的起点,但是仍然存在一些挑战和陷阱:由于有效的技术原因,单个数据提供者的系统是(并且需要是)自包含的,这是以一定数量的重复(例如,GBIF和ENA/NCBI主干分类法)为代价的。这本身没有问题,但会减缓更新的扩散,并可能导致一些差异。此外,传统的人类可读标识符可能有些模棱两可:(1)一些机构和收藏代码不是唯一的,或者作者以非标准的方式使用它们(例如,全球科学收藏注册(GrSciColl)中的一些代码指向六个不同的机构);(2)博物馆标本的某些目录编号也是有效的(可解析的)加入编号,其实际语义仅从上下文中出现;(3)缺少后者使得表中数据的语义特别难以推断;(4)没有一个提供者具有完整的数据覆盖,因此在任何给定的点上,链接甚至在技术上都不可能在所有情况下都是可行的,并且随着覆盖范围和数据之间的重叠增加,一些链接只能随着时间的推移而添加(例如,当定义处理被数字化时,新发布的名称不可能在CoL中);(5)偶尔的完全重新计算或重新处理是不切实际和浪费的。 在本次演讲中,我们将讨论克服上述挑战和避免上述缺陷的各种方法,并为api提供相关建议,以更好地支持各自的机制。
{"title":"Bidirectional Linking: Benefits, challenges, pitfalls, and solutions","authors":"Guido Sautter, D. Agosti","doi":"10.3897/biss.7.112344","DOIUrl":"https://doi.org/10.3897/biss.7.112344","url":null,"abstract":"Taxonomy, and biodiversity science in general, mainly revolve around four types of entities, which are available digitally in ever increasing numbers from different services: (1) Physical specimens (kept in museums and other collections around the world) and observations are available digitally via the Global Biodiversity Information Facility (GBIF). (2) DNA sequences (often derived from preserved specimens) are available from the European Nucleotide Archive (ENA) and National Center for Biotechnology Information (NCBI), having accession numbers as their primary means of citation. (3) Taxa, identified by taxon names, are increasingly registered to nomenclatural reference databases (ZooBank, International Plant Names Index (IPNI)) and aggregated in the Catalogue of Life (CoL). (4) Taxonomic treatments combine the former three; they define taxa, express scientific opinions about existing taxa, based upon specimens as well as DNA sequences derived from themand coin respective names; they are available from TreatmentBank (as well as Zenodo/Biodiversity Literature Repository (BLR) and Swiss Institute of Bioinformatics Literature Services (SIBiLS), and GBIF).\u0000 Traditionally, treatments cite specimens, taxa, and other treatments in mainly human-centric ways, describing where to find the cited object, but they are not immediately actionable in a digital sense. Specimen citations use institution and collection codes and catalog numbers (often combined with geographical and environmental data). Taxon names are a type of self-citing entities, especially when given in combination with their (bibliographic) authorship, as they represent a historical approach to human-readable taxon identifiers. Citations of treatments are very similar to those of taxon names, adding (bibliographic) information of subsequent name usages as needed. Accession numbers for DNA sequences are the closest to modern digital identifiers. However, none of these means of citation, as usually found in literature, are readily machine actionable, which makes them hard to process at scale and analyze programmatically. Identifiers coined by the various data providers, in combination with APIs to resolve them, alleviate this problem and enable computational navigation of such links. However, this alone only defers the problem, as actionable identifiers (e.g., HTTP URIs) at some point still need to be inferred from the information given in the traditional means of citation where the latter occur in data.\u0000 Recent projects, like BiCIKL, aim to add machine navigable links to the various entities (or respective data records) at scale, in pursuit of (ideally) fully intermeshed records, connecting (1) treatments to subject taxon names and concepts, cited specimens and DNA sequences, as well as cited treatments (with explicit nomenclatorial implications, e.g., taxon name synonymies or rebuttals thereof), (2) (digital) specimens to assigned taxon names, citing treatments, and any derived DNA sequences,","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87815824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Global Biodata Coalition: Towards a sustainable biodata infrastructure 全球生物数据联盟:迈向可持续的生物数据基础设施
Pub Date : 2023-09-08 DOI: 10.3897/biss.7.112303
Chuck Cook, Guy Cochrane
Progress in life and biomedical sciences depends absolutely on biodata resources—databases comprising biological data and services around those databases. Supporting scientists in data operations and spanning management, analysis and publication of newly generated data and access to pre-existing reference data, these biodata resources together comprise a critical infrastructure for life science and biomedical research. Familiar scientific infrastructures—for example the Conseil Européen pour la Recherche Nucléaire (CERN) or the Square Kilometer Array, are distinct, constructed, physical entities that are centrally funded and managed at one or more identifiable locations. By contrast, the primary infrastructure of the life sciences—comprised of databases and other biological data resources—is globally distributed, virtually connected, funded from multiple sources, and is not managed as a coordinated entity. While this configuration supports innovation, it lends itself poorly to the long-term sustainability of individual biodata resources and of the infrastructure as a whole. The Global Biodata Coalition (GBC) brings together life science research funding organisations that recognise these challenges and acknowledge the threat that the lack of sustainability poses. They agree to work together to find ways to improve sustainability. In the presentation, we will provide an overview of the global biodata resource infrastructure, focusing in particular on challenges to providing sustained long-term funding to the resources that comprise the infrastructure. This will provide a global context to other presentations in the session, which focus on biodata resources in Australia. Covering some of the work that GBC has carried out to understand and classify biodata resources and the entire biodata resource infrastructure, we will outline the Global Core Biodata Resource programme and Inventory project and also introduce the stakeholder consultation processes around approaches to sustainability and open data. Finally, we will lay out the path GBC is taking to engage researchers, informaticians, funding organisations and other stakeholders in moving towards greater sustainability for these critical resources
生命和生物医学科学的进步完全依赖于生物数据资源——包括生物数据的数据库和围绕这些数据库的服务。这些生物数据资源支持科学家进行数据操作和跨管理、分析和发布新生成的数据,并访问已有的参考数据,共同构成了生命科学和生物医学研究的关键基础设施。我们熟悉的科学基础设施,例如欧洲核子研究中心(CERN)或平方公里阵列,都是独特的、建造的实体,由中央资助并在一个或多个可识别的地点管理。相比之下,生命科学的主要基础设施——由数据库和其他生物数据资源组成——是全球分布的,虚拟连接的,由多个来源资助的,并且不是作为一个协调的实体来管理的。虽然这种配置支持创新,但它不利于个体生物数据资源和整个基础设施的长期可持续性。全球生物数据联盟(GBC)将认识到这些挑战并承认缺乏可持续性所带来的威胁的生命科学研究资助组织聚集在一起。他们同意共同努力寻找提高可持续性的方法。在演讲中,我们将概述全球生物数据资源基础设施,特别关注为构成基础设施的资源提供持续长期资金的挑战。这将为会议上关注澳大利亚生物数据资源的其他演讲提供全球背景。我们将介绍GBC为理解和分类生物数据资源和整个生物数据资源基础设施所开展的一些工作,概述全球核心生物数据资源计划和清单项目,并介绍围绕可持续性和开放数据方法的利益相关者咨询过程。最后,我们将列出GBC正在采取的途径,让研究人员、信息学家、资助组织和其他利益相关者参与进来,以实现这些关键资源的更大可持续性
{"title":"The Global Biodata Coalition: Towards a sustainable biodata infrastructure","authors":"Chuck Cook, Guy Cochrane","doi":"10.3897/biss.7.112303","DOIUrl":"https://doi.org/10.3897/biss.7.112303","url":null,"abstract":"Progress in life and biomedical sciences depends absolutely on biodata resources—databases comprising biological data and services around those databases. Supporting scientists in data operations and spanning management, analysis and publication of newly generated data and access to pre-existing reference data, these biodata resources together comprise a critical infrastructure for life science and biomedical research. Familiar scientific infrastructures—for example the Conseil Européen pour la Recherche Nucléaire (CERN) or the Square Kilometer Array, are distinct, constructed, physical entities that are centrally funded and managed at one or more identifiable locations. By contrast, the primary infrastructure of the life sciences—comprised of databases and other biological data resources—is globally distributed, virtually connected, funded from multiple sources, and is not managed as a coordinated entity. While this configuration supports innovation, it lends itself poorly to the long-term sustainability of individual biodata resources and of the infrastructure as a whole. The Global Biodata Coalition (GBC) brings together life science research funding organisations that recognise these challenges and acknowledge the threat that the lack of sustainability poses. They agree to work together to find ways to improve sustainability.\u0000 In the presentation, we will provide an overview of the global biodata resource infrastructure, focusing in particular on challenges to providing sustained long-term funding to the resources that comprise the infrastructure. This will provide a global context to other presentations in the session, which focus on biodata resources in Australia.\u0000 Covering some of the work that GBC has carried out to understand and classify biodata resources and the entire biodata resource infrastructure, we will outline the Global Core Biodata Resource programme and Inventory project and also introduce the stakeholder consultation processes around approaches to sustainability and open data. Finally, we will lay out the path GBC is taking to engage researchers, informaticians, funding organisations and other stakeholders in moving towards greater sustainability for these critical resources","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"197 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76232603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Celebrating BHL Australia through the Eye of the (Tasmanian) Tiger 通过(塔斯马尼亚)老虎的眼睛庆祝BHL澳大利亚
Pub Date : 2023-09-08 DOI: 10.3897/biss.7.112352
Nicole Kearney
BHL Australia, the Australian branch of the Biodiversity Heritage Library (BHL), was launched in 2010 and began operation with a single organisation, Museums Victoria in Melbourne. Since then, it has grown considerably. Funded by the Atlas of Living Australia, BHL Australia now digitises biodiversity literature on behalf of 42 organisations across the country. These organisations include museums, herbaria, state libraries, royal societies, government agencies, field naturalist clubs and natural history publishers, many of whom lack the resources to do this work themselves. BHL Australia’s national consortium model, which makes biodiversity literature accessible on behalf of so many organisations, is unique amongst the BHL global community. Most BHL operations digitise material on behalf of a single organisation. BHL Australia has now made over 530,000 pages of Australia’s biodiversity knowledge freely accessible online. The BHL Australia Collection includes both published works (books and journals) and unpublished material (collection registers, field diaries and correspondence). The pages of these works are filled with species descriptions, references to historically significant people and, most importantly, scientific data that is critical to ongoing research and conservation efforts. Providing access to materials published as far back as the 1600s and as recently as the current year, the collection chronicles the scientific discovery and understanding of Australia’s biodiversity. BHL Australia also leads the global initiative to bring the world's historic biodiversity and taxonomic literature into the modern linked network of scholarly research by incorporating article data into BHL and retrospectively assigning DOIs (Digital Object Identifiers) (Kearney et al. 2021). BHL has now assigned more than 162,000 DOIs to historic publications, making them persistently citable and trackable, both within BHL and beyond. This paper will celebrate the achievements of BHL Australia by journeying through the (now accessible, discoverable and DOI'd) Tasmanian Tiger literature. It will showcase: previously elusive descriptions (and beautiful illustrations) of Thylacines, including those by Gerhard Krefft (1871) https://doi.org/10.5962/p.314741, and John Gould (1863) https://doi.org/10.5962/p.312790; the invaluable creation of links to open access versions from paywalled publications that should be in the public domain, such as the first description of the Thylacine (Harris 1808): open access on BHL; paywalled by Oxford Academic; the many citations of historic taxonomic descriptions that are now appearing as clickable DOI links in modern scholarly articles, taxonomic databases, social media, and Wikipedia (Kearney and Page 2022); and the efforts being made to encourage more authors to cite the authoritative source of taxonomic names (Benichou 2022). previously elusive descriptions (and beautiful illustrations) of Thylacines, i
BHL澳大利亚是生物多样性遗产图书馆(BHL)的澳大利亚分馆,成立于2010年,最初由一个组织——墨尔本的维多利亚博物馆运营。从那以后,它有了相当大的增长。在澳大利亚生活地图集的资助下,BHL澳大利亚现在代表全国42个组织对生物多样性文献进行数字化。这些组织包括博物馆、植物标本馆、国家图书馆、皇家学会、政府机构、野外自然学家俱乐部和自然历史出版商,其中许多人缺乏自己开展这项工作的资源。BHL澳大利亚的国家联盟模式,使生物多样性文献可代表许多组织访问,是BHL全球社区中独一无二的。大多数BHL业务代表单个组织将材料数字化。澳大利亚生物多样性研究所现在已经在网上免费提供了超过53万页的澳大利亚生物多样性知识。BHL澳大利亚收藏包括已出版的作品(书籍和期刊)和未出版的材料(收集登记册,实地日记和通信)。这些作品的页面上充满了物种描述,对历史上重要人物的参考,最重要的是,对正在进行的研究和保护工作至关重要的科学数据。提供可追溯到17世纪和最近一年出版的材料,该收藏记录了澳大利亚生物多样性的科学发现和理解。澳大利亚BHL还领导全球倡议,通过将文章数据纳入BHL并回顾性地分配doi(数字对象标识符),将世界历史上的生物多样性和分类文献纳入现代学术研究网络(Kearney等人,2021)。BHL现在已经为历史出版物分配了超过162,000份doi,使它们在BHL内外都可以持续引用和跟踪。本文将通过(现在可访问的,可发现的和DOI)塔斯马尼亚虎文献来庆祝BHL澳大利亚的成就。它将展示:以前难以捉摸的袋狼描述(和美丽的插图),包括格哈德·克雷夫特(1871年)https://doi.org/10.5962/p.314741和约翰·古尔德(1863年)https://doi.org/10.5962/p.312790;从付费出版物中获取开放获取版本的链接,这些链接本应属于公共领域,例如对袋狼的首次描述(Harris 1808):在BHL上开放获取;付费墙由牛津学术;在现代学术文章、分类数据库、社交媒体和维基百科(Kearney and Page 2022)中,许多历史分类描述的引用现在以可点击的DOI链接出现;并努力鼓励更多的作者引用权威的分类名称来源(Benichou 2022)。以前难以捉摸的袋狼描述(和美丽的插图),包括格哈德·克雷夫特(1871年)https://doi.org/10.5962/p.314741和约翰·古尔德(1863年)https://doi.org/10.5962/p.312790;从付费出版物中获取开放获取版本的链接,这些链接本应属于公共领域,例如对袋狼的首次描述(Harris 1808):在BHL上开放获取;付费墙由牛津学术;在现代学术文章、分类数据库、社交媒体和维基百科(Kearney and Page 2022)中,许多历史分类描述的引用现在以可点击的DOI链接出现;并努力鼓励更多的作者引用权威的分类名称来源(Benichou 2022)。袋狼的灭绝鲜明地提醒我们,缺乏对自然世界的理解和欣赏将带来不可逆转的后果。同样,缺乏获取和/或无法找到生物多样性知识,阻碍了我们从过去学习的能力,阻碍了科学进步和保护工作。生物多样性遗产图书馆的创建是为了“解决科学研究的一个主要障碍:缺乏获取自然历史文献的途径”(BHL 2019)。BHL澳大利亚为这一全球使命做出了重大贡献,并在BHL向生物多样性知识图谱的完全可搜索、持续可链接组件的过渡中发挥了重要作用(Kearney 2020, Page 2016)。
{"title":"Celebrating BHL Australia through the Eye of the (Tasmanian) Tiger","authors":"Nicole Kearney","doi":"10.3897/biss.7.112352","DOIUrl":"https://doi.org/10.3897/biss.7.112352","url":null,"abstract":"BHL Australia, the Australian branch of the Biodiversity Heritage Library (BHL), was launched in 2010 and began operation with a single organisation, Museums Victoria in Melbourne. Since then, it has grown considerably. Funded by the Atlas of Living Australia, BHL Australia now digitises biodiversity literature on behalf of 42 organisations across the country. These organisations include museums, herbaria, state libraries, royal societies, government agencies, field naturalist clubs and natural history publishers, many of whom lack the resources to do this work themselves. BHL Australia’s national consortium model, which makes biodiversity literature accessible on behalf of so many organisations, is unique amongst the BHL global community. Most BHL operations digitise material on behalf of a single organisation.\u0000 BHL Australia has now made over 530,000 pages of Australia’s biodiversity knowledge freely accessible online. The BHL Australia Collection includes both published works (books and journals) and unpublished material (collection registers, field diaries and correspondence). The pages of these works are filled with species descriptions, references to historically significant people and, most importantly, scientific data that is critical to ongoing research and conservation efforts. Providing access to materials published as far back as the 1600s and as recently as the current year, the collection chronicles the scientific discovery and understanding of Australia’s biodiversity.\u0000 BHL Australia also leads the global initiative to bring the world's historic biodiversity and taxonomic literature into the modern linked network of scholarly research by incorporating article data into BHL and retrospectively assigning DOIs (Digital Object Identifiers) (Kearney et al. 2021). BHL has now assigned more than 162,000 DOIs to historic publications, making them persistently citable and trackable, both within BHL and beyond. \u0000 This paper will celebrate the achievements of BHL Australia by journeying through the (now accessible, discoverable and DOI'd) Tasmanian Tiger literature. It will showcase:\u0000 \u0000 \u0000 \u0000 previously elusive descriptions (and beautiful illustrations) of Thylacines, including those by Gerhard Krefft (1871) https://doi.org/10.5962/p.314741, and John Gould (1863) https://doi.org/10.5962/p.312790;\u0000 \u0000 \u0000 the invaluable creation of links to open access versions from paywalled publications that should be in the public domain, such as the first description of the Thylacine (Harris 1808): open access on BHL; paywalled by Oxford Academic;\u0000 \u0000 \u0000 the many citations of historic taxonomic descriptions that are now appearing as clickable DOI links in modern scholarly articles, taxonomic databases, social media, and Wikipedia (Kearney and Page 2022); and\u0000 \u0000 \u0000 the efforts being made to encourage more authors to cite the authoritative source of taxonomic names (Benichou 2022).\u0000 \u0000 \u0000 \u0000 previously elusive descriptions (and beautiful illustrations) of Thylacines, i","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84286904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Part in the Swiss Army Knife for Linking Biodiversity Data: The digital specimen identifier service 连接生物多样性数据的瑞士军刀的一个新部分:数字标本标识服务
Pub Date : 2023-09-07 DOI: 10.3897/biss.7.112283
W. Addink, Soulaine Theocharides, Sharif Islam
Digital specimens are new information objects on the internet, which act as digital surrogates of the physical objects they represent. They are designed to be extended with data derived from the specimen like genetic, morphological and chemical data, and with data that puts the specimen in context of its gathering event and the environment it was derived from. This requires linking the digital specimens and their related entities to information about agents, locations, publications, taxa and environmental information. To establish reliable links and (re-)connect data to specimens, a new framework is needed, which creates persistent identifiers (PIDs) for the digital specimen and its related entities. These PIDs should be actionable by machines but also can be used by humans for data citation and communication purposes. The framework that enables this is a new PID infrastructure, produced by the European Commission-funded BiCIKL project (Biodiversity Community Integrated Knowledge Library), creates persistent and actionable identifiers. It is a generic PID infrastructure that will be used by the Distributed System for Scientific Collections research infrastructure (DiSSCo), but it can also be used by other infrastructures and institutions. PIDs minted by DiSSCo will be linked to the digital specimens and samples provided through DiSSCo. The new PIDs are a key element in enabling the concept of Digital Extended Specimens (Webster et al. 2021) and provide unique and resolvable references to enable bidirectional linking. DiSSCo has done extensive work to select the most appropriate PID scheme (Hardisty et al. 2021) and to design a PID infrastructure for the pan-European specimens. The draft design has been discussed with technical specialists in the joint DiSSCo and Consortium of European Taxonomic Facilities (CETAF) community, with international stakeholders like the Global Biodiversity Information Facility (GBIF) and Integrated Digitized Biocollections (iDigBio) and was discussed at the 2022 conference of the Society for the Preservation of Natural History Collections (SPNHC). A first implementation was demonstrated in the Biodiversity Information Standards (TDWG) annual conference in 2022 and illustrated key elements in the design. To be able to provide digital specimen identifiers as DOIs (Digital Object Identifiers), a pilot project was started in 2023 with DataCite to investigate if Digital Specimen DOIs in the new PID infrastructure can be created using the DataCite service. The pilot aim was to create metadata crosswalks to the DataCite schema in consultation with the DataCite Metadata Working Group, to evaluate synergies with the IGSN (International Generic Sample Number) metadata schema, to develop and test PID kernel metadata registration, and to evaluate performance and the impact of using DataCite services. There are around two billion specimens and creating PIDs for them as DOIs requires creating DOIs at an unprecedented scale. Also,
数字样本是互联网上新的信息对象,它们作为它们所代表的物理对象的数字替代品。它们的设计目的是扩展来自标本的数据,如遗传、形态和化学数据,以及将标本置于其收集事件及其产生环境的背景下的数据。这需要将数字标本及其相关实体与有关代理人、地点、出版物、分类群和环境信息的信息联系起来。为了建立可靠的链接并(重新)将数据连接到标本,需要一个新的框架,为数字标本及其相关实体创建持久标识符(pid)。这些pid应该可以被机器操作,但也可以被人类用于数据引用和通信目的。实现这一目标的框架是一个新的PID基础设施,由欧盟委员会资助的BiCIKL项目(生物多样性社区综合知识库)生产,创建持久和可操作的标识符。它是一个通用的PID基础设施,将被分布式系统用于科学收藏研究基础设施(DiSSCo),但它也可以被其他基础设施和机构使用。由DiSSCo铸造的pid将与通过DiSSCo提供的数字标本和样品相关联。新的pid是实现数字扩展标本概念的关键因素(Webster等人,2021),并提供独特且可解析的参考,以实现双向连接。DiSSCo已经做了大量的工作来选择最合适的PID方案(Hardisty et al. 2021),并为泛欧标本设计了PID基础设施。该设计草案已与DiSSCo和欧洲分类学设施联盟(CETAF)社区的技术专家,以及全球生物多样性信息设施(GBIF)和综合数字化生物收集(iDigBio)等国际利益相关者进行了讨论,并在2022年自然历史收藏保护协会(SPNHC)会议上进行了讨论。在2022年的生物多样性信息标准(TDWG)年会上展示了第一次实施,并说明了设计中的关键要素。为了能够提供数字样本标识符作为doi(数字对象标识符),DataCite于2023年启动了一个试点项目,以调查是否可以使用DataCite服务在新的PID基础设施中创建数字样本doi。试点的目的是与DataCite元数据工作组协商,创建与DataCite模式的元数据交叉通道,评估与IGSN(国际通用样本号)元数据模式的协同作用,开发和测试PID内核元数据注册,并评估使用DataCite服务的性能和影响。大约有20亿个样本,为它们创建pid作为doi需要以前所未有的规模创建doi。此外,PID内核元数据注册是doi的新特性。所包含的标本元数据将补充现有的生物多样性信息标准,如达尔文核心,并支持正在开发中的新的MIDS(关于数字标本的最小信息)标准。新的PID基础设施的设计、开发和测试是BiCIKL项目的一部分,该项目旨在促进基础设施之间的协作并发展双向连接(Penev et al. 2022)。在会议上,我们将展示PID基础设施的发展成果,作为BiCIKL工具箱的一部分,用于连接生物多样性数据,并讨论创建数字标本doi的进展。
{"title":"A Novel Part in the Swiss Army Knife for Linking Biodiversity Data: The digital specimen identifier service","authors":"W. Addink, Soulaine Theocharides, Sharif Islam","doi":"10.3897/biss.7.112283","DOIUrl":"https://doi.org/10.3897/biss.7.112283","url":null,"abstract":"Digital specimens are new information objects on the internet, which act as digital surrogates of the physical objects they represent. They are designed to be extended with data derived from the specimen like genetic, morphological and chemical data, and with data that puts the specimen in context of its gathering event and the environment it was derived from. This requires linking the digital specimens and their related entities to information about agents, locations, publications, taxa and environmental information. To establish reliable links and (re-)connect data to specimens, a new framework is needed, which creates persistent identifiers (PIDs) for the digital specimen and its related entities. These PIDs should be actionable by machines but also can be used by humans for data citation and communication purposes.\u0000 The framework that enables this is a new PID infrastructure, produced by the European Commission-funded BiCIKL project (Biodiversity Community Integrated Knowledge Library), creates persistent and actionable identifiers. It is a generic PID infrastructure that will be used by the Distributed System for Scientific Collections research infrastructure (DiSSCo), but it can also be used by other infrastructures and institutions. PIDs minted by DiSSCo will be linked to the digital specimens and samples provided through DiSSCo. The new PIDs are a key element in enabling the concept of Digital Extended Specimens (Webster et al. 2021) and provide unique and resolvable references to enable bidirectional linking. \u0000 DiSSCo has done extensive work to select the most appropriate PID scheme (Hardisty et al. 2021) and to design a PID infrastructure for the pan-European specimens. The draft design has been discussed with technical specialists in the joint DiSSCo and Consortium of European Taxonomic Facilities (CETAF) community, with international stakeholders like the Global Biodiversity Information Facility (GBIF) and Integrated Digitized Biocollections (iDigBio) and was discussed at the 2022 conference of the Society for the Preservation of Natural History Collections (SPNHC). A first implementation was demonstrated in the Biodiversity Information Standards (TDWG) annual conference in 2022 and illustrated key elements in the design. To be able to provide digital specimen identifiers as DOIs (Digital Object Identifiers), a pilot project was started in 2023 with DataCite to investigate if Digital Specimen DOIs in the new PID infrastructure can be created using the DataCite service. The pilot aim was to create metadata crosswalks to the DataCite schema in consultation with the DataCite Metadata Working Group, to evaluate synergies with the IGSN (International Generic Sample Number) metadata schema, to develop and test PID kernel metadata registration, and to evaluate performance and the impact of using DataCite services. There are around two billion specimens and creating PIDs for them as DOIs requires creating DOIs at an unprecedented scale. Also,","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89183646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Want to Describe and Share Biodiversity Inventory and Monitoring Data? The Humboldt Extension for Ecological Inventories Can Help! 想要描述和分享生物多样性清单和监测数据吗?洪堡生态清单扩展可以帮助!
Pub Date : 2023-09-07 DOI: 10.3897/biss.7.112229
Y. Sica, Wesley Hochachka, Yi-Ming Gan, Kate Ingenloff, Dmitry Schigel, Robert Stevenson, Steven Baskauf, Peter Brenton, Anahita J. N. Kazem, John Wieczorek
Access to high-quality ecological data is critical to assessing and modeling biodiversity and its changes through space and time. The Darwin Core standard has proven to be immensely helpful in sharing species occurrence data (see Wieczorek et al. 2012, Global Biodiversity Information Facility, GBIF) and promoting biodiversity research following the FAIR principles of findability, accessibility, interoperability and reusability (Wilkinson et al. 2016). However, it is limited in its ability to fully accommodate inventory data (i.e., linked records of multiple taxa at a specific place and time). Information about the inventory processes is often either unreported or described in an unstructured manner, limiting its potential re-use for larger-scale analyses. Two key aspects that are not captured in a structured manner yet are: i) information about the species that were not detected during an inventory, and ii) ancillary information about sampling effort and completeness. Non-detections (i.e., reported counts of zero) potentially enable more accurate and precise estimates of distribution, abundance, and changes in abundance. This becomes possible when variation in effort is used to estimate the likelihood that a non-detection represents a true absence of that taxon during the inventory. Currently, ecological inventory data, when shared at all, are typically discoverable through dataset catalogs (e.g., governmental data repositories) and supplementary materials to publications. With few exceptions, indexing of such data with the detail and structure needed has not been attempted at broad temporal and spatial scales, despite the potentially high value resulting from making inventory data more readily accessible. To address these limitations in documenting inventory data using the Darwin Core, Guralnick et al. (2018) proposed the Humboldt Core. Subsequent discussions within the biodiversity standards community made it clear that greater integration could be achieved by creating an extension of the Darwin Core, rather than developing a new standard in isolation. Extension design work began in 2021 and progress has been reported by Brenton (2021) and Sica et al. (2022). Over the last year the Humboldt Extension Task Group has sought advice from data providers and aggregators and updated its vocabulary terms. A challenging aspect has been creating terminology for the parent-child relationships (see Properties of Hierarchical Events) needed to describe surveys that may be as simple as a collection of checklists (one level of hierarchy) or as complex as species records from traps within plots along transects across habitats over multiple years (at least four levels of hierarchy). The Task Group has committed to completing a User Guide for the Humboldt Extension. Group members who contributed to the Darwin Core (Darwin Core Task Group 2009) and the Vocabulary Maintenance Specification (Vocabulary Maintenance Specification Task Group 2017) have provided va
获得高质量的生态数据对于评估和模拟生物多样性及其时空变化至关重要。事实证明,达尔文核心标准在共享物种发生数据(见Wieczorek等人,2012,全球生物多样性信息设施,GBIF)和促进生物多样性研究方面非常有帮助,遵循可查找性、可及性、互操作性和可重用性的FAIR原则(Wilkinson等人,2016)。但是,它在完全容纳清单数据(即在特定地点和时间内多个分类群的关联记录)方面的能力有限。关于库存过程的信息通常要么没有报告,要么以一种非结构化的方式描述,这限制了它在大规模分析中的潜在重用。尚未以结构化方式捕获的两个关键方面是:i)关于在清查期间未检测到的物种的信息,ii)关于采样努力和完整性的辅助信息。未检测到(即报告计数为零)可能使分布、丰度和丰度变化的估计更加准确和精确。当使用工作量的变化来估计未检测到的分类单元在清单中真正缺失的可能性时,这就成为可能。目前,生态清单数据在共享时,通常是通过数据集目录(例如,政府数据存储库)和出版物的补充材料发现的。除少数例外情况外,没有尝试在广泛的时间和空间尺度上对这些具有所需细节和结构的数据进行索引,尽管使库存数据更容易获得可能会产生很高的价值。为了解决使用达尔文核心记录库存数据的这些限制,Guralnick等人(2018)提出了洪堡核心。生物多样性标准界随后的讨论表明,可以通过扩大达尔文核心来实现更大程度的一体化,而不是孤立地制定一项新标准。扩建设计工作始于2021年,Brenton(2021年)和Sica等人(2022年)报告了进展情况。在过去的一年里,洪堡扩展任务小组向数据提供者和聚合者寻求建议,并更新了词汇表。一个具有挑战性的方面是为父子关系创建术语(见层次事件的属性),这些术语用于描述调查,这些调查可能简单到如清单的集合(一级层次),也可能复杂到如多年来沿着栖息地的样带在地块内捕获的物种记录(至少四个层次)。工作组已承诺完成洪堡扩展的用户指南。为达尔文核心(达尔文核心任务组2009)和词汇维护规范(词汇维护规范任务组2017)做出贡献的小组成员提供了术语细化和过程方面的宝贵专业知识。通过批准Humboldt扩展作为达尔文核心事件扩展,我们希望为社区提供一个可用的解决方案,与完善的数据发布机制相关联,用于共享和使用库存数据。这项工作有望克服至关重要的生态数据共享的关键瓶颈,增强数据的可发现性、互操作性和重用性,同时降低报告负担以及数据和元数据的异构性。全球数据聚合计划(如GBIF)将从这一发展中受益,因为它们开发了自己的数据模型及其支持的标准和扩展范围。我们预计洪堡扩展将吸引数据出版商和数据用户,通过促进数据的表示和索引更丰富,更有意义的方式。尽管基础生态学研究和管理和政策的应用监测具有数据密集型的性质,但生态数据仍然是FAIR数据前沿之一。我们预计洪堡扩展将解决所有专业社区的大多数数据交换需求。
{"title":"Want to Describe and Share Biodiversity Inventory and Monitoring Data? The Humboldt Extension for Ecological Inventories Can Help!","authors":"Y. Sica, Wesley Hochachka, Yi-Ming Gan, Kate Ingenloff, Dmitry Schigel, Robert Stevenson, Steven Baskauf, Peter Brenton, Anahita J. N. Kazem, John Wieczorek","doi":"10.3897/biss.7.112229","DOIUrl":"https://doi.org/10.3897/biss.7.112229","url":null,"abstract":"Access to high-quality ecological data is critical to assessing and modeling biodiversity and its changes through space and time. The Darwin Core standard has proven to be immensely helpful in sharing species occurrence data (see Wieczorek et al. 2012, Global Biodiversity Information Facility, GBIF) and promoting biodiversity research following the FAIR principles of findability, accessibility, interoperability and reusability (Wilkinson et al. 2016). However, it is limited in its ability to fully accommodate inventory data (i.e., linked records of multiple taxa at a specific place and time). Information about the inventory processes is often either unreported or described in an unstructured manner, limiting its potential re-use for larger-scale analyses. Two key aspects that are not captured in a structured manner yet are: i) information about the species that were not detected during an inventory, and ii) ancillary information about sampling effort and completeness.\u0000 Non-detections (i.e., reported counts of zero) potentially enable more accurate and precise estimates of distribution, abundance, and changes in abundance. This becomes possible when variation in effort is used to estimate the likelihood that a non-detection represents a true absence of that taxon during the inventory. Currently, ecological inventory data, when shared at all, are typically discoverable through dataset catalogs (e.g., governmental data repositories) and supplementary materials to publications. With few exceptions, indexing of such data with the detail and structure needed has not been attempted at broad temporal and spatial scales, despite the potentially high value resulting from making inventory data more readily accessible.\u0000 To address these limitations in documenting inventory data using the Darwin Core, Guralnick et al. (2018) proposed the Humboldt Core. Subsequent discussions within the biodiversity standards community made it clear that greater integration could be achieved by creating an extension of the Darwin Core, rather than developing a new standard in isolation. Extension design work began in 2021 and progress has been reported by Brenton (2021) and Sica et al. (2022). \u0000 Over the last year the Humboldt Extension Task Group has sought advice from data providers and aggregators and updated its vocabulary terms. A challenging aspect has been creating terminology for the parent-child relationships (see Properties of Hierarchical Events) needed to describe surveys that may be as simple as a collection of checklists (one level of hierarchy) or as complex as species records from traps within plots along transects across habitats over multiple years (at least four levels of hierarchy). The Task Group has committed to completing a User Guide for the Humboldt Extension. Group members who contributed to the Darwin Core (Darwin Core Task Group 2009) and the Vocabulary Maintenance Specification (Vocabulary Maintenance Specification Task Group 2017) have provided va","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"1939 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91122617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biodata Infrastructure within Australia and Beyond: Landscapes and horizons 澳大利亚内外的生物数据基础设施:景观和视野
Pub Date : 2023-09-07 DOI: 10.3897/biss.7.112274
Jeff Christiansen, Kathryn Hall
In current life science practice, digital data are associated with all parts of the research lifecycle. Generation and management of data are planned for during project conception; collected from numerous instruments or existing sources; prepared for analysis and analysed to generate new knowledge and information; and then (hopefully) preserved so that the data may be found, shared and re-used by others when appropriate. This session will begin with a scan of the biodata and biodata infrastructure landscape within Australia. We will explore which organisations fund biodata generation, where data are processed and stored, and how data are made available for reuse by others. Important global and complementary data resources that are hosted offshore will also be discussed. To guarantee reproducibility and integrity for life sciences research, it is critical that each of these infrastructures (whether they are hosted on- or off-shore) are maintained for the long term. As an example of a resource that utilises a mixture of existing on- and off-shore data infrastructures to underpin a critical research need, the Australian Reference Genome Atlas (ARGA) will be discussed. ARGA is solving the problem of genomics data obscurity for Australian-relevant species by creating an online platform where life sciences researchers can comprehensively and confidently search for genomic data for taxa relevant to Australian research. Publicly available genomics (and genetics) data are aggregated and indexed from multiple sources (both on- and off-shore), and then integrated with occurrence records and the taxonomic frameworks of the Global Biodiversity Information Facility (GBIF) and the Atlas of Living Australia (ALA) to enrich the genomic data and make them searchable using taxonomy, location, ecological characteristics and selected phenotypic data. The presentation sets the scene for a subsequent talk by members of the Global Biodata Coalition (GBC), who will outline the challenges in sustaining the types of disseminated infrastructure discussed and the GBC’s work with the funders who support many of these resources to ensure long-term funding for existing infrastructure, while also channelling support to underpin future growth in data volumes and new technologies.
在当前的生命科学实践中,数字数据与研究生命周期的所有部分相关联。数据的生成和管理计划在项目构思期间进行;收集自许多工具或现有来源的;准备分析和分析,以产生新的知识和信息;然后(希望)保存起来,以便其他人可以在适当的时候发现、共享和再利用这些数据。本次会议将以澳大利亚生物数据和生物数据基础设施景观的扫描开始。我们将探讨哪些组织资助生物数据生成,在哪里处理和存储数据,以及如何使数据可供其他人重用。还将讨论托管在海外的重要全球和互补数据资源。为了保证生命科学研究的可重复性和完整性,至关重要的是,这些基础设施中的每一个(无论它们是托管在岸上还是离岸)都要长期维护。作为一个利用现有的陆上和海上数据基础设施来支持关键研究需求的资源的例子,澳大利亚参考基因组图谱(ARGA)将被讨论。ARGA正在通过创建一个在线平台来解决澳大利亚相关物种基因组数据模糊的问题,生命科学研究人员可以全面而自信地搜索与澳大利亚研究相关的分类群的基因组数据。公开可用的基因组学(和遗传学)数据从多个来源(包括陆上和海上)进行汇总和索引,然后与发生记录和全球生物多样性信息设施(GBIF)和澳大利亚生活地图集(ALA)的分类框架相结合,以丰富基因组数据,并使其能够使用分类、位置、生态特征和选择的表型数据进行搜索。该演讲为全球生物数据联盟(GBC)成员随后的演讲奠定了基础,他们将概述维持所讨论的传播基础设施类型的挑战,以及GBC与支持这些资源的资助者的合作,以确保对现有基础设施的长期资助,同时也为支持数据量和新技术的未来增长提供支持。
{"title":"Biodata Infrastructure within Australia and Beyond: Landscapes and horizons","authors":"Jeff Christiansen, Kathryn Hall","doi":"10.3897/biss.7.112274","DOIUrl":"https://doi.org/10.3897/biss.7.112274","url":null,"abstract":"In current life science practice, digital data are associated with all parts of the research lifecycle. Generation and management of data are planned for during project conception; collected from numerous instruments or existing sources; prepared for analysis and analysed to generate new knowledge and information; and then (hopefully) preserved so that the data may be found, shared and re-used by others when appropriate. \u0000 This session will begin with a scan of the biodata and biodata infrastructure landscape within Australia. We will explore which organisations fund biodata generation, where data are processed and stored, and how data are made available for reuse by others. Important global and complementary data resources that are hosted offshore will also be discussed. To guarantee reproducibility and integrity for life sciences research, it is critical that each of these infrastructures (whether they are hosted on- or off-shore) are maintained for the long term.\u0000 As an example of a resource that utilises a mixture of existing on- and off-shore data infrastructures to underpin a critical research need, the Australian Reference Genome Atlas (ARGA) will be discussed. ARGA is solving the problem of genomics data obscurity for Australian-relevant species by creating an online platform where life sciences researchers can comprehensively and confidently search for genomic data for taxa relevant to Australian research. Publicly available genomics (and genetics) data are aggregated and indexed from multiple sources (both on- and off-shore), and then integrated with occurrence records and the taxonomic frameworks of the Global Biodiversity Information Facility (GBIF) and the Atlas of Living Australia (ALA) to enrich the genomic data and make them searchable using taxonomy, location, ecological characteristics and selected phenotypic data. The presentation sets the scene for a subsequent talk by members of the Global Biodata Coalition (GBC), who will outline the challenges in sustaining the types of disseminated infrastructure discussed and the GBC’s work with the funders who support many of these resources to ensure long-term funding for existing infrastructure, while also channelling support to underpin future growth in data volumes and new technologies.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87150039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biodiversity Information Science and Standards
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1