连接生物多样性数据的瑞士军刀的一个新部分:数字标本标识服务

W. Addink, Soulaine Theocharides, Sharif Islam
{"title":"连接生物多样性数据的瑞士军刀的一个新部分:数字标本标识服务","authors":"W. Addink, Soulaine Theocharides, Sharif Islam","doi":"10.3897/biss.7.112283","DOIUrl":null,"url":null,"abstract":"Digital specimens are new information objects on the internet, which act as digital surrogates of the physical objects they represent. They are designed to be extended with data derived from the specimen like genetic, morphological and chemical data, and with data that puts the specimen in context of its gathering event and the environment it was derived from. This requires linking the digital specimens and their related entities to information about agents, locations, publications, taxa and environmental information. To establish reliable links and (re-)connect data to specimens, a new framework is needed, which creates persistent identifiers (PIDs) for the digital specimen and its related entities. These PIDs should be actionable by machines but also can be used by humans for data citation and communication purposes.\n The framework that enables this is a new PID infrastructure, produced by the European Commission-funded BiCIKL project (Biodiversity Community Integrated Knowledge Library), creates persistent and actionable identifiers. It is a generic PID infrastructure that will be used by the Distributed System for Scientific Collections research infrastructure (DiSSCo), but it can also be used by other infrastructures and institutions. PIDs minted by DiSSCo will be linked to the digital specimens and samples provided through DiSSCo. The new PIDs are a key element in enabling the concept of Digital Extended Specimens (Webster et al. 2021) and provide unique and resolvable references to enable bidirectional linking. \n DiSSCo has done extensive work to select the most appropriate PID scheme (Hardisty et al. 2021) and to design a PID infrastructure for the pan-European specimens. The draft design has been discussed with technical specialists in the joint DiSSCo and Consortium of European Taxonomic Facilities (CETAF) community, with international stakeholders like the Global Biodiversity Information Facility (GBIF) and Integrated Digitized Biocollections (iDigBio) and was discussed at the 2022 conference of the Society for the Preservation of Natural History Collections (SPNHC). A first implementation was demonstrated in the Biodiversity Information Standards (TDWG) annual conference in 2022 and illustrated key elements in the design. To be able to provide digital specimen identifiers as DOIs (Digital Object Identifiers), a pilot project was started in 2023 with DataCite to investigate if Digital Specimen DOIs in the new PID infrastructure can be created using the DataCite service. The pilot aim was to create metadata crosswalks to the DataCite schema in consultation with the DataCite Metadata Working Group, to evaluate synergies with the IGSN (International Generic Sample Number) metadata schema, to develop and test PID kernel metadata registration, and to evaluate performance and the impact of using DataCite services. There are around two billion specimens and creating PIDs for them as DOIs requires creating DOIs at an unprecedented scale. Also, PID kernel metadata registration is new for DOIs. The included metadata for specimens will complement existing Biodiversity Information Standards such as Darwin Core, and supports the new MIDS (Minimum Information about a digital specimen) standard that is under development.\n The design, development and testing of the new PID infrastructure is being done as part of the BiCIKL project that aims to foster collaboration between infrastructures and develop bidirectional connections (Penev et al. 2022). In the session, we will demonstrate the results in development of the PID infrastructure as part of the BiCIKL toolbox to link biodiversity data and to discuss the progress with creating digital specimen DOIs.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"39 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Part in the Swiss Army Knife for Linking Biodiversity Data: The digital specimen identifier service\",\"authors\":\"W. Addink, Soulaine Theocharides, Sharif Islam\",\"doi\":\"10.3897/biss.7.112283\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Digital specimens are new information objects on the internet, which act as digital surrogates of the physical objects they represent. They are designed to be extended with data derived from the specimen like genetic, morphological and chemical data, and with data that puts the specimen in context of its gathering event and the environment it was derived from. This requires linking the digital specimens and their related entities to information about agents, locations, publications, taxa and environmental information. To establish reliable links and (re-)connect data to specimens, a new framework is needed, which creates persistent identifiers (PIDs) for the digital specimen and its related entities. These PIDs should be actionable by machines but also can be used by humans for data citation and communication purposes.\\n The framework that enables this is a new PID infrastructure, produced by the European Commission-funded BiCIKL project (Biodiversity Community Integrated Knowledge Library), creates persistent and actionable identifiers. It is a generic PID infrastructure that will be used by the Distributed System for Scientific Collections research infrastructure (DiSSCo), but it can also be used by other infrastructures and institutions. PIDs minted by DiSSCo will be linked to the digital specimens and samples provided through DiSSCo. The new PIDs are a key element in enabling the concept of Digital Extended Specimens (Webster et al. 2021) and provide unique and resolvable references to enable bidirectional linking. \\n DiSSCo has done extensive work to select the most appropriate PID scheme (Hardisty et al. 2021) and to design a PID infrastructure for the pan-European specimens. The draft design has been discussed with technical specialists in the joint DiSSCo and Consortium of European Taxonomic Facilities (CETAF) community, with international stakeholders like the Global Biodiversity Information Facility (GBIF) and Integrated Digitized Biocollections (iDigBio) and was discussed at the 2022 conference of the Society for the Preservation of Natural History Collections (SPNHC). A first implementation was demonstrated in the Biodiversity Information Standards (TDWG) annual conference in 2022 and illustrated key elements in the design. To be able to provide digital specimen identifiers as DOIs (Digital Object Identifiers), a pilot project was started in 2023 with DataCite to investigate if Digital Specimen DOIs in the new PID infrastructure can be created using the DataCite service. The pilot aim was to create metadata crosswalks to the DataCite schema in consultation with the DataCite Metadata Working Group, to evaluate synergies with the IGSN (International Generic Sample Number) metadata schema, to develop and test PID kernel metadata registration, and to evaluate performance and the impact of using DataCite services. There are around two billion specimens and creating PIDs for them as DOIs requires creating DOIs at an unprecedented scale. Also, PID kernel metadata registration is new for DOIs. The included metadata for specimens will complement existing Biodiversity Information Standards such as Darwin Core, and supports the new MIDS (Minimum Information about a digital specimen) standard that is under development.\\n The design, development and testing of the new PID infrastructure is being done as part of the BiCIKL project that aims to foster collaboration between infrastructures and develop bidirectional connections (Penev et al. 2022). In the session, we will demonstrate the results in development of the PID infrastructure as part of the BiCIKL toolbox to link biodiversity data and to discuss the progress with creating digital specimen DOIs.\",\"PeriodicalId\":9011,\"journal\":{\"name\":\"Biodiversity Information Science and Standards\",\"volume\":\"39 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biodiversity Information Science and Standards\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3897/biss.7.112283\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodiversity Information Science and Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/biss.7.112283","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

数字样本是互联网上新的信息对象,它们作为它们所代表的物理对象的数字替代品。它们的设计目的是扩展来自标本的数据,如遗传、形态和化学数据,以及将标本置于其收集事件及其产生环境的背景下的数据。这需要将数字标本及其相关实体与有关代理人、地点、出版物、分类群和环境信息的信息联系起来。为了建立可靠的链接并(重新)将数据连接到标本,需要一个新的框架,为数字标本及其相关实体创建持久标识符(pid)。这些pid应该可以被机器操作,但也可以被人类用于数据引用和通信目的。实现这一目标的框架是一个新的PID基础设施,由欧盟委员会资助的BiCIKL项目(生物多样性社区综合知识库)生产,创建持久和可操作的标识符。它是一个通用的PID基础设施,将被分布式系统用于科学收藏研究基础设施(DiSSCo),但它也可以被其他基础设施和机构使用。由DiSSCo铸造的pid将与通过DiSSCo提供的数字标本和样品相关联。新的pid是实现数字扩展标本概念的关键因素(Webster等人,2021),并提供独特且可解析的参考,以实现双向连接。DiSSCo已经做了大量的工作来选择最合适的PID方案(Hardisty et al. 2021),并为泛欧标本设计了PID基础设施。该设计草案已与DiSSCo和欧洲分类学设施联盟(CETAF)社区的技术专家,以及全球生物多样性信息设施(GBIF)和综合数字化生物收集(iDigBio)等国际利益相关者进行了讨论,并在2022年自然历史收藏保护协会(SPNHC)会议上进行了讨论。在2022年的生物多样性信息标准(TDWG)年会上展示了第一次实施,并说明了设计中的关键要素。为了能够提供数字样本标识符作为doi(数字对象标识符),DataCite于2023年启动了一个试点项目,以调查是否可以使用DataCite服务在新的PID基础设施中创建数字样本doi。试点的目的是与DataCite元数据工作组协商,创建与DataCite模式的元数据交叉通道,评估与IGSN(国际通用样本号)元数据模式的协同作用,开发和测试PID内核元数据注册,并评估使用DataCite服务的性能和影响。大约有20亿个样本,为它们创建pid作为doi需要以前所未有的规模创建doi。此外,PID内核元数据注册是doi的新特性。所包含的标本元数据将补充现有的生物多样性信息标准,如达尔文核心,并支持正在开发中的新的MIDS(关于数字标本的最小信息)标准。新的PID基础设施的设计、开发和测试是BiCIKL项目的一部分,该项目旨在促进基础设施之间的协作并发展双向连接(Penev et al. 2022)。在会议上,我们将展示PID基础设施的发展成果,作为BiCIKL工具箱的一部分,用于连接生物多样性数据,并讨论创建数字标本doi的进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Novel Part in the Swiss Army Knife for Linking Biodiversity Data: The digital specimen identifier service
Digital specimens are new information objects on the internet, which act as digital surrogates of the physical objects they represent. They are designed to be extended with data derived from the specimen like genetic, morphological and chemical data, and with data that puts the specimen in context of its gathering event and the environment it was derived from. This requires linking the digital specimens and their related entities to information about agents, locations, publications, taxa and environmental information. To establish reliable links and (re-)connect data to specimens, a new framework is needed, which creates persistent identifiers (PIDs) for the digital specimen and its related entities. These PIDs should be actionable by machines but also can be used by humans for data citation and communication purposes. The framework that enables this is a new PID infrastructure, produced by the European Commission-funded BiCIKL project (Biodiversity Community Integrated Knowledge Library), creates persistent and actionable identifiers. It is a generic PID infrastructure that will be used by the Distributed System for Scientific Collections research infrastructure (DiSSCo), but it can also be used by other infrastructures and institutions. PIDs minted by DiSSCo will be linked to the digital specimens and samples provided through DiSSCo. The new PIDs are a key element in enabling the concept of Digital Extended Specimens (Webster et al. 2021) and provide unique and resolvable references to enable bidirectional linking. DiSSCo has done extensive work to select the most appropriate PID scheme (Hardisty et al. 2021) and to design a PID infrastructure for the pan-European specimens. The draft design has been discussed with technical specialists in the joint DiSSCo and Consortium of European Taxonomic Facilities (CETAF) community, with international stakeholders like the Global Biodiversity Information Facility (GBIF) and Integrated Digitized Biocollections (iDigBio) and was discussed at the 2022 conference of the Society for the Preservation of Natural History Collections (SPNHC). A first implementation was demonstrated in the Biodiversity Information Standards (TDWG) annual conference in 2022 and illustrated key elements in the design. To be able to provide digital specimen identifiers as DOIs (Digital Object Identifiers), a pilot project was started in 2023 with DataCite to investigate if Digital Specimen DOIs in the new PID infrastructure can be created using the DataCite service. The pilot aim was to create metadata crosswalks to the DataCite schema in consultation with the DataCite Metadata Working Group, to evaluate synergies with the IGSN (International Generic Sample Number) metadata schema, to develop and test PID kernel metadata registration, and to evaluate performance and the impact of using DataCite services. There are around two billion specimens and creating PIDs for them as DOIs requires creating DOIs at an unprecedented scale. Also, PID kernel metadata registration is new for DOIs. The included metadata for specimens will complement existing Biodiversity Information Standards such as Darwin Core, and supports the new MIDS (Minimum Information about a digital specimen) standard that is under development. The design, development and testing of the new PID infrastructure is being done as part of the BiCIKL project that aims to foster collaboration between infrastructures and develop bidirectional connections (Penev et al. 2022). In the session, we will demonstrate the results in development of the PID infrastructure as part of the BiCIKL toolbox to link biodiversity data and to discuss the progress with creating digital specimen DOIs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Meeting Report for the Phenoscape TraitFest 2023 with Comments on Organising Interdisciplinary Meetings Implementation Experience Report for the Developing Latimer Core Standard: The DiSSCo Flanders use-case Structuring Information from Plant Morphological Descriptions using Open Information Extraction The Future of Natural History Transcription: Navigating AI advancements with VoucherVision and the Specimen Label Transcription Project (SLTP) Comparative Study: Evaluating the effects of class balancing on transformer performance in the PlantNet-300k image dataset
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1