Rapid development since the 1980s of technologies for analysing texts, has led not only to widespread employment of text 'mining', but also to now-pervasive large language model artificial intelligence (AI) applications. However, building new, concise, data resources from historic, as well as contemporary scientific literature, which can be employed efficiently at scale by automation and which have long-term value for the research community, has proved more elusive. Efforts at codifying analyses, such as the Text Encoding Initiative (TEI), date from the early 1990s and were initially driven by the social sciences and humanities (SSH) and linguistics communities, and extended with multiple XML-based tagging schemes, including in biodiversity (Miller et al. 2012). In 2010, the Bio-Ontologies Special Interest Group (of the International Society for Computational Biology) presented its Annotation Ontology (AO), incorporating JavaScript Object Notation and broadening previous XML-based approaches (Ciccarese et al. 2011). From 2011, the Open Annotation Data Model (OADM) (Sanderson et al. 2013) focused on cross-domain standards with utility for Web 3.0, leading to the W3C Web Annotation Data Model (WADM) Recommendation in February 2017*1 and the potential for unifying the multiplicity of already-in-use tagging approaches. This continual evolution has made the preservation of investment using annotation methods, and in particular of the connections between annotations and their context in source literature, particularly challenging. Infrastructure that entered service during the intervening years does not yet support WADM, and has only recently started to address the parallel emergence of page imagery-based standards such as the International Image Interoperability Framework (IIIF). Notably, IIIF instruments such as Mirador-2, which has been employed widely for manual creation and editing of annotations in SSH, continue to employ the now-deprecated OADM. Although multiple efforts now address combining IIIF and TEI text coordinate systems, they are currently fundamentally incompatible. However, emerging repository technologies enable preservation of annotation investment to be accomplished comprehensively for the first time. Native IIIF support enables interactive previewing of annotations within repository graphical user interfaces and dynamic serialisation technologies provide compatibility with existing XML-based infrastructures. Repository access controls can permit experts to trace annotation sources in original texts even if the literature is not publicly accessible, e.g., due to copyright restriction. This is of paramount importance, not only because surrounding context can be crucial to qualify formal terms that have been annotated, such as collecting country. Also, contemporary automated text mining—essential for operation at the scale of known biodiversity literature—is not 100% accurate and manual checking of uncertainties is currently essentia
{"title":"Progress with Repository-based Annotation Infrastructure for Biodiversity Applications","authors":"Peter Cornwell","doi":"10.3897/biss.7.112707","DOIUrl":"https://doi.org/10.3897/biss.7.112707","url":null,"abstract":"Rapid development since the 1980s of technologies for analysing texts, has led not only to widespread employment of text 'mining', but also to now-pervasive large language model artificial intelligence (AI) applications. However, building new, concise, data resources from historic, as well as contemporary scientific literature, which can be employed efficiently at scale by automation and which have long-term value for the research community, has proved more elusive. Efforts at codifying analyses, such as the Text Encoding Initiative (TEI), date from the early 1990s and were initially driven by the social sciences and humanities (SSH) and linguistics communities, and extended with multiple XML-based tagging schemes, including in biodiversity (Miller et al. 2012). In 2010, the Bio-Ontologies Special Interest Group (of the International Society for Computational Biology) presented its Annotation Ontology (AO), incorporating JavaScript Object Notation and broadening previous XML-based approaches (Ciccarese et al. 2011). From 2011, the Open Annotation Data Model (OADM) (Sanderson et al. 2013) focused on cross-domain standards with utility for Web 3.0, leading to the W3C Web Annotation Data Model (WADM) Recommendation in February 2017*1 and the potential for unifying the multiplicity of already-in-use tagging approaches. This continual evolution has made the preservation of investment using annotation methods, and in particular of the connections between annotations and their context in source literature, particularly challenging. Infrastructure that entered service during the intervening years does not yet support WADM, and has only recently started to address the parallel emergence of page imagery-based standards such as the International Image Interoperability Framework (IIIF). Notably, IIIF instruments such as Mirador-2, which has been employed widely for manual creation and editing of annotations in SSH, continue to employ the now-deprecated OADM. Although multiple efforts now address combining IIIF and TEI text coordinate systems, they are currently fundamentally incompatible. However, emerging repository technologies enable preservation of annotation investment to be accomplished comprehensively for the first time. Native IIIF support enables interactive previewing of annotations within repository graphical user interfaces and dynamic serialisation technologies provide compatibility with existing XML-based infrastructures. Repository access controls can permit experts to trace annotation sources in original texts even if the literature is not publicly accessible, e.g., due to copyright restriction. This is of paramount importance, not only because surrounding context can be crucial to qualify formal terms that have been annotated, such as collecting country. Also, contemporary automated text mining—essential for operation at the scale of known biodiversity literature—is not 100% accurate and manual checking of uncertainties is currently essentia","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134911890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Richardson Ciguene, Aurélien Miralles, Francis Clément
The species inventory of global biodiversity is constantly revised and refined by taxonomic research, through the addition of newly discovered species and the reclassification of known species. This almost three-century-old project provides essential knowledge for humankind. In particular, knowledge of biodiversity establishes a foundation for developing appropriate conservation strategies. An accurate global inventory of species relies on the study of millions of specimens housed all around the world in natural history collections. For the last two decades, biological taxonomy has generated an increasing amount of data every year, and notably through the digitization of collection specimens, has gradually been transformed into a big data science. In recognition of this trend, the French National Museum of Natural History has embarked on a major research and engineering challenge within its information system: the adoption of cyber-taxonomic practices that require easy access to data on specimens housed in natural history collections all over the world. To this end, an important step is to automatically complete and reconcile the heterogeneous classification data usually associated with specimens managed in different collection databases. We describe here a new fuzzy approach to reconciling the classifications in multiple databases, enabling more accurate taxonomic retrieval of specimen data across databases.
{"title":"Application of Fuzzy Measures to Move Towards Cyber-Taxonomy","authors":"Richardson Ciguene, Aurélien Miralles, Francis Clément","doi":"10.3897/biss.7.112677","DOIUrl":"https://doi.org/10.3897/biss.7.112677","url":null,"abstract":"The species inventory of global biodiversity is constantly revised and refined by taxonomic research, through the addition of newly discovered species and the reclassification of known species. This almost three-century-old project provides essential knowledge for humankind. In particular, knowledge of biodiversity establishes a foundation for developing appropriate conservation strategies. An accurate global inventory of species relies on the study of millions of specimens housed all around the world in natural history collections. For the last two decades, biological taxonomy has generated an increasing amount of data every year, and notably through the digitization of collection specimens, has gradually been transformed into a big data science. In recognition of this trend, the French National Museum of Natural History has embarked on a major research and engineering challenge within its information system: the adoption of cyber-taxonomic practices that require easy access to data on specimens housed in natural history collections all over the world. To this end, an important step is to automatically complete and reconcile the heterogeneous classification data usually associated with specimens managed in different collection databases. We describe here a new fuzzy approach to reconciling the classifications in multiple databases, enabling more accurate taxonomic retrieval of specimen data across databases.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"2021 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134912505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Catalogue of Afrotropical Bees provides a comprehensive checklist of species of bees known from Sub-Saharan Africa and the western Indian Ocean islands, excluding the honey bee ( Apis mellifera Linnaeus) (Eardley and Urban 2010). The checklist has a detailed bibliography of the group, distribution records, and biological associations (visited flowers, host plants, plants used as nests, as well as parasitoids). The database, which was originally built in Microsoft Access, and later managed using Specify Software, was recently migrated to TaxonWorks. TaxonWorks is an integrated, web-based platform designed specifically for the needs of practicing taxonomists and biodiversity scientists, and maintained by the SpeciesFile Group. TaxonWorks has a variety of tools that were designed to help import, manage, validate, and package data for future exports (e.g., in the Darwin Core-Archives (DwC-A; GBIF 2021) or Catalogue of Life's COL-DP formats). Although TaxonWorks has batch upload functionality (e.g., in Darwin Core-Archive, BibTeX format), the complexity of the original dataset (Fig. 1) required special handling, and a custom migration was built to transfer the data from the original format. TaxonWorks could now be used to produce a paper-style catalogue or share the data via the TaxonWorks public interface, TaxonPages.
{"title":"Migration of the Catalogue of Afrotropical Bees into TaxonWorks","authors":"Dmitry Dmitriev, Connal Eardley, Willem Coetzer","doi":"10.3897/biss.7.112702","DOIUrl":"https://doi.org/10.3897/biss.7.112702","url":null,"abstract":"The Catalogue of Afrotropical Bees provides a comprehensive checklist of species of bees known from Sub-Saharan Africa and the western Indian Ocean islands, excluding the honey bee ( Apis mellifera Linnaeus) (Eardley and Urban 2010). The checklist has a detailed bibliography of the group, distribution records, and biological associations (visited flowers, host plants, plants used as nests, as well as parasitoids). The database, which was originally built in Microsoft Access, and later managed using Specify Software, was recently migrated to TaxonWorks. TaxonWorks is an integrated, web-based platform designed specifically for the needs of practicing taxonomists and biodiversity scientists, and maintained by the SpeciesFile Group. TaxonWorks has a variety of tools that were designed to help import, manage, validate, and package data for future exports (e.g., in the Darwin Core-Archives (DwC-A; GBIF 2021) or Catalogue of Life's COL-DP formats). Although TaxonWorks has batch upload functionality (e.g., in Darwin Core-Archive, BibTeX format), the complexity of the original dataset (Fig. 1) required special handling, and a custom migration was built to transfer the data from the original format. TaxonWorks could now be used to produce a paper-style catalogue or share the data via the TaxonWorks public interface, TaxonPages.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134912817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Over the past 20 years, the biodiversity informatics community has pursued components of the digital annotation landscape with varying degrees of success. We will provide an historical overview of the theory, the advancements made through a few key projects, and will identify some of the ongoing challenges and opportunities. The fundamental principles remain unchanged since annotations were first proposed. Someone (or something): (1) has an enhancement to make elsewhere from the source where original data or information are generated or transcribed; (2) wishes to broadcast these statements to the originator and to others who may benefit; and (3) expects persistence, discoverability, and attribution for their contributions alongside the source. The Filtered Push project (Morris et al. 2013) considered several use cases and pioneered development of services based on the technology of the day. The exchange of data between parties in a universally consistent way necessitated the development of a novel draft standard for data annotations via an extension of the World Wide Web Consortium’s Web Annotation Working Group standard (Sanderson et al. 2013) to be sufficiently informative for a data curator to confidently make a decision. Figure 2 from Morris et al. (2013), reproduced here as Fig. 1, outlines the composition of an annotation data package for a taxonomic identification. The package contains the data object(s) associated with an occurrence, an expression of the motivation(s) for updating, some evidence for an assertion, and a stated expectation for how the receiving entity should take action. The Filtered Push and Annosys (Tschöpe et al. 2013) projects also considered implementation strategies involving collection management systems (e.g., Symbiota) and portals (e.g., European Distributed Institute of Taxonomy, EDIT). However, there remain technological barriers for these systems to operate at scale, the least of which is the absence of globally unique, persistent, resolvable identifiers for shared objects and concepts. Major aggregation infrastructures like the Global Biodiversity Information Facility (GBIF) and the Distributed System of Scientific Collections (DiSSCo) rely on data enhancement to improve the quality of their resources and have annotation services in their work plans. More recently, the Digital Extended Specimen (DES) concept (Hardisty et al. 2022) will rely on annotation services as key components of the proposed infrastructure. Recent work on annotation services more generally has considered various new forms of packaging and delivery such as Frictionless Data (Fowler et al. 2018), Journal Article Tag Suite XML (Agosti et al. 2022), or nanopublications (Kuhn et al. 2018). There is risk in fragmentation of this landscape and disenfranchisement of both biological collections and the wider research community if we fail to align the purpose, content, and structure of these packages or if these fail to remain aligned with FAIR prin
在过去的20年里,生物多样性信息界一直在追求数字注释景观的组成部分,并取得了不同程度的成功。我们将提供理论的历史概述,通过几个关键项目取得的进展,并将确定一些持续的挑战和机遇。自从首次提出注释以来,基本原则一直没有改变。某人(或某物):(1)在产生或转录原始数据或信息的来源的其他地方进行增强;(二)希望向发起人和其他可能受益的人广播该声明的;并且(3)期望持久性,可发现性,以及他们的贡献与来源的归属。过滤推送项目(Morris et al. 2013)考虑了几个用例,并开创了基于当时技术的服务开发。各方之间以普遍一致的方式交换数据,需要通过扩展万维网联盟的网络注释工作组标准(Sanderson et al. 2013)来开发一种新的数据注释标准草案,以提供足够的信息,使数据管理员能够自信地做出决策。Morris等人(2013)的图2(此处复制为图1)概述了用于分类学鉴定的注释数据包的组成。包包含与事件关联的数据对象、更新动机的表达式、断言的一些证据以及对接收实体应如何采取行动的声明期望。Filtered Push和Annosys (Tschöpe et al. 2013)项目也考虑了涉及收集管理系统(例如Symbiota)和门户(例如European Distributed Institute of Taxonomy, EDIT)的实施策略。然而,这些系统的大规模运行仍然存在技术障碍,其中最重要的是缺乏全局唯一的、持久的、可解析的共享对象和概念标识符。全球生物多样性信息设施(GBIF)和分布式科学馆藏系统(DiSSCo)等主要聚合基础设施依靠数据增强来提高其资源质量,并在其工作计划中提供注释服务。最近,数字扩展样本(DES)概念(Hardisty et al. 2022)将依赖注释服务作为拟议基础设施的关键组件。最近关于注释服务的工作更普遍地考虑了各种新的包装和交付形式,如Frictionless Data (Fowler等人,2018)、Journal Article Tag Suite XML (Agosti等人,2022)或纳米出版物(Kuhn等人,2018)。如果我们不能使这些包的目的、内容和结构保持一致,或者如果这些包不能与FAIR原则保持一致,那么就有可能使这一景观支离破碎,剥夺生物收藏和更广泛的研究界的权利。机构收集管理系统目前代表了向研究人员和数据聚合者提供数据的规范数据存储。至关重要的是,关于他们发布的数据的信息和/或反馈要返回给他们考虑。然而,人类和机器管理过程产生的大量注释将使本地数据管理人员和支持他们的系统不堪重负。对此的一种解决方案是创建一个带有写入和发现服务的中央注释存储,以最好地支持所有数据管理员的需求。这将需要一个由各方组成的国际联盟,拥有一个治理和技术模式,以确保其可持续性。
{"title":"I Know Something You Don’t Know: The annotation saga continues…","authors":"James Macklin, David Shorthouse, Falko Glöckler","doi":"10.3897/biss.7.112715","DOIUrl":"https://doi.org/10.3897/biss.7.112715","url":null,"abstract":"Over the past 20 years, the biodiversity informatics community has pursued components of the digital annotation landscape with varying degrees of success. We will provide an historical overview of the theory, the advancements made through a few key projects, and will identify some of the ongoing challenges and opportunities. The fundamental principles remain unchanged since annotations were first proposed. Someone (or something): (1) has an enhancement to make elsewhere from the source where original data or information are generated or transcribed; (2) wishes to broadcast these statements to the originator and to others who may benefit; and (3) expects persistence, discoverability, and attribution for their contributions alongside the source. The Filtered Push project (Morris et al. 2013) considered several use cases and pioneered development of services based on the technology of the day. The exchange of data between parties in a universally consistent way necessitated the development of a novel draft standard for data annotations via an extension of the World Wide Web Consortium’s Web Annotation Working Group standard (Sanderson et al. 2013) to be sufficiently informative for a data curator to confidently make a decision. Figure 2 from Morris et al. (2013), reproduced here as Fig. 1, outlines the composition of an annotation data package for a taxonomic identification. The package contains the data object(s) associated with an occurrence, an expression of the motivation(s) for updating, some evidence for an assertion, and a stated expectation for how the receiving entity should take action. The Filtered Push and Annosys (Tschöpe et al. 2013) projects also considered implementation strategies involving collection management systems (e.g., Symbiota) and portals (e.g., European Distributed Institute of Taxonomy, EDIT). However, there remain technological barriers for these systems to operate at scale, the least of which is the absence of globally unique, persistent, resolvable identifiers for shared objects and concepts. Major aggregation infrastructures like the Global Biodiversity Information Facility (GBIF) and the Distributed System of Scientific Collections (DiSSCo) rely on data enhancement to improve the quality of their resources and have annotation services in their work plans. More recently, the Digital Extended Specimen (DES) concept (Hardisty et al. 2022) will rely on annotation services as key components of the proposed infrastructure. Recent work on annotation services more generally has considered various new forms of packaging and delivery such as Frictionless Data (Fowler et al. 2018), Journal Article Tag Suite XML (Agosti et al. 2022), or nanopublications (Kuhn et al. 2018). There is risk in fragmentation of this landscape and disenfranchisement of both biological collections and the wider research community if we fail to align the purpose, content, and structure of these packages or if these fail to remain aligned with FAIR prin","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134913130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Environmental DNA (eDNA) is a fast-growing biomonitoring approach to detect species and map their distributions, with the number of eDNA publications exponentially increasing in the past decade. While millions of DNA sequences are often generated and assigned to taxa in each publication, these records are stored in numerous locations (e.g., supplementary materials at journals’ servers, open data publishing platforms such as Dryad) and in various formats, which makes it difficult to find, access, re-use and integrate datasets. Making eDNA data FAIR (findable, accessible, interoperable, re-usable) has vast potential to improve how the biological environment is measured and how change is detected and understood. For instance, it would allow biomonitoring and species distribution modelling studies across extended space and time scales, which is logistically difficult or impossible for individual projects. It would also shed light on “dark” (unassigned) DNA sequences by facilitating their storage and re-analyses with updated ever-growing DNA reference databases. Several challenges are associated with making eDNA FAIR, including how to standardise data formats and bioinformatics workflows, and simplifying the process of post-publication data archiving so that it is acceptable for eDNA practitioners to adopt. Over the next three years, we plan to work closely with biodiversity data platforms such as the Global Biodiversity Information Facility (GBIF) and Atlas of Living Austrlia (ALA), eDNA science journals, and eDNA practitioners, to solve these challenges and enable eDNA to achieve its revolutionary potential as a unified source of information that supports environmental management.
{"title":"Amplifying the Power of eDNA by Making it FAIR","authors":"Miwa Takahashi, Oliver Berry","doi":"10.3897/biss.7.112553","DOIUrl":"https://doi.org/10.3897/biss.7.112553","url":null,"abstract":"Environmental DNA (eDNA) is a fast-growing biomonitoring approach to detect species and map their distributions, with the number of eDNA publications exponentially increasing in the past decade. While millions of DNA sequences are often generated and assigned to taxa in each publication, these records are stored in numerous locations (e.g., supplementary materials at journals’ servers, open data publishing platforms such as Dryad) and in various formats, which makes it difficult to find, access, re-use and integrate datasets. Making eDNA data FAIR (findable, accessible, interoperable, re-usable) has vast potential to improve how the biological environment is measured and how change is detected and understood. For instance, it would allow biomonitoring and species distribution modelling studies across extended space and time scales, which is logistically difficult or impossible for individual projects. It would also shed light on “dark” (unassigned) DNA sequences by facilitating their storage and re-analyses with updated ever-growing DNA reference databases. Several challenges are associated with making eDNA FAIR, including how to standardise data formats and bioinformatics workflows, and simplifying the process of post-publication data archiving so that it is acceptable for eDNA practitioners to adopt. Over the next three years, we plan to work closely with biodiversity data platforms such as the Global Biodiversity Information Facility (GBIF) and Atlas of Living Austrlia (ALA), eDNA science journals, and eDNA practitioners, to solve these challenges and enable eDNA to achieve its revolutionary potential as a unified source of information that supports environmental management.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135689431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Living Atlases project, facilitated by the Global Biodiversity Information Facility (GBIF) and the Atlas of Living Australia (ALA), has successfully operated for more than a decade, establishing collaborations with over 30 countries who have implemented ALA components within their respective environments. Over this period, technological advancements and the prevalence of cloud platforms have transformed the landscape of infrastructure management. In this presentation, we will explore innovative approaches to streamline the installation process of ALA, capitalizing on the benefits offered by cloud platforms and cutting-edge technologies. Furthermore, ALA has maintained a strong collaborative partnership with GBIF over the past four years, focusing on data ingestion pipelines and, more recently, engaging in shared user interface (UI) development. These improvements aim to enhance the maintainability of ALA modules, enabling organizations to leverage the advantages provided by cloud-based solutions and novel technologies.
{"title":"Next Steps Towards Better Living Atlas Deployments and Maintenance","authors":"David Martin, Vicente Ruiz Jurado","doi":"10.3897/biss.7.112560","DOIUrl":"https://doi.org/10.3897/biss.7.112560","url":null,"abstract":"The Living Atlases project, facilitated by the Global Biodiversity Information Facility (GBIF) and the Atlas of Living Australia (ALA), has successfully operated for more than a decade, establishing collaborations with over 30 countries who have implemented ALA components within their respective environments. Over this period, technological advancements and the prevalence of cloud platforms have transformed the landscape of infrastructure management. In this presentation, we will explore innovative approaches to streamline the installation process of ALA, capitalizing on the benefits offered by cloud platforms and cutting-edge technologies. Furthermore, ALA has maintained a strong collaborative partnership with GBIF over the past four years, focusing on data ingestion pipelines and, more recently, engaging in shared user interface (UI) development. These improvements aim to enhance the maintainability of ALA modules, enabling organizations to leverage the advantages provided by cloud-based solutions and novel technologies.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135878450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Olaf Bánki, Markus Döring, Thomas Jeppesen, Donald Hobern
ChecklistBank is a publishing platform and open data repository focused on taxonomic and nomenclatural datasets. It was launched at the end of 2020, and is a joint development by Catalogue of Life (COL) and the Global Biodiversity Information Facility (GBIF). Close to 50K datasets, mostly originating from published literature mediated through Plazi's TreatmentBank, Pensoft Publishers and the European Journal of Taxonomy, are openly accessible through ChecklistBank. Data sets also include sources with (Molecular) Operational Taxonomic Units, such as from UNITE / PlutoF, National Center for Biotechnology Information Taxonomy / European Nuclotide Archive, and the International Barcode of Life / BoLD. Next to various taxonomic datasets (also from regional and national levels, e.g., shared through GBIF) and nomenclatural datasets (e.g., Zoobank, International Plant Names Index), ChecklistBank also links out to the various original initiatives websites (e.g., World Register of Marine Species, Integrated Taxonomic Information System, COL China, Species Files). ChecklistBank also holds all the tooling that is needed to assemble the COL Checklist, the authoritative global species list of all described organisms. The COL Checklist 2023 (Bánki et al. 2023), containing more than 2.1 million accepted species, is assembled from 164 global taxonomic data sources mediated through ChecklistBank. The COL Checklist contains name usage identifiers, and each checklist version and its underpinning data sources are issued with digital object identifiers. After the launching of ChecklistBank, the EU funded Biodiversity Community Integrated Knowledge Library (BiCIKL) project contributed to additional improvements to ChecklistBank. These added functionalities include, amongst others, a name usage search, name match, and a taxonomic data comparison. The tooling used to assemble the COL Checklist has been generalised through a ChecklistBank 'project functionality' supporting the assembly of a species list. During the demonstration, several of the functionalities, developed in the context of the EU BiCIKL project, will be highlighted.
ChecklistBank是一个专注于分类和命名数据集的发布平台和开放数据存储库。它于2020年底启动,是由生命目录(COL)和全球生物多样性信息设施(GBIF)联合开发的。通过ChecklistBank可以公开访问近5万个数据集,其中大部分来自Plazi的TreatmentBank、Pensoft Publishers和European Journal of Taxonomy介导的已发表文献。数据集还包括(分子)操作分类单位的来源,例如来自UNITE / PlutoF,国家生物技术信息分类中心/欧洲核苷酸档案,以及国际生命条形码/ BoLD。除了各种分类数据集(也来自区域和国家层面,例如,通过GBIF共享)和命名数据集(例如,Zoobank,国际植物名称索引)之外,ChecklistBank还链接到各种原始倡议网站(例如,世界海洋物种登记,综合分类信息系统,COL中国,物种档案)。ChecklistBank还拥有组装COL清单所需的所有工具,该清单是所有已描述生物的权威全球物种清单。COL清单2023 (Bánki et al. 2023)包含210多万个已接受的物种,由ChecklistBank介导的164个全球分类学数据源汇编而成。COL检查表包含名称使用标识符,每个检查表版本及其基础数据源都带有数字对象标识符。在ChecklistBank启动后,欧盟资助的生物多样性社区综合知识库(BiCIKL)项目对ChecklistBank进行了进一步改进。这些新增的功能包括名称使用情况搜索、名称匹配和分类数据比较。用于组装COL清单的工具已经通过ChecklistBank的“项目功能”进行了推广,该功能支持物种清单的组装。在演示过程中,将重点介绍在欧盟BiCIKL项目背景下开发的几个功能。
{"title":"Demonstration of Taxonomic Name Data Services through ChecklistBank","authors":"Olaf Bánki, Markus Döring, Thomas Jeppesen, Donald Hobern","doi":"10.3897/biss.7.112544","DOIUrl":"https://doi.org/10.3897/biss.7.112544","url":null,"abstract":"ChecklistBank is a publishing platform and open data repository focused on taxonomic and nomenclatural datasets. It was launched at the end of 2020, and is a joint development by Catalogue of Life (COL) and the Global Biodiversity Information Facility (GBIF). Close to 50K datasets, mostly originating from published literature mediated through Plazi's TreatmentBank, Pensoft Publishers and the European Journal of Taxonomy, are openly accessible through ChecklistBank. Data sets also include sources with (Molecular) Operational Taxonomic Units, such as from UNITE / PlutoF, National Center for Biotechnology Information Taxonomy / European Nuclotide Archive, and the International Barcode of Life / BoLD. Next to various taxonomic datasets (also from regional and national levels, e.g., shared through GBIF) and nomenclatural datasets (e.g., Zoobank, International Plant Names Index), ChecklistBank also links out to the various original initiatives websites (e.g., World Register of Marine Species, Integrated Taxonomic Information System, COL China, Species Files). ChecklistBank also holds all the tooling that is needed to assemble the COL Checklist, the authoritative global species list of all described organisms. The COL Checklist 2023 (Bánki et al. 2023), containing more than 2.1 million accepted species, is assembled from 164 global taxonomic data sources mediated through ChecklistBank. The COL Checklist contains name usage identifiers, and each checklist version and its underpinning data sources are issued with digital object identifiers. After the launching of ChecklistBank, the EU funded Biodiversity Community Integrated Knowledge Library (BiCIKL) project contributed to additional improvements to ChecklistBank. These added functionalities include, amongst others, a name usage search, name match, and a taxonomic data comparison. The tooling used to assemble the COL Checklist has been generalised through a ChecklistBank 'project functionality' supporting the assembly of a species list. During the demonstration, several of the functionalities, developed in the context of the EU BiCIKL project, will be highlighted.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135831397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pierre Bonnet, Antoine Affouard, Jean-Christophe Lombardo, Mathias Chouet, Hugo Gresse, Vanessa Hequet, Remi Palard, Maxime Fromholtz, Vincent Espitalier, Hervé Goëau, Benjamin Deneu, Christophe Botella, Joaquim Estopinan, César Leblanc, Maximilien Servajean, François Munoz, Alexis Joly
Human activities have a growing impact on global biodiversity. While our understanding of biodiversity worldwide is not yet comprehensive, it is crucial to explore effective means of characterizing it in order to mitigate these impacts. The advancements in data storage, exchange capabilities, and the increasing availability of extensive taxonomic, ecological, and environmental databases offer possibilities for implementing new approaches that can address knowledge gaps regarding species and habitats. This enhanced knowledge will, in turn, facilitate improved management practices and enable better local governance of territories. Meeting these requirements necessitates the development of innovative tools and methods to respond to these needs. Citizen science platforms have emerged as valuable resources for generating large amounts of biodiversity data, thanks to their visibility and attractiveness to individuals involved in territorial management and education. These platforms present new opportunities to train deep learning models for automated species recognition, leveraging the substantial volumes of multimedia data they accumulate. However, effectively managing, curating, and disseminating the data and services generated by these platforms remains a significant challenge that hinders the achievement of their objectives. In line with this, the GUARDEN and MAMBO European projects aim to utilize the Pl@ntNet participatory science platform (Affouard et al. 2021) to develop and implement novel computational services to enable the widespread creation of floristic inventories. In the pursuit of this project, various standards and reference datasets have been employed, such as the POWO (Plants of the World Online) world checklist and the WGSRPD (World Geographical Scheme for Recording Plant Distributions) standard, to establish a foundation for creating a global service that aids in plant identification through visual analysis. This service relies on a NoSQL (Not Only Structured Query Language) data management system ArangoDB (Arango Database), utilizes state-of-the-art automated visual classification models (vision transformers), and operates on a distributed IT (Information Technology) infrastructure that leverages the capabilities of collaborative stakeholders interested in supporting this initiative. Global-scale automated workflows have been established specifically for the collection, analysis, and dissemination of illustrated occurrences of plant species. These workflows now enable the development of new IT tools that facilitate the description and monitoring of species and habitat conservation statuses. A comprehensive presentation highlighting the significant advancements achieved will be provided to share the lessons learned during its development and ensure the widespread adoption of this service within the scientific community.
人类活动对全球生物多样性的影响越来越大。虽然我们对世界范围内生物多样性的了解还不全面,但为了减轻这些影响,探索有效的方法来描述它是至关重要的。数据存储、交换能力的进步,以及广泛的分类、生态和环境数据库的日益可用性,为实施解决物种和栖息地知识差距的新方法提供了可能性。这种增进的知识反过来又将促进改进管理做法,使领土能够更好地进行地方治理。要满足这些要求,就必须开发创新的工具和方法来满足这些需求。公民科学平台已经成为生成大量生物多样性数据的宝贵资源,这要归功于它们的可见性和对参与领土管理和教育的个人的吸引力。这些平台为训练深度学习模型进行自动物种识别提供了新的机会,利用它们积累的大量多媒体数据。然而,有效地管理、策划和传播这些平台产生的数据和服务仍然是阻碍其目标实现的重大挑战。与此相一致,GUARDEN和MAMBO欧洲项目旨在利用Pl@ntNet参与式科学平台(Affouard et al. 2021)开发和实施新的计算服务,以实现广泛创建植物区系清单。在该项目的实施过程中,采用了各种标准和参考数据集,如POWO(世界植物在线)世界清单和WGSRPD(世界植物分布记录地理计划)标准,为创建通过可视化分析帮助植物识别的全球服务奠定了基础。该服务依赖于NoSQL(非结构化查询语言)数据管理系统ArangoDB (Arango数据库),利用最先进的自动视觉分类模型(视觉转换器),并在分布式IT(信息技术)基础设施上运行,该基础设施利用了对支持该计划感兴趣的合作利益相关者的能力。全球范围内的自动化工作流程已经建立,专门用于收集、分析和传播植物物种的插图。这些工作流程现在使开发新的信息技术工具成为可能,这些工具有助于描述和监测物种和栖息地的保护状况。将提供一个全面的介绍,突出所取得的重大进展,以分享在其发展过程中吸取的经验教训,并确保在科学界广泛采用这项服务。
{"title":"Synergizing Digital, Biological, and Participatory Sciences for Global Plant Species Identification: Enabling access to a worldwide identification service","authors":"Pierre Bonnet, Antoine Affouard, Jean-Christophe Lombardo, Mathias Chouet, Hugo Gresse, Vanessa Hequet, Remi Palard, Maxime Fromholtz, Vincent Espitalier, Hervé Goëau, Benjamin Deneu, Christophe Botella, Joaquim Estopinan, César Leblanc, Maximilien Servajean, François Munoz, Alexis Joly","doi":"10.3897/biss.7.112545","DOIUrl":"https://doi.org/10.3897/biss.7.112545","url":null,"abstract":"Human activities have a growing impact on global biodiversity. While our understanding of biodiversity worldwide is not yet comprehensive, it is crucial to explore effective means of characterizing it in order to mitigate these impacts. The advancements in data storage, exchange capabilities, and the increasing availability of extensive taxonomic, ecological, and environmental databases offer possibilities for implementing new approaches that can address knowledge gaps regarding species and habitats. This enhanced knowledge will, in turn, facilitate improved management practices and enable better local governance of territories. Meeting these requirements necessitates the development of innovative tools and methods to respond to these needs. Citizen science platforms have emerged as valuable resources for generating large amounts of biodiversity data, thanks to their visibility and attractiveness to individuals involved in territorial management and education. These platforms present new opportunities to train deep learning models for automated species recognition, leveraging the substantial volumes of multimedia data they accumulate. However, effectively managing, curating, and disseminating the data and services generated by these platforms remains a significant challenge that hinders the achievement of their objectives. In line with this, the GUARDEN and MAMBO European projects aim to utilize the Pl@ntNet participatory science platform (Affouard et al. 2021) to develop and implement novel computational services to enable the widespread creation of floristic inventories. In the pursuit of this project, various standards and reference datasets have been employed, such as the POWO (Plants of the World Online) world checklist and the WGSRPD (World Geographical Scheme for Recording Plant Distributions) standard, to establish a foundation for creating a global service that aids in plant identification through visual analysis. This service relies on a NoSQL (Not Only Structured Query Language) data management system ArangoDB (Arango Database), utilizes state-of-the-art automated visual classification models (vision transformers), and operates on a distributed IT (Information Technology) infrastructure that leverages the capabilities of collaborative stakeholders interested in supporting this initiative. Global-scale automated workflows have been established specifically for the collection, analysis, and dissemination of illustrated occurrences of plant species. These workflows now enable the development of new IT tools that facilitate the description and monitoring of species and habitat conservation statuses. A comprehensive presentation highlighting the significant advancements achieved will be provided to share the lessons learned during its development and ensure the widespread adoption of this service within the scientific community.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135878792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maui Hudson, Jane Anderson, Riley Taitingfong, Andrew Martinez, Stephanie Carroll
The advent of data-driven technologies and the increasing demand for data have brought about unique challenges for Indigenous data governance. The CARE principles emphasize Collective Benefit, Authority, Responsibility, and Ethics as essential pillars for ensuring that Indigenous data rights are upheld, Indigenous knowledge is protected, and Indigenous Peoples are active participants in data governance processes (Carroll et al. 2020, Carroll et al. 2021). Identifying tangible activities and providing guidance to centre Indigenous perspectives provide a comprehensive approach to address the complexities of Indigenous data governance in a rapidly evolving data landscape (Gupta et al. 2023, Jennings et al. 2023, Sterner and Elliott 2023). Biodiversity research has increasingly recognized the intertwined relationship between biological diversity and cultural practices, leading to discussions about how research can strengthen the evidence base, build trust, enhance legitimacy for decision making (Alexander et al. 2021) and explore requirements for Indigenous metadata (Jennings et al. 2023). An Indigenous Metadata Bundle Communique, produced following an Indigenous Metadata Symposium, recommended the initial categories as: Governance, Provenance, Lands & Waters, Protocols, and Local Contexts Notices & Labels. Traditional Knowledge (TK) and Biocultural (BC) Labels have emerged as essential tools for recognising and maintaining Indigenous provenance, protocols and permissions in records for both natural ecosystems and cultural heritage (Anderson et al. 2020, Liggins et al. 2021) emphasizing the importance of Indigenous Peoples and local knowledge systems in research and digital management. Biocultural labels acknowledge the intricate links between biodiversity and cultural diversity, emphasizing the role of indigenous communities in preserving biodiversity through their traditional practices (Hudson et al. 2021). By recognizing the intrinsic value of these relationships, TK and BC labels not only contribute to a more holistic understanding of biodiversity but also promote ethical considerations and mutual respect between researchers and local communities, fostering collaborative partnerships for research and conservation initiatives (McCartney et al. 2023). Addressing the CARE Principles for Indigenous Data Governance in biodiversity research introduces several challenges and opportunities. Ethical concerns regarding recognition of Indigenous rights and interests in data (Hudson et al. 2023), intellectual property rights, cultural appropriation, and equitable benefit sharing, must be navigated sensitively (Carroll et al. 2022b, Golan et al. 2022). Moreover, fostering effective communication between researchers and communities is paramount for ensuring the accuracy and authenticity of Indigenous metadata and protocols for appropriate use (Carroll et al. 2022a). However, these challenges are offset by the potential for enriching scientific knowledg
数据驱动技术的出现和对数据日益增长的需求给土著数据治理带来了独特的挑战。CARE原则强调集体利益、权威、责任和道德是确保维护土著数据权利、保护土著知识和土著人民积极参与数据治理过程的基本支柱(Carroll et al. 2020, Carroll et al. 2021)。确定切实的活动并为中心土著观点提供指导,为解决快速发展的数据环境中土著数据治理的复杂性提供了一种全面的方法(Gupta等人,2023;Jennings等人,2023;Sterner和Elliott, 2023)。生物多样性研究越来越多地认识到生物多样性与文化实践之间的交织关系,从而引发了关于研究如何加强证据基础、建立信任、提高决策合法性(Alexander et al. 2021)和探索对土著元数据的要求(Jennings et al. 2023)的讨论。在土著元数据研讨会之后发布的《土著元数据捆绑公报》推荐了以下初始类别:治理、来源、土地和;水域、协议和当地环境通知& &;标签。传统知识(TK)和生物文化(BC)标签已成为识别和维护自然生态系统和文化遗产记录中的土著来源、协议和许可的重要工具(Anderson等人,2020年,Liggins等人,2021年),强调了土著人民和地方知识系统在研究和数字管理中的重要性。生物文化标签承认生物多样性与文化多样性之间的复杂联系,强调土著社区通过其传统做法保护生物多样性的作用(Hudson et al. 2021)。通过认识到这些关系的内在价值,TK和BC标签不仅有助于更全面地了解生物多样性,而且还促进了研究人员和当地社区之间的伦理考虑和相互尊重,促进了研究和保护倡议的合作伙伴关系(McCartney et al. 2023)。在生物多样性研究中解决土著数据治理的CARE原则带来了一些挑战和机遇。关于承认数据中的土著权利和利益(Hudson et al. 2023)、知识产权、文化挪用和公平利益分享的伦理问题,必须谨慎处理(Carroll et al. 2022b, Golan et al. 2022)。此外,促进研究人员和社区之间的有效沟通对于确保适当使用土著元数据和协议的准确性和真实性至关重要(Carroll et al. 2022a)。然而,这些挑战被丰富科学知识、加强政策框架和加强以社区为基础的保护工作的潜力所抵消。
{"title":"Recognising Indigenous Provenance in Biodiversity Records","authors":"Maui Hudson, Jane Anderson, Riley Taitingfong, Andrew Martinez, Stephanie Carroll","doi":"10.3897/biss.7.112610","DOIUrl":"https://doi.org/10.3897/biss.7.112610","url":null,"abstract":"The advent of data-driven technologies and the increasing demand for data have brought about unique challenges for Indigenous data governance. The CARE principles emphasize Collective Benefit, Authority, Responsibility, and Ethics as essential pillars for ensuring that Indigenous data rights are upheld, Indigenous knowledge is protected, and Indigenous Peoples are active participants in data governance processes (Carroll et al. 2020, Carroll et al. 2021). Identifying tangible activities and providing guidance to centre Indigenous perspectives provide a comprehensive approach to address the complexities of Indigenous data governance in a rapidly evolving data landscape (Gupta et al. 2023, Jennings et al. 2023, Sterner and Elliott 2023). Biodiversity research has increasingly recognized the intertwined relationship between biological diversity and cultural practices, leading to discussions about how research can strengthen the evidence base, build trust, enhance legitimacy for decision making (Alexander et al. 2021) and explore requirements for Indigenous metadata (Jennings et al. 2023). An Indigenous Metadata Bundle Communique, produced following an Indigenous Metadata Symposium, recommended the initial categories as: Governance, Provenance, Lands & Waters, Protocols, and Local Contexts Notices & Labels. Traditional Knowledge (TK) and Biocultural (BC) Labels have emerged as essential tools for recognising and maintaining Indigenous provenance, protocols and permissions in records for both natural ecosystems and cultural heritage (Anderson et al. 2020, Liggins et al. 2021) emphasizing the importance of Indigenous Peoples and local knowledge systems in research and digital management. Biocultural labels acknowledge the intricate links between biodiversity and cultural diversity, emphasizing the role of indigenous communities in preserving biodiversity through their traditional practices (Hudson et al. 2021). By recognizing the intrinsic value of these relationships, TK and BC labels not only contribute to a more holistic understanding of biodiversity but also promote ethical considerations and mutual respect between researchers and local communities, fostering collaborative partnerships for research and conservation initiatives (McCartney et al. 2023). Addressing the CARE Principles for Indigenous Data Governance in biodiversity research introduces several challenges and opportunities. Ethical concerns regarding recognition of Indigenous rights and interests in data (Hudson et al. 2023), intellectual property rights, cultural appropriation, and equitable benefit sharing, must be navigated sensitively (Carroll et al. 2022b, Golan et al. 2022). Moreover, fostering effective communication between researchers and communities is paramount for ensuring the accuracy and authenticity of Indigenous metadata and protocols for appropriate use (Carroll et al. 2022a). However, these challenges are offset by the potential for enriching scientific knowledg","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135831122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Indigenous data governance is a critical aspect of upholding Indigenous rights and fostering equitable partnerships in biodiversity research and data management. An estimated 80% of the planet’s biodiversity exists on Indigenous lands (Sobrevila 2008), and the majority of Indigenous data derived from specimens taken from Indigenous lands are held by non-Indigenous entities and institutions. The CARE Principles (Collective benefit, Authority to control, Responsibility, and Ethics) are designed to guide the inclusion of Indigenous peoples in data governance, and increase their access to and benefit from data (Carroll et al. 2020). This talk will share emerging tools and resources that can be leveraged to implement the CARE Principles within repositories and institutions that hold Indigenous data. This talk highlights two primary tools to promote Indigenous data governance in repositories: a phased framework to guide third-party holders of Indigenous data through foundational learning and concrete steps to apply the CARE principles in their respective settings, and the CARE criteria, an assessment tool by which researchers and institutions can evaluate the maturity of CARE implementation, identify areas for improvement, and allow other entities such as funders and publishers to evaluate CARE compliance. a phased framework to guide third-party holders of Indigenous data through foundational learning and concrete steps to apply the CARE principles in their respective settings, and the CARE criteria, an assessment tool by which researchers and institutions can evaluate the maturity of CARE implementation, identify areas for improvement, and allow other entities such as funders and publishers to evaluate CARE compliance.
土著数据治理是维护土著权利和促进生物多样性研究和数据管理方面的公平伙伴关系的一个关键方面。据估计,地球上80%的生物多样性存在于土著土地上(Sobrevila 2008),从土著土地上采集的标本中获得的大多数土著数据由非土著实体和机构持有。CARE原则(集体利益、控制权力、责任和道德)旨在指导将土著人民纳入数据治理,并增加他们对数据的获取和受益(Carroll et al. 2020)。本次演讲将分享新兴的工具和资源,这些工具和资源可以用来在存储库和机构中实现持有本地数据的CARE原则。本次演讲重点介绍了在存储库中促进本地数据治理的两个主要工具:一个分阶段的框架,通过基础学习和具体步骤指导土著数据的第三方持有者在各自的环境中应用CARE原则,以及CARE标准,一个评估工具,研究人员和机构可以通过它来评估CARE实施的成熟度,确定需要改进的领域,并允许其他实体(如资助者和出版商)评估CARE的合规性。一个分阶段的框架,通过基础学习和具体步骤指导土著数据的第三方持有者在各自的环境中应用CARE原则,以及CARE标准,一个评估工具,研究人员和机构可以通过它来评估CARE实施的成熟度,确定需要改进的领域,并允许其他实体(如资助者和出版商)评估CARE的合规性。
{"title":"Implementing the CARE Principles for Indigenous Data Governance in Biodiversity Data Management","authors":"Riley Taitingfong, Stephanie Carroll","doi":"10.3897/biss.7.112615","DOIUrl":"https://doi.org/10.3897/biss.7.112615","url":null,"abstract":"Indigenous data governance is a critical aspect of upholding Indigenous rights and fostering equitable partnerships in biodiversity research and data management. An estimated 80% of the planet’s biodiversity exists on Indigenous lands (Sobrevila 2008), and the majority of Indigenous data derived from specimens taken from Indigenous lands are held by non-Indigenous entities and institutions. The CARE Principles (Collective benefit, Authority to control, Responsibility, and Ethics) are designed to guide the inclusion of Indigenous peoples in data governance, and increase their access to and benefit from data (Carroll et al. 2020). This talk will share emerging tools and resources that can be leveraged to implement the CARE Principles within repositories and institutions that hold Indigenous data. This talk highlights two primary tools to promote Indigenous data governance in repositories: a phased framework to guide third-party holders of Indigenous data through foundational learning and concrete steps to apply the CARE principles in their respective settings, and the CARE criteria, an assessment tool by which researchers and institutions can evaluate the maturity of CARE implementation, identify areas for improvement, and allow other entities such as funders and publishers to evaluate CARE compliance. a phased framework to guide third-party holders of Indigenous data through foundational learning and concrete steps to apply the CARE principles in their respective settings, and the CARE criteria, an assessment tool by which researchers and institutions can evaluate the maturity of CARE implementation, identify areas for improvement, and allow other entities such as funders and publishers to evaluate CARE compliance.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135830419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}