Five years ago, the value of biodiversity open data was scarcely recognized in Taiwan. This posed a significant challenge to the Taiwan Biodiversity Infomation Facility (TaiBIF), our national node of the Global Biodiversity Information Facility (GBIF), in its sustained efforts to enhance data publishing capacities. Notably, non-academic entities, both governmental and industrial, were reluctant to invest resources in data management and publication, questioning the benefits beyond purely research-oriented returns. At the time, Taiwan had fewer than a million published records domestically, while GBIF held around 3 million occurrence records for Taiwan, largely unused by local users. We speculated that this discrepancy in data usage stemmed from three factors: (1) lack of species names in the local language within the occurrence data, (2) missing locally important species attributes, such as conservation status and national red list categories, and (3) absence of a culturally relatable local portal promoting biodiversity data usage. To address these issues, we launched the Taiwan Biodiversity Network (TBN) website in 2018, localizing global data from GBIF and integrating missing information from local data sources. Collaborating with wildlife illustrators, we designed a user-friendly data interface to lessen the system's technical or academic barriers. This effort led to a doubling of website visitors and data download requests annually, and in recent years, biodiversity open data has become a vital component in environmental impact assessments. This upward trend heightened the recognition of the value of biodiversity open data, inciting organizations, such as initially data-conservative government agencies and private sectors with no obligatory data-sharing, to invest in data management and mobilization. This advancement also catalyzed the formation of the Taiwan Biodiversity Information Alliance (TBIA), actively promoting cross-organizational collaborations on data integration. Today, Taiwan offers more than 19 million globally accessible occurrence records and data for more than 28,000 species. While the surge in data volume can certainly be credited to the active local citizen science community, we believe the expanded coverage of species and data types is a result of a growing community supportive of biodiversity open data. This was made possible by the establishment of a local portal that effectively bridged the gap between global data and local needs. We hope our experience will motivate other Asian countries to create analogous local portals using global open data sources like GBIF, illustrating the value of biodiversity open data to decision-makers and overcoming resource limitations that impede investments in biodiversity informatics.
{"title":"Will a Local Portal using Global Data Encourage the Mainstreaming of Biodiversity Informatics in Asia? In Taiwan, We Say Yes","authors":"Jerome Chie-Jen Ko, Huiling Chang, Yihong Chang, You-Cheng Yu, Min-Hsuan Ni, Jun-Yi Wu, You Zhen Chen","doi":"10.3897/biss.7.112176","DOIUrl":"https://doi.org/10.3897/biss.7.112176","url":null,"abstract":"Five years ago, the value of biodiversity open data was scarcely recognized in Taiwan. This posed a significant challenge to the Taiwan Biodiversity Infomation Facility (TaiBIF), our national node of the Global Biodiversity Information Facility (GBIF), in its sustained efforts to enhance data publishing capacities. Notably, non-academic entities, both governmental and industrial, were reluctant to invest resources in data management and publication, questioning the benefits beyond purely research-oriented returns.\u0000 At the time, Taiwan had fewer than a million published records domestically, while GBIF held around 3 million occurrence records for Taiwan, largely unused by local users. We speculated that this discrepancy in data usage stemmed from three factors: (1) lack of species names in the local language within the occurrence data, (2) missing locally important species attributes, such as conservation status and national red list categories, and (3) absence of a culturally relatable local portal promoting biodiversity data usage.\u0000 To address these issues, we launched the Taiwan Biodiversity Network (TBN) website in 2018, localizing global data from GBIF and integrating missing information from local data sources. Collaborating with wildlife illustrators, we designed a user-friendly data interface to lessen the system's technical or academic barriers. This effort led to a doubling of website visitors and data download requests annually, and in recent years, biodiversity open data has become a vital component in environmental impact assessments. This upward trend heightened the recognition of the value of biodiversity open data, inciting organizations, such as initially data-conservative government agencies and private sectors with no obligatory data-sharing, to invest in data management and mobilization. This advancement also catalyzed the formation of the Taiwan Biodiversity Information Alliance (TBIA), actively promoting cross-organizational collaborations on data integration.\u0000 Today, Taiwan offers more than 19 million globally accessible occurrence records and data for more than 28,000 species. While the surge in data volume can certainly be credited to the active local citizen science community, we believe the expanded coverage of species and data types is a result of a growing community supportive of biodiversity open data. This was made possible by the establishment of a local portal that effectively bridged the gap between global data and local needs. We hope our experience will motivate other Asian countries to create analogous local portals using global open data sources like GBIF, illustrating the value of biodiversity open data to decision-makers and overcoming resource limitations that impede investments in biodiversity informatics.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74888548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peter Brenton, Peggy Eby, Robert Stevenson, Elizabeth R. Ellwood
Habitat decline and fragmentation are major factors in biodiversity loss across the globe and can be difficult to measure, particularly at landscape scale (Brooks et al. 2002, Fahrig 2003, Ritchie and Roser 2019). In Australia, rural, coastal and urban communities have been undertaking habitat restoration activities since the mid-1980s to protect and restore ecological balance on private land and in local shared and natural spaces. Much of the restoration effort has centered around hands-on activities as a mechanism for building community with environmental benefits. Over such a time span, thousands of locations throughout the country have been transformed from degraded and highly disturbed landscapes into resemblances of more-or-less natural areas. However, collecting and analysing data for these activities was given little attention until quite recently, as governments, philanthropists and other investors have become increasingly interested in measuring the value and outcomes from investment. To measure the effectiveness of the restoration effort, it is essential to to benchmark the environmental state and species composition before the restoration begins, but surprisingly or unsurprisingly, this is rarely done (Hale et al. 2019). Responding to this call for better documentation of restoration outcomes, over 30 groups have been using the Atlas of Living Australia’s BioCollect platform to capture complex information about current and past restoration work. The BioCollect platform enables each type of monitoring, establishment, and follow-up activity to have its own data collection schema and associated metadata structured around using a hierarchy of sampling events based on the Event class in the Darwin Core standard, which allows relationships between types of event records to be specified. When event records are created through use of an activity-based template, each occurrence of a species is also parsed and configured as a Darwin Core occurrence record. Standard templates have been created for a range of activities, such as benchmarking assessments, site establishment, follow-up interventions and monitoring over time, which are being used by many different groups over large areas of the landscape. This allows each group to operate independently, yet collect standardised data that can be easily aggregated at larger temporal and spatial scales, quantifying change over time. The relationships between occurrences and the event context in which they were collected is also preserved and navigable. Here we present how Darwin Core and Event Core have been implemented in the BioCollect platform to enable this important data to be collected and stored in its full richness and resolution.
栖息地减少和破碎化是全球生物多样性丧失的主要因素,很难测量,特别是在景观尺度上(Brooks et al. 2002, Fahrig 2003, Ritchie and Roser 2019)。在澳大利亚,自20世纪80年代中期以来,农村、沿海和城市社区一直在开展栖息地恢复活动,以保护和恢复私人土地和当地共享和自然空间的生态平衡。大部分的修复工作都集中在实践活动上,作为一种建立具有环境效益的社区的机制。在这段时间里,全国成千上万的地方已经从退化和高度受干扰的景观变成了或多或少与自然地区相似的地方。然而,收集和分析这些活动的数据直到最近才受到重视,因为政府、慈善家和其他投资者对衡量投资的价值和结果越来越感兴趣。为了衡量恢复工作的有效性,必须在恢复开始之前对环境状态和物种组成进行基准测试,但令人惊讶或不出所料的是,很少这样做(Hale et al. 2019)。为了更好地记录修复结果,30多个小组一直在使用澳大利亚生活地图集的BioCollect平台来捕捉有关当前和过去修复工作的复杂信息。BioCollect平台允许每种类型的监测、建立和后续活动都有自己的数据收集模式和相关的元数据,这些模式和元数据使用基于Darwin Core标准中的Event类的采样事件层次结构,这允许指定事件记录类型之间的关系。当通过使用基于活动的模板创建事件记录时,物种的每次出现也被解析并配置为Darwin Core发生记录。已经为一系列活动创建了标准模板,例如基准评估、场地建立、后续干预和长期监测,这些模板正在被许多不同群体在大片景观中使用。这使得每个小组可以独立运作,但可以收集标准化的数据,这些数据可以很容易地在更大的时间和空间尺度上进行汇总,量化随时间的变化。事件与收集事件的事件上下文之间的关系也得到了保存和可导航。在这里,我们介绍了如何在BioCollect平台上实现达尔文核心和事件核心,以使这些重要的数据能够以其完整的丰富性和分辨率被收集和存储。
{"title":"Measuring Habitat Restoration using the Darwin and \"Event\" Cores: Australian examples powered by BioCollect","authors":"Peter Brenton, Peggy Eby, Robert Stevenson, Elizabeth R. Ellwood","doi":"10.3897/biss.7.112083","DOIUrl":"https://doi.org/10.3897/biss.7.112083","url":null,"abstract":"Habitat decline and fragmentation are major factors in biodiversity loss across the globe and can be difficult to measure, particularly at landscape scale (Brooks et al. 2002, Fahrig 2003, Ritchie and Roser 2019). In Australia, rural, coastal and urban communities have been undertaking habitat restoration activities since the mid-1980s to protect and restore ecological balance on private land and in local shared and natural spaces. Much of the restoration effort has centered around hands-on activities as a mechanism for building community with environmental benefits. Over such a time span, thousands of locations throughout the country have been transformed from degraded and highly disturbed landscapes into resemblances of more-or-less natural areas. \u0000 However, collecting and analysing data for these activities was given little attention until quite recently, as governments, philanthropists and other investors have become increasingly interested in measuring the value and outcomes from investment. To measure the effectiveness of the restoration effort, it is essential to to benchmark the environmental state and species composition before the restoration begins, but surprisingly or unsurprisingly, this is rarely done (Hale et al. 2019).\u0000 Responding to this call for better documentation of restoration outcomes, over 30 groups have been using the Atlas of Living Australia’s BioCollect platform to capture complex information about current and past restoration work. The BioCollect platform enables each type of monitoring, establishment, and follow-up activity to have its own data collection schema and associated metadata structured around using a hierarchy of sampling events based on the Event class in the Darwin Core standard, which allows relationships between types of event records to be specified. When event records are created through use of an activity-based template, each occurrence of a species is also parsed and configured as a Darwin Core occurrence record. Standard templates have been created for a range of activities, such as benchmarking assessments, site establishment, follow-up interventions and monitoring over time, which are being used by many different groups over large areas of the landscape. This allows each group to operate independently, yet collect standardised data that can be easily aggregated at larger temporal and spatial scales, quantifying change over time. The relationships between occurrences and the event context in which they were collected is also preserved and navigable.\u0000 Here we present how Darwin Core and Event Core have been implemented in the BioCollect platform to enable this important data to be collected and stored in its full richness and resolution.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84158634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Filipi Miranda Soares, Luís Ferreira Pires, Maria Carolina Garcia, A. D. de Carvalho, S. Koffler, N. Ghilardi-Lopes, Rubens Silva, Benildes Maculan, Ana Maria Bertolini, Gabriela Rigote, L. Coradin, U. Montedo, Debora P. Drucker, Raquel Santiago, Maria Clara de Carvalho, Ana Carolina da Silva Lima, Karoline Reis de Almeida, Stephanie Gabriele Mendonça de França, Hillary Dandara Elias Gabriel, Bárbara Junqueira dos Santos, A. Saraiva
The "Pomar Urbano" (Urban Orchard) project focuses on the collaborative monitoring of fruit-bearing plant species in urban areas throughout Brazil. The project collected a list of 411 fruit-bearing plant species (Soares et al. 2023), both native and exotic varieties found in Brazil. This list was selected from two main sources: the book Brazilian Fruits and Cultivated Exotics (Lorenzi et al. 2006) and the book series Plants for the Future, which includes volumes specifically dedicated to species of economic value in different regions of Brazil, namely the South (Coradin et al. 2011), Midwest (Vieira et al. 2016), Northeast (Coradin et al. 2018) and North (Coradin et al. 2022). To ensure broad geographic coverage, the project spans all 27 state capitals of Brazil. The data collection process relies on the iNaturalist Umbrella and Collection projects. Each state capital has a single collection project, including the fruit-bearing plant species list, and the locality restriction to that specific city. For example, the collection project Pomar Paulistano gathers data from the city of São Paulo. The Umbrella Project Urban Orchard was set to track data from all 27 collection projects. We firmly believe that these fruit-bearing plant species possess multifaceted value that extends beyond mere consumption. As such, we have assembled a dynamic and multidisciplinary team comprising professionals from various institutions across Brazil in a collaborative effort that encompasses different dimensions of biodiversity value exploration and monitoring, especially phenological data. One facet of our team is focused on creating products inspired by the diverse array of Brazilian fruit-bearing plants. Their work spans across sectors of the creative industry, including fashion, painting, and graphic design to infuse these natural elements into innovative and sustainable designs (Fig. 1 and Fig. 2). A group of nutrition and health scientists in conjunction with communication and marketing professionals is working to produce engaging media content centered around food recipes that incorporate Brazilian fruits (Fig. 3). These recipes primarily feature the fruit-bearing plants most frequently observed on iNaturalist in the city of São Paulo, allowing us to showcase the local biodiversity while promoting culinary diversity. Some of these recipes are based on the book Brazilian Biodiversity: Flavors and Aromas (Santiago and Coradin 2018). This book is an extensive compendium of food recipes that use fruits derived from native Brazilian species.
“城市果园”(Pomar Urbano)项目的重点是对巴西城市地区的结果植物物种进行合作监测。该项目收集了411种结果植物物种(Soares et al. 2023),包括巴西的本地和外来品种。这份清单主要来自两个来源:《巴西水果和栽培外来种》一书(Lorenzi等人,2006年)和《未来植物》系列丛书,其中包括专门介绍巴西不同地区具有经济价值的物种的卷,即南部(Coradin等人,2011年)、中西部(Vieira等人,2016年)、东北部(Coradin等人,2018年)和北部(Coradin等人,2022年)。为了确保广泛的地理覆盖,该项目涵盖了巴西所有27个州的首府。数据收集过程依赖于iNaturalist Umbrella和collection项目。每个州府都有一个单独的收集项目,包括结果植物种类清单,以及对该特定城市的地点限制。例如,Pomar Paulistano收集项目收集来自圣保罗市的数据。“雨伞项目城市果园”将跟踪所有27个收集项目的数据。我们坚信,这些结果的植物物种具有多方面的价值,超出了单纯的消费。因此,我们组建了一支充满活力的多学科团队,由来自巴西各地不同机构的专业人员组成,共同努力,涵盖生物多样性价值探索和监测的不同维度,特别是物候数据。我们团队的一个方面是专注于创造灵感来自巴西果实植物的多样化阵列的产品。他们的作品横跨创意产业的各个领域,包括时尚、绘画、以及将这些自然元素注入创新和可持续设计中的平面设计(图1和图2)。一群营养和健康科学家与传播和营销专业人士合作,正在努力制作引人注目的媒体内容,以包含巴西水果的食物食谱为中心(图3)。这些食谱主要以圣保罗市iNaturalist网站上最常见的结果植物为特色。让我们能够展示当地的生物多样性,同时促进烹饪的多样性。其中一些食谱是基于《巴西生物多样性:风味和香气》(圣地亚哥和科拉丁2018)一书。这本书是一个广泛的食品配方纲要,使用来自巴西本土物种的水果。
{"title":"Optimizing the Monitoring of Urban Fruit-Bearing Flora with Citizen Science: An Overview of the Pomar Urbano Initiative","authors":"Filipi Miranda Soares, Luís Ferreira Pires, Maria Carolina Garcia, A. D. de Carvalho, S. Koffler, N. Ghilardi-Lopes, Rubens Silva, Benildes Maculan, Ana Maria Bertolini, Gabriela Rigote, L. Coradin, U. Montedo, Debora P. Drucker, Raquel Santiago, Maria Clara de Carvalho, Ana Carolina da Silva Lima, Karoline Reis de Almeida, Stephanie Gabriele Mendonça de França, Hillary Dandara Elias Gabriel, Bárbara Junqueira dos Santos, A. Saraiva","doi":"10.3897/biss.7.112009","DOIUrl":"https://doi.org/10.3897/biss.7.112009","url":null,"abstract":"The \"Pomar Urbano\" (Urban Orchard) project focuses on the collaborative monitoring of fruit-bearing plant species in urban areas throughout Brazil.\u0000 The project collected a list of 411 fruit-bearing plant species (Soares et al. 2023), both native and exotic varieties found in Brazil. This list was selected from two main sources: the book Brazilian Fruits and Cultivated Exotics (Lorenzi et al. 2006) and the book series Plants for the Future, which includes volumes specifically dedicated to species of economic value in different regions of Brazil, namely the South (Coradin et al. 2011), Midwest (Vieira et al. 2016), Northeast (Coradin et al. 2018) and North (Coradin et al. 2022). To ensure broad geographic coverage, the project spans all 27 state capitals of Brazil. The data collection process relies on the iNaturalist Umbrella and Collection projects. Each state capital has a single collection project, including the fruit-bearing plant species list, and the locality restriction to that specific city. For example, the collection project Pomar Paulistano gathers data from the city of São Paulo. The Umbrella Project Urban Orchard was set to track data from all 27 collection projects.\u0000 We firmly believe that these fruit-bearing plant species possess multifaceted value that extends beyond mere consumption. As such, we have assembled a dynamic and multidisciplinary team comprising professionals from various institutions across Brazil in a collaborative effort that encompasses different dimensions of biodiversity value exploration and monitoring, especially phenological data.\u0000 One facet of our team is focused on creating products inspired by the diverse array of Brazilian fruit-bearing plants. Their work spans across sectors of the creative industry, including fashion, painting, and graphic design to infuse these natural elements into innovative and sustainable designs (Fig. 1 and Fig. 2).\u0000 A group of nutrition and health scientists in conjunction with communication and marketing professionals is working to produce engaging media content centered around food recipes that incorporate Brazilian fruits (Fig. 3). These recipes primarily feature the fruit-bearing plants most frequently observed on iNaturalist in the city of São Paulo, allowing us to showcase the local biodiversity while promoting culinary diversity. Some of these recipes are based on the book Brazilian Biodiversity: Flavors and Aromas (Santiago and Coradin 2018). This book is an extensive compendium of food recipes that use fruits derived from native Brazilian species.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"229 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84050439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi-Ming Gan, Ruben Perez Perez, Pieter Provoost, A. Benson, A. C. Peralta Brichtova, Elizabeth R. Lawrence, John Nicholls, Johnny Konjarla, Georgia Sarafidou, H. Saeedi, Dan Lear, Anke Penzlin, N. Wambiji, W. Appeltans
The Ocean Biodiversity Information System (OBIS) (Klein et al. 2019) is a global database of marine biodiversity and associated environmental data, which provides critical information to researchers and policymakers worldwide. Ensuring the accuracy and consistency of the data in OBIS is essential for its usefulness and value, not only to the scientific community but also to the science-policy interface. The OBIS Data Quality Assessment and Enhancement Project Team (QCPT), formed in 2019 by the OBIS steering group, aims to assess and enhance data quality. It has been working on three categories of activities for this purpose: Data quality enhancement and management The OBIS QCPT organized data laundry events to identify and address data quality issues of published OBIS datasets. Furthermore, individual OBIS nodes were invited to give their data-processing presentations in the monthly meetings to foster knowledge sharing and collaborative problem-solving focused on data quality. Data quality issues and solutions highlighted in the presentations and data laundry events were documented in a dedicated GitHub repository as GitHub issues. The solutions for data quality issues and marine-specific pre-publication quality control tools, designed to identify the data quality issues, were provided as feedback to the OBIS Capacity Development Task Team. These inputs were used to create training resources (see OBIS manual, upcoming OBIS training course hosted on OceanTeacher Global Academy) aimed at preventing these issues. Standardization of OBIS data processing pipeline As OBIS uses the Darwin Core standard (Wieczorek et al. 2012), the use of standardized tests and assertions in the data processing pipeline is encouraged. To achieve this, the OBIS QCPT aligned OBIS quality checks with a subset of core tests and assertions (Chapman et al. 2020) developed by the Biodiversity Information Standards (TDWG) Biodiversity Data Quality (BDQ) Task Group 2 (TG2) (Chapman et al. 2020) as tracked in this GitHub issue. Not all default parameters of the core tests and assertions are optimal for marine biodiversity data. The OBIS QCPT met monthly to determine suitable parameters for customizing the tests. The pipeline produces a data quality report for each dataset with quality flags that indicate potential data quality issues, enabling node managers and data providers to review the flagged records. Community engagement The OBIS QCPT led a survey among data users to gather insights into OBIS data quality issues and bridge the gap between the current implementation and user expectations. The survey findings enabled OBIS to prioritize issues to be addressed, as summarized in Section 2.2.2 of the 11th OBIS Steering Group meeting report. In addition to engaging with data users, the OBIS QCPT also served as a platform to discuss questions related to the use of Darwin Core from the nodes and provided feedback for the term discussions. In summary, the OBIS QCP
海洋生物多样性信息系统(OBIS) (Klein et al. 2019)是海洋生物多样性和相关环境数据的全球数据库,为世界各地的研究人员和政策制定者提供重要信息。确保OBIS数据的准确性和一致性对其有用性和价值至关重要,不仅对科学界,而且对科学-政策界面也是如此。OBIS数据质量评估和提高项目组(QCPT)由OBIS指导小组于2019年成立,旨在评估和提高数据质量。OBIS QCPT为此目的开展了三类活动:数据质量增强和管理OBIS QCPT组织了数据清洗活动,以识别和解决已发布的OBIS数据集的数据质量问题。此外,OBIS各节点应邀在每月会议上介绍数据处理情况,以促进以数据质量为重点的知识共享和协作解决问题。在演示和数据清洗事件中强调的数据质量问题和解决方案被记录在专用的GitHub存储库中,作为GitHub问题。数据质量问题的解决方案和针对海洋的出版前质量控制工具,旨在确定数据质量问题,作为反馈提供给OBIS能力发展任务小组。这些输入用于创建培训资源(参见OBIS手册,即将在OceanTeacher全球学院举办的OBIS培训课程),旨在防止这些问题。由于OBIS使用达尔文核心标准(Wieczorek et al. 2012),因此鼓励在数据处理管道中使用标准化测试和断言。为了实现这一目标,OBIS QCPT将OBIS质量检查与生物多样性信息标准(TDWG)生物多样性数据质量(BDQ)任务组2 (TG2) (Chapman et al. 2020)开发的核心测试和断言子集(Chapman et al. 2020)保持一致,并在本GitHub问题中进行了跟踪。并非所有核心测试和断言的默认参数都是海洋生物多样性数据的最佳参数。OBIS QCPT每月开会一次,以确定定制测试的合适参数。该管道为每个数据集生成数据质量报告,其中带有质量标志,指示潜在的数据质量问题,使节点管理器和数据提供程序能够检查标记的记录。OBIS QCPT在数据用户中进行了一项调查,以收集OBIS数据质量问题的见解,并弥合当前实施与用户期望之间的差距。正如OBIS指导小组第11次会议报告第2.2.2节所总结的那样,调查结果使OBIS能够优先考虑要解决的问题。除了与数据用户互动外,OBIS QCPT还作为一个平台,讨论与节点使用达尔文核心相关的问题,并为术语讨论提供反馈。总之,OBIS QCPT通过透明和参与性的方法提高了海洋物种数据的可靠性和可用性,促进了持续改进。合作努力、标准化程序和知识共享推进OBIS为研究、保护和海洋管理提供高质量生物多样性数据的使命。
{"title":"Promoting High-Quality Data in OBIS: Insights from the OBIS Data Quality Assessment and Enhancement Project Team ","authors":"Yi-Ming Gan, Ruben Perez Perez, Pieter Provoost, A. Benson, A. C. Peralta Brichtova, Elizabeth R. Lawrence, John Nicholls, Johnny Konjarla, Georgia Sarafidou, H. Saeedi, Dan Lear, Anke Penzlin, N. Wambiji, W. Appeltans","doi":"10.3897/biss.7.112018","DOIUrl":"https://doi.org/10.3897/biss.7.112018","url":null,"abstract":"The Ocean Biodiversity Information System (OBIS) (Klein et al. 2019) is a global database of marine biodiversity and associated environmental data, which provides critical information to researchers and policymakers worldwide. Ensuring the accuracy and consistency of the data in OBIS is essential for its usefulness and value, not only to the scientific community but also to the science-policy interface. The OBIS Data Quality Assessment and Enhancement Project Team (QCPT), formed in 2019 by the OBIS steering group, aims to assess and enhance data quality. It has been working on three categories of activities for this purpose:\u0000 \u0000 Data quality enhancement and management\u0000 \u0000 The OBIS QCPT organized data laundry events to identify and address data quality issues of published OBIS datasets. Furthermore, individual OBIS nodes were invited to give their data-processing presentations in the monthly meetings to foster knowledge sharing and collaborative problem-solving focused on data quality. Data quality issues and solutions highlighted in the presentations and data laundry events were documented in a dedicated GitHub repository as GitHub issues. The solutions for data quality issues and marine-specific pre-publication quality control tools, designed to identify the data quality issues, were provided as feedback to the OBIS Capacity Development Task Team. These inputs were used to create training resources (see OBIS manual, upcoming OBIS training course hosted on OceanTeacher Global Academy) aimed at preventing these issues.\u0000 \u0000 Standardization of OBIS data processing pipeline \u0000 \u0000 As OBIS uses the Darwin Core standard (Wieczorek et al. 2012), the use of standardized tests and assertions in the data processing pipeline is encouraged. To achieve this, the OBIS QCPT aligned OBIS quality checks with a subset of core tests and assertions (Chapman et al. 2020) developed by the Biodiversity Information Standards (TDWG) Biodiversity Data Quality (BDQ) Task Group 2 (TG2) (Chapman et al. 2020) as tracked in this GitHub issue. Not all default parameters of the core tests and assertions are optimal for marine biodiversity data. The OBIS QCPT met monthly to determine suitable parameters for customizing the tests. The pipeline produces a data quality report for each dataset with quality flags that indicate potential data quality issues, enabling node managers and data providers to review the flagged records.\u0000 \u0000 Community engagement\u0000 \u0000 The OBIS QCPT led a survey among data users to gather insights into OBIS data quality issues and bridge the gap between the current implementation and user expectations. The survey findings enabled OBIS to prioritize issues to be addressed, as summarized in Section 2.2.2 of the 11th OBIS Steering Group meeting report. In addition to engaging with data users, the OBIS QCPT also served as a platform to discuss questions related to the use of Darwin Core from the nodes and provided feedback for the term discussions. \u0000 In summary, the OBIS QCP","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85992377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katie Pearson, E. Gilbert, K. S. Orellana, Greg Post, Lindsay Walker, Jenn Yost, Nico Franz
Symbiota is empowering biodiversity collections communities across the globe to efficiently manage and mobilize their data. Beginning with only a handful of collections in two major portals in the early 2010s (Gries et al. 2014), Symbiota now acts as the primary content management system for over 1,000 collections in more than 50 portals. Over 1,800 collections share data through Symbiota portals, constituting over 90+ million records and 42+ million images. The iDigBio Symbiota Support Hub, a team and cyberinfrastructure based out of Arizona State University and supported by the United States (U.S.) National Science Foundation, hosts 52 Symbiota portals and provides daily help and resources to all Symbiota user communities. The Symbiota codebase is being actively developed in collaboration with several funded projects, including the U.S. National Ecological and Observatory Network (NEON), to support new data types and connections, such as between Symbiota portals and other collections management systems, and to other resources (e.g., Index Fungorum, Global Registry of Scientific Collections, Bionomia, Environmental Data Initiative). Because the Symbiota codebase is open source and shared among portals, new developments in any portal empower the entire network. Here we describe recent expansions of the Symbiota network, including new portals, collaborations, functionalities, and sustainability actions. We look forward to building further collaborations with diverse, international collections data communities.
Symbiota使全球的生物多样性收集社区能够有效地管理和调动他们的数据。在2010年代初,Symbiota在两个主要门户网站中只有少数几个集合(Gries et al. 2014),现在Symbiota作为50多个门户网站中超过1000个集合的主要内容管理系统。超过1800个收藏通过Symbiota门户共享数据,构成超过9000万条记录和4200万张图像。iDigBio共生体支持中心,一个基于亚利桑那州立大学的团队和网络基础设施,由美国支持。美国国家科学基金会,拥有52个Symbiota门户网站,并为所有Symbiota用户社区提供日常帮助和资源。Symbiota代码库正在与包括美国国家生态和观测站网络(NEON)在内的几个资助项目合作积极开发,以支持新的数据类型和连接,例如Symbiota门户和其他集合管理系统之间,以及与其他资源(例如Index Fungorum,全球科学集合注册,Bionomia,环境数据倡议)。由于Symbiota代码库是开源的,并在门户之间共享,因此任何门户的新开发都可以增强整个网络的能力。在这里,我们描述了Symbiota网络最近的扩展,包括新的门户、合作、功能和可持续性行动。我们期待着与不同的国际收集数据社区建立进一步的合作。
{"title":"Growth and Evolution of the Symbiota Portal Network","authors":"Katie Pearson, E. Gilbert, K. S. Orellana, Greg Post, Lindsay Walker, Jenn Yost, Nico Franz","doi":"10.3897/biss.7.112028","DOIUrl":"https://doi.org/10.3897/biss.7.112028","url":null,"abstract":"Symbiota is empowering biodiversity collections communities across the globe to efficiently manage and mobilize their data. Beginning with only a handful of collections in two major portals in the early 2010s (Gries et al. 2014), Symbiota now acts as the primary content management system for over 1,000 collections in more than 50 portals. Over 1,800 collections share data through Symbiota portals, constituting over 90+ million records and 42+ million images. The iDigBio Symbiota Support Hub, a team and cyberinfrastructure based out of Arizona State University and supported by the United States (U.S.) National Science Foundation, hosts 52 Symbiota portals and provides daily help and resources to all Symbiota user communities. The Symbiota codebase is being actively developed in collaboration with several funded projects, including the U.S. National Ecological and Observatory Network (NEON), to support new data types and connections, such as between Symbiota portals and other collections management systems, and to other resources (e.g., Index Fungorum, Global Registry of Scientific Collections, Bionomia, Environmental Data Initiative). Because the Symbiota codebase is open source and shared among portals, new developments in any portal empower the entire network. Here we describe recent expansions of the Symbiota network, including new portals, collaborations, functionalities, and sustainability actions. We look forward to building further collaborations with diverse, international collections data communities.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84059322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Species File Group (SFG) endeavors to build tools and community structures that empower researchers and collections staff in their long-term collective efforts to gather, share, and learn from biodiversity data. One such tool is TaxonWorks, now in its 10th year of development. TaxonWorks provides a collaborative workbench where scientists, collection managers, students, and volunteers capture and build on the key data and concepts we use to Describe Life (TaxonWorks motto). It provides a growing number of ways to share descriptions, from Darwin Core Archives, to NeXML-formatted observations and keys, to checklists, and bibliographies. What’s New? We have expanded the data model of TaxonWorks, added new tools and functions, and some Companion software, that is, new stand-alone code-bases. Two major additions, Unified Filters and Cached Maps, provide developers and users (and users who are developers) the ability to run complex queries across TaxonWorks' rich data model and to display quickly computed maps for datasets of notable size, 100K or more specimen and literature-based records. For example, Cached Maps can superimpose the asserted distribution and georeferenced literature and specimen records to create interactive searchable maps (Fig. 1). In TaxonWorks, we aim to empower those working with the data with tools that help them visualize and curate information. To be able to model taxon concept relationships over time to reflect different taxonomic opinions, we added RCC-5 (Region Connection Calculus; Thau et al. 2008), which will make it possible to visualize these relationships. Similarly, we built a new visual editor (Fig. 2) for displaying, editing, and citing biological associations as recorded among specimens or taxa (or both). Querying and enhancing data in a given database can be complex. We have worked on harmonizing the look-feel-function of the data filtering interfaces. With our Unified Filters, one can pass the results of one search to another filter (e.g., query for specimens for a given taxonomic group and then ask for the distinct collecting events for those specimens). Then, once you filter to a given dataset, you can use our new Stepwise tasks to enhance and edit that information en-masse. Companions code-bases extend what one can do with the data in TaxonWorks, but are also available for use with other software. For example, using our new TaxonPages code, our users can produce their own web pages for taxa (Fig. 1). TaxonPages will be used by SFG groups to make available well over 100K pages this year. They include basic Bioschema integration, links to JSON-formatted data behind every panel, and the option to download any occurrence data present, expressed as Darwin Core attributes, formatted as a CSV file. TaxonPages can be set up in minutes and served on resources like GitHub pages and our user community can customize their content. Finally, the TaxonWorks external API has added a huge number of new parame
物种档案组(SFG)致力于建立工具和社区结构,使研究人员和收集人员能够长期共同努力收集、分享和学习生物多样性数据。TaxonWorks就是这样一个工具,现在已经开发了10年。TaxonWorks提供了一个协作工作台,科学家、收集管理人员、学生和志愿者可以在此获取并构建用于描述生命(TaxonWorks的座右铭)的关键数据和概念。它提供了越来越多的方法来共享描述,从达尔文核心档案到nexml格式的观察和关键字,再到清单和参考书目。有什么新鲜事吗?我们扩展了TaxonWorks的数据模型,增加了新的工具和功能,以及一些Companion软件,即新的独立代码库。两个主要的新增功能,统一过滤器和缓存地图,为开发人员和用户(以及开发人员用户)提供了跨TaxonWorks的丰富数据模型运行复杂查询的能力,并为显着大小的数据集(100K或更多的样本和基于文献的记录)快速显示计算地图。例如,缓存地图可以叠加断言的分布和地理参考文献和标本记录,以创建交互式可搜索的地图(图1)。在TaxonWorks中,我们的目标是为那些使用数据的人提供工具,帮助他们可视化和管理信息。为了能够模拟分类单元概念关系随时间的变化,以反映不同的分类观点,我们添加了RCC-5(区域连接演算;Thau et al. 2008),这将使可视化这些关系成为可能。同样,我们构建了一个新的可视化编辑器(图2),用于显示、编辑和引用标本或分类群(或两者)之间记录的生物关联。查询和增强给定数据库中的数据可能很复杂。我们致力于协调数据过滤接口的观感功能。使用我们的统一过滤器,可以将一个搜索结果传递给另一个过滤器(例如,查询给定分类组的标本,然后请求这些标本的不同收集事件)。然后,一旦你过滤到一个给定的数据集,你可以使用我们新的逐步任务来增强和编辑信息。同伴代码库扩展了对TaxonWorks中的数据所能做的事情,但也可用于其他软件。例如,使用我们新的TaxonPages代码,我们的用户可以为分类组创建他们自己的网页(图1)。SFG组今年将使用TaxonPages提供超过10万个页面。它们包括基本的Bioschema集成,每个面板后面指向json格式数据的链接,以及下载任何出现的数据的选项,这些数据表示为Darwin Core属性,格式为CSV文件。TaxonPages可以在几分钟内建立起来,并在GitHub页面等资源上提供服务,我们的用户社区可以自定义其内容。最后,TaxonWorks外部API跨多个新的概念性端点添加了大量的新参数。接下来是什么?经过十年的发展,我们看到围绕TaxonWorks核心概念的成熟功能,如观察(如特征,系统发育数据),生物学关联(如宿主-寄生虫关系),图像,来源(引用管理),标本,收集事件和收集管理。目前,我们专注于与其他外部服务的集成。我们已经制作了多个新的API包装,特别是Colrapi(包装生命清单库的API)和BellPepper包装新的生物多样性增强定位服务(BELS)地理参考API。这些包装器以及与全球名称框架的持续集成使我们的用户能够提高数据质量,例如,链接到外部词汇表,查找和更新过时的命名法,以及使用我们的GBIF差异工具(即“GBIF差异”)可视化TaxonWorks集合对象数据在外部聚合器(如全球生物多样性信息设施(GBIF))上下文中的样子。TaxonWorks社区在持续增长,因此使用它的项目的多样性也在不断增加。其中一些多样性反映了项目的阶段:新项目需要快速创建和删除新记录,中期项目需要从广泛的外部资源中寻找和添加不同的数据,成熟的项目需要工具来识别和解决异常值。对于这些数据连续体场景,我们预见了为管理这些数据成熟度阶段差异而定制的Stepwise任务。想象一下,为中型数字化项目捕获逐字的标本测定数据,然后一次解析到10、100或1000个单位的People、Times和Taxa的链接。TaxonWorks社区背后的一些日益增长的多样性是类似工具寿终人终的结果。例如,SFG被要求研究将数据从Scratchpads转移到TaxonWorks。
{"title":"TaxonWorks in its 10th Year: What’s new, what’s next?","authors":"Deborah Paul, Matthew Yoder","doi":"10.3897/biss.7.112040","DOIUrl":"https://doi.org/10.3897/biss.7.112040","url":null,"abstract":"The Species File Group (SFG) endeavors to build tools and community structures that empower researchers and collections staff in their long-term collective efforts to gather, share, and learn from biodiversity data. One such tool is TaxonWorks, now in its 10th year of development. TaxonWorks provides a collaborative workbench where scientists, collection managers, students, and volunteers capture and build on the key data and concepts we use to Describe Life (TaxonWorks motto). It provides a growing number of ways to share descriptions, from Darwin Core Archives, to NeXML-formatted observations and keys, to checklists, and bibliographies.\u0000 \u0000 What’s New? \u0000 \u0000 We have expanded the data model of TaxonWorks, added new tools and functions, and some Companion software, that is, new stand-alone code-bases.\u0000 Two major additions, Unified Filters and Cached Maps, provide developers and users (and users who are developers) the ability to run complex queries across TaxonWorks' rich data model and to display quickly computed maps for datasets of notable size, 100K or more specimen and literature-based records. For example, Cached Maps can superimpose the asserted distribution and georeferenced literature and specimen records to create interactive searchable maps (Fig. 1). \u0000 In TaxonWorks, we aim to empower those working with the data with tools that help them visualize and curate information. To be able to model taxon concept relationships over time to reflect different taxonomic opinions, we added RCC-5 (Region Connection Calculus; Thau et al. 2008), which will make it possible to visualize these relationships. Similarly, we built a new visual editor (Fig. 2) for displaying, editing, and citing biological associations as recorded among specimens or taxa (or both).\u0000 Querying and enhancing data in a given database can be complex. We have worked on harmonizing the look-feel-function of the data filtering interfaces. With our Unified Filters, one can pass the results of one search to another filter (e.g., query for specimens for a given taxonomic group and then ask for the distinct collecting events for those specimens). Then, once you filter to a given dataset, you can use our new Stepwise tasks to enhance and edit that information en-masse.\u0000 Companions code-bases extend what one can do with the data in TaxonWorks, but are also available for use with other software. For example, using our new TaxonPages code, our users can produce their own web pages for taxa (Fig. 1). TaxonPages will be used by SFG groups to make available well over 100K pages this year. They include basic Bioschema integration, links to JSON-formatted data behind every panel, and the option to download any occurrence data present, expressed as Darwin Core attributes, formatted as a CSV file. TaxonPages can be set up in minutes and served on resources like GitHub pages and our user community can customize their content.\u0000 Finally, the TaxonWorks external API has added a huge number of new parame","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84278164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Integrated Taxonomic Information System*1 (ITIS) and Species 2000*2 have worked together for two decades after signing an agreement to form the Catalogue of Life*3 (COL), striving to provide current and complete Global Species Databases (GSDs) from various sources. ITIS has provided many such GSDs to the COL. In this presentation, we will demonstrate in detail the nine aspects of ITIS’ approach to quality assurance/quality control: People, Process, Rules, Standards, Automation, Control, Assurance, Publication, and Feedback. All of these aspects are important and deserve consideration in the creation/maintenance of ITIS’ high quality GSDs. ITIS has also developed a new web-based ‘Taxonomic Workbench’ (TWB) that allows new levels of cooperative effort beyond what ITIS has been able to attain with a desktop version of the software, which has been used for twenty-five years. Some key aspects of these tools and what they will allow are discussed in the last half of the presentation.
{"title":"Quality Control/Quality Assurance within the Integrated Taxonomic Information System (ITIS)","authors":"Thomas M. Orrell","doi":"10.3897/biss.7.112043","DOIUrl":"https://doi.org/10.3897/biss.7.112043","url":null,"abstract":"The Integrated Taxonomic Information System*1 (ITIS) and Species 2000*2 have worked together for two decades after signing an agreement to form the Catalogue of Life*3 (COL), striving to provide current and complete Global Species Databases (GSDs) from various sources. ITIS has provided many such GSDs to the COL.\u0000 In this presentation, we will demonstrate in detail the nine aspects of ITIS’ approach to quality assurance/quality control: People, Process, Rules, Standards, Automation, Control, Assurance, Publication, and Feedback. All of these aspects are important and deserve consideration in the creation/maintenance of ITIS’ high quality GSDs.\u0000 ITIS has also developed a new web-based ‘Taxonomic Workbench’ (TWB) that allows new levels of cooperative effort beyond what ITIS has been able to attain with a desktop version of the software, which has been used for twenty-five years. Some key aspects of these tools and what they will allow are discussed in the last half of the presentation.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87215297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
One thing the field of biodiversity informatics has been very good at is creating databases. However, this success in creation has not been matched by equivalent success in creating deep links between records in those databases. Instead, we create an ever growing number of silos. An obvious route to “silo-breaking” is the shared use of the same persistent identifiers for the same entities across those databases. For example, we have minted millions of Life Science Identifiers (LSIDs) for taxonomic names (which can be resolved at lsid.io), and a growing number of taxonomic papers have Digital Object Identifiers (DOIs), but we lack connections between these two identifiers. In this talk I describe work over the last decade to make these connections between LSIDs and DOIs across three large taxonomic databases: Index Fungorum, International Plant Names Index (IPNI), and the Index to Organism Names (ION) (Page 2023). Over a million names have been matched to DOIs or other persistent identifiers for taxonomic publications (Fig. 1 shows the coverage of publications for animal names). This represents approximately 36% of animal, plant or fungal names for which publication data is available. The mappings between LSIDs and publication persistent identifiers (PIDs) such as DOIs and Wikidata item identifiers, are made available through ChecklistBank (datasets 129659, 164203, 128415), and also archived in Zenodo. By combining these LSID and DOI links with Open Researcher and Contributor ID (ORCIDs) for taxonomists, we can potentially gain insight into who is doing taxonomic research, where they work, and how they are funded. Possible applications of this data are discussed, including a tool to discover the citation for a species name (Species Cite, Fig. 2), using DOI to ORCIDs to discover who is doing taxonomic research, and creating a linked data version of the Catalogue of Life.
{"title":"Ten Years and a Million Links: Building a global taxonomic library connecting persistent identifiers for names (LSIDs), publications (DOIs), and people (ORCIDs)","authors":"Roderic Page","doi":"10.3897/biss.7.112053","DOIUrl":"https://doi.org/10.3897/biss.7.112053","url":null,"abstract":"One thing the field of biodiversity informatics has been very good at is creating databases. However, this success in creation has not been matched by equivalent success in creating deep links between records in those databases. Instead, we create an ever growing number of silos. An obvious route to “silo-breaking” is the shared use of the same persistent identifiers for the same entities across those databases. For example, we have minted millions of Life Science Identifiers (LSIDs) for taxonomic names (which can be resolved at lsid.io), and a growing number of taxonomic papers have Digital Object Identifiers (DOIs), but we lack connections between these two identifiers. In this talk I describe work over the last decade to make these connections between LSIDs and DOIs across three large taxonomic databases: Index Fungorum, International Plant Names Index (IPNI), and the Index to Organism Names (ION) (Page 2023). Over a million names have been matched to DOIs or other persistent identifiers for taxonomic publications (Fig. 1 shows the coverage of publications for animal names). This represents approximately 36% of animal, plant or fungal names for which publication data is available.\u0000 The mappings between LSIDs and publication persistent identifiers (PIDs) such as DOIs and Wikidata item identifiers, are made available through ChecklistBank (datasets 129659, 164203, 128415), and also archived in Zenodo. By combining these LSID and DOI links with Open Researcher and Contributor ID (ORCIDs) for taxonomists, we can potentially gain insight into who is doing taxonomic research, where they work, and how they are funded. Possible applications of this data are discussed, including a tool to discover the citation for a species name (Species Cite, Fig. 2), using DOI to ORCIDs to discover who is doing taxonomic research, and creating a linked data version of the Catalogue of Life.","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"71 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86344658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The term ‘taxonomic backbone’ is often used to indicate the compromise taxonomies that form the taxonomic backbone of systems like the Global Biodiversity Information Facility (GBIF) and the Atlas of Living Australia (ALA). However, the term can also be seen in the broader sense as the entire expansive and continually evolving body of taxonomic work that underpins all biodiversity data and the linkage of all the different concepts that are used in various parts of the world and by various groups of people. The Taxon Concept Schema (TCS; Hyam and Kennedy 2006), which was ratified as a TDWG standard in 2005, came forth from the need of providers of taxonomic information for a mechanism to exchange data with other providers and users. Additionally, there was the knowledge that taxon names make poor identifiers for taxa and that more than names are needed for effective sharing and linking of biodiversity data. The same name can be associated with multiple taxon concepts or definitions, especially when a name has been around for a long time or is used in a heavily revised group. In order for others to know what a name means, people who use a name should also indicate which concept of that name is being used. Traditionally, the Latin ‘sensu’ or `sec.` have been used for this purpose; in TCS, an ‘according to’ property is used. The taxon concept, along with a language to relate different concepts, which is also in TCS, was later introduced to a systematic audience in an article by Franz and Peet (2009). Unfortunately, TCS has never enjoyed wide adoption and since Darwin Core (Wieczorek et al. 2012) was ratified in 2009, sharing of taxonomic information has mostly been done with the Darwin Core Taxon class. However, various issues with the Darwin Core Taxon class (e.g., Darwin Core and RDF/OWL Task Groups 2015) have made us look at TCS again and in 2020 the Taxonomic Names and Concepts Interest Group was formally renamed the TCS Maintenance Group. In 2021, a TCS 2 Task Group was established with the goal to update TCS to a Vocabulary Standard (like Darwin Core) that can be maintained under the TDWG Vocabulary Maintenance Specification (Vocabulary Maintenance Specification Task Group 2017). As it currently stands, TCS 2 (TCS 2 Task Group 2023) has two classes for dealing with taxonomy, the Taxon Concept and Taxon Relationship classes, and two classes for dealing with nomenclature, the Taxon Name and Nomenclatural Type classes. TCS 2 describes objects that are present and known in the domain and uses terms that are used in the domain (e.g., Greuter et al. 2011, Hawksworth 2010), so is easily understood by practitioners in the domain and other users of taxonomic information, as well as data specialists and developers. At the same time, it is in accordance with the OpenBiodiv Ontology (Senderov et al. 2018) and the Simple Knowledge Organization System (SKOS; Miles and Bechhofer 2009). TCS 2 can be used to mark up taxon concepts of any type, including tax
“分类主干”一词通常用于表示构成全球生物多样性信息设施(GBIF)和澳大利亚生活地图集(ALA)等系统分类主干的折衷分类。然而,该术语也可以在更广泛的意义上被视为支撑所有生物多样性数据和世界各地和各种人群使用的所有不同概念之间联系的整个广泛和不断发展的分类学工作体。分类单元概念图式(TCS);Hyam and Kennedy 2006)于2005年被批准为TDWG标准,它源于分类信息提供者需要一种与其他提供者和用户交换数据的机制。此外,人们还认识到,分类群名称对分类群的标识效果较差,要有效地共享和链接生物多样性数据,需要的不仅仅是名称。同一个名称可以与多个分类单元概念或定义相关联,特别是当一个名称已经存在很长时间或在经过大量修订的组中使用时。为了让别人知道一个名字的意思,使用一个名字的人也应该表明使用了这个名字的哪个概念。传统上,拉丁语的“sensu”或“sec”。’被用于这个目的;在TCS中,使用了' according '属性。分类单元的概念,以及一种将不同概念联系起来的语言,也在TCS中,后来在Franz和Peet(2009)的一篇文章中被系统地介绍给了读者。不幸的是,TCS从未被广泛采用,自从2009年达尔文核心分类单元(Wieczorek et al. 2012)被批准以来,分类信息的共享主要是在达尔文核心分类单元类中完成的。然而,达尔文核心分类单元类的各种问题(例如,达尔文核心和RDF/OWL任务组2015)使我们再次关注TCS,并且在2020年,分类名称和概念兴趣组正式更名为TCS维护组。2021年,TCS 2任务组成立,目标是将TCS更新为可在TDWG词汇维护规范(词汇维护规范任务组2017)下维护的词汇标准(如Darwin Core)。目前,TCS 2 (TCS 2 Task Group 2023)有两个处理分类学的类,即分类单元概念类和分类单元关系类,以及两个处理命名法的类,即分类单元名称类和命名类型类。TCS 2描述领域中存在和已知的对象,并使用领域中使用的术语(例如,Greuter et al. 2011, Hawksworth 2010),因此易于被领域从业者和其他分类信息用户以及数据专家和开发人员理解。同时,它符合OpenBiodiv Ontology (Senderov et al. 2018)和Simple Knowledge Organization System (SKOS;Miles and Bechhofer 2009)。TCS 2可以用来标记任何类型的分类单元概念,包括分类处理、清单、实地指南,以及像生命目录和AviBase这样的系统。一旦标记为TCS,所有类型的概念看起来都是一样的,因此可以使用不到40个术语的小标准来共享和链接所有分类信息,并链接到其他类型的生物多样性数据,例如发生数据或描述性数据。
{"title":"Improved Sharing and Linkage of Taxonomic Data with the Taxon Concept Standard (TCS)","authors":"Niels Klazenga","doi":"10.3897/biss.7.112045","DOIUrl":"https://doi.org/10.3897/biss.7.112045","url":null,"abstract":"The term ‘taxonomic backbone’ is often used to indicate the compromise taxonomies that form the taxonomic backbone of systems like the Global Biodiversity Information Facility (GBIF) and the Atlas of Living Australia (ALA). However, the term can also be seen in the broader sense as the entire expansive and continually evolving body of taxonomic work that underpins all biodiversity data and the linkage of all the different concepts that are used in various parts of the world and by various groups of people.\u0000 The Taxon Concept Schema (TCS; Hyam and Kennedy 2006), which was ratified as a TDWG standard in 2005, came forth from the need of providers of taxonomic information for a mechanism to exchange data with other providers and users. Additionally, there was the knowledge that taxon names make poor identifiers for taxa and that more than names are needed for effective sharing and linking of biodiversity data. The same name can be associated with multiple taxon concepts or definitions, especially when a name has been around for a long time or is used in a heavily revised group. In order for others to know what a name means, people who use a name should also indicate which concept of that name is being used. Traditionally, the Latin ‘sensu’ or `sec.` have been used for this purpose; in TCS, an ‘according to’ property is used. The taxon concept, along with a language to relate different concepts, which is also in TCS, was later introduced to a systematic audience in an article by Franz and Peet (2009).\u0000 Unfortunately, TCS has never enjoyed wide adoption and since Darwin Core (Wieczorek et al. 2012) was ratified in 2009, sharing of taxonomic information has mostly been done with the Darwin Core Taxon class. However, various issues with the Darwin Core Taxon class (e.g., Darwin Core and RDF/OWL Task Groups 2015) have made us look at TCS again and in 2020 the Taxonomic Names and Concepts Interest Group was formally renamed the TCS Maintenance Group. In 2021, a TCS 2 Task Group was established with the goal to update TCS to a Vocabulary Standard (like Darwin Core) that can be maintained under the TDWG Vocabulary Maintenance Specification (Vocabulary Maintenance Specification Task Group 2017).\u0000 As it currently stands, TCS 2 (TCS 2 Task Group 2023) has two classes for dealing with taxonomy, the Taxon Concept and Taxon Relationship classes, and two classes for dealing with nomenclature, the Taxon Name and Nomenclatural Type classes. TCS 2 describes objects that are present and known in the domain and uses terms that are used in the domain (e.g., Greuter et al. 2011, Hawksworth 2010), so is easily understood by practitioners in the domain and other users of taxonomic information, as well as data specialists and developers. At the same time, it is in accordance with the OpenBiodiv Ontology (Senderov et al. 2018) and the Simple Knowledge Organization System (SKOS; Miles and Bechhofer 2009).\u0000 TCS 2 can be used to mark up taxon concepts of any type, including tax","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86517208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biodiversity research has a strong focus on the links between environment and functional traits, e.g., to assess how anthropogenic drivers of change impact ecological systems (Díaz et al. 2013). Interoperable exchange and integration of such data is enabled through the use of ontologies that provide ”meaning” to data and enable downstream processing involving learning and inference over graph-structured models of these data (Kulmanov et al. 2020). However, the development of thematically similar semantic artifacts, e.g., the Environmental Ontology (ENVO, Buttigieg et al. 2016) and the Semantic Web for Earth and Environment Technology Ontology (SWEET, DiGiuseppe et al. 2014), in biodiversity-related disciplines (e.g., environmental genomics and earth observation) can introduce substantial conceptual overlaps, and highlights the need for bridging technologies to facilitate reuse of biodiversity data across those knowledge fields (Karam et al. 2020). A recent design study, funded by the European Open Science Cloud (EOSC), proposes a framework to create, document and publish mappings and crosswalks linking different semantic artifacts within a particular scientific community and across scientific domains under the label of "Flexible Semantic Mapping Framework" (SEMAF, Broeder et al. 2021). SEMAF puts a strong emphasis on so-called pragmatic mappings, i.e., mappings that are driven by specific interoperability goals such as translations between specific observation measurements (e.g., sensor configurations) and metadata descriptions. Within the Horizon Europe Project “Biodiversity Digital Twin for Advanced Modelling, Simulation and Prediction Capabilities" (BioDT), a mapping tool leveraging SEMAF is currently under development: Mapping.bio provides a lightweight web service to read semantic artifacts, visualize them, add mappings as graphical connections and store the mappings as FAIR (Findable, Accessible, Interoperable Reusable) Digital Objects (FDOs, De Smedt et al. 2020) in a repository. To foster reusability, sustainably and long-term availability of digital objects, mapping.bio features mappings compliant with the Simple Standard for Sharing Ontological Mappings (SSSOM, Matentzoglu et al. 2022), a machine-interpretable and extensible vocabulary enabling the self-contained exploration and processing of annotated mappings by machines (machine actionability, Jacobsen et al. 2020).
生物多样性研究非常关注环境与功能特征之间的联系,例如,评估变化的人为驱动因素如何影响生态系统(Díaz et al. 2013)。这些数据的可互操作交换和集成是通过使用本体来实现的,本体为数据提供“意义”,并支持下游处理,包括对这些数据的图结构模型的学习和推理(Kulmanov et al. 2020)。然而,在生物多样性相关学科(如环境基因组学和地球观测)中开发主题相似的语义工件,例如环境本体(ENVO, Buttigieg等人,2016)和地球与环境技术本体语义网(SWEET, DiGiuseppe等人,2014),可以引入大量的概念重叠。并强调需要桥接技术来促进跨这些知识领域的生物多样性数据重用(Karam et al. 2020)。最近由欧洲开放科学云(EOSC)资助的一项设计研究提出了一个框架,用于在“灵活语义映射框架”(SEMAF, Broeder et al. 2021)的标签下创建、记录和发布连接特定科学社区和跨科学领域内不同语义工件的映射和人行横道。SEMAF非常强调所谓的实用映射,即由特定互操作性目标驱动的映射,例如特定观测测量(例如,传感器配置)和元数据描述之间的转换。在地平线欧洲项目“生物多样性数字孪生用于高级建模、模拟和预测能力”(BioDT)中,目前正在开发一种利用SEMAF的绘图工具:mapping。bio提供了一个轻量级的web服务来读取语义工件,将它们可视化,将映射添加为图形连接,并将映射存储为存储库中的FAIR(可查找、可访问、可互操作、可重用)数字对象(fdo, De Smedt等人,2020)。促进数字对象的可重用性、可持续性和长期可用性。生物特征映射符合共享本体论映射的简单标准(SSSOM, Matentzoglu et al. 2022),这是一种机器可解释和可扩展的词汇表,允许机器对注释映射进行独立的探索和处理(机器可操作性,Jacobsen et al. 2020)。
{"title":"Mapping.bio: Piloting FAIR semantic mappings for biodiversity digital twins","authors":"Alexander Wolodkin, Claus Weiland, Jonas Grieb","doi":"10.3897/biss.7.111979","DOIUrl":"https://doi.org/10.3897/biss.7.111979","url":null,"abstract":"Biodiversity research has a strong focus on the links between environment and functional traits, e.g., to assess how anthropogenic drivers of change impact ecological systems (Díaz et al. 2013). Interoperable exchange and integration of such data is enabled through the use of ontologies that provide ”meaning” to data and enable downstream processing involving learning and inference over graph-structured models of these data (Kulmanov et al. 2020). However, the development of thematically similar semantic artifacts, e.g., the Environmental Ontology (ENVO, Buttigieg et al. 2016) and the Semantic Web for Earth and Environment Technology Ontology (SWEET, DiGiuseppe et al. 2014), in biodiversity-related disciplines (e.g., environmental genomics and earth observation) can introduce substantial conceptual overlaps, and highlights the need for bridging technologies to facilitate reuse of biodiversity data across those knowledge fields (Karam et al. 2020). \u0000 A recent design study, funded by the European Open Science Cloud (EOSC), proposes a framework to create, document and publish mappings and crosswalks linking different semantic artifacts within a particular scientific community and across scientific domains under the label of \"Flexible Semantic Mapping Framework\" (SEMAF, Broeder et al. 2021). SEMAF puts a strong emphasis on so-called pragmatic mappings, i.e., mappings that are driven by specific interoperability goals such as translations between specific observation measurements (e.g., sensor configurations) and metadata descriptions. Within the Horizon Europe Project “Biodiversity Digital Twin for Advanced Modelling, Simulation and Prediction Capabilities\" (BioDT), a mapping tool leveraging SEMAF is currently under development: Mapping.bio provides a lightweight web service to read semantic artifacts, visualize them, add mappings as graphical connections and store the mappings as FAIR (Findable, Accessible, Interoperable Reusable) Digital Objects (FDOs, De Smedt et al. 2020) in a repository. To foster reusability, sustainably and long-term availability of digital objects, mapping.bio features mappings compliant with the Simple Standard for Sharing Ontological Mappings (SSSOM, Matentzoglu et al. 2022), a machine-interpretable and extensible vocabulary enabling the self-contained exploration and processing of annotated mappings by machines (machine actionability, Jacobsen et al. 2020).","PeriodicalId":9011,"journal":{"name":"Biodiversity Information Science and Standards","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83194946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}