{"title":"Pytaxon: A Python software for resolving and correcting taxonomic names in biodiversity data.","authors":"Marco A Proença Neto, Marcos P A De Sousa","doi":"10.3897/BDJ.13.e138257","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The standardisation and correction of taxonomic names in large biodiversity databases remain persistent challenges for researchers, as errors in species names can compromise ecological analyses, land-use planning and conservation efforts, particularly when inaccurate data are shared on global biodiversity portals.</p><p><strong>New information: </strong>We present pytaxon, a Python software designed to resolve and correct taxonomic names in biodiversity data by leveraging the Global Names Verifier (GNV) API and employing fuzzy matching techniques to suggest corrections for discrepancies and nomenclatural inconsistencies. The pytaxon offers both a Command Line Interface (CLI) and a Graphical User Interface (GUI), ensuring accessibility to users with different levels of computing expertise. Tests on spreadsheets derived from datasets published in the Global Biodiversity Information Facility (GBIF) demonstrated its effectiveness in identifying and resolving taxonomic errors. By mitigating the propagation of inaccuracies from researchers' datasets to global biodiversity databases, pytaxon supports more reliable conservation decisions and robust scientific investigations. Its contributions enhance data integrity and promote informed biodiversity management in a rapidly evolving global environment.</p>","PeriodicalId":55994,"journal":{"name":"Biodiversity Data Journal","volume":"13 ","pages":"e138257"},"PeriodicalIF":1.0000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11736304/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodiversity Data Journal","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.3897/BDJ.13.e138257","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIODIVERSITY CONSERVATION","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The standardisation and correction of taxonomic names in large biodiversity databases remain persistent challenges for researchers, as errors in species names can compromise ecological analyses, land-use planning and conservation efforts, particularly when inaccurate data are shared on global biodiversity portals.
New information: We present pytaxon, a Python software designed to resolve and correct taxonomic names in biodiversity data by leveraging the Global Names Verifier (GNV) API and employing fuzzy matching techniques to suggest corrections for discrepancies and nomenclatural inconsistencies. The pytaxon offers both a Command Line Interface (CLI) and a Graphical User Interface (GUI), ensuring accessibility to users with different levels of computing expertise. Tests on spreadsheets derived from datasets published in the Global Biodiversity Information Facility (GBIF) demonstrated its effectiveness in identifying and resolving taxonomic errors. By mitigating the propagation of inaccuracies from researchers' datasets to global biodiversity databases, pytaxon supports more reliable conservation decisions and robust scientific investigations. Its contributions enhance data integrity and promote informed biodiversity management in a rapidly evolving global environment.
Biodiversity Data JournalAgricultural and Biological Sciences-Ecology, Evolution, Behavior and Systematics
CiteScore
2.20
自引率
7.70%
发文量
283
审稿时长
6 weeks
期刊介绍:
Biodiversity Data Journal (BDJ) is a community peer-reviewed, open-access, comprehensive online platform, designed to accelerate publishing, dissemination and sharing of biodiversity-related data of any kind. All structural elements of the articles – text, morphological descriptions, occurrences, data tables, etc. – will be treated and stored as DATA, in accordance with the Data Publishing Policies and Guidelines of Pensoft Publishers.
The journal will publish papers in biodiversity science containing taxonomic, floristic/faunistic, morphological, genomic, phylogenetic, ecological or environmental data on any taxon of any geological age from any part of the world with no lower or upper limit to manuscript size.