Rapid technological advancements and the biodiversity crisis have motivated efforts to document species before their extinction. However, taxonomic coverage gaps, where certain species are underrepresented in biodiversity databases, can distort our understanding of ecosystems. Here, we quantified how many of the plant species found in a hotspot are invisible, i.e. they would be excluded from studies due to insufficient occurrence data. Additionally, we identified factors influencing the invisibility of species.
Atlantic Forest hotspot, Brazil.
We downloaded and filtered occurrence data from 15,010 plant species from online biodiversity databases. We utilized multiple thresholds, each representing a minimum required number of records, to classify species as “invisible” if their record count fell below these thresholds. We fitted logistic models to estimate how factors such as life form, presence of a vernacular name, geographical distribution, endemism, and year of taxonomic publication influence the odds of species exclusion.
The proportion of invisible species ranged from 14% when employing simple tools requiring just three records to as high as 64% with more demanding tools requiring at least 60 records. Species with specific characteristics are more prone to invisibility, including non-tree species, species without vernacular names, species with restricted distributions within Atlantic Forest, endemic species, and species with names published more recently. A significant portion of these invisible species are distributed along the coastline. In contrast, the continental portion of the biome exhibits fewer taxonomic coverage gaps of known species, most likely due to lower rates of new species descriptions.
Coverage gaps are shaped by the interaction of biological traits, societal preferences, limited technical support, and human activities. Studies relying on distributional data must balance the rigour of filters and thresholds to achieve both geographical reliability and taxonomic coverage, adjusting them to align with each study's specific data and goals.