D. C. Cugler, C. B. Medeiros, S. Shekhar, L. F. Toledo
{"title":"A Geographical Approach for Metadata Quality Improvement in Biological Observation Databases","authors":"D. C. Cugler, C. B. Medeiros, S. Shekhar, L. F. Toledo","doi":"10.1109/eScience.2013.14","DOIUrl":null,"url":null,"abstract":"This paper addresses the problem of improving the quality of metadata in biological observation databases, in particular those associated with observations of living beings, and which are often used as a starting point for biodiversity analyses. Poor quality metadata lead to incorrect scientific conclusions, and can mislead experts. Thus, it is important to design and develop methods to detect and correct metadata quality problems. This is a challenging problem because of the variety of issues concerning such metadata, e.g., misnaming of species, location uncertainty and imprecision concerning where observations were recorded. Related work is limited because it does not adequately model such issues. We propose a geographic approach based on expert-led classification of place and/or range mismatch anomalies detected by our algorithms. Our approach enables detection of anomalies in both species' reported geographic distributions and in species' identification. Our main contribution is our geographic algorithm that deals with uncertain/imprecise locations. Our work is tested using a case study with the Fonoteca Neotropical Jacques Vielliard, one of the 10 largest animal sound collections in the world.","PeriodicalId":325272,"journal":{"name":"2013 IEEE 9th International Conference on e-Science","volume":"199 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 9th International Conference on e-Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/eScience.2013.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
This paper addresses the problem of improving the quality of metadata in biological observation databases, in particular those associated with observations of living beings, and which are often used as a starting point for biodiversity analyses. Poor quality metadata lead to incorrect scientific conclusions, and can mislead experts. Thus, it is important to design and develop methods to detect and correct metadata quality problems. This is a challenging problem because of the variety of issues concerning such metadata, e.g., misnaming of species, location uncertainty and imprecision concerning where observations were recorded. Related work is limited because it does not adequately model such issues. We propose a geographic approach based on expert-led classification of place and/or range mismatch anomalies detected by our algorithms. Our approach enables detection of anomalies in both species' reported geographic distributions and in species' identification. Our main contribution is our geographic algorithm that deals with uncertain/imprecise locations. Our work is tested using a case study with the Fonoteca Neotropical Jacques Vielliard, one of the 10 largest animal sound collections in the world.
本文讨论了如何提高生物观测数据库中元数据的质量,特别是那些与生物观测相关的元数据,这些元数据通常被用作生物多样性分析的起点。质量差的元数据会导致不正确的科学结论,并可能误导专家。因此,设计和开发检测和纠正元数据质量问题的方法非常重要。这是一个具有挑战性的问题,因为与这种元数据有关的各种问题,例如,物种的错误命名,地点的不确定性和观测记录地点的不精确。相关工作是有限的,因为它没有充分模拟这些问题。我们提出了一种地理方法,该方法基于我们的算法检测到的由专家主导的地点和/或范围不匹配异常分类。我们的方法可以检测物种报告的地理分布和物种鉴定中的异常。我们的主要贡献是处理不确定/不精确位置的地理算法。我们的工作通过与Fonoteca Neotropical Jacques Vielliard(世界十大动物声音收藏之一)的案例研究进行了验证。