The BioWhere Project: Unlocking the Potential of Biological Collections Data

Q3 Social Sciences GI_Forum Pub Date : 2023-01-01 DOI:10.1553/giscience2023_01_s3
Kristin Stock, K. Wijegunarathna, C. B. Jones, H. Morris, Pragyan Das, D. Medyckyj-Scott, Brandon Whitehead
{"title":"The BioWhere Project: Unlocking the Potential of Biological Collections Data","authors":"Kristin Stock, K. Wijegunarathna, C. B. Jones, H. Morris, Pragyan Das, D. Medyckyj-Scott, Brandon Whitehead","doi":"10.1553/giscience2023_01_s3","DOIUrl":null,"url":null,"abstract":"Vast numbers of biological specimens (e.g. flora, fauna, soils) are stored in collections globally. Many of these have only a natural-language location description, such as ‘ 200ft above and south of main highway, 1.1 miles west of Porters Pass ’, and numerical coordinates are unknown. The BioWhere project is pioneering methods to automatically determine the geographic coordinates (georeferences) of complex location descriptions. Particular challenges are posed by the variable accuracy of recent and historical data that might be used to train models to predict geographic coordinates from the natural-language descriptions; by the presence of historical place names in the descriptions that are not stored in existing gazetteers; and by the vague and context-sensitive nature (e.g. above , on , south of ) of the descriptions. We are addressing these challenges by extending the latest transformer-based deep learning models to parse locality descriptions, and to build models for specific spatial terms that incorporate geographic context and data quality to more accurately predict georeferences. We also describe a gazetteer that contains enriched cultural content to support georeferencing of historical records, and to serve as a store of New Zealand Māori cultural knowledge for future generations.","PeriodicalId":29645,"journal":{"name":"GI_Forum","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GI_Forum","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1553/giscience2023_01_s3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

Vast numbers of biological specimens (e.g. flora, fauna, soils) are stored in collections globally. Many of these have only a natural-language location description, such as ‘ 200ft above and south of main highway, 1.1 miles west of Porters Pass ’, and numerical coordinates are unknown. The BioWhere project is pioneering methods to automatically determine the geographic coordinates (georeferences) of complex location descriptions. Particular challenges are posed by the variable accuracy of recent and historical data that might be used to train models to predict geographic coordinates from the natural-language descriptions; by the presence of historical place names in the descriptions that are not stored in existing gazetteers; and by the vague and context-sensitive nature (e.g. above , on , south of ) of the descriptions. We are addressing these challenges by extending the latest transformer-based deep learning models to parse locality descriptions, and to build models for specific spatial terms that incorporate geographic context and data quality to more accurately predict georeferences. We also describe a gazetteer that contains enriched cultural content to support georeferencing of historical records, and to serve as a store of New Zealand Māori cultural knowledge for future generations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
BioWhere项目:释放生物收集数据的潜力
大量的生物标本(如植物、动物、土壤)储存在全球各地。其中许多只有自然语言的位置描述,比如“在主干道以南200英尺处,波特斯山口以西1.1英里处”,数字坐标是未知的。BioWhere项目是自动确定复杂位置描述的地理坐标(地理参考)方法的先驱。近期和历史数据的不同准确性带来了特殊的挑战,这些数据可能用于训练模型,以从自然语言描述中预测地理坐标;通过在现有地名辞典中没有存储的描述中出现历史地名;并且通过描述的模糊和上下文敏感的性质(例如above, on, south of)。我们正在通过扩展最新的基于转换器的深度学习模型来解决这些挑战,以解析位置描述,并为包含地理背景和数据质量的特定空间术语构建模型,以更准确地预测地理参考。我们还描述了一个包含丰富文化内容的地名辞典,以支持历史记录的地理参考,并为后代提供新西兰Māori文化知识的存储。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
GI_Forum
GI_Forum Earth and Planetary Sciences-Computers in Earth Sciences
CiteScore
1.10
自引率
0.00%
发文量
9
审稿时长
23 weeks
期刊最新文献
Above-Ground Forest Biomass Estimation using Multispectral LiDAR Data in a Multilayered Coniferous Forest The State of Trajectory Visualization in Notebook Environments Development of a Standardized, Interdisciplinary Approach for Evaluating the Impact of Infrastructural Interventions on Sustainable Mobility A Comparative Study of Geocoder Performance on Unstructured Tweet Locations Application of Object-Based Image Analysis for Detecting and Differentiating between Shallow Landslides and Debris Flows
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1