Improving Domain Repository Connectivity

IF 1.3 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Data Intelligence Pub Date : 2023-03-01 DOI:10.1162/dint_a_00120
T. Habermann
{"title":"Improving Domain Repository Connectivity","authors":"T. Habermann","doi":"10.1162/dint_a_00120","DOIUrl":null,"url":null,"abstract":"ABSTRACT Domain repositories, i.e. repositories that store, manage, and persist data pertaining to a specific scientific domain, are common and growing in the research landscape. Many of these repositories develop close, long-term communities made up of individuals and organizations that collect, analyze, and publish results based on the data in the repositories. Connections between these datasets, papers, people, and organizations are an important part of the knowledge infrastructure surrounding the repository. All these research objects, people, and organizations can now be identified using various unique and persistent identifiers (PIDs) and it is possible for domain repositories to build on their existing communities to facilitate and accelerate the identifier adoption process. As community members contribute to multiple datasets and articles, identifiers for them, once found, can be used multiple times. We explore this idea by defining a connectivity metric and applying it to datasets collected and papers published by members of the UNAVCO community. Finding identifiers in DataCite and Crossref metadata and spreading those identifiers through the UNAVCO DataCite metadata can increase connectivity from less than 10% to close to 50% for people and organizations.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"5 1","pages":"6-26"},"PeriodicalIF":1.3000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/dint_a_00120","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 2

Abstract

ABSTRACT Domain repositories, i.e. repositories that store, manage, and persist data pertaining to a specific scientific domain, are common and growing in the research landscape. Many of these repositories develop close, long-term communities made up of individuals and organizations that collect, analyze, and publish results based on the data in the repositories. Connections between these datasets, papers, people, and organizations are an important part of the knowledge infrastructure surrounding the repository. All these research objects, people, and organizations can now be identified using various unique and persistent identifiers (PIDs) and it is possible for domain repositories to build on their existing communities to facilitate and accelerate the identifier adoption process. As community members contribute to multiple datasets and articles, identifiers for them, once found, can be used multiple times. We explore this idea by defining a connectivity metric and applying it to datasets collected and papers published by members of the UNAVCO community. Finding identifiers in DataCite and Crossref metadata and spreading those identifiers through the UNAVCO DataCite metadata can increase connectivity from less than 10% to close to 50% for people and organizations.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
改进域存储库连通性
领域存储库,即存储、管理和持久化与特定科学领域有关的数据的存储库,在研究领域中很常见并且正在增长。这些存储库中有许多开发了紧密的、长期的社区,社区由个人和组织组成,这些个人和组织根据存储库中的数据收集、分析和发布结果。这些数据集、论文、人员和组织之间的连接是围绕存储库的知识基础设施的重要组成部分。所有这些研究对象、人员和组织现在都可以使用各种唯一和持久标识符(pid)来标识,并且域存储库可以在其现有社区的基础上构建,以促进和加速标识符采用过程。由于社区成员贡献了多个数据集和文章,他们的标识符一旦被发现,就可以多次使用。我们通过定义连接度量并将其应用于联合国维和部队社区成员收集的数据集和发表的论文来探索这一想法。在DataCite和Crossref元数据中查找标识符,并通过联阿维和部队DataCite元数据传播这些标识符,可将个人和组织的连通性从不到10%提高到接近50%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Data Intelligence
Data Intelligence COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
6.50
自引率
15.40%
发文量
40
审稿时长
8 weeks
期刊最新文献
The Limitations and Ethical Considerations of ChatGPT Rule Mining Trends from 1987 to 2022: A Bibliometric Analysis and Visualization Classification and quantification of timestamp data quality issues and its impact on data quality outcome BIKAS: Bio-Inspired Knowledge Acquisition and Simulacrum—A Knowledge Database to Support Multifunctional Design Concept Generation Exploring Attentive Siamese LSTM for Low-Resource Text Plagiarism Detection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1