Improving Domain Repository Connectivity

IF 1.3 3区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Data Intelligence Pub Date : 2023-03-01 DOI:10.1162/dint_a_00120

T. Habermann

{"title":"Improving Domain Repository Connectivity","authors":"T. Habermann","doi":"10.1162/dint_a_00120","DOIUrl":null,"url":null,"abstract":"ABSTRACT Domain repositories, i.e. repositories that store, manage, and persist data pertaining to a specific scientific domain, are common and growing in the research landscape. Many of these repositories develop close, long-term communities made up of individuals and organizations that collect, analyze, and publish results based on the data in the repositories. Connections between these datasets, papers, people, and organizations are an important part of the knowledge infrastructure surrounding the repository. All these research objects, people, and organizations can now be identified using various unique and persistent identifiers (PIDs) and it is possible for domain repositories to build on their existing communities to facilitate and accelerate the identifier adoption process. As community members contribute to multiple datasets and articles, identifiers for them, once found, can be used multiple times. We explore this idea by defining a connectivity metric and applying it to datasets collected and papers published by members of the UNAVCO community. Finding identifiers in DataCite and Crossref metadata and spreading those identifiers through the UNAVCO DataCite metadata can increase connectivity from less than 10% to close to 50% for people and organizations.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"5 1","pages":"6-26"},"PeriodicalIF":1.3000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/dint_a_00120","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 2

Abstract

ABSTRACT Domain repositories, i.e. repositories that store, manage, and persist data pertaining to a specific scientific domain, are common and growing in the research landscape. Many of these repositories develop close, long-term communities made up of individuals and organizations that collect, analyze, and publish results based on the data in the repositories. Connections between these datasets, papers, people, and organizations are an important part of the knowledge infrastructure surrounding the repository. All these research objects, people, and organizations can now be identified using various unique and persistent identifiers (PIDs) and it is possible for domain repositories to build on their existing communities to facilitate and accelerate the identifier adoption process. As community members contribute to multiple datasets and articles, identifiers for them, once found, can be used multiple times. We explore this idea by defining a connectivity metric and applying it to datasets collected and papers published by members of the UNAVCO community. Finding identifiers in DataCite and Crossref metadata and spreading those identifiers through the UNAVCO DataCite metadata can increase connectivity from less than 10% to close to 50% for people and organizations.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

改进域存储库连通性

领域存储库，即存储、管理和持久化与特定科学领域有关的数据的存储库，在研究领域中很常见并且正在增长。这些存储库中有许多开发了紧密的、长期的社区，社区由个人和组织组成，这些个人和组织根据存储库中的数据收集、分析和发布结果。这些数据集、论文、人员和组织之间的连接是围绕存储库的知识基础设施的重要组成部分。所有这些研究对象、人员和组织现在都可以使用各种唯一和持久标识符(pid)来标识，并且域存储库可以在其现有社区的基础上构建，以促进和加速标识符采用过程。由于社区成员贡献了多个数据集和文章，他们的标识符一旦被发现，就可以多次使用。我们通过定义连接度量并将其应用于联合国维和部队社区成员收集的数据集和发表的论文来探索这一想法。在DataCite和Crossref元数据中查找标识符，并通过联阿维和部队DataCite元数据传播这些标识符，可将个人和组织的连通性从不到10%提高到接近50%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊