{"title":"Improving Domain Repository Connectivity","authors":"T. Habermann","doi":"10.1162/dint_a_00120","DOIUrl":null,"url":null,"abstract":"ABSTRACT Domain repositories, i.e. repositories that store, manage, and persist data pertaining to a specific scientific domain, are common and growing in the research landscape. Many of these repositories develop close, long-term communities made up of individuals and organizations that collect, analyze, and publish results based on the data in the repositories. Connections between these datasets, papers, people, and organizations are an important part of the knowledge infrastructure surrounding the repository. All these research objects, people, and organizations can now be identified using various unique and persistent identifiers (PIDs) and it is possible for domain repositories to build on their existing communities to facilitate and accelerate the identifier adoption process. As community members contribute to multiple datasets and articles, identifiers for them, once found, can be used multiple times. We explore this idea by defining a connectivity metric and applying it to datasets collected and papers published by members of the UNAVCO community. Finding identifiers in DataCite and Crossref metadata and spreading those identifiers through the UNAVCO DataCite metadata can increase connectivity from less than 10% to close to 50% for people and organizations.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"5 1","pages":"6-26"},"PeriodicalIF":1.3000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/dint_a_00120","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 2
Abstract
ABSTRACT Domain repositories, i.e. repositories that store, manage, and persist data pertaining to a specific scientific domain, are common and growing in the research landscape. Many of these repositories develop close, long-term communities made up of individuals and organizations that collect, analyze, and publish results based on the data in the repositories. Connections between these datasets, papers, people, and organizations are an important part of the knowledge infrastructure surrounding the repository. All these research objects, people, and organizations can now be identified using various unique and persistent identifiers (PIDs) and it is possible for domain repositories to build on their existing communities to facilitate and accelerate the identifier adoption process. As community members contribute to multiple datasets and articles, identifiers for them, once found, can be used multiple times. We explore this idea by defining a connectivity metric and applying it to datasets collected and papers published by members of the UNAVCO community. Finding identifiers in DataCite and Crossref metadata and spreading those identifiers through the UNAVCO DataCite metadata can increase connectivity from less than 10% to close to 50% for people and organizations.