P. Ellingsen, L. Ferrighi, Ø. Godøy, T. Gabrielsen
{"title":"Keeping Track of Samples in Multidisciplinary Fieldwork","authors":"P. Ellingsen, L. Ferrighi, Ø. Godøy, T. Gabrielsen","doi":"10.5334/dsj-2021-034","DOIUrl":"https://doi.org/10.5334/dsj-2021-034","url":null,"abstract":"","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71068327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The common standard for machine-actionable Data Management Plans (DMPs) allows for automatic exchange, integration, and validation of information provided in DMPs. In this paper, we report on the hackathon organised by the Research Data Alliance in which a group of 89 participants from 21 countries worked collaboratively on use cases exploring the utility of the standard in different settings. The work included integration of tools and services, funder templates mapping, and development of new serialisations. This paper summarises the results achieved during the hackathon and provides pointers to further resources.
{"title":"Interconnecting Systems Using Machine-Actionable Data Management Plans – Hackathon Report","authors":"João Cardoso, L. J. Castro, Tomasz Miksa","doi":"10.5334/dsj-2021-035","DOIUrl":"https://doi.org/10.5334/dsj-2021-035","url":null,"abstract":"The common standard for machine-actionable Data Management Plans (DMPs) allows for automatic exchange, integration, and validation of information provided in DMPs. In this paper, we report on the hackathon organised by the Research Data Alliance in which a group of 89 participants from 21 countries worked collaboratively on use cases exploring the utility of the standard in different settings. The work included integration of tools and services, funder templates mapping, and development of new serialisations. This paper summarises the results achieved during the hackathon and provides pointers to further resources.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71068380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Peng, R. Downs, C. Lacagnina, H. Ramapriyan, I. Ivánová, D. Moroni, Yaxing Wei, Larnicol Gilles, L. Wyborn, Mitchell Goldberg, J. Schulz, I. Bastrakova, A. Ganske, L. Bastin, S. Khalsa, Mingfang Wu, C. Shie, N. Ritchey, Dave Jones, T. Habermann, C. Lief, Iolanda Maggio, M. Albani, S. Stall, Lihang Zhou, M. Drévillon, Sarah M. Champion, C. Hou, F. Doblas-Reyes, K. Lehnert, E. Robinson, K. Bugbee
Knowledge about the quality of data and metadata is important to support informed decisions on the (re)use of individual datasets and is an essential part of the ecosystem that supports open science. Quality assessments reflect the reliability and usability of data and need to be consistently curated, fully traceable, and adequately documented, as these are crucial for sound decision- and policy-making efforts that rely on data. Quality assessments also need to be consistently represented and readily integrated across systems and tools to allow for improved sharing of information on quality at the dataset level for individual quality attribute or dimension. Although the need for assessing the quality of data and associated information is well recognized, methodologies for an evaluation framework and presentation of resultant quality information to end users may not have been comprehensively addressed within and across disciplines. Global interdisciplinary domain experts have come together to systematically explore needs, challenges and impacts of consistently curating and representing quality information through the entire lifecycle of a dataset. This paper describes the findings, calls for community action to develop practical guidelines, and outlines community recommendations for developing such guidelines. Community practical guidelines will allow for global access and harmonization of quality information at the level of individual Earth science datasets and support open science.
{"title":"Call to Action for Global Access to and Harmonization of Quality Information of Individual Earth Science Datasets","authors":"G. Peng, R. Downs, C. Lacagnina, H. Ramapriyan, I. Ivánová, D. Moroni, Yaxing Wei, Larnicol Gilles, L. Wyborn, Mitchell Goldberg, J. Schulz, I. Bastrakova, A. Ganske, L. Bastin, S. Khalsa, Mingfang Wu, C. Shie, N. Ritchey, Dave Jones, T. Habermann, C. Lief, Iolanda Maggio, M. Albani, S. Stall, Lihang Zhou, M. Drévillon, Sarah M. Champion, C. Hou, F. Doblas-Reyes, K. Lehnert, E. Robinson, K. Bugbee","doi":"10.31219/osf.io/nwe5p","DOIUrl":"https://doi.org/10.31219/osf.io/nwe5p","url":null,"abstract":"Knowledge about the quality of data and metadata is important to support informed decisions on the (re)use of individual datasets and is an essential part of the ecosystem that supports open science. Quality assessments reflect the reliability and usability of data and need to be consistently curated, fully traceable, and adequately documented, as these are crucial for sound decision- and policy-making efforts that rely on data. Quality assessments also need to be consistently represented and readily integrated across systems and tools to allow for improved sharing of information on quality at the dataset level for individual quality attribute or dimension. Although the need for assessing the quality of data and associated information is well recognized, methodologies for an evaluation framework and presentation of resultant quality information to end users may not have been comprehensively addressed within and across disciplines. Global interdisciplinary domain experts have come together to systematically explore needs, challenges and impacts of consistently curating and representing quality information through the entire lifecycle of a dataset. This paper describes the findings, calls for community action to develop practical guidelines, and outlines community recommendations for developing such guidelines. Community practical guidelines will allow for global access and harmonization of quality information at the level of individual Earth science datasets and support open science.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46714816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-11-10DOI: 10.1101/2020.11.07.20227462
N. Popper, M. Zechmeister, D. Brunmeir, C. Rippinger, N. Weibrecht, C. Urach, M. Bicher, G. Schneckenreither, A. Rauber
We generate synthetic data documenting COVID-19 cases in Austria by the means of an agent-based simulation model. The model simulates the transmission of the SARS-CoV-2 virus in a statistical replica of the population and reproduces typical patient pathways on an individual basis while simultaneously integrating historical data on the implementation and expiration of population-wide countermeasures. The resulting data semantically and statistically aligns with an official epidemiological case reporting data set and provides an easily accessible, consistent and augmented alternative. Our synthetic data set provides additional insight into the spread of the epidemic by synthesizing information that cannot be recorded in reality.
{"title":"Synthetic Reproduction and Augmentation of COVID-19 Case Reporting Data by Agent-Based Simulation","authors":"N. Popper, M. Zechmeister, D. Brunmeir, C. Rippinger, N. Weibrecht, C. Urach, M. Bicher, G. Schneckenreither, A. Rauber","doi":"10.1101/2020.11.07.20227462","DOIUrl":"https://doi.org/10.1101/2020.11.07.20227462","url":null,"abstract":"We generate synthetic data documenting COVID-19 cases in Austria by the means of an agent-based simulation model. The model simulates the transmission of the SARS-CoV-2 virus in a statistical replica of the population and reproduces typical patient pathways on an individual basis while simultaneously integrating historical data on the implementation and expiration of population-wide countermeasures. The resulting data semantically and statistically aligns with an official epidemiological case reporting data set and provides an easily accessible, consistent and augmented alternative. Our synthetic data set provides additional insight into the spread of the epidemic by synthesizing information that cannot be recorded in reality.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46200237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As a National Metrology Institute (NMI), the USA National Institute of Standards and Technology (NIST) scientists, engineers and technology experts conduct research across a full spectrum of physical science domains. NIST is a non-regulatory agency within the U.S. Department of Commerce with a mission to promote U.S. innovation and industrial competitiveness by advancing measurement science, standards, and technology in ways that enhance economic security and improve our quality of life. NIST research results in the production and distribution of standard reference materials, calibration services, and datasets. These are generated from a wide range of complex laboratory instrumentation, expert analyses, and calibration processes. In response to a government open data policy, and in collaboration with the broader research community, NIST has developed a federated Open Access to Research (OAR) scientific data infrastructure aligned with FAIR (Findable, Accessible, Interoperable, Reusable) data principles. Through the OAR initiatives, NIST's Material Measurement Laboratory Office of Data and Informatics (ODI) recently released a new scientific data discovery portal and public data repository. These science-oriented applications provide dissemination and public access for data from across the broad spectrum of NIST research disciplines, including chemistry, biology, materials science (such as crystallography, nanomaterials, etc.), physics, disaster resilience, cyberinfrastructure, communications, forensics, and others. NIST's public data consist of carefully curated Standard Reference Data, legacy high valued data, and new research data publications. The repository is thus evolving both in content and features as the nature of research progresses. Implementation of the OAR infrastructure is key to NIST's role in sharing high integrity reproducible research for measurement science in a rapidly changing world.
{"title":"Building Open Access to Research (OAR) Data Infrastructure at NIST","authors":"Gretchen Greene, R. Plante, R. Hanisch","doi":"10.5334/dsj-2019-030","DOIUrl":"https://doi.org/10.5334/dsj-2019-030","url":null,"abstract":"As a National Metrology Institute (NMI), the USA National Institute of Standards and Technology (NIST) scientists, engineers and technology experts conduct research across a full spectrum of physical science domains. NIST is a non-regulatory agency within the U.S. Department of Commerce with a mission to promote U.S. innovation and industrial competitiveness by advancing measurement science, standards, and technology in ways that enhance economic security and improve our quality of life. NIST research results in the production and distribution of standard reference materials, calibration services, and datasets. These are generated from a wide range of complex laboratory instrumentation, expert analyses, and calibration processes. In response to a government open data policy, and in collaboration with the broader research community, NIST has developed a federated Open Access to Research (OAR) scientific data infrastructure aligned with FAIR (Findable, Accessible, Interoperable, Reusable) data principles. Through the OAR initiatives, NIST's Material Measurement Laboratory Office of Data and Informatics (ODI) recently released a new scientific data discovery portal and public data repository. These science-oriented applications provide dissemination and public access for data from across the broad spectrum of NIST research disciplines, including chemistry, biology, materials science (such as crystallography, nanomaterials, etc.), physics, disaster resilience, cyberinfrastructure, communications, forensics, and others. NIST's public data consist of carefully curated Standard Reference Data, legacy high valued data, and new research data publications. The repository is thus evolving both in content and features as the nature of research progresses. Implementation of the OAR infrastructure is key to NIST's role in sharing high integrity reproducible research for measurement science in a rapidly changing world.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46050150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Developing a Model Guidelines Addressing Legal Impediments to Open Access to Publicly Funded Research Data in Malaysia","authors":"Haswira Nor Mohamad Hashim","doi":"10.5334/dsj-2019-027","DOIUrl":"https://doi.org/10.5334/dsj-2019-027","url":null,"abstract":"","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71067873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stephanie Russo Carroll, Desi Rodriguez-Lonebear, Andrew Martinez
Data have become the new global currency, and a powerful force in making decisions and wielding power. As the world engages with open data, big data reuse, and data linkage, what do data-driven futures look like for communities plagued by data inequities? Indigenous data stakeholders and non-Indigenous allies have explored this question over the last three years in a series of meetings through the Research Data Alliance (RDA). Drawing on RDA and other gatherings, and a systematic scan of literature and practice, we consider possible answers to this question in the context of Indigenous peoples vis-á-vis two emerging concepts: Indigenous data sovereignty and Indigenous data governance. Specifically, we focus on the data challenges facing Native nations and the intersection of data, tribal sovereignty, and power. Indigenous data sovereignty is the right of each Native nation to govern the collection, ownership, and application of the tribe's data. Native nations exercise Indigenous data sovereignty through the interrelated processes of Indigenous data governance and decolonizing data. This paper explores the implications of Indigenous data sovereignty and Indigenous data governance for Native nations and others. We argue for the repositioning of authority over Indigenous data back to Indigenous peoples. At the same time, we recognize that there are significant obstacles to rebuilding effective Indigenous data systems and the process will require resources, time, and partnerships among Native nations, other governments, and data agents.
{"title":"Indigenous Data Governance: Strategies from United States Native Nations.","authors":"Stephanie Russo Carroll, Desi Rodriguez-Lonebear, Andrew Martinez","doi":"10.5334/dsj-2019-031","DOIUrl":"https://doi.org/10.5334/dsj-2019-031","url":null,"abstract":"<p><p>Data have become the new global currency, and a powerful force in making decisions and wielding power. As the world engages with open data, big data reuse, and data linkage, what do data-driven futures look like for communities plagued by data inequities? Indigenous data stakeholders and non-Indigenous allies have explored this question over the last three years in a series of meetings through the Research Data Alliance (RDA). Drawing on RDA and other gatherings, and a systematic scan of literature and practice, we consider possible answers to this question in the context of Indigenous peoples vis-á-vis two emerging concepts: Indigenous data sovereignty and Indigenous data governance. Specifically, we focus on the data challenges facing Native nations and the intersection of data, tribal sovereignty, and power. Indigenous data sovereignty is the right of each Native nation to govern the collection, ownership, and application of the tribe's data. Native nations exercise Indigenous data sovereignty through the interrelated processes of Indigenous data governance and decolonizing data. This paper explores the implications of <i>Indigenous data sovereignty and Indigenous data governance</i> for Native nations and others. We argue for the repositioning of authority over Indigenous data back to Indigenous peoples. At the same time, we recognize that there are significant obstacles to rebuilding effective Indigenous data systems and the process will require resources, time, and partnerships among Native nations, other governments, and data agents.</p>","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8580324/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39613550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-01-01Epub Date: 2018-08-17DOI: 10.5334/dsj-2018-019
Jeremy York, Myron Gutmann, Francine Berman
In the 21st century, digital data drive innovation and decision-making in nearly every field. However, little is known about the total size, characteristics, and sustainability of these data. In the scholarly sphere, it is widely suspected that there is a gap between the amount of valuable digital data that is produced and the amount that is effectively stewarded and made accessible. The Stewardship Gap Project (http://bit.ly/stewardshipgap) investigates characteristics of, and measures, the stewardship gap for sponsored scholarly activity in the United States. This paper presents a preliminary definition of the stewardship gap based on a review of relevant literature and investigates areas of the stewardship gap for which metrics have been developed and measurements made, and where work to measure the stewardship gap is yet to be done. The main findings presented are 1) there is not one stewardship gap but rather multiple "gaps" that contribute to whether data is responsibly stewarded; 2) there are relationships between the gaps that can be used to guide strategies for addressing the various stewardship gaps; and 3) there are imbalances in the types and depths of studies that have been conducted to measure the stewardship gap.
{"title":"What Do We Know About the Stewardship Gap.","authors":"Jeremy York, Myron Gutmann, Francine Berman","doi":"10.5334/dsj-2018-019","DOIUrl":"10.5334/dsj-2018-019","url":null,"abstract":"<p><p>In the 21<sup>st</sup> century, digital data drive innovation and decision-making in nearly every field. However, little is known about the total size, characteristics, and sustainability of these data. In the scholarly sphere, it is widely suspected that there is a gap between the amount of valuable digital data that is produced and the amount that is effectively stewarded and made accessible. The Stewardship Gap Project (http://bit.ly/stewardshipgap) investigates characteristics of, and measures, the stewardship gap for sponsored scholarly activity in the United States. This paper presents a preliminary definition of the stewardship gap based on a review of relevant literature and investigates areas of the stewardship gap for which metrics have been developed and measurements made, and where work to measure the stewardship gap is yet to be done. The main findings presented are 1) there is not one stewardship gap but rather multiple \"gaps\" that contribute to whether data is responsibly stewarded; 2) there are relationships between the gaps that can be used to guide strategies for addressing the various stewardship gaps; and 3) there are imbalances in the types and depths of studies that have been conducted to measure the stewardship gap.</p>","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6450659/pdf/nihms-1010966.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37133522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-01-01Epub Date: 2018-06-28DOI: 10.5334/dsj-2018-015
Ge Peng, Jeffrey L Privette, Curt Tilmes, Sky Bristol, Tom Maycock, John J Bates, Scott Hausman, Otis Brown, Edward J Kearns
Scientific data stewardship is an important part of long-term preservation and the use/reuse of digital research data. It is critical for ensuring trustworthiness of data, products, and services, which is important for decision-making. Recent U.S. federal government directives and scientific organization guidelines have levied specific requirements, increasing the need for a more formal approach to ensuring that stewardship activities support compliance verification and reporting. However, many science data centers lack an integrated, systematic, and holistic framework to support such efforts. The current business- and process-oriented stewardship frameworks are too costly and lengthy for most data centers to implement. They often do not explicitly address the federal stewardship requirements and/or the uniqueness of geospatial data. This work proposes a data-centric conceptual enterprise framework for managing stewardship activities, based on the philosophy behind the Plan-Do-Check-Act (PDCA) cycle, a proven industrial concept. This framework, which includes the application of maturity assessment models, allows for quantitative evaluation of how organizations manage their stewardship activities and supports informed decision-making for continual improvement towards full compliance with federal, agency, and user requirements.
{"title":"A Conceptual Enterprise Framework for Managing Scientific Data Stewardship.","authors":"Ge Peng, Jeffrey L Privette, Curt Tilmes, Sky Bristol, Tom Maycock, John J Bates, Scott Hausman, Otis Brown, Edward J Kearns","doi":"10.5334/dsj-2018-015","DOIUrl":"https://doi.org/10.5334/dsj-2018-015","url":null,"abstract":"<p><p>Scientific data stewardship is an important part of long-term preservation and the use/reuse of digital research data. It is critical for ensuring trustworthiness of data, products, and services, which is important for decision-making. Recent U.S. federal government directives and scientific organization guidelines have levied specific requirements, increasing the need for a more formal approach to ensuring that stewardship activities support compliance verification and reporting. However, many science data centers lack an integrated, systematic, and holistic framework to support such efforts. The current business- and process-oriented stewardship frameworks are too costly and lengthy for most data centers to implement. They often do not explicitly address the federal stewardship requirements and/or the uniqueness of geospatial data. This work proposes a data-centric conceptual enterprise framework for managing stewardship activities, based on the philosophy behind the Plan-Do-Check-Act (PDCA) cycle, a proven industrial concept. This framework, which includes the application of maturity assessment models, allows for quantitative evaluation of how organizations manage their stewardship activities and supports informed decision-making for continual improvement towards full compliance with federal, agency, and user requirements.</p>","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7580807/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38622699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NASA's Earth Science Data and Information System (ESDIS) Project began investigating the use of Digital Object Identifiers (DOIs) in 2010 with the goal of assigning DOIs to various data products. These Earth science research data products produced using Earth observations and models are archived and distributed by twelve Distributed Active Archive Centers (DAACs) located across the United States. Each data center serves a different Earth science discipline user community and, accordingly, has a unique approach and process for generating and archiving a variety of data products. These varied approaches present a challenge for developing a DOI solution. To address this challenge, the ESDIS Project has developed processes, guidelines, and several models for creating and assigning DOIs. Initially the DOI assignment and registration process was started as a prototype but now it is fully operational. In February 2012, the ESDIS Project started using the California Digital Library (CDL) EZID for registering DOIs. The DOI assignments were initially labor-intensive. The system is now automated, and the assignments are progressing rapidly. As of February 28, 2017, over 50% of the data products at the DAACs had been assigned DOIs. Citations using the DOIs increased from about 100 to over 370 between 2015 and 2016.
{"title":"NASA EOSDIS Data Identifiers: Approach and System","authors":"L. Wanchoo, N. James, H. Ramapriyan","doi":"10.5334/dsj-2017-015","DOIUrl":"https://doi.org/10.5334/dsj-2017-015","url":null,"abstract":"NASA's Earth Science Data and Information System (ESDIS) Project began investigating the use of Digital Object Identifiers (DOIs) in 2010 with the goal of assigning DOIs to various data products. These Earth science research data products produced using Earth observations and models are archived and distributed by twelve Distributed Active Archive Centers (DAACs) located across the United States. Each data center serves a different Earth science discipline user community and, accordingly, has a unique approach and process for generating and archiving a variety of data products. These varied approaches present a challenge for developing a DOI solution. To address this challenge, the ESDIS Project has developed processes, guidelines, and several models for creating and assigning DOIs. Initially the DOI assignment and registration process was started as a prototype but now it is fully operational. In February 2012, the ESDIS Project started using the California Digital Library (CDL) EZID for registering DOIs. The DOI assignments were initially labor-intensive. The system is now automated, and the assignments are progressing rapidly. As of February 28, 2017, over 50% of the data products at the DAACs had been assigned DOIs. Citations using the DOIs increased from about 100 to over 370 between 2015 and 2016.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43633157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}