首页 > 最新文献

Data Science Journal最新文献

英文 中文
Keeping Track of Samples in Multidisciplinary Fieldwork 多学科野外工作样本的跟踪
Q2 Computer Science Pub Date : 2021-01-01 DOI: 10.5334/dsj-2021-034
P. Ellingsen, L. Ferrighi, Ø. Godøy, T. Gabrielsen
{"title":"Keeping Track of Samples in Multidisciplinary Fieldwork","authors":"P. Ellingsen, L. Ferrighi, Ø. Godøy, T. Gabrielsen","doi":"10.5334/dsj-2021-034","DOIUrl":"https://doi.org/10.5334/dsj-2021-034","url":null,"abstract":"","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71068327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Interconnecting Systems Using Machine-Actionable Data Management Plans – Hackathon Report 使用机器可操作数据管理计划的互连系统-黑客马拉松报告
Q2 Computer Science Pub Date : 2021-01-01 DOI: 10.5334/dsj-2021-035
João Cardoso, L. J. Castro, Tomasz Miksa
The common standard for machine-actionable Data Management Plans (DMPs) allows for automatic exchange, integration, and validation of information provided in DMPs. In this paper, we report on the hackathon organised by the Research Data Alliance in which a group of 89 participants from 21 countries worked collaboratively on use cases exploring the utility of the standard in different settings. The work included integration of tools and services, funder templates mapping, and development of new serialisations. This paper summarises the results achieved during the hackathon and provides pointers to further resources.
机器可操作的数据管理计划(dmp)的通用标准允许自动交换、集成和验证dmp中提供的信息。在本文中,我们报告了由研究数据联盟组织的黑客马拉松,来自21个国家的89名参与者在用例上合作,探索标准在不同环境中的效用。这项工作包括工具和服务的集成、资助者模板映射和新序列化的开发。本文总结了黑客马拉松期间取得的成果,并提供了进一步资源的指针。若昂卡多佐
{"title":"Interconnecting Systems Using Machine-Actionable Data Management Plans – Hackathon Report","authors":"João Cardoso, L. J. Castro, Tomasz Miksa","doi":"10.5334/dsj-2021-035","DOIUrl":"https://doi.org/10.5334/dsj-2021-035","url":null,"abstract":"The common standard for machine-actionable Data Management Plans (DMPs) allows for automatic exchange, integration, and validation of information provided in DMPs. In this paper, we report on the hackathon organised by the Research Data Alliance in which a group of 89 participants from 21 countries worked collaboratively on use cases exploring the utility of the standard in different settings. The work included integration of tools and services, funder templates mapping, and development of new serialisations. This paper summarises the results achieved during the hackathon and provides pointers to further resources.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71068380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Call to Action for Global Access to and Harmonization of Quality Information of Individual Earth Science Datasets 全球获取和协调单个地球科学数据集的质量信息的行动呼吁
Q2 Computer Science Pub Date : 2020-12-15 DOI: 10.31219/osf.io/nwe5p
G. Peng, R. Downs, C. Lacagnina, H. Ramapriyan, I. Ivánová, D. Moroni, Yaxing Wei, Larnicol Gilles, L. Wyborn, Mitchell Goldberg, J. Schulz, I. Bastrakova, A. Ganske, L. Bastin, S. Khalsa, Mingfang Wu, C. Shie, N. Ritchey, Dave Jones, T. Habermann, C. Lief, Iolanda Maggio, M. Albani, S. Stall, Lihang Zhou, M. Drévillon, Sarah M. Champion, C. Hou, F. Doblas-Reyes, K. Lehnert, E. Robinson, K. Bugbee
Knowledge about the quality of data and metadata is important to support informed decisions on the (re)use of individual datasets and is an essential part of the ecosystem that supports open science. Quality assessments reflect the reliability and usability of data and need to be consistently curated, fully traceable, and adequately documented, as these are crucial for sound decision- and policy-making efforts that rely on data. Quality assessments also need to be consistently represented and readily integrated across systems and tools to allow for improved sharing of information on quality at the dataset level for individual quality attribute or dimension. Although the need for assessing the quality of data and associated information is well recognized, methodologies for an evaluation framework and presentation of resultant quality information to end users may not have been comprehensively addressed within and across disciplines. Global interdisciplinary domain experts have come together to systematically explore needs, challenges and impacts of consistently curating and representing quality information through the entire lifecycle of a dataset. This paper describes the findings, calls for community action to develop practical guidelines, and outlines community recommendations for developing such guidelines. Community practical guidelines will allow for global access and harmonization of quality information at the level of individual Earth science datasets and support open science.
关于数据和元数据质量的知识对于支持关于(重新)使用单个数据集的知情决策很重要,也是支持开放科学的生态系统的重要组成部分。质量评估反映了数据的可靠性和可用性,需要持续策划、完全可追溯和充分记录,因为这些对于依赖数据的健全决策和决策工作至关重要。质量评估还需要在系统和工具之间得到一致的表示和容易的集成,以允许在数据集级别更好地共享单个质量属性或维度的质量信息。尽管评估数据和相关信息质量的必要性已得到充分认识,但评估框架的方法以及向最终用户介绍由此产生的质量信息的方法可能尚未在学科内部和学科之间得到全面解决。全球跨学科领域专家聚集在一起,系统地探索在数据集的整个生命周期中持续管理和表示高质量信息的需求、挑战和影响。本文介绍了调查结果,呼吁社区采取行动制定切实可行的指导方针,并概述了制定此类指导方针的社区建议。社区实用指南将允许全球获取和协调单个地球科学数据集层面的高质量信息,并支持开放科学。
{"title":"Call to Action for Global Access to and Harmonization of Quality Information of Individual Earth Science Datasets","authors":"G. Peng, R. Downs, C. Lacagnina, H. Ramapriyan, I. Ivánová, D. Moroni, Yaxing Wei, Larnicol Gilles, L. Wyborn, Mitchell Goldberg, J. Schulz, I. Bastrakova, A. Ganske, L. Bastin, S. Khalsa, Mingfang Wu, C. Shie, N. Ritchey, Dave Jones, T. Habermann, C. Lief, Iolanda Maggio, M. Albani, S. Stall, Lihang Zhou, M. Drévillon, Sarah M. Champion, C. Hou, F. Doblas-Reyes, K. Lehnert, E. Robinson, K. Bugbee","doi":"10.31219/osf.io/nwe5p","DOIUrl":"https://doi.org/10.31219/osf.io/nwe5p","url":null,"abstract":"Knowledge about the quality of data and metadata is important to support informed decisions on the (re)use of individual datasets and is an essential part of the ecosystem that supports open science. Quality assessments reflect the reliability and usability of data and need to be consistently curated, fully traceable, and adequately documented, as these are crucial for sound decision- and policy-making efforts that rely on data. Quality assessments also need to be consistently represented and readily integrated across systems and tools to allow for improved sharing of information on quality at the dataset level for individual quality attribute or dimension. Although the need for assessing the quality of data and associated information is well recognized, methodologies for an evaluation framework and presentation of resultant quality information to end users may not have been comprehensively addressed within and across disciplines. Global interdisciplinary domain experts have come together to systematically explore needs, challenges and impacts of consistently curating and representing quality information through the entire lifecycle of a dataset. This paper describes the findings, calls for community action to develop practical guidelines, and outlines community recommendations for developing such guidelines. Community practical guidelines will allow for global access and harmonization of quality information at the level of individual Earth science datasets and support open science.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46714816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Synthetic Reproduction and Augmentation of COVID-19 Case Reporting Data by Agent-Based Simulation 基于agent模拟的COVID-19病例报告数据合成复制与增强
Q2 Computer Science Pub Date : 2020-11-10 DOI: 10.1101/2020.11.07.20227462
N. Popper, M. Zechmeister, D. Brunmeir, C. Rippinger, N. Weibrecht, C. Urach, M. Bicher, G. Schneckenreither, A. Rauber
We generate synthetic data documenting COVID-19 cases in Austria by the means of an agent-based simulation model. The model simulates the transmission of the SARS-CoV-2 virus in a statistical replica of the population and reproduces typical patient pathways on an individual basis while simultaneously integrating historical data on the implementation and expiration of population-wide countermeasures. The resulting data semantically and statistically aligns with an official epidemiological case reporting data set and provides an easily accessible, consistent and augmented alternative. Our synthetic data set provides additional insight into the spread of the epidemic by synthesizing information that cannot be recorded in reality.
我们通过基于主体的模拟模型生成了记录奥地利COVID-19病例的合成数据。该模型模拟了SARS-CoV-2病毒在人口统计副本中的传播,并以个体为基础再现了典型的患者路径,同时整合了有关全人口对策实施和到期的历史数据。由此产生的数据在语义和统计上与官方流行病学病例报告数据集一致,并提供易于获取、一致和增强的替代方案。我们的综合数据集通过综合现实中无法记录的信息,进一步了解了这一流行病的传播情况。
{"title":"Synthetic Reproduction and Augmentation of COVID-19 Case Reporting Data by Agent-Based Simulation","authors":"N. Popper, M. Zechmeister, D. Brunmeir, C. Rippinger, N. Weibrecht, C. Urach, M. Bicher, G. Schneckenreither, A. Rauber","doi":"10.1101/2020.11.07.20227462","DOIUrl":"https://doi.org/10.1101/2020.11.07.20227462","url":null,"abstract":"We generate synthetic data documenting COVID-19 cases in Austria by the means of an agent-based simulation model. The model simulates the transmission of the SARS-CoV-2 virus in a statistical replica of the population and reproduces typical patient pathways on an individual basis while simultaneously integrating historical data on the implementation and expiration of population-wide countermeasures. The resulting data semantically and statistically aligns with an official epidemiological case reporting data set and provides an easily accessible, consistent and augmented alternative. Our synthetic data set provides additional insight into the spread of the epidemic by synthesizing information that cannot be recorded in reality.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46200237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Building Open Access to Research (OAR) Data Infrastructure at NIST 在NIST建立开放存取研究(OAR)数据基础设施
Q2 Computer Science Pub Date : 2019-07-08 DOI: 10.5334/dsj-2019-030
Gretchen Greene, R. Plante, R. Hanisch
As a National Metrology Institute (NMI), the USA National Institute of Standards and Technology (NIST) scientists, engineers and technology experts conduct research across a full spectrum of physical science domains. NIST is a non-regulatory agency within the U.S. Department of Commerce with a mission to promote U.S. innovation and industrial competitiveness by advancing measurement science, standards, and technology in ways that enhance economic security and improve our quality of life. NIST research results in the production and distribution of standard reference materials, calibration services, and datasets. These are generated from a wide range of complex laboratory instrumentation, expert analyses, and calibration processes. In response to a government open data policy, and in collaboration with the broader research community, NIST has developed a federated Open Access to Research (OAR) scientific data infrastructure aligned with FAIR (Findable, Accessible, Interoperable, Reusable) data principles. Through the OAR initiatives, NIST's Material Measurement Laboratory Office of Data and Informatics (ODI) recently released a new scientific data discovery portal and public data repository. These science-oriented applications provide dissemination and public access for data from across the broad spectrum of NIST research disciplines, including chemistry, biology, materials science (such as crystallography, nanomaterials, etc.), physics, disaster resilience, cyberinfrastructure, communications, forensics, and others. NIST's public data consist of carefully curated Standard Reference Data, legacy high valued data, and new research data publications. The repository is thus evolving both in content and features as the nature of research progresses. Implementation of the OAR infrastructure is key to NIST's role in sharing high integrity reproducible research for measurement science in a rapidly changing world.
作为国家计量研究所(NMI),美国国家标准与技术研究所(NIST)的科学家、工程师和技术专家在物理科学领域进行全方位的研究。NIST是美国商务部下属的一个非监管机构,其使命是通过推进测量科学、标准和技术,以增强经济安全和改善我们的生活质量,促进美国的创新和工业竞争力。NIST在标准参考材料、校准服务和数据集的生产和分发方面的研究成果。这些是由各种复杂的实验室仪器、专家分析和校准过程产生的。为了响应政府的开放数据政策,并与更广泛的研究团体合作,NIST开发了一个联邦开放获取研究(OAR)科学数据基础设施,与FAIR(可查找、可访问、可互操作、可重用)数据原则保持一致。通过OAR计划,NIST的数据和信息学材料测量实验室办公室(ODI)最近发布了一个新的科学数据发现门户和公共数据存储库。这些面向科学的应用程序为NIST研究学科的广泛数据提供传播和公共访问,包括化学、生物学、材料科学(如晶体学、纳米材料等)、物理学、灾难恢复力、网络基础设施、通信、法医学等。NIST的公共数据包括精心策划的标准参考数据、遗留的高价值数据和新的研究数据出版物。因此,随着研究性质的进展,存储库在内容和功能上都在不断发展。OAR基础设施的实施是NIST在快速变化的世界中共享测量科学高完整性可重复研究的关键。
{"title":"Building Open Access to Research (OAR) Data Infrastructure at NIST","authors":"Gretchen Greene, R. Plante, R. Hanisch","doi":"10.5334/dsj-2019-030","DOIUrl":"https://doi.org/10.5334/dsj-2019-030","url":null,"abstract":"As a National Metrology Institute (NMI), the USA National Institute of Standards and Technology (NIST) scientists, engineers and technology experts conduct research across a full spectrum of physical science domains. NIST is a non-regulatory agency within the U.S. Department of Commerce with a mission to promote U.S. innovation and industrial competitiveness by advancing measurement science, standards, and technology in ways that enhance economic security and improve our quality of life. NIST research results in the production and distribution of standard reference materials, calibration services, and datasets. These are generated from a wide range of complex laboratory instrumentation, expert analyses, and calibration processes. In response to a government open data policy, and in collaboration with the broader research community, NIST has developed a federated Open Access to Research (OAR) scientific data infrastructure aligned with FAIR (Findable, Accessible, Interoperable, Reusable) data principles. Through the OAR initiatives, NIST's Material Measurement Laboratory Office of Data and Informatics (ODI) recently released a new scientific data discovery portal and public data repository. These science-oriented applications provide dissemination and public access for data from across the broad spectrum of NIST research disciplines, including chemistry, biology, materials science (such as crystallography, nanomaterials, etc.), physics, disaster resilience, cyberinfrastructure, communications, forensics, and others. NIST's public data consist of carefully curated Standard Reference Data, legacy high valued data, and new research data publications. The repository is thus evolving both in content and features as the nature of research progresses. Implementation of the OAR infrastructure is key to NIST's role in sharing high integrity reproducible research for measurement science in a rapidly changing world.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46050150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Developing a Model Guidelines Addressing Legal Impediments to Open Access to Publicly Funded Research Data in Malaysia 制定解决马来西亚公共资助研究数据开放获取的法律障碍的示范指南
Q2 Computer Science Pub Date : 2019-01-01 DOI: 10.5334/dsj-2019-027
Haswira Nor Mohamad Hashim
{"title":"Developing a Model Guidelines Addressing Legal Impediments to Open Access to Publicly Funded Research Data in Malaysia","authors":"Haswira Nor Mohamad Hashim","doi":"10.5334/dsj-2019-027","DOIUrl":"https://doi.org/10.5334/dsj-2019-027","url":null,"abstract":"","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71067873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Indigenous Data Governance: Strategies from United States Native Nations. 本土数据治理:来自美国本土的策略。
Q2 Computer Science Pub Date : 2019-01-01 DOI: 10.5334/dsj-2019-031
Stephanie Russo Carroll, Desi Rodriguez-Lonebear, Andrew Martinez

Data have become the new global currency, and a powerful force in making decisions and wielding power. As the world engages with open data, big data reuse, and data linkage, what do data-driven futures look like for communities plagued by data inequities? Indigenous data stakeholders and non-Indigenous allies have explored this question over the last three years in a series of meetings through the Research Data Alliance (RDA). Drawing on RDA and other gatherings, and a systematic scan of literature and practice, we consider possible answers to this question in the context of Indigenous peoples vis-á-vis two emerging concepts: Indigenous data sovereignty and Indigenous data governance. Specifically, we focus on the data challenges facing Native nations and the intersection of data, tribal sovereignty, and power. Indigenous data sovereignty is the right of each Native nation to govern the collection, ownership, and application of the tribe's data. Native nations exercise Indigenous data sovereignty through the interrelated processes of Indigenous data governance and decolonizing data. This paper explores the implications of Indigenous data sovereignty and Indigenous data governance for Native nations and others. We argue for the repositioning of authority over Indigenous data back to Indigenous peoples. At the same time, we recognize that there are significant obstacles to rebuilding effective Indigenous data systems and the process will require resources, time, and partnerships among Native nations, other governments, and data agents.

数据已经成为新的全球货币,成为决策和行使权力的强大力量。随着世界与开放数据、大数据重用和数据链接的融合,受数据不平等困扰的社区,数据驱动的未来会是什么样子?在过去三年中,土著数据利益相关者和非土著盟友通过研究数据联盟(RDA)在一系列会议上探讨了这个问题。借助RDA和其他会议,以及对文献和实践的系统扫描,我们在土著人民的背景下考虑了这个问题的可能答案,参见-á-vis两个新兴概念:土著数据主权和土著数据治理。具体来说,我们关注的是土著民族面临的数据挑战,以及数据、部落主权和权力的交集。土著数据主权是每个土著民族对部落数据的收集、所有权和应用进行管理的权利。土著民族通过土著数据治理和数据非殖民化这两个相互关联的过程行使土著数据主权。本文探讨了土著数据主权和土著数据治理对土著民族和其他民族的影响。我们主张将原住民资料的权威重新定位回原住民族。与此同时,我们认识到重建有效的土著数据系统存在重大障碍,这一过程将需要资源、时间以及土著民族、其他政府和数据代理之间的伙伴关系。
{"title":"Indigenous Data Governance: Strategies from United States Native Nations.","authors":"Stephanie Russo Carroll,&nbsp;Desi Rodriguez-Lonebear,&nbsp;Andrew Martinez","doi":"10.5334/dsj-2019-031","DOIUrl":"https://doi.org/10.5334/dsj-2019-031","url":null,"abstract":"<p><p>Data have become the new global currency, and a powerful force in making decisions and wielding power. As the world engages with open data, big data reuse, and data linkage, what do data-driven futures look like for communities plagued by data inequities? Indigenous data stakeholders and non-Indigenous allies have explored this question over the last three years in a series of meetings through the Research Data Alliance (RDA). Drawing on RDA and other gatherings, and a systematic scan of literature and practice, we consider possible answers to this question in the context of Indigenous peoples vis-á-vis two emerging concepts: Indigenous data sovereignty and Indigenous data governance. Specifically, we focus on the data challenges facing Native nations and the intersection of data, tribal sovereignty, and power. Indigenous data sovereignty is the right of each Native nation to govern the collection, ownership, and application of the tribe's data. Native nations exercise Indigenous data sovereignty through the interrelated processes of Indigenous data governance and decolonizing data. This paper explores the implications of <i>Indigenous data sovereignty and Indigenous data governance</i> for Native nations and others. We argue for the repositioning of authority over Indigenous data back to Indigenous peoples. At the same time, we recognize that there are significant obstacles to rebuilding effective Indigenous data systems and the process will require resources, time, and partnerships among Native nations, other governments, and data agents.</p>","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8580324/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39613550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 86
What Do We Know About the Stewardship Gap. 我们对管理差距了解多少。
Q2 Computer Science Pub Date : 2018-01-01 Epub Date: 2018-08-17 DOI: 10.5334/dsj-2018-019
Jeremy York, Myron Gutmann, Francine Berman

In the 21st century, digital data drive innovation and decision-making in nearly every field. However, little is known about the total size, characteristics, and sustainability of these data. In the scholarly sphere, it is widely suspected that there is a gap between the amount of valuable digital data that is produced and the amount that is effectively stewarded and made accessible. The Stewardship Gap Project (http://bit.ly/stewardshipgap) investigates characteristics of, and measures, the stewardship gap for sponsored scholarly activity in the United States. This paper presents a preliminary definition of the stewardship gap based on a review of relevant literature and investigates areas of the stewardship gap for which metrics have been developed and measurements made, and where work to measure the stewardship gap is yet to be done. The main findings presented are 1) there is not one stewardship gap but rather multiple "gaps" that contribute to whether data is responsibly stewarded; 2) there are relationships between the gaps that can be used to guide strategies for addressing the various stewardship gaps; and 3) there are imbalances in the types and depths of studies that have been conducted to measure the stewardship gap.

在21世纪,数字数据推动了几乎所有领域的创新和决策。然而,人们对这些数据的总体规模、特征和可持续性知之甚少。在学术领域,人们普遍怀疑,产生的有价值的数字数据量与有效管理和访问的数据量之间存在差距。管理差距项目(http://bit.ly/stewardshipgap)调查并衡量美国赞助学术活动的管理差距的特点。本文在回顾相关文献的基础上,提出了管理差距的初步定义,并调查了已经制定指标和进行测量的管理差距领域,以及尚未开展测量管理差距工作的领域。提出的主要发现是:1)数据是否被负责任地管理,不是一个管理缺口,而是多个“缺口”;2) 差距之间存在关系,可用于指导解决各种管理差距的战略;以及3)为衡量管理差距而进行的研究的类型和深度存在不平衡。
{"title":"What Do We Know About the Stewardship Gap.","authors":"Jeremy York, Myron Gutmann, Francine Berman","doi":"10.5334/dsj-2018-019","DOIUrl":"10.5334/dsj-2018-019","url":null,"abstract":"<p><p>In the 21<sup>st</sup> century, digital data drive innovation and decision-making in nearly every field. However, little is known about the total size, characteristics, and sustainability of these data. In the scholarly sphere, it is widely suspected that there is a gap between the amount of valuable digital data that is produced and the amount that is effectively stewarded and made accessible. The Stewardship Gap Project (http://bit.ly/stewardshipgap) investigates characteristics of, and measures, the stewardship gap for sponsored scholarly activity in the United States. This paper presents a preliminary definition of the stewardship gap based on a review of relevant literature and investigates areas of the stewardship gap for which metrics have been developed and measurements made, and where work to measure the stewardship gap is yet to be done. The main findings presented are 1) there is not one stewardship gap but rather multiple \"gaps\" that contribute to whether data is responsibly stewarded; 2) there are relationships between the gaps that can be used to guide strategies for addressing the various stewardship gaps; and 3) there are imbalances in the types and depths of studies that have been conducted to measure the stewardship gap.</p>","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6450659/pdf/nihms-1010966.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37133522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Conceptual Enterprise Framework for Managing Scientific Data Stewardship. 管理科学数据管理的概念性企业框架。
Q2 Computer Science Pub Date : 2018-01-01 Epub Date: 2018-06-28 DOI: 10.5334/dsj-2018-015
Ge Peng, Jeffrey L Privette, Curt Tilmes, Sky Bristol, Tom Maycock, John J Bates, Scott Hausman, Otis Brown, Edward J Kearns

Scientific data stewardship is an important part of long-term preservation and the use/reuse of digital research data. It is critical for ensuring trustworthiness of data, products, and services, which is important for decision-making. Recent U.S. federal government directives and scientific organization guidelines have levied specific requirements, increasing the need for a more formal approach to ensuring that stewardship activities support compliance verification and reporting. However, many science data centers lack an integrated, systematic, and holistic framework to support such efforts. The current business- and process-oriented stewardship frameworks are too costly and lengthy for most data centers to implement. They often do not explicitly address the federal stewardship requirements and/or the uniqueness of geospatial data. This work proposes a data-centric conceptual enterprise framework for managing stewardship activities, based on the philosophy behind the Plan-Do-Check-Act (PDCA) cycle, a proven industrial concept. This framework, which includes the application of maturity assessment models, allows for quantitative evaluation of how organizations manage their stewardship activities and supports informed decision-making for continual improvement towards full compliance with federal, agency, and user requirements.

科学数据管理是数字研究数据长期保存和使用/再利用的重要组成部分。它对于确保数据、产品和服务的可信度至关重要,这对决策至关重要。最近的美国联邦政府指令和科学组织指导方针提出了具体的要求,增加了对更正式的方法的需求,以确保管理活动支持遵从性验证和报告。然而,许多科学数据中心缺乏一个集成的、系统的和整体的框架来支持这些努力。当前面向业务和面向流程的管理框架对于大多数数据中心来说过于昂贵和冗长,无法实现。它们通常没有明确地处理联邦管理要求和/或地理空间数据的唯一性。这项工作提出了一个以数据为中心的概念性企业框架,用于管理管理活动,该框架基于计划-执行-检查-行动(PDCA)周期背后的理念,这是一个经过验证的工业概念。这个框架,包括成熟度评估模型的应用,允许对组织如何管理他们的管理活动进行定量评估,并支持对联邦、机构和用户需求的完全遵从进行持续改进的知情决策。
{"title":"A Conceptual Enterprise Framework for Managing Scientific Data Stewardship.","authors":"Ge Peng,&nbsp;Jeffrey L Privette,&nbsp;Curt Tilmes,&nbsp;Sky Bristol,&nbsp;Tom Maycock,&nbsp;John J Bates,&nbsp;Scott Hausman,&nbsp;Otis Brown,&nbsp;Edward J Kearns","doi":"10.5334/dsj-2018-015","DOIUrl":"https://doi.org/10.5334/dsj-2018-015","url":null,"abstract":"<p><p>Scientific data stewardship is an important part of long-term preservation and the use/reuse of digital research data. It is critical for ensuring trustworthiness of data, products, and services, which is important for decision-making. Recent U.S. federal government directives and scientific organization guidelines have levied specific requirements, increasing the need for a more formal approach to ensuring that stewardship activities support compliance verification and reporting. However, many science data centers lack an integrated, systematic, and holistic framework to support such efforts. The current business- and process-oriented stewardship frameworks are too costly and lengthy for most data centers to implement. They often do not explicitly address the federal stewardship requirements and/or the uniqueness of geospatial data. This work proposes a data-centric conceptual enterprise framework for managing stewardship activities, based on the philosophy behind the Plan-Do-Check-Act (PDCA) cycle, a proven industrial concept. This framework, which includes the application of maturity assessment models, allows for quantitative evaluation of how organizations manage their stewardship activities and supports informed decision-making for continual improvement towards full compliance with federal, agency, and user requirements.</p>","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7580807/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38622699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
NASA EOSDIS Data Identifiers: Approach and System NASA EOSDIS数据标识符:方法和系统
Q2 Computer Science Pub Date : 2017-04-04 DOI: 10.5334/dsj-2017-015
L. Wanchoo, N. James, H. Ramapriyan
NASA's Earth Science Data and Information System (ESDIS) Project began investigating the use of Digital Object Identifiers (DOIs) in 2010 with the goal of assigning DOIs to various data products. These Earth science research data products produced using Earth observations and models are archived and distributed by twelve Distributed Active Archive Centers (DAACs) located across the United States. Each data center serves a different Earth science discipline user community and, accordingly, has a unique approach and process for generating and archiving a variety of data products. These varied approaches present a challenge for developing a DOI solution. To address this challenge, the ESDIS Project has developed processes, guidelines, and several models for creating and assigning DOIs. Initially the DOI assignment and registration process was started as a prototype but now it is fully operational. In February 2012, the ESDIS Project started using the California Digital Library (CDL) EZID for registering DOIs. The DOI assignments were initially labor-intensive. The system is now automated, and the assignments are progressing rapidly. As of February 28, 2017, over 50% of the data products at the DAACs had been assigned DOIs. Citations using the DOIs increased from about 100 to over 370 between 2015 and 2016.
美国国家航空航天局的地球科学数据和信息系统(ESDIS)项目于2010年开始调查数字对象标识符(DOI)的使用,目的是将DOI分配给各种数据产品。这些使用地球观测和模型制作的地球科学研究数据产品由位于美国各地的十二个分布式主动档案中心(DAAC)存档和分发。每个数据中心都为不同的地球科学学科用户社区提供服务,因此,它们有一种独特的方法和流程来生成和归档各种数据产品。这些不同的方法对开发DOI解决方案提出了挑战。为了应对这一挑战,ESDIS项目制定了创建和分配DOI的流程、指导方针和几个模型。最初,内政部的分配和登记程序是作为一个原型开始的,但现在它已经全面运作。2012年2月,ESDIS项目开始使用加州数字图书馆(CDL)EZID注册DOI。内政部的任务最初是劳动密集型的。该系统现已实现自动化,任务进展迅速。截至2017年2月28日,DAAC超过50%的数据产品已分配DOI。2015年至2016年间,使用DOI的引文从约100篇增加到370多篇。
{"title":"NASA EOSDIS Data Identifiers: Approach and System","authors":"L. Wanchoo, N. James, H. Ramapriyan","doi":"10.5334/dsj-2017-015","DOIUrl":"https://doi.org/10.5334/dsj-2017-015","url":null,"abstract":"NASA's Earth Science Data and Information System (ESDIS) Project began investigating the use of Digital Object Identifiers (DOIs) in 2010 with the goal of assigning DOIs to various data products. These Earth science research data products produced using Earth observations and models are archived and distributed by twelve Distributed Active Archive Centers (DAACs) located across the United States. Each data center serves a different Earth science discipline user community and, accordingly, has a unique approach and process for generating and archiving a variety of data products. These varied approaches present a challenge for developing a DOI solution. To address this challenge, the ESDIS Project has developed processes, guidelines, and several models for creating and assigning DOIs. Initially the DOI assignment and registration process was started as a prototype but now it is fully operational. In February 2012, the ESDIS Project started using the California Digital Library (CDL) EZID for registering DOIs. The DOI assignments were initially labor-intensive. The system is now automated, and the assignments are progressing rapidly. As of February 28, 2017, over 50% of the data products at the DAACs had been assigned DOIs. Citations using the DOIs increased from about 100 to over 370 between 2015 and 2016.","PeriodicalId":35375,"journal":{"name":"Data Science Journal","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43633157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Data Science Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1