Journal of escience librarianship最新文献

英文中文

Computational Reproducibility: A Practical Framework for Data Curators 计算再现性：数据策展器的实用框架

Journal of escience librarianship

Pub Date : 2021-08-11 DOI: 10.7191/jeslib.2021.1206

Sandra L. Sawchuk, Shahira Khair

Introduction: This paper presents concrete and actionable steps to guide researchers, data curators, and data managers in improving their understanding and practice of computational reproducibility.Objectives: Focusing on incremental progress rather than prescriptive rules, researchers and curators can build their knowledge and skills as the need arises. This paper presents a framework of incremental curation for reproducibility to support open science objectives.Methods: A computational reproducibility framework developed for the Canadian Data Curation Forum serves as the model for this approach. This framework combines learning about reproducibility with recommended steps to improving reproducibility.Conclusion: Computational reproducibility leads to more transparent and accurate research. The authors warn that fear of a crisis and focus on perfection should not prevent curation that may be ‘good enough.’

引言：本文提出了具体可行的步骤，以指导研究人员、数据管理者和数据管理者提高他们对计算再现性的理解和实践。目标：研究人员和策展人可以根据需要建立自己的知识和技能，专注于渐进式的进步，而不是规定性的规则。本文提出了一个可再现性的增量策展框架，以支持开放科学目标。方法：为加拿大数据整理论坛开发的计算再现性框架作为该方法的模型。该框架将对再现性的学习与提高再现性的推荐步骤相结合。结论：计算再现性使研究更加透明和准确。作者警告说，对危机的恐惧和对完美的关注不应阻止可能“足够好”的策展

引用次数: 1

Responding to Reality: Evolving Curation Practices and Infrastructure at the University of Illinois at Urbana-Champaign 回应现实:不断发展的策展实践和基础设施，伊利诺伊大学厄巴纳-香槟分校

Journal of escience librarianship

Pub Date : 2021-08-11 DOI: 10.7191/jeslib.2021.1202

Hoa Q. Luong, Colleen Fallaw, Genevieve Schmitt, S. Braxton, Heidi J. Imker

Objective: The Illinois Data Bank provides Illinois researchers with the infrastructure to publish research data publicly. During a five-year review of the Research Data Service at the University of Illinois at Urbana-Champaign, it was recognized as the most useful service offering in the unit. Internal metrics are captured and used to monitor the growth, document curation workflows, and surface technical challenges faced as we assist our researchers. Here we present examples of these curation challenges and the solutions chosen to address them.Methods: Some Illinois Data Bank metrics are collected internally by within the system, but most of the curation metrics reported here are tracked separately in a Google spreadsheet. The curator logs required information after curation is complete for each dataset. While the data is sometimes ambiguous (e.g., depending on researcher uptake of suggested actions), our curation data provide a general understanding about our data repository and have been useful in assessing our workflows and services. These metrics also help prioritize development needs for the Illinois Data Bank.Results and Conclusions: The curatorial services polish and improve the datasets, which contributes to the spirit of data reuse. Although we continue to see challenges in our processes, curation makes a positive impact on datasets. Continued development and adaptation of the technical infrastructure allows for an ever-better experience for the curators and users. These improvements have helped our repository more effectively support the data sharing process by successfully fostering depositor engagement with curators to improve datasets and facilitating easy transfer of very large files.

目的:伊利诺伊州数据库为伊利诺伊州的研究人员提供了公开发布研究数据的基础设施。在伊利诺伊大学厄巴纳-香槟分校对研究数据服务进行的五年审查中，它被认为是该部门最有用的服务。内部指标被捕获并用于监控增长、文档管理工作流程，以及在我们协助研究人员时面临的表面技术挑战。在这里，我们将展示这些策展挑战的例子以及选择解决这些挑战的解决方案。方法:伊利诺斯数据银行的一些指标是在系统内部收集的，但这里报告的大多数管理指标是在谷歌电子表格中单独跟踪的。管理完成后，管理器记录每个数据集所需的信息。虽然数据有时是模糊的(例如，取决于研究人员对建议操作的理解)，但我们的管理数据提供了对我们的数据存储库的一般理解，并在评估我们的工作流程和服务时非常有用。这些指标也有助于伊利诺伊数据银行优先考虑开发需求。结果与结论:策展服务对数据集进行了润色和完善，有助于弘扬数据重用精神。尽管我们在流程中不断遇到挑战，但管理对数据集产生了积极的影响。技术基础设施的持续开发和调整为管理员和用户提供了更好的体验。这些改进帮助我们的存储库更有效地支持数据共享过程，成功地促进了存款人与管理员的互动，以改进数据集，并促进了超大文件的轻松传输。

{"title":"Responding to Reality: Evolving Curation Practices and Infrastructure at the University of Illinois at Urbana-Champaign","authors":"Hoa Q. Luong, Colleen Fallaw, Genevieve Schmitt, S. Braxton, Heidi J. Imker","doi":"10.7191/jeslib.2021.1202","DOIUrl":"https://doi.org/10.7191/jeslib.2021.1202","url":null,"abstract":"Objective: The Illinois Data Bank provides Illinois researchers with the infrastructure to publish research data publicly. During a five-year review of the Research Data Service at the University of Illinois at Urbana-Champaign, it was recognized as the most useful service offering in the unit. Internal metrics are captured and used to monitor the growth, document curation workflows, and surface technical challenges faced as we assist our researchers. Here we present examples of these curation challenges and the solutions chosen to address them.\u0000\u0000Methods: Some Illinois Data Bank metrics are collected internally by within the system, but most of the curation metrics reported here are tracked separately in a Google spreadsheet. The curator logs required information after curation is complete for each dataset. While the data is sometimes ambiguous (e.g., depending on researcher uptake of suggested actions), our curation data provide a general understanding about our data repository and have been useful in assessing our workflows and services. These metrics also help prioritize development needs for the Illinois Data Bank.\u0000\u0000Results and Conclusions: The curatorial services polish and improve the datasets, which contributes to the spirit of data reuse. Although we continue to see challenges in our processes, curation makes a positive impact on datasets. Continued development and adaptation of the technical infrastructure allows for an ever-better experience for the curators and users. These improvements have helped our repository more effectively support the data sharing process by successfully fostering depositor engagement with curators to improve datasets and facilitating easy transfer of very large files.","PeriodicalId":90214,"journal":{"name":"Journal of escience librarianship","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43296380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Data Curation in Practice: Extract Tabular Data from PDF Files Using a Data Analytics Tool 实践中的数据整理：使用数据分析工具从PDF文件中提取表格数据

Journal of escience librarianship

Pub Date : 2021-08-11 DOI: 10.7191/jeslib.2021.1209

A. J. Choi, Xuying Xin

Data curation is the process of managing data to make it available for reuse and preservation and to allow FAIR (findable, accessible, interoperable, reusable) uses. It is an important part of the research lifecycle as researchers are often either required by funders or generally encouraged to preserve the dataset and make it discoverable and reusable. This has been especially important as the Open Access (OA) policy is being implemented in many institutions across the nation. In facilitating research data discovery and enhancing its easier reuse, an efficient data repository and its data curation play key roles. In this article, we briefly discuss the local institutional repository at Penn State University and the general data curation practices we adopt for the deposited files and datasets, then we focus on a data analytics tool that has recently been applied to extract tabular data from PDF files. This is an enhancement to the existing data curation practices as it adds additional tabular data to deposits with PDF files where tables are often embedded and not easily reused.

数据管理是管理数据的过程，目的是使数据可以重用和保存，并允许对数据进行FAIR(可查找、可访问、可互操作、可重用)的使用。这是研究生命周期的重要组成部分，因为研究人员通常要么被资助者要求，要么被鼓励保存数据集，使其可发现和可重用。随着开放获取(OA)政策在全国许多机构中实施，这一点尤为重要。在促进研究数据发现和提高其更容易重用的过程中，高效的数据存储库及其数据管理起着关键作用。在本文中，我们简要讨论了宾夕法尼亚州立大学的本地机构存储库以及我们为存储的文件和数据集采用的一般数据管理实践，然后我们将重点放在最近用于从PDF文件中提取表格数据的数据分析工具上。这是对现有数据管理实践的增强，因为它将额外的表格数据添加到包含PDF文件的存储中，而PDF文件中通常嵌入表格且不易重用。

引用次数: 0

Plain Text & Character Encoding: A Primer for Data Curators 纯文本和字符编码：数据管理员入门

Journal of escience librarianship

Pub Date : 2021-08-11 DOI: 10.7191/jeslib.2021.1211

S. Erickson

Plain text data consists of a sequence of encoded characters or “code points” from a given standard such as the Unicode Standard. Some of the most common file formats for digital data used in eScience (CSV, XML, and JSON, for example) are built atop plain text standards. Plain text representations of digital data are often preferred because plain text formats are relatively stable, and they facilitate reuse and interoperability. Despite its ubiquity, plain text is not as plain as it may seem. The set of standards used in modern text encoding (principally, the Unicode Character Set and the related encoding format, UTF-8) have complex architectures when compared to historical standards like ASCII. Further, while the Unicode standard has gained in prominence, text encoding problems are not uncommon in research data curation. This primer provides conceptual foundations for modern text encoding and guidance for common curation and preservation actions related to textual data.

纯文本数据由一系列来自给定标准（如Unicode标准）的编码字符或“代码点”组成。eScience中使用的一些最常见的数字数据文件格式（例如CSV、XML和JSON）是基于纯文本标准构建的。数字数据的纯文本表示通常是首选，因为纯文本格式相对稳定，并且有助于重用和互操作性。尽管纯文本无处不在，但它并不像看上去那么简单。与ASCII等历史标准相比，现代文本编码中使用的一组标准（主要是Unicode字符集和相关编码格式UTF-8）具有复杂的体系结构。此外，虽然Unicode标准越来越突出，但文本编码问题在研究数据管理中并不罕见。本初级读本为现代文本编码提供了概念基础，并为与文本数据相关的常见管理和保存行动提供了指导。

引用次数: 2

Introduction to the Special JeSLIB Issue on Data Curation in Practice JeSLIB数据整理专题介绍

Journal of escience librarianship

Pub Date : 2021-08-11 DOI: 10.7191/jeslib.2021.1222

Cynthia Hudson Vitale, J. Carlson, H. Hadley, L. Johnston

Research data curation is a set of scientific communication processes andactivities that support the ethical reuse of research data and upholdresearch integrity. Data curators act as key collaborators with researchersto enrich the scholarly value and potential impact of their data throughpreparing it to be shared with others and preserved for the long term. Thisspecial issues focuses on practical data curation workflows and tools thathave been developed and implemented within data repositories, scholarlysocieties, research projects, and academic institutions.

研究数据管理是一套科学的沟通过程和活动，支持研究数据的道德重用和维护研究的完整性。数据策展人是研究人员的关键合作者，通过准备与他人共享并长期保存数据，丰富其数据的学术价值和潜在影响。本特刊聚焦于在数据存储库、学术团体、研究项目和学术机构中开发和实施的实用数据管理工作流程和工具。

引用次数: 0

Active Curation of Large Longitudinal Surveys: A Case Study 大型纵向调查的主动管理:一个案例研究

Journal of escience librarianship

Pub Date : 2021-08-11 DOI: 10.7191/jeslib.2021.1210

Inna Kouper, Karen L. Tucker, Kevin Tharp, Mary Ellen van Booven, Ashley Clark

In this paper we take an in-depth look at the curation of a large longitudinal survey and activities and procedures involved in moving the data from its generation to the state that is needed to conduct scientific analysis. Using a case study approach, we describe how large surveys generate a range of data assets that require many decisions well before the data is considered for analysis and publication. We use the notion of active curation to describe activities and decisions about the data objects that are “live,” i.e., when they are still being collected and processed for the later stages of the data lifecycle. Our efforts illustrate a gap in the existing discussions on curation. On one hand, there is an acknowledged need for active or upstream curation as an engagement of curators close to the point of data creation. On the other hand, the recommendations on how to do that are scattered across multiple domain-oriented data efforts.In describing the complexities of active curation of survey data and providing general recommendations we aim to draw attention to the practices of active curation, stimulate the development of interoperable tools, standards, and techniques needed at the initial stages of research projects, and encourage collaborations between libraries and other academic units.

在本文中，我们深入研究了大型纵向调查的管理以及将数据从生成转移到进行科学分析所需的状态所涉及的活动和程序。使用案例研究方法，我们描述了大型调查如何在考虑分析和发布数据之前生成一系列需要许多决策的数据资产。我们使用活动管理的概念来描述关于“活动”数据对象的活动和决策，也就是说，当数据对象仍在为数据生命周期的后期阶段收集和处理时。我们的努力说明了目前关于策展的讨论中存在的差距。一方面，作为接近数据创建点的策展人的参与，人们公认需要积极的或上游的策展人。另一方面，关于如何做到这一点的建议分散在多个面向领域的数据工作中。在描述调查数据主动管理的复杂性和提供一般性建议时，我们的目标是引起人们对主动管理实践的关注，刺激研究项目初始阶段所需的互操作工具、标准和技术的开发，并鼓励图书馆和其他学术单位之间的合作。

{"title":"Active Curation of Large Longitudinal Surveys: A Case Study","authors":"Inna Kouper, Karen L. Tucker, Kevin Tharp, Mary Ellen van Booven, Ashley Clark","doi":"10.7191/jeslib.2021.1210","DOIUrl":"https://doi.org/10.7191/jeslib.2021.1210","url":null,"abstract":"In this paper we take an in-depth look at the curation of a large longitudinal survey and activities and procedures involved in moving the data from its generation to the state that is needed to conduct scientific analysis. Using a case study approach, we describe how large surveys generate a range of data assets that require many decisions well before the data is considered for analysis and publication. We use the notion of active curation to describe activities and decisions about the data objects that are “live,” i.e., when they are still being collected and processed for the later stages of the data lifecycle. Our efforts illustrate a gap in the existing discussions on curation. On one hand, there is an acknowledged need for active or upstream curation as an engagement of curators close to the point of data creation. On the other hand, the recommendations on how to do that are scattered across multiple domain-oriented data efforts.\u0000\u0000In describing the complexities of active curation of survey data and providing general recommendations we aim to draw attention to the practices of active curation, stimulate the development of interoperable tools, standards, and techniques needed at the initial stages of research projects, and encourage collaborations between libraries and other academic units.","PeriodicalId":90214,"journal":{"name":"Journal of escience librarianship","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46160925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Touring a Data Curation Network Primer: A Focus on Neuroimaging Data 浏览数据治疗网络初级读本：聚焦神经影像学数据

Journal of escience librarianship

Pub Date : 2021-08-01 DOI: 10.7191/jeslib.2021.1204

S. Samuel, Michael Moore, Helenmary Sheridan, Chris Sorensen, Brandon Patterson

This video article provides an introduction to a data primer which leads data curators through the process of preparing a neuroimaging dataset for submission into a repository. A team of health sciences librarians and informationists created the primer which is focused on data from functional magnetic resonance images that are saved in either DICOM or NIfTI formats. The video walks through a flowchart discussing the process of preparing data sets to be deposited into a repository, key curatorial questions to ask for data that is highly sensitive, and how to suggest edits to this and other primers. The primer grew out of a data curation workshop hosted by the Data Curation Network.

这篇视频文章介绍了一种数据引物，该引物引导数据管理员完成准备神经成像数据集提交到存储库的过程。一个由健康科学图书馆员和信息学家组成的团队创建了这本入门书，重点关注以DICOM或NIfTI格式保存的功能性磁共振图像的数据。该视频介绍了一个流程图，讨论了准备将要存入存储库的数据集的过程、要求获得高度敏感数据的关键策展问题，以及如何建议对该引物和其他引物进行编辑。这本入门书源于数据管理网络主办的数据管理研讨会。

引用次数: 1

Not Forgetting – 80s Style 不要忘记- 80年代的风格

Journal of escience librarianship

Pub Date : 2021-07-30 DOI: 10.7191/jeslib.2021.1223

R. Raboin

Keeping in mind the work done by data librarians is key to understanding the importance of providing open and free access to data. Standards such as persistent identifiers (PIDs) were created to provide long-lasting access to all types of digital materials and resources. Providing new ways to inform and instruct researchers and other users on the importance of making data available for sharing, reproducibility, and re-use helps in driving good and effective social policy for researchers.

牢记数据图书馆员所做的工作是理解提供开放和免费数据访问的重要性的关键。创建持久标识符（PID）等标准是为了提供对所有类型的数字材料和资源的长期访问。提供新的方式来告知和指导研究人员和其他用户使数据可用于共享、再现和重复使用的重要性，有助于为研究人员制定良好有效的社会政策。

引用次数: 0

(Hyper)active Data Curation: A Video Case Study from Behavioral Science. （超）主动数据处理：来自行为科学的视频案例研究。

Journal of escience librarianship

Pub Date : 2021-05-26 DOI: 10.31234/OSF.IO/89RCB

Kasey C. Soska, Melody Xu, Sandy L. Gonzalez, Orit Hertzberg, Catherine S Tamis-LeMonda, R. Gilmore, K. Adolph

Video data are uniquely suited for research reuse and for documenting research methods and findings. However, curation of video data is a serious hurdle for researchers in the social and behavioral sciences, where behavioral video data are obtained session by session and data sharing is not the norm. To eliminate the onerous burden of post hoc curation at the time of publication (or later), we describe best practices in active data curation-where data are curated and uploaded immediately after each data collection to allow instantaneous sharing with one button press at any time. Indeed, we recommend that researchers adopt "hyperactive" data curation where they openly share every step of their research process. The necessary infrastructure and tools are provided by Databrary-a secure, web-based data library designed for active curation and sharing of personally identifiable video data and associated metadata. We provide a case study of hyperactive curation of video data from the Play and Learning Across a Year (PLAY) project, where dozens of researchers developed a common protocol to collect, annotate, and actively curate video data of infants and mothers during natural activity in their homes at research sites across North America. PLAY relies on scalable standardized workflows to facilitate collaborative research, assure data quality, and prepare the corpus for sharing and reuse throughout the entire research process.

视频数据非常适合研究重用和记录研究方法和发现。然而，对于社会和行为科学的研究人员来说，视频数据的管理是一个严重的障碍，在这些领域，行为视频数据是逐节获得的，数据共享不是常态。为了消除在发布时(或之后)进行事后管理的繁重负担，我们描述了主动数据管理的最佳实践，即在每次数据收集后立即对数据进行管理和上传，以便在任何时候只需按一下按钮即可实现即时共享。事实上，我们建议研究人员采用“过度活跃”的数据管理方式，公开分享他们研究过程的每一步。database提供了必要的基础设施和工具，这是一个安全的、基于web的数据库，专为主动管理和共享个人身份视频数据和相关元数据而设计。我们提供了一个来自Play(全年游戏和学习)项目的视频数据过度活跃管理的案例研究，在该项目中，数十名研究人员开发了一种通用协议，用于收集、注释和积极管理北美各地研究地点的婴儿和母亲在家中自然活动期间的视频数据。PLAY依靠可扩展的标准化工作流程来促进协作研究，确保数据质量，并为整个研究过程中的共享和重用准备语料库。

{"title":"(Hyper)active Data Curation: A Video Case Study from Behavioral Science.","authors":"Kasey C. Soska, Melody Xu, Sandy L. Gonzalez, Orit Hertzberg, Catherine S Tamis-LeMonda, R. Gilmore, K. Adolph","doi":"10.31234/OSF.IO/89RCB","DOIUrl":"https://doi.org/10.31234/OSF.IO/89RCB","url":null,"abstract":"Video data are uniquely suited for research reuse and for documenting research methods and findings. However, curation of video data is a serious hurdle for researchers in the social and behavioral sciences, where behavioral video data are obtained session by session and data sharing is not the norm. To eliminate the onerous burden of post hoc curation at the time of publication (or later), we describe best practices in active data curation-where data are curated and uploaded immediately after each data collection to allow instantaneous sharing with one button press at any time. Indeed, we recommend that researchers adopt \"hyperactive\" data curation where they openly share every step of their research process. The necessary infrastructure and tools are provided by Databrary-a secure, web-based data library designed for active curation and sharing of personally identifiable video data and associated metadata. We provide a case study of hyperactive curation of video data from the Play and Learning Across a Year (PLAY) project, where dozens of researchers developed a common protocol to collect, annotate, and actively curate video data of infants and mothers during natural activity in their homes at research sites across North America. PLAY relies on scalable standardized workflows to facilitate collaborative research, assure data quality, and prepare the corpus for sharing and reuse throughout the entire research process.","PeriodicalId":90214,"journal":{"name":"Journal of escience librarianship","volume":"10 3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45481101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Introducing Reproducibility to Citation Analysis: a Case Study in the Earth Sciences 引文分析的可重复性:以地球科学为例

Journal of escience librarianship

Pub Date : 2021-05-13 DOI: 10.7191/JESLIB.2021.1194

S. Teplitzky, Wynn Tranfield, Mea Warren, Philip White

Objectives:Replicate methods from a 2019 study of Earth Science researcher citation practices.Calculate programmatically whether researchers in Earth Science rely on a smaller subset of literature than estimated by the 80/20 rule.Determine whether these reproducible citation analysis methods can be used to analyze open access uptake.Methods: Replicated methods of a prior citation study provide an updated transparent, reproducible citation analysis protocol that can be replicated with Jupyter Notebooks.Results: This study replicated the prior citation study’s conclusions, and also adapted the author’s methods to analyze the citation practices of Earth Scientists at four institutions. We found that 80% of the citations could be accounted for by only 7.88% of journals, a key metric to help identify a core collection of titles in this discipline. We then demonstrated programmatically that 36% of these cited references were available as open access.Conclusions: Jupyter Notebooks are a viable platform for disseminating replicable processes for citation analysis. A completely open methodology is emerging and we consider this a step forward. Adherence to the 80/20 rule aligned with institutional research output, but citation preferences are evident. Reproducible citation analysis methods may be used to analyze open access uptake, however, results are inconclusive. It is difficult to determine whether an article was open access at the time of citation, or became open access after an embargo.

目的：复制2019年地球科学研究人员引文实践研究的方法。通过编程计算地球科学的研究人员是否依赖比80/20规则估计的更小的文献子集。确定这些可重复引用分析方法是否可用于分析开放获取吸收。方法：先前引文研究的复制方法提供了一个更新的透明、可重复的引文分析协议，可以用Jupyter Notebooks复制。结果：本研究复制了先前引文研究的结论，并采用作者的方法分析了四个机构的地球科学家的引文实践。我们发现，80%的引文只能由7.88%的期刊占据，这是帮助确定该学科核心标题集的关键指标。然后，我们通过编程证明，这些引用的引用中有36%是开放访问的。结论：Jupyter Notebooks是一个传播可复制引用分析过程的可行平台。一种完全开放的方法正在出现，我们认为这是向前迈出的一步。遵守80/20规则与机构研究成果一致，但引文偏好是明显的。可重复引用分析方法可用于分析开放获取的吸收，然而，结果是不确定的。很难确定一篇文章在被引用时是开放获取的，还是在禁运后成为开放获取的。

{"title":"Introducing Reproducibility to Citation Analysis: a Case Study in the Earth Sciences","authors":"S. Teplitzky, Wynn Tranfield, Mea Warren, Philip White","doi":"10.7191/JESLIB.2021.1194","DOIUrl":"https://doi.org/10.7191/JESLIB.2021.1194","url":null,"abstract":"Objectives:\u0000\u0000Replicate methods from a 2019 study of Earth Science researcher citation practices.\u0000\u0000Calculate programmatically whether researchers in Earth Science rely on a smaller subset of literature than estimated by the 80/20 rule.\u0000\u0000Determine whether these reproducible citation analysis methods can be used to analyze open access uptake.\u0000\u0000Methods: Replicated methods of a prior citation study provide an updated transparent, reproducible citation analysis protocol that can be replicated with Jupyter Notebooks.\u0000\u0000Results: This study replicated the prior citation study’s conclusions, and also adapted the author’s methods to analyze the citation practices of Earth Scientists at four institutions. We found that 80% of the citations could be accounted for by only 7.88% of journals, a key metric to help identify a core collection of titles in this discipline. We then demonstrated programmatically that 36% of these cited references were available as open access.\u0000\u0000Conclusions: Jupyter Notebooks are a viable platform for disseminating replicable processes for citation analysis. A completely open methodology is emerging and we consider this a step forward. Adherence to the 80/20 rule aligned with institutional research output, but citation preferences are evident. Reproducible citation analysis methods may be used to analyze open access uptake, however, results are inconclusive. It is difficult to determine whether an article was open access at the time of citation, or became open access after an embargo.","PeriodicalId":90214,"journal":{"name":"Journal of escience librarianship","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43986015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Journal of escience librarianship

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀