Using R to develop a corpus of full-text journal articles

IF 1.8 4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Journal of Information Science Pub Date : 2023-07-14 DOI:10.1177/01655515231171362
Billie Anderson, M. Bani-Yaghoub, Vagmi Kantheti, Scott Curtis
{"title":"Using R to develop a corpus of full-text journal articles","authors":"Billie Anderson, M. Bani-Yaghoub, Vagmi Kantheti, Scott Curtis","doi":"10.1177/01655515231171362","DOIUrl":null,"url":null,"abstract":"Over the past two decades, databases and the tools to access them in a simple manner have become increasingly available, allowing historical and modern-day topics to be merged and studied. Throughout the recent COVID-19 pandemic, for example, many researchers have reflected on whether any lessons learned from the Spanish flu pandemic of 1918 could have been helpful in the present pandemic. Most studies using text-mining applications rarely use full-text journal articles. This article provides a methodology used to develop a full-text journal article corpus using the R fulltext package. Using the proposed methodology, 2743 full-text journal articles were obtained. The aim of this article is to provide a methodology and supplementary codes for researchers to use the R fulltext package to curate a full-text journal corpus.","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/01655515231171362","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Over the past two decades, databases and the tools to access them in a simple manner have become increasingly available, allowing historical and modern-day topics to be merged and studied. Throughout the recent COVID-19 pandemic, for example, many researchers have reflected on whether any lessons learned from the Spanish flu pandemic of 1918 could have been helpful in the present pandemic. Most studies using text-mining applications rarely use full-text journal articles. This article provides a methodology used to develop a full-text journal article corpus using the R fulltext package. Using the proposed methodology, 2743 full-text journal articles were obtained. The aim of this article is to provide a methodology and supplementary codes for researchers to use the R fulltext package to curate a full-text journal corpus.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用R开发全文期刊文章的语料库
在过去的二十年里,数据库和以一种简单的方式访问它们的工具变得越来越可用,允许历史和现代主题合并和研究。例如,在最近的COVID-19大流行期间,许多研究人员都在思考,从1918年西班牙流感大流行中吸取的教训是否对当前的大流行有所帮助。大多数使用文本挖掘应用程序的研究很少使用全文期刊文章。本文提供了一种使用R全文包开发全文期刊文章语料库的方法。使用所提出的方法,获得2743篇全文期刊文章。本文的目的是为研究人员提供一种方法和补充代码,以使用R全文包来策划全文期刊语料库。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Information Science
Journal of Information Science 工程技术-计算机:信息系统
CiteScore
6.80
自引率
8.30%
发文量
121
审稿时长
4 months
期刊介绍: The Journal of Information Science is a peer-reviewed international journal of high repute covering topics of interest to all those researching and working in the sciences of information and knowledge management. The Editors welcome material on any aspect of information science theory, policy, application or practice that will advance thinking in the field.
期刊最新文献
Government chatbot: Empowering smart conversations with enhanced contextual understanding and reasoning Knowing within multispecies families: An information experience study How are global university rankings adjusted for erroneous science, fraud and misconduct? Posterior reduction or adjustment in rankings in response to retractions and invalidation of scientific findings Predicting the technological impact of papers: Exploring optimal models and most important features Cross-domain corpus selection for cold-start context
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1