知识映射的先前步骤:文本挖掘的应用和比较

Faizhal Arif Santosa
{"title":"知识映射的先前步骤:文本挖掘的应用和比较","authors":"Faizhal Arif Santosa","doi":"10.29173/istl2736","DOIUrl":null,"url":null,"abstract":"Bibliometrics is increasingly being used by the knowledge community and librarians to easily analyze patterns in knowledge. In the field, the use of data from databases that provide bibliometric information is not always completely clean, so pre-processing is required. Several previous studies have shown that bibliometric analysis begins with a simple pre-processing step. The goal of this research is to use text mining to perform pre-processing to find the basic terms of the keywords that appear – to essentially construct a controlled vocabulary for a bibliographic dataset. The method used in this study is cleaning keywords with the stemming method using RapidMiner software. Bibliometrix was used to compare the results. A total of 85 keywords were combined into basic words. Using the built process, this study discovers differences in the network built between raw data and data that has been pre-processed, resulting in differences in the analysis that will be produced. The built process can also be reused in a variety of real-world situations.","PeriodicalId":39287,"journal":{"name":"Issues in Science and Technology Librarianship","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Prior Steps into Knowledge Mapping: Text Mining Application and Comparison\",\"authors\":\"Faizhal Arif Santosa\",\"doi\":\"10.29173/istl2736\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bibliometrics is increasingly being used by the knowledge community and librarians to easily analyze patterns in knowledge. In the field, the use of data from databases that provide bibliometric information is not always completely clean, so pre-processing is required. Several previous studies have shown that bibliometric analysis begins with a simple pre-processing step. The goal of this research is to use text mining to perform pre-processing to find the basic terms of the keywords that appear – to essentially construct a controlled vocabulary for a bibliographic dataset. The method used in this study is cleaning keywords with the stemming method using RapidMiner software. Bibliometrix was used to compare the results. A total of 85 keywords were combined into basic words. Using the built process, this study discovers differences in the network built between raw data and data that has been pre-processed, resulting in differences in the analysis that will be produced. The built process can also be reused in a variety of real-world situations.\",\"PeriodicalId\":39287,\"journal\":{\"name\":\"Issues in Science and Technology Librarianship\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Issues in Science and Technology Librarianship\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29173/istl2736\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Issues in Science and Technology Librarianship","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29173/istl2736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 1

摘要

文献计量学越来越多地被知识社区和图书馆员用来方便地分析知识模式。在该领域,使用提供文献计量信息的数据库中的数据并不总是完全干净的,因此需要进行预处理。先前的几项研究表明,文献计量分析从一个简单的预处理步骤开始。这项研究的目标是使用文本挖掘进行预处理,以找到出现的关键词的基本术语——本质上为书目数据集构建一个受控词汇表。本研究中使用的方法是使用RapidMiner软件用词干法清理关键词。使用Bibliometrix对结果进行比较。共有85个关键词被组合成基本单词。使用构建的过程,本研究发现原始数据和预处理数据之间构建的网络存在差异,从而导致将产生的分析存在差异。构建的过程也可以在各种实际情况下重复使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Prior Steps into Knowledge Mapping: Text Mining Application and Comparison
Bibliometrics is increasingly being used by the knowledge community and librarians to easily analyze patterns in knowledge. In the field, the use of data from databases that provide bibliometric information is not always completely clean, so pre-processing is required. Several previous studies have shown that bibliometric analysis begins with a simple pre-processing step. The goal of this research is to use text mining to perform pre-processing to find the basic terms of the keywords that appear – to essentially construct a controlled vocabulary for a bibliographic dataset. The method used in this study is cleaning keywords with the stemming method using RapidMiner software. Bibliometrix was used to compare the results. A total of 85 keywords were combined into basic words. Using the built process, this study discovers differences in the network built between raw data and data that has been pre-processed, resulting in differences in the analysis that will be produced. The built process can also be reused in a variety of real-world situations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Issues in Science and Technology Librarianship
Issues in Science and Technology Librarianship Social Sciences-Library and Information Sciences
CiteScore
1.00
自引率
0.00%
发文量
19
期刊最新文献
The Value of Faculty Book Donations: A Case Study of Botany Books at Marx Science and Social Science Library, Yale University Librarian Support in Teaching Open Science Research Practices in Higher Education Addressing Equity and Affordability in Digital Study Tools for STEM and the Health Sciences: Possibilities for Library Involvement A Survey of Student Employment and Geospatial Services in Academic Libraries The Use of Preprints in Doctorate Programs: A Citation Analysis Study of Trends in Chemistry and Physics Dissertations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1