An Intelligent Prediction of the Next Highly Cited Paper Using Machine Learning

Pub Date : 2023-04-17 DOI:10.5530/jscires.12.1.008
Galal M. Bin Makhashen, Hamdi A. Al-Jamimi
{"title":"An Intelligent Prediction of the Next Highly Cited Paper Using Machine Learning","authors":"Galal M. Bin Makhashen, Hamdi A. Al-Jamimi","doi":"10.5530/jscires.12.1.008","DOIUrl":null,"url":null,"abstract":"Highly cited articles capture the attention of significant contributors in the research community as an opportunity to improve knowledge, source of ideas or solutions, and advance their research in general. Typically, these articles are authored by a large number of scientists with international collaboration. However, this could not be the only reason for an article to be highly cited, there might be several other characteristics for an article to be more attractive to researchers and readers. In other words, there are a few other characteristics that help articles/papers to be more than others to appear in search engines or to grab readers’ attention. In this study, we modeled several machine-learning methods with a set of articles, and journal characteristics including authors-count, title characteristics, abstract length, international collaboration, number of keywords, funding information, journal characteristics, etc. We extracted 20 characteristics and developed multiple machine-learning models to automate highly-cited papers recognition from regular papers. In experiments conducted with an ensemble machine learning algorithm, 97% recognition accuracy was achieved. Other algorithms including a deep learning method using LSTMs also achieved high recognition accuracy. Such high performances can be utilized for a promising HCP auto-detection system in the future.","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5530/jscires.12.1.008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Highly cited articles capture the attention of significant contributors in the research community as an opportunity to improve knowledge, source of ideas or solutions, and advance their research in general. Typically, these articles are authored by a large number of scientists with international collaboration. However, this could not be the only reason for an article to be highly cited, there might be several other characteristics for an article to be more attractive to researchers and readers. In other words, there are a few other characteristics that help articles/papers to be more than others to appear in search engines or to grab readers’ attention. In this study, we modeled several machine-learning methods with a set of articles, and journal characteristics including authors-count, title characteristics, abstract length, international collaboration, number of keywords, funding information, journal characteristics, etc. We extracted 20 characteristics and developed multiple machine-learning models to automate highly-cited papers recognition from regular papers. In experiments conducted with an ensemble machine learning algorithm, 97% recognition accuracy was achieved. Other algorithms including a deep learning method using LSTMs also achieved high recognition accuracy. Such high performances can be utilized for a promising HCP auto-detection system in the future.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
利用机器学习智能预测下一篇高被引论文
高被引文章吸引了研究界重要贡献者的注意,作为提高知识、思想或解决方案来源的机会,并在总体上推进他们的研究。通常,这些文章是由大量的科学家在国际合作下撰写的。然而,这并不是一篇文章被高度引用的唯一原因,一篇文章可能还有其他几个特征对研究人员和读者更有吸引力。换句话说,还有一些其他的特征可以帮助文章/论文比其他文章/论文更容易出现在搜索引擎中或抓住读者的注意力。在这项研究中,我们用一组文章和期刊特征(包括作者数量、标题特征、摘要长度、国际合作、关键词数量、资助信息、期刊特征等)对几种机器学习方法进行了建模。我们提取了20个特征,并开发了多个机器学习模型,以自动识别普通论文中的高被引论文。在使用集成机器学习算法进行的实验中,识别准确率达到97%。其他算法包括使用lstm的深度学习方法也取得了很高的识别精度。这种高性能可以用于未来有前途的HCP自动检测系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1