{"title":"Google Scholar's ranking algorithm: The impact of citation counts (An empirical study)","authors":"J. Beel, Bela Gipp","doi":"10.1109/RCIS.2009.5089308","DOIUrl":null,"url":null,"abstract":"Google Scholar is one of the major academic search engines but its ranking algorithm for academic articles is unknown. In a recent study we partly reverse-engineered the algorithm. This paper presents the results of our second study. While the previous study provided a broad overview, the current study focused on analyzing the correlation of an article's citation count and its ranking in Google Scholar. For this study, citation counts and rankings of 1,364,757 articles were analyzed. Some results of our first study were confirmed: Citation counts is the highest weighed factor in Google Scholar's ranking algorithm. Highly cited articles are found significantly more often in higher positions than articles that are cited less often. Therefore, Google Scholar seems to be more suitable for searching standard literature than for gems or articles by authors advancing a view different from the mainstream. However, interesting exceptions for some search queries occurred. In some cases no correlation existed; in others bizarre patterns were recognizable, suggesting that citation counts sometimes have no impact at all on articles' rankings.","PeriodicalId":180106,"journal":{"name":"2009 Third International Conference on Research Challenges in Information Science","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"92","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Third International Conference on Research Challenges in Information Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RCIS.2009.5089308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 92
Abstract
Google Scholar is one of the major academic search engines but its ranking algorithm for academic articles is unknown. In a recent study we partly reverse-engineered the algorithm. This paper presents the results of our second study. While the previous study provided a broad overview, the current study focused on analyzing the correlation of an article's citation count and its ranking in Google Scholar. For this study, citation counts and rankings of 1,364,757 articles were analyzed. Some results of our first study were confirmed: Citation counts is the highest weighed factor in Google Scholar's ranking algorithm. Highly cited articles are found significantly more often in higher positions than articles that are cited less often. Therefore, Google Scholar seems to be more suitable for searching standard literature than for gems or articles by authors advancing a view different from the mainstream. However, interesting exceptions for some search queries occurred. In some cases no correlation existed; in others bizarre patterns were recognizable, suggesting that citation counts sometimes have no impact at all on articles' rankings.
Google Scholar是主要的学术搜索引擎之一,但其对学术文章的排名算法尚不清楚。在最近的一项研究中,我们对该算法进行了部分逆向工程。本文介绍了我们第二次研究的结果。虽然之前的研究提供了一个广泛的概述,但当前的研究侧重于分析文章的引用次数与其在Google Scholar中的排名之间的相关性。本研究分析了1364757篇论文的被引次数和排名。我们第一项研究的一些结果得到了证实:引文数量是谷歌学术排名算法中权重最高的因素。高被引文章比低被引文章更常出现在较高的位置。因此,Google Scholar似乎更适合于搜索标准文献,而不是由提出不同于主流观点的作者撰写的宝石或文章。然而,对于某些搜索查询出现了有趣的异常。在某些情况下,不存在相关性;在另一些研究中,奇怪的模式是可以识别的,这表明引用次数有时对文章的排名根本没有影响。