Expert-guided evaluation of medical research may promote publishing low-quality studies and increase research waste: A comparative analysis of Journal Impact Factor and Polish expert-based journal ranking list
Albert Stachura, Łukasz Banaszek, Paweł K. Włodarski
{"title":"Expert-guided evaluation of medical research may promote publishing low-quality studies and increase research waste: A comparative analysis of Journal Impact Factor and Polish expert-based journal ranking list","authors":"Albert Stachura, Łukasz Banaszek, Paweł K. Włodarski","doi":"10.1111/jebm.12615","DOIUrl":null,"url":null,"abstract":"<p>An ever-growing amount of medical literature has created a need for evaluating scientific merit. The Journal Impact Factor (JIF) is a metric relying on citation count and may indicate the prestige of a scientific journal.<span><sup>1</sup></span> A study quality is often assessed based on a publication venue—hence JIF may be indirectly used to evaluate research. Such an approach is not flawless. Using JIF as a surrogate of a journal's quality has been widely criticized.<span><sup>2</sup></span> In the Leiden Manifesto Hicks et al. advocated for putting more emphasis on qualitative assessment, transparency, robust locally relevant research, and accounting for variation by field of study.<span><sup>3</sup></span></p><p>Despite these criticisms, JIF is still used to assess scientific output.<span><sup>4</sup></span> In the United States and Canada, journal ranking is based on indicators such as JIF, CiteScore, SCImago Journal Rank, or Hirsh index. In some countries, journal rankings have been created to assess the research performance of scientists and institutions. Two main approaches are prevalent: (<span>1</span>) based solely on metrics or (<span>2</span>) determined by experts who may (or may not) take such metrics into consideration.<span><sup>5</sup></span> The first model is used, for example, in Turkey (the TÜBİTAK Incentive Program for International Scientific Publications) or China (Chinese Academy of Sciences Journal Ranking List), the second one, for example, in Finland (the Publication Forum Journal list), Norway (the Norwegian Register for Scientific Journals, Series and Publishers), Italy (the Ratings of scientific and class A journals), Denmark (the BFI List of Series), and Poland (Polish Journal Ranking).<span><sup>6</sup></span> Though both models rely to some degree on JIF, the latter is more subjective and likely to be shaped by the national science policy objectives. This significantly increases the risk of politicization, which might lead to adjusting the assigned journal rank to own professional goals of experts involved in producing rankings, potentially creating a conflict of interests.<span><sup>5</sup></span></p><p>Funding, grants, and scholarships are awarded to scientists publishing in top journals from the national ranking lists. In Poland, the evaluation system is based on points awarded by the Ministry of Education and Science (MEiN—<i>pol. Ministerstwo Edukacji i Nauki</i>). The latest edition was released on January 5, 2024, more than 1 year after the 2022 Journal Citation Report had been announced (June 2022).<span><sup>7</sup></span> Since JIF is an imperfect surrogate of journal quality, supplementing assessment systems with expert opinion may potentially help promote good research. The objective of this study is to compare the MEiN ranking system with JIF and discuss the consequences of potential discrepancies between the two models.</p><p>A total of 5326 journals appeared both in JCR Clinical Medicine category and on the MEiN ranking list (Medical sciences category). Additionally, 582 (10%) were considered in JCR but not included within the MEiN Medical Sciences category (Tables S1 and S2). Some of the omitted titles had a JIF of over 17. Across ranks, minimal JIFs were low (0–2.2) and variations of JIF values were considerable. In extreme cases, journals with Impact Factor of over 30 were assigned 20 MEiN points, while those with JIF of 2.2 were included in the 200 MEiN points group. The number of journals included within each subsequent rank was not decreasing, as one would expect, but was irregular. More journals were assigned 70 or 100 points as compared to only 40. Ranks 140 and 200 were the most elite comprising a total of 690 journals. Additionally, 2219 journals were assigned MEiN rank but were not listed in JCR Clinical Medicine category. Of them, 1092 (49.2%) were assigned 20 points, 353 (15.9%)—40 points, 355 (16%)—70 points, 243 (10.9%)—100 points, 110 (4.9%)—140 points, and 66 (3%)—200 points. JIF of all journals with Impact Factor lower than 40 was plotted against MEiN ranking and presented in Figure 1.</p><p>Within each JCR Clinical Medicine category, we ranked journals from 1 to <i>n</i> (number of journals within a category) based on their JIF. Therefore, a journal with the highest JIF was assigned a rank of 1, second best a rank of 2, and so on. Later we correlated said ranking within each category with MEiN scores (Tables S3 and S4). The results varied considerably with the lowest correlation coefficient noted for Medical Informatics (<i>r</i> = −0.18) and the highest for Neuroimaging (<i>r</i> = −0.93). In more than half of categories (41/59), the correlation coefficient was weaker than −0.7. It suggests experts scoring was informed by more than just JIF-based prestige of a given journal within a field. What guided their decisions remains unknown. For 14 (24%) specialties from the Clinical Medicine category, no journal was assigned 200 MEiN points. Strikingly, some of the best journals included in these groups, not assigned a rank of 200, had JIF of between 3.4 and 30.8.</p><p>It seems clear that some of the most prestigious titles in certain fields were undervalued by the experts assigning MEiN ranks, as explained above. What about journals considered worthy of 200 MEiN points? Here are some familiar titles: <i>The New England Journal of Medicine, The Lancet, JAMA, The BMJ</i>, and <i>Annals of Internal Medicine</i>. However, some lesser-known titles were also included in this group: <i>Bioethics</i> (JIF 2.2) or <i>Application of Clinical Genetics</i> (JIF 3.1). Some journals not listed in the JCR Clinical Medicine group were also assigned 200 points: <i>Journal of Quantitative Criminology</i> or <i>Journal of Anthropological Archaeology</i>. Articles published in any of the above-mentioned journals are therefore assessed to be of equal scientific merit.</p><p>Impact Factor was first introduced by Eugene Garfield in 1975 and was meant to become an indicator of usage of scholarly literature, as well as help identify potential venues for publication, especially for interdisciplinary research.<span><sup>8</sup></span> It has been deemed a valid measure of journal quality among researchers and practicing physicians<span><sup>9</sup></span> but also received criticism as a bibliometric indicator.<span><sup>8</sup></span> But for all these criticisms, JIF remains a widely used bibliometric indicator and introduction of any new metric should mitigate rather than replicate its flaws.</p><p>Any ranking system should ideally aim to promote researchers doing “good” research. In his famous editorial, Doug Altman argued that “<i>we need less research, better research, and research done for the right reasons</i>” and “(…) <i>much poor research arises because researchers feel compelled for career reasons to carry out research that they are ill equipped to perform, and nobody stops them</i>.”<span><sup>10</sup></span> The Cochrane Methodology Review Group created a list of outcome measures that would facilitate identifying a good-quality biomedical study. It ideally should be important, useful, relevant, methodologically sound, ethical, complete, and accurate.<span><sup>11</sup></span> Higher JIF is associated with better adherence to reporting guidelines in health care literature<span><sup>12</sup></span>—a surrogate outcome for methodological soundness.<span><sup>11</sup></span></p><p>The process by which the current MEiN ranking list was created lacks transparency. No clear criteria for assigning ranks were published. Much like JIF, MEiN ranking helps identify venues for publication but promotes cleverness rather than skill and hard work. The key question comes to mind: is it policymakers’ job to artificially inflate the value of selected journals and even put them on a par with top medical titles? In our view, more emphasis should be put on improving the quality of peer-review, editorial process, and strict adherence to reporting guidelines to promote transparent, reproducible and locally relevant research. Instead of encouraging scientists to publish poor studies in “high-rank” journals, they should be equipped with skills Altman argued few possessed. Courses in basic medical statistics and statistical inference, research design and critical appraisal should be promoted. The joined effort of policymakers, clinicians, and researchers would likely result in better research addressing significant clinical problems to find better solutions for patients.</p><p>Our analysis had some limitations. We only extracted data on journals included in the Medical Sciences category by MEiN. Therefore, some of the titles from JCR Clinical Medicine list that were not present within this MEiN category could be listed elsewhere. We assumed clinical medicine was a narrower term than medical sciences and that the latter should contain all the journals from the former group. As shown above, some journals unrelated to health care sciences were also included by MEiN in the Medical Sciences group. How titles were assigned to categories remains unclear. Another limitation was that authors of this study were Polish scientists who underwent evaluation based on the said ranking and therefore were biased. We aimed to analyze bibliometric data in an objective way and focused on the most extreme outliers, regardless of our own study fields.</p><p>In conclusion, in Poland, medical research assessment is based on a nontransparent and unbalanced system created by experts. It may encourage publishing low-quality research in journals assigned high expert rank but with low JIF. A comprehensive review of the journal ranking list is needed to promote researchers adequately addressing relevant clinical problems. It is our hope this analysis will spark discussion in other countries currently using similar expert-based assessment systems.</p><p>The authors declare no conflict of interest.</p><p>The authors have not declared a specific grant for this research from any funding agency in the public, commercial, or not-for-profit sectors.</p>","PeriodicalId":16090,"journal":{"name":"Journal of Evidence‐Based Medicine","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/jebm.12615","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Evidence‐Based Medicine","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/jebm.12615","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0
Abstract
An ever-growing amount of medical literature has created a need for evaluating scientific merit. The Journal Impact Factor (JIF) is a metric relying on citation count and may indicate the prestige of a scientific journal.1 A study quality is often assessed based on a publication venue—hence JIF may be indirectly used to evaluate research. Such an approach is not flawless. Using JIF as a surrogate of a journal's quality has been widely criticized.2 In the Leiden Manifesto Hicks et al. advocated for putting more emphasis on qualitative assessment, transparency, robust locally relevant research, and accounting for variation by field of study.3
Despite these criticisms, JIF is still used to assess scientific output.4 In the United States and Canada, journal ranking is based on indicators such as JIF, CiteScore, SCImago Journal Rank, or Hirsh index. In some countries, journal rankings have been created to assess the research performance of scientists and institutions. Two main approaches are prevalent: (1) based solely on metrics or (2) determined by experts who may (or may not) take such metrics into consideration.5 The first model is used, for example, in Turkey (the TÜBİTAK Incentive Program for International Scientific Publications) or China (Chinese Academy of Sciences Journal Ranking List), the second one, for example, in Finland (the Publication Forum Journal list), Norway (the Norwegian Register for Scientific Journals, Series and Publishers), Italy (the Ratings of scientific and class A journals), Denmark (the BFI List of Series), and Poland (Polish Journal Ranking).6 Though both models rely to some degree on JIF, the latter is more subjective and likely to be shaped by the national science policy objectives. This significantly increases the risk of politicization, which might lead to adjusting the assigned journal rank to own professional goals of experts involved in producing rankings, potentially creating a conflict of interests.5
Funding, grants, and scholarships are awarded to scientists publishing in top journals from the national ranking lists. In Poland, the evaluation system is based on points awarded by the Ministry of Education and Science (MEiN—pol. Ministerstwo Edukacji i Nauki). The latest edition was released on January 5, 2024, more than 1 year after the 2022 Journal Citation Report had been announced (June 2022).7 Since JIF is an imperfect surrogate of journal quality, supplementing assessment systems with expert opinion may potentially help promote good research. The objective of this study is to compare the MEiN ranking system with JIF and discuss the consequences of potential discrepancies between the two models.
A total of 5326 journals appeared both in JCR Clinical Medicine category and on the MEiN ranking list (Medical sciences category). Additionally, 582 (10%) were considered in JCR but not included within the MEiN Medical Sciences category (Tables S1 and S2). Some of the omitted titles had a JIF of over 17. Across ranks, minimal JIFs were low (0–2.2) and variations of JIF values were considerable. In extreme cases, journals with Impact Factor of over 30 were assigned 20 MEiN points, while those with JIF of 2.2 were included in the 200 MEiN points group. The number of journals included within each subsequent rank was not decreasing, as one would expect, but was irregular. More journals were assigned 70 or 100 points as compared to only 40. Ranks 140 and 200 were the most elite comprising a total of 690 journals. Additionally, 2219 journals were assigned MEiN rank but were not listed in JCR Clinical Medicine category. Of them, 1092 (49.2%) were assigned 20 points, 353 (15.9%)—40 points, 355 (16%)—70 points, 243 (10.9%)—100 points, 110 (4.9%)—140 points, and 66 (3%)—200 points. JIF of all journals with Impact Factor lower than 40 was plotted against MEiN ranking and presented in Figure 1.
Within each JCR Clinical Medicine category, we ranked journals from 1 to n (number of journals within a category) based on their JIF. Therefore, a journal with the highest JIF was assigned a rank of 1, second best a rank of 2, and so on. Later we correlated said ranking within each category with MEiN scores (Tables S3 and S4). The results varied considerably with the lowest correlation coefficient noted for Medical Informatics (r = −0.18) and the highest for Neuroimaging (r = −0.93). In more than half of categories (41/59), the correlation coefficient was weaker than −0.7. It suggests experts scoring was informed by more than just JIF-based prestige of a given journal within a field. What guided their decisions remains unknown. For 14 (24%) specialties from the Clinical Medicine category, no journal was assigned 200 MEiN points. Strikingly, some of the best journals included in these groups, not assigned a rank of 200, had JIF of between 3.4 and 30.8.
It seems clear that some of the most prestigious titles in certain fields were undervalued by the experts assigning MEiN ranks, as explained above. What about journals considered worthy of 200 MEiN points? Here are some familiar titles: The New England Journal of Medicine, The Lancet, JAMA, The BMJ, and Annals of Internal Medicine. However, some lesser-known titles were also included in this group: Bioethics (JIF 2.2) or Application of Clinical Genetics (JIF 3.1). Some journals not listed in the JCR Clinical Medicine group were also assigned 200 points: Journal of Quantitative Criminology or Journal of Anthropological Archaeology. Articles published in any of the above-mentioned journals are therefore assessed to be of equal scientific merit.
Impact Factor was first introduced by Eugene Garfield in 1975 and was meant to become an indicator of usage of scholarly literature, as well as help identify potential venues for publication, especially for interdisciplinary research.8 It has been deemed a valid measure of journal quality among researchers and practicing physicians9 but also received criticism as a bibliometric indicator.8 But for all these criticisms, JIF remains a widely used bibliometric indicator and introduction of any new metric should mitigate rather than replicate its flaws.
Any ranking system should ideally aim to promote researchers doing “good” research. In his famous editorial, Doug Altman argued that “we need less research, better research, and research done for the right reasons” and “(…) much poor research arises because researchers feel compelled for career reasons to carry out research that they are ill equipped to perform, and nobody stops them.”10 The Cochrane Methodology Review Group created a list of outcome measures that would facilitate identifying a good-quality biomedical study. It ideally should be important, useful, relevant, methodologically sound, ethical, complete, and accurate.11 Higher JIF is associated with better adherence to reporting guidelines in health care literature12—a surrogate outcome for methodological soundness.11
The process by which the current MEiN ranking list was created lacks transparency. No clear criteria for assigning ranks were published. Much like JIF, MEiN ranking helps identify venues for publication but promotes cleverness rather than skill and hard work. The key question comes to mind: is it policymakers’ job to artificially inflate the value of selected journals and even put them on a par with top medical titles? In our view, more emphasis should be put on improving the quality of peer-review, editorial process, and strict adherence to reporting guidelines to promote transparent, reproducible and locally relevant research. Instead of encouraging scientists to publish poor studies in “high-rank” journals, they should be equipped with skills Altman argued few possessed. Courses in basic medical statistics and statistical inference, research design and critical appraisal should be promoted. The joined effort of policymakers, clinicians, and researchers would likely result in better research addressing significant clinical problems to find better solutions for patients.
Our analysis had some limitations. We only extracted data on journals included in the Medical Sciences category by MEiN. Therefore, some of the titles from JCR Clinical Medicine list that were not present within this MEiN category could be listed elsewhere. We assumed clinical medicine was a narrower term than medical sciences and that the latter should contain all the journals from the former group. As shown above, some journals unrelated to health care sciences were also included by MEiN in the Medical Sciences group. How titles were assigned to categories remains unclear. Another limitation was that authors of this study were Polish scientists who underwent evaluation based on the said ranking and therefore were biased. We aimed to analyze bibliometric data in an objective way and focused on the most extreme outliers, regardless of our own study fields.
In conclusion, in Poland, medical research assessment is based on a nontransparent and unbalanced system created by experts. It may encourage publishing low-quality research in journals assigned high expert rank but with low JIF. A comprehensive review of the journal ranking list is needed to promote researchers adequately addressing relevant clinical problems. It is our hope this analysis will spark discussion in other countries currently using similar expert-based assessment systems.
The authors declare no conflict of interest.
The authors have not declared a specific grant for this research from any funding agency in the public, commercial, or not-for-profit sectors.
期刊介绍:
The Journal of Evidence-Based Medicine (EMB) is an esteemed international healthcare and medical decision-making journal, dedicated to publishing groundbreaking research outcomes in evidence-based decision-making, research, practice, and education. Serving as the official English-language journal of the Cochrane China Centre and West China Hospital of Sichuan University, we eagerly welcome editorials, commentaries, and systematic reviews encompassing various topics such as clinical trials, policy, drug and patient safety, education, and knowledge translation.