Madelin R. Siedler , Hassan Kawtharany , Muayad Azzam , Defne Ezgü , Abrar Alshorman , Ibrahim K. El Mikati , Sadiya Abid , Ali Choaib , Qais Hamarsha , M. Hassan Murad , Rebecca L. Morgan , Yngve Falck-Ytter , Shahnaz Sultan , Philipp Dahm , Reem A. Mustafa
{"title":"Risk of bias assessment tools often addressed items not related to risk of bias and used numerical scores","authors":"Madelin R. Siedler , Hassan Kawtharany , Muayad Azzam , Defne Ezgü , Abrar Alshorman , Ibrahim K. El Mikati , Sadiya Abid , Ali Choaib , Qais Hamarsha , M. Hassan Murad , Rebecca L. Morgan , Yngve Falck-Ytter , Shahnaz Sultan , Philipp Dahm , Reem A. Mustafa","doi":"10.1016/j.jclinepi.2025.111684","DOIUrl":null,"url":null,"abstract":"<div><h3>Objectives</h3><div>We aimed to determine whether the existing risk of bias assessment tools addressed constructs other than risk of bias or internal validity and whether they used numerical scores to express quality, which is discouraged and may be a misleading approach.</div></div><div><h3>Methods</h3><div>We searched Ovid MEDLINE and Embase to identify quality appraisal tools across all disciplines in human health research. Tools designed specifically to evaluate reporting quality were excluded. Potentially eligible tools were screened by independent pairs of reviewers. We categorized tools according to conceptual constructs and evaluated their scoring methods.</div></div><div><h3>Results</h3><div>We included 230 tools published from 1995 to 2023. Access to the tool was limited to a peer-reviewed journal article in 63% of the sample. Most tools (76%) provided signaling questions, whereas 39% produced an overall judgment across multiple domains. Most tools (93%) addressed concepts other than risk of bias, such as the appropriateness of statistical analysis (65%), reporting quality (64%), indirectness (41%), imprecision (38%), and ethical considerations and funding (22%). Numerical scoring was used in 25% of tools.</div></div><div><h3>Conclusion</h3><div>Currently available study quality assessment tools were not explicit about the constructs addressed by their items or signaling questions and addressed multiple constructs in addition to risk of bias. Many tools used numerical scoring systems, which can be misleading. Limitations of the existing tools make the process of rating the certainty of evidence more difficult.</div></div><div><h3>Plain Language Summary</h3><div>Many tools have been made to assess how well a scientific study was designed, conducted, and written. We searched for these tools to better understand the types of questions they ask and the types of studies to which they apply. We found 230 tools published between 1995 and 2023. One in every four tools used a numerical scoring system. This approach is not recommended because it does not distinguish well between different ways quality can be assessed. Tools assessed quality in a number of different ways, with the most common ways being risk of bias (how a study is designed and run to reduce biased results; 98%), statistical analysis (how the data were analyzed; 65%), and reporting quality (whether important details were included in the article; 64%). People who make tools in the future should carefully consider the aspects of quality that they want the tool to address and distinguish between questions of study design, conduct, analysis, ethics, and reporting.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"180 ","pages":"Article 111684"},"PeriodicalIF":7.3000,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895435625000174","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives
We aimed to determine whether the existing risk of bias assessment tools addressed constructs other than risk of bias or internal validity and whether they used numerical scores to express quality, which is discouraged and may be a misleading approach.
Methods
We searched Ovid MEDLINE and Embase to identify quality appraisal tools across all disciplines in human health research. Tools designed specifically to evaluate reporting quality were excluded. Potentially eligible tools were screened by independent pairs of reviewers. We categorized tools according to conceptual constructs and evaluated their scoring methods.
Results
We included 230 tools published from 1995 to 2023. Access to the tool was limited to a peer-reviewed journal article in 63% of the sample. Most tools (76%) provided signaling questions, whereas 39% produced an overall judgment across multiple domains. Most tools (93%) addressed concepts other than risk of bias, such as the appropriateness of statistical analysis (65%), reporting quality (64%), indirectness (41%), imprecision (38%), and ethical considerations and funding (22%). Numerical scoring was used in 25% of tools.
Conclusion
Currently available study quality assessment tools were not explicit about the constructs addressed by their items or signaling questions and addressed multiple constructs in addition to risk of bias. Many tools used numerical scoring systems, which can be misleading. Limitations of the existing tools make the process of rating the certainty of evidence more difficult.
Plain Language Summary
Many tools have been made to assess how well a scientific study was designed, conducted, and written. We searched for these tools to better understand the types of questions they ask and the types of studies to which they apply. We found 230 tools published between 1995 and 2023. One in every four tools used a numerical scoring system. This approach is not recommended because it does not distinguish well between different ways quality can be assessed. Tools assessed quality in a number of different ways, with the most common ways being risk of bias (how a study is designed and run to reduce biased results; 98%), statistical analysis (how the data were analyzed; 65%), and reporting quality (whether important details were included in the article; 64%). People who make tools in the future should carefully consider the aspects of quality that they want the tool to address and distinguish between questions of study design, conduct, analysis, ethics, and reporting.
期刊介绍:
The Journal of Clinical Epidemiology strives to enhance the quality of clinical and patient-oriented healthcare research by advancing and applying innovative methods in conducting, presenting, synthesizing, disseminating, and translating research results into optimal clinical practice. Special emphasis is placed on training new generations of scientists and clinical practice leaders.