Objectives: Non-randomized studies of interventions provide valuable evidence in health sciences but are prone to biases that affect validity. Multiple instruments have been developed for critical appraisal, although their measurement properties remain insufficiently established. These instruments typically aim to evaluate two key theoretical constructs: methodological quality and risk of bias, which reflect different but complementary dimensions of study rigor and internal validity. The COSMIN framework offers internationally recognized standards for evaluating instrument robustness, thus enhancing transparency and comparability in their selection. This systematic review specifically focused on identifying and critically evaluating studies that have empirically tested the measurement properties of instruments developed to assess methodological quality and/or risk of bias in non-randomized studies of interventions.
Study design and setting: This systematic review and meta-analysis was conducted in accordance with the COSMIN initiative which is structured in three parts: (1) a literature search reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses and the PRISMA search extension; (2) an assessment of methodological quality and measurement properties using the "COSMIN Risk of Bias" tool, together with an evaluation of the certainty of evidence following the "Grading of Recommendations Assessment, Development, and Evaluation" and, (3) a meta-analysis of measurement properties when sufficient quantitative data were available across validation studies for a given instrument.
Results: Eleven instruments for critical appraisal of non-randomized studies of interventions were identified. None were evaluated for all COSMIN measurement properties; instrument development, content validity, and reliability were most consistently reported. MINORS, MMERSQI, and ASSESS demonstrated the highest quality evidence for methodological quality, while ROBINS-I provided the strongest evidence for risk of bias assessment. For instruments with sufficient comparable data, such as ROBINS-I, a meta-analysis of inter-rater reliability coefficients was conducted, showing moderate agreement for selection and exposure domains, with substantial heterogeneity across studies.
Conclusion: MINORS emerged as the most robust instrument for critical appraisal of methodological quality, whereas ROBINS-I stood out for risk of bias. ASSESS and MMERSQI provided adequate evidence of content validity, further assessment of different psychometric properties is needed, highlighting that only a small subset of available tools for NRSI have undergone formal psychometric validation. Further research is needed to strengthen the measurement evidence base for these instruments.
扫码关注我们
求助内容:
应助结果提醒方式:
