Real-world evaluation of interconsensus agreement of risk of bias tools: A case study using risk of bias in nonrandomized studies-of interventions (ROBINS-I)
Samer Saadi, Bashar Hasan, Adel Kanaan, Mohamed Abusalih, Zin Tarakji, Mustafa Sadek, Ayla Shamsi Basha, Mohammed Firwana, Zhen Wang, M. Hassan Murad
{"title":"Real-world evaluation of interconsensus agreement of risk of bias tools: A case study using risk of bias in nonrandomized studies-of interventions (ROBINS-I)","authors":"Samer Saadi, Bashar Hasan, Adel Kanaan, Mohamed Abusalih, Zin Tarakji, Mustafa Sadek, Ayla Shamsi Basha, Mohammed Firwana, Zhen Wang, M. Hassan Murad","doi":"10.1002/cesm.12094","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Risk of bias (RoB) tools are critical in systematic reviews and affect subsequent decision-making. RoB tools should have adequate interrater reliability and interconsensus agreement. We present an approach of post hoc evaluation of RoB tools using duplicated studies that overlap systematic reviews.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Using a back-citation approach, we identified systematic reviews that used the Risk Of Bias In Nonrandomized Studies-of Interventions (ROBINS-I) tool and retrieved all the included primary studies. We selected studies that were appraised by more than one systematic review and calculated observed agreement and unweighted kappa comparing the different systematic reviews' assessments.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>We identified 903 systematic reviews that used the tool with 51,676 cited references, from which we eventually analyzed 171 duplicated studies assessed using ROBINS-I by different systematic reviewers. The observed agreement on ROBINS-I domains ranged from 54.9% (missing data domain) to 70.3% (deviations from intended interventions domain), and was 63.0% for overall RoB assessment of the study. Kappa coefficient ranged from 0.131 (measurement of outcome domain) to 0.396 (domains of confounding and deviations from intended interventions), and was 0.404 for overall RoB assessment of the study.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>A post hoc evaluation of RoB tools is feasible by focusing on duplicated studies that overlap systematic review. ROBINS-I assessments demonstrated considerable variation in interconsensus agreement among various systematic reviewes that assessed the same study and outcome, suggesting the need for more intensive upfront work to calibrate systematic reviewers on how to identify context-specific information and agree on how to judge it.</p>\n </section>\n </div>","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"2 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.12094","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cochrane Evidence Synthesis and Methods","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cesm.12094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Risk of bias (RoB) tools are critical in systematic reviews and affect subsequent decision-making. RoB tools should have adequate interrater reliability and interconsensus agreement. We present an approach of post hoc evaluation of RoB tools using duplicated studies that overlap systematic reviews.
Methods
Using a back-citation approach, we identified systematic reviews that used the Risk Of Bias In Nonrandomized Studies-of Interventions (ROBINS-I) tool and retrieved all the included primary studies. We selected studies that were appraised by more than one systematic review and calculated observed agreement and unweighted kappa comparing the different systematic reviews' assessments.
Results
We identified 903 systematic reviews that used the tool with 51,676 cited references, from which we eventually analyzed 171 duplicated studies assessed using ROBINS-I by different systematic reviewers. The observed agreement on ROBINS-I domains ranged from 54.9% (missing data domain) to 70.3% (deviations from intended interventions domain), and was 63.0% for overall RoB assessment of the study. Kappa coefficient ranged from 0.131 (measurement of outcome domain) to 0.396 (domains of confounding and deviations from intended interventions), and was 0.404 for overall RoB assessment of the study.
Conclusion
A post hoc evaluation of RoB tools is feasible by focusing on duplicated studies that overlap systematic review. ROBINS-I assessments demonstrated considerable variation in interconsensus agreement among various systematic reviewes that assessed the same study and outcome, suggesting the need for more intensive upfront work to calibrate systematic reviewers on how to identify context-specific information and agree on how to judge it.