Quantitative bias analysis (QBA) methods evaluate the impact of biases arising from systematic errors on observational study results. This systematic review aimed to summarize the range and characteristics of QBA methods for summary-level data published in the peer-reviewed literature.
We searched MEDLINE, Embase, Scopus, and Web of Science for English-language articles describing QBA methods. For each QBA method, we recorded key characteristics, including applicable study designs, bias(es) addressed; bias parameters, and publicly available software. The study protocol was preregistered on the Open Science Framework (https://osf.io/ue6vm/).
Our search identified 10,249 records, of which 53 were articles describing 57 QBA methods for summary-level data. Of the 57 QBA methods, 53 (93%) were explicitly designed for observational studies, and 4 (7%) for meta-analyses. There were 29 (51%) QBA methods that addressed unmeasured confounding, 19 (33%) misclassification bias, 6 (11%) selection bias, and 3 (5%) multiple biases. Thirty-eight (67%) QBA methods were designed to generate bias-adjusted effect estimates and 18 (32%) were designed to describe how bias could explain away observed findings. Twenty-two (39%) articles provided code or online tools to implement the QBA methods.
In this systematic review, we identified a total of 57 QBA methods for summary-level epidemiologic data published in the peer-reviewed literature. Future investigators can use this systematic review to identify different QBA methods for summary-level epidemiologic data.
Quantitative bias analysis (QBA) methods can be used to evaluate the impact of biases on observational study results. However, little is known about the full range and characteristics of available methods in the peer-reviewed literature that can be used to conduct QBA using information reported in manuscripts and other publicly available sources without requiring the raw data from a study. In this systematic review, we identified 57 QBA methods for summary-level data from observational studies. Overall, there were 29 methods that addressed unmeasured confounding, 19 that addressed misclassification bias, six that addressed selection bias, and three that addressed multiple biases. This systematic review may help future investigators identify different QBA methods for summary-level data.
Persistent somatic symptoms (PSS) describe recurrent or continuously occurring symptoms such as fatigue, dizziness, or pain that have persisted for at least several months. These include single symptoms such as chronic pain, combinations of symptoms, or functional disorders such as fibromyalgia or irritable bowel syndrome. While many studies have explored stigmatisation by healthcare professionals toward people with PSS, there is a lack of validated measurement instruments. We recently developed a stigma scale, the Persistent Somatic Symptom Stigma scale for Healthcare Professionals (PSSS-HCP). The aim of this study is to evaluate the measurement properties (validity and reliability) and factor structure of the PSSS-HCP.
The PSSS-HCP was tested with 121 healthcare professionals across the United Kingdom to evaluate its measurement properties. Analysis of the factor structure was conducted using principal component analysis. We calculated Cronbach's alpha to determine the internal consistency of each (sub)scale. Test-retest reliability was conducted with a subsample of participants with a 2-week interval. We evaluated convergent validity by testing the association between the PSSS-HCP and the Medical Condition Regard Scale (MCRS) and the influence of social desirability using the short form of the Marlowe-Crowne Social Desirability Scale (MCSDS).
The PSSS-HCP showed sufficient internal consistency (Cronbach's alpha = 0.84) and sufficient test-retest reliability, intraclass correlation = 0.97 (95% CI 0.94–0.99, P < .001). Convergent validity was sufficient between the PSSS-HCP and the MCRS, and no relationship was found between the PSSS-HCP and the MCSDS. A three factor structure was identified (othering, uneasiness in interaction, non-disclosure) which accounted for 60.5% of the variance using 13 of the 19 tested items.
The PSSS-HCP can be used to measure PSS stigmatisation by healthcare professionals. The PSSS-HCP has demonstrated sufficient internal consistency, test-retest reliability, convergent validity and no evidence of social desirability bias. The PSSS-HCP has demonstrated potential to measure important aspects of stigma and provide a foundation for stigma reduction intervention evaluation.
To quantify the ability of two new comorbidity indices to adjust for confounding, by benchmarking a target trial emulation against the randomized controlled trial (RCT) result.
Observational study including 18,316 men from Prostate Cancer data Base Sweden 5.0, diagnosed with prostate cancer between 2008 and 2019 and treated with primary radical prostatectomy (RP, n = 14,379) or radiotherapy (RT, n = 3,937). The effect on adjusted risk of death from any cause after adjustment for comorbidity by use of two new comorbidity indices, the multidimensional diagnosis-based comorbidity index and the drug comorbidity index, were compared to adjustment for the Charlson comorbidity index (CCI).
Risk of death was higher after RT than RP (hazard ratio [HR] = 1.94; 95% confidence interval [CI]: 1.70–2.21). The difference decreased when adjusting for age, cancer characteristics, and CCI (HR = 1.32, 95% CI: 1.06–1.66). Adjustment for the two new comorbidity indices further attenuated the difference (HR 1.14, 95% CI 0.91–1.44). Emulation of a hypothetical pragmatic trial where also older men with any type of baseline comorbidity were included, largely confirmed these results (HR 1.10; 95% CI 0.95–1.26).
Adjustment for comorbidity using two new indices provided comparable risk of death from any cause in line with results of a RCT. Similar results were seen in a broader study population, more representative of clinical practice.
The US Agency for Healthcare Research and Quality, through the Evidence-based Practice Center (EPC) Program, aims to provide health system decision makers with the highest-quality evidence to inform clinical decisions. However, limitations in the literature may lead to inconclusive findings in EPC systematic reviews (SRs). The EPC Program conducted pilot projects to understand the feasibility, benefits, and challenges of utilizing health system data to augment SR findings to support confidence in healthcare decision-making based on real-world experiences.
Three contractors (each an EPC located at a different health system) selected a recently completed SR conducted by their center and identified an evidence gap that electronic health record (EHR) data might address. All pilot project topics addressed clinical questions as opposed to care delivery, care organization, or care disparities topics that are common in EPC reports. Topic areas addressed by each EPC included infantile epilepsy, migraine, and hip fracture. EPCs also tracked additional resources needed to conduct supplemental analyses. The workgroup met monthly in 2022-2023 to discuss challenges and lessons learned from the pilot projects.
Two supplemental data analyses filled an evidence gap identified in the SRs (raised certainty of evidence, improved applicability) and the third filled a health system knowledge gap. Project challenges fell under three themes: regulatory and logistical issues, data collection and analysis, and interpretation and presentation of findings. Limited ability to capture key clinical variables given inconsistent or missing data within the EHR was a major limitation. The workgroup found that conducting supplemental data analysis alongside an SR was feasible but adds considerable time and resources to the review process (estimated total hours to complete pilot projects ranged from 283 to 595 across EPCs), and that the increased effort and resources added limited incremental value.
Supplementing existing SRs with analyses of EHR data is resource intensive and requires specialized skillsets throughout the process. While using EHR data for research has immense potential to generate real-world evidence and fill knowledge gaps, these data may not yet be ready for routine use alongside SRs.
The Grading of Recommendations, Assessment, Development and Evaluations (GRADE)-ADOLOPMENT methodology has been widely used to adopt, adapt, or de novo develop recommendations from existing or new guideline and evidence synthesis efforts. The objective of this guidance is to refine the operationalization for applying GRADE-ADOLOPMENT.
Through iterative discussions, online meetings, and email communications, the GRADE-ADOLOPMENT project group drafted the updated guidance. We then conducted a review of handbooks of guideline-producing organizations, and a scoping review of published and planned adolopment guideline projects. The lead authors refined the existing approach based on the scoping review findings and feedback from members of the GRADE working group. We presented the revised approach to the group in November 2022 (approximately 115 people), in May 2023 (approximately 100 people), and twice in September 2023 (approximately 60 and 90 people) for approval.
This GRADE guidance shows how to effectively and efficiently contextualize recommendations using the GRADE-ADOLOPMENT approach by doing the following: (1) showcasing alternative pathways for starting an adolopment effort; (2) elaborating on the different essential steps of this approach, such as building on existing evidence-to-decision (EtDs), when available or developing new EtDs, if necessary; and (3) providing examples from adolopment case studies to facilitate the application of the approach. We demonstrate how to use contextual evidence to make judgments about EtD criteria, and highlight the importance of making the resulting EtDs available to facilitate adolopment efforts by others.
This updated GRADE guidance further operationalizes the application of GRADE-ADOLOPMENT based on over 6 years of experience. It serves to support uptake and application by end users interested in contextualizing recommendations to a local setting or specific reality in a short period of time or with limited resources.