Interpreting positive and negative predictive values of diagnostic tests is crucial for clinical decision-making as they quantify the clinician's confidence in an individual's disease status after testing. For a given diagnostic test, these values depend on the pretest probability (ie, the probability that an individual has the disease before testing), which differs across individuals. Therefore, they should not be presented as a single pair for clinical use. To account for this individual variability in pretest probability and the minimum confidence level required to conclude on an individual's disease status, we propose the use of the minimum and maximum pretest probabilities (“PTP+conf” and “PTP-conf”). These thresholds depend on the test's sensitivity and specificity, as well as the clinician's predefined confidence level. They represent the pretest probability above (or below) which a positive (or negative) test result allows the clinician to reach that minimum confidence level (“conf”) regarding the presence or absence of disease. These “PTP+conf” and “PTP-conf” values can be considered as intrinsic characteristics of a diagnostic test for a given confidence threshold. Clinicians then only need to compare their bedside estimate of the individual's pretest probability with “PTP+conf” (if positive result) or “PTP-conf” (if negative result) to determine whether they can conclude with sufficient confidence after obtaining the test result.
{"title":"Use of minimum and maximum pretest probabilities to conclude with confidence after obtaining a diagnostic test result","authors":"Loic Desquilbet , Maxime Kurtz , Morgane Canonne-Guibert , Solen Kerneis , Ghita Benchekroun","doi":"10.1016/j.jclinepi.2026.112134","DOIUrl":"10.1016/j.jclinepi.2026.112134","url":null,"abstract":"<div><div>Interpreting positive and negative predictive values of diagnostic tests is crucial for clinical decision-making as they quantify the clinician's confidence in an individual's disease status after testing. For a given diagnostic test, these values depend on the pretest probability (ie, the probability that an individual has the disease before testing), which differs across individuals. Therefore, they should not be presented as a single pair for clinical use. To account for this individual variability in pretest probability and the minimum confidence level required to conclude on an individual's disease status, we propose the use of the minimum and maximum pretest probabilities (“PTP+<sub>conf</sub>” and “PTP-<sub>conf</sub>”). These thresholds depend on the test's sensitivity and specificity, as well as the clinician's predefined confidence level. They represent the pretest probability above (or below) which a positive (or negative) test result allows the clinician to reach that minimum confidence level (“conf”) regarding the presence or absence of disease. These “PTP+<sub>conf</sub>” and “PTP-<sub>conf</sub>” values can be considered as intrinsic characteristics of a diagnostic test for a given confidence threshold. Clinicians then only need to compare their bedside estimate of the individual's pretest probability with “PTP+<sub>conf</sub>” (if positive result) or “PTP-<sub>conf</sub>” (if negative result) to determine whether they can conclude with sufficient confidence after obtaining the test result.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"192 ","pages":"Article 112134"},"PeriodicalIF":5.2,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145953814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.jclinepi.2026.112133
Daniel G. Hamilton , Joanne E. McKenzie , Camilla H. Nejstgaard , Sue E. Brennan , David Moher , Matthew J. Page
Background and Objectives
Numerous studies have assessed the adherence of published systematic reviews to the PRISMA 2020 statement. We aimed to summarize the characteristics and methods of development of the tools used to assess adherence in these studies.
Methods
MEDLINE, Embase, and PsycINFO (all via Ovid) were searched on January 20, 2025, to locate studies that assessed adherence of systematic reviews of health interventions to the PRISMA 2020 statement. Two authors independently screened all records and extracted data. We examined three aspects of the tools used to assess adherence to PRISMA 2020: i) characteristics of the assessment tool, ii) methods used to develop and validate the tool, and iii) processes used to apply the tool. We classified a tool as “implementable” by researchers external to the tool developers if authors reported the exact wording of each item, its response options, guidance on how to operationalize all response options, and the algorithms used to aggregate judgments and quantify adherence.
Results
We included 24 meta-research studies that had assessed adherence to PRISMA 2020 in 2766 systematic reviews published between 1989 and 2024. Most authors assessed adherence to PRISMA 2020 in its entirety (N = 15/24, 63%), with the remaining nine (37%) assessing adherence to one or a subset of domains (eg, abstract, search methods, and risk of bias assessment methods). Psychometric testing of the assessment tool was reported by five studies (21%), all of which assessed the inter-rater reliability of the tool. Only one (4%) reported how response options for all items were operationalized. According to our criteria, only one assessment tool was classified as implementable (N = 1/24, 4%). No authorship team used the same methods to assess adherence to the PRISMA 2020 statement. However, information on some tool characteristics was unavailable for several studies.
Conclusion
Our findings demonstrate variation and inadequacies in the methods and reporting of tools used to assess adherence to the PRISMA 2020 statement. We have commenced work on a standardized PRISMA 2020 assessment tool to facilitate accurate and consistent assessments of adherence of systematic reviews to PRISMA. In the interim, we provide some recommendations for how meta-researchers interested in assessing adherence of systematic reviews to the PRISMA 2020 statement can transparently report the findings of their assessments.
{"title":"Evaluation of tools used to assess adherence to PRISMA 2020 reveals inconsistent methods and poor tool implementability: part I of a systematic review","authors":"Daniel G. Hamilton , Joanne E. McKenzie , Camilla H. Nejstgaard , Sue E. Brennan , David Moher , Matthew J. Page","doi":"10.1016/j.jclinepi.2026.112133","DOIUrl":"10.1016/j.jclinepi.2026.112133","url":null,"abstract":"<div><h3>Background and Objectives</h3><div>Numerous studies have assessed the adherence of published systematic reviews to the PRISMA 2020 statement. We aimed to summarize the characteristics and methods of development of the tools used to assess adherence in these studies.</div></div><div><h3>Methods</h3><div>MEDLINE, Embase, and PsycINFO (all via Ovid) were searched on January 20, 2025, to locate studies that assessed adherence of systematic reviews of health interventions to the PRISMA 2020 statement. Two authors independently screened all records and extracted data. We examined three aspects of the tools used to assess adherence to PRISMA 2020: i) characteristics of the assessment tool, ii) methods used to develop and validate the tool, and iii) processes used to apply the tool. We classified a tool as “implementable” by researchers external to the tool developers if authors reported the exact wording of each item, its response options, guidance on how to operationalize all response options, and the algorithms used to aggregate judgments and quantify adherence.</div></div><div><h3>Results</h3><div>We included 24 meta-research studies that had assessed adherence to PRISMA 2020 in 2766 systematic reviews published between 1989 and 2024. Most authors assessed adherence to PRISMA 2020 in its entirety (<em>N</em> = 15/24, 63%), with the remaining nine (37%) assessing adherence to one or a subset of domains (eg, abstract, search methods, and risk of bias assessment methods). Psychometric testing of the assessment tool was reported by five studies (21%), all of which assessed the inter-rater reliability of the tool. Only one (4%) reported how response options for all items were operationalized. According to our criteria, only one assessment tool was classified as implementable (<em>N</em> = 1/24, 4%). No authorship team used the same methods to assess adherence to the PRISMA 2020 statement. However, information on some tool characteristics was unavailable for several studies.</div></div><div><h3>Conclusion</h3><div>Our findings demonstrate variation and inadequacies in the methods and reporting of tools used to assess adherence to the PRISMA 2020 statement. We have commenced work on a standardized PRISMA 2020 assessment tool to facilitate accurate and consistent assessments of adherence of systematic reviews to PRISMA. In the interim, we provide some recommendations for how meta-researchers interested in assessing adherence of systematic reviews to the PRISMA 2020 statement can transparently report the findings of their assessments.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"192 ","pages":"Article 112133"},"PeriodicalIF":5.2,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145953728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-08DOI: 10.1016/j.jclinepi.2026.112135
Juliane Kennett, Stephana Julia Moss, Jeanna Parsons Leigh, Niklas Bobrovitz, Henry T Stelfox
<p><strong>Objectives: </strong>Map the evidence on factors (eg, research practices) associated with reproducibility of methods and results reported in health sciences research.</p><p><strong>Study design and setting: </strong>Five bibliographic databases were searched from January 2000 to May 2023 with supplemental searches of high-impact journals and relevant records. We included health science records of observational, interventional, or knowledge synthesis studies reporting data on factors related to research reproducibility. Data were coded using inductive qualitative content analysis, and empirical evidence was synthesized with evidence and gap maps. Factors were categorized as modifiable or nonmodifiable; reproducibility outcomes were categorized as related to methods or results. Statistical tests of association between factors and reproducibility outcomes were summarized.</p><p><strong>Results: </strong>Our review included 148 studies, primarily from biomedical/preclinical (n = 62) and clinical (n = 71) domains. Factors were classified into 12 modifiable (eg, sample size and power) and three nonmodifiable (eg, publication year) categories. Of 234 reported evaluations of factors, 76 (32%) assessed methodological reproducibility and 158 (68%) assessed results reproducibility. The most frequently reported factor was transparency and reporting (38 of 234 assessments). A total of 155 factors (66%) were evaluated for statistical associations with reproducibility outcomes. Statistical associations were most frequently conducted for analytical methods (24 of 26 reporting significance), sample size and power (21 of 23 reporting significance), and participants characteristics and study materials (10 of 12 reporting significance).</p><p><strong>Conclusion: </strong>Several modifiable factors were associated with reproducibility of health sciences research and represent opportunities for intervention. Applying more stringent statistical testing procedures and thresholds, conducting appropriate sample size and power calculations, and improving transparency and completeness of reporting should be top priorities for improving reproducibility. Experimental studies to test interventions to improve reproducibility are needed.</p><p><strong>Plain language summary: </strong>Many research findings in medicine and health cannot be reproduced by other researchers. This makes it harder to know what evidence to trust when making patient care and health policy decisions. We systematically reviewed 148 studies that examined how specific research practices are linked to whether research findings can be reproduced. We found that several modifiable features of study design, analysis, and reporting were often associated with better reproducibility. Using larger sample sizes, applying more stringent statistical methods, and providing transparent, complete descriptions of methods and results were frequently linked to more reproducible findings. These results suggest that
{"title":"Modifiable methodological and reporting practices are associated with reproducibility of health sciences research: a systematic review and evidence and gap map.","authors":"Juliane Kennett, Stephana Julia Moss, Jeanna Parsons Leigh, Niklas Bobrovitz, Henry T Stelfox","doi":"10.1016/j.jclinepi.2026.112135","DOIUrl":"10.1016/j.jclinepi.2026.112135","url":null,"abstract":"<p><strong>Objectives: </strong>Map the evidence on factors (eg, research practices) associated with reproducibility of methods and results reported in health sciences research.</p><p><strong>Study design and setting: </strong>Five bibliographic databases were searched from January 2000 to May 2023 with supplemental searches of high-impact journals and relevant records. We included health science records of observational, interventional, or knowledge synthesis studies reporting data on factors related to research reproducibility. Data were coded using inductive qualitative content analysis, and empirical evidence was synthesized with evidence and gap maps. Factors were categorized as modifiable or nonmodifiable; reproducibility outcomes were categorized as related to methods or results. Statistical tests of association between factors and reproducibility outcomes were summarized.</p><p><strong>Results: </strong>Our review included 148 studies, primarily from biomedical/preclinical (n = 62) and clinical (n = 71) domains. Factors were classified into 12 modifiable (eg, sample size and power) and three nonmodifiable (eg, publication year) categories. Of 234 reported evaluations of factors, 76 (32%) assessed methodological reproducibility and 158 (68%) assessed results reproducibility. The most frequently reported factor was transparency and reporting (38 of 234 assessments). A total of 155 factors (66%) were evaluated for statistical associations with reproducibility outcomes. Statistical associations were most frequently conducted for analytical methods (24 of 26 reporting significance), sample size and power (21 of 23 reporting significance), and participants characteristics and study materials (10 of 12 reporting significance).</p><p><strong>Conclusion: </strong>Several modifiable factors were associated with reproducibility of health sciences research and represent opportunities for intervention. Applying more stringent statistical testing procedures and thresholds, conducting appropriate sample size and power calculations, and improving transparency and completeness of reporting should be top priorities for improving reproducibility. Experimental studies to test interventions to improve reproducibility are needed.</p><p><strong>Plain language summary: </strong>Many research findings in medicine and health cannot be reproduced by other researchers. This makes it harder to know what evidence to trust when making patient care and health policy decisions. We systematically reviewed 148 studies that examined how specific research practices are linked to whether research findings can be reproduced. We found that several modifiable features of study design, analysis, and reporting were often associated with better reproducibility. Using larger sample sizes, applying more stringent statistical methods, and providing transparent, complete descriptions of methods and results were frequently linked to more reproducible findings. These results suggest that","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":" ","pages":"112135"},"PeriodicalIF":5.2,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145949384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In randomized clinical trials (RCTs) for hematological malignancies, patients may undergo allogeneic hematopoietic stem cell transplantation (allo-HSCT) as part of standard clinical pathways. Allo-HSCT is a potentially curative but high-risk procedure performed after randomization and thus constitutes an important intercurrent event that can substantially influence survival outcomes. However, its handling in statistical analyses is not standardized.
Objective
To review current statistical methods used to handle postrandomization allo-HSCT as an intercurrent event in RCTs, and to highlight how each method corresponds to a different estimand, reflecting distinct clinical questions.
Methods
We reviewed 93 RCTs published between January 1, 2014, and April 1, 2024 that reported survival outcomes with postrandomization allo-HSCT.
Results
Three different statistical methods were employed to estimate the treatment effects: censoring at the time of allo-HSCT (64 analyses), a time-dependent covariate in a Cox model (24 analyses), or ignoring allo-HSCT status (17 analyses). Each method estimates the treatment effect in response to a different clinical question and estimand, with specific assumptions that must be considered when interpreting the results. Censoring corresponds to the “hypothetical” estimand, but its validity requires 2 things: first, that the likelihood of receiving allo-HSCT is similar across treatment arms; and second, that patients who undergo transplantation have a similar prognosis to those who do not. Time-dependent covariate incorporates the effect of allo-HSCT but is not associated with a specific estimand and requires careful interpretation. Ignoring allo-HSCT corresponds to the “treatment policy” strategy, of comparing the treatment strategy, whichever allo-HSCT or not, without additional assumptions.
Conclusion
There is no consensus on handling allo-HSCT as an intercurrent event in survival analyses. Censoring, although common, may introduce bias if treatment or prognostic covariates influence allo-HSCT use. The treatment policy estimand should be preferred when allo-HSCT is part of the therapeutic strategy.
{"title":"Challenges in handling allogeneic stem cell transplantation in randomized clinical trials","authors":"Roxane Couturier , Loïc Vasseur , Nicolas Boissel , Hervé Dombret , Jérôme Lambert , Sylvie Chevret","doi":"10.1016/j.jclinepi.2026.112132","DOIUrl":"10.1016/j.jclinepi.2026.112132","url":null,"abstract":"<div><h3>Background</h3><div>In randomized clinical trials (RCTs) for hematological malignancies, patients may undergo allogeneic hematopoietic stem cell transplantation (allo-HSCT) as part of standard clinical pathways. Allo-HSCT is a potentially curative but high-risk procedure performed after randomization and thus constitutes an important intercurrent event that can substantially influence survival outcomes. However, its handling in statistical analyses is not standardized.</div></div><div><h3>Objective</h3><div>To review current statistical methods used to handle postrandomization allo-HSCT as an intercurrent event in RCTs, and to highlight how each method corresponds to a different estimand, reflecting distinct clinical questions.</div></div><div><h3>Methods</h3><div>We reviewed 93 RCTs published between January 1, 2014, and April 1, 2024 that reported survival outcomes with postrandomization allo-HSCT.</div></div><div><h3>Results</h3><div>Three different statistical methods were employed to estimate the treatment effects: censoring at the time of allo-HSCT (64 analyses), a time-dependent covariate in a Cox model (24 analyses), or ignoring allo-HSCT status (17 analyses). Each method estimates the treatment effect in response to a different clinical question and estimand, with specific assumptions that must be considered when interpreting the results. Censoring corresponds to the “hypothetical” estimand, but its validity requires 2 things: first, that the likelihood of receiving allo-HSCT is similar across treatment arms; and second, that patients who undergo transplantation have a similar prognosis to those who do not. Time-dependent covariate incorporates the effect of allo-HSCT but is not associated with a specific estimand and requires careful interpretation. Ignoring allo-HSCT corresponds to the “treatment policy” strategy, of comparing the treatment strategy, whichever allo-HSCT or not, without additional assumptions.</div></div><div><h3>Conclusion</h3><div>There is no consensus on handling allo-HSCT as an intercurrent event in survival analyses. Censoring, although common, may introduce bias if treatment or prognostic covariates influence allo-HSCT use. The treatment policy estimand should be preferred when allo-HSCT is part of the therapeutic strategy.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112132"},"PeriodicalIF":5.2,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145935893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jclinepi.2025.112017
Howard Bauchner
{"title":"Systematic reviews and meta-analysis: continued failure to achieve research integrity","authors":"Howard Bauchner","doi":"10.1016/j.jclinepi.2025.112017","DOIUrl":"10.1016/j.jclinepi.2025.112017","url":null,"abstract":"","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"189 ","pages":"Article 112017"},"PeriodicalIF":5.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145310051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jclinepi.2025.112016
Martin Plöderl , Richard Lyus , Mark A. Horowitz , Joanna Moncrieff
Objectives
Fluoxetine is among the most used antidepressants for children and adolescents and frequently recommended as first-line pharmacological treatment for pediatric depression. However, in contrast to earlier studies and reviews, a Cochrane network meta-analysis from 2021 concluded that the estimated efficacy of fluoxetine was no longer clinically meaningful. We aimed to explain the discrepant findings between the recent Cochrane review and earlier reviews, and to explore if this was acknowledged in guidelines and treatment recommendations appearing since then.
Study Design and Setting
Meta-analytical aggregation of trial results over time, exploring potential biases, and a nonsystematic search for recent treatment guidelines/recommendations from major medical organizations.
Results
The estimated efficacy of fluoxetine in clinical trials declined over time into the range of clinical equivalence with placebo when more recent studies were included in analyses and when considering common thresholds of clinical significance. This remains unacknowledged in treatment guidelines and related publications, including some that continue to recommend fluoxetine as first-line pharmacological treatment. Finally, we find that the loss of efficacy over time is likely explained by biases such as the novelty bias or by variations of expectancy effects.
Conclusion
The seeming lack of clinically meaningful efficacy of fluoxetine for the treatment of pediatric depression needs to be considered by those who develop treatment recommendations as well as by patients and clinicians. The biases we observed are not only relevant in the evaluation of fluoxetine and other antidepressants for pediatric depression, but also for any new treatment.
{"title":"The loss of efficacy of fluoxetine in pediatric depression: explanations, lack of acknowledgment, and implications for other treatments","authors":"Martin Plöderl , Richard Lyus , Mark A. Horowitz , Joanna Moncrieff","doi":"10.1016/j.jclinepi.2025.112016","DOIUrl":"10.1016/j.jclinepi.2025.112016","url":null,"abstract":"<div><h3>Objectives</h3><div>Fluoxetine is among the most used antidepressants for children and adolescents and frequently recommended as first-line pharmacological treatment for pediatric depression. However, in contrast to earlier studies and reviews, a Cochrane network meta-analysis from 2021 concluded that the estimated efficacy of fluoxetine was no longer clinically meaningful. We aimed to explain the discrepant findings between the recent Cochrane review and earlier reviews, and to explore if this was acknowledged in guidelines and treatment recommendations appearing since then.</div></div><div><h3>Study Design and Setting</h3><div>Meta-analytical aggregation of trial results over time, exploring potential biases, and a nonsystematic search for recent treatment guidelines/recommendations from major medical organizations.</div></div><div><h3>Results</h3><div>The estimated efficacy of fluoxetine in clinical trials declined over time into the range of clinical equivalence with placebo when more recent studies were included in analyses and when considering common thresholds of clinical significance. This remains unacknowledged in treatment guidelines and related publications, including some that continue to recommend fluoxetine as first-line pharmacological treatment. Finally, we find that the loss of efficacy over time is likely explained by biases such as the novelty bias or by variations of expectancy effects.</div></div><div><h3>Conclusion</h3><div>The seeming lack of clinically meaningful efficacy of fluoxetine for the treatment of pediatric depression needs to be considered by those who develop treatment recommendations as well as by patients and clinicians. The biases we observed are not only relevant in the evaluation of fluoxetine and other antidepressants for pediatric depression, but also for any new treatment.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"189 ","pages":"Article 112016"},"PeriodicalIF":5.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jclinepi.2025.111960
Brett P. Dyer
It has been proposed that medical research questions can be categorised into three classes: causal, predictive, and descriptive. This distinction was proposed to encourage researchers to think clearly about how study design, analysis, interpretation, and clinical implications should differ according to the type of research question being investigated. This article highlights four common mistakes that remain in observational research regarding the classification of research questions as causal, predictive, or descriptive, and provides suggestions about how they may be rectified. The four common mistakes are (1) Adjustment for “confounders” in predictive and descriptive research, (2) Interpreting “effects” in prediction models, (3) The use of non-specific terminology that does not indicate which class of research question is being investigated, and (4) Prioritising parsimony over confounder adjustment in causal models.
{"title":"The distinction between causal, predictive, and descriptive research—there is still room for improvement","authors":"Brett P. Dyer","doi":"10.1016/j.jclinepi.2025.111960","DOIUrl":"10.1016/j.jclinepi.2025.111960","url":null,"abstract":"<div><div>It has been proposed that medical research questions can be categorised into three classes: causal, predictive, and descriptive. This distinction was proposed to encourage researchers to think clearly about how study design, analysis, interpretation, and clinical implications should differ according to the type of research question being investigated. This article highlights four common mistakes that remain in observational research regarding the classification of research questions as causal, predictive, or descriptive, and provides suggestions about how they may be rectified. The four common mistakes are (1) Adjustment for “confounders” in predictive and descriptive research, (2) Interpreting “effects” in prediction models, (3) The use of non-specific terminology that does not indicate which class of research question is being investigated, and (4) Prioritising parsimony over confounder adjustment in causal models.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"189 ","pages":"Article 111960"},"PeriodicalIF":5.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jclinepi.2025.112024
Brian S. Alper, Joanne Dehnbostel, Holger Schünemann, Paul Whaley
{"title":"Resourcing and validation of the GRADE ontology: reply to Dedeepya et al.","authors":"Brian S. Alper, Joanne Dehnbostel, Holger Schünemann, Paul Whaley","doi":"10.1016/j.jclinepi.2025.112024","DOIUrl":"10.1016/j.jclinepi.2025.112024","url":null,"abstract":"","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"189 ","pages":"Article 112024"},"PeriodicalIF":5.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145370543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To 1) assess the frequency of overlapping systematic reviews (SRs) on the same topic including overlap in outcomes, 2) assess whether SRs meet some key methodological characteristics, and 3) describe discrepancies in results.
Study Design and Setting
For this research-on-research study, we gathered a random sample of SRs with meta-analysis (MA) published in 2022, identified the questions they addressed and, for each question, searched all SRs with MA published from 2018 to 2023 to assess the frequency of overlap. We assessed whether SRs met a minimum set of six key methodological characteristics: protocol registration, search of major electronic databases, search of trial registries, double selection and extraction, use of the Cochrane Risk-of-Bias tool, and Grading of Recommendations, Assessment, Development, and Evaluations assessment.
Results
From a sample of 107 SRs with MA published in 2022, we extracted 105 different questions and identified 123 other SRs with MA published from 2018 to 2023. There were overlapping SRs for 33 questions (31.4%, 95% CI: 22.9–41.3), with a median of three overlapping SRs per question (IQR 2–6; range 2–19). Of the 230 SRs, 15 (6.5%) met the minimum set of six key methodological characteristics, and 12 (11.4%) questions had at least one SR meeting this criterion. Among the 33 questions with overlapping SRs, for 7 (21.2%), the SRs had discrepant results.
Conclusion
One-third of the SRs published in 2022 had at least one overlapping SR published from 2018 to 2023, and most did not meet a minimum set of methodological standards. For one-fifth of the questions, overlapping SRs provided discrepant results.
{"title":"Systematic reviews on the same topic are common but often fail to meet key methodological standards: a research-on-research study","authors":"Wilfred Kwok , Titiane Dallant , Guillaume Martin, Gabriel Fournier, Blandine Kervennic, Ophélie Pingeon, Agnès Dechartres","doi":"10.1016/j.jclinepi.2025.112018","DOIUrl":"10.1016/j.jclinepi.2025.112018","url":null,"abstract":"<div><h3>Objectives</h3><div>To 1) assess the frequency of overlapping systematic reviews (SRs) on the same topic including overlap in outcomes, 2) assess whether SRs meet some key methodological characteristics, and 3) describe discrepancies in results.</div></div><div><h3>Study Design and Setting</h3><div>For this research-on-research study, we gathered a random sample of SRs with meta-analysis (MA) published in 2022, identified the questions they addressed and, for each question, searched all SRs with MA published from 2018 to 2023 to assess the frequency of overlap. We assessed whether SRs met a minimum set of six key methodological characteristics: protocol registration, search of major electronic databases, search of trial registries, double selection and extraction, use of the Cochrane Risk-of-Bias tool, and Grading of Recommendations, Assessment, Development, and Evaluations assessment.</div></div><div><h3>Results</h3><div>From a sample of 107 SRs with MA published in 2022, we extracted 105 different questions and identified 123 other SRs with MA published from 2018 to 2023. There were overlapping SRs for 33 questions (31.4%, 95% CI: 22.9–41.3), with a median of three overlapping SRs per question (IQR 2–6; range 2–19). Of the 230 SRs, 15 (6.5%) met the minimum set of six key methodological characteristics, and 12 (11.4%) questions had at least one SR meeting this criterion. Among the 33 questions with overlapping SRs, for 7 (21.2%), the SRs had discrepant results.</div></div><div><h3>Conclusion</h3><div>One-third of the SRs published in 2022 had at least one overlapping SR published from 2018 to 2023, and most did not meet a minimum set of methodological standards. For one-fifth of the questions, overlapping SRs provided discrepant results.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"189 ","pages":"Article 112018"},"PeriodicalIF":5.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145330918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01DOI: 10.1016/j.jclinepi.2025.112021
Marc Bennett Stone
{"title":"Statistical power is an essential element for replication","authors":"Marc Bennett Stone","doi":"10.1016/j.jclinepi.2025.112021","DOIUrl":"10.1016/j.jclinepi.2025.112021","url":null,"abstract":"","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"189 ","pages":"Article 112021"},"PeriodicalIF":5.2,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145423378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}