Pub Date : 2025-12-29DOI: 10.1016/j.jclinepi.2025.112119
C. Veal , K.R. Krause , E.I. Fried , A. Cipriani , P. Cuijpers , J. Downs , T.A. Furukawa , G. Gartlehner , S.D. Hollon , H. Levy-Soussan , G. Sahlem , A. Tomlinson , S. Touboul , P. Ravaud , V.-T. Tran , A. Chevance
Background and Objective
Heterogeneous outcome measurement limits the comparison and combination of results from randomized controlled trials and observational studies aimed at evaluating therapeutic interventions for depression. We report here the protocol for the development of a Core Outcome Set (COS) for adults with depression.
Methods
Development will follow a multistep approach with: (1) generating outcome domains that matter to people with lived experiences of depression, health care professionals, and carers through a large online international survey using open-ended questions; (2). selecting domains based on the preferences of key interest holders through an international online preference elicitation survey; and (3) identifying relevant outcome measures with measurement properties considered sufficient through several systematic reviews conducted according to COnsensus-based Standards for the selection of health Measurement INstruments standards.
Discussion
The protocol describes a proof-of-concept approach to include large numbers of individuals from all key interest holder groups in COS development, which could be replicated in other conditions and contexts.
{"title":"A protocol for the development of a core outcome set for adults with depression","authors":"C. Veal , K.R. Krause , E.I. Fried , A. Cipriani , P. Cuijpers , J. Downs , T.A. Furukawa , G. Gartlehner , S.D. Hollon , H. Levy-Soussan , G. Sahlem , A. Tomlinson , S. Touboul , P. Ravaud , V.-T. Tran , A. Chevance","doi":"10.1016/j.jclinepi.2025.112119","DOIUrl":"10.1016/j.jclinepi.2025.112119","url":null,"abstract":"<div><h3>Background and Objective</h3><div>Heterogeneous outcome measurement limits the comparison and combination of results from randomized controlled trials and observational studies aimed at evaluating therapeutic interventions for depression. We report here the protocol for the development of a Core Outcome Set (COS) for adults with depression.</div></div><div><h3>Methods</h3><div>Development will follow a multistep approach with: (1) generating outcome domains that matter to people with lived experiences of depression, health care professionals, and carers through a large online international survey using open-ended questions; (2). selecting domains based on the preferences of key interest holders through an international online preference elicitation survey; and (3) identifying relevant outcome measures with measurement properties considered sufficient through several systematic reviews conducted according to COnsensus-based Standards for the selection of health Measurement INstruments standards.</div></div><div><h3>Discussion</h3><div>The protocol describes a proof-of-concept approach to include large numbers of individuals from all key interest holder groups in COS development, which could be replicated in other conditions and contexts.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112119"},"PeriodicalIF":5.2,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145879283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-27DOI: 10.1016/j.jclinepi.2025.112120
Yanjiao Shen , Zhengchi Li , Xianlin Gu , Yifan Yao , Sameer Parpia , Diane Heels-Ansdell , Yaping Chang , Ying Wang , Qingyang Shi , Qiukui Hao , Sepideh Mardani Jadid , Tachit Jiravichitchai , Akira Kuriyama , Zuojia Shang , Yuting Wang , Yunli Zhao , Ya Gao , Liang Du , Jin Huang , Gordon Guyatt
<div><h3>Background and Objectives</h3><div>Missing outcome data (hereafter referred to as “missing data,” typically due to loss to follow-up) is a major problem in randomized controlled trials (RCTs) and systematic reviews of RCTs. While prior work has examined the impact of missing binary outcomes, the influence of missing continuous patient-reported outcome measures (PROMs) on pooled effect estimates remains poorly understood. We therefore assessed the risk of bias introduced by missing data in systematic reviews of PROMs.</div></div><div><h3>Study Design and Setting</h3><div>We selected a representative sample of 100 systematic reviews that included meta-analyses reporting a statistically significant effect on a continuous patient-reported efficacy outcome. We applied four increasingly stringent imputation strategies based on the grading of recommendations assessment, development, and evaluation (GRADE) approach, along with three alternative approaches for handling studies in which investigators had already imputed results for missing data. We also conducted Firth logistic regression analyses to identify factors associated with crossing the null after imputation.</div></div><div><h3>Results</h3><div>Results from 100 systematic reviews that included 1298 RCTs proved similar across all three approaches to addressing imputed data. Using the least stringent strategy for imputing missing data, the percentage of meta-analyses in which the 95% CI crossed the null proved under 4%. Applying the next most stringent strategy, the percentage of CIs that crossed the null increased to 47.9%. Percentages crossing the null increased only marginally for the two most stringent approaches, crossing up to 53.1% in the next most stringent and 54.2% in the most stringent. Firth logistic regression identified two significant predictors of crossing the null after imputation: a higher average missing data (odds ratio [OR] 1.23, 95% CI: 1.11–1.43 per 1% increase in missing data) and a larger magnitude of the treatment effect, which was associated with lower odds of crossing the null (OR 0.70, 95% CI: 0.39–0.91 per 1 standardized mean difference increase). Neither database type (Cochrane vs. non-Cochrane) nor duration of follow-up proved associated with CI crossing the null.</div></div><div><h3>Conclusion</h3><div>A plausible imputation approach to test the potential risk of bias as a result of missing data in studies addressing treatment effects on PROMs resulted in 95% CIs in a high proportion of studies initially suggesting benefit crossing the null. The greater the proportion of missing data and the smaller the treatment effect, the more likely the CI crossed the null. Systematic review authors may consider formally testing the robustness of their results with respect to missing data.</div></div><div><h3>Plain Language Summary</h3><div>When studies included in a systematic review have missing outcome data, the study results may be biased and therefore misleading. I
{"title":"An imputation study shows that missing outcome data can substantially bias pooled estimates in systematic reviews of patient-reported outcomes","authors":"Yanjiao Shen , Zhengchi Li , Xianlin Gu , Yifan Yao , Sameer Parpia , Diane Heels-Ansdell , Yaping Chang , Ying Wang , Qingyang Shi , Qiukui Hao , Sepideh Mardani Jadid , Tachit Jiravichitchai , Akira Kuriyama , Zuojia Shang , Yuting Wang , Yunli Zhao , Ya Gao , Liang Du , Jin Huang , Gordon Guyatt","doi":"10.1016/j.jclinepi.2025.112120","DOIUrl":"10.1016/j.jclinepi.2025.112120","url":null,"abstract":"<div><h3>Background and Objectives</h3><div>Missing outcome data (hereafter referred to as “missing data,” typically due to loss to follow-up) is a major problem in randomized controlled trials (RCTs) and systematic reviews of RCTs. While prior work has examined the impact of missing binary outcomes, the influence of missing continuous patient-reported outcome measures (PROMs) on pooled effect estimates remains poorly understood. We therefore assessed the risk of bias introduced by missing data in systematic reviews of PROMs.</div></div><div><h3>Study Design and Setting</h3><div>We selected a representative sample of 100 systematic reviews that included meta-analyses reporting a statistically significant effect on a continuous patient-reported efficacy outcome. We applied four increasingly stringent imputation strategies based on the grading of recommendations assessment, development, and evaluation (GRADE) approach, along with three alternative approaches for handling studies in which investigators had already imputed results for missing data. We also conducted Firth logistic regression analyses to identify factors associated with crossing the null after imputation.</div></div><div><h3>Results</h3><div>Results from 100 systematic reviews that included 1298 RCTs proved similar across all three approaches to addressing imputed data. Using the least stringent strategy for imputing missing data, the percentage of meta-analyses in which the 95% CI crossed the null proved under 4%. Applying the next most stringent strategy, the percentage of CIs that crossed the null increased to 47.9%. Percentages crossing the null increased only marginally for the two most stringent approaches, crossing up to 53.1% in the next most stringent and 54.2% in the most stringent. Firth logistic regression identified two significant predictors of crossing the null after imputation: a higher average missing data (odds ratio [OR] 1.23, 95% CI: 1.11–1.43 per 1% increase in missing data) and a larger magnitude of the treatment effect, which was associated with lower odds of crossing the null (OR 0.70, 95% CI: 0.39–0.91 per 1 standardized mean difference increase). Neither database type (Cochrane vs. non-Cochrane) nor duration of follow-up proved associated with CI crossing the null.</div></div><div><h3>Conclusion</h3><div>A plausible imputation approach to test the potential risk of bias as a result of missing data in studies addressing treatment effects on PROMs resulted in 95% CIs in a high proportion of studies initially suggesting benefit crossing the null. The greater the proportion of missing data and the smaller the treatment effect, the more likely the CI crossed the null. Systematic review authors may consider formally testing the robustness of their results with respect to missing data.</div></div><div><h3>Plain Language Summary</h3><div>When studies included in a systematic review have missing outcome data, the study results may be biased and therefore misleading. I","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112120"},"PeriodicalIF":5.2,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145858961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-27DOI: 10.1016/j.jclinepi.2025.112122
Raphael E. Cuomo
Objectives
Epidemiology is largely organized to explain who becomes ill, yet many clinical and public health decisions occur after diagnosis. I introduce and formally define survival epidemiology as a new branch of science focused on assessing how people live longer and better with established disease, and I provide justification that prevention estimates should not be assumed to apply postdiagnosis.
Study Design and Setting
Conceptual and methodological commentary synthesizing evidence across cardiovascular, renal, oncologic, pulmonary, and hepatic conditions and integrating causal-inference and time-to-event principles for postdiagnosis questions.
Results
Across diseases, associations measured for incidence often fail to reproduce, and sometimes reverse, among patients with established disease. Diagnosis acts as a causal threshold that changes time scales and bias structures, including conditioning on disease (collider stratification), time-dependent confounding, immortal time bias, and reverse causation. Credible postdiagnosis inference requires designs that emulate randomized trials; explicit alignment of time zero with clinical decision points; strategies defined as used in practice; and handling of competing risks, multistate transitions, and longitudinal biomarkers (including joint models when appropriate). Essential postdiagnosis data include stage, molecular subtype, prior therapy lines, dose intensity and modifications, adverse events, performance status, and patient-reported outcomes. Recommended practice is parallel estimation of prevention and postdiagnosis survival effects for the same exposure–disease pairs and routine reporting of heterogeneity by stage, subtype, treatment pathway, and time since diagnosis.
Conclusion
Prevention and postdiagnosis survival are distinct inferential targets. Journals should require clarity on whether claims pertain to prevention or survival and report target-trial elements; guideline bodies should distinguish prevention from survival recommendations when evidence allows; and funders, training programs, and public communication should support survival-focused methods, data standards, and context-specific messaging for people living with disease.
{"title":"Defining survival epidemiology: postdiagnosis population science for people living with disease","authors":"Raphael E. Cuomo","doi":"10.1016/j.jclinepi.2025.112122","DOIUrl":"10.1016/j.jclinepi.2025.112122","url":null,"abstract":"<div><h3>Objectives</h3><div>Epidemiology is largely organized to explain who becomes ill, yet many clinical and public health decisions occur after diagnosis. I introduce and formally define survival epidemiology as a new branch of science focused on assessing how people live longer and better with established disease, and I provide justification that prevention estimates should not be assumed to apply postdiagnosis.</div></div><div><h3>Study Design and Setting</h3><div>Conceptual and methodological commentary synthesizing evidence across cardiovascular, renal, oncologic, pulmonary, and hepatic conditions and integrating causal-inference and time-to-event principles for postdiagnosis questions.</div></div><div><h3>Results</h3><div>Across diseases, associations measured for incidence often fail to reproduce, and sometimes reverse, among patients with established disease. Diagnosis acts as a causal threshold that changes time scales and bias structures, including conditioning on disease (collider stratification), time-dependent confounding, immortal time bias, and reverse causation. Credible postdiagnosis inference requires designs that emulate randomized trials; explicit alignment of time zero with clinical decision points; strategies defined as used in practice; and handling of competing risks, multistate transitions, and longitudinal biomarkers (including joint models when appropriate). Essential postdiagnosis data include stage, molecular subtype, prior therapy lines, dose intensity and modifications, adverse events, performance status, and patient-reported outcomes. Recommended practice is parallel estimation of prevention and postdiagnosis survival effects for the same exposure–disease pairs and routine reporting of heterogeneity by stage, subtype, treatment pathway, and time since diagnosis.</div></div><div><h3>Conclusion</h3><div>Prevention and postdiagnosis survival are distinct inferential targets. Journals should require clarity on whether claims pertain to prevention or survival and report target-trial elements; guideline bodies should distinguish prevention from survival recommendations when evidence allows; and funders, training programs, and public communication should support survival-focused methods, data standards, and context-specific messaging for people living with disease.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112122"},"PeriodicalIF":5.2,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145859002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-27DOI: 10.1016/j.jclinepi.2025.112121
Sarah B. Windle , Sam Harper , Jasleen Arneja , Peter Socha , Arijit Nandi
Background
In contrast to other observational study designs, quasi-experimental approaches (eg, difference-in-differences, interrupted time series, regression discontinuity, instrumental variable, synthetic control) account for some sources of unmeasured confounding and can estimate causal effects under weaker assumptions. Studies which apply quasi-experimental approaches have increased in popularity in recent decades, therefore investigators conducting systematic reviews of observational studies, particularly in biomedical, public health, or epidemiologic content areas, must be prepared to encounter and appropriately assess these approaches.
Objective
Our objective is to describe key methodological challenges and considerations for systematic reviews including quasi-experimental studies, with attention to current recommendations and approaches which have been applied in previous reviews.
Conclusion
Recommendations for authors of systematic reviews: We recommend that individuals conducting systematic reviews including quasi-experimental studies: (1) search a broad range of bibliographic databases and gray literature, including preprint repositories; (2) do not use search strategies which require specific terms for study design for identification, given inconsistent nomenclature and poor database indexing for quasi-experimental studies; (3) ensure that their review team includes several individuals with expertise in quasi-experimental designs for screening and risk of bias assessment in duplicate; (4) use an approach to risk of bias assessment which is sufficiently granular to identify studies most likely to report unbiased estimates of causal effects (eg, modified Risk Of Bias In Nonrandomized Studies - of Interventions); and (5) consider the implications of varied estimands when interpreting estimates from different quasi-experimental designs. Researchers may also consider restricting systematic review inclusion to quasi-experimental studies for feasibility when addressing research questions with large bodies of literature. However, a more inclusive approach is preferred, as well-designed studies using a variety of methodological approaches may be more credible than a quasi-experiment which violates causal assumptions.
Recommendations for the research community: Many of the challenges faced in conducting systematic reviews of quasi-experimental studies would be ameliorated by improved consistency in nomenclature, as well as greater transparency from authors in describing their research designs. The broader community (eg, research networks, journals) should consider the creation and implementation of reporting standards and protocol registration for quasi-experimental studies to improve study identification in systematic reviews.
{"title":"Systematic reviews of quasi-experimental studies: challenges and considerations","authors":"Sarah B. Windle , Sam Harper , Jasleen Arneja , Peter Socha , Arijit Nandi","doi":"10.1016/j.jclinepi.2025.112121","DOIUrl":"10.1016/j.jclinepi.2025.112121","url":null,"abstract":"<div><h3>Background</h3><div>In contrast to other observational study designs, quasi-experimental approaches (eg, difference-in-differences, interrupted time series, regression discontinuity, instrumental variable, synthetic control) account for some sources of unmeasured confounding and can estimate causal effects under weaker assumptions. Studies which apply quasi-experimental approaches have increased in popularity in recent decades, therefore investigators conducting systematic reviews of observational studies, particularly in biomedical, public health, or epidemiologic content areas, must be prepared to encounter and appropriately assess these approaches.</div></div><div><h3>Objective</h3><div>Our objective is to describe key methodological challenges and considerations for systematic reviews including quasi-experimental studies, with attention to current recommendations and approaches which have been applied in previous reviews.</div></div><div><h3>Conclusion</h3><div><em>Recommendations for authors of systematic reviews:</em> We recommend that individuals conducting systematic reviews including quasi-experimental studies: (1) search a broad range of bibliographic databases and gray literature, including preprint repositories; (2) do not use search strategies which require specific terms for study design for identification, given inconsistent nomenclature and poor database indexing for quasi-experimental studies; (3) ensure that their review team includes several individuals with expertise in quasi-experimental designs for screening and risk of bias assessment in duplicate; (4) use an approach to risk of bias assessment which is sufficiently granular to identify studies most likely to report unbiased estimates of causal effects (eg, modified Risk Of Bias In Nonrandomized Studies - of Interventions); and (5) consider the implications of varied estimands when interpreting estimates from different quasi-experimental designs. Researchers may also consider restricting systematic review inclusion to quasi-experimental studies for feasibility when addressing research questions with large bodies of literature. However, a more inclusive approach is preferred, as well-designed studies using a variety of methodological approaches may be more credible than a quasi-experiment which violates causal assumptions.</div><div><em>Recommendations for the research community:</em> Many of the challenges faced in conducting systematic reviews of quasi-experimental studies would be ameliorated by improved consistency in nomenclature, as well as greater transparency from authors in describing their research designs. The broader community (eg, research networks, journals) should consider the creation and implementation of reporting standards and protocol registration for quasi-experimental studies to improve study identification in systematic reviews.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112121"},"PeriodicalIF":5.2,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145858997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1016/j.jclinepi.2025.112091
David Ruben Teindl Laursen, Mihaela Ivosevic Broager, Mathias Weis Damkjær, Andreas Halgreen Eiset, Mia Elkjær, Erlend Faltinsen, Ingrid Rose MacLean-Nyegaard, Camilla Hansen Nejstgaard, Asger Sand Paludan-Müller, Lasse Adrup Benné Petersen, Søren Viborg Vestergaard, Asbjørn Hróbjartsson
{"title":"Corrigendum to \"Impact of active placebo controls on estimated drug effects in randomized trials: a meta-epidemiological study\" [Journal of Clinical Epidemiology 188 (2025) 111998].","authors":"David Ruben Teindl Laursen, Mihaela Ivosevic Broager, Mathias Weis Damkjær, Andreas Halgreen Eiset, Mia Elkjær, Erlend Faltinsen, Ingrid Rose MacLean-Nyegaard, Camilla Hansen Nejstgaard, Asger Sand Paludan-Müller, Lasse Adrup Benné Petersen, Søren Viborg Vestergaard, Asbjørn Hróbjartsson","doi":"10.1016/j.jclinepi.2025.112091","DOIUrl":"https://doi.org/10.1016/j.jclinepi.2025.112091","url":null,"abstract":"","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":" ","pages":"112091"},"PeriodicalIF":5.2,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145846980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-23DOI: 10.1016/j.jclinepi.2025.112118
Emilie de Kanter , Tabea Kaul , Pauline Heus , Tom M. de Groot , René Harmen Kuijten , Johannes B. Reitsma , Gary S. Collins , Lotty Hooft , Karel G.M. Moons , Johanna A.A. Damen
<div><h3>Objectives</h3><div>Incomplete reporting of research limits its usefulness and contributes to research waste. Numerous reporting guidelines have been developed to support complete and accurate reporting of health-care research studies. Completeness of reporting can be measured by evaluating the adherence to reporting guidelines. However, assessing adherence to a reporting guideline often lacks uniformity. In 2019, we developed a reporting adherence tool for the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement. With recent advances in regression and artificial intelligence (AI)/machine learning (ML)–based methods, TRIPOD + AI (<span><span>www.tripod-statment.org</span><svg><path></path></svg></span>) was developed to replace the TRIPOD statement. The aim of this study was to develop an updated adherence tool for TRIPOD + AI.</div></div><div><h3>Study Design and Setting</h3><div>Based on the TRIPOD + AI full reporting guideline, including the accompanying explanation and elaboration light, and TRIPOD + AI for abstracts, we updated and expanded the original TRIPOD adherence tool and refined the adherence elements and their scoring rules through discussions within the author team and a pilot test.</div></div><div><h3>Results</h3><div>The updated tool comprises of 37 main items and 136 adherence elements and includes several automated scoring rules. We developed separate TRIPOD + AI adherence tools for model development, model evaluation, and for studies describing both in a single paper.</div></div><div><h3>Conclusion</h3><div>A uniform approach to assessing reporting adherence of TRIPOD + AI allows for comparisons across various fields, monitor reporting over time, and incentivizes primary study authors to comply.</div></div><div><h3>Plain Language Summary</h3><div>Accurate and complete reporting is crucial in biomedical research to ensure findings can be effectively used. To support researchers in reporting their findings well, reporting guidelines have been developed for different study types. One such guideline is TRIPOD, which focuses on research studies about medical prediction tools. In 2024, TRIPOD was updated to TRIPOD + AI to address the increasing use of AI and ML in prediction model studies. In 2019, we developed a scoring system to evaluate how well research papers on prediction tools adhered to the TRIPOD guideline, resulting in a reporting completeness score. This score allows for easier comparison of reporting completeness across various medical fields, and to monitor improvement in reporting over time. With the introduction of TRIPOD + AI, an update of the scoring system was required to align with the new reporting recommendations. We achieved this by reviewing our previous scoring system and incorporating the new items from TRIPOD + AI to better suit studies involving AI. We believe that this system will facilitate comparisons of prediction model reporting co
{"title":"Adherence to TRIPOD+AI guideline: an updated reporting assessment tool","authors":"Emilie de Kanter , Tabea Kaul , Pauline Heus , Tom M. de Groot , René Harmen Kuijten , Johannes B. Reitsma , Gary S. Collins , Lotty Hooft , Karel G.M. Moons , Johanna A.A. Damen","doi":"10.1016/j.jclinepi.2025.112118","DOIUrl":"10.1016/j.jclinepi.2025.112118","url":null,"abstract":"<div><h3>Objectives</h3><div>Incomplete reporting of research limits its usefulness and contributes to research waste. Numerous reporting guidelines have been developed to support complete and accurate reporting of health-care research studies. Completeness of reporting can be measured by evaluating the adherence to reporting guidelines. However, assessing adherence to a reporting guideline often lacks uniformity. In 2019, we developed a reporting adherence tool for the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement. With recent advances in regression and artificial intelligence (AI)/machine learning (ML)–based methods, TRIPOD + AI (<span><span>www.tripod-statment.org</span><svg><path></path></svg></span>) was developed to replace the TRIPOD statement. The aim of this study was to develop an updated adherence tool for TRIPOD + AI.</div></div><div><h3>Study Design and Setting</h3><div>Based on the TRIPOD + AI full reporting guideline, including the accompanying explanation and elaboration light, and TRIPOD + AI for abstracts, we updated and expanded the original TRIPOD adherence tool and refined the adherence elements and their scoring rules through discussions within the author team and a pilot test.</div></div><div><h3>Results</h3><div>The updated tool comprises of 37 main items and 136 adherence elements and includes several automated scoring rules. We developed separate TRIPOD + AI adherence tools for model development, model evaluation, and for studies describing both in a single paper.</div></div><div><h3>Conclusion</h3><div>A uniform approach to assessing reporting adherence of TRIPOD + AI allows for comparisons across various fields, monitor reporting over time, and incentivizes primary study authors to comply.</div></div><div><h3>Plain Language Summary</h3><div>Accurate and complete reporting is crucial in biomedical research to ensure findings can be effectively used. To support researchers in reporting their findings well, reporting guidelines have been developed for different study types. One such guideline is TRIPOD, which focuses on research studies about medical prediction tools. In 2024, TRIPOD was updated to TRIPOD + AI to address the increasing use of AI and ML in prediction model studies. In 2019, we developed a scoring system to evaluate how well research papers on prediction tools adhered to the TRIPOD guideline, resulting in a reporting completeness score. This score allows for easier comparison of reporting completeness across various medical fields, and to monitor improvement in reporting over time. With the introduction of TRIPOD + AI, an update of the scoring system was required to align with the new reporting recommendations. We achieved this by reviewing our previous scoring system and incorporating the new items from TRIPOD + AI to better suit studies involving AI. We believe that this system will facilitate comparisons of prediction model reporting co","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112118"},"PeriodicalIF":5.2,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145835262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1016/j.jclinepi.2025.112114
Valtteri Panula , Antti Saarinen , Matias Vaajala , Rasmus Liukkonen , Oskari Pakarinen , Juho Laaksonen , Ville Ponkilainen , Ilari Kuitunen , Mikko Uimonen
Objectives
We hypothesized that, in musculoskeletal randomized controlled trials (RCTs) using patient-reported outcome measures (PROMs), higher baseline scores and the clustering of follow-up scores near the upper bound (ie, ceiling effect) compress variability and attenuate measurable between-group differences, thereby lowering the likelihood of observing a statistically significant effect. We therefore examined how score distributions at pretreatment and follow-up influence the likelihood of detecting between-group differences.
Study Design and Setting
We conducted a metaepidemiologic study of RCTs, published between 2015 and 2024, that compared treatment effects on musculoskeletal disorders between two study groups using PROMs. The observed distributions of the PROM scores at baseline and follow-up were collected from the included studies. All PROM scores were rescaled to 0-100 with higher scores indicating better health. The likelihood of observing a statistically significant difference in PROM scores between the study groups was examined by calculating the score difference required to achieve a P value <.05.
Results
A total of 255 RCTs were included. PROM scores improved from baseline to follow-up in most studies (98%), with a mean change of +28 points. The correlation coefficient between the mean baseline score and mean score change was −0.66 (95% CI -0.72 to −0.59) indicating that higher baseline scores were associated with lower score change. In addition, there was a moderate correlation between the mean and SD of PROM scores at follow-up (−0.39; 95% CI -0.48 to −0.28). The mean likelihood of detecting a difference was 65% (SD 11%) at baseline and 65% (SD 11%) at follow-up. The likelihood reached the 80% benchmark in only 8.5% and 8.1% of the studies at baseline and follow-up, respectively.
Conclusion
The concentration of PROM score distributions toward the high end of the scale, especially when higher baseline scores are present, diminishes the likelihood of detecting significant differences between study groups, particularly at follow-up assessments in studies analyzing musculoskeletal complaints. This underscores the importance of critically evaluating the conclusions drawn from these studies.
目的:我们假设,在使用PROMs的肌肉骨骼随机对照试验中,较高的基线评分和接近上限的随访评分聚类(即天花板效应)压缩了可变性,减弱了可测量的组间差异,从而降低了观察到统计学显著效应的可能性。因此,我们研究了治疗前和随访时的评分分布如何影响检测组间差异的可能性。研究设计和背景:我们对2015年至2024年间发表的随机对照试验进行了荟萃流行病学研究,比较了两个使用PROMs的研究组对肌肉骨骼疾病的治疗效果。从纳入的研究中收集了基线和随访时观察到的PROM分数分布。所有PROM分数重新调整为0-100分,分数越高表示健康状况越好。通过计算达到p值< 0.05所需的评分差异来检验观察到研究组之间PROM评分有统计学意义差异的可能性。结果:共纳入255项rct。大多数研究(98%)的PROM评分从基线到随访有所改善,平均变化为+28分。平均基线评分与平均评分变化之间的相关系数为-0.66 (95% CI -0.72 - -0.59),表明较高的基线评分与较低的评分变化相关。此外,随访时PROM评分的平均值和SD之间存在中等相关性(-0.39;95% CI -0.48 - -0.28)。检测差异的平均可能性在基线时为65% (SD 11%),在随访时为65% (SD 11%)。在基线和随访时,分别只有8.5%和8.1%的研究的可能性达到80%的基准。结论:PROM分数向量表高端分布的浓度,特别是当较高的基线分数存在时,降低了发现研究组之间显著差异的可能性,特别是在分析肌肉骨骼疾病的研究的随访评估中。这强调了批判性地评估从这些研究中得出的结论的重要性。
{"title":"The interplay between PROM score distributions and treatment effect detection likelihood in randomized controlled trials–a metaepidemiologic study","authors":"Valtteri Panula , Antti Saarinen , Matias Vaajala , Rasmus Liukkonen , Oskari Pakarinen , Juho Laaksonen , Ville Ponkilainen , Ilari Kuitunen , Mikko Uimonen","doi":"10.1016/j.jclinepi.2025.112114","DOIUrl":"10.1016/j.jclinepi.2025.112114","url":null,"abstract":"<div><h3>Objectives</h3><div>We hypothesized that, in musculoskeletal randomized controlled trials (RCTs) using patient-reported outcome measures (PROMs), higher baseline scores and the clustering of follow-up scores near the upper bound (ie, ceiling effect) compress variability and attenuate measurable between-group differences, thereby lowering the likelihood of observing a statistically significant effect. We therefore examined how score distributions at pretreatment and follow-up influence the likelihood of detecting between-group differences.</div></div><div><h3>Study Design and Setting</h3><div>We conducted a metaepidemiologic study of RCTs, published between 2015 and 2024, that compared treatment effects on musculoskeletal disorders between two study groups using PROMs. The observed distributions of the PROM scores at baseline and follow-up were collected from the included studies. All PROM scores were rescaled to 0-100 with higher scores indicating better health. The likelihood of observing a statistically significant difference in PROM scores between the study groups was examined by calculating the score difference required to achieve a <em>P</em> value <.05.</div></div><div><h3>Results</h3><div>A total of 255 RCTs were included. PROM scores improved from baseline to follow-up in most studies (98%), with a mean change of +28 points. The correlation coefficient between the mean baseline score and mean score change was −0.66 (95% CI -0.72 to −0.59) indicating that higher baseline scores were associated with lower score change. In addition, there was a moderate correlation between the mean and SD of PROM scores at follow-up (−0.39; 95% CI -0.48 to −0.28). The mean likelihood of detecting a difference was 65% (SD 11%) at baseline and 65% (SD 11%) at follow-up. The likelihood reached the 80% benchmark in only 8.5% and 8.1% of the studies at baseline and follow-up, respectively.</div></div><div><h3>Conclusion</h3><div>The concentration of PROM score distributions toward the high end of the scale, especially when higher baseline scores are present, diminishes the likelihood of detecting significant differences between study groups, particularly at follow-up assessments in studies analyzing musculoskeletal complaints. This underscores the importance of critically evaluating the conclusions drawn from these studies.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112114"},"PeriodicalIF":5.2,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145806149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1016/j.jclinepi.2025.112115
Zijun Wang , Hongfeng He , Sergey K. Zyryanov , Liliya E. Ziganshina , Akihiko Ozaki , Natalia Dorofeeva , Myeong Soo Lee , Ivan D. Florez , Etienne Ngeh , Abhilasha Sharma , Ekaterina V. Yudina , Barbara C. van Munster , Jako S. Burgers , Opeyemi O. Babatunde , Yaolong Chen , Janne Estill
Objectives
The use of guidelines in multimorbidity-related practice has not yet been extensively investigated. We aimed to explore how health-care professionals use guidelines when managing individuals with multimorbidity.
Methods
We conducted an exploratory survey among a convenience sample of medical professionals with clinical experience. The questionnaire addressed whether and how different types of guidelines are used in multimorbidity-related practice, the reasons for not using specific types of guidelines, and other approaches to inform multimorbidity practice. It was distributed through the investigators’ contact networks. The results were presented descriptively.
Results
We received 311 valid responses: 136 from the World Health Organization European Region, 137 from the Western Pacific Region, and 38 from other regions. Most participants were familiar with the concept of multimorbidity (n = 245, 79%). Among the 269 respondents who reported using guidelines in multimorbidity practice, 124 (46%) used guidelines specifically focusing on combinations of diseases, and 148 (55%) multiple single-disease guidelines together. Lack of availability was the main reason for not using guidelines that address multimorbidity itself, and the high number of guidelines (n = 76, 40%) and possible interactions between conditions or treatments (n = 62, 38%) for not using single-disease guidelines. Respondents frequently consult experts or refer to systematic reviews and primary studies when existing guidelines do not meet their needs. The development of a tool or method to guide the use of multiple guidelines ranked highest among possible actions to improve multimorbidity practice.
Conclusion
Although the medical professionals in our sample were generally familiar with the use of guidelines, there are many unmet needs and tool gaps related to guideline-informed multimorbidity-related practice.
{"title":"The use of guidelines in multimorbidity-related practice: an exploratory questionnaire survey","authors":"Zijun Wang , Hongfeng He , Sergey K. Zyryanov , Liliya E. Ziganshina , Akihiko Ozaki , Natalia Dorofeeva , Myeong Soo Lee , Ivan D. Florez , Etienne Ngeh , Abhilasha Sharma , Ekaterina V. Yudina , Barbara C. van Munster , Jako S. Burgers , Opeyemi O. Babatunde , Yaolong Chen , Janne Estill","doi":"10.1016/j.jclinepi.2025.112115","DOIUrl":"10.1016/j.jclinepi.2025.112115","url":null,"abstract":"<div><h3>Objectives</h3><div>The use of guidelines in multimorbidity-related practice has not yet been extensively investigated. We aimed to explore how health-care professionals use guidelines when managing individuals with multimorbidity.</div></div><div><h3>Methods</h3><div>We conducted an exploratory survey among a convenience sample of medical professionals with clinical experience. The questionnaire addressed whether and how different types of guidelines are used in multimorbidity-related practice, the reasons for not using specific types of guidelines, and other approaches to inform multimorbidity practice. It was distributed through the investigators’ contact networks. The results were presented descriptively.</div></div><div><h3>Results</h3><div>We received 311 valid responses: 136 from the World Health Organization European Region, 137 from the Western Pacific Region, and 38 from other regions. Most participants were familiar with the concept of multimorbidity (<em>n</em> = 245, 79%). Among the 269 respondents who reported using guidelines in multimorbidity practice, 124 (46%) used guidelines specifically focusing on combinations of diseases, and 148 (55%) multiple single-disease guidelines together. Lack of availability was the main reason for not using guidelines that address multimorbidity itself, and the high number of guidelines (<em>n</em> = 76, 40%) and possible interactions between conditions or treatments (<em>n</em> = 62, 38%) for not using single-disease guidelines. Respondents frequently consult experts or refer to systematic reviews and primary studies when existing guidelines do not meet their needs. The development of a tool or method to guide the use of multiple guidelines ranked highest among possible actions to improve multimorbidity practice.</div></div><div><h3>Conclusion</h3><div>Although the medical professionals in our sample were generally familiar with the use of guidelines, there are many unmet needs and tool gaps related to guideline-informed multimorbidity-related practice.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112115"},"PeriodicalIF":5.2,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145806152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1016/j.jclinepi.2025.112117
Amardeep Legha , Joie Ensor , Rebecca Whittle , Lucinda Archer , Ben Van Calster , Evangelia Christodoulou , Kym I.E. Snell , Mohsen Sadatsafavi , Gary S. Collins , Richard D. Riley
Background and Objectives
When recruiting participants to a new study developing a clinical prediction model (CPM), sample size calculations are typically conducted before data collection based on sensible assumptions. This leads to a fixed sample size, but if the assumptions are inaccurate, the actual sample size required to develop a reliable model may be higher or even lower. To safeguard against this, adaptive sample size approaches have been proposed, based on sequential evaluation of (changes in) a model's predictive performance. The objective of the study was to illustrate and extend sequential sample size calculations for CPM development by (i) proposing stopping rules for prospective data collection based on minimizing uncertainty (instability) and misclassification of individual-level predictions and (ii) showcasing how it safeguards against inaccurate fixed sample size calculations.
Methods
Using the sequential approach repeats the predefined model development strategy every time a chosen number (eg, 100) of participants are recruited and adequately followed up. At each stage, CPM performance is evaluated using bootstrapping, leading to prediction and classification stability statistics and plots, alongside optimism-adjusted measures of calibration and discrimination. Learning curves display the trend of results against sample size and recruitment is stopped when a chosen stopping rule is met.
Results
Our approach is illustrated for model development of acute kidney injury using (penalized) logistic regression CPMs. Before recruitment based on perceived sensible assumptions, the fixed sample size calculation suggests recruiting 342 patients to minimize overfitting; however, during data collection, the sequential approach reveals that a much larger sample size of 1100 is required to minimize overfitting (targeting a bootstrap-corrected calibration slope ≥0.9). If the stopping rule criteria also target small uncertainty and misclassification probability of individual predictions, the sequential approach suggests an even larger sample size of about 1800.
Conclusion
For CPM development studies involving prospective data collection, a sequential sample size approach allows users to dynamically monitor individual-level prediction and classification instability. This helps determine when enough participants have been recruited and safeguards against using inaccurate assumptions in a sample size calculation before data collection. Engagement with patients and other stakeholders is crucial to identify sensible context-specific stopping rules for robust individual predictions.
{"title":"Sequential sample size calculations and learning curves safeguard the robust development of a clinical prediction model for individuals","authors":"Amardeep Legha , Joie Ensor , Rebecca Whittle , Lucinda Archer , Ben Van Calster , Evangelia Christodoulou , Kym I.E. Snell , Mohsen Sadatsafavi , Gary S. Collins , Richard D. Riley","doi":"10.1016/j.jclinepi.2025.112117","DOIUrl":"10.1016/j.jclinepi.2025.112117","url":null,"abstract":"<div><h3>Background and Objectives</h3><div>When recruiting participants to a new study developing a clinical prediction model (CPM), sample size calculations are typically conducted before data collection based on sensible assumptions. This leads to a fixed sample size, but if the assumptions are inaccurate, the actual sample size required to develop a reliable model may be higher or even lower. To safeguard against this, adaptive sample size approaches have been proposed, based on sequential evaluation of (changes in) a model's predictive performance. The objective of the study was to illustrate and extend sequential sample size calculations for CPM development by (i) proposing stopping rules for prospective data collection based on minimizing uncertainty (instability) and misclassification of individual-level predictions and (ii) showcasing how it safeguards against inaccurate fixed sample size calculations.</div></div><div><h3>Methods</h3><div>Using the sequential approach repeats the predefined model development strategy every time a chosen number (eg, 100) of participants are recruited and adequately followed up. At each stage, CPM performance is evaluated using bootstrapping, leading to prediction and classification stability statistics and plots, alongside optimism-adjusted measures of calibration and discrimination. Learning curves display the trend of results against sample size and recruitment is stopped when a chosen stopping rule is met.</div></div><div><h3>Results</h3><div>Our approach is illustrated for model development of acute kidney injury using (penalized) logistic regression CPMs. Before recruitment based on perceived sensible assumptions, the fixed sample size calculation suggests recruiting 342 patients to minimize overfitting; however, during data collection, the sequential approach reveals that a much larger sample size of 1100 is required to minimize overfitting (targeting a bootstrap-corrected calibration slope ≥0.9). If the stopping rule criteria also target small uncertainty and misclassification probability of individual predictions, the sequential approach suggests an even larger sample size of about 1800.</div></div><div><h3>Conclusion</h3><div>For CPM development studies involving prospective data collection, a sequential sample size approach allows users to dynamically monitor individual-level prediction and classification instability. This helps determine when enough participants have been recruited and safeguards against using inaccurate assumptions in a sample size calculation before data collection. Engagement with patients and other stakeholders is crucial to identify sensible context-specific stopping rules for robust individual predictions.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112117"},"PeriodicalIF":5.2,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145806008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-19DOI: 10.1016/j.jclinepi.2025.112116
Gemma Altinger , Sweekriti Sharma , Qiang Li , Anthony Devaux , Samantha Darby , Aidan van Wyk , Caitlin M.P. Jones , Chris G. Maher , Adrian C. Traeger
<div><h3>Objectives</h3><div>To determine if text message–based behavioral interventions could increase response rates to a patient-reported outcomes survey in the emergency department (ED).</div></div><div><h3>Study Design and Setting</h3><div>We conducted a study within a trial (SWAT), within the NUDG-ED trial. The NUDG-ED trial aimed to reduce low-value care for patients with back pain presenting to eight EDs in Sydney, Australia. This SWAT was a 3-arm randomized controlled trial (RCT) nested within the NUDG-ED trial. After discharge from the ED, patients were randomized to receive one of the three text message invitations to complete a follow-up patient-reported outcome survey: a standard control message, or one of two behaviorally informed messages including either a prize draw incentive or prosocial framing. Our primary outcome measure was the response rate in each study group. We performed a linear mixed-effects model controlling for hospital heterogeneity and patient characteristics to estimate the mean difference (MD) in proportions with 95% CI, to determine the effectiveness of the behavioral interventions.</div></div><div><h3>Results</h3><div>A total of 1494 patients were randomized between May 15, 2024 and January 29, 2025. Of these, 52% were women, the median age was 46 years (IQR 35, 62), 43% were from disadvantaged areas and 51% were triaged with a clinically urgent condition. Baseline characteristics were balanced across all groups. Our primary analysis found that compared to the control, the prize draw incentive increased response rates (<em>n</em> = 997 patients, MD = 6.9%, 95% CI: 1.8% to 11.9%, <em>P</em> = .007). Our adjusted mixed-effects model also found a significant increase in response rates (<em>n</em> = 979 patients, MD = 6.4%, 95% CI 1.3% to 11.4%, <em>P</em> = .013). Compared to the control, the prosocial framing message may have slightly increased response rates, but the results were not statistically significant (<em>n</em> = 996 patients, 17.2% vs 21.1%, MD = 3.9%, 95% CI: −1.1% to 8.9%).</div></div><div><h3>Conclusion</h3><div>In this randomized trial, a prize draw incentive modestly improved response rates to a patient-reported outcomes survey in routine emergency care settings. Prosocial framing may have slightly increased response rates, but the effect was uncertain. Both behavioral approaches warrant further testing in routine care settings.</div></div><div><h3>Plain Language Summary</h3><div>Patient-reported outcomes, such as surveys about how people feel and recover after care, are important for understanding what matters most to patients. However, response rates to these surveys are often very low, especially in real clinical settings. This makes it difficult to draw strong conclusions about whether treatments are helping patients. So, researchers and health services need to find ways to improve response rates. This study looked at whether simple text message strategies could encourage more patients to com
{"title":"Text message incentives increased patient-reported outcomes survey response in emergency care: SWAT findings","authors":"Gemma Altinger , Sweekriti Sharma , Qiang Li , Anthony Devaux , Samantha Darby , Aidan van Wyk , Caitlin M.P. Jones , Chris G. Maher , Adrian C. Traeger","doi":"10.1016/j.jclinepi.2025.112116","DOIUrl":"10.1016/j.jclinepi.2025.112116","url":null,"abstract":"<div><h3>Objectives</h3><div>To determine if text message–based behavioral interventions could increase response rates to a patient-reported outcomes survey in the emergency department (ED).</div></div><div><h3>Study Design and Setting</h3><div>We conducted a study within a trial (SWAT), within the NUDG-ED trial. The NUDG-ED trial aimed to reduce low-value care for patients with back pain presenting to eight EDs in Sydney, Australia. This SWAT was a 3-arm randomized controlled trial (RCT) nested within the NUDG-ED trial. After discharge from the ED, patients were randomized to receive one of the three text message invitations to complete a follow-up patient-reported outcome survey: a standard control message, or one of two behaviorally informed messages including either a prize draw incentive or prosocial framing. Our primary outcome measure was the response rate in each study group. We performed a linear mixed-effects model controlling for hospital heterogeneity and patient characteristics to estimate the mean difference (MD) in proportions with 95% CI, to determine the effectiveness of the behavioral interventions.</div></div><div><h3>Results</h3><div>A total of 1494 patients were randomized between May 15, 2024 and January 29, 2025. Of these, 52% were women, the median age was 46 years (IQR 35, 62), 43% were from disadvantaged areas and 51% were triaged with a clinically urgent condition. Baseline characteristics were balanced across all groups. Our primary analysis found that compared to the control, the prize draw incentive increased response rates (<em>n</em> = 997 patients, MD = 6.9%, 95% CI: 1.8% to 11.9%, <em>P</em> = .007). Our adjusted mixed-effects model also found a significant increase in response rates (<em>n</em> = 979 patients, MD = 6.4%, 95% CI 1.3% to 11.4%, <em>P</em> = .013). Compared to the control, the prosocial framing message may have slightly increased response rates, but the results were not statistically significant (<em>n</em> = 996 patients, 17.2% vs 21.1%, MD = 3.9%, 95% CI: −1.1% to 8.9%).</div></div><div><h3>Conclusion</h3><div>In this randomized trial, a prize draw incentive modestly improved response rates to a patient-reported outcomes survey in routine emergency care settings. Prosocial framing may have slightly increased response rates, but the effect was uncertain. Both behavioral approaches warrant further testing in routine care settings.</div></div><div><h3>Plain Language Summary</h3><div>Patient-reported outcomes, such as surveys about how people feel and recover after care, are important for understanding what matters most to patients. However, response rates to these surveys are often very low, especially in real clinical settings. This makes it difficult to draw strong conclusions about whether treatments are helping patients. So, researchers and health services need to find ways to improve response rates. This study looked at whether simple text message strategies could encourage more patients to com","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112116"},"PeriodicalIF":5.2,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145806204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}