Pub Date : 2026-03-01Epub Date: 2025-12-23DOI: 10.1016/j.jclinepi.2025.112118
Emilie de Kanter , Tabea Kaul , Pauline Heus , Tom M. de Groot , René Harmen Kuijten , Johannes B. Reitsma , Gary S. Collins , Lotty Hooft , Karel G.M. Moons , Johanna A.A. Damen
<div><h3>Objectives</h3><div>Incomplete reporting of research limits its usefulness and contributes to research waste. Numerous reporting guidelines have been developed to support complete and accurate reporting of health-care research studies. Completeness of reporting can be measured by evaluating the adherence to reporting guidelines. However, assessing adherence to a reporting guideline often lacks uniformity. In 2019, we developed a reporting adherence tool for the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement. With recent advances in regression and artificial intelligence (AI)/machine learning (ML)–based methods, TRIPOD + AI (<span><span>www.tripod-statment.org</span><svg><path></path></svg></span>) was developed to replace the TRIPOD statement. The aim of this study was to develop an updated adherence tool for TRIPOD + AI.</div></div><div><h3>Study Design and Setting</h3><div>Based on the TRIPOD + AI full reporting guideline, including the accompanying explanation and elaboration light, and TRIPOD + AI for abstracts, we updated and expanded the original TRIPOD adherence tool and refined the adherence elements and their scoring rules through discussions within the author team and a pilot test.</div></div><div><h3>Results</h3><div>The updated tool comprises of 37 main items and 136 adherence elements and includes several automated scoring rules. We developed separate TRIPOD + AI adherence tools for model development, model evaluation, and for studies describing both in a single paper.</div></div><div><h3>Conclusion</h3><div>A uniform approach to assessing reporting adherence of TRIPOD + AI allows for comparisons across various fields, monitor reporting over time, and incentivizes primary study authors to comply.</div></div><div><h3>Plain Language Summary</h3><div>Accurate and complete reporting is crucial in biomedical research to ensure findings can be effectively used. To support researchers in reporting their findings well, reporting guidelines have been developed for different study types. One such guideline is TRIPOD, which focuses on research studies about medical prediction tools. In 2024, TRIPOD was updated to TRIPOD + AI to address the increasing use of AI and ML in prediction model studies. In 2019, we developed a scoring system to evaluate how well research papers on prediction tools adhered to the TRIPOD guideline, resulting in a reporting completeness score. This score allows for easier comparison of reporting completeness across various medical fields, and to monitor improvement in reporting over time. With the introduction of TRIPOD + AI, an update of the scoring system was required to align with the new reporting recommendations. We achieved this by reviewing our previous scoring system and incorporating the new items from TRIPOD + AI to better suit studies involving AI. We believe that this system will facilitate comparisons of prediction model reporting co
{"title":"Adherence to TRIPOD+AI guideline: an updated reporting assessment tool","authors":"Emilie de Kanter , Tabea Kaul , Pauline Heus , Tom M. de Groot , René Harmen Kuijten , Johannes B. Reitsma , Gary S. Collins , Lotty Hooft , Karel G.M. Moons , Johanna A.A. Damen","doi":"10.1016/j.jclinepi.2025.112118","DOIUrl":"10.1016/j.jclinepi.2025.112118","url":null,"abstract":"<div><h3>Objectives</h3><div>Incomplete reporting of research limits its usefulness and contributes to research waste. Numerous reporting guidelines have been developed to support complete and accurate reporting of health-care research studies. Completeness of reporting can be measured by evaluating the adherence to reporting guidelines. However, assessing adherence to a reporting guideline often lacks uniformity. In 2019, we developed a reporting adherence tool for the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement. With recent advances in regression and artificial intelligence (AI)/machine learning (ML)–based methods, TRIPOD + AI (<span><span>www.tripod-statment.org</span><svg><path></path></svg></span>) was developed to replace the TRIPOD statement. The aim of this study was to develop an updated adherence tool for TRIPOD + AI.</div></div><div><h3>Study Design and Setting</h3><div>Based on the TRIPOD + AI full reporting guideline, including the accompanying explanation and elaboration light, and TRIPOD + AI for abstracts, we updated and expanded the original TRIPOD adherence tool and refined the adherence elements and their scoring rules through discussions within the author team and a pilot test.</div></div><div><h3>Results</h3><div>The updated tool comprises of 37 main items and 136 adherence elements and includes several automated scoring rules. We developed separate TRIPOD + AI adherence tools for model development, model evaluation, and for studies describing both in a single paper.</div></div><div><h3>Conclusion</h3><div>A uniform approach to assessing reporting adherence of TRIPOD + AI allows for comparisons across various fields, monitor reporting over time, and incentivizes primary study authors to comply.</div></div><div><h3>Plain Language Summary</h3><div>Accurate and complete reporting is crucial in biomedical research to ensure findings can be effectively used. To support researchers in reporting their findings well, reporting guidelines have been developed for different study types. One such guideline is TRIPOD, which focuses on research studies about medical prediction tools. In 2024, TRIPOD was updated to TRIPOD + AI to address the increasing use of AI and ML in prediction model studies. In 2019, we developed a scoring system to evaluate how well research papers on prediction tools adhered to the TRIPOD guideline, resulting in a reporting completeness score. This score allows for easier comparison of reporting completeness across various medical fields, and to monitor improvement in reporting over time. With the introduction of TRIPOD + AI, an update of the scoring system was required to align with the new reporting recommendations. We achieved this by reviewing our previous scoring system and incorporating the new items from TRIPOD + AI to better suit studies involving AI. We believe that this system will facilitate comparisons of prediction model reporting co","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112118"},"PeriodicalIF":5.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145835262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-08DOI: 10.1016/j.jclinepi.2025.112098
Shannon M. Ruzycki , Kirstie C. Lithgow , Claire Song , Sarah Taylor , Abinaya Subramanian , Miriam Li , Stephanie Happ , Mark Shea , Debby Oladimeji , Wayne Clark , Dean A. Fergusson , Sarina R. Isenberg , Patricia Li , Sangeeta Mehta , Stuart G. Nicholls , Courtney L. Pollock , Louise Pilote , Amity E. Quinn , Syamala Buragadda , David Collister
<div><h3>Objectives</h3><div>To describe the demographic and social identities of participants in contemporary Canadian randomized clinical trials (RCTs).</div></div><div><h3>Study Design and Setting</h3><div>A meta-epidemiologic study included published reports of phase 2 and 3 RCTs that exclusively recruited adults living in Canada and were registered on ClinicalTrials.gov between January 1, 2010, and December 31, 2019. Study design and participant demographics were abstracted from eligible articles in duplicate using frameworks for understanding participant diversity such as PROGRESS-PLUS.</div></div><div><h3>Results</h3><div>We identified 118 RCTs with 17,387 participants. Most reported participant sex (<em>n</em> = 105, 89.0%), few reported gender (<em>n</em> = 12, 10.2%), and none reported both. Among articles reporting sex, there were 11,066 female (63.6%), 5402 male (32.8%), and one intersex (<0.1%) participants. There were 477 women (54.1%) and 404 men (45.9%) participants. No studies reported gender diverse participants. When excluding studies that only recruited one sex and/or gender, 51.8% of participants were male (<em>n</em> = 4774/9219) and 47.5% were men (<em>n</em> = 446/850). Race and/or ethnicity was reported for 4124 participants (23.7%) in 31 of 118 (26.3%) of RCTs; of these, 72.0% were White (<em>n</em> = 2969), 2.7% were Black (<em>n</em> = 113), and 0.2% were Indigenous (<em>n</em> = 7). Eligibility criteria related to specific PROGRESS-PLUS factors were rare except for cognition (<em>n</em> = 42, 35.6%), substance use (<em>n</em> = 25, 21.7%), pregnancy (<em>n</em> = 29, 24.5%), breastfeeding (<em>n</em> = 16, 13.6%), and older age (<em>n</em> = 26, 22.0%).</div></div><div><h3>Conclusion</h3><div>The data are encouraging regarding representation of female and women participants in Canadian trials. Due to underreporting of other identities, we cannot identify additional groups who may be underrepresented. Work to improve reporting of race and/or ethnicity, among other identities, is needed.</div></div><div><h3>Plain Language Summary</h3><div>Clinical trials tell us what drugs and procedures are helpful for patients. In certain specialties, like cancer and heart disease, clinical trials are made up mostly of men, White people, and younger people. This means that the results of these trials may be different for other groups of people, especially older people, women, and racialized people, who are more likely to have these diseases. We looked at the demographic identities of all participants in 118 Canadian clinical trials that were done between 2010 and 2019. Of the 17,387 participants, there were 11,066 female, 5402 male, 477 women, 404 men, and one intersex participant. We could find the race and/or ethnicity for only 4124 participants in 31 of the trials. Most participants (72.0%) were White, and only 2.7% were Black and 0.2% were Indigenous. These results tell us that reporting of identities in Canadian clinical trial
{"title":"Participant diversity and inclusive trial design: a meta-epidemiologic study of Canadian randomized clinical trials","authors":"Shannon M. Ruzycki , Kirstie C. Lithgow , Claire Song , Sarah Taylor , Abinaya Subramanian , Miriam Li , Stephanie Happ , Mark Shea , Debby Oladimeji , Wayne Clark , Dean A. Fergusson , Sarina R. Isenberg , Patricia Li , Sangeeta Mehta , Stuart G. Nicholls , Courtney L. Pollock , Louise Pilote , Amity E. Quinn , Syamala Buragadda , David Collister","doi":"10.1016/j.jclinepi.2025.112098","DOIUrl":"10.1016/j.jclinepi.2025.112098","url":null,"abstract":"<div><h3>Objectives</h3><div>To describe the demographic and social identities of participants in contemporary Canadian randomized clinical trials (RCTs).</div></div><div><h3>Study Design and Setting</h3><div>A meta-epidemiologic study included published reports of phase 2 and 3 RCTs that exclusively recruited adults living in Canada and were registered on ClinicalTrials.gov between January 1, 2010, and December 31, 2019. Study design and participant demographics were abstracted from eligible articles in duplicate using frameworks for understanding participant diversity such as PROGRESS-PLUS.</div></div><div><h3>Results</h3><div>We identified 118 RCTs with 17,387 participants. Most reported participant sex (<em>n</em> = 105, 89.0%), few reported gender (<em>n</em> = 12, 10.2%), and none reported both. Among articles reporting sex, there were 11,066 female (63.6%), 5402 male (32.8%), and one intersex (<0.1%) participants. There were 477 women (54.1%) and 404 men (45.9%) participants. No studies reported gender diverse participants. When excluding studies that only recruited one sex and/or gender, 51.8% of participants were male (<em>n</em> = 4774/9219) and 47.5% were men (<em>n</em> = 446/850). Race and/or ethnicity was reported for 4124 participants (23.7%) in 31 of 118 (26.3%) of RCTs; of these, 72.0% were White (<em>n</em> = 2969), 2.7% were Black (<em>n</em> = 113), and 0.2% were Indigenous (<em>n</em> = 7). Eligibility criteria related to specific PROGRESS-PLUS factors were rare except for cognition (<em>n</em> = 42, 35.6%), substance use (<em>n</em> = 25, 21.7%), pregnancy (<em>n</em> = 29, 24.5%), breastfeeding (<em>n</em> = 16, 13.6%), and older age (<em>n</em> = 26, 22.0%).</div></div><div><h3>Conclusion</h3><div>The data are encouraging regarding representation of female and women participants in Canadian trials. Due to underreporting of other identities, we cannot identify additional groups who may be underrepresented. Work to improve reporting of race and/or ethnicity, among other identities, is needed.</div></div><div><h3>Plain Language Summary</h3><div>Clinical trials tell us what drugs and procedures are helpful for patients. In certain specialties, like cancer and heart disease, clinical trials are made up mostly of men, White people, and younger people. This means that the results of these trials may be different for other groups of people, especially older people, women, and racialized people, who are more likely to have these diseases. We looked at the demographic identities of all participants in 118 Canadian clinical trials that were done between 2010 and 2019. Of the 17,387 participants, there were 11,066 female, 5402 male, 477 women, 404 men, and one intersex participant. We could find the race and/or ethnicity for only 4124 participants in 31 of the trials. Most participants (72.0%) were White, and only 2.7% were Black and 0.2% were Indigenous. These results tell us that reporting of identities in Canadian clinical trial","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112098"},"PeriodicalIF":5.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145727093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-15DOI: 10.1016/j.jclinepi.2025.112102
Declan Devane , Johanna Pope , Paula Byrne , Evan Forde , Isabel O'Byrne , Steven Woloshin , Eileen Culloty , Darren Dahly , Ingeborg Hess Elgersma , Heather Munthe-Kaas , Conor Judge , Martin O'Donnell , Finn Krewer , Sandra Galvin , Nikita N. Burke , Theresa Tierney , KM Saif-Ur-Rahman , Tom Conway , James Thomas
<div><h3>Objectives</h3><div>To compare the comprehension, readability, quality, safety, and trustworthiness of artificial intelligence (AI)-assisted vs human-generated plain language summaries (PLSs) for Cochrane systematic reviews.</div></div><div><h3>Study Design</h3><div>Randomized, parallel-group, two-arm, noninferiority trial (ISRCTN85699985).</div></div><div><h3>Setting</h3><div>Online survey platform, September 2025.</div></div><div><h3>Participants</h3><div>Adults aged 18 years or older with a minimum English reading proficiency of 7 out of 10, recruited via Prolific. Of the 500 individuals screened, 465 were randomized and 453 completed per-protocol analysis.</div></div><div><h3>Interventions</h3><div>Participants were randomly assigned to three AI-assisted PLSs developed with ChatGPT and human-in-the-loop verification, or to three published human-generated Cochrane PLSs for the same reviews.</div></div><div><h3>Outcomes</h3><div>Primary: comprehension (10-item questionnaire, noninferiority margin 10%). Secondary: readability quality and safety, trustworthiness, and authorship perception.</div></div><div><h3>Results</h3><div>Mean comprehension scores were 88.9% (<em>n</em> = 228) in the AI-assisted group and 89.0% (<em>n</em> = 225) in the human-generated group (mean difference −0.03 percentage points, 95% CI: −1.9% to 2.0%); the upper CI bound (2.0 percentage points) did not exceed the +10 percentage-point noninferiority margin, demonstrating noninferiority. Flesch-Kincaid Grade Level showed no significant difference (8.20 vs 8.38, <em>P</em> = .722), although formal noninferiority was missed (upper 95% CI bound 1.72 exceeded the 1.0 grade level margin). AI-assisted summaries scored higher on Flesch Reading Ease (63.33 vs 50.00, <em>P</em> = .008) and lower on the Coleman-Liau Index. All summaries met prespecified quality and safety standards (100% in both groups). Trustworthiness scores were comparable (3.98 vs 3.91, difference 0.068, 95% CI: −0.043 to 0.179; meeting noninferiority). Participants demonstrated limited ability to distinguish between authorship, correctly identifying AI-assisted summaries in 56.3% of cases and human-generated summaries in 34.7% (≈ chance for a three-option question), with 55.4% of human-generated summaries misattributed as AI-assisted. Exploratory subgroup analysis showed an age interaction (<em>P</em> = .023), though based on a small subgroup (<em>n</em> = 14, 3%).</div></div><div><h3>Conclusion</h3><div>AI-assisted PLSs with human oversight achieved comprehension levels noninferior to those of human-generated Cochrane summaries, with comparable quality, safety, and trust ratings. AI summaries were largely indistinguishable from those generated by humans. Pretrial verification identified and corrected numerical errors, confirming the need for human oversight. These findings support human-in-the-loop AI workflows for PLS production, though formal evaluation of the time and resource implications is needed
目的比较人工智能(AI)辅助与人工生成的简单语言摘要(pls)在Cochrane系统评价中的理解性、可读性、质量、安全性和可信度。研究设计:随机、平行组、双臂、非劣效性试验(ISRCTN85699985)。在线调查平台,2025年9月。参与者:18岁或以上的成年人,英语阅读能力至少达到7分(满分10分),通过多产网站招募。在筛选的500人中,465人被随机分配,453人完成了每个方案的分析。干预措施:参与者被随机分配到三个由ChatGPT和人在环验证开发的人工智能辅助PLSs中,或三个已发表的人工生成的Cochrane PLSs中进行相同的评价。主要结果:理解(10项问卷,非劣效度10%)。其次:可读性、质量和安全性、可信度和作者感知。结果人工智能辅助组的平均理解分数为88.9% (n = 228),人工辅助组的平均理解分数为89.0% (n = 225)(平均差异为- 0.03个百分点,95% CI: - 1.9% ~ 2.0%);CI上限(2.0个百分点)未超过+10个百分点的非劣效性边际,表明非劣效性。Flesch-Kincaid分级水平没有显着差异(8.20 vs 8.38, P = .722),尽管错过了正式的非劣效性(95% CI上限1.72超过1.0等级水平界限)。人工智能辅助摘要在Flesch Reading Ease得分较高(63.33 vs 50.00, P = 0.008),而在Coleman-Liau Index得分较低。所有总结均符合预先规定的质量和安全标准(两组均为100%)。可信度评分具有可比性(3.98 vs 3.91,差异0.068,95% CI: - 0.043 ~ 0.179;符合非劣效性)。参与者表现出有限的区分作者的能力,在56.3%的情况下正确识别人工智能辅助的摘要,在34.7%的情况下正确识别人工生成的摘要(三选项问题的概率≈),55.4%的人工生成的摘要被错误地归因于人工智能辅助。探索性亚组分析显示年龄相互作用(P = 0.023),尽管基于小亚组(n = 14.3%)。结论在人工监督下,人工智能辅助的sds达到了不低于人工生成的Cochrane摘要的理解水平,具有相当的质量、安全性和信任评级。人工智能的摘要在很大程度上与人类生成的摘要无法区分。审前验证识别并纠正了数值误差,确认了人工监督的必要性。这些发现支持PLS生产的人工智能工作流程,尽管需要对时间和资源影响进行正式评估,以建立优于传统手工方法的效率收益。
{"title":"Comparison of AI-assisted and human-generated plain language summaries for Cochrane reviews: a randomised non-inferiority trial (HIET-1) [Registered Report - stage II]","authors":"Declan Devane , Johanna Pope , Paula Byrne , Evan Forde , Isabel O'Byrne , Steven Woloshin , Eileen Culloty , Darren Dahly , Ingeborg Hess Elgersma , Heather Munthe-Kaas , Conor Judge , Martin O'Donnell , Finn Krewer , Sandra Galvin , Nikita N. Burke , Theresa Tierney , KM Saif-Ur-Rahman , Tom Conway , James Thomas","doi":"10.1016/j.jclinepi.2025.112102","DOIUrl":"10.1016/j.jclinepi.2025.112102","url":null,"abstract":"<div><h3>Objectives</h3><div>To compare the comprehension, readability, quality, safety, and trustworthiness of artificial intelligence (AI)-assisted vs human-generated plain language summaries (PLSs) for Cochrane systematic reviews.</div></div><div><h3>Study Design</h3><div>Randomized, parallel-group, two-arm, noninferiority trial (ISRCTN85699985).</div></div><div><h3>Setting</h3><div>Online survey platform, September 2025.</div></div><div><h3>Participants</h3><div>Adults aged 18 years or older with a minimum English reading proficiency of 7 out of 10, recruited via Prolific. Of the 500 individuals screened, 465 were randomized and 453 completed per-protocol analysis.</div></div><div><h3>Interventions</h3><div>Participants were randomly assigned to three AI-assisted PLSs developed with ChatGPT and human-in-the-loop verification, or to three published human-generated Cochrane PLSs for the same reviews.</div></div><div><h3>Outcomes</h3><div>Primary: comprehension (10-item questionnaire, noninferiority margin 10%). Secondary: readability quality and safety, trustworthiness, and authorship perception.</div></div><div><h3>Results</h3><div>Mean comprehension scores were 88.9% (<em>n</em> = 228) in the AI-assisted group and 89.0% (<em>n</em> = 225) in the human-generated group (mean difference −0.03 percentage points, 95% CI: −1.9% to 2.0%); the upper CI bound (2.0 percentage points) did not exceed the +10 percentage-point noninferiority margin, demonstrating noninferiority. Flesch-Kincaid Grade Level showed no significant difference (8.20 vs 8.38, <em>P</em> = .722), although formal noninferiority was missed (upper 95% CI bound 1.72 exceeded the 1.0 grade level margin). AI-assisted summaries scored higher on Flesch Reading Ease (63.33 vs 50.00, <em>P</em> = .008) and lower on the Coleman-Liau Index. All summaries met prespecified quality and safety standards (100% in both groups). Trustworthiness scores were comparable (3.98 vs 3.91, difference 0.068, 95% CI: −0.043 to 0.179; meeting noninferiority). Participants demonstrated limited ability to distinguish between authorship, correctly identifying AI-assisted summaries in 56.3% of cases and human-generated summaries in 34.7% (≈ chance for a three-option question), with 55.4% of human-generated summaries misattributed as AI-assisted. Exploratory subgroup analysis showed an age interaction (<em>P</em> = .023), though based on a small subgroup (<em>n</em> = 14, 3%).</div></div><div><h3>Conclusion</h3><div>AI-assisted PLSs with human oversight achieved comprehension levels noninferior to those of human-generated Cochrane summaries, with comparable quality, safety, and trust ratings. AI summaries were largely indistinguishable from those generated by humans. Pretrial verification identified and corrected numerical errors, confirming the need for human oversight. These findings support human-in-the-loop AI workflows for PLS production, though formal evaluation of the time and resource implications is needed","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112102"},"PeriodicalIF":5.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145897844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-27DOI: 10.1016/j.jclinepi.2025.112122
Raphael E. Cuomo
Objectives
Epidemiology is largely organized to explain who becomes ill, yet many clinical and public health decisions occur after diagnosis. I introduce and formally define survival epidemiology as a new branch of science focused on assessing how people live longer and better with established disease, and I provide justification that prevention estimates should not be assumed to apply postdiagnosis.
Study Design and Setting
Conceptual and methodological commentary synthesizing evidence across cardiovascular, renal, oncologic, pulmonary, and hepatic conditions and integrating causal-inference and time-to-event principles for postdiagnosis questions.
Results
Across diseases, associations measured for incidence often fail to reproduce, and sometimes reverse, among patients with established disease. Diagnosis acts as a causal threshold that changes time scales and bias structures, including conditioning on disease (collider stratification), time-dependent confounding, immortal time bias, and reverse causation. Credible postdiagnosis inference requires designs that emulate randomized trials; explicit alignment of time zero with clinical decision points; strategies defined as used in practice; and handling of competing risks, multistate transitions, and longitudinal biomarkers (including joint models when appropriate). Essential postdiagnosis data include stage, molecular subtype, prior therapy lines, dose intensity and modifications, adverse events, performance status, and patient-reported outcomes. Recommended practice is parallel estimation of prevention and postdiagnosis survival effects for the same exposure–disease pairs and routine reporting of heterogeneity by stage, subtype, treatment pathway, and time since diagnosis.
Conclusion
Prevention and postdiagnosis survival are distinct inferential targets. Journals should require clarity on whether claims pertain to prevention or survival and report target-trial elements; guideline bodies should distinguish prevention from survival recommendations when evidence allows; and funders, training programs, and public communication should support survival-focused methods, data standards, and context-specific messaging for people living with disease.
{"title":"Defining survival epidemiology: postdiagnosis population science for people living with disease","authors":"Raphael E. Cuomo","doi":"10.1016/j.jclinepi.2025.112122","DOIUrl":"10.1016/j.jclinepi.2025.112122","url":null,"abstract":"<div><h3>Objectives</h3><div>Epidemiology is largely organized to explain who becomes ill, yet many clinical and public health decisions occur after diagnosis. I introduce and formally define survival epidemiology as a new branch of science focused on assessing how people live longer and better with established disease, and I provide justification that prevention estimates should not be assumed to apply postdiagnosis.</div></div><div><h3>Study Design and Setting</h3><div>Conceptual and methodological commentary synthesizing evidence across cardiovascular, renal, oncologic, pulmonary, and hepatic conditions and integrating causal-inference and time-to-event principles for postdiagnosis questions.</div></div><div><h3>Results</h3><div>Across diseases, associations measured for incidence often fail to reproduce, and sometimes reverse, among patients with established disease. Diagnosis acts as a causal threshold that changes time scales and bias structures, including conditioning on disease (collider stratification), time-dependent confounding, immortal time bias, and reverse causation. Credible postdiagnosis inference requires designs that emulate randomized trials; explicit alignment of time zero with clinical decision points; strategies defined as used in practice; and handling of competing risks, multistate transitions, and longitudinal biomarkers (including joint models when appropriate). Essential postdiagnosis data include stage, molecular subtype, prior therapy lines, dose intensity and modifications, adverse events, performance status, and patient-reported outcomes. Recommended practice is parallel estimation of prevention and postdiagnosis survival effects for the same exposure–disease pairs and routine reporting of heterogeneity by stage, subtype, treatment pathway, and time since diagnosis.</div></div><div><h3>Conclusion</h3><div>Prevention and postdiagnosis survival are distinct inferential targets. Journals should require clarity on whether claims pertain to prevention or survival and report target-trial elements; guideline bodies should distinguish prevention from survival recommendations when evidence allows; and funders, training programs, and public communication should support survival-focused methods, data standards, and context-specific messaging for people living with disease.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112122"},"PeriodicalIF":5.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145859002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In randomized clinical trials (RCTs) for hematological malignancies, patients may undergo allogeneic hematopoietic stem cell transplantation (allo-HSCT) as part of standard clinical pathways. Allo-HSCT is a potentially curative but high-risk procedure performed after randomization and thus constitutes an important intercurrent event that can substantially influence survival outcomes. However, its handling in statistical analyses is not standardized.
Objective
To review current statistical methods used to handle postrandomization allo-HSCT as an intercurrent event in RCTs, and to highlight how each method corresponds to a different estimand, reflecting distinct clinical questions.
Methods
We reviewed 93 RCTs published between January 1, 2014, and April 1, 2024 that reported survival outcomes with postrandomization allo-HSCT.
Results
Three different statistical methods were employed to estimate the treatment effects: censoring at the time of allo-HSCT (64 analyses), a time-dependent covariate in a Cox model (24 analyses), or ignoring allo-HSCT status (17 analyses). Each method estimates the treatment effect in response to a different clinical question and estimand, with specific assumptions that must be considered when interpreting the results. Censoring corresponds to the “hypothetical” estimand, but its validity requires 2 things: first, that the likelihood of receiving allo-HSCT is similar across treatment arms; and second, that patients who undergo transplantation have a similar prognosis to those who do not. Time-dependent covariate incorporates the effect of allo-HSCT but is not associated with a specific estimand and requires careful interpretation. Ignoring allo-HSCT corresponds to the “treatment policy” strategy, of comparing the treatment strategy, whichever allo-HSCT or not, without additional assumptions.
Conclusion
There is no consensus on handling allo-HSCT as an intercurrent event in survival analyses. Censoring, although common, may introduce bias if treatment or prognostic covariates influence allo-HSCT use. The treatment policy estimand should be preferred when allo-HSCT is part of the therapeutic strategy.
{"title":"Challenges in handling allogeneic stem cell transplantation in randomized clinical trials","authors":"Roxane Couturier , Loïc Vasseur , Nicolas Boissel , Hervé Dombret , Jérôme Lambert , Sylvie Chevret","doi":"10.1016/j.jclinepi.2026.112132","DOIUrl":"10.1016/j.jclinepi.2026.112132","url":null,"abstract":"<div><h3>Background</h3><div>In randomized clinical trials (RCTs) for hematological malignancies, patients may undergo allogeneic hematopoietic stem cell transplantation (allo-HSCT) as part of standard clinical pathways. Allo-HSCT is a potentially curative but high-risk procedure performed after randomization and thus constitutes an important intercurrent event that can substantially influence survival outcomes. However, its handling in statistical analyses is not standardized.</div></div><div><h3>Objective</h3><div>To review current statistical methods used to handle postrandomization allo-HSCT as an intercurrent event in RCTs, and to highlight how each method corresponds to a different estimand, reflecting distinct clinical questions.</div></div><div><h3>Methods</h3><div>We reviewed 93 RCTs published between January 1, 2014, and April 1, 2024 that reported survival outcomes with postrandomization allo-HSCT.</div></div><div><h3>Results</h3><div>Three different statistical methods were employed to estimate the treatment effects: censoring at the time of allo-HSCT (64 analyses), a time-dependent covariate in a Cox model (24 analyses), or ignoring allo-HSCT status (17 analyses). Each method estimates the treatment effect in response to a different clinical question and estimand, with specific assumptions that must be considered when interpreting the results. Censoring corresponds to the “hypothetical” estimand, but its validity requires 2 things: first, that the likelihood of receiving allo-HSCT is similar across treatment arms; and second, that patients who undergo transplantation have a similar prognosis to those who do not. Time-dependent covariate incorporates the effect of allo-HSCT but is not associated with a specific estimand and requires careful interpretation. Ignoring allo-HSCT corresponds to the “treatment policy” strategy, of comparing the treatment strategy, whichever allo-HSCT or not, without additional assumptions.</div></div><div><h3>Conclusion</h3><div>There is no consensus on handling allo-HSCT as an intercurrent event in survival analyses. Censoring, although common, may introduce bias if treatment or prognostic covariates influence allo-HSCT use. The treatment policy estimand should be preferred when allo-HSCT is part of the therapeutic strategy.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112132"},"PeriodicalIF":5.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145935893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-19DOI: 10.1016/j.jclinepi.2025.112115
Zijun Wang , Hongfeng He , Sergey K. Zyryanov , Liliya E. Ziganshina , Akihiko Ozaki , Natalia Dorofeeva , Myeong Soo Lee , Ivan D. Florez , Etienne Ngeh , Abhilasha Sharma , Ekaterina V. Yudina , Barbara C. van Munster , Jako S. Burgers , Opeyemi O. Babatunde , Yaolong Chen , Janne Estill
Objectives
The use of guidelines in multimorbidity-related practice has not yet been extensively investigated. We aimed to explore how health-care professionals use guidelines when managing individuals with multimorbidity.
Methods
We conducted an exploratory survey among a convenience sample of medical professionals with clinical experience. The questionnaire addressed whether and how different types of guidelines are used in multimorbidity-related practice, the reasons for not using specific types of guidelines, and other approaches to inform multimorbidity practice. It was distributed through the investigators’ contact networks. The results were presented descriptively.
Results
We received 311 valid responses: 136 from the World Health Organization European Region, 137 from the Western Pacific Region, and 38 from other regions. Most participants were familiar with the concept of multimorbidity (n = 245, 79%). Among the 269 respondents who reported using guidelines in multimorbidity practice, 124 (46%) used guidelines specifically focusing on combinations of diseases, and 148 (55%) multiple single-disease guidelines together. Lack of availability was the main reason for not using guidelines that address multimorbidity itself, and the high number of guidelines (n = 76, 40%) and possible interactions between conditions or treatments (n = 62, 38%) for not using single-disease guidelines. Respondents frequently consult experts or refer to systematic reviews and primary studies when existing guidelines do not meet their needs. The development of a tool or method to guide the use of multiple guidelines ranked highest among possible actions to improve multimorbidity practice.
Conclusion
Although the medical professionals in our sample were generally familiar with the use of guidelines, there are many unmet needs and tool gaps related to guideline-informed multimorbidity-related practice.
{"title":"The use of guidelines in multimorbidity-related practice: an exploratory questionnaire survey","authors":"Zijun Wang , Hongfeng He , Sergey K. Zyryanov , Liliya E. Ziganshina , Akihiko Ozaki , Natalia Dorofeeva , Myeong Soo Lee , Ivan D. Florez , Etienne Ngeh , Abhilasha Sharma , Ekaterina V. Yudina , Barbara C. van Munster , Jako S. Burgers , Opeyemi O. Babatunde , Yaolong Chen , Janne Estill","doi":"10.1016/j.jclinepi.2025.112115","DOIUrl":"10.1016/j.jclinepi.2025.112115","url":null,"abstract":"<div><h3>Objectives</h3><div>The use of guidelines in multimorbidity-related practice has not yet been extensively investigated. We aimed to explore how health-care professionals use guidelines when managing individuals with multimorbidity.</div></div><div><h3>Methods</h3><div>We conducted an exploratory survey among a convenience sample of medical professionals with clinical experience. The questionnaire addressed whether and how different types of guidelines are used in multimorbidity-related practice, the reasons for not using specific types of guidelines, and other approaches to inform multimorbidity practice. It was distributed through the investigators’ contact networks. The results were presented descriptively.</div></div><div><h3>Results</h3><div>We received 311 valid responses: 136 from the World Health Organization European Region, 137 from the Western Pacific Region, and 38 from other regions. Most participants were familiar with the concept of multimorbidity (<em>n</em> = 245, 79%). Among the 269 respondents who reported using guidelines in multimorbidity practice, 124 (46%) used guidelines specifically focusing on combinations of diseases, and 148 (55%) multiple single-disease guidelines together. Lack of availability was the main reason for not using guidelines that address multimorbidity itself, and the high number of guidelines (<em>n</em> = 76, 40%) and possible interactions between conditions or treatments (<em>n</em> = 62, 38%) for not using single-disease guidelines. Respondents frequently consult experts or refer to systematic reviews and primary studies when existing guidelines do not meet their needs. The development of a tool or method to guide the use of multiple guidelines ranked highest among possible actions to improve multimorbidity practice.</div></div><div><h3>Conclusion</h3><div>Although the medical professionals in our sample were generally familiar with the use of guidelines, there are many unmet needs and tool gaps related to guideline-informed multimorbidity-related practice.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112115"},"PeriodicalIF":5.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145806152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-08DOI: 10.1016/j.jclinepi.2025.112100
Gina Bantle, Julia Stadelmaier, Maria Petropoulou, Joerg J. Meerpohl, Lukas Schwingshackl
{"title":"Response to letter to the editor “Most methodological characteristics do not exaggerate effect estimates in nutrition randomized trials: findings from a metaepidemiological study”","authors":"Gina Bantle, Julia Stadelmaier, Maria Petropoulou, Joerg J. Meerpohl, Lukas Schwingshackl","doi":"10.1016/j.jclinepi.2025.112100","DOIUrl":"10.1016/j.jclinepi.2025.112100","url":null,"abstract":"","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112100"},"PeriodicalIF":5.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145727089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-03-03DOI: 10.1016/j.jclinepi.2026.112190
Andrea C. Tricco, David Tovey
{"title":"Editors' Choice: March 2026","authors":"Andrea C. Tricco, David Tovey","doi":"10.1016/j.jclinepi.2026.112190","DOIUrl":"10.1016/j.jclinepi.2026.112190","url":null,"abstract":"","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112190"},"PeriodicalIF":5.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147367124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-17DOI: 10.1016/j.jclinepi.2025.112111
Manuel Marques-Cruz, Rafael José Vieira, Sara Gil Mata, Bernardo Sousa-Pinto
{"title":"The opacity and exemption of artificial intelligence or the epic of explainable artificial intelligence, reply to commentary by Rattanapitoon et al","authors":"Manuel Marques-Cruz, Rafael José Vieira, Sara Gil Mata, Bernardo Sousa-Pinto","doi":"10.1016/j.jclinepi.2025.112111","DOIUrl":"10.1016/j.jclinepi.2025.112111","url":null,"abstract":"","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112111"},"PeriodicalIF":5.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145795605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-27DOI: 10.1016/j.jclinepi.2025.112121
Sarah B. Windle , Sam Harper , Jasleen Arneja , Peter Socha , Arijit Nandi
Background
In contrast to other observational study designs, quasi-experimental approaches (eg, difference-in-differences, interrupted time series, regression discontinuity, instrumental variable, synthetic control) account for some sources of unmeasured confounding and can estimate causal effects under weaker assumptions. Studies which apply quasi-experimental approaches have increased in popularity in recent decades, therefore investigators conducting systematic reviews of observational studies, particularly in biomedical, public health, or epidemiologic content areas, must be prepared to encounter and appropriately assess these approaches.
Objective
Our objective is to describe key methodological challenges and considerations for systematic reviews including quasi-experimental studies, with attention to current recommendations and approaches which have been applied in previous reviews.
Conclusion
Recommendations for authors of systematic reviews: We recommend that individuals conducting systematic reviews including quasi-experimental studies: (1) search a broad range of bibliographic databases and gray literature, including preprint repositories; (2) do not use search strategies which require specific terms for study design for identification, given inconsistent nomenclature and poor database indexing for quasi-experimental studies; (3) ensure that their review team includes several individuals with expertise in quasi-experimental designs for screening and risk of bias assessment in duplicate; (4) use an approach to risk of bias assessment which is sufficiently granular to identify studies most likely to report unbiased estimates of causal effects (eg, modified Risk Of Bias In Nonrandomized Studies - of Interventions); and (5) consider the implications of varied estimands when interpreting estimates from different quasi-experimental designs. Researchers may also consider restricting systematic review inclusion to quasi-experimental studies for feasibility when addressing research questions with large bodies of literature. However, a more inclusive approach is preferred, as well-designed studies using a variety of methodological approaches may be more credible than a quasi-experiment which violates causal assumptions.
Recommendations for the research community: Many of the challenges faced in conducting systematic reviews of quasi-experimental studies would be ameliorated by improved consistency in nomenclature, as well as greater transparency from authors in describing their research designs. The broader community (eg, research networks, journals) should consider the creation and implementation of reporting standards and protocol registration for quasi-experimental studies to improve study identification in systematic reviews.
{"title":"Systematic reviews of quasi-experimental studies: challenges and considerations","authors":"Sarah B. Windle , Sam Harper , Jasleen Arneja , Peter Socha , Arijit Nandi","doi":"10.1016/j.jclinepi.2025.112121","DOIUrl":"10.1016/j.jclinepi.2025.112121","url":null,"abstract":"<div><h3>Background</h3><div>In contrast to other observational study designs, quasi-experimental approaches (eg, difference-in-differences, interrupted time series, regression discontinuity, instrumental variable, synthetic control) account for some sources of unmeasured confounding and can estimate causal effects under weaker assumptions. Studies which apply quasi-experimental approaches have increased in popularity in recent decades, therefore investigators conducting systematic reviews of observational studies, particularly in biomedical, public health, or epidemiologic content areas, must be prepared to encounter and appropriately assess these approaches.</div></div><div><h3>Objective</h3><div>Our objective is to describe key methodological challenges and considerations for systematic reviews including quasi-experimental studies, with attention to current recommendations and approaches which have been applied in previous reviews.</div></div><div><h3>Conclusion</h3><div><em>Recommendations for authors of systematic reviews:</em> We recommend that individuals conducting systematic reviews including quasi-experimental studies: (1) search a broad range of bibliographic databases and gray literature, including preprint repositories; (2) do not use search strategies which require specific terms for study design for identification, given inconsistent nomenclature and poor database indexing for quasi-experimental studies; (3) ensure that their review team includes several individuals with expertise in quasi-experimental designs for screening and risk of bias assessment in duplicate; (4) use an approach to risk of bias assessment which is sufficiently granular to identify studies most likely to report unbiased estimates of causal effects (eg, modified Risk Of Bias In Nonrandomized Studies - of Interventions); and (5) consider the implications of varied estimands when interpreting estimates from different quasi-experimental designs. Researchers may also consider restricting systematic review inclusion to quasi-experimental studies for feasibility when addressing research questions with large bodies of literature. However, a more inclusive approach is preferred, as well-designed studies using a variety of methodological approaches may be more credible than a quasi-experiment which violates causal assumptions.</div><div><em>Recommendations for the research community:</em> Many of the challenges faced in conducting systematic reviews of quasi-experimental studies would be ameliorated by improved consistency in nomenclature, as well as greater transparency from authors in describing their research designs. The broader community (eg, research networks, journals) should consider the creation and implementation of reporting standards and protocol registration for quasi-experimental studies to improve study identification in systematic reviews.</div></div>","PeriodicalId":51079,"journal":{"name":"Journal of Clinical Epidemiology","volume":"191 ","pages":"Article 112121"},"PeriodicalIF":5.2,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145858997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}