Diagnostic and prognostic research最新文献

Development and internal validation of a new life expectancy estimator for multimorbid older adults.

Diagnostic and prognostic research

Pub Date : 2025-03-04 DOI: 10.1186/s41512-025-00185-9

Viktoria Gastens, Arnaud Chiolero, Martin Feller, Douglas C Bauer, Nicolas Rodondi, Cinzia Del Giovane

Background: As populations are aging, the number of older patients with multiple chronic diseases demanding complex care increases. Although clinical guidelines recommend care to be personalized accounting for life expectancy, there are no tools to estimate life expectancy among multimorbid patients. Our objective was therefore to develop and internally validate a life expectancy estimator specifically for older multimorbid adults.

Methods: We analyzed data from the OPERAM (OPtimising thERapy to prevent avoidable hospital admissions in multimorbid older people) study in Bern, Switzerland. Participants aged 70 years old or more with multimorbidity (3 or more chronic medical conditions) and polypharmacy (use of 5 drugs or more for > 30 days) were included. All-cause mortality was assessed during 3 years of follow-up. We built a 3-year mortality prognostic index and transformed this index into a life expectancy estimator. Mortality risk candidate predictors included demographic variables (age, sex), clinical characteristics (metastatic cancer, number of drugs, body mass index, weight loss), smoking, functional status variables (Barthel-Index, falls, nursing home residence), and hospitalization. We internally validated and optimism corrected the model using bootstrapping techniques. We transformed the mortality prognostic index into a life expectancy estimator using the Gompertz survival function.

Results: Eight hundred five participants were included in the analysis. During 3 years of follow-up, 292 participants (36%) died. Age, metastatic cancer, number of drugs, lower body mass index, weight loss, number of hospitalizations, and lower Barthel-Index (functional impairment) were selected as predictors in the final multivariable model. Our model showed moderate discrimination with an optimism-corrected C statistic of 0.70. The optimism-corrected calibration slope was 0.96. The Gompertz-predicted mean life expectancy in our sample was 5.4 years (standard deviation 3.5 years). Categorization into three life expectancy groups led to visually good separation in Kaplan-Meier curves. We also developed a web application that calculates an individual's life expectancy estimation.

Conclusion: A life expectancy estimator for multimorbid older adults based on an internally validated 3-year mortality risk index was developed. Further validation of the score among various populations of multimorbid patients is needed before its implementation into practice.

Trial registration: ClinicalTrials.gov NCT02986425. First submitted 21/10/2016. First posted 08/12/2016.

{"title":"Development and internal validation of a new life expectancy estimator for multimorbid older adults.","authors":"Viktoria Gastens, Arnaud Chiolero, Martin Feller, Douglas C Bauer, Nicolas Rodondi, Cinzia Del Giovane","doi":"10.1186/s41512-025-00185-9","DOIUrl":"10.1186/s41512-025-00185-9","url":null,"abstract":"Background: As populations are aging, the number of older patients with multiple chronic diseases demanding complex care increases. Although clinical guidelines recommend care to be personalized accounting for life expectancy, there are no tools to estimate life expectancy among multimorbid patients. Our objective was therefore to develop and internally validate a life expectancy estimator specifically for older multimorbid adults.Methods: We analyzed data from the OPERAM (OPtimising thERapy to prevent avoidable hospital admissions in multimorbid older people) study in Bern, Switzerland. Participants aged 70 years old or more with multimorbidity (3 or more chronic medical conditions) and polypharmacy (use of 5 drugs or more for > 30 days) were included. All-cause mortality was assessed during 3 years of follow-up. We built a 3-year mortality prognostic index and transformed this index into a life expectancy estimator. Mortality risk candidate predictors included demographic variables (age, sex), clinical characteristics (metastatic cancer, number of drugs, body mass index, weight loss), smoking, functional status variables (Barthel-Index, falls, nursing home residence), and hospitalization. We internally validated and optimism corrected the model using bootstrapping techniques. We transformed the mortality prognostic index into a life expectancy estimator using the Gompertz survival function.Results: Eight hundred five participants were included in the analysis. During 3 years of follow-up, 292 participants (36%) died. Age, metastatic cancer, number of drugs, lower body mass index, weight loss, number of hospitalizations, and lower Barthel-Index (functional impairment) were selected as predictors in the final multivariable model. Our model showed moderate discrimination with an optimism-corrected C statistic of 0.70. The optimism-corrected calibration slope was 0.96. The Gompertz-predicted mean life expectancy in our sample was 5.4 years (standard deviation 3.5 years). Categorization into three life expectancy groups led to visually good separation in Kaplan-Meier curves. We also developed a web application that calculates an individual's life expectancy estimation.Conclusion: A life expectancy estimator for multimorbid older adults based on an internally validated 3-year mortality risk index was developed. Further validation of the score among various populations of multimorbid patients is needed before its implementation into practice.Trial registration: ClinicalTrials.gov NCT02986425. First submitted 21/10/2016. First posted 08/12/2016.","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"9 1","pages":"5"},"PeriodicalIF":0.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11877760/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143544800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Against reflexive recalibration: towards a causal framework for addressing miscalibration.

Diagnostic and prognostic research

Pub Date : 2025-02-11 DOI: 10.1186/s41512-024-00184-2

Akshay Swaminathan, Ujwal Srivastava, Lucia Tu, Ivan Lopez, Nigam H Shah, Andrew J Vickers

引用次数: 0

Models for predicting risk of endometrial cancer: a systematic review.

Diagnostic and prognostic research

Pub Date : 2025-02-04 DOI: 10.1186/s41512-024-00178-0

Bea Harris Forder, Anastasia Ardasheva, Karyna Atha, Hannah Nentwich, Roxanna Abhari, Christiana Kartsonaki

Background: Endometrial cancer (EC) is the most prevalent gynaecological cancer in the UK with a rising incidence. Various models exist to predict the risk of developing EC, for different settings and prediction timeframes. This systematic review aims to provide a summary of models and assess their characteristics and performance.

Methods: A systematic search of the MEDLINE and Embase (OVID) databases was used to identify risk prediction models related to EC and studies validating these models. Papers relating to predicting the risk of a future diagnosis of EC were selected for inclusion. Study characteristics, variables included in the model, methods used, and model performance, were extracted. The Prediction model Risk-of-Bias Assessment Tool was used to assess model quality.

Results: Twenty studies describing 19 models were included. Ten were designed for the general population and nine for high-risk populations. Three models were developed for premenopausal women and two for postmenopausal women. Logistic regression was the most used development method. Three models, all in the general population, had a low risk of bias and all models had high applicability. Most models had moderate (area under the receiver operating characteristic curve (AUC) 0.60-0.80) or high predictive ability (AUC > 0.80) with AUCs ranging from 0.56 to 0.92. Calibration was assessed for five models. Two of these, the Hippisley-Cox and Coupland QCancer models, had high predictive ability and were well calibrated; these models also received a low risk of bias rating.

Conclusions: Several models of moderate-high predictive ability exist for predicting the risk of EC, but study quality varies, with most models at high risk of bias. External validation of well-performing models in large, diverse cohorts is needed to assess their utility.

Registration: The protocol for this review is available on PROSPERO (CRD42022303085).

{"title":"Models for predicting risk of endometrial cancer: a systematic review.","authors":"Bea Harris Forder, Anastasia Ardasheva, Karyna Atha, Hannah Nentwich, Roxanna Abhari, Christiana Kartsonaki","doi":"10.1186/s41512-024-00178-0","DOIUrl":"10.1186/s41512-024-00178-0","url":null,"abstract":"Background: Endometrial cancer (EC) is the most prevalent gynaecological cancer in the UK with a rising incidence. Various models exist to predict the risk of developing EC, for different settings and prediction timeframes. This systematic review aims to provide a summary of models and assess their characteristics and performance.Methods: A systematic search of the MEDLINE and Embase (OVID) databases was used to identify risk prediction models related to EC and studies validating these models. Papers relating to predicting the risk of a future diagnosis of EC were selected for inclusion. Study characteristics, variables included in the model, methods used, and model performance, were extracted. The Prediction model Risk-of-Bias Assessment Tool was used to assess model quality.Results: Twenty studies describing 19 models were included. Ten were designed for the general population and nine for high-risk populations. Three models were developed for premenopausal women and two for postmenopausal women. Logistic regression was the most used development method. Three models, all in the general population, had a low risk of bias and all models had high applicability. Most models had moderate (area under the receiver operating characteristic curve (AUC) 0.60-0.80) or high predictive ability (AUC > 0.80) with AUCs ranging from 0.56 to 0.92. Calibration was assessed for five models. Two of these, the Hippisley-Cox and Coupland QCancer models, had high predictive ability and were well calibrated; these models also received a low risk of bias rating.Conclusions: Several models of moderate-high predictive ability exist for predicting the risk of EC, but study quality varies, with most models at high risk of bias. External validation of well-performing models in large, diverse cohorts is needed to assess their utility.Registration: The protocol for this review is available on PROSPERO (CRD42022303085).","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"9 1","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11792366/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143124016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Risk prediction tools for pressure injury occurrence: an umbrella review of systematic reviews reporting model development and validation methods. 压力损伤发生的风险预测工具：系统评价报告模型开发和验证方法的总括性回顾。

Diagnostic and prognostic research

Pub Date : 2025-01-14 DOI: 10.1186/s41512-024-00182-4

Bethany Hillier, Katie Scandrett, April Coombe, Tina Hernandez-Boussard, Ewout Steyerberg, Yemisi Takwoingi, Vladica Velickovic, Jacqueline Dinnes

Background: Pressure injuries (PIs) place a substantial burden on healthcare systems worldwide. Risk stratification of those who are at risk of developing PIs allows preventive interventions to be focused on patients who are at the highest risk. The considerable number of risk assessment scales and prediction models available underscores the need for a thorough evaluation of their development, validation, and clinical utility. Our objectives were to identify and describe available risk prediction tools for PI occurrence, their content and the development and validation methods used.

Methods: The umbrella review was conducted according to Cochrane guidance. MEDLINE, Embase, CINAHL, EPISTEMONIKOS, Google Scholar, and reference lists were searched to identify relevant systematic reviews. The risk of bias was assessed using adapted AMSTAR-2 criteria. Results were described narratively. All included reviews contributed to building a comprehensive list of risk prediction tools.

Results: We identified 32 eligible systematic reviews only seven of which described the development and validation of risk prediction tools for PI. Nineteen reviews assessed the prognostic accuracy of the tools and 11 assessed clinical effectiveness. Of the seven reviews reporting model development and validation, six included only machine learning models. Two reviews included external validations of models, although only one review reported any details on external validation methods or results. This was also the only review to report measures of both discrimination and calibration. Five reviews presented measures of discrimination, such as the area under the curve (AUC), sensitivities, specificities, F1 scores, and G-means. For the four reviews that assessed the risk of bias assessment using the PROBAST tool, all models but one were found to be at high or unclear risk of bias.

Conclusions: Available tools do not meet current standards for the development or reporting of risk prediction models. The majority of tools have not been externally validated. Standardised and rigorous approaches to risk prediction model development and validation are needed.

Trial registration: The protocol was registered on the Open Science Framework ( https://osf.io/tepyk ).

背景：压力性损伤（PIs）给世界各地的医疗保健系统带来了巨大的负担。对那些有患pi风险的人进行风险分层，可以将预防干预措施的重点放在风险最高的患者身上。大量可用的风险评估量表和预测模型强调了对其开发、验证和临床应用进行彻底评估的必要性。我们的目标是识别和描述PI发生的可用风险预测工具，它们的内容以及所使用的开发和验证方法。方法：根据Cochrane指南进行总括性综述。检索MEDLINE、Embase、CINAHL、EPISTEMONIKOS、谷歌Scholar和参考文献列表，以确定相关的系统综述。采用AMSTAR-2标准评估偏倚风险。对结果进行叙述。所有纳入的审查都有助于建立一个全面的风险预测工具列表。结果：我们确定了32个符合条件的系统评价，其中只有7个描述了PI风险预测工具的开发和验证。19篇综述评估了这些工具的预后准确性，11篇综述评估了临床有效性。在报告模型开发和验证的七篇综述中，有六篇仅包括机器学习模型。两个综述包括模型的外部验证，尽管只有一个综述报告了外部验证方法或结果的任何细节。这也是唯一一篇报告辨别和校准措施的综述。五篇综述提出了歧视的测量方法，如曲线下面积（AUC）、敏感性、特异性、F1分数和g均值。对于使用PROBAST工具评估偏倚风险的四篇综述，除一篇外，所有模型都被发现具有较高或不明确的偏倚风险。结论：现有的工具不符合当前风险预测模型开发或报告的标准。大多数工具还没有经过外部验证。需要标准化和严格的方法来开发和验证风险预测模型。试验注册：该方案已在开放科学框架（https://osf.io/tepyk）上注册。

{"title":"Risk prediction tools for pressure injury occurrence: an umbrella review of systematic reviews reporting model development and validation methods.","authors":"Bethany Hillier, Katie Scandrett, April Coombe, Tina Hernandez-Boussard, Ewout Steyerberg, Yemisi Takwoingi, Vladica Velickovic, Jacqueline Dinnes","doi":"10.1186/s41512-024-00182-4","DOIUrl":"10.1186/s41512-024-00182-4","url":null,"abstract":"Background: Pressure injuries (PIs) place a substantial burden on healthcare systems worldwide. Risk stratification of those who are at risk of developing PIs allows preventive interventions to be focused on patients who are at the highest risk. The considerable number of risk assessment scales and prediction models available underscores the need for a thorough evaluation of their development, validation, and clinical utility. Our objectives were to identify and describe available risk prediction tools for PI occurrence, their content and the development and validation methods used.Methods: The umbrella review was conducted according to Cochrane guidance. MEDLINE, Embase, CINAHL, EPISTEMONIKOS, Google Scholar, and reference lists were searched to identify relevant systematic reviews. The risk of bias was assessed using adapted AMSTAR-2 criteria. Results were described narratively. All included reviews contributed to building a comprehensive list of risk prediction tools.Results: We identified 32 eligible systematic reviews only seven of which described the development and validation of risk prediction tools for PI. Nineteen reviews assessed the prognostic accuracy of the tools and 11 assessed clinical effectiveness. Of the seven reviews reporting model development and validation, six included only machine learning models. Two reviews included external validations of models, although only one review reported any details on external validation methods or results. This was also the only review to report measures of both discrimination and calibration. Five reviews presented measures of discrimination, such as the area under the curve (AUC), sensitivities, specificities, F1 scores, and G-means. For the four reviews that assessed the risk of bias assessment using the PROBAST tool, all models but one were found to be at high or unclear risk of bias.Conclusions: Available tools do not meet current standards for the development or reporting of risk prediction models. The majority of tools have not been externally validated. Standardised and rigorous approaches to risk prediction model development and validation are needed.Trial registration: The protocol was registered on the Open Science Framework ( https://osf.io/tepyk ).","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"9 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11730812/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142980868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rehabilitation outcomes after comprehensive post-acute inpatient rehabilitation following moderate to severe acquired brain injury-study protocol for an overall prognosis study based on routinely collected health data. 中度至重度获得性脑损伤急性住院后全面康复的康复结果——基于常规收集的健康数据的总体预后研究方案

Diagnostic and prognostic research

Pub Date : 2025-01-07 DOI: 10.1186/s41512-024-00183-3

Uwe M Pommerich, Peter W Stubbs, Jørgen Feldbæk Nielsen

Background: The initial theme of the PROGRESS framework for prognosis research is termed overall prognosis research. Its aim is to describe the most likely course of health conditions in the context of current care. These average group-level prognoses may be used to inform patients, health policies, trial designs, or further prognosis research. Acquired brain injury, such as stroke, traumatic brain injury or encephalopathy, is a major cause of disability and functional limitations, worldwide. Rehabilitation aims to maximize independent functioning and meaningful participation in society post-injury. While some observational studies can allow for an inference of the overall prognosis of the level of independent functioning, the context for the provision of rehabilitation is rarely described. The aim of this protocol is to provide a detailed account of the clinical context to aid the interpretation of our upcoming overall prognosis study.

Methods: The study will occur at a Danish post-acute inpatient rehabilitation facility providing specialised inpatient rehabilitation for individuals with moderate to severe acquired brain injury. Routinely collected electronic health data will be extracted from the healthcare provider's database and deterministically linked on an individual level to construct the study cohort. The study period spans from March 2011 to December 2022. Four outcomes will measure the level of functioning. Rehabilitation needs will also be described. Outcomes and rehabilitation needs will be described for the entire cohort, across rehabilitation complexity levels and stratified for relevant demographic and clinical parameters. Descriptive statistics will be used to estimate average prognoses for the level of functioning at discharge from post-acute rehabilitation. The patterns of missing data will be investigated.

Discussion: This protocol is intended to provide transparency in our upcoming study based on routinely collected clinical data. It will aid in the interpretation of the overall prognosis estimates within the context of our current clinical practice and the assessment of potential sources of bias independently.

背景：预后研究进展框架的最初主题被称为整体预后研究。其目的是描述在当前护理情况下最可能出现的健康状况。这些平均组水平的预后可用于告知患者、卫生政策、试验设计或进一步的预后研究。获得性脑损伤，如中风、创伤性脑损伤或脑病，是全世界残疾和功能限制的主要原因。康复旨在最大限度地提高受伤后的独立功能和有意义的社会参与。虽然一些观察性研究可以推断独立功能水平的总体预后，但提供康复的背景很少被描述。本协议的目的是提供临床背景的详细描述，以帮助解释我们即将进行的整体预后研究。方法：该研究将在丹麦的急性住院康复机构进行，该机构为中度至重度获得性脑损伤患者提供专门的住院康复。常规收集的电子健康数据将从医疗保健提供者的数据库中提取，并在个体水平上确定地联系起来，以构建研究队列。研究时间为2011年3月至2022年12月。四个结果将衡量功能水平。还将描述康复需要。将描述整个队列的结果和康复需求，跨越康复复杂性水平，并根据相关的人口统计学和临床参数分层。描述性统计将用于估计急性康复后出院时功能水平的平均预后。缺失数据的模式将被调查。讨论：本方案旨在为我们即将开展的基于常规收集的临床数据的研究提供透明度。这将有助于在我们当前临床实践的背景下解释总体预后估计，并独立评估潜在的偏倚来源。

{"title":"Rehabilitation outcomes after comprehensive post-acute inpatient rehabilitation following moderate to severe acquired brain injury-study protocol for an overall prognosis study based on routinely collected health data.","authors":"Uwe M Pommerich, Peter W Stubbs, Jørgen Feldbæk Nielsen","doi":"10.1186/s41512-024-00183-3","DOIUrl":"https://doi.org/10.1186/s41512-024-00183-3","url":null,"abstract":"Background: The initial theme of the PROGRESS framework for prognosis research is termed overall prognosis research. Its aim is to describe the most likely course of health conditions in the context of current care. These average group-level prognoses may be used to inform patients, health policies, trial designs, or further prognosis research. Acquired brain injury, such as stroke, traumatic brain injury or encephalopathy, is a major cause of disability and functional limitations, worldwide. Rehabilitation aims to maximize independent functioning and meaningful participation in society post-injury. While some observational studies can allow for an inference of the overall prognosis of the level of independent functioning, the context for the provision of rehabilitation is rarely described. The aim of this protocol is to provide a detailed account of the clinical context to aid the interpretation of our upcoming overall prognosis study.Methods: The study will occur at a Danish post-acute inpatient rehabilitation facility providing specialised inpatient rehabilitation for individuals with moderate to severe acquired brain injury. Routinely collected electronic health data will be extracted from the healthcare provider's database and deterministically linked on an individual level to construct the study cohort. The study period spans from March 2011 to December 2022. Four outcomes will measure the level of functioning. Rehabilitation needs will also be described. Outcomes and rehabilitation needs will be described for the entire cohort, across rehabilitation complexity levels and stratified for relevant demographic and clinical parameters. Descriptive statistics will be used to estimate average prognoses for the level of functioning at discharge from post-acute rehabilitation. The patterns of missing data will be investigated.Discussion: This protocol is intended to provide transparency in our upcoming study based on routinely collected clinical data. It will aid in the interpretation of the overall prognosis estimates within the context of our current clinical practice and the assessment of potential sources of bias independently.","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"9 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11706155/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142959579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Validation of prognostic models predicting mortality or ICU admission in patients with COVID-19 in low- and middle-income countries: a global individual participant data meta-analysis. 低收入和中等收入国家预测COVID-19患者死亡率或ICU入院的预后模型的验证：一项全球个体参与者数据荟萃分析

Diagnostic and prognostic research

Pub Date : 2024-12-19 DOI: 10.1186/s41512-024-00181-5

Johanna A A Damen, Banafsheh Arshi, Maarten van Smeden, Silvia Bertagnolio, Janet V Diaz, Ronaldo Silva, Soe Soe Thwin, Laure Wynants, Karel G M Moons

Background: We evaluated the performance of prognostic models for predicting mortality or ICU admission in hospitalized patients with COVID-19 in the World Health Organization (WHO) Global Clinical Platform, a repository of individual-level clinical data of patients hospitalized with COVID-19, including in low- and middle-income countries (LMICs).

Methods: We identified eligible multivariable prognostic models for predicting overall mortality and ICU admission during hospital stay in patients with confirmed or suspected COVID-19 from a living review of COVID-19 prediction models. These models were evaluated using data contributed to the WHO Global Clinical Platform for COVID-19 from nine LMICs (Burkina Faso, Cameroon, Democratic Republic of Congo, Guinea, India, Niger, Nigeria, Zambia, and Zimbabwe). Model performance was assessed in terms of discrimination and calibration.

Results: Out of 144 eligible models, 140 were excluded due to a high risk of bias, predictors unavailable in LIMCs, or insufficient model description. Among 11,338 participants, the remaining models showed good discrimination for predicting in-hospital mortality (3 models), with areas under the curve (AUCs) ranging between 0.76 (95% CI 0.71-0.81) and 0.84 (95% CI 0.77-0.89). An AUC of 0.74 (95% CI 0.70-0.78) was found for predicting ICU admission risk (one model). All models showed signs of miscalibration and overfitting, with extensive heterogeneity between countries.

Conclusions: Among the available COVID-19 prognostic models, only a few could be validated on data collected from LMICs, mainly due to limited predictor availability. Despite their discriminative ability, selected models for mortality prediction or ICU admission showed varying and suboptimal calibration.

背景：我们在世界卫生组织（WHO）全球临床平台中评估了预测COVID-19住院患者死亡率或ICU入院率的预后模型的性能，该平台是包括低收入和中等收入国家（LMICs）在内的COVID-19住院患者个人临床数据的存储库。方法：通过对COVID-19预测模型的实时回顾，我们确定了用于预测确诊或疑似COVID-19患者住院期间总死亡率和ICU住院率的合格多变量预后模型。使用来自9个中低收入国家（布基纳法索、喀麦隆、刚果民主共和国、几内亚、印度、尼日尔、尼日利亚、赞比亚和津巴布韦）向世卫组织COVID-19全球临床平台提供的数据对这些模型进行了评估。从判别和校准两个方面对模型性能进行了评估。结果：在144个符合条件的模型中，140个因高偏倚风险、LIMCs中无法获得预测因子或模型描述不充分而被排除。在11,338名参与者中，其余模型在预测住院死亡率（3个模型）方面表现出良好的辨别能力，曲线下面积（auc）范围在0.76 （95% CI 0.71-0.81）和0.84 （95% CI 0.77-0.89）之间。预测ICU入院风险的AUC为0.74 (95% CI 0.70-0.78)（一个模型）。所有模型都显示出校准不当和过拟合的迹象，各国之间存在广泛的异质性。结论：在现有的COVID-19预后模型中，只有少数模型可以根据从中低收入国家收集的数据进行验证，这主要是由于预测器的可用性有限。尽管它们具有判别能力，但所选的死亡率预测或ICU入院模型显示出不同的和次优的校准。

{"title":"Validation of prognostic models predicting mortality or ICU admission in patients with COVID-19 in low- and middle-income countries: a global individual participant data meta-analysis.","authors":"Johanna A A Damen, Banafsheh Arshi, Maarten van Smeden, Silvia Bertagnolio, Janet V Diaz, Ronaldo Silva, Soe Soe Thwin, Laure Wynants, Karel G M Moons","doi":"10.1186/s41512-024-00181-5","DOIUrl":"10.1186/s41512-024-00181-5","url":null,"abstract":"Background: We evaluated the performance of prognostic models for predicting mortality or ICU admission in hospitalized patients with COVID-19 in the World Health Organization (WHO) Global Clinical Platform, a repository of individual-level clinical data of patients hospitalized with COVID-19, including in low- and middle-income countries (LMICs).Methods: We identified eligible multivariable prognostic models for predicting overall mortality and ICU admission during hospital stay in patients with confirmed or suspected COVID-19 from a living review of COVID-19 prediction models. These models were evaluated using data contributed to the WHO Global Clinical Platform for COVID-19 from nine LMICs (Burkina Faso, Cameroon, Democratic Republic of Congo, Guinea, India, Niger, Nigeria, Zambia, and Zimbabwe). Model performance was assessed in terms of discrimination and calibration.Results: Out of 144 eligible models, 140 were excluded due to a high risk of bias, predictors unavailable in LIMCs, or insufficient model description. Among 11,338 participants, the remaining models showed good discrimination for predicting in-hospital mortality (3 models), with areas under the curve (AUCs) ranging between 0.76 (95% CI 0.71-0.81) and 0.84 (95% CI 0.77-0.89). An AUC of 0.74 (95% CI 0.70-0.78) was found for predicting ICU admission risk (one model). All models showed signs of miscalibration and overfitting, with extensive heterogeneity between countries.Conclusions: Among the available COVID-19 prognostic models, only a few could be validated on data collected from LMICs, mainly due to limited predictor availability. Despite their discriminative ability, selected models for mortality prediction or ICU admission showed varying and suboptimal calibration.","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"8 1","pages":"17"},"PeriodicalIF":0.0,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11656577/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142856909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reported prevalence and comparison of diagnostic approaches for Candida africana: a systematic review with meta-analysis. 非洲念珠菌的报告患病率和诊断方法的比较：一项系统综述和荟萃分析。

Diagnostic and prognostic research

Pub Date : 2024-12-05 DOI: 10.1186/s41512-024-00180-6

Bwambale Jonani, Emmanuel Charles Kasule, Herman Roman Bwire, Gerald Mboowa

This systematic review and meta-analysis evaluated reported prevalence and diagnostic methods for identifying Candida africana, an opportunistic yeast associated with vaginal and oral candidiasis. A comprehensive literature search yielded 53 studies meeting the inclusion criteria, 2 of which were case studies. The pooled prevalence of C. africana among 20,571 participants was 0.9% (95% CI: 0.7-1.3%), with significant heterogeneity observed (I² = 79%, p < 0.01). Subgroup analyses revealed regional variations, with North America showing the highest prevalence (4.6%, 95% CI: 1.8-11.2%). The majority 84.52% of the C. africana have been isolated from vaginal samples, 8.37% from oral samples, 3.77% from urine, 2.09% from glans penis swabs, and 0.42% from rectal swabs, nasal swabs, and respiratory tract expectorations respectively. No C. africana has been isolated from nail samples. Hyphal wall protein 1 gene PCR was the most used diagnostic method for identifying C. africana. It has been used to identify 70% of the isolates. A comparison of methods revealed that the Vitek-2 system consistently failed to differentiate C. africana from Candida albicans, whereas MALDI-TOF misidentified several isolates compared with HWP1 PCR. Factors beyond diagnostic methodology may influence C. africana detection rates. We highlight the importance of adapting molecular methods for resource-limited settings or developing equally accurate but more accessible alternatives for the identification and differentiation of highly similar and cryptic Candida species such as C. africana.

本系统综述和荟萃分析评估了非洲念珠菌（一种与阴道和口腔念珠菌病相关的机会性酵母菌）的报告患病率和诊断方法。综合文献检索得到53项符合纳入标准的研究，其中2项为个案研究。在20,571名参与者中，非洲卷虫的总患病率为0.9% (95% CI: 0.7-1.3%)，存在显著的异质性(I2 = 79%, p

{"title":"Reported prevalence and comparison of diagnostic approaches for Candida africana: a systematic review with meta-analysis.","authors":"Bwambale Jonani, Emmanuel Charles Kasule, Herman Roman Bwire, Gerald Mboowa","doi":"10.1186/s41512-024-00180-6","DOIUrl":"10.1186/s41512-024-00180-6","url":null,"abstract":"This systematic review and meta-analysis evaluated reported prevalence and diagnostic methods for identifying Candida africana, an opportunistic yeast associated with vaginal and oral candidiasis. A comprehensive literature search yielded 53 studies meeting the inclusion criteria, 2 of which were case studies. The pooled prevalence of C. africana among 20,571 participants was 0.9% (95% CI: 0.7-1.3%), with significant heterogeneity observed (I2 = 79%, p < 0.01). Subgroup analyses revealed regional variations, with North America showing the highest prevalence (4.6%, 95% CI: 1.8-11.2%). The majority 84.52% of the C. africana have been isolated from vaginal samples, 8.37% from oral samples, 3.77% from urine, 2.09% from glans penis swabs, and 0.42% from rectal swabs, nasal swabs, and respiratory tract expectorations respectively. No C. africana has been isolated from nail samples. Hyphal wall protein 1 gene PCR was the most used diagnostic method for identifying C. africana. It has been used to identify 70% of the isolates. A comparison of methods revealed that the Vitek-2 system consistently failed to differentiate C. africana from Candida albicans, whereas MALDI-TOF misidentified several isolates compared with HWP1 PCR. Factors beyond diagnostic methodology may influence C. africana detection rates. We highlight the importance of adapting molecular methods for resource-limited settings or developing equally accurate but more accessible alternatives for the identification and differentiation of highly similar and cryptic Candida species such as C. africana.","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"8 1","pages":"16"},"PeriodicalIF":0.0,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11619109/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142787989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The relative data hungriness of unpenalized and penalized logistic regression and ensemble-based machine learning methods: the case of calibration. 无惩罚和有惩罚逻辑回归以及基于集合的机器学习方法的相对数据饥渴度：校准案例。

Diagnostic and prognostic research

Pub Date : 2024-11-05 DOI: 10.1186/s41512-024-00179-z

Peter C Austin, Douglas S Lee, Bo Wang

Background: Machine learning methods are increasingly being used to predict clinical outcomes. Optimism is the difference in model performance between derivation and validation samples. The term "data hungriness" refers to the sample size needed for a modelling technique to generate a prediction model with minimal optimism. Our objective was to compare the relative data hungriness of different statistical and machine learning methods when assessed using calibration.

Methods: We used Monte Carlo simulations to assess the effect of number of events per variable (EPV) on the optimism of six learning methods when assessing model calibration: unpenalized logistic regression, ridge regression, lasso regression, bagged classification trees, random forests, and stochastic gradient boosting machines using trees as the base learners. We performed simulations in two large cardiovascular datasets each of which comprised an independent derivation and validation sample: patients hospitalized with acute myocardial infarction and patients hospitalized with heart failure. We used six data-generating processes, each based on one of the six learning methods. We allowed the sample sizes to be such that the number of EPV ranged from 10 to 200 in increments of 10. We applied six prediction methods in each of the simulated derivation samples and evaluated calibration in the simulated validation samples using the integrated calibration index, the calibration intercept, and the calibration slope. We also examined Nagelkerke's R², the scaled Brier score, and the c-statistic.

Results: Across all 12 scenarios (2 diseases × 6 data-generating processes), penalized logistic regression displayed very low optimism even when the number of EPV was very low. Random forests and bagged trees tended to be the most data hungry and displayed the greatest optimism.

Conclusions: When assessed using calibration, penalized logistic regression was substantially less data hungry than methods from the machine learning literature.

背景：机器学习方法越来越多地被用于预测临床结果。乐观度是推导样本和验证样本之间模型性能的差异。术语 "数据饥渴度 "指的是建模技术生成预测模型所需的样本量，该模型的乐观程度最低。我们的目标是比较不同统计方法和机器学习方法在使用校准评估时的相对数据饥饿度：我们使用蒙特卡罗模拟来评估在评估模型校准时，每个变量的事件数（EPV）对以下六种学习方法的乐观程度的影响：非惩罚性逻辑回归、脊回归、套索回归、袋装分类树、随机森林和使用树作为基础学习器的随机梯度提升机。我们在两个大型心血管数据集上进行了模拟，每个数据集由独立的推导和验证样本组成：急性心肌梗死住院患者和心力衰竭住院患者。我们使用了六种数据生成流程，每种流程都基于六种学习方法中的一种。我们允许样本大小为 EPV 数量在 10 到 200 之间，以 10 为增量。我们在每个模拟推导样本中应用了六种预测方法，并使用综合校准指数、校准截距和校准斜率评估了模拟验证样本中的校准情况。我们还检查了纳格尔克 R2、标度布赖尔得分和 c 统计量：在所有 12 种情况下（2 种疾病 × 6 个数据生成过程），即使 EPV 的数量非常少，惩罚逻辑回归也显示出非常低的乐观程度。随机森林和袋装树往往最需要数据，并显示出最大的乐观性：结论：在使用校准进行评估时，惩罚逻辑回归对数据的需求远远低于机器学习文献中的方法。

{"title":"The relative data hungriness of unpenalized and penalized logistic regression and ensemble-based machine learning methods: the case of calibration.","authors":"Peter C Austin, Douglas S Lee, Bo Wang","doi":"10.1186/s41512-024-00179-z","DOIUrl":"10.1186/s41512-024-00179-z","url":null,"abstract":"Background: Machine learning methods are increasingly being used to predict clinical outcomes. Optimism is the difference in model performance between derivation and validation samples. The term \"data hungriness\" refers to the sample size needed for a modelling technique to generate a prediction model with minimal optimism. Our objective was to compare the relative data hungriness of different statistical and machine learning methods when assessed using calibration.Methods: We used Monte Carlo simulations to assess the effect of number of events per variable (EPV) on the optimism of six learning methods when assessing model calibration: unpenalized logistic regression, ridge regression, lasso regression, bagged classification trees, random forests, and stochastic gradient boosting machines using trees as the base learners. We performed simulations in two large cardiovascular datasets each of which comprised an independent derivation and validation sample: patients hospitalized with acute myocardial infarction and patients hospitalized with heart failure. We used six data-generating processes, each based on one of the six learning methods. We allowed the sample sizes to be such that the number of EPV ranged from 10 to 200 in increments of 10. We applied six prediction methods in each of the simulated derivation samples and evaluated calibration in the simulated validation samples using the integrated calibration index, the calibration intercept, and the calibration slope. We also examined Nagelkerke's R2, the scaled Brier score, and the c-statistic.Results: Across all 12 scenarios (2 diseases × 6 data-generating processes), penalized logistic regression displayed very low optimism even when the number of EPV was very low. Random forests and bagged trees tended to be the most data hungry and displayed the greatest optimism.Conclusions: When assessed using calibration, penalized logistic regression was substantially less data hungry than methods from the machine learning literature.","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"8 1","pages":"15"},"PeriodicalIF":0.0,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11539735/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142585094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding overfitting in random forest for probability estimation: a visualization and simulation study. 理解用于概率估计的随机森林中的过度拟合：一项可视化和模拟研究。

Diagnostic and prognostic research

Pub Date : 2024-09-27 DOI: 10.1186/s41512-024-00177-1

Lasai Barreñada, Paula Dhiman, Dirk Timmerman, Anne-Laure Boulesteix, Ben Van Calster

Background: Random forests have become popular for clinical risk prediction modeling. In a case study on predicting ovarian malignancy, we observed training AUCs close to 1. Although this suggests overfitting, performance was competitive on test data. We aimed to understand the behavior of random forests for probability estimation by (1) visualizing data space in three real-world case studies and (2) a simulation study.

Methods: For the case studies, multinomial risk estimates were visualized using heatmaps in a 2-dimensional subspace. The simulation study included 48 logistic data-generating mechanisms (DGM), varying the predictor distribution, the number of predictors, the correlation between predictors, the true AUC, and the strength of true predictors. For each DGM, 1000 training datasets of size 200 or 4000 with binary outcomes were simulated, and random forest models were trained with minimum node size 2 or 20 using the ranger R package, resulting in 192 scenarios in total. Model performance was evaluated on large test datasets (N = 100,000).

Results: The visualizations suggested that the model learned "spikes of probability" around events in the training set. A cluster of events created a bigger peak or plateau (signal), isolated events local peaks (noise). In the simulation study, median training AUCs were between 0.97 and 1 unless there were 4 binary predictors or 16 binary predictors with a minimum node size of 20. The median discrimination loss, i.e., the difference between the median test AUC and the true AUC, was 0.025 (range 0.00 to 0.13). Median training AUCs had Spearman correlations of around 0.70 with discrimination loss. Median test AUCs were higher with higher events per variable, higher minimum node size, and binary predictors. Median training calibration slopes were always above 1 and were not correlated with median test slopes across scenarios (Spearman correlation - 0.11). Median test slopes were higher with higher true AUC, higher minimum node size, and higher sample size.

Conclusions: Random forests learn local probability peaks that often yield near perfect training AUCs without strongly affecting AUCs on test data. When the aim is probability estimation, the simulation results go against the common recommendation to use fully grown trees in random forest models.

背景随机森林已成为临床风险预测建模的常用方法。在一项预测卵巢恶性肿瘤的案例研究中，我们观察到训练 AUC 接近 1。虽然这表明存在过度拟合的情况，但在测试数据上的表现还是很有竞争力的。我们旨在通过（1）可视化三个真实世界案例研究中的数据空间和（2）模拟研究来了解随机森林在概率估计中的行为：在案例研究中，使用二维子空间中的热图对多叉风险估计进行可视化。模拟研究包括 48 种逻辑数据生成机制（DGM），预测因子的分布、预测因子的数量、预测因子之间的相关性、真实 AUC 和真实预测因子的强度各不相同。针对每种 DGM，模拟了 1000 个大小为 200 或 4000、结果为二进制的训练数据集，并使用 ranger R 软件包训练了最小节点大小为 2 或 20 的随机森林模型，总共产生了 192 个方案。在大型测试数据集（N = 100,000）上对模型性能进行了评估：可视化结果表明，模型围绕训练集中的事件学习到了 "概率峰值"。事件集群产生了更大的峰值或高原（信号），孤立事件产生了局部峰值（噪音）。在模拟研究中，除非有 4 个二进制预测因子或 16 个二进制预测因子（最小节点大小为 20），否则训练 AUC 的中位数介于 0.97 和 1 之间。分辨损失中位数，即测试 AUC 中位数与真实 AUC 之间的差值为 0.025（范围为 0.00 至 0.13）。训练 AUC 中位数与辨别损失的 Spearman 相关性约为 0.70。每个变量的事件数越多、最小节点大小越大、二元预测因子越多，测试 AUC 的中值就越高。训练校准斜率中值始终高于 1，且与各种情况下的测试斜率中值不相关（Spearman 相关性 - 0.11）。真实 AUC 越高、最小节点规模越大、样本规模越大，测试斜率中值越高：结论：随机森林学习局部概率峰值，往往能获得接近完美的训练 AUC，而不会对测试数据的 AUC 产生强烈影响。当目标是概率估计时，模拟结果与在随机森林模型中使用完全生长树的常见建议背道而驰。

{"title":"Understanding overfitting in random forest for probability estimation: a visualization and simulation study.","authors":"Lasai Barreñada, Paula Dhiman, Dirk Timmerman, Anne-Laure Boulesteix, Ben Van Calster","doi":"10.1186/s41512-024-00177-1","DOIUrl":"https://doi.org/10.1186/s41512-024-00177-1","url":null,"abstract":"Background: Random forests have become popular for clinical risk prediction modeling. In a case study on predicting ovarian malignancy, we observed training AUCs close to 1. Although this suggests overfitting, performance was competitive on test data. We aimed to understand the behavior of random forests for probability estimation by (1) visualizing data space in three real-world case studies and (2) a simulation study.Methods: For the case studies, multinomial risk estimates were visualized using heatmaps in a 2-dimensional subspace. The simulation study included 48 logistic data-generating mechanisms (DGM), varying the predictor distribution, the number of predictors, the correlation between predictors, the true AUC, and the strength of true predictors. For each DGM, 1000 training datasets of size 200 or 4000 with binary outcomes were simulated, and random forest models were trained with minimum node size 2 or 20 using the ranger R package, resulting in 192 scenarios in total. Model performance was evaluated on large test datasets (N = 100,000).Results: The visualizations suggested that the model learned \"spikes of probability\" around events in the training set. A cluster of events created a bigger peak or plateau (signal), isolated events local peaks (noise). In the simulation study, median training AUCs were between 0.97 and 1 unless there were 4 binary predictors or 16 binary predictors with a minimum node size of 20. The median discrimination loss, i.e., the difference between the median test AUC and the true AUC, was 0.025 (range 0.00 to 0.13). Median training AUCs had Spearman correlations of around 0.70 with discrimination loss. Median test AUCs were higher with higher events per variable, higher minimum node size, and binary predictors. Median training calibration slopes were always above 1 and were not correlated with median test slopes across scenarios (Spearman correlation - 0.11). Median test slopes were higher with higher true AUC, higher minimum node size, and higher sample size.Conclusions: Random forests learn local probability peaks that often yield near perfect training AUCs without strongly affecting AUCs on test data. When the aim is probability estimation, the simulation results go against the common recommendation to use fully grown trees in random forest models.","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"8 1","pages":"14"},"PeriodicalIF":0.0,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11437774/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A review of methods for the analysis of diagnostic tests performed in sequence. 依次进行的诊断测试分析方法综述。

Diagnostic and prognostic research

Pub Date : 2024-09-03 DOI: 10.1186/s41512-024-00175-3

Thomas R Fanshawe, Brian D Nicholson, Rafael Perera, Jason L Oke

Background: Many clinical pathways for the diagnosis of disease are based on diagnostic tests that are performed in sequence. The performance of the full diagnostic sequence is dictated by the diagnostic performance of each test in the sequence as well as the conditional dependence between them, given true disease status. Resulting estimates of performance, such as the sensitivity and specificity of the test sequence, are key parameters in health-economic evaluations. We conducted a methodological review of statistical methods for assessing the performance of diagnostic tests performed in sequence, with the aim of guiding data analysts towards classes of methods that may be suitable given the design and objectives of the testing sequence.

Methods: We searched PubMed, Scopus and Web of Science for relevant papers describing methodology for analysing sequences of diagnostic tests. Papers were classified by the characteristics of the method used, and these were used to group methods into themes. We illustrate some of the methods using data from a cohort study of repeat faecal immunochemical testing for colorectal cancer in symptomatic patients, to highlight the importance of allowing for conditional dependence in test sequences and adjustment for an imperfect reference standard.

Results: Five overall themes were identified, detailing methods for combining multiple tests in sequence, estimating conditional dependence, analysing sequences of diagnostic tests used for risk assessment, analysing test sequences in conjunction with an imperfect or incomplete reference standard, and meta-analysis of test sequences.

Conclusions: This methodological review can be used to help researchers identify suitable analytic methods for studies that use diagnostic tests performed in sequence.

背景：许多疾病诊断的临床路径都以依次进行的诊断测试为基础。完整诊断序列的性能取决于序列中每项检验的诊断性能，以及在真实疾病状态下它们之间的条件依赖性。由此得出的性能估计值，如检验序列的灵敏度和特异性，是健康经济评价的关键参数。我们对评估依次进行的诊断检测性能的统计方法进行了方法学回顾，目的是指导数据分析师根据检测序列的设计和目标选择合适的方法类别：我们在 PubMed、Scopus 和 Web of Science 上搜索了描述诊断检测序列分析方法的相关论文。我们根据所用方法的特点对论文进行了分类，并将这些方法归入不同的主题。我们利用对无症状患者进行结直肠癌重复粪便免疫化学检验的队列研究数据来说明其中的一些方法，以突出考虑检验序列中条件依赖性和调整不完善参考标准的重要性：确定了五个总体主题，详细介绍了将多个检验序列结合起来的方法、估计条件依赖性的方法、分析用于风险评估的诊断检测序列的方法、结合不完善或不完全参考标准分析检验序列的方法以及检验序列的元分析方法：本方法论综述可用于帮助研究人员为使用序列诊断检测的研究确定合适的分析方法。

{"title":"A review of methods for the analysis of diagnostic tests performed in sequence.","authors":"Thomas R Fanshawe, Brian D Nicholson, Rafael Perera, Jason L Oke","doi":"10.1186/s41512-024-00175-3","DOIUrl":"10.1186/s41512-024-00175-3","url":null,"abstract":"Background: Many clinical pathways for the diagnosis of disease are based on diagnostic tests that are performed in sequence. The performance of the full diagnostic sequence is dictated by the diagnostic performance of each test in the sequence as well as the conditional dependence between them, given true disease status. Resulting estimates of performance, such as the sensitivity and specificity of the test sequence, are key parameters in health-economic evaluations. We conducted a methodological review of statistical methods for assessing the performance of diagnostic tests performed in sequence, with the aim of guiding data analysts towards classes of methods that may be suitable given the design and objectives of the testing sequence.Methods: We searched PubMed, Scopus and Web of Science for relevant papers describing methodology for analysing sequences of diagnostic tests. Papers were classified by the characteristics of the method used, and these were used to group methods into themes. We illustrate some of the methods using data from a cohort study of repeat faecal immunochemical testing for colorectal cancer in symptomatic patients, to highlight the importance of allowing for conditional dependence in test sequences and adjustment for an imperfect reference standard.Results: Five overall themes were identified, detailing methods for combining multiple tests in sequence, estimating conditional dependence, analysing sequences of diagnostic tests used for risk assessment, analysing test sequences in conjunction with an imperfect or incomplete reference standard, and meta-analysis of test sequences.Conclusions: This methodological review can be used to help researchers identify suitable analytic methods for studies that use diagnostic tests performed in sequence.","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"8 1","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370044/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142121261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0