{"title":"Ventilatory efficiency as a predictor of 1-year mortality after non-cardiac surgery: showing clinical utility by applying decision curve analysis","authors":"Thomas Vetsch, Markus Huber","doi":"10.1111/anae.16554","DOIUrl":null,"url":null,"abstract":"<p>The study by Arina et al. represents an innovative and commendable effort to develop a clinical prediction model to predict 1-year mortality (the primary outcome) after major non-cardiac surgery using pre-operative data [<span>1</span>]. We would like to highlight and elaborate on two aspects of their study related to the clinical interpretation and utility of the prediction model for decision-making.</p>\n<p>First, and with respect to clinical interpretation, the authors highlight the importance of both clinical and physiological data in predicting the primary outcome. In particular, the inclusion of data obtained from cardiopulmonary exercise testing (CPET) – reflecting objectively measured fitness – is clinically reasonable. Among the plethora of parameters derived from CPET, ventilatory efficiency (expressed by the minute ventilation (l.min<sup>-1</sup>) to carbon dioxide output (ml.kg<sup>-1</sup>.min<sup>-1</sup>)) is a frequently reported predictor of short-term postoperative mortality. It can be reported as a slope to the secondary ventilatory threshold (VE.VCO<sub>2</sub><sup>-1</sup> slope); as a ratio at the first ventilatory threshold (VE.VCO<sub>2</sub><sup>-1 VT1/AT</sup>); or as a ratio at peak exercise (VE.VCO<sub>2</sub><sup>-1 peak</sup>). Given the fact that submaximal testing may be appropriate for some patients, we recommend reporting the VE.VCO<sub>2</sub><sup>-1</sup> slope due to its insensitivity to ventilatory thresholds [<span>2</span>]. It is plausible that ventilatory efficiency could be even more relevant to predict mid- to long-term mortality compared with short-term, given its close relationship with comorbidity. The results presented by Arina et al., therefore, constitute a valuable contribution in the quest to identify novel predictors for mid- to long-term outcomes.</p>\n<p>Second, with respect to clinical utility, the authors primarily evaluate a clinical prediction model based on the multi-objective symbolic regression (MOSR) approach. Given the broad clinical readership of <i>Anaesthesia</i>, it would be helpful to introduce and explain this novel algorithm in more detail and, if possible, with some illustrative, practical examples. This would help avoid the introduction of another so-called ‘black-box’ medical algorithm [<span>3</span>]. Additionally, the suite of prediction models in the study could be combined in a so-called super learner [<span>4</span>].</p>\n<p>The performance of the models is evaluated with the so-called F1-score. For another performance metric – the area under the receiver operating characteristic (AUROC) – it has been emphasised that the relative gain in AUROC performance of a new or updated prediction model provides only a very limited perspective on its added clinical utility. Here, we would argue similarly, in the sense that the added benefit of the MOSR approach for clinical decision-making with respect to logistic regression or machine-learning methods can only be partly examined with traditional performance metrics like the F1-score.</p>\n<p>In this context, the decision curve analysis provides a suitable framework which allows the superior performance of the MOSR approach to be examined and for clinicians to better understand the practical implications of applying the model in real-world settings [<span>5</span>]. This well-described and endorsed method evaluates the clinical utility of prediction models by weighting the benefit of true positives against the harms of false positives across a range of decision thresholds. Considering the good calibration of the MOSR models in the different predictor domains (clinical, fitness, combined) reported in supporting Figure S1 [<span>1</span>], it would be interesting to see the result of a decision curve analysis. This analysis would be of particular interest for the MOSR approach, as the study by Arina et al. constitutes one of the very first applications of this statistical method.</p>\n<p>Overall, we recommend not relying on a single algorithm for datasets with limited sample size with respect to generalisability of the results, and to perform a decision curve analysis to evaluate the relative clinical utility of a suite of prediction models.</p>","PeriodicalId":7742,"journal":{"name":"Anaesthesia","volume":"62 1","pages":""},"PeriodicalIF":7.5000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anaesthesia","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/anae.16554","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The study by Arina et al. represents an innovative and commendable effort to develop a clinical prediction model to predict 1-year mortality (the primary outcome) after major non-cardiac surgery using pre-operative data [1]. We would like to highlight and elaborate on two aspects of their study related to the clinical interpretation and utility of the prediction model for decision-making.
First, and with respect to clinical interpretation, the authors highlight the importance of both clinical and physiological data in predicting the primary outcome. In particular, the inclusion of data obtained from cardiopulmonary exercise testing (CPET) – reflecting objectively measured fitness – is clinically reasonable. Among the plethora of parameters derived from CPET, ventilatory efficiency (expressed by the minute ventilation (l.min-1) to carbon dioxide output (ml.kg-1.min-1)) is a frequently reported predictor of short-term postoperative mortality. It can be reported as a slope to the secondary ventilatory threshold (VE.VCO2-1 slope); as a ratio at the first ventilatory threshold (VE.VCO2-1 VT1/AT); or as a ratio at peak exercise (VE.VCO2-1 peak). Given the fact that submaximal testing may be appropriate for some patients, we recommend reporting the VE.VCO2-1 slope due to its insensitivity to ventilatory thresholds [2]. It is plausible that ventilatory efficiency could be even more relevant to predict mid- to long-term mortality compared with short-term, given its close relationship with comorbidity. The results presented by Arina et al., therefore, constitute a valuable contribution in the quest to identify novel predictors for mid- to long-term outcomes.
Second, with respect to clinical utility, the authors primarily evaluate a clinical prediction model based on the multi-objective symbolic regression (MOSR) approach. Given the broad clinical readership of Anaesthesia, it would be helpful to introduce and explain this novel algorithm in more detail and, if possible, with some illustrative, practical examples. This would help avoid the introduction of another so-called ‘black-box’ medical algorithm [3]. Additionally, the suite of prediction models in the study could be combined in a so-called super learner [4].
The performance of the models is evaluated with the so-called F1-score. For another performance metric – the area under the receiver operating characteristic (AUROC) – it has been emphasised that the relative gain in AUROC performance of a new or updated prediction model provides only a very limited perspective on its added clinical utility. Here, we would argue similarly, in the sense that the added benefit of the MOSR approach for clinical decision-making with respect to logistic regression or machine-learning methods can only be partly examined with traditional performance metrics like the F1-score.
In this context, the decision curve analysis provides a suitable framework which allows the superior performance of the MOSR approach to be examined and for clinicians to better understand the practical implications of applying the model in real-world settings [5]. This well-described and endorsed method evaluates the clinical utility of prediction models by weighting the benefit of true positives against the harms of false positives across a range of decision thresholds. Considering the good calibration of the MOSR models in the different predictor domains (clinical, fitness, combined) reported in supporting Figure S1 [1], it would be interesting to see the result of a decision curve analysis. This analysis would be of particular interest for the MOSR approach, as the study by Arina et al. constitutes one of the very first applications of this statistical method.
Overall, we recommend not relying on a single algorithm for datasets with limited sample size with respect to generalisability of the results, and to perform a decision curve analysis to evaluate the relative clinical utility of a suite of prediction models.
期刊介绍:
The official journal of the Association of Anaesthetists is Anaesthesia. It is a comprehensive international publication that covers a wide range of topics. The journal focuses on general and regional anaesthesia, as well as intensive care and pain therapy. It includes original articles that have undergone peer review, covering all aspects of these fields, including research on equipment.