利用心肺运动测试 (CPET) 数据预测再受伤风险的机器学习模型：优化运动员的恢复。

IF 6.1 3区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Biodata Mining Pub Date : 2025-02-17 DOI:10.1186/s13040-025-00431-2

Arezoo Abasi, Ahmad Nazari, Azar Moezy, Seyed Ali Fatemi Aghda

{"title":"利用心肺运动测试 (CPET) 数据预测再受伤风险的机器学习模型：优化运动员的恢复。","authors":"Arezoo Abasi, Ahmad Nazari, Azar Moezy, Seyed Ali Fatemi Aghda","doi":"10.1186/s13040-025-00431-2","DOIUrl":null,"url":null,"abstract":"Background: Cardiopulmonary Exercise Testing (CPET) provides detailed insights into athletes' cardiovascular and pulmonary function, making it a valuable tool in assessing recovery and injury risks. However, traditional statistical models often fail to leverage the full potential of CPET data in predicting reinjury. Machine learning (ML) algorithms offer promising capabilities in uncovering complex patterns within this data, allowing for more accurate injury risk assessment.Objective: This study aimed to develop machine learning models to predict reinjury risk among elite soccer players using CPET data. Specifically, we sought to identify key physiological and performance variables that correlate with reinjury and to evaluate the performance of various ML algorithms in generating accurate predictions.Methods: A dataset of 256 elite soccer players from 16 national and top-tier teams in Iran was analyzed, incorporating physiological variables and categorical data. Several machine learning models, including CatBoost, SVM, Random Forest, and XGBoost, were employed to predict reinjury risk. Model performance was assessed using metrics such as accuracy, precision, recall, F1-score, AUC, and SHAP values to ensure robust evaluation and interpretability.Results: CatBoost and SVM exhibited the best performance, with CatBoost achieving the highest accuracy (0.9138) and F1-score (0.9148), and SVM achieving the highest AUC (0.9725). A significant association was found between a history of concussion and reinjury risk (χ² = 13.0360, p = 0.0015), highlighting the importance of neurological recovery in preventing future injuries. Heart rate metrics, particularly HRmax and HR2, were also significantly lower in players who experienced reinjury, indicating reduced cardiovascular capacity in this group.Conclusion: Machine learning models, particularly CatBoost and SVM, provide promising tools for predicting reinjury risk using CPET data. These models offer clinicians more precise, data-driven insights into athlete recovery and risk management. Future research should explore the integration of external factors such as training load and psychological readiness to further refine these predictions and enhance injury prevention protocols.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"18 1","pages":"16"},"PeriodicalIF":6.1000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11834553/pdf/","citationCount":"0","resultStr":"{\"title\":\"Machine learning models for reinjury risk prediction using cardiopulmonary exercise testing (CPET) data: optimizing athlete recovery.\",\"authors\":\"Arezoo Abasi, Ahmad Nazari, Azar Moezy, Seyed Ali Fatemi Aghda\",\"doi\":\"10.1186/s13040-025-00431-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Cardiopulmonary Exercise Testing (CPET) provides detailed insights into athletes' cardiovascular and pulmonary function, making it a valuable tool in assessing recovery and injury risks. However, traditional statistical models often fail to leverage the full potential of CPET data in predicting reinjury. Machine learning (ML) algorithms offer promising capabilities in uncovering complex patterns within this data, allowing for more accurate injury risk assessment.Objective: This study aimed to develop machine learning models to predict reinjury risk among elite soccer players using CPET data. Specifically, we sought to identify key physiological and performance variables that correlate with reinjury and to evaluate the performance of various ML algorithms in generating accurate predictions.Methods: A dataset of 256 elite soccer players from 16 national and top-tier teams in Iran was analyzed, incorporating physiological variables and categorical data. Several machine learning models, including CatBoost, SVM, Random Forest, and XGBoost, were employed to predict reinjury risk. Model performance was assessed using metrics such as accuracy, precision, recall, F1-score, AUC, and SHAP values to ensure robust evaluation and interpretability.Results: CatBoost and SVM exhibited the best performance, with CatBoost achieving the highest accuracy (0.9138) and F1-score (0.9148), and SVM achieving the highest AUC (0.9725). A significant association was found between a history of concussion and reinjury risk (χ² = 13.0360, p = 0.0015), highlighting the importance of neurological recovery in preventing future injuries. Heart rate metrics, particularly HRmax and HR2, were also significantly lower in players who experienced reinjury, indicating reduced cardiovascular capacity in this group.Conclusion: Machine learning models, particularly CatBoost and SVM, provide promising tools for predicting reinjury risk using CPET data. These models offer clinicians more precise, data-driven insights into athlete recovery and risk management. Future research should explore the integration of external factors such as training load and psychological readiness to further refine these predictions and enhance injury prevention protocols.\",\"PeriodicalId\":48947,\"journal\":{\"name\":\"Biodata Mining\",\"volume\":\"18 1\",\"pages\":\"16\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2025-02-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11834553/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biodata Mining\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s13040-025-00431-2\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biodata Mining","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13040-025-00431-2","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

背景：心肺运动测试（CPET）提供了运动员心血管和肺功能的详细信息，使其成为评估恢复和损伤风险的有价值的工具。然而，传统的统计模型往往不能充分利用CPET数据预测再损伤的潜力。机器学习（ML）算法在揭示这些数据中的复杂模式方面提供了很有前途的能力，可以更准确地评估伤害风险。目的：本研究旨在利用CPET数据开发机器学习模型来预测精英足球运动员的再损伤风险。具体来说，我们试图确定与再损伤相关的关键生理和性能变量，并评估各种ML算法在生成准确预测方面的性能。方法：对来自伊朗16支国家队和顶级球队的256名优秀足球运动员的数据集进行分析，结合生理变量和分类数据。使用CatBoost、SVM、Random Forest和XGBoost等机器学习模型来预测再损伤风险。使用准确性、精密度、召回率、f1分数、AUC和SHAP值等指标评估模型性能，以确保可靠的评估和可解释性。结果：CatBoost和SVM表现最好，其中CatBoost的准确率最高（0.9138），f1得分最高（0.9148），SVM的AUC最高（0.9725）。发现脑震荡史与再损伤风险之间存在显著关联（χ²= 13.0360,p = 0.0015），突出了神经恢复对预防未来损伤的重要性。再次受伤的运动员的心率指标，尤其是HRmax和HR2，也明显较低，这表明这组运动员的心血管容量降低。结论：机器学习模型，特别是CatBoost和SVM，为利用CPET数据预测再损伤风险提供了很有前途的工具。这些模型为临床医生提供了更精确的、数据驱动的运动员康复和风险管理见解。未来的研究应该探索外部因素的整合，如训练负荷和心理准备，以进一步完善这些预测并加强伤害预防方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Machine learning models for reinjury risk prediction using cardiopulmonary exercise testing (CPET) data: optimizing athlete recovery.

Background: Cardiopulmonary Exercise Testing (CPET) provides detailed insights into athletes' cardiovascular and pulmonary function, making it a valuable tool in assessing recovery and injury risks. However, traditional statistical models often fail to leverage the full potential of CPET data in predicting reinjury. Machine learning (ML) algorithms offer promising capabilities in uncovering complex patterns within this data, allowing for more accurate injury risk assessment.

Objective: This study aimed to develop machine learning models to predict reinjury risk among elite soccer players using CPET data. Specifically, we sought to identify key physiological and performance variables that correlate with reinjury and to evaluate the performance of various ML algorithms in generating accurate predictions.

Methods: A dataset of 256 elite soccer players from 16 national and top-tier teams in Iran was analyzed, incorporating physiological variables and categorical data. Several machine learning models, including CatBoost, SVM, Random Forest, and XGBoost, were employed to predict reinjury risk. Model performance was assessed using metrics such as accuracy, precision, recall, F1-score, AUC, and SHAP values to ensure robust evaluation and interpretability.

Results: CatBoost and SVM exhibited the best performance, with CatBoost achieving the highest accuracy (0.9138) and F1-score (0.9148), and SVM achieving the highest AUC (0.9725). A significant association was found between a history of concussion and reinjury risk (χ² = 13.0360, p = 0.0015), highlighting the importance of neurological recovery in preventing future injuries. Heart rate metrics, particularly HRmax and HR2, were also significantly lower in players who experienced reinjury, indicating reduced cardiovascular capacity in this group.

Conclusion: Machine learning models, particularly CatBoost and SVM, provide promising tools for predicting reinjury risk using CPET data. These models offer clinicians more precise, data-driven insights into athlete recovery and risk management. Future research should explore the integration of external factors such as training load and psychological readiness to further refine these predictions and enhance injury prevention protocols.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biodata Mining MATHEMATICAL & COMPUTATIONAL BIOLOGY-

CiteScore

7.90

自引率

0.00%

发文量

审稿时长

23 weeks

期刊介绍： BioData Mining is an open access, open peer-reviewed journal encompassing research on all aspects of data mining applied to high-dimensional biological and biomedical data, focusing on computational aspects of knowledge discovery from large-scale genetic, transcriptomic, genomic, proteomic, and metabolomic data. Topical areas include, but are not limited to: -Development, evaluation, and application of novel data mining and machine learning algorithms. -Adaptation, evaluation, and application of traditional data mining and machine learning algorithms. -Open-source software for the application of data mining and machine learning algorithms. -Design, development and integration of databases, software and web services for the storage, management, retrieval, and analysis of data from large scale studies. -Pre-processing, post-processing, modeling, and interpretation of data mining and machine learning results for biological interpretation and knowledge discovery.