Rafael Garcia Carretero, Luis Vigil-Medina, Oscar Barquero-Perez, Inmaculada Mora-Jimenez, Cristina Soguero-Ruiz, Javier Ramos-Lopez
{"title":"构建高血压人群维生素D缺乏预测模型的机器学习方法:一项比较研究。","authors":"Rafael Garcia Carretero, Luis Vigil-Medina, Oscar Barquero-Perez, Inmaculada Mora-Jimenez, Cristina Soguero-Ruiz, Javier Ramos-Lopez","doi":"10.1080/17538157.2021.1896524","DOIUrl":null,"url":null,"abstract":"<p><p><b>Objective:</b> Given the association between vitamin D deficiency and risk for cardiovascular disease, we used machine learning approaches to establish a model to predict the probability of deficiency. Determination of serum levels of 25-hydroxy vitamin D (25(OH)D) provided the best assessment of vitamin D status, but such tests are not always widely available or feasible. Thus, our study established predictive models with high sensitivity to identify patients either unlikely to have vitamin D deficiency or who should undergo 25(OH)D testing.<b>Methods:</b> We collected data from 1002 hypertensive patients from a Spanish university hospital. The elastic net regularization approach was applied to reduce the dimensionality of the dataset. The issue of determining vitamin D status was addressed as a classification problem; thus, the following classifiers were applied: logistic regression, support vector machine (SVM), random forest, naive Bayes, and Extreme Gradient Boost methods. Classification accuracy, sensitivity, specificity, and predictive values were computed to assess the performance of each method.<b>Results:</b> The SVM-based method with radial kernel performed better than the other algorithms in terms of sensitivity (98%), negative predictive value (71%), and classification accuracy (73%).<b>Conclusion:</b> The combination of a feature-selection method such as elastic net regularization and a classification approach produced well-fitted models. The SVM approach yielded better predictions than the other algorithms. This combination approach allowed us to develop a predictive model with high sensitivity but low specificity, to identify the population that could benefit from laboratory determination of serum levels of 25(OH)D.</p>","PeriodicalId":54984,"journal":{"name":"Informatics for Health & Social Care","volume":"46 4","pages":"355-369"},"PeriodicalIF":2.5000,"publicationDate":"2021-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/17538157.2021.1896524","citationCount":"6","resultStr":"{\"title\":\"Machine learning approaches to constructing predictive models of vitamin D deficiency in a hypertensive population: a comparative study.\",\"authors\":\"Rafael Garcia Carretero, Luis Vigil-Medina, Oscar Barquero-Perez, Inmaculada Mora-Jimenez, Cristina Soguero-Ruiz, Javier Ramos-Lopez\",\"doi\":\"10.1080/17538157.2021.1896524\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Objective:</b> Given the association between vitamin D deficiency and risk for cardiovascular disease, we used machine learning approaches to establish a model to predict the probability of deficiency. Determination of serum levels of 25-hydroxy vitamin D (25(OH)D) provided the best assessment of vitamin D status, but such tests are not always widely available or feasible. Thus, our study established predictive models with high sensitivity to identify patients either unlikely to have vitamin D deficiency or who should undergo 25(OH)D testing.<b>Methods:</b> We collected data from 1002 hypertensive patients from a Spanish university hospital. The elastic net regularization approach was applied to reduce the dimensionality of the dataset. The issue of determining vitamin D status was addressed as a classification problem; thus, the following classifiers were applied: logistic regression, support vector machine (SVM), random forest, naive Bayes, and Extreme Gradient Boost methods. Classification accuracy, sensitivity, specificity, and predictive values were computed to assess the performance of each method.<b>Results:</b> The SVM-based method with radial kernel performed better than the other algorithms in terms of sensitivity (98%), negative predictive value (71%), and classification accuracy (73%).<b>Conclusion:</b> The combination of a feature-selection method such as elastic net regularization and a classification approach produced well-fitted models. The SVM approach yielded better predictions than the other algorithms. This combination approach allowed us to develop a predictive model with high sensitivity but low specificity, to identify the population that could benefit from laboratory determination of serum levels of 25(OH)D.</p>\",\"PeriodicalId\":54984,\"journal\":{\"name\":\"Informatics for Health & Social Care\",\"volume\":\"46 4\",\"pages\":\"355-369\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2021-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1080/17538157.2021.1896524\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Informatics for Health & Social Care\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/17538157.2021.1896524\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2021/4/1 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics for Health & Social Care","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/17538157.2021.1896524","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/4/1 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
Machine learning approaches to constructing predictive models of vitamin D deficiency in a hypertensive population: a comparative study.
Objective: Given the association between vitamin D deficiency and risk for cardiovascular disease, we used machine learning approaches to establish a model to predict the probability of deficiency. Determination of serum levels of 25-hydroxy vitamin D (25(OH)D) provided the best assessment of vitamin D status, but such tests are not always widely available or feasible. Thus, our study established predictive models with high sensitivity to identify patients either unlikely to have vitamin D deficiency or who should undergo 25(OH)D testing.Methods: We collected data from 1002 hypertensive patients from a Spanish university hospital. The elastic net regularization approach was applied to reduce the dimensionality of the dataset. The issue of determining vitamin D status was addressed as a classification problem; thus, the following classifiers were applied: logistic regression, support vector machine (SVM), random forest, naive Bayes, and Extreme Gradient Boost methods. Classification accuracy, sensitivity, specificity, and predictive values were computed to assess the performance of each method.Results: The SVM-based method with radial kernel performed better than the other algorithms in terms of sensitivity (98%), negative predictive value (71%), and classification accuracy (73%).Conclusion: The combination of a feature-selection method such as elastic net regularization and a classification approach produced well-fitted models. The SVM approach yielded better predictions than the other algorithms. This combination approach allowed us to develop a predictive model with high sensitivity but low specificity, to identify the population that could benefit from laboratory determination of serum levels of 25(OH)D.
期刊介绍:
Informatics for Health & Social Care promotes evidence-based informatics as applied to the domain of health and social care. It showcases informatics research and practice within the many and diverse contexts of care; it takes personal information, both its direct and indirect use, as its central focus.
The scope of the Journal is broad, encompassing both the properties of care information and the life-cycle of associated information systems.
Consideration of the properties of care information will necessarily include the data itself, its representation, structure, and associated processes, as well as the context of its use, highlighting the related communication, computational, cognitive, social and ethical aspects.
Consideration of the life-cycle of care information systems includes full range from requirements, specifications, theoretical models and conceptual design through to sustainable implementations, and the valuation of impacts. Empirical evidence experiences related to implementation are particularly welcome.
Informatics in Health & Social Care seeks to consolidate and add to the core knowledge within the disciplines of Health and Social Care Informatics. The Journal therefore welcomes scientific papers, case studies and literature reviews. Examples of novel approaches are particularly welcome. Articles might, for example, show how care data is collected and transformed into useful and usable information, how informatics research is translated into practice, how specific results can be generalised, or perhaps provide case studies that facilitate learning from experience.