机器学习在泰国绝经后骨质疏松症患者分类中的表现

Intelligence-based medicine Pub Date : 2023-01-01 DOI:10.1016/j.ibmed.2023.100099

Kittisak Thawnashom , Pornsarp Pornsawad , Bunjira Makond

{"title":"机器学习在泰国绝经后骨质疏松症患者分类中的表现","authors":"Kittisak Thawnashom , Pornsarp Pornsawad , Bunjira Makond","doi":"10.1016/j.ibmed.2023.100099","DOIUrl":null,"url":null,"abstract":"<div><p>This work investigates the performance of different machine learning (ML) methods for classifying postmenopausal osteoporosis Thai patients. Our dataset contains 377 samples compiled retrospectively using the medical records of a Thai woman in the postmenopause stage from the obstetrics and gynecology clinic, Ramathibodi Hospital, Bangkok, Thailand. Missing data imputation, feature selection, and handling imbalanced techniques are independently applied as pre-processing approaches. The performance of different ML algorithms, including <em>k</em>-nearest neighbors (<em>k</em>-NN), neural network (NN), naïve Bayesian (NB), Bayesian network (BN), support vector machine (SVM), random forest (RF), and decision tree (DT), is compared between the pre-processed and original data. The results demonstrate that different ML algorithms combined with pre-processing techniques achieve varying results. In terms of accuracy, the three best-performing methods are the NN, NB, and RF models when a wrapper approach is used with an appropriate learner. In terms of specificity, the DT model achieves the best performance when the synthetic minority oversampling technique method is applied. When feature selection techniques are applied, the <em>k</em>-NN, BN, and SVM algorithms obtain the best sensitivity, whereas the NN shows the best area under the curve. Overall, in comparison with the original dataset, the pre-processed approaches improved model performance. Therefore, proper pre-processing techniques should be considered when developing ML classifiers to identify the best appropriate model.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"7 ","pages":"Article 100099"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning's performance in classifying postmenopausal osteoporosis Thai patients\",\"authors\":\"Kittisak Thawnashom , Pornsarp Pornsawad , Bunjira Makond\",\"doi\":\"10.1016/j.ibmed.2023.100099\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This work investigates the performance of different machine learning (ML) methods for classifying postmenopausal osteoporosis Thai patients. Our dataset contains 377 samples compiled retrospectively using the medical records of a Thai woman in the postmenopause stage from the obstetrics and gynecology clinic, Ramathibodi Hospital, Bangkok, Thailand. Missing data imputation, feature selection, and handling imbalanced techniques are independently applied as pre-processing approaches. The performance of different ML algorithms, including <em>k</em>-nearest neighbors (<em>k</em>-NN), neural network (NN), naïve Bayesian (NB), Bayesian network (BN), support vector machine (SVM), random forest (RF), and decision tree (DT), is compared between the pre-processed and original data. The results demonstrate that different ML algorithms combined with pre-processing techniques achieve varying results. In terms of accuracy, the three best-performing methods are the NN, NB, and RF models when a wrapper approach is used with an appropriate learner. In terms of specificity, the DT model achieves the best performance when the synthetic minority oversampling technique method is applied. When feature selection techniques are applied, the <em>k</em>-NN, BN, and SVM algorithms obtain the best sensitivity, whereas the NN shows the best area under the curve. Overall, in comparison with the original dataset, the pre-processed approaches improved model performance. Therefore, proper pre-processing techniques should be considered when developing ML classifiers to identify the best appropriate model.</p></div>\",\"PeriodicalId\":73399,\"journal\":{\"name\":\"Intelligence-based medicine\",\"volume\":\"7 \",\"pages\":\"Article 100099\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligence-based medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666521223000133\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521223000133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

这项工作调查了不同的机器学习(ML)方法分类绝经后骨质疏松症泰国患者的性能。我们的数据集包含377个样本，回顾性汇编使用来自泰国曼谷Ramathibodi医院妇产科诊所的绝经后泰国妇女的医疗记录。缺失数据输入、特征选择和处理不平衡技术分别作为预处理方法。比较了不同的机器学习算法，包括k-近邻算法(k-NN)、神经网络算法(NN)、naïve贝叶斯算法(NB)、贝叶斯网络算法(BN)、支持向量机算法(SVM)、随机森林算法(RF)和决策树算法(DT)在预处理和原始数据之间的性能。结果表明，不同的机器学习算法结合预处理技术可以获得不同的结果。在准确性方面，当包装器方法与适当的学习器一起使用时，三种表现最好的方法是NN, NB和RF模型。在特异性方面，采用合成少数派过采样技术方法时，DT模型的性能最好。当使用特征选择技术时，k-NN、BN和SVM算法获得最佳灵敏度，而NN在曲线下显示最佳面积。总体而言，与原始数据集相比，预处理方法提高了模型性能。因此，在开发ML分类器时应考虑适当的预处理技术，以确定最合适的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Machine learning's performance in classifying postmenopausal osteoporosis Thai patients

This work investigates the performance of different machine learning (ML) methods for classifying postmenopausal osteoporosis Thai patients. Our dataset contains 377 samples compiled retrospectively using the medical records of a Thai woman in the postmenopause stage from the obstetrics and gynecology clinic, Ramathibodi Hospital, Bangkok, Thailand. Missing data imputation, feature selection, and handling imbalanced techniques are independently applied as pre-processing approaches. The performance of different ML algorithms, including k-nearest neighbors (k-NN), neural network (NN), naïve Bayesian (NB), Bayesian network (BN), support vector machine (SVM), random forest (RF), and decision tree (DT), is compared between the pre-processed and original data. The results demonstrate that different ML algorithms combined with pre-processing techniques achieve varying results. In terms of accuracy, the three best-performing methods are the NN, NB, and RF models when a wrapper approach is used with an appropriate learner. In terms of specificity, the DT model achieves the best performance when the synthetic minority oversampling technique method is applied. When feature selection techniques are applied, the k-NN, BN, and SVM algorithms obtain the best sensitivity, whereas the NN shows the best area under the curve. Overall, in comparison with the original dataset, the pre-processed approaches improved model performance. Therefore, proper pre-processing techniques should be considered when developing ML classifiers to identify the best appropriate model.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊