{"title":"电子医疗应用中机器学习模型验证方法的关键分析","authors":"Hakan Yekta Yatbaz, Adnan Yazici, E. Ever","doi":"10.1109/ICISCT55600.2022.10146901","DOIUrl":null,"url":null,"abstract":"Different types of data sets are established using various sensor-based networks and wearable devices for performing human activity recognition that can be used effectively in the e-health domain. Various machine learning models and relevant algorithms are used to perform detection with high accuracy. Depending on the characteristics of the data set, validation methods used to compute the accuracy may vary. The correct validation of machine learning algorithms is essential to correctly assess the performance of the models especially when the data analysed are related to the health domain. Activity recognition algorithms based on sliding windows with different overlapping ratios are popularly used for validation together with popular cross-validation methods such as k-fold, leave-one-out and leave-one-subject-out. In this study, validation methods commonly used for windowing-based activity recognition systems using wearable devices are analyzed. The advantages and disadvantages of each method are discussed taking into account various parameters. A case study, using the well-known MHEALTH data set, is presented with state-of-the art machine learning approaches. Experimental testing with a second window size using 10-fold cross-validation, a five-second window size using leave one out cross-validation, and a second window size using 5 fold cross-validation gave the highest accuracy, 96.71%, 95.65% and 95% respectively while the window overlap is 50%.","PeriodicalId":332984,"journal":{"name":"2022 International Conference on Information Science and Communications Technologies (ICISCT)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Critical Analysis of Validation Methods for Machine Learning Models Used in E-health Applications\",\"authors\":\"Hakan Yekta Yatbaz, Adnan Yazici, E. Ever\",\"doi\":\"10.1109/ICISCT55600.2022.10146901\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Different types of data sets are established using various sensor-based networks and wearable devices for performing human activity recognition that can be used effectively in the e-health domain. Various machine learning models and relevant algorithms are used to perform detection with high accuracy. Depending on the characteristics of the data set, validation methods used to compute the accuracy may vary. The correct validation of machine learning algorithms is essential to correctly assess the performance of the models especially when the data analysed are related to the health domain. Activity recognition algorithms based on sliding windows with different overlapping ratios are popularly used for validation together with popular cross-validation methods such as k-fold, leave-one-out and leave-one-subject-out. In this study, validation methods commonly used for windowing-based activity recognition systems using wearable devices are analyzed. The advantages and disadvantages of each method are discussed taking into account various parameters. A case study, using the well-known MHEALTH data set, is presented with state-of-the art machine learning approaches. Experimental testing with a second window size using 10-fold cross-validation, a five-second window size using leave one out cross-validation, and a second window size using 5 fold cross-validation gave the highest accuracy, 96.71%, 95.65% and 95% respectively while the window overlap is 50%.\",\"PeriodicalId\":332984,\"journal\":{\"name\":\"2022 International Conference on Information Science and Communications Technologies (ICISCT)\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Information Science and Communications Technologies (ICISCT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICISCT55600.2022.10146901\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Information Science and Communications Technologies (ICISCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISCT55600.2022.10146901","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Critical Analysis of Validation Methods for Machine Learning Models Used in E-health Applications
Different types of data sets are established using various sensor-based networks and wearable devices for performing human activity recognition that can be used effectively in the e-health domain. Various machine learning models and relevant algorithms are used to perform detection with high accuracy. Depending on the characteristics of the data set, validation methods used to compute the accuracy may vary. The correct validation of machine learning algorithms is essential to correctly assess the performance of the models especially when the data analysed are related to the health domain. Activity recognition algorithms based on sliding windows with different overlapping ratios are popularly used for validation together with popular cross-validation methods such as k-fold, leave-one-out and leave-one-subject-out. In this study, validation methods commonly used for windowing-based activity recognition systems using wearable devices are analyzed. The advantages and disadvantages of each method are discussed taking into account various parameters. A case study, using the well-known MHEALTH data set, is presented with state-of-the art machine learning approaches. Experimental testing with a second window size using 10-fold cross-validation, a five-second window size using leave one out cross-validation, and a second window size using 5 fold cross-validation gave the highest accuracy, 96.71%, 95.65% and 95% respectively while the window overlap is 50%.