基于深度学习的人体活动识别中可穿戴传感器数据不平衡采样方法的比较

IF 0.7 Q3 COMPUTER SCIENCE, THEORY & METHODS International Journal of Advanced Computer Science and Applications Pub Date : 2023-01-01 DOI:10.14569/ijacsa.2023.0141032

Mariam El Ghazi, Noura Aknin

{"title":"基于深度学习的人体活动识别中可穿戴传感器数据不平衡采样方法的比较","authors":"Mariam El Ghazi, Noura Aknin","doi":"10.14569/ijacsa.2023.0141032","DOIUrl":null,"url":null,"abstract":"Human Activity Recognition (HAR) holds significant implications across diverse domains, including healthcare, sports analytics, and human-computer interaction. Deep learning models demonstrate great potential in HAR, but performance is often hindered by imbalanced datasets. This study investigates the impact of class imbalance on deep learning models in HAR and conducts a comprehensive comparative analysis of various sampling techniques to mitigate this issue. The experimentation involves the PAMAP2 dataset, encompassing data collected from wearable sensors. The research includes four primary experiments. Initially, a performance baseline is established by training four deep-learning models on the imbalanced dataset. Subsequently, Synthetic Minority Over-sampling Technique (SMOTE), random under-sampling, and a hybrid sampling approach are employed to rebalance the dataset. In each experiment, Bayesian optimization is employed for hyperparameter tuning, optimizing model performance. The findings underscore the paramount importance of dataset balance, resulting in substantial improvements across critical performance metrics such as accuracy, F1 score, precision, and recall. Notably, the hybrid sampling technique, combining SMOTE and Random Undersampling, emerges as the most effective method, surpassing other approaches. This research contributes significantly to advancing the field of HAR, highlighting the necessity of addressing class imbalance in deep learning models. Furthermore, the results offer practical insights for the development of HAR systems, enhancing accuracy and reliability in real-world applications. Future works will explore alternative public datasets, more complex deep learning models, and diverse sampling techniques to further elevate the capabilities of HAR systems.","PeriodicalId":13824,"journal":{"name":"International Journal of Advanced Computer Science and Applications","volume":"30 1","pages":"0"},"PeriodicalIF":0.7000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comparison of Sampling Methods for Dealing with Imbalanced Wearable Sensor Data in Human Activity Recognition using Deep Learning\",\"authors\":\"Mariam El Ghazi, Noura Aknin\",\"doi\":\"10.14569/ijacsa.2023.0141032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human Activity Recognition (HAR) holds significant implications across diverse domains, including healthcare, sports analytics, and human-computer interaction. Deep learning models demonstrate great potential in HAR, but performance is often hindered by imbalanced datasets. This study investigates the impact of class imbalance on deep learning models in HAR and conducts a comprehensive comparative analysis of various sampling techniques to mitigate this issue. The experimentation involves the PAMAP2 dataset, encompassing data collected from wearable sensors. The research includes four primary experiments. Initially, a performance baseline is established by training four deep-learning models on the imbalanced dataset. Subsequently, Synthetic Minority Over-sampling Technique (SMOTE), random under-sampling, and a hybrid sampling approach are employed to rebalance the dataset. In each experiment, Bayesian optimization is employed for hyperparameter tuning, optimizing model performance. The findings underscore the paramount importance of dataset balance, resulting in substantial improvements across critical performance metrics such as accuracy, F1 score, precision, and recall. Notably, the hybrid sampling technique, combining SMOTE and Random Undersampling, emerges as the most effective method, surpassing other approaches. This research contributes significantly to advancing the field of HAR, highlighting the necessity of addressing class imbalance in deep learning models. Furthermore, the results offer practical insights for the development of HAR systems, enhancing accuracy and reliability in real-world applications. Future works will explore alternative public datasets, more complex deep learning models, and diverse sampling techniques to further elevate the capabilities of HAR systems.\",\"PeriodicalId\":13824,\"journal\":{\"name\":\"International Journal of Advanced Computer Science and Applications\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Advanced Computer Science and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14569/ijacsa.2023.0141032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advanced Computer Science and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14569/ijacsa.2023.0141032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

人类活动识别(HAR)在不同领域具有重要意义，包括医疗保健、体育分析和人机交互。深度学习模型在HAR中显示出巨大的潜力，但性能经常受到不平衡数据集的阻碍。本研究探讨了类不平衡对HAR中深度学习模型的影响，并对各种采样技术进行了全面的比较分析，以缓解这一问题。实验涉及PAMAP2数据集，包括从可穿戴传感器收集的数据。本研究包括四个主要实验。首先，通过在不平衡数据集上训练四个深度学习模型来建立性能基线。随后，采用合成少数派过采样技术(SMOTE)、随机欠采样和混合采样方法对数据集进行再平衡。每次实验均采用贝叶斯优化进行超参数调优，优化模型性能。研究结果强调了数据集平衡的重要性，从而大大提高了关键性能指标，如准确性、F1分数、精度和召回率。值得注意的是，混合采样技术，结合SMOTE和随机欠采样，成为最有效的方法，超越了其他方法。本研究对HAR领域的发展做出了重大贡献，突出了解决深度学习模型中阶级不平衡问题的必要性。此外，研究结果为HAR系统的开发提供了实用的见解，提高了实际应用中的准确性和可靠性。未来的工作将探索其他公共数据集、更复杂的深度学习模型和不同的采样技术，以进一步提升HAR系统的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Comparison of Sampling Methods for Dealing with Imbalanced Wearable Sensor Data in Human Activity Recognition using Deep Learning

Human Activity Recognition (HAR) holds significant implications across diverse domains, including healthcare, sports analytics, and human-computer interaction. Deep learning models demonstrate great potential in HAR, but performance is often hindered by imbalanced datasets. This study investigates the impact of class imbalance on deep learning models in HAR and conducts a comprehensive comparative analysis of various sampling techniques to mitigate this issue. The experimentation involves the PAMAP2 dataset, encompassing data collected from wearable sensors. The research includes four primary experiments. Initially, a performance baseline is established by training four deep-learning models on the imbalanced dataset. Subsequently, Synthetic Minority Over-sampling Technique (SMOTE), random under-sampling, and a hybrid sampling approach are employed to rebalance the dataset. In each experiment, Bayesian optimization is employed for hyperparameter tuning, optimizing model performance. The findings underscore the paramount importance of dataset balance, resulting in substantial improvements across critical performance metrics such as accuracy, F1 score, precision, and recall. Notably, the hybrid sampling technique, combining SMOTE and Random Undersampling, emerges as the most effective method, surpassing other approaches. This research contributes significantly to advancing the field of HAR, highlighting the necessity of addressing class imbalance in deep learning models. Furthermore, the results offer practical insights for the development of HAR systems, enhancing accuracy and reliability in real-world applications. Future works will explore alternative public datasets, more complex deep learning models, and diverse sampling techniques to further elevate the capabilities of HAR systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Advanced Computer Science and Applications COMPUTER SCIENCE, THEORY & METHODS-

CiteScore

2.30

自引率

22.20%

发文量

519

期刊介绍： IJACSA is a scholarly computer science journal representing the best in research. Its mission is to provide an outlet for quality research to be publicised and published to a global audience. The journal aims to publish papers selected through rigorous double-blind peer review to ensure originality, timeliness, relevance, and readability. In sync with the Journal''s vision "to be a respected publication that publishes peer reviewed research articles, as well as review and survey papers contributed by International community of Authors", we have drawn reviewers and editors from Institutions and Universities across the globe. A double blind peer review process is conducted to ensure that we retain high standards. At IJACSA, we stand strong because we know that global challenges make way for new innovations, new ways and new talent. International Journal of Advanced Computer Science and Applications publishes carefully refereed research, review and survey papers which offer a significant contribution to the computer science literature, and which are of interest to a wide audience. Coverage extends to all main-stream branches of computer science and related applications