{"title":"Human Action Recognition (HAR) Classification Using MediaPipe and Long Short-Term Memory (LSTM)","authors":"Ichsan Arsyi Putra, O. Nurhayati, D. Eridani","doi":"10.14710/teknik.v43i2.46439","DOIUrl":null,"url":null,"abstract":"Human Action Recognition is an important research topic in Machine Learning and Computer Vision domains. One of the proposed methods is a combination of MediaPipe library and Long Short-Term Memory concerning the testing accuracy and training duration as indicators to evaluate the model performance. This research tried to adapt proposed LSTM models to implement HAR with image features extracted by MediaPipe library. There would be a comparison between LSTM models based on their testing accuracy and training duration. This research was conducted under OSEMN methods (Obtain, Scrub, Explore, Model, and iNterpret). The dataset was preprocessed Weizmann dataset with data preprocessing and data augmentation implementations. Video features extracted by MediaPipe: Pose was used in training and validation processes on neural network models focusing on Long Short-Term Memory layers. The processes were finished by model performance evaluation based on confusion matrices interpretation and calculations of accuracy, error rate, precision, recall, and F1score. This research yielded seven LSTM model variants with the highest testing accuracy at 82% taking 10 minutes and 50 seconds of training duration.","PeriodicalId":30795,"journal":{"name":"Teknik","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Teknik","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14710/teknik.v43i2.46439","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Human Action Recognition is an important research topic in Machine Learning and Computer Vision domains. One of the proposed methods is a combination of MediaPipe library and Long Short-Term Memory concerning the testing accuracy and training duration as indicators to evaluate the model performance. This research tried to adapt proposed LSTM models to implement HAR with image features extracted by MediaPipe library. There would be a comparison between LSTM models based on their testing accuracy and training duration. This research was conducted under OSEMN methods (Obtain, Scrub, Explore, Model, and iNterpret). The dataset was preprocessed Weizmann dataset with data preprocessing and data augmentation implementations. Video features extracted by MediaPipe: Pose was used in training and validation processes on neural network models focusing on Long Short-Term Memory layers. The processes were finished by model performance evaluation based on confusion matrices interpretation and calculations of accuracy, error rate, precision, recall, and F1score. This research yielded seven LSTM model variants with the highest testing accuracy at 82% taking 10 minutes and 50 seconds of training duration.