Aldi Sidik Permana, E. C. Djamal, Fikri Nugraha, Fatan Kasyidi
{"title":"基于单流空间卷积神经网络的手部运动识别","authors":"Aldi Sidik Permana, E. C. Djamal, Fikri Nugraha, Fatan Kasyidi","doi":"10.23919/EECSI50503.2020.9251896","DOIUrl":null,"url":null,"abstract":"Human-robot interaction can be through several ways, such as through device control, sounds, brain, and body, or hand gesture. There are two main issues: the ability to adapt to extreme settings and the number of frames processed concerning memory capabilities. Although it is necessary to be careful with the selection of the number of frames so as not to burden the memory, this paper proposed identifying hand gesture of video using Spatial Convolutional Neural Networks (CNN). The sequential image's spatial arrangement is extracted from the frames contained in the video so that each frame can be identified as part of one of the hand movements. The research used VGG16, as CNN architecture is concerned with the depth of learning where there are 13 layers of convolution and three layers of identification. Hand gestures can only be identified into four movements, namely ‘right’, ‘left’, ‘grab’, and ‘phone’. Hand gesture identification on the video using Spatial CNN with an initial accuracy of 87.97%, then the second training increased to 98.05%. Accuracy was obtained after training using 5600 training data and 1120 test data, and the improvement occurred after manual noise reduction was performed.","PeriodicalId":6743,"journal":{"name":"2020 7th International Conference on Electrical Engineering, Computer Sciences and Informatics (EECSI)","volume":"28 1","pages":"172-176"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hand Movement Identification Using Single-Stream Spatial Convolutional Neural Networks\",\"authors\":\"Aldi Sidik Permana, E. C. Djamal, Fikri Nugraha, Fatan Kasyidi\",\"doi\":\"10.23919/EECSI50503.2020.9251896\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human-robot interaction can be through several ways, such as through device control, sounds, brain, and body, or hand gesture. There are two main issues: the ability to adapt to extreme settings and the number of frames processed concerning memory capabilities. Although it is necessary to be careful with the selection of the number of frames so as not to burden the memory, this paper proposed identifying hand gesture of video using Spatial Convolutional Neural Networks (CNN). The sequential image's spatial arrangement is extracted from the frames contained in the video so that each frame can be identified as part of one of the hand movements. The research used VGG16, as CNN architecture is concerned with the depth of learning where there are 13 layers of convolution and three layers of identification. Hand gestures can only be identified into four movements, namely ‘right’, ‘left’, ‘grab’, and ‘phone’. Hand gesture identification on the video using Spatial CNN with an initial accuracy of 87.97%, then the second training increased to 98.05%. Accuracy was obtained after training using 5600 training data and 1120 test data, and the improvement occurred after manual noise reduction was performed.\",\"PeriodicalId\":6743,\"journal\":{\"name\":\"2020 7th International Conference on Electrical Engineering, Computer Sciences and Informatics (EECSI)\",\"volume\":\"28 1\",\"pages\":\"172-176\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 7th International Conference on Electrical Engineering, Computer Sciences and Informatics (EECSI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/EECSI50503.2020.9251896\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 7th International Conference on Electrical Engineering, Computer Sciences and Informatics (EECSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/EECSI50503.2020.9251896","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hand Movement Identification Using Single-Stream Spatial Convolutional Neural Networks
Human-robot interaction can be through several ways, such as through device control, sounds, brain, and body, or hand gesture. There are two main issues: the ability to adapt to extreme settings and the number of frames processed concerning memory capabilities. Although it is necessary to be careful with the selection of the number of frames so as not to burden the memory, this paper proposed identifying hand gesture of video using Spatial Convolutional Neural Networks (CNN). The sequential image's spatial arrangement is extracted from the frames contained in the video so that each frame can be identified as part of one of the hand movements. The research used VGG16, as CNN architecture is concerned with the depth of learning where there are 13 layers of convolution and three layers of identification. Hand gestures can only be identified into four movements, namely ‘right’, ‘left’, ‘grab’, and ‘phone’. Hand gesture identification on the video using Spatial CNN with an initial accuracy of 87.97%, then the second training increased to 98.05%. Accuracy was obtained after training using 5600 training data and 1120 test data, and the improvement occurred after manual noise reduction was performed.