Zhengyou Wang , Chengyu Du , Yunpeng Zhang , Jing Bai , Shanna Zhuang
{"title":"EM-Gait:利用运动激励和特征嵌入自我关注进行步态识别","authors":"Zhengyou Wang , Chengyu Du , Yunpeng Zhang , Jing Bai , Shanna Zhuang","doi":"10.1016/j.jvcir.2024.104266","DOIUrl":null,"url":null,"abstract":"<div><p>Gait recognition, which can realize long-distance and contactless identification, is an important biometric technology. Recent gait recognition methods focus on learning the pattern of human movement or appearance during walking, and construct the corresponding spatio-temporal representations. However, different individuals have their own laws of movement patterns, simple spatial–temporal features are difficult to describe changes in motion of human parts, especially when confounding variables such as clothing and carrying are included, thus distinguishability of features is reduced. To this end, we propose the Embedding and Motion (EM) block and Fine Feature Extractor (FFE) to capture the motion mode of walking and enhance the difference of local motion rules. The EM block consists of a Motion Excitation (ME) module to capture the changes of temporal motion and an Embedding Self-attention (ES) module to enhance the expression of motion rules. Specifically, without introducing additional parameters, ME module learns the difference information between frames and intervals to obtain the dynamic change representation of walking for frame sequences with uncertain length. By contrast, ES module divides the feature map hierarchically based on element values, blurring the difference of elements to highlight the motion track. Furthermore, we present the FFE, which independently learns the spatio-temporal representations of human body according to different horizontal parts of individuals. Benefiting from EM block and our proposed motion branch, our method innovatively combines motion change information, significantly improving the performance of the model under cross appearance conditions. On the popular dataset CASIA-B, our proposed EM-Gait is better than the existing single-modal gait recognition methods.</p></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"103 ","pages":"Article 104266"},"PeriodicalIF":2.6000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EM-Gait: Gait recognition using motion excitation and feature embedding self-attention\",\"authors\":\"Zhengyou Wang , Chengyu Du , Yunpeng Zhang , Jing Bai , Shanna Zhuang\",\"doi\":\"10.1016/j.jvcir.2024.104266\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Gait recognition, which can realize long-distance and contactless identification, is an important biometric technology. Recent gait recognition methods focus on learning the pattern of human movement or appearance during walking, and construct the corresponding spatio-temporal representations. However, different individuals have their own laws of movement patterns, simple spatial–temporal features are difficult to describe changes in motion of human parts, especially when confounding variables such as clothing and carrying are included, thus distinguishability of features is reduced. To this end, we propose the Embedding and Motion (EM) block and Fine Feature Extractor (FFE) to capture the motion mode of walking and enhance the difference of local motion rules. The EM block consists of a Motion Excitation (ME) module to capture the changes of temporal motion and an Embedding Self-attention (ES) module to enhance the expression of motion rules. Specifically, without introducing additional parameters, ME module learns the difference information between frames and intervals to obtain the dynamic change representation of walking for frame sequences with uncertain length. By contrast, ES module divides the feature map hierarchically based on element values, blurring the difference of elements to highlight the motion track. Furthermore, we present the FFE, which independently learns the spatio-temporal representations of human body according to different horizontal parts of individuals. Benefiting from EM block and our proposed motion branch, our method innovatively combines motion change information, significantly improving the performance of the model under cross appearance conditions. On the popular dataset CASIA-B, our proposed EM-Gait is better than the existing single-modal gait recognition methods.</p></div>\",\"PeriodicalId\":54755,\"journal\":{\"name\":\"Journal of Visual Communication and Image Representation\",\"volume\":\"103 \",\"pages\":\"Article 104266\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Visual Communication and Image Representation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1047320324002220\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320324002220","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
EM-Gait: Gait recognition using motion excitation and feature embedding self-attention
Gait recognition, which can realize long-distance and contactless identification, is an important biometric technology. Recent gait recognition methods focus on learning the pattern of human movement or appearance during walking, and construct the corresponding spatio-temporal representations. However, different individuals have their own laws of movement patterns, simple spatial–temporal features are difficult to describe changes in motion of human parts, especially when confounding variables such as clothing and carrying are included, thus distinguishability of features is reduced. To this end, we propose the Embedding and Motion (EM) block and Fine Feature Extractor (FFE) to capture the motion mode of walking and enhance the difference of local motion rules. The EM block consists of a Motion Excitation (ME) module to capture the changes of temporal motion and an Embedding Self-attention (ES) module to enhance the expression of motion rules. Specifically, without introducing additional parameters, ME module learns the difference information between frames and intervals to obtain the dynamic change representation of walking for frame sequences with uncertain length. By contrast, ES module divides the feature map hierarchically based on element values, blurring the difference of elements to highlight the motion track. Furthermore, we present the FFE, which independently learns the spatio-temporal representations of human body according to different horizontal parts of individuals. Benefiting from EM block and our proposed motion branch, our method innovatively combines motion change information, significantly improving the performance of the model under cross appearance conditions. On the popular dataset CASIA-B, our proposed EM-Gait is better than the existing single-modal gait recognition methods.
期刊介绍:
The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.