Skeleton-based gait recognition has significantly improved due to the advent of graph convolutional networks (GCNs). Nevertheless, the classical ST-GCN has a key drawback: limited receptive fields fail to learn the global correlations of joints, restricting its ability to extract global dependencies effectively. To address this, we present the GSCTN method, a GCN and self-attention contemporary network with temporal convolution. This method combines GCN with a self-attention mechanism using a learnable weighted fusion. By combining local joint details from GCN with the larger context from self-attention, GSCTN creates a strong representation of skeleton movements. Our approach uses decoupled self-attention (DSA) techniques that fragment the tightly coupled (TiC) SA module into two learnable components, unary and pairwise SA, to model joint relationships separately. The unary SA shows an extensive relationship between the single key joint and all additional query joints. The paired SA captures the local gait features from each pair of body joints. We also present a Depthwise Multi-scale Temporal Convolutional Network (DMS-TCN) that smoothly captures the temporal nature of joint movements. DMS-TCN efficiently handles both short-term and long-term motion patterns. To boost the model’s ability to converge spatial and temporal joints dynamically, we applied Global Aware Attention (GAA) to the GSCTN module. We tested our method on the OUMVLP-Pose, CASIA-B, and GREW datasets. The suggested method exhibits remarkable accuracy on widely used CASIA-B datasets, with 97.9% for normal walking, 94.8% for carrying a bag, and 91.91% for clothing conditions. Meanwhile, the OUMVLP-Pose and GREW datasets exhibit a rank-1 accuracy of 93.5% and 75.7%, respectively. Our experimental results demonstrate that the proposed model is a holistic approach for gait recognition by utilizing GCN, DSA, and GAA with DMS-TCN to capture both inter-domain and spatial aspects of human locomotion.
扫码关注我们
求助内容:
应助结果提醒方式:
