Lang Deng;Jifang Pei;Yuansen Song;Weibo Huo;Yin Zhang;Yulin Huang
{"title":"KiRV: Robust Human Identification via Multimodal Learning Based on Kinetic Gait Features of Radar and Vision","authors":"Lang Deng;Jifang Pei;Yuansen Song;Weibo Huo;Yin Zhang;Yulin Huang","doi":"10.1109/JIOT.2025.3550532","DOIUrl":null,"url":null,"abstract":"Gait is an appealing biometric pattern that aims to identify individuals based on the way they walk. Gait recognition, a passive human identification technology utilized from a distance without subject cooperation, plays a considerable role in life monitoring, crime prevention, security guarantee, and other identity recognition applications. Although vision-based methods dominate the state-of-the-art field, their performance degrades under poor illumination. In contrast, radar signals are not affected by light and are more sensitive to micro-motion information. In this article, we design a Kinetic feature-based Radar-Vision fused (KiRV) gait recognition method, which leverages millimeter-wave radar echo signals and a video for illumination robust human identification. In the KiRV, we propose a novel kinetic gait feature representation framework based on radar micro-Doppler and visual optical flow information, which are the direct expressions of the gait motion process. The physical meaning of the kinetic features under the two modalities is similar, while the semantic information is complementary. Therefore, the two features can be effectively fused. To learn robust gait information, we propose two 2-D residual CNN-based lightweight backbone networks to encode the kinetic features, respectively, and further propose a two-stream cross-correlated fusion method, including radar-vision cross-correlated fusion (RVCF) and radar-vision gate unit (RVGU) modules. The RVCF adaptively adjusts the attention to radar and vision for better recognition performance, while the RVGU controls the contribution of each modality to the fused feature to improve the robustness of the model. Finally, the gait retrieval task can be achieved through the above innovative model and joint loss calculation at different feature levels. Extensive experiments are conducted in the real world and semi-simulation, demonstrating that the KiRV outperforms state-of-the-art gait recognition methods with well-illumination robustness.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 12","pages":"22224-22242"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10924240/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Gait is an appealing biometric pattern that aims to identify individuals based on the way they walk. Gait recognition, a passive human identification technology utilized from a distance without subject cooperation, plays a considerable role in life monitoring, crime prevention, security guarantee, and other identity recognition applications. Although vision-based methods dominate the state-of-the-art field, their performance degrades under poor illumination. In contrast, radar signals are not affected by light and are more sensitive to micro-motion information. In this article, we design a Kinetic feature-based Radar-Vision fused (KiRV) gait recognition method, which leverages millimeter-wave radar echo signals and a video for illumination robust human identification. In the KiRV, we propose a novel kinetic gait feature representation framework based on radar micro-Doppler and visual optical flow information, which are the direct expressions of the gait motion process. The physical meaning of the kinetic features under the two modalities is similar, while the semantic information is complementary. Therefore, the two features can be effectively fused. To learn robust gait information, we propose two 2-D residual CNN-based lightweight backbone networks to encode the kinetic features, respectively, and further propose a two-stream cross-correlated fusion method, including radar-vision cross-correlated fusion (RVCF) and radar-vision gate unit (RVGU) modules. The RVCF adaptively adjusts the attention to radar and vision for better recognition performance, while the RVGU controls the contribution of each modality to the fused feature to improve the robustness of the model. Finally, the gait retrieval task can be achieved through the above innovative model and joint loss calculation at different feature levels. Extensive experiments are conducted in the real world and semi-simulation, demonstrating that the KiRV outperforms state-of-the-art gait recognition methods with well-illumination robustness.
期刊介绍:
The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.