Learning Gait Representation from Massive Unlabelled Walking Videos: A Benchmark

IF 20.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2022-06-28 DOI:10.48550/arXiv.2206.13964
Chao Fan, Saihui Hou, Jilong Wang, Yongzhen Huang, Shiqi Yu
{"title":"Learning Gait Representation from Massive Unlabelled Walking Videos: A Benchmark","authors":"Chao Fan, Saihui Hou, Jilong Wang, Yongzhen Huang, Shiqi Yu","doi":"10.48550/arXiv.2206.13964","DOIUrl":null,"url":null,"abstract":"Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks. The source code of GaitSSB and anonymous data of GaitLU-1M is available at https://github.com/ShiqiYu/OpenGait.","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":" ","pages":""},"PeriodicalIF":20.8000,"publicationDate":"2022-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.48550/arXiv.2206.13964","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 10

Abstract

Gait depicts individuals' unique and distinguishing walking patterns and has become one of the most promising biometric features for human identification. As a fine-grained recognition task, gait recognition is easily affected by many factors and usually requires a large amount of completely annotated data that is costly and insatiable. This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning, aiming to learn the general gait representation from massive unlabelled walking videos for practical applications via offering informative walking priors and diverse real-world variations. Specifically, we collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences and propose a conceptually simple yet empirically powerful baseline model GaitSSB. Experimentally, we evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-MVLP, GREW and Gait3D with or without transfer learning. The unsupervised results are comparable to or even better than the early model-based and GEI-based methods. After transfer learning, GaitSSB outperforms existing methods by a large margin in most cases, and also showcases the superior generalization capacity. Further experiments indicate that the pre-training can save about 50% and 80% annotation costs of GREW and Gait3D. Theoretically, we discuss the critical issues for gait-specific contrastive framework and present some insights for further study. As far as we know, GaitLU-1M is the first large-scale unlabelled gait dataset, and GaitSSB is the first method that achieves remarkable unsupervised results on the aforementioned benchmarks. The source code of GaitSSB and anonymous data of GaitLU-1M is available at https://github.com/ShiqiYu/OpenGait.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从大量未标记的步行视频中学习步态表示:一个基准
步态描绘了个体独特的行走模式,已成为人类识别最有前景的生物特征之一。步态识别作为一项细粒度的识别任务,容易受到多种因素的影响,通常需要大量完全注释的数据,成本高昂,难以满足。本文提出了一种用于对比学习步态识别的大规模自监督基准,旨在通过提供信息丰富的行走先验和不同的真实世界变化,从大量未标记的行走视频中学习通用步态表示,以供实际应用。具体来说,我们收集了一个由1.02M个行走序列组成的大规模未标记步态数据集GaitLU-1M,并提出了一个概念简单但经验强大的基线模型GaitSB。在实验上,我们在四个广泛使用的步态基准上评估了预训练模型,即CASIA-B、OU-MVLP、GREW和Gait3D,无论是否进行迁移学习。无监督的结果与早期基于模型和基于GEI的方法相当,甚至更好。在迁移学习之后,GaitSB在大多数情况下都大大优于现有的方法,并且表现出优越的泛化能力。进一步的实验表明,预训练可以节省GREW和Gait3D约50%和80%的注释成本。从理论上讲,我们讨论了步态特定对比框架的关键问题,并为进一步研究提供了一些见解。据我们所知,GaitLU-1M是第一个大规模的未标记步态数据集,而GaitSB是第一个在上述基准上获得显著无监督结果的方法。GaitSB的源代码和GaitLU-1M的匿名数据可在https://github.com/ShiqiYu/OpenGait.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
28.40
自引率
3.00%
发文量
885
审稿时长
8.5 months
期刊介绍: The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.
期刊最新文献
Practical Compact Deep Compressed Sensing Fine-Grained Visual Text Prompting Correlation Verification for Image Retrieval and Its Memory Footprint Optimization Task-Oriented Channel Attention for Fine-Grained Few-Shot Classification Streaming quanta sensors for online, high-performance imaging and vision
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1