SkeletonCLIP: Recognizing Skeleton-based Human Actions with Text Prompts

Lin Yuan, Zhen He, Qianqian Wang, Leiyang Xu, Xiang Ma
{"title":"SkeletonCLIP: Recognizing Skeleton-based Human Actions with Text Prompts","authors":"Lin Yuan, Zhen He, Qianqian Wang, Leiyang Xu, Xiang Ma","doi":"10.1109/ICSAI57119.2022.10005459","DOIUrl":null,"url":null,"abstract":"Human action recognition has been a hot research for decades, and mainstream supervised frameworks include a feature extraction backbone and a softmax classifier to predict daily human actions. When the number of classes applied to the dataset changes, we must retrain the classifier on the well-trained backbone. This pipeline restricts the generalization and transfer ability of the model due to an extra training period. Moreover, replacing action labels with simple number labels discards useful semantic information and can only receive a meaningless classifier at last. In this work, we present a model SkeletonCLIP for skeleton-based human action recognition. We add an alternative text encoder to extract semantic information from labels while keeping the original sequence encoder. We use dot production to measure the similarities of sequence-text pairs in place of traditional classifier head and cross-entropy loss. Experiments from three human action datasets show that our framework can reach a higher recognition accuracy with the help of semantic information when training the network from scratch. The code has been shown at eunseo-v/SkeletonCLIP.","PeriodicalId":339547,"journal":{"name":"2022 8th International Conference on Systems and Informatics (ICSAI)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 8th International Conference on Systems and Informatics (ICSAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI57119.2022.10005459","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Human action recognition has been a hot research for decades, and mainstream supervised frameworks include a feature extraction backbone and a softmax classifier to predict daily human actions. When the number of classes applied to the dataset changes, we must retrain the classifier on the well-trained backbone. This pipeline restricts the generalization and transfer ability of the model due to an extra training period. Moreover, replacing action labels with simple number labels discards useful semantic information and can only receive a meaningless classifier at last. In this work, we present a model SkeletonCLIP for skeleton-based human action recognition. We add an alternative text encoder to extract semantic information from labels while keeping the original sequence encoder. We use dot production to measure the similarities of sequence-text pairs in place of traditional classifier head and cross-entropy loss. Experiments from three human action datasets show that our framework can reach a higher recognition accuracy with the help of semantic information when training the network from scratch. The code has been shown at eunseo-v/SkeletonCLIP.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用文本提示识别基于骨骼的人类动作
人类行为识别是几十年来的研究热点,主流的监督框架包括特征提取骨干和softmax分类器来预测人类的日常行为。当应用于数据集的类数量发生变化时,我们必须在训练良好的主干上重新训练分类器。由于额外的训练周期,这种管道限制了模型的泛化和迁移能力。而且,用简单的数字标签代替动作标签,丢弃了有用的语义信息,最后只能得到一个无意义的分类器。在这项工作中,我们提出了一个基于骨骼的人体动作识别模型骷髅clip。在保留原始序列编码器的同时,我们增加了一个替代文本编码器来从标签中提取语义信息。我们使用点产生来测量序列文本对的相似性,取代传统的分类器头和交叉熵损失。三个人体动作数据集的实验表明,我们的框架在从头开始训练网络时,借助语义信息可以达到更高的识别精度。代码已显示在eunseo-v/SkeletonCLIP。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Multi-hop Knowledge Base Q&A in Integrated Energy Services Based on Intermediate Reasoning Attention Wrong Wiring Detection of Electricity Meter Based on Image Processing Perturbation Analysis Based Simulation Approach for Electricity Market Research and Investigation Promoting a Hybrid Cryptosystem System’s Security based on Fresnel lens and RSA Algorithm Customer Portrait for Metrology Institutions Based on the Machine Learning Clustering Algorithm and the RFM Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1