An efficient pedestrian attributes recognition system under challenging conditions

Ha X. Nguyen, Dong N. Hoang, Tuan A. Tran, Tuan M. Dang
{"title":"An efficient pedestrian attributes recognition system under challenging conditions","authors":"Ha X. Nguyen, Dong N. Hoang, Tuan A. Tran, Tuan M. Dang","doi":"10.22630/mgv.2023.32.2.1","DOIUrl":null,"url":null,"abstract":"In this work, an efficient pedestrian attribute recognition system is introduced. The system is based on a novel processing pipeline that combines the best-performing attribute extraction model with an efficient attribute filtering algorithm using keypoints of human pose. The attribute extraction models are developed based on several state-of-the-art deep networks via transfer learning techniques, including ResNet50, Swin-transformer, and ConvNeXt. Pre-trained models of these networks are fine-tuned using the Ensemble Pedestrian Attribute Recognition (EPAR) dataset. Several optimization techniques, including the advanced optimizer Adam with Decoupled Weight Decay Regularization (AdamW), Random Erasing (RE), and weighted loss functions, are adopted to solve issues of data unbalancing or challenging conditions like partial and occluded bodies. Experimental evaluations are performed via EPAR that contains 26993 images of 1477 person IDs, most of which are in challenging conditions. The results show that the ConvNeXt-v2-B outperforms other networks; mean accuracy (mA) reaches 85.57%, and other indices are also the highest. The addition of AdamW or RE can improve accuracy by 1-2%. The use of new loss functions can solve the issue of data unbalancing, in which the accuracy of data-less attributes improves by a maximum of 14% in the best case. Significantly, when the attribute filtering algorithm is applied, the results are dramatically improved, and mA reaches an excellent value of 94.85%. Utilizing the state-of-the-art attribute extraction model with optimization techniques on the large-scale and diverse dataset and attribute filtering has shown a good approach and thus has a high potential for practical applications.","PeriodicalId":39750,"journal":{"name":"Machine Graphics and Vision","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Graphics and Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22630/mgv.2023.32.2.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this work, an efficient pedestrian attribute recognition system is introduced. The system is based on a novel processing pipeline that combines the best-performing attribute extraction model with an efficient attribute filtering algorithm using keypoints of human pose. The attribute extraction models are developed based on several state-of-the-art deep networks via transfer learning techniques, including ResNet50, Swin-transformer, and ConvNeXt. Pre-trained models of these networks are fine-tuned using the Ensemble Pedestrian Attribute Recognition (EPAR) dataset. Several optimization techniques, including the advanced optimizer Adam with Decoupled Weight Decay Regularization (AdamW), Random Erasing (RE), and weighted loss functions, are adopted to solve issues of data unbalancing or challenging conditions like partial and occluded bodies. Experimental evaluations are performed via EPAR that contains 26993 images of 1477 person IDs, most of which are in challenging conditions. The results show that the ConvNeXt-v2-B outperforms other networks; mean accuracy (mA) reaches 85.57%, and other indices are also the highest. The addition of AdamW or RE can improve accuracy by 1-2%. The use of new loss functions can solve the issue of data unbalancing, in which the accuracy of data-less attributes improves by a maximum of 14% in the best case. Significantly, when the attribute filtering algorithm is applied, the results are dramatically improved, and mA reaches an excellent value of 94.85%. Utilizing the state-of-the-art attribute extraction model with optimization techniques on the large-scale and diverse dataset and attribute filtering has shown a good approach and thus has a high potential for practical applications.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
具有挑战性条件下的高效行人属性识别系统
本文介绍了一种高效的行人属性识别系统。该系统基于一种新颖的处理流水线,将性能最好的属性提取模型与高效的基于人体姿态关键点的属性过滤算法相结合。属性提取模型是基于几种最先进的深度网络,通过迁移学习技术开发的,包括ResNet50、swan -transformer和ConvNeXt。这些网络的预训练模型使用集成行人属性识别(EPAR)数据集进行微调。采用了几种优化技术,包括具有解耦权衰减正则化(AdamW)、随机擦除(RE)和加权损失函数的高级优化器Adam,以解决数据不平衡或部分和遮挡体等挑战性条件的问题。实验评估是通过EPAR进行的,EPAR包含26993张1477个人id的图像,其中大多数都处于具有挑战性的条件下。结果表明,ConvNeXt-v2-B网络优于其他网络;平均准确率(mA)达到85.57%,其他指标也最高。添加AdamW或RE可将精度提高1-2%。使用新的损失函数可以解决数据不平衡的问题,在最好的情况下,无数据属性的准确性最多提高14%。值得注意的是,当应用属性过滤算法时,结果得到了显著改善,mA达到了94.85%的优异值。利用最先进的属性提取模型和优化技术,在大规模和多样化的数据集上进行属性过滤,是一种很好的方法,具有很高的实际应用潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Machine Graphics and Vision
Machine Graphics and Vision Computer Science-Computer Graphics and Computer-Aided Design
CiteScore
0.40
自引率
0.00%
发文量
1
期刊介绍: Machine GRAPHICS & VISION (MGV) is a refereed international journal, published quarterly, providing a scientific exchange forum and an authoritative source of information in the field of, in general, pictorial information exchange between computers and their environment, including applications of visual and graphical computer systems. The journal concentrates on theoretical and computational models underlying computer generated, analysed, or otherwise processed imagery, in particular: - image processing - scene analysis, modeling, and understanding - machine vision - pattern matching and pattern recognition - image synthesis, including three-dimensional imaging and solid modeling
期刊最新文献
Use of virtual reality to facilitate engineer training in the aerospace industry An efficient pedestrian attributes recognition system under challenging conditions Performance evaluation of Machine Learning models to predict heart attack Lung and colon cancer detection from CT images using Deep Learning Riesz-Laplace Wavelet Transform and PCNN Based Image Fusion
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1