针对真实世界中多 PAR 的自适应多任务学习

IF 2.3 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE journal of radio frequency identification Pub Date : 2024-02-29 DOI:10.1109/JRFID.2024.3371881
Haoyun Sun;Hongwei Zhao;Weishan Zhang;Liang Xu;Hongqing Guan
{"title":"针对真实世界中多 PAR 的自适应多任务学习","authors":"Haoyun Sun;Hongwei Zhao;Weishan Zhang;Liang Xu;Hongqing Guan","doi":"10.1109/JRFID.2024.3371881","DOIUrl":null,"url":null,"abstract":"Multi-pedestrian attribute recognition (Multi-PAR) is a vital task for smart city surveillance applications, which requires identifying various attributes of multiple pedestrians in a single image. However, most existing methods are limited by the complex backgrounds and the time-consuming pedestrian detection preprocessing work in real-world scenarios, and cannot achieve satisfactory accuracy and efficiency. In this paper, we present a novel end-to-end solution, named Adaptive Multi-Task Network (AMTN), which jointly performs multiple tasks and leverages an adaptive feature re-extraction (AFRE) module to optimize them. Specially, We integrate pedestrian detection into AMTN to perform PAR preprocessing, and incorporate a person re-identification (ReID) task branch to track pedestrians in video streams, thereby selecting the clearest video frames for analysis instead of every video frame to improve analysis efficiency and recognition accuracy. Moreover, we design a dynamic weight fitting loss (DWFL) function to prevent gradient explosions and balance tasks during training. We conduct extensive experiments to evaluate the accuracy and efficiency of our approach, and compare it with the state-of-the-art methods. The experimental results demonstrate that our method outperforms other state-of-the-art algorithms, achieving 1.5%-4.9% improvement in accuracy on Multi-PAR. The experiments also show that the AMTN can greatly improve the efficiency of preprocessing by saving the computation of feature extraction through basic features sharing. Compared with the state-of-the-art detection algorithm Yolov5s, it can improve the efficiency by 42%.","PeriodicalId":73291,"journal":{"name":"IEEE journal of radio frequency identification","volume":"8 ","pages":"357-366"},"PeriodicalIF":2.3000,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive Multi-Task Learning for Multi-PAR in Real World\",\"authors\":\"Haoyun Sun;Hongwei Zhao;Weishan Zhang;Liang Xu;Hongqing Guan\",\"doi\":\"10.1109/JRFID.2024.3371881\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-pedestrian attribute recognition (Multi-PAR) is a vital task for smart city surveillance applications, which requires identifying various attributes of multiple pedestrians in a single image. However, most existing methods are limited by the complex backgrounds and the time-consuming pedestrian detection preprocessing work in real-world scenarios, and cannot achieve satisfactory accuracy and efficiency. In this paper, we present a novel end-to-end solution, named Adaptive Multi-Task Network (AMTN), which jointly performs multiple tasks and leverages an adaptive feature re-extraction (AFRE) module to optimize them. Specially, We integrate pedestrian detection into AMTN to perform PAR preprocessing, and incorporate a person re-identification (ReID) task branch to track pedestrians in video streams, thereby selecting the clearest video frames for analysis instead of every video frame to improve analysis efficiency and recognition accuracy. Moreover, we design a dynamic weight fitting loss (DWFL) function to prevent gradient explosions and balance tasks during training. We conduct extensive experiments to evaluate the accuracy and efficiency of our approach, and compare it with the state-of-the-art methods. The experimental results demonstrate that our method outperforms other state-of-the-art algorithms, achieving 1.5%-4.9% improvement in accuracy on Multi-PAR. The experiments also show that the AMTN can greatly improve the efficiency of preprocessing by saving the computation of feature extraction through basic features sharing. Compared with the state-of-the-art detection algorithm Yolov5s, it can improve the efficiency by 42%.\",\"PeriodicalId\":73291,\"journal\":{\"name\":\"IEEE journal of radio frequency identification\",\"volume\":\"8 \",\"pages\":\"357-366\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-02-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE journal of radio frequency identification\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10454582/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal of radio frequency identification","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10454582/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

多行人属性识别(Multi-PAR)是智慧城市监控应用中的一项重要任务,它要求在单幅图像中识别多个行人的各种属性。然而,现有的大多数方法受限于现实场景中复杂的背景和耗时的行人检测预处理工作,无法达到令人满意的精度和效率。在本文中,我们提出了一种新颖的端到端解决方案,名为自适应多任务网络(AMTN),它可以联合执行多项任务,并利用自适应特征再提取(AFRE)模块对其进行优化。特别是,我们在 AMTN 中集成了行人检测功能,以执行 PAR 预处理,并集成了人员再识别(ReID)任务分支,以跟踪视频流中的行人,从而选择最清晰的视频帧进行分析,而不是每个视频帧,以提高分析效率和识别准确率。此外,我们还设计了动态权重拟合损失(DWFL)函数,以防止训练过程中出现梯度爆炸和平衡任务。我们进行了大量实验来评估我们方法的准确性和效率,并将其与最先进的方法进行比较。实验结果表明,我们的方法优于其他最先进的算法,在 Multi-PAR 上的准确率提高了 1.5%-4.9%。实验还表明,AMTN 通过基本特征共享节省了特征提取的计算量,从而大大提高了预处理的效率。与最先进的检测算法 Yolov5s 相比,其效率提高了 42%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Adaptive Multi-Task Learning for Multi-PAR in Real World
Multi-pedestrian attribute recognition (Multi-PAR) is a vital task for smart city surveillance applications, which requires identifying various attributes of multiple pedestrians in a single image. However, most existing methods are limited by the complex backgrounds and the time-consuming pedestrian detection preprocessing work in real-world scenarios, and cannot achieve satisfactory accuracy and efficiency. In this paper, we present a novel end-to-end solution, named Adaptive Multi-Task Network (AMTN), which jointly performs multiple tasks and leverages an adaptive feature re-extraction (AFRE) module to optimize them. Specially, We integrate pedestrian detection into AMTN to perform PAR preprocessing, and incorporate a person re-identification (ReID) task branch to track pedestrians in video streams, thereby selecting the clearest video frames for analysis instead of every video frame to improve analysis efficiency and recognition accuracy. Moreover, we design a dynamic weight fitting loss (DWFL) function to prevent gradient explosions and balance tasks during training. We conduct extensive experiments to evaluate the accuracy and efficiency of our approach, and compare it with the state-of-the-art methods. The experimental results demonstrate that our method outperforms other state-of-the-art algorithms, achieving 1.5%-4.9% improvement in accuracy on Multi-PAR. The experiments also show that the AMTN can greatly improve the efficiency of preprocessing by saving the computation of feature extraction through basic features sharing. Compared with the state-of-the-art detection algorithm Yolov5s, it can improve the efficiency by 42%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.70
自引率
0.00%
发文量
0
期刊最新文献
News From CRFID Meetings Guest Editorial of the Special Issue on RFID 2023, SpliTech 2023, and IEEE RFID-TA 2023 IoT-Based Integrated Sensing and Logging Solution for Cold Chain Monitoring Applications Robust Low-Cost Drone Detection and Classification Using Convolutional Neural Networks in Low SNR Environments Overview of RFID Applications Utilizing Neural Networks A 920-MHz, 160-μW, 25-dB Gain Negative Resistance Reflection Amplifier for BPSK Modulation RFID Tag
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1