Yifei Qian , Liangfei Zhang , Zhongliang Guo , Xiaopeng Hong , Ognjen Arandjelović , Carl R. Donovan
{"title":"Perspective-assisted prototype-based learning for semi-supervised crowd counting","authors":"Yifei Qian , Liangfei Zhang , Zhongliang Guo , Xiaopeng Hong , Ognjen Arandjelović , Carl R. Donovan","doi":"10.1016/j.patcog.2024.111073","DOIUrl":null,"url":null,"abstract":"<div><div>To alleviate the burden of labeling data to train crowd counting models, we propose a prototype-based learning approach for semi-supervised crowd counting with an embeded understanding of perspective. Our key idea is that image patches with the same density of people are likely to exhibit coherent appearance changes under similar perspective distortion, but differ significantly under varying distortions. Motivated by this observation, we construct multiple prototypes for each density level to capture variations in perspective. For labeled data, the prototype-based learning assists the regression task by regularizing the feature space and modeling the relationships within and across different density levels. For unlabeled data, the learnt perspective-embedded prototypes enhance differentiation between samples of the same density levels, allowing for a more nuanced assessment of the predictions. By incorporating regression results, we categorize unlabeled samples as reliable or unreliable, applying tailored consistency learning strategies to enhance model accuracy and generalization. Since the perspective information is often unavailable, we propose a novel pseudo-label assigner based on perspective self-organization which requires no additional annotations and assigns image regions to distinct spatial density groups, which mainly reflect the differences in average density among regions. Extensive experiments on four crowd counting benchmarks demonstrate the effectiveness of our approach.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"158 ","pages":"Article 111073"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324008240","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
To alleviate the burden of labeling data to train crowd counting models, we propose a prototype-based learning approach for semi-supervised crowd counting with an embeded understanding of perspective. Our key idea is that image patches with the same density of people are likely to exhibit coherent appearance changes under similar perspective distortion, but differ significantly under varying distortions. Motivated by this observation, we construct multiple prototypes for each density level to capture variations in perspective. For labeled data, the prototype-based learning assists the regression task by regularizing the feature space and modeling the relationships within and across different density levels. For unlabeled data, the learnt perspective-embedded prototypes enhance differentiation between samples of the same density levels, allowing for a more nuanced assessment of the predictions. By incorporating regression results, we categorize unlabeled samples as reliable or unreliable, applying tailored consistency learning strategies to enhance model accuracy and generalization. Since the perspective information is often unavailable, we propose a novel pseudo-label assigner based on perspective self-organization which requires no additional annotations and assigns image regions to distinct spatial density groups, which mainly reflect the differences in average density among regions. Extensive experiments on four crowd counting benchmarks demonstrate the effectiveness of our approach.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.