PIDLNet:一个物理诱导的深度学习网络，用于描述人群视频

2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) Pub Date : 2021-11-16 DOI:10.1109/AVSS52988.2021.9663817

S. Behera, T. K. Vijay, H. M. Kausik, D. P. Dogra

{"title":"PIDLNet:一个物理诱导的深度学习网络，用于描述人群视频","authors":"S. Behera, T. K. Vijay, H. M. Kausik, D. P. Dogra","doi":"10.1109/AVSS52988.2021.9663817","DOIUrl":null,"url":null,"abstract":"Human visual perception regarding crowd gatherings can provide valuable information about behavioral movements. Empirical analysis on visual perception about orderly moving crowds has revealed that such movements are often structured in nature with relatively higher order parameter and lower entropy as compared to unstructured crowd, and vice-versa. This paper proposes a Physics-Induced Deep Learning Network (PIDLNet), a deep learning framework trained on conventional 3D convolutional features combined with physics-based features. We have computed frame-level entropy and order parameter from the motion flows extracted from the crowd videos. These features are then integrated with the 3D convolutional features at a later stage in the feature extraction pipeline to aid in the crowd characterization process. Experiments reveal that the proposed network can characterize video segments depicting crowd movements with accuracy as high as 91.63%. We have obtained overall AUC of 0.9913 on highly challenging publicly available video dataset. The method outperforms existing deep-learning frameworks and conventional crowd characterization frameworks by a notable margin.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"308 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"PIDLNet: A Physics-Induced Deep Learning Network for Characterization of Crowd Videos\",\"authors\":\"S. Behera, T. K. Vijay, H. M. Kausik, D. P. Dogra\",\"doi\":\"10.1109/AVSS52988.2021.9663817\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human visual perception regarding crowd gatherings can provide valuable information about behavioral movements. Empirical analysis on visual perception about orderly moving crowds has revealed that such movements are often structured in nature with relatively higher order parameter and lower entropy as compared to unstructured crowd, and vice-versa. This paper proposes a Physics-Induced Deep Learning Network (PIDLNet), a deep learning framework trained on conventional 3D convolutional features combined with physics-based features. We have computed frame-level entropy and order parameter from the motion flows extracted from the crowd videos. These features are then integrated with the 3D convolutional features at a later stage in the feature extraction pipeline to aid in the crowd characterization process. Experiments reveal that the proposed network can characterize video segments depicting crowd movements with accuracy as high as 91.63%. We have obtained overall AUC of 0.9913 on highly challenging publicly available video dataset. The method outperforms existing deep-learning frameworks and conventional crowd characterization frameworks by a notable margin.\",\"PeriodicalId\":246327,\"journal\":{\"name\":\"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"volume\":\"308 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AVSS52988.2021.9663817\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS52988.2021.9663817","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

人类对人群聚集的视觉感知可以提供有关行为运动的宝贵信息。对有序移动人群视觉感知的实证分析表明，与非结构化人群相比，有序移动人群往往具有较高的序参量和较低的熵，反之亦然。本文提出了一种物理诱导深度学习网络(PIDLNet)，这是一种基于传统3D卷积特征和基于物理特征相结合的深度学习框架。对从人群视频中提取的运动流进行了帧级熵和序参量的计算。然后在特征提取管道的后期阶段将这些特征与3D卷积特征集成，以帮助人群表征过程。实验表明，所提出的网络可以对描述人群运动的视频片段进行表征，准确率高达91.63%。我们在极具挑战性的公开视频数据集上获得了0.9913的总体AUC。该方法明显优于现有的深度学习框架和传统的人群表征框架。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

PIDLNet: A Physics-Induced Deep Learning Network for Characterization of Crowd Videos

Human visual perception regarding crowd gatherings can provide valuable information about behavioral movements. Empirical analysis on visual perception about orderly moving crowds has revealed that such movements are often structured in nature with relatively higher order parameter and lower entropy as compared to unstructured crowd, and vice-versa. This paper proposes a Physics-Induced Deep Learning Network (PIDLNet), a deep learning framework trained on conventional 3D convolutional features combined with physics-based features. We have computed frame-level entropy and order parameter from the motion flows extracted from the crowd videos. These features are then integrated with the 3D convolutional features at a later stage in the feature extraction pipeline to aid in the crowd characterization process. Experiments reveal that the proposed network can characterize video segments depicting crowd movements with accuracy as high as 91.63%. We have obtained overall AUC of 0.9913 on highly challenging publicly available video dataset. The method outperforms existing deep-learning frameworks and conventional crowd characterization frameworks by a notable margin.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)

自引率

0.00%

发文量