{"title":"PIDLNet:一个物理诱导的深度学习网络,用于描述人群视频","authors":"S. Behera, T. K. Vijay, H. M. Kausik, D. P. Dogra","doi":"10.1109/AVSS52988.2021.9663817","DOIUrl":null,"url":null,"abstract":"Human visual perception regarding crowd gatherings can provide valuable information about behavioral movements. Empirical analysis on visual perception about orderly moving crowds has revealed that such movements are often structured in nature with relatively higher order parameter and lower entropy as compared to unstructured crowd, and vice-versa. This paper proposes a Physics-Induced Deep Learning Network (PIDLNet), a deep learning framework trained on conventional 3D convolutional features combined with physics-based features. We have computed frame-level entropy and order parameter from the motion flows extracted from the crowd videos. These features are then integrated with the 3D convolutional features at a later stage in the feature extraction pipeline to aid in the crowd characterization process. Experiments reveal that the proposed network can characterize video segments depicting crowd movements with accuracy as high as 91.63%. We have obtained overall AUC of 0.9913 on highly challenging publicly available video dataset. The method outperforms existing deep-learning frameworks and conventional crowd characterization frameworks by a notable margin.","PeriodicalId":246327,"journal":{"name":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","volume":"308 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"PIDLNet: A Physics-Induced Deep Learning Network for Characterization of Crowd Videos\",\"authors\":\"S. Behera, T. K. Vijay, H. M. Kausik, D. P. Dogra\",\"doi\":\"10.1109/AVSS52988.2021.9663817\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Human visual perception regarding crowd gatherings can provide valuable information about behavioral movements. Empirical analysis on visual perception about orderly moving crowds has revealed that such movements are often structured in nature with relatively higher order parameter and lower entropy as compared to unstructured crowd, and vice-versa. This paper proposes a Physics-Induced Deep Learning Network (PIDLNet), a deep learning framework trained on conventional 3D convolutional features combined with physics-based features. We have computed frame-level entropy and order parameter from the motion flows extracted from the crowd videos. These features are then integrated with the 3D convolutional features at a later stage in the feature extraction pipeline to aid in the crowd characterization process. Experiments reveal that the proposed network can characterize video segments depicting crowd movements with accuracy as high as 91.63%. We have obtained overall AUC of 0.9913 on highly challenging publicly available video dataset. The method outperforms existing deep-learning frameworks and conventional crowd characterization frameworks by a notable margin.\",\"PeriodicalId\":246327,\"journal\":{\"name\":\"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"volume\":\"308 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AVSS52988.2021.9663817\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 17th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS52988.2021.9663817","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
PIDLNet: A Physics-Induced Deep Learning Network for Characterization of Crowd Videos
Human visual perception regarding crowd gatherings can provide valuable information about behavioral movements. Empirical analysis on visual perception about orderly moving crowds has revealed that such movements are often structured in nature with relatively higher order parameter and lower entropy as compared to unstructured crowd, and vice-versa. This paper proposes a Physics-Induced Deep Learning Network (PIDLNet), a deep learning framework trained on conventional 3D convolutional features combined with physics-based features. We have computed frame-level entropy and order parameter from the motion flows extracted from the crowd videos. These features are then integrated with the 3D convolutional features at a later stage in the feature extraction pipeline to aid in the crowd characterization process. Experiments reveal that the proposed network can characterize video segments depicting crowd movements with accuracy as high as 91.63%. We have obtained overall AUC of 0.9913 on highly challenging publicly available video dataset. The method outperforms existing deep-learning frameworks and conventional crowd characterization frameworks by a notable margin.