{"title":"Dual generators and dual discriminators generative adversarial network for video anomaly detection","authors":"Kang Chen, Changming Song, Dongxu Cheng, Hao Li","doi":"10.3233/jifs-237831","DOIUrl":null,"url":null,"abstract":"Video anomaly detection (VAD) has garnered substantial attention from researchers due to its broad applications, including fire detection, drop detection, and vibration detection. In the current context of VAD, existing methods prioritize detection efficiency but overlook the impact of motion and appearance information. Additionally, achieving accurate predictions while retaining motion and appearance information poses a significant challenge. This paper proposes a novel semi-supervised method for VAD based on Generative Adversarial Network (GAN) structures with dual generators and dual discriminators, namely Dual-GAN. The future frame generator utilizes an improved encoder-decoder network to preserve more spatial information. Motion information for the future flow generator is obtained by estimating optical flow between reconstruction frames, complementing the optical flow between prediction frames. The introduction of a frame discriminator and a motion discriminator against the frame generator enhances the realism of prediction frames, which facilitates the identification of unexpected abnormal events. This method significantly outperforms comparative approaches in synthesizing video frames and predicting future flows, showcasing its effectiveness in handling diverse video data. Extensive experiments are performed on four publicly available datasets to ensure a comprehensive evaluation of the model performance. Further exploration could include refining the model architecture, exploring additional datasets, and adapting the methodology to specific application domains.","PeriodicalId":509313,"journal":{"name":"Journal of Intelligent & Fuzzy Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent & Fuzzy Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jifs-237831","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Video anomaly detection (VAD) has garnered substantial attention from researchers due to its broad applications, including fire detection, drop detection, and vibration detection. In the current context of VAD, existing methods prioritize detection efficiency but overlook the impact of motion and appearance information. Additionally, achieving accurate predictions while retaining motion and appearance information poses a significant challenge. This paper proposes a novel semi-supervised method for VAD based on Generative Adversarial Network (GAN) structures with dual generators and dual discriminators, namely Dual-GAN. The future frame generator utilizes an improved encoder-decoder network to preserve more spatial information. Motion information for the future flow generator is obtained by estimating optical flow between reconstruction frames, complementing the optical flow between prediction frames. The introduction of a frame discriminator and a motion discriminator against the frame generator enhances the realism of prediction frames, which facilitates the identification of unexpected abnormal events. This method significantly outperforms comparative approaches in synthesizing video frames and predicting future flows, showcasing its effectiveness in handling diverse video data. Extensive experiments are performed on four publicly available datasets to ensure a comprehensive evaluation of the model performance. Further exploration could include refining the model architecture, exploring additional datasets, and adapting the methodology to specific application domains.
视频异常检测(VAD)因其广泛的应用而备受研究人员的关注,包括火灾检测、跌落检测和振动检测。在当前的 VAD 环境下,现有方法优先考虑检测效率,却忽视了运动和外观信息的影响。此外,在保留运动和外观信息的同时实现准确预测也是一个巨大的挑战。本文提出了一种基于具有双生成器和双判别器的生成对抗网络(GAN)结构的新型半监督 VAD 方法,即 Dual-GAN。未来帧生成器利用改进的编码器-解码器网络来保留更多的空间信息。未来流生成器的运动信息是通过估计重建帧之间的光流获得的,这是对预测帧之间光流的补充。针对帧生成器引入的帧判别器和运动判别器增强了预测帧的真实性,从而有助于识别意外异常事件。该方法在合成视频帧和预测未来流量方面明显优于其他方法,展示了其处理各种视频数据的有效性。我们在四个公开数据集上进行了广泛的实验,以确保对模型性能进行全面评估。进一步的探索可包括完善模型架构、探索其他数据集,以及将该方法调整到特定应用领域。