{"title":"Dynamic Erasing Network With Adaptive Temporal Modeling for Weakly Supervised Video Anomaly Detection","authors":"Chen Zhang;Guorong Li;Yuankai Qi;Hanhua Ye;Laiyun Qing;Ming-Hsuan Yang;Qingming Huang","doi":"10.1109/TNNLS.2025.3553556","DOIUrl":null,"url":null,"abstract":"The weakly supervised video anomaly detection aims to learn a detection model using only video-level labeled data. Prior studies ignore the complexity or duration of anomalies present in abnormal videos during temporal modeling. Moreover, existing works usually detect the most abnormal segments, potentially overlooking the completeness of anomalies. We propose a dynamic erasing network (DE-Net) for weakly supervised video anomaly detection, which learns video-specific temporal features via adaptive temporal modeling (ATM) to address these limitations. Specifically, to handle duration variations of abnormal events, we propose an ATM module capable of adaptively selecting and aggregating the most appropriate <italic>K</i> temporal scale features for each video. Then, we design a dynamic erasing (DE) strategy that dynamically assesses the completeness of the detected anomalies and erases prominent abnormal segments to encourage the model to discover gentle abnormal segments. The proposed method achieves favorable performance compared to several state-of-the-art approaches on the widely used XD-Violence, TAD, and UCF-Crime datasets.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 9","pages":"16706-16720"},"PeriodicalIF":8.9000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10957754/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The weakly supervised video anomaly detection aims to learn a detection model using only video-level labeled data. Prior studies ignore the complexity or duration of anomalies present in abnormal videos during temporal modeling. Moreover, existing works usually detect the most abnormal segments, potentially overlooking the completeness of anomalies. We propose a dynamic erasing network (DE-Net) for weakly supervised video anomaly detection, which learns video-specific temporal features via adaptive temporal modeling (ATM) to address these limitations. Specifically, to handle duration variations of abnormal events, we propose an ATM module capable of adaptively selecting and aggregating the most appropriate K temporal scale features for each video. Then, we design a dynamic erasing (DE) strategy that dynamically assesses the completeness of the detected anomalies and erases prominent abnormal segments to encourage the model to discover gentle abnormal segments. The proposed method achieves favorable performance compared to several state-of-the-art approaches on the widely used XD-Violence, TAD, and UCF-Crime datasets.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.