利用关键帧识别和带有卷积块注意力模块的 3D CNN 检测视频中的人类暴力行为

IF 1.8 3区 工程技术 Q3 ENGINEERING, ELECTRICAL & ELECTRONIC Circuits, Systems and Signal Processing Pub Date : 2024-08-13 DOI:10.1007/s00034-024-02824-w
Venkatesh Akula, Ilaiah Kavati
{"title":"利用关键帧识别和带有卷积块注意力模块的 3D CNN 检测视频中的人类暴力行为","authors":"Venkatesh Akula, Ilaiah Kavati","doi":"10.1007/s00034-024-02824-w","DOIUrl":null,"url":null,"abstract":"<p>In recent years, there has been an increase in demand for intelligent automatic surveillance systems to detect abnormal activities at various places, such as schools, hospitals, prisons, psychiatric centers, and public gatherings. The availability of video surveillance cameras in such places enables techniques for automatically identifying violent actions and alerting the authorities to minimize loss. Deep learning-based models, such as Convolutional Neural Networks (CNNs), have shown better performance in detecting violent activities by utilizing the spatiotemporal features of video frames. In this work, we propose a violence detection model based on 3D CNN, which employs a DenseNet architecture for enhanced spatiotemporal feature capture. First, the video’s redundant frames are discarded by identifying the key frames in the video. We exploit the Multi-Scale Structural Similarity Index Measure (MS-SSIM) technique to identify the key frames of the video, which contain significant information about the video. Key frame identification helps to reduce the complexity of the model. Next, the identified video key frames with the lowest MS-SSIM are forwarded to 3D CNN to extract spatiotemporal features. Furthermore, we exploit the Convolutional Block Attention Module (CBAM) to increase the representational capabilities of the 3D CNN. The results on different benchmark datasets show that the proposed violence detection method performs better than most of the existing methods. The source code for the proposed method is publicly available at https://github.com/venkateshakula19/violence-detection-using-keyframe-extraction-and-CNN-with-attention-CBAM</p>","PeriodicalId":10227,"journal":{"name":"Circuits, Systems and Signal Processing","volume":"58 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Human Violence Detection in Videos Using Key Frame Identification and 3D CNN with Convolutional Block Attention Module\",\"authors\":\"Venkatesh Akula, Ilaiah Kavati\",\"doi\":\"10.1007/s00034-024-02824-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In recent years, there has been an increase in demand for intelligent automatic surveillance systems to detect abnormal activities at various places, such as schools, hospitals, prisons, psychiatric centers, and public gatherings. The availability of video surveillance cameras in such places enables techniques for automatically identifying violent actions and alerting the authorities to minimize loss. Deep learning-based models, such as Convolutional Neural Networks (CNNs), have shown better performance in detecting violent activities by utilizing the spatiotemporal features of video frames. In this work, we propose a violence detection model based on 3D CNN, which employs a DenseNet architecture for enhanced spatiotemporal feature capture. First, the video’s redundant frames are discarded by identifying the key frames in the video. We exploit the Multi-Scale Structural Similarity Index Measure (MS-SSIM) technique to identify the key frames of the video, which contain significant information about the video. Key frame identification helps to reduce the complexity of the model. Next, the identified video key frames with the lowest MS-SSIM are forwarded to 3D CNN to extract spatiotemporal features. Furthermore, we exploit the Convolutional Block Attention Module (CBAM) to increase the representational capabilities of the 3D CNN. The results on different benchmark datasets show that the proposed violence detection method performs better than most of the existing methods. The source code for the proposed method is publicly available at https://github.com/venkateshakula19/violence-detection-using-keyframe-extraction-and-CNN-with-attention-CBAM</p>\",\"PeriodicalId\":10227,\"journal\":{\"name\":\"Circuits, Systems and Signal Processing\",\"volume\":\"58 1\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Circuits, Systems and Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s00034-024-02824-w\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Circuits, Systems and Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s00034-024-02824-w","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

近年来,在学校、医院、监狱、精神病治疗中心和公共集会等各种场所检测异常活动的智能自动监控系统的需求不断增加。在这些场所安装视频监控摄像头,可实现自动识别暴力行为并向当局发出警报的技术,从而将损失降到最低。基于深度学习的模型,如卷积神经网络(CNN),通过利用视频帧的时空特征,在检测暴力活动方面表现出了更好的性能。在这项工作中,我们提出了一种基于 3D CNN 的暴力检测模型,该模型采用 DenseNet 架构来增强时空特征捕捉。首先,通过识别视频中的关键帧,剔除视频中的冗余帧。我们利用多尺度结构相似性指数测量(MS-SSIM)技术来识别视频中包含重要视频信息的关键帧。关键帧识别有助于降低模型的复杂性。接下来,MS-SSIM 值最低的已识别视频关键帧将被转发到 3D CNN,以提取时空特征。此外,我们还利用卷积块注意力模块(CBAM)来提高 3D CNN 的表征能力。在不同基准数据集上的结果表明,所提出的暴力检测方法的性能优于大多数现有方法。建议方法的源代码可在 https://github.com/venkateshakula19/violence-detection-using-keyframe-extraction-and-CNN-with-attention-CBAM 上公开获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Human Violence Detection in Videos Using Key Frame Identification and 3D CNN with Convolutional Block Attention Module

In recent years, there has been an increase in demand for intelligent automatic surveillance systems to detect abnormal activities at various places, such as schools, hospitals, prisons, psychiatric centers, and public gatherings. The availability of video surveillance cameras in such places enables techniques for automatically identifying violent actions and alerting the authorities to minimize loss. Deep learning-based models, such as Convolutional Neural Networks (CNNs), have shown better performance in detecting violent activities by utilizing the spatiotemporal features of video frames. In this work, we propose a violence detection model based on 3D CNN, which employs a DenseNet architecture for enhanced spatiotemporal feature capture. First, the video’s redundant frames are discarded by identifying the key frames in the video. We exploit the Multi-Scale Structural Similarity Index Measure (MS-SSIM) technique to identify the key frames of the video, which contain significant information about the video. Key frame identification helps to reduce the complexity of the model. Next, the identified video key frames with the lowest MS-SSIM are forwarded to 3D CNN to extract spatiotemporal features. Furthermore, we exploit the Convolutional Block Attention Module (CBAM) to increase the representational capabilities of the 3D CNN. The results on different benchmark datasets show that the proposed violence detection method performs better than most of the existing methods. The source code for the proposed method is publicly available at https://github.com/venkateshakula19/violence-detection-using-keyframe-extraction-and-CNN-with-attention-CBAM

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Circuits, Systems and Signal Processing
Circuits, Systems and Signal Processing 工程技术-工程:电子与电气
CiteScore
4.80
自引率
13.00%
发文量
321
审稿时长
4.6 months
期刊介绍: Rapid developments in the analog and digital processing of signals for communication, control, and computer systems have made the theory of electrical circuits and signal processing a burgeoning area of research and design. The aim of Circuits, Systems, and Signal Processing (CSSP) is to help meet the needs of outlets for significant research papers and state-of-the-art review articles in the area. The scope of the journal is broad, ranging from mathematical foundations to practical engineering design. It encompasses, but is not limited to, such topics as linear and nonlinear networks, distributed circuits and systems, multi-dimensional signals and systems, analog filters and signal processing, digital filters and signal processing, statistical signal processing, multimedia, computer aided design, graph theory, neural systems, communication circuits and systems, and VLSI signal processing. The Editorial Board is international, and papers are welcome from throughout the world. The journal is devoted primarily to research papers, but survey, expository, and tutorial papers are also published. Circuits, Systems, and Signal Processing (CSSP) is published twelve times annually.
期刊最新文献
Squeeze-and-Excitation Self-Attention Mechanism Enhanced Digital Audio Source Recognition Based on Transfer Learning Recursive Windowed Variational Mode Decomposition Discrete-Time Delta-Sigma Modulator with Successively Approximating Register ADC Assisted Analog Feedback Technique Individually Weighted Modified Logarithmic Hyperbolic Sine Curvelet Based Recursive FLN for Nonlinear System Identification Event-Triggered $$H_{\infty }$$ Filtering for A Class of Nonlinear Systems Under DoS Attacks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1