Naz Dündar , Ali Seydi Keçeli , Aydın Kaya , Hayri Sever
{"title":"用于视频暴力检测的浅层 3D 卷积神经网络","authors":"Naz Dündar , Ali Seydi Keçeli , Aydın Kaya , Hayri Sever","doi":"10.1016/j.eij.2024.100455","DOIUrl":null,"url":null,"abstract":"<div><p>With the recent worldwide statistical rise in the amount of public violence, automated violence detection in surveillance cameras has become a matter of high importance. This work introduces an end-to-end, trainable 3D Convolutional Neural Network (3D CNN) for detecting violence in video footage. The proposed network is inherently capable of processing both spatial and temporal information, thereby obviating the need for additional models that would introduce higher computational requirements and complexity. This work has two main contributions: 1) developing a lightweight 3D CNN suitable for inference on edge devices as mobile systems, and 2) a comprehensive explanation of all components comprising a CNN model, thereby enhances model interpretability. Experiments were conducted to assess the performance of the proposed model using a consolidated dataset combining four benchmark datasets. The results of the experiments support the asserted contributions, which are discussed in detail.</p></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1110866524000185/pdfft?md5=26f705021bf8a72c4c1ae1b9cb9a844a&pid=1-s2.0-S1110866524000185-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A shallow 3D convolutional neural network for violence detection in videos\",\"authors\":\"Naz Dündar , Ali Seydi Keçeli , Aydın Kaya , Hayri Sever\",\"doi\":\"10.1016/j.eij.2024.100455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>With the recent worldwide statistical rise in the amount of public violence, automated violence detection in surveillance cameras has become a matter of high importance. This work introduces an end-to-end, trainable 3D Convolutional Neural Network (3D CNN) for detecting violence in video footage. The proposed network is inherently capable of processing both spatial and temporal information, thereby obviating the need for additional models that would introduce higher computational requirements and complexity. This work has two main contributions: 1) developing a lightweight 3D CNN suitable for inference on edge devices as mobile systems, and 2) a comprehensive explanation of all components comprising a CNN model, thereby enhances model interpretability. Experiments were conducted to assess the performance of the proposed model using a consolidated dataset combining four benchmark datasets. The results of the experiments support the asserted contributions, which are discussed in detail.</p></div>\",\"PeriodicalId\":56010,\"journal\":{\"name\":\"Egyptian Informatics Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-03-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S1110866524000185/pdfft?md5=26f705021bf8a72c4c1ae1b9cb9a844a&pid=1-s2.0-S1110866524000185-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Egyptian Informatics Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1110866524000185\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110866524000185","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
随着最近全球范围内公共暴力事件统计数字的上升,在监控摄像机中自动检测暴力事件已成为一个非常重要的问题。本作品介绍了一种端到端、可训练的 3D 卷积神经网络(3D CNN),用于检测视频片段中的暴力行为。所提出的网络本质上能够处理空间和时间信息,因此无需额外的模型,这将带来更高的计算要求和复杂性。这项工作有两个主要贡献:1)开发了一种轻量级 3D CNN,适合在移动系统的边缘设备上进行推理;2)全面解释了 CNN 模型的所有组成部分,从而提高了模型的可解释性。为了评估所提出模型的性能,我们使用了一个结合了四个基准数据集的综合数据集进行了实验。实验结果支持了所断言的贡献,并对这些贡献进行了详细讨论。
A shallow 3D convolutional neural network for violence detection in videos
With the recent worldwide statistical rise in the amount of public violence, automated violence detection in surveillance cameras has become a matter of high importance. This work introduces an end-to-end, trainable 3D Convolutional Neural Network (3D CNN) for detecting violence in video footage. The proposed network is inherently capable of processing both spatial and temporal information, thereby obviating the need for additional models that would introduce higher computational requirements and complexity. This work has two main contributions: 1) developing a lightweight 3D CNN suitable for inference on edge devices as mobile systems, and 2) a comprehensive explanation of all components comprising a CNN model, thereby enhances model interpretability. Experiments were conducted to assess the performance of the proposed model using a consolidated dataset combining four benchmark datasets. The results of the experiments support the asserted contributions, which are discussed in detail.
期刊介绍:
The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.