利用年龄自适应深度学习方法检测不当视频内容

IF 4.3 Q1 PSYCHOLOGY, MULTIDISCIPLINARY Human Behavior and Emerging Technologies Pub Date : 2024-06-19 DOI:10.1155/2024/7004031

Iftikhar Alam, Abdul Basit, Riaz Ahmad Ziar

{"title":"利用年龄自适应深度学习方法检测不当视频内容","authors":"Iftikhar Alam, Abdul Basit, Riaz Ahmad Ziar","doi":"10.1155/2024/7004031","DOIUrl":null,"url":null,"abstract":"The exponential growth of video-sharing platforms, exemplified by platforms like YouTube and Netflix, has made videos available to everyone with minimal restrictions. This proliferation, while offering a variety of content, at the same time introduces challenges, such as the increased vulnerability of children and adolescents to potentially harmful material, notably explicit content. Despite the efforts in developing content moderation tools, a research gap still exists in creating comprehensive solutions capable of reliably estimating users’ ages and accurately classifying numerous forms of inappropriate video content. This study is aimed at bridging this gap by introducing VideoTransformer, which combines the power of two existing models: AgeNet and MobileNetV2. To evaluate the effectiveness of the proposed approach, this study utilized a manually annotated video dataset collected from YouTube, covering multiple categories, including safe, real violence, drugs, nudity, simulated violence, kissing, pornography, and terrorism. In contrast to existing models, the proposed VideoTransformer model demonstrates significant performance improvements, as evidenced by two distinct accuracy evaluations. It achieves an impressive accuracy rate of (96.89%) in a 5-fold cross-validation setup, outperforming NasNet (92.6%), EfficientNet-B7 (87.87%), GoogLeNet (85.1%), and VGG-19 (92.83%). Furthermore, in a single run, it maintains a consistent accuracy rate of 90%. Additionally, the proposed model attains an F1-score of 90.34%, indicating a well-balanced trade-off between precision and recall. These findings highlight the potential of the proposed approach in advancing content moderation and enhancing user safety on video-sharing platforms. We envision deploying the proposed methodology in real-time video streaming to effectively mitigate the spread of inappropriate content, thereby raising online safety standards.","PeriodicalId":36408,"journal":{"name":"Human Behavior and Emerging Technologies","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2024/7004031","citationCount":"0","resultStr":"{\"title\":\"Utilizing Age-Adaptive Deep Learning Approaches for Detecting Inappropriate Video Content\",\"authors\":\"Iftikhar Alam, Abdul Basit, Riaz Ahmad Ziar\",\"doi\":\"10.1155/2024/7004031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The exponential growth of video-sharing platforms, exemplified by platforms like YouTube and Netflix, has made videos available to everyone with minimal restrictions. This proliferation, while offering a variety of content, at the same time introduces challenges, such as the increased vulnerability of children and adolescents to potentially harmful material, notably explicit content. Despite the efforts in developing content moderation tools, a research gap still exists in creating comprehensive solutions capable of reliably estimating users’ ages and accurately classifying numerous forms of inappropriate video content. This study is aimed at bridging this gap by introducing VideoTransformer, which combines the power of two existing models: AgeNet and MobileNetV2. To evaluate the effectiveness of the proposed approach, this study utilized a manually annotated video dataset collected from YouTube, covering multiple categories, including safe, real violence, drugs, nudity, simulated violence, kissing, pornography, and terrorism. In contrast to existing models, the proposed VideoTransformer model demonstrates significant performance improvements, as evidenced by two distinct accuracy evaluations. It achieves an impressive accuracy rate of (96.89%) in a 5-fold cross-validation setup, outperforming NasNet (92.6%), EfficientNet-B7 (87.87%), GoogLeNet (85.1%), and VGG-19 (92.83%). Furthermore, in a single run, it maintains a consistent accuracy rate of 90%. Additionally, the proposed model attains an F1-score of 90.34%, indicating a well-balanced trade-off between precision and recall. These findings highlight the potential of the proposed approach in advancing content moderation and enhancing user safety on video-sharing platforms. We envision deploying the proposed methodology in real-time video streaming to effectively mitigate the spread of inappropriate content, thereby raising online safety standards.\",\"PeriodicalId\":36408,\"journal\":{\"name\":\"Human Behavior and Emerging Technologies\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2024/7004031\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Human Behavior and Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/2024/7004031\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Behavior and Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2024/7004031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

以 YouTube 和 Netflix 等平台为例，视频共享平台的指数式增长使每个人都能在极少限制的情况下观看视频。这种激增在提供各种内容的同时也带来了挑战，例如儿童和青少年更容易接触到潜在的有害信息，尤其是露骨的内容。尽管在开发内容节制工具方面做出了努力，但在创建能够可靠估计用户年龄和准确分类多种形式的不当视频内容的全面解决方案方面，仍然存在研究空白。本研究旨在通过引入视频转换器（VideoTransformer）来弥补这一差距：AgeNet 和 MobileNetV2。为了评估所提出方法的有效性，本研究使用了从 YouTube 收集的人工注释视频数据集，涵盖多个类别，包括安全、真实暴力、毒品、裸体、模拟暴力、接吻、色情和恐怖主义。与现有模型相比，所提出的 VideoTransformer 模型在性能上有了显著提高，两个不同的准确率评估证明了这一点。在 5 倍交叉验证设置中，它的准确率达到了令人印象深刻的 96.89%，超过了 NasNet（92.6%）、EfficientNet-B7（87.87%）、GoogLeNet（85.1%）和 VGG-19（92.83%）。此外，在单次运行中，它的准确率始终保持在 90%。此外，拟议模型的 F1 分数达到 90.34%，表明精确度和召回率之间的权衡非常平衡。这些发现凸显了所提方法在推进内容审核和提高视频共享平台用户安全方面的潜力。我们设想在实时视频流中部署所提出的方法，以有效减少不当内容的传播，从而提高网络安全标准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Utilizing Age-Adaptive Deep Learning Approaches for Detecting Inappropriate Video Content

The exponential growth of video-sharing platforms, exemplified by platforms like YouTube and Netflix, has made videos available to everyone with minimal restrictions. This proliferation, while offering a variety of content, at the same time introduces challenges, such as the increased vulnerability of children and adolescents to potentially harmful material, notably explicit content. Despite the efforts in developing content moderation tools, a research gap still exists in creating comprehensive solutions capable of reliably estimating users’ ages and accurately classifying numerous forms of inappropriate video content. This study is aimed at bridging this gap by introducing VideoTransformer, which combines the power of two existing models: AgeNet and MobileNetV2. To evaluate the effectiveness of the proposed approach, this study utilized a manually annotated video dataset collected from YouTube, covering multiple categories, including safe, real violence, drugs, nudity, simulated violence, kissing, pornography, and terrorism. In contrast to existing models, the proposed VideoTransformer model demonstrates significant performance improvements, as evidenced by two distinct accuracy evaluations. It achieves an impressive accuracy rate of (96.89%) in a 5-fold cross-validation setup, outperforming NasNet (92.6%), EfficientNet-B7 (87.87%), GoogLeNet (85.1%), and VGG-19 (92.83%). Furthermore, in a single run, it maintains a consistent accuracy rate of 90%. Additionally, the proposed model attains an F1-score of 90.34%, indicating a well-balanced trade-off between precision and recall. These findings highlight the potential of the proposed approach in advancing content moderation and enhancing user safety on video-sharing platforms. We envision deploying the proposed methodology in real-time video streaming to effectively mitigate the spread of inappropriate content, thereby raising online safety standards.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Human Behavior and Emerging Technologies Social Sciences-Social Sciences (all)

CiteScore

17.20

自引率

8.70%

发文量

期刊介绍： Human Behavior and Emerging Technologies is an interdisciplinary journal dedicated to publishing high-impact research that enhances understanding of the complex interactions between diverse human behavior and emerging digital technologies.