{"title":"利用年龄自适应深度学习方法检测不当视频内容","authors":"Iftikhar Alam, Abdul Basit, Riaz Ahmad Ziar","doi":"10.1155/2024/7004031","DOIUrl":null,"url":null,"abstract":"<p>The exponential growth of video-sharing platforms, exemplified by platforms like YouTube and Netflix, has made videos available to everyone with minimal restrictions. This proliferation, while offering a variety of content, at the same time introduces challenges, such as the increased vulnerability of children and adolescents to potentially harmful material, notably explicit content. Despite the efforts in developing content moderation tools, a research gap still exists in creating comprehensive solutions capable of reliably estimating users’ ages and accurately classifying numerous forms of inappropriate video content. This study is aimed at bridging this gap by introducing VideoTransformer, which combines the power of two existing models: AgeNet and MobileNetV2. To evaluate the effectiveness of the proposed approach, this study utilized a manually annotated video dataset collected from YouTube, covering multiple categories, including <i>safe</i>, <i>real violence</i>, <i>drugs</i>, <i>nudity</i>, <i>simulated violence</i>, <i>kissing</i>, <i>pornography</i>, and <i>terrorism</i>. In contrast to existing models, the proposed VideoTransformer model demonstrates significant performance improvements, as evidenced by two distinct accuracy evaluations. It achieves an impressive accuracy rate of (96.89%) in a 5-fold cross-validation setup, outperforming NasNet (92.6%), EfficientNet-B7 (87.87%), GoogLeNet (85.1%), and VGG-19 (92.83%). Furthermore, in a single run, it maintains a consistent accuracy rate of 90%. Additionally, the proposed model attains an <i>F</i>1-score of 90.34%, indicating a well-balanced trade-off between precision and recall. These findings highlight the potential of the proposed approach in advancing content moderation and enhancing user safety on video-sharing platforms. We envision deploying the proposed methodology in real-time video streaming to effectively mitigate the spread of inappropriate content, thereby raising online safety standards.</p>","PeriodicalId":36408,"journal":{"name":"Human Behavior and Emerging Technologies","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2024/7004031","citationCount":"0","resultStr":"{\"title\":\"Utilizing Age-Adaptive Deep Learning Approaches for Detecting Inappropriate Video Content\",\"authors\":\"Iftikhar Alam, Abdul Basit, Riaz Ahmad Ziar\",\"doi\":\"10.1155/2024/7004031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The exponential growth of video-sharing platforms, exemplified by platforms like YouTube and Netflix, has made videos available to everyone with minimal restrictions. This proliferation, while offering a variety of content, at the same time introduces challenges, such as the increased vulnerability of children and adolescents to potentially harmful material, notably explicit content. Despite the efforts in developing content moderation tools, a research gap still exists in creating comprehensive solutions capable of reliably estimating users’ ages and accurately classifying numerous forms of inappropriate video content. This study is aimed at bridging this gap by introducing VideoTransformer, which combines the power of two existing models: AgeNet and MobileNetV2. To evaluate the effectiveness of the proposed approach, this study utilized a manually annotated video dataset collected from YouTube, covering multiple categories, including <i>safe</i>, <i>real violence</i>, <i>drugs</i>, <i>nudity</i>, <i>simulated violence</i>, <i>kissing</i>, <i>pornography</i>, and <i>terrorism</i>. In contrast to existing models, the proposed VideoTransformer model demonstrates significant performance improvements, as evidenced by two distinct accuracy evaluations. It achieves an impressive accuracy rate of (96.89%) in a 5-fold cross-validation setup, outperforming NasNet (92.6%), EfficientNet-B7 (87.87%), GoogLeNet (85.1%), and VGG-19 (92.83%). Furthermore, in a single run, it maintains a consistent accuracy rate of 90%. Additionally, the proposed model attains an <i>F</i>1-score of 90.34%, indicating a well-balanced trade-off between precision and recall. These findings highlight the potential of the proposed approach in advancing content moderation and enhancing user safety on video-sharing platforms. We envision deploying the proposed methodology in real-time video streaming to effectively mitigate the spread of inappropriate content, thereby raising online safety standards.</p>\",\"PeriodicalId\":36408,\"journal\":{\"name\":\"Human Behavior and Emerging Technologies\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2024/7004031\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Human Behavior and Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/2024/7004031\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Behavior and Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2024/7004031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
Utilizing Age-Adaptive Deep Learning Approaches for Detecting Inappropriate Video Content
The exponential growth of video-sharing platforms, exemplified by platforms like YouTube and Netflix, has made videos available to everyone with minimal restrictions. This proliferation, while offering a variety of content, at the same time introduces challenges, such as the increased vulnerability of children and adolescents to potentially harmful material, notably explicit content. Despite the efforts in developing content moderation tools, a research gap still exists in creating comprehensive solutions capable of reliably estimating users’ ages and accurately classifying numerous forms of inappropriate video content. This study is aimed at bridging this gap by introducing VideoTransformer, which combines the power of two existing models: AgeNet and MobileNetV2. To evaluate the effectiveness of the proposed approach, this study utilized a manually annotated video dataset collected from YouTube, covering multiple categories, including safe, real violence, drugs, nudity, simulated violence, kissing, pornography, and terrorism. In contrast to existing models, the proposed VideoTransformer model demonstrates significant performance improvements, as evidenced by two distinct accuracy evaluations. It achieves an impressive accuracy rate of (96.89%) in a 5-fold cross-validation setup, outperforming NasNet (92.6%), EfficientNet-B7 (87.87%), GoogLeNet (85.1%), and VGG-19 (92.83%). Furthermore, in a single run, it maintains a consistent accuracy rate of 90%. Additionally, the proposed model attains an F1-score of 90.34%, indicating a well-balanced trade-off between precision and recall. These findings highlight the potential of the proposed approach in advancing content moderation and enhancing user safety on video-sharing platforms. We envision deploying the proposed methodology in real-time video streaming to effectively mitigate the spread of inappropriate content, thereby raising online safety standards.
期刊介绍:
Human Behavior and Emerging Technologies is an interdisciplinary journal dedicated to publishing high-impact research that enhances understanding of the complex interactions between diverse human behavior and emerging digital technologies.