DeepHSAR: Semi-supervised fine-grained learning for multi-label human sexual activity recognition

IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Processing & Management Pub Date : 2024-06-27 DOI:10.1016/j.ipm.2024.103800
Abhishek Gangwar , Víctor González-Castro , Enrique Alegre , Eduardo Fidalgo , Alicia Martínez-Mendoza
{"title":"DeepHSAR: Semi-supervised fine-grained learning for multi-label human sexual activity recognition","authors":"Abhishek Gangwar ,&nbsp;Víctor González-Castro ,&nbsp;Enrique Alegre ,&nbsp;Eduardo Fidalgo ,&nbsp;Alicia Martínez-Mendoza","doi":"10.1016/j.ipm.2024.103800","DOIUrl":null,"url":null,"abstract":"<div><p>The identification of sexual activities in images can be helpful in detecting the level of content severity and can assist pornography detectors in filtering specific types of content. In this paper, we propose a Deep Learning-based framework, named DeepHSAR, for semi-supervised fine-grained multi-label Human Sexual Activity Recognition (HSAR). To the best of our knowledge, this is the first work to propose an approach to HSAR. We also introduce a new multi-label dataset, named SexualActs-150k, containing 150k images manually labeled with 19 types of sexual activities. DeepHSAR has two multi-label classification streams: one for global image representation and another for fine-grained representation. To perform fine-grained image classification without ground-truth bounding box annotations, we propose a novel semi-supervised approach for multi-label fine-grained recognition, which learns through an iterative clustering and iterative CNN training process. We obtained a significant performance gain by fusing both streams (i.e., overall F1-score of 79.29%), compared to when they work separately. The experiments demonstrate that the proposed framework explicitly outperforms baseline and state-of-the-art approaches. In addition, the proposed framework also obtains state-of-the-art or competitive results in semi-supervised multi-label learning experiments on the NUS-WIDE and MS-COCO datasets with overall F1-scores of 75.98% and 85.17%, respectively. Furthermore, the proposed DeepHSAR has been assessed on the NPDI Pornography-2k video dataset, achieving a new state-of-the-art with 99.85% accuracy.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324001596","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The identification of sexual activities in images can be helpful in detecting the level of content severity and can assist pornography detectors in filtering specific types of content. In this paper, we propose a Deep Learning-based framework, named DeepHSAR, for semi-supervised fine-grained multi-label Human Sexual Activity Recognition (HSAR). To the best of our knowledge, this is the first work to propose an approach to HSAR. We also introduce a new multi-label dataset, named SexualActs-150k, containing 150k images manually labeled with 19 types of sexual activities. DeepHSAR has two multi-label classification streams: one for global image representation and another for fine-grained representation. To perform fine-grained image classification without ground-truth bounding box annotations, we propose a novel semi-supervised approach for multi-label fine-grained recognition, which learns through an iterative clustering and iterative CNN training process. We obtained a significant performance gain by fusing both streams (i.e., overall F1-score of 79.29%), compared to when they work separately. The experiments demonstrate that the proposed framework explicitly outperforms baseline and state-of-the-art approaches. In addition, the proposed framework also obtains state-of-the-art or competitive results in semi-supervised multi-label learning experiments on the NUS-WIDE and MS-COCO datasets with overall F1-scores of 75.98% and 85.17%, respectively. Furthermore, the proposed DeepHSAR has been assessed on the NPDI Pornography-2k video dataset, achieving a new state-of-the-art with 99.85% accuracy.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DeepHSAR:针对多标签人类性活动识别的半监督细粒度学习
识别图像中的性活动有助于检测内容的严重程度,并帮助色情检测器过滤特定类型的内容。在本文中,我们提出了一种基于深度学习的框架,名为 DeepHSAR,用于半监督式细粒度多标签人类性活动识别(HSAR)。据我们所知,这是第一项提出 HSAR 方法的工作。我们还引入了一个新的多标签数据集,名为 SexualActs-150k,其中包含 150k 张人工标注了 19 种性活动的图片。DeepHSAR 有两个多标签分类流:一个用于全局图像表示,另一个用于细粒度表示。为了在没有地面实况边界框注释的情况下执行细粒度图像分类,我们提出了一种新颖的半监督式多标签细粒度识别方法,该方法通过迭代聚类和迭代 CNN 训练过程进行学习。通过融合这两个流,我们获得了显著的性能提升(即总体 F1 分数为 79.29%),而这两个流是分开工作的。实验证明,所提出的框架明显优于基准方法和最先进的方法。此外,在 NUS-WIDE 和 MS-COCO 数据集的半监督多标签学习实验中,所提出的框架也获得了最先进或有竞争力的结果,总体 F1 分数分别为 75.98% 和 85.17%。此外,DeepHSAR 还在 NPDI Pornography-2k 视频数据集上进行了评估,以 99.85% 的准确率达到了最新水平。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
期刊最新文献
Fusing temporal and semantic dependencies for session-based recommendation A Universal Adaptive Algorithm for Graph Anomaly Detection A context-aware attention and graph neural network-based multimodal framework for misogyny detection Multi-granularity contrastive zero-shot learning model based on attribute decomposition Asymmetric augmented paradigm-based graph neural architecture search
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1