{"title":"DeepHSAR:针对多标签人类性活动识别的半监督细粒度学习","authors":"Abhishek Gangwar , Víctor González-Castro , Enrique Alegre , Eduardo Fidalgo , Alicia Martínez-Mendoza","doi":"10.1016/j.ipm.2024.103800","DOIUrl":null,"url":null,"abstract":"<div><p>The identification of sexual activities in images can be helpful in detecting the level of content severity and can assist pornography detectors in filtering specific types of content. In this paper, we propose a Deep Learning-based framework, named DeepHSAR, for semi-supervised fine-grained multi-label Human Sexual Activity Recognition (HSAR). To the best of our knowledge, this is the first work to propose an approach to HSAR. We also introduce a new multi-label dataset, named SexualActs-150k, containing 150k images manually labeled with 19 types of sexual activities. DeepHSAR has two multi-label classification streams: one for global image representation and another for fine-grained representation. To perform fine-grained image classification without ground-truth bounding box annotations, we propose a novel semi-supervised approach for multi-label fine-grained recognition, which learns through an iterative clustering and iterative CNN training process. We obtained a significant performance gain by fusing both streams (i.e., overall F1-score of 79.29%), compared to when they work separately. The experiments demonstrate that the proposed framework explicitly outperforms baseline and state-of-the-art approaches. In addition, the proposed framework also obtains state-of-the-art or competitive results in semi-supervised multi-label learning experiments on the NUS-WIDE and MS-COCO datasets with overall F1-scores of 75.98% and 85.17%, respectively. Furthermore, the proposed DeepHSAR has been assessed on the NPDI Pornography-2k video dataset, achieving a new state-of-the-art with 99.85% accuracy.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DeepHSAR: Semi-supervised fine-grained learning for multi-label human sexual activity recognition\",\"authors\":\"Abhishek Gangwar , Víctor González-Castro , Enrique Alegre , Eduardo Fidalgo , Alicia Martínez-Mendoza\",\"doi\":\"10.1016/j.ipm.2024.103800\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The identification of sexual activities in images can be helpful in detecting the level of content severity and can assist pornography detectors in filtering specific types of content. In this paper, we propose a Deep Learning-based framework, named DeepHSAR, for semi-supervised fine-grained multi-label Human Sexual Activity Recognition (HSAR). To the best of our knowledge, this is the first work to propose an approach to HSAR. We also introduce a new multi-label dataset, named SexualActs-150k, containing 150k images manually labeled with 19 types of sexual activities. DeepHSAR has two multi-label classification streams: one for global image representation and another for fine-grained representation. To perform fine-grained image classification without ground-truth bounding box annotations, we propose a novel semi-supervised approach for multi-label fine-grained recognition, which learns through an iterative clustering and iterative CNN training process. We obtained a significant performance gain by fusing both streams (i.e., overall F1-score of 79.29%), compared to when they work separately. The experiments demonstrate that the proposed framework explicitly outperforms baseline and state-of-the-art approaches. In addition, the proposed framework also obtains state-of-the-art or competitive results in semi-supervised multi-label learning experiments on the NUS-WIDE and MS-COCO datasets with overall F1-scores of 75.98% and 85.17%, respectively. Furthermore, the proposed DeepHSAR has been assessed on the NPDI Pornography-2k video dataset, achieving a new state-of-the-art with 99.85% accuracy.</p></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324001596\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324001596","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
DeepHSAR: Semi-supervised fine-grained learning for multi-label human sexual activity recognition
The identification of sexual activities in images can be helpful in detecting the level of content severity and can assist pornography detectors in filtering specific types of content. In this paper, we propose a Deep Learning-based framework, named DeepHSAR, for semi-supervised fine-grained multi-label Human Sexual Activity Recognition (HSAR). To the best of our knowledge, this is the first work to propose an approach to HSAR. We also introduce a new multi-label dataset, named SexualActs-150k, containing 150k images manually labeled with 19 types of sexual activities. DeepHSAR has two multi-label classification streams: one for global image representation and another for fine-grained representation. To perform fine-grained image classification without ground-truth bounding box annotations, we propose a novel semi-supervised approach for multi-label fine-grained recognition, which learns through an iterative clustering and iterative CNN training process. We obtained a significant performance gain by fusing both streams (i.e., overall F1-score of 79.29%), compared to when they work separately. The experiments demonstrate that the proposed framework explicitly outperforms baseline and state-of-the-art approaches. In addition, the proposed framework also obtains state-of-the-art or competitive results in semi-supervised multi-label learning experiments on the NUS-WIDE and MS-COCO datasets with overall F1-scores of 75.98% and 85.17%, respectively. Furthermore, the proposed DeepHSAR has been assessed on the NPDI Pornography-2k video dataset, achieving a new state-of-the-art with 99.85% accuracy.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.