Xin Li, Jia-Min Gu, Xiaoyuan Lu, Yan Ning, L. Zhang, Peiyi Shen, Chaochen Gu
{"title":"图像中属性检测的变长序列模型","authors":"Xin Li, Jia-Min Gu, Xiaoyuan Lu, Yan Ning, L. Zhang, Peiyi Shen, Chaochen Gu","doi":"10.3233/jcm-226762","DOIUrl":null,"url":null,"abstract":"Holistic scene understanding is a challenging problem in computer vision. Most recent researches in this field were focusing on the object detection, the semantic segmentation and the relationship detection tasks. The attribute can provide meaningful information for the object instance, thus the object instance can be expressed more detail in the scene understanding. However, most researches in this field have been limited to several special conditions. Such as, several researches were just focusing on the attribute of special object class, because their solutions were aimed at a limited-scenarios, their methods are hardly to generalize in other scenarios. We also find that most of the research for multi-attribute detection task were only regarding each attribute as binary class and simply use the multi-binary-classifier method for the attribute detection. But these strategies above not consider the relation between each pair of the attributes, they will fall into trouble in the “imperfect” attribute dataset (which is labeled with the missing and incomplete annotations), and they will have low performance in the long-tail attribute class (which has lower rank of annotation and more missing labels). In this paper, we focus on the multi-attribute detection for a variant of object classes and take the relation between attributes into consideration. We propose a GRU-based model to detect a variable-length attribute sequence with a customized loss compute method to solve the “imperfect” attribute dataset problem. Furthermore, we perform ablative studies to prove the effectiveness of each part of our method. Finally, we compare our model with several existed multi-attribute detection methods on VG (Visual Genome) and CUB200 bird datasets to prove the superior performance of the proposed model.","PeriodicalId":14668,"journal":{"name":"J. Comput. Methods Sci. Eng.","volume":"16 1","pages":"1913-1927"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Variable-length sequence model for attribute detection in the image\",\"authors\":\"Xin Li, Jia-Min Gu, Xiaoyuan Lu, Yan Ning, L. Zhang, Peiyi Shen, Chaochen Gu\",\"doi\":\"10.3233/jcm-226762\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Holistic scene understanding is a challenging problem in computer vision. Most recent researches in this field were focusing on the object detection, the semantic segmentation and the relationship detection tasks. The attribute can provide meaningful information for the object instance, thus the object instance can be expressed more detail in the scene understanding. However, most researches in this field have been limited to several special conditions. Such as, several researches were just focusing on the attribute of special object class, because their solutions were aimed at a limited-scenarios, their methods are hardly to generalize in other scenarios. We also find that most of the research for multi-attribute detection task were only regarding each attribute as binary class and simply use the multi-binary-classifier method for the attribute detection. But these strategies above not consider the relation between each pair of the attributes, they will fall into trouble in the “imperfect” attribute dataset (which is labeled with the missing and incomplete annotations), and they will have low performance in the long-tail attribute class (which has lower rank of annotation and more missing labels). In this paper, we focus on the multi-attribute detection for a variant of object classes and take the relation between attributes into consideration. We propose a GRU-based model to detect a variable-length attribute sequence with a customized loss compute method to solve the “imperfect” attribute dataset problem. Furthermore, we perform ablative studies to prove the effectiveness of each part of our method. Finally, we compare our model with several existed multi-attribute detection methods on VG (Visual Genome) and CUB200 bird datasets to prove the superior performance of the proposed model.\",\"PeriodicalId\":14668,\"journal\":{\"name\":\"J. Comput. Methods Sci. Eng.\",\"volume\":\"16 1\",\"pages\":\"1913-1927\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J. Comput. Methods Sci. Eng.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/jcm-226762\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Comput. Methods Sci. Eng.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jcm-226762","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Variable-length sequence model for attribute detection in the image
Holistic scene understanding is a challenging problem in computer vision. Most recent researches in this field were focusing on the object detection, the semantic segmentation and the relationship detection tasks. The attribute can provide meaningful information for the object instance, thus the object instance can be expressed more detail in the scene understanding. However, most researches in this field have been limited to several special conditions. Such as, several researches were just focusing on the attribute of special object class, because their solutions were aimed at a limited-scenarios, their methods are hardly to generalize in other scenarios. We also find that most of the research for multi-attribute detection task were only regarding each attribute as binary class and simply use the multi-binary-classifier method for the attribute detection. But these strategies above not consider the relation between each pair of the attributes, they will fall into trouble in the “imperfect” attribute dataset (which is labeled with the missing and incomplete annotations), and they will have low performance in the long-tail attribute class (which has lower rank of annotation and more missing labels). In this paper, we focus on the multi-attribute detection for a variant of object classes and take the relation between attributes into consideration. We propose a GRU-based model to detect a variable-length attribute sequence with a customized loss compute method to solve the “imperfect” attribute dataset problem. Furthermore, we perform ablative studies to prove the effectiveness of each part of our method. Finally, we compare our model with several existed multi-attribute detection methods on VG (Visual Genome) and CUB200 bird datasets to prove the superior performance of the proposed model.