{"title":"图像区域的噪声标签对齐","authors":"Yang Liu, Jing Liu, Zechao Li, Hanqing Lu","doi":"10.1109/ICME.2012.143","DOIUrl":null,"url":null,"abstract":"With the permeation of Web 2.0, large-scale user contributed images with tags are easily available on social websites. How to align these social tags with image regions is a challenging task while no additional human intervention is considered, but a valuable one since the alignment can provide more detailed image semantic information and improve the accuracy of image retrieval. To this end, we propose a large margin discriminative model for automatically locating unaligned and possibly noisy image-level tags to the corresponding regions, and the model is optimized using concave-convex procedure (CCCP). In the model, each image is considered as a bag of segmented regions, associated with a set of candidate labeling vectors. Each labeling vector encodes a possible label arrangement for the regions of an image. To make the size of admissible labels tractable, we adopt an effective strategy based on the consistency between visual similarity and semantic correlation to generate a more compact set of labeling vectors. Extensive experiments on MSRC and SAIAPR TC-12 databases have been conducted to demonstrate the encouraging performance of our method comparing with other baseline methods.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Noisy Tag Alignment with Image Regions\",\"authors\":\"Yang Liu, Jing Liu, Zechao Li, Hanqing Lu\",\"doi\":\"10.1109/ICME.2012.143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the permeation of Web 2.0, large-scale user contributed images with tags are easily available on social websites. How to align these social tags with image regions is a challenging task while no additional human intervention is considered, but a valuable one since the alignment can provide more detailed image semantic information and improve the accuracy of image retrieval. To this end, we propose a large margin discriminative model for automatically locating unaligned and possibly noisy image-level tags to the corresponding regions, and the model is optimized using concave-convex procedure (CCCP). In the model, each image is considered as a bag of segmented regions, associated with a set of candidate labeling vectors. Each labeling vector encodes a possible label arrangement for the regions of an image. To make the size of admissible labels tractable, we adopt an effective strategy based on the consistency between visual similarity and semantic correlation to generate a more compact set of labeling vectors. Extensive experiments on MSRC and SAIAPR TC-12 databases have been conducted to demonstrate the encouraging performance of our method comparing with other baseline methods.\",\"PeriodicalId\":273567,\"journal\":{\"name\":\"2012 IEEE International Conference on Multimedia and Expo\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Multimedia and Expo\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME.2012.143\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Multimedia and Expo","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2012.143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
With the permeation of Web 2.0, large-scale user contributed images with tags are easily available on social websites. How to align these social tags with image regions is a challenging task while no additional human intervention is considered, but a valuable one since the alignment can provide more detailed image semantic information and improve the accuracy of image retrieval. To this end, we propose a large margin discriminative model for automatically locating unaligned and possibly noisy image-level tags to the corresponding regions, and the model is optimized using concave-convex procedure (CCCP). In the model, each image is considered as a bag of segmented regions, associated with a set of candidate labeling vectors. Each labeling vector encodes a possible label arrangement for the regions of an image. To make the size of admissible labels tractable, we adopt an effective strategy based on the consistency between visual similarity and semantic correlation to generate a more compact set of labeling vectors. Extensive experiments on MSRC and SAIAPR TC-12 databases have been conducted to demonstrate the encouraging performance of our method comparing with other baseline methods.