Jinzheng Tu, Guoxian Yu, C. Domeniconi, J. Wang, Guoqiang Xiao, Maozu Guo
{"title":"基于联合矩阵分解的多标签答案聚合","authors":"Jinzheng Tu, Guoxian Yu, C. Domeniconi, J. Wang, Guoqiang Xiao, Maozu Guo","doi":"10.1109/ICDM.2018.00067","DOIUrl":null,"url":null,"abstract":"Crowdsourcing is a useful and economic approach to data annotation. To obtain annotation of high quality, various aggregation approaches have been developed, which take into account different factors that impact the quality of aggregated answers. However, existing methods generally focus on single-label (multi-class and binary) tasks, and they ignore the inter-correlation between labels, and thus may have compromised quality. In this paper, we introduce a Multi-Label answer aggregation approach based on Joint Matrix Factorization (ML-JMF). ML-JMF selectively and jointly factorizes the sample-label association matrices collected from different annotators into products of individual and shared low-rank matrices. As such, it takes advantage of the robustness of low-rank matrix approximation to noise, and reduces the impact of unreliable annotators by assigning small (zero) weights to their annotation matrices. In addition, it takes advantage of the correlation among labels by leveraging the shared low-rank matrix, and of the similarity between annotators using the individual low-rank matrices to guide the factorization. ML-JMF pursues the low-rank matrices via a unified objective function, and introduces an iterative technique to optimize it. ML-JMF finally uses the optimized low-rank matrices and weights to infer the ground-truth labels. Our experimental results on multi-label datasets show that ML-JMF outperforms competitive methods in inferring ground truth labels. Our approach can identify unreliable annotators, and is robust against their misleading answers through the assignment of small (zero) weights to their annotation.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Multi-label Answer Aggregation Based on Joint Matrix Factorization\",\"authors\":\"Jinzheng Tu, Guoxian Yu, C. Domeniconi, J. Wang, Guoqiang Xiao, Maozu Guo\",\"doi\":\"10.1109/ICDM.2018.00067\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Crowdsourcing is a useful and economic approach to data annotation. To obtain annotation of high quality, various aggregation approaches have been developed, which take into account different factors that impact the quality of aggregated answers. However, existing methods generally focus on single-label (multi-class and binary) tasks, and they ignore the inter-correlation between labels, and thus may have compromised quality. In this paper, we introduce a Multi-Label answer aggregation approach based on Joint Matrix Factorization (ML-JMF). ML-JMF selectively and jointly factorizes the sample-label association matrices collected from different annotators into products of individual and shared low-rank matrices. As such, it takes advantage of the robustness of low-rank matrix approximation to noise, and reduces the impact of unreliable annotators by assigning small (zero) weights to their annotation matrices. In addition, it takes advantage of the correlation among labels by leveraging the shared low-rank matrix, and of the similarity between annotators using the individual low-rank matrices to guide the factorization. ML-JMF pursues the low-rank matrices via a unified objective function, and introduces an iterative technique to optimize it. ML-JMF finally uses the optimized low-rank matrices and weights to infer the ground-truth labels. Our experimental results on multi-label datasets show that ML-JMF outperforms competitive methods in inferring ground truth labels. Our approach can identify unreliable annotators, and is robust against their misleading answers through the assignment of small (zero) weights to their annotation.\",\"PeriodicalId\":286444,\"journal\":{\"name\":\"2018 IEEE International Conference on Data Mining (ICDM)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Data Mining (ICDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2018.00067\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2018.00067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-label Answer Aggregation Based on Joint Matrix Factorization
Crowdsourcing is a useful and economic approach to data annotation. To obtain annotation of high quality, various aggregation approaches have been developed, which take into account different factors that impact the quality of aggregated answers. However, existing methods generally focus on single-label (multi-class and binary) tasks, and they ignore the inter-correlation between labels, and thus may have compromised quality. In this paper, we introduce a Multi-Label answer aggregation approach based on Joint Matrix Factorization (ML-JMF). ML-JMF selectively and jointly factorizes the sample-label association matrices collected from different annotators into products of individual and shared low-rank matrices. As such, it takes advantage of the robustness of low-rank matrix approximation to noise, and reduces the impact of unreliable annotators by assigning small (zero) weights to their annotation matrices. In addition, it takes advantage of the correlation among labels by leveraging the shared low-rank matrix, and of the similarity between annotators using the individual low-rank matrices to guide the factorization. ML-JMF pursues the low-rank matrices via a unified objective function, and introduces an iterative technique to optimize it. ML-JMF finally uses the optimized low-rank matrices and weights to infer the ground-truth labels. Our experimental results on multi-label datasets show that ML-JMF outperforms competitive methods in inferring ground truth labels. Our approach can identify unreliable annotators, and is robust against their misleading answers through the assignment of small (zero) weights to their annotation.