基于联合矩阵分解的多标签答案聚合

Jinzheng Tu, Guoxian Yu, C. Domeniconi, J. Wang, Guoqiang Xiao, Maozu Guo
{"title":"基于联合矩阵分解的多标签答案聚合","authors":"Jinzheng Tu, Guoxian Yu, C. Domeniconi, J. Wang, Guoqiang Xiao, Maozu Guo","doi":"10.1109/ICDM.2018.00067","DOIUrl":null,"url":null,"abstract":"Crowdsourcing is a useful and economic approach to data annotation. To obtain annotation of high quality, various aggregation approaches have been developed, which take into account different factors that impact the quality of aggregated answers. However, existing methods generally focus on single-label (multi-class and binary) tasks, and they ignore the inter-correlation between labels, and thus may have compromised quality. In this paper, we introduce a Multi-Label answer aggregation approach based on Joint Matrix Factorization (ML-JMF). ML-JMF selectively and jointly factorizes the sample-label association matrices collected from different annotators into products of individual and shared low-rank matrices. As such, it takes advantage of the robustness of low-rank matrix approximation to noise, and reduces the impact of unreliable annotators by assigning small (zero) weights to their annotation matrices. In addition, it takes advantage of the correlation among labels by leveraging the shared low-rank matrix, and of the similarity between annotators using the individual low-rank matrices to guide the factorization. ML-JMF pursues the low-rank matrices via a unified objective function, and introduces an iterative technique to optimize it. ML-JMF finally uses the optimized low-rank matrices and weights to infer the ground-truth labels. Our experimental results on multi-label datasets show that ML-JMF outperforms competitive methods in inferring ground truth labels. Our approach can identify unreliable annotators, and is robust against their misleading answers through the assignment of small (zero) weights to their annotation.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Multi-label Answer Aggregation Based on Joint Matrix Factorization\",\"authors\":\"Jinzheng Tu, Guoxian Yu, C. Domeniconi, J. Wang, Guoqiang Xiao, Maozu Guo\",\"doi\":\"10.1109/ICDM.2018.00067\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Crowdsourcing is a useful and economic approach to data annotation. To obtain annotation of high quality, various aggregation approaches have been developed, which take into account different factors that impact the quality of aggregated answers. However, existing methods generally focus on single-label (multi-class and binary) tasks, and they ignore the inter-correlation between labels, and thus may have compromised quality. In this paper, we introduce a Multi-Label answer aggregation approach based on Joint Matrix Factorization (ML-JMF). ML-JMF selectively and jointly factorizes the sample-label association matrices collected from different annotators into products of individual and shared low-rank matrices. As such, it takes advantage of the robustness of low-rank matrix approximation to noise, and reduces the impact of unreliable annotators by assigning small (zero) weights to their annotation matrices. In addition, it takes advantage of the correlation among labels by leveraging the shared low-rank matrix, and of the similarity between annotators using the individual low-rank matrices to guide the factorization. ML-JMF pursues the low-rank matrices via a unified objective function, and introduces an iterative technique to optimize it. ML-JMF finally uses the optimized low-rank matrices and weights to infer the ground-truth labels. Our experimental results on multi-label datasets show that ML-JMF outperforms competitive methods in inferring ground truth labels. Our approach can identify unreliable annotators, and is robust against their misleading answers through the assignment of small (zero) weights to their annotation.\",\"PeriodicalId\":286444,\"journal\":{\"name\":\"2018 IEEE International Conference on Data Mining (ICDM)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Data Mining (ICDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2018.00067\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2018.00067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

摘要

众包是一种有用且经济的数据注释方法。为了获得高质量的注释,人们开发了各种聚合方法,这些方法考虑了影响聚合答案质量的不同因素。然而,现有的方法通常侧重于单标签(多类和二进制)任务,而忽略了标签之间的相互关系,因此可能会降低质量。本文介绍了一种基于联合矩阵分解(ML-JMF)的多标签答案聚合方法。ML-JMF选择性地、联合地将从不同注释器收集的样本-标签关联矩阵分解为单个和共享的低秩矩阵的乘积。因此,它利用了低秩矩阵逼近噪声的鲁棒性,并通过为其注释矩阵分配小(零)权值来减少不可靠注释器的影响。此外,它通过利用共享的低秩矩阵来利用标签之间的相关性,并利用单个低秩矩阵来利用注释器之间的相似性来指导分解。ML-JMF通过统一的目标函数来追求低秩矩阵,并引入了迭代优化技术。ML-JMF最后使用优化后的低秩矩阵和权值来推断基真标签。我们在多标签数据集上的实验结果表明,ML-JMF在推断基础真值标签方面优于竞争方法。我们的方法可以识别不可靠的注释者,并且通过为其注释分配小(零)权值,对其误导性答案具有鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Multi-label Answer Aggregation Based on Joint Matrix Factorization
Crowdsourcing is a useful and economic approach to data annotation. To obtain annotation of high quality, various aggregation approaches have been developed, which take into account different factors that impact the quality of aggregated answers. However, existing methods generally focus on single-label (multi-class and binary) tasks, and they ignore the inter-correlation between labels, and thus may have compromised quality. In this paper, we introduce a Multi-Label answer aggregation approach based on Joint Matrix Factorization (ML-JMF). ML-JMF selectively and jointly factorizes the sample-label association matrices collected from different annotators into products of individual and shared low-rank matrices. As such, it takes advantage of the robustness of low-rank matrix approximation to noise, and reduces the impact of unreliable annotators by assigning small (zero) weights to their annotation matrices. In addition, it takes advantage of the correlation among labels by leveraging the shared low-rank matrix, and of the similarity between annotators using the individual low-rank matrices to guide the factorization. ML-JMF pursues the low-rank matrices via a unified objective function, and introduces an iterative technique to optimize it. ML-JMF finally uses the optimized low-rank matrices and weights to infer the ground-truth labels. Our experimental results on multi-label datasets show that ML-JMF outperforms competitive methods in inferring ground truth labels. Our approach can identify unreliable annotators, and is robust against their misleading answers through the assignment of small (zero) weights to their annotation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Entire Regularization Path for Sparse Nonnegative Interaction Model Accelerating Experimental Design by Incorporating Experimenter Hunches Title Page i An Efficient Many-Class Active Learning Framework for Knowledge-Rich Domains Social Recommendation with Missing Not at Random Data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1