Jointly clustering rows and columns of binary matrices: algorithms and trade-offs

Jiaming Xu, Rui Wu, Kai Zhu, B. Hajek, R. Srikant, Lei Ying
{"title":"Jointly clustering rows and columns of binary matrices: algorithms and trade-offs","authors":"Jiaming Xu, Rui Wu, Kai Zhu, B. Hajek, R. Srikant, Lei Ying","doi":"10.1145/2591971.2592005","DOIUrl":null,"url":null,"abstract":"In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure. In this paper, we consider a class of binary matrices, arising in many applications, which exhibit both row and column cluster structure, and our goal is to exactly recover the underlying row and column clusters by observing only a small fraction of noisy entries. We first derive a lower bound on the minimum number of observations needed for exact cluster recovery. Then, we study three algorithms with different running time and compare the number of observations needed by them for successful cluster recovery. Our analytical results show smooth time-data trade offs: one can gradually reduce the computational complexity when increasingly more observations are available.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"42","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement and Modeling of Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2591971.2592005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 42

Abstract

In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure. In this paper, we consider a class of binary matrices, arising in many applications, which exhibit both row and column cluster structure, and our goal is to exactly recover the underlying row and column clusters by observing only a small fraction of noisy entries. We first derive a lower bound on the minimum number of observations needed for exact cluster recovery. Then, we study three algorithms with different running time and compare the number of observations needed by them for successful cluster recovery. Our analytical results show smooth time-data trade offs: one can gradually reduce the computational complexity when increasingly more observations are available.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
二值矩阵的行和列联合聚类:算法和权衡
在标准聚类问题中,数据点用向量表示,通过将它们堆叠在一起,形成具有行或列聚类结构的数据矩阵。在本文中,我们考虑了一类在许多应用中出现的二进制矩阵,它同时表现出行和列簇结构,我们的目标是通过观察一小部分有噪声的条目来精确地恢复底层的行和列簇。我们首先推导出精确集群恢复所需的最小观测数的下界。然后,我们研究了三种不同运行时间的算法,并比较了它们成功恢复集群所需的观测数。我们的分析结果显示了平滑的时间-数据权衡:当可用的观测值越来越多时,可以逐渐降低计算复杂性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Queueing delays in buffered multistage interconnection networks Data dissemination performance in large-scale sensor networks Index policies for a multi-class queue with convex holding cost and abandonments Neighbor-cell assisted error correction for MLC NAND flash memories Collecting, organizing, and sharing pins in pinterest: interest-driven or social-driven?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1