Low rank approximation with entrywise l1-norm error

Zhao Song, David P. Woodruff, Peilin Zhong
{"title":"Low rank approximation with entrywise l1-norm error","authors":"Zhao Song, David P. Woodruff, Peilin Zhong","doi":"10.1145/3055399.3055431","DOIUrl":null,"url":null,"abstract":"We study the ℓ1-low rank approximation problem, where for a given n x d matrix A and approximation factor α ≤ 1, the goal is to output a rank-k matrix  for which ‖A-Â‖1 ≤ α · min rank-k matrices A′ ‖A-A′‖1, where for an n x d matrix C, we let ‖C‖1 = ∑i=1n ∑j=1d |Ci,j|. This error measure is known to be more robust than the Frobenius norm in the presence of outliers and is indicated in models where Gaussian assumptions on the noise may not apply. The problem was shown to be NP-hard by Gillis and Vavasis and a number of heuristics have been proposed. It was asked in multiple places if there are any approximation algorithms. We give the first provable approximation algorithms for ℓ1-low rank approximation, showing that it is possible to achieve approximation factor α = (logd) #183; poly(k) in nnz(A) + (n+d) poly(k) time, where nnz(A) denotes the number of non-zero entries of A. If k is constant, we further improve the approximation ratio to O(1) with a poly(nd)-time algorithm. Under the Exponential Time Hypothesis, we show there is no poly(nd)-time algorithm achieving a (1+1/log1+γ(nd))-approximation, for γ > 0 an arbitrarily small constant, even when k = 1. We give a number of additional results for ℓ1-low rank approximation: nearly tight upper and lower bounds for column subset selection, CUR decompositions, extensions to low rank approximation with respect to ℓp-norms for 1 ≤ p < 2 and earthmover distance, low-communication distributed protocols and low-memory streaming algorithms, algorithms with limited randomness, and bicriteria algorithms. We also give a preliminary empirical evaluation.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"19 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"96","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3055399.3055431","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 96

Abstract

We study the ℓ1-low rank approximation problem, where for a given n x d matrix A and approximation factor α ≤ 1, the goal is to output a rank-k matrix  for which ‖A-Â‖1 ≤ α · min rank-k matrices A′ ‖A-A′‖1, where for an n x d matrix C, we let ‖C‖1 = ∑i=1n ∑j=1d |Ci,j|. This error measure is known to be more robust than the Frobenius norm in the presence of outliers and is indicated in models where Gaussian assumptions on the noise may not apply. The problem was shown to be NP-hard by Gillis and Vavasis and a number of heuristics have been proposed. It was asked in multiple places if there are any approximation algorithms. We give the first provable approximation algorithms for ℓ1-low rank approximation, showing that it is possible to achieve approximation factor α = (logd) #183; poly(k) in nnz(A) + (n+d) poly(k) time, where nnz(A) denotes the number of non-zero entries of A. If k is constant, we further improve the approximation ratio to O(1) with a poly(nd)-time algorithm. Under the Exponential Time Hypothesis, we show there is no poly(nd)-time algorithm achieving a (1+1/log1+γ(nd))-approximation, for γ > 0 an arbitrarily small constant, even when k = 1. We give a number of additional results for ℓ1-low rank approximation: nearly tight upper and lower bounds for column subset selection, CUR decompositions, extensions to low rank approximation with respect to ℓp-norms for 1 ≤ p < 2 and earthmover distance, low-communication distributed protocols and low-memory streaming algorithms, algorithms with limited randomness, and bicriteria algorithms. We also give a preliminary empirical evaluation.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
低秩近似与入口方向11范数误差
我们研究了1-低秩近似问题,其中对于给定的n x d矩阵a和近似因子α≤1,目标是输出一个秩-k矩阵Â,其中‖a -Â‖1≤α·最小秩-k矩阵a′‖a - a′‖1,其中对于n x d矩阵C,我们令‖C‖1 =∑i=1n∑j=1d |Ci,j|。在存在异常值的情况下,这种误差测量已知比Frobenius范数更稳健,并且在对噪声的高斯假设可能不适用的模型中表示。Gillis和Vavasis证明了这个问题是np困难的,并提出了许多启发式方法。很多地方都问过是否有近似算法。我们给出了第一个可证明的l_1 -低秩近似的逼近算法,表明可以实现近似因子α = (logd) #183;poly(k) in nnz(A) + (n+d) poly(k) time,其中nnz(A)表示A的非零条目数。如果k为常数,我们进一步使用poly(nd) time算法将近似比提高到O(1)。在指数时间假设下,我们证明没有多(nd)时间算法实现(1+1/log1+γ(nd))-近似,即使当k = 1时,γ > 0是一个任意小的常数。我们给出了一些额外的结果:列子集选择的近紧上界和下界,CUR分解,关于1≤p < 2和土方距离的低秩近似的扩展,低通信分布式协议和低内存流算法,有限随机性算法和双准则算法。并给出了初步的实证评价。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Online service with delay A simpler and faster strongly polynomial algorithm for generalized flow maximization Low rank approximation with entrywise l1-norm error Fast convergence of learning in games (invited talk) Surviving in directed graphs: a quasi-polynomial-time polylogarithmic approximation for two-connected directed Steiner tree
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1