{"title":"采用迭代 SVD 的 DEIM-CUR 因式分解法","authors":"Perfect Y. Gidisu, Michiel E. Hochstenbach","doi":"10.1016/j.jcmds.2024.100095","DOIUrl":null,"url":null,"abstract":"<div><p>A CUR factorization is often utilized as a substitute for the singular value decomposition (SVD), especially when a concrete interpretation of the singular vectors is challenging. Moreover, if the original data matrix possesses properties like nonnegativity and sparsity, a CUR decomposition can better preserve them compared to the SVD. An essential aspect of this approach is the methodology used for selecting a subset of columns and rows from the original matrix. This study investigates the effectiveness of <em>one-round sampling</em> and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs. One provably appropriate technique for index selection in constructing a CUR factorization is the discrete empirical interpolation method (DEIM). Our contribution aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds, in the sense that we select subsequent columns and rows based on the previously selected ones. Thus, we modify <span><math><mi>A</mi></math></span> after each iteration by removing the information that has been captured by the previously selected columns and rows. We also discuss how iterative procedures for computing a few singular vectors of large data matrices can be integrated with the new iterative subselection strategies. We present the results of numerical experiments, providing a comparison of one-round sampling and iterative subselection techniques, and demonstrating the improved approximation quality associated with using the latter.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"12 ","pages":"Article 100095"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000063/pdfft?md5=16d9fd47f077d52851c28e4d876eb3c0&pid=1-s2.0-S2772415824000063-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A DEIM-CUR factorization with iterative SVDs\",\"authors\":\"Perfect Y. Gidisu, Michiel E. Hochstenbach\",\"doi\":\"10.1016/j.jcmds.2024.100095\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>A CUR factorization is often utilized as a substitute for the singular value decomposition (SVD), especially when a concrete interpretation of the singular vectors is challenging. Moreover, if the original data matrix possesses properties like nonnegativity and sparsity, a CUR decomposition can better preserve them compared to the SVD. An essential aspect of this approach is the methodology used for selecting a subset of columns and rows from the original matrix. This study investigates the effectiveness of <em>one-round sampling</em> and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs. One provably appropriate technique for index selection in constructing a CUR factorization is the discrete empirical interpolation method (DEIM). Our contribution aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds, in the sense that we select subsequent columns and rows based on the previously selected ones. Thus, we modify <span><math><mi>A</mi></math></span> after each iteration by removing the information that has been captured by the previously selected columns and rows. We also discuss how iterative procedures for computing a few singular vectors of large data matrices can be integrated with the new iterative subselection strategies. We present the results of numerical experiments, providing a comparison of one-round sampling and iterative subselection techniques, and demonstrating the improved approximation quality associated with using the latter.</p></div>\",\"PeriodicalId\":100768,\"journal\":{\"name\":\"Journal of Computational Mathematics and Data Science\",\"volume\":\"12 \",\"pages\":\"Article 100095\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772415824000063/pdfft?md5=16d9fd47f077d52851c28e4d876eb3c0&pid=1-s2.0-S2772415824000063-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational Mathematics and Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772415824000063\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Mathematics and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772415824000063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
CUR 因式分解经常被用来替代奇异值分解(SVD),尤其是在奇异向量的具体解释具有挑战性的情况下。此外,如果原始数据矩阵具有非负性和稀疏性等特性,CUR 分解与 SVD 相比能更好地保留这些特性。这种方法的一个重要方面是从原始矩阵中选择列和行子集的方法。本研究调查了单轮采样和迭代子选择技术的有效性,并在迭代 SVD 的基础上引入了新的迭代子选择策略。在构建 CUR 因式分解时,离散经验插值法(DEIM)是一种可证明的合适索引选择技术。我们的贡献旨在通过多轮迭代调用 DEIM 方案来提高其近似质量,即我们根据之前选择的列和行来选择后续的列和行。因此,我们在每次迭代后都会修改 A,删除之前选定的列和行所捕获的信息。我们还讨论了如何将计算大型数据矩阵几个奇异向量的迭代程序与新的迭代子选择策略相结合。我们介绍了数值实验的结果,对单轮采样和迭代子选择技术进行了比较,并证明使用后者可以提高近似质量。
A CUR factorization is often utilized as a substitute for the singular value decomposition (SVD), especially when a concrete interpretation of the singular vectors is challenging. Moreover, if the original data matrix possesses properties like nonnegativity and sparsity, a CUR decomposition can better preserve them compared to the SVD. An essential aspect of this approach is the methodology used for selecting a subset of columns and rows from the original matrix. This study investigates the effectiveness of one-round sampling and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs. One provably appropriate technique for index selection in constructing a CUR factorization is the discrete empirical interpolation method (DEIM). Our contribution aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds, in the sense that we select subsequent columns and rows based on the previously selected ones. Thus, we modify after each iteration by removing the information that has been captured by the previously selected columns and rows. We also discuss how iterative procedures for computing a few singular vectors of large data matrices can be integrated with the new iterative subselection strategies. We present the results of numerical experiments, providing a comparison of one-round sampling and iterative subselection techniques, and demonstrating the improved approximation quality associated with using the latter.