大规模张量补全的快速Tucker分解

2018 IEEE International Conference on Data Mining (ICDM) Pub Date : 2018-11-01 DOI:10.1109/ICDM.2018.00142

Dongha Lee, Jaehyung Lee, Hwanjo Yu

{"title":"大规模张量补全的快速Tucker分解","authors":"Dongha Lee, Jaehyung Lee, Hwanjo Yu","doi":"10.1109/ICDM.2018.00142","DOIUrl":null,"url":null,"abstract":"Tensor completion is the task of completing multi-aspect data represented as a tensor by accurately predicting missing entries in the tensor. It is mainly solved by tensor factorization methods, and among them, Tucker factorization has attracted considerable interests due to its powerful ability to learn latent factors and even their interactions. Although several Tucker methods have been developed to reduce the memory and computational complexity, the state-of-the-art method still 1) generates redundant computations and 2) cannot factorize a large tensor that exceeds the size of memory. This paper proposes FTcom, a fast and scalable Tucker factorization method for tensor completion. FTcom performs element-wise updates for factor matrices based on coordinate descent, and adopts a novel caching algorithm which stores frequently-required intermediate data. It also uses a tensor file for disk-based data processing and loads only a small part of the tensor at a time into the memory. Experimental results show that FTcom is much faster and more scalable compared to all other competitors. It significantly shortens the training time of Tucker factorization, especially on real-world tensors, and it can be executed on a billion-scale tensor which is bigger than the memory capacity within a single machine.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Fast Tucker Factorization for Large-Scale Tensor Completion\",\"authors\":\"Dongha Lee, Jaehyung Lee, Hwanjo Yu\",\"doi\":\"10.1109/ICDM.2018.00142\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Tensor completion is the task of completing multi-aspect data represented as a tensor by accurately predicting missing entries in the tensor. It is mainly solved by tensor factorization methods, and among them, Tucker factorization has attracted considerable interests due to its powerful ability to learn latent factors and even their interactions. Although several Tucker methods have been developed to reduce the memory and computational complexity, the state-of-the-art method still 1) generates redundant computations and 2) cannot factorize a large tensor that exceeds the size of memory. This paper proposes FTcom, a fast and scalable Tucker factorization method for tensor completion. FTcom performs element-wise updates for factor matrices based on coordinate descent, and adopts a novel caching algorithm which stores frequently-required intermediate data. It also uses a tensor file for disk-based data processing and loads only a small part of the tensor at a time into the memory. Experimental results show that FTcom is much faster and more scalable compared to all other competitors. It significantly shortens the training time of Tucker factorization, especially on real-world tensors, and it can be executed on a billion-scale tensor which is bigger than the memory capacity within a single machine.\",\"PeriodicalId\":286444,\"journal\":{\"name\":\"2018 IEEE International Conference on Data Mining (ICDM)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Data Mining (ICDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2018.00142\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2018.00142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

张量补全是通过准确预测张量中缺失的条目来完成以张量表示的多方面数据的任务。它主要通过张量分解方法来解决，其中，Tucker分解因其强大的学习潜在因素甚至相互作用的能力而引起了人们的极大兴趣。尽管已经开发了几种Tucker方法来减少内存和计算复杂性，但最先进的方法仍然1)产生冗余计算，2)不能分解超过内存大小的大张量。提出了一种快速、可扩展的张量补全Tucker分解方法FTcom。FTcom基于坐标下降对因子矩阵进行逐元素更新，并采用了一种新颖的缓存算法来存储频繁需要的中间数据。它还使用一个张量文件进行基于磁盘的数据处理，每次只将张量的一小部分加载到内存中。实验结果表明，与所有其他竞争对手相比，FTcom的速度更快，可扩展性更强。它显著缩短了Tucker分解的训练时间，特别是在真实世界的张量上，并且可以在大于单个机器内存容量的十亿尺度张量上执行。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Fast Tucker Factorization for Large-Scale Tensor Completion

Tensor completion is the task of completing multi-aspect data represented as a tensor by accurately predicting missing entries in the tensor. It is mainly solved by tensor factorization methods, and among them, Tucker factorization has attracted considerable interests due to its powerful ability to learn latent factors and even their interactions. Although several Tucker methods have been developed to reduce the memory and computational complexity, the state-of-the-art method still 1) generates redundant computations and 2) cannot factorize a large tensor that exceeds the size of memory. This paper proposes FTcom, a fast and scalable Tucker factorization method for tensor completion. FTcom performs element-wise updates for factor matrices based on coordinate descent, and adopts a novel caching algorithm which stores frequently-required intermediate data. It also uses a tensor file for disk-based data processing and loads only a small part of the tensor at a time into the memory. Experimental results show that FTcom is much faster and more scalable compared to all other competitors. It significantly shortens the training time of Tucker factorization, especially on real-world tensors, and it can be executed on a billion-scale tensor which is bigger than the memory capacity within a single machine.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助