模型驱动的高阶张量稀疏CP分解

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2017-05-01 DOI:10.1109/IPDPS.2017.80

Jiajia Li, Jee W. Choi, Ioakeim Perros, Jimeng Sun, R. Vuduc

{"title":"模型驱动的高阶张量稀疏CP分解","authors":"Jiajia Li, Jee W. Choi, Ioakeim Perros, Jimeng Sun, R. Vuduc","doi":"10.1109/IPDPS.2017.80","DOIUrl":null,"url":null,"abstract":"Given an input tensor, its CANDECOMP/PARAFAC decomposition (or CPD) is a low-rank representation. CPDs are of particular interest in data analysis and mining, especially when the data tensor is sparse and of higher order (dimension). This paper focuses on the central bottleneck of a CPD algorithm, which is evaluating a sequence of matricized tensor times Khatri-Rao products (MTTKRPs). To speed up the MTTKRP sequence, we propose a novel, adaptive tensor memoization algorithm, AdaTM. Besides removing redundant computations within the MTTKRP sequence, which potentially reduces its overall asymptotic complexity, our technique also allows a user to make a space-time tradeoff by automatically tuning algorithmic and machine parameters using a model-driven framework. Our method improves as the tensor order grows, making its performance more scalable for higher-order data problems. We show speedups of up to 8× and 820× on real sparse data tensors with orders as high as 85 over the SPLATT package and Tensor Toolbox library respectively; and on a full CPD algorithm (CP-ALS), AdaTM can be up to 8× faster than state-of-the-art method implemented in SPLATT.","PeriodicalId":209524,"journal":{"name":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":"{\"title\":\"Model-Driven Sparse CP Decomposition for Higher-Order Tensors\",\"authors\":\"Jiajia Li, Jee W. Choi, Ioakeim Perros, Jimeng Sun, R. Vuduc\",\"doi\":\"10.1109/IPDPS.2017.80\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given an input tensor, its CANDECOMP/PARAFAC decomposition (or CPD) is a low-rank representation. CPDs are of particular interest in data analysis and mining, especially when the data tensor is sparse and of higher order (dimension). This paper focuses on the central bottleneck of a CPD algorithm, which is evaluating a sequence of matricized tensor times Khatri-Rao products (MTTKRPs). To speed up the MTTKRP sequence, we propose a novel, adaptive tensor memoization algorithm, AdaTM. Besides removing redundant computations within the MTTKRP sequence, which potentially reduces its overall asymptotic complexity, our technique also allows a user to make a space-time tradeoff by automatically tuning algorithmic and machine parameters using a model-driven framework. Our method improves as the tensor order grows, making its performance more scalable for higher-order data problems. We show speedups of up to 8× and 820× on real sparse data tensors with orders as high as 85 over the SPLATT package and Tensor Toolbox library respectively; and on a full CPD algorithm (CP-ALS), AdaTM can be up to 8× faster than state-of-the-art method implemented in SPLATT.\",\"PeriodicalId\":209524,\"journal\":{\"name\":\"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"53\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS.2017.80\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2017.80","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 53

摘要

给定一个输入张量，它的CANDECOMP/PARAFAC分解(或CPD)是一个低秩表示。cpd在数据分析和挖掘中特别有趣，特别是当数据张量是稀疏的和高阶(维)的时候。本文重点研究了一种CPD算法的中心瓶颈，即矩阵张量乘Khatri-Rao积(MTTKRPs)序列的求值。为了加快MTTKRP序列的速度，我们提出了一种新的自适应张量记忆算法AdaTM。除了消除MTTKRP序列中的冗余计算(这可能会降低其总体渐近复杂性)之外，我们的技术还允许用户通过使用模型驱动框架自动调整算法和机器参数来进行时空权衡。我们的方法随着张量阶的增长而改进，使其性能在高阶数据问题上更具可扩展性。我们分别在SPLATT包和Tensor Toolbox库上显示了高达85阶的真实稀疏数据张量的加速高达8倍和820倍;在全CPD算法(CP-ALS)上，AdaTM可以比SPLATT中实现的最先进方法快8倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Model-Driven Sparse CP Decomposition for Higher-Order Tensors

Given an input tensor, its CANDECOMP/PARAFAC decomposition (or CPD) is a low-rank representation. CPDs are of particular interest in data analysis and mining, especially when the data tensor is sparse and of higher order (dimension). This paper focuses on the central bottleneck of a CPD algorithm, which is evaluating a sequence of matricized tensor times Khatri-Rao products (MTTKRPs). To speed up the MTTKRP sequence, we propose a novel, adaptive tensor memoization algorithm, AdaTM. Besides removing redundant computations within the MTTKRP sequence, which potentially reduces its overall asymptotic complexity, our technique also allows a user to make a space-time tradeoff by automatically tuning algorithmic and machine parameters using a model-driven framework. Our method improves as the tensor order grows, making its performance more scalable for higher-order data problems. We show speedups of up to 8× and 820× on real sparse data tensors with orders as high as 85 over the SPLATT package and Tensor Toolbox library respectively; and on a full CPD algorithm (CP-ALS), AdaTM can be up to 8× faster than state-of-the-art method implemented in SPLATT.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量