PIM体系结构中矩阵乘法的数据划分和放置方案

2008 International Symposium on Parallel and Distributed Computing Pub Date : 2008-07-01 DOI:10.1109/ISPDC.2008.7

J. Cha, S. Gupta

{"title":"PIM体系结构中矩阵乘法的数据划分和放置方案","authors":"J. Cha, S. Gupta","doi":"10.1109/ISPDC.2008.7","DOIUrl":null,"url":null,"abstract":"Data intensive applications require massive data transfers between storage and processing units. VLSI scaling has increased the sizes of dynamic memories as well as speeds and capabilities of processing units to a point where, for many such applications, storage and computational processing capabilities are no longer the main limiting factors. Despite this fact, most current architectures fail to meet the performance requirements for such data intensive applications. In this paper, we describe a PIM architecture that harnesses the benefits of VLSI scaling to accelerate matrix operations that constitute the core of many data-intensive applications. We then present data partitioning and placement schemes that are efficient in terms of the computational complexities and internode communication cost. Such approaches are evaluated and analyzed under various computing environments. We also discuss on how to apply such partitioning and placement schemes to each matrix when chains of matrix operations are given as a task.","PeriodicalId":125975,"journal":{"name":"2008 International Symposium on Parallel and Distributed Computing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Data Partitioning and Placement Schemes for Matrix Multiplications on a PIM Architecture\",\"authors\":\"J. Cha, S. Gupta\",\"doi\":\"10.1109/ISPDC.2008.7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data intensive applications require massive data transfers between storage and processing units. VLSI scaling has increased the sizes of dynamic memories as well as speeds and capabilities of processing units to a point where, for many such applications, storage and computational processing capabilities are no longer the main limiting factors. Despite this fact, most current architectures fail to meet the performance requirements for such data intensive applications. In this paper, we describe a PIM architecture that harnesses the benefits of VLSI scaling to accelerate matrix operations that constitute the core of many data-intensive applications. We then present data partitioning and placement schemes that are efficient in terms of the computational complexities and internode communication cost. Such approaches are evaluated and analyzed under various computing environments. We also discuss on how to apply such partitioning and placement schemes to each matrix when chains of matrix operations are given as a task.\",\"PeriodicalId\":125975,\"journal\":{\"name\":\"2008 International Symposium on Parallel and Distributed Computing\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Symposium on Parallel and Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPDC.2008.7\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Symposium on Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPDC.2008.7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

数据密集型应用需要在存储和处理单元之间传输大量数据。VLSI的扩展已经增加了动态存储器的尺寸以及处理单元的速度和能力，对于许多此类应用来说，存储和计算处理能力不再是主要的限制因素。尽管如此，大多数当前架构都无法满足此类数据密集型应用程序的性能需求。在本文中，我们描述了一种PIM架构，该架构利用VLSI扩展的优势来加速构成许多数据密集型应用核心的矩阵运算。然后，我们提出了在计算复杂性和节点间通信成本方面有效的数据分区和放置方案。这些方法在各种计算环境下进行了评估和分析。我们还讨论了当矩阵运算链作为一个任务给定时，如何将这种划分和放置方案应用于每个矩阵。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Data Partitioning and Placement Schemes for Matrix Multiplications on a PIM Architecture

Data intensive applications require massive data transfers between storage and processing units. VLSI scaling has increased the sizes of dynamic memories as well as speeds and capabilities of processing units to a point where, for many such applications, storage and computational processing capabilities are no longer the main limiting factors. Despite this fact, most current architectures fail to meet the performance requirements for such data intensive applications. In this paper, we describe a PIM architecture that harnesses the benefits of VLSI scaling to accelerate matrix operations that constitute the core of many data-intensive applications. We then present data partitioning and placement schemes that are efficient in terms of the computational complexities and internode communication cost. Such approaches are evaluated and analyzed under various computing environments. We also discuss on how to apply such partitioning and placement schemes to each matrix when chains of matrix operations are given as a task.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2008 International Symposium on Parallel and Distributed Computing

自引率

0.00%

发文量