Multigrain parallelism for eigenvalue computations on networks of clusters

Proceedings 11th IEEE International Symposium on High Performance Distributed Computing Pub Date : 2002-07-24 DOI:10.1109/HPDC.2002.1029912

James R. McCombs, A. Stathopoulos

{"title":"Multigrain parallelism for eigenvalue computations on networks of clusters","authors":"James R. McCombs, A. Stathopoulos","doi":"10.1109/HPDC.2002.1029912","DOIUrl":null,"url":null,"abstract":"Clusters of workstations have become a cost-effective means of performing scientific computations. However, large network latencies, resource sharing, and heterogeneity found in networks of clusters and Grids can impede the performance of applications not specifically tailored for use in such environments. A typical example is the traditional fine grain implementations of Krylov-like iterative methods, a central component in many scientific applications. To exploit the potential of these environments, advances in networking technology must be complemented by advances in parallel algorithmic design. In this paper, we present an algorithmic technique that increases the granularity of parallel block iterative methods by inducing additional work during the preconditioning (inexact solution) phase of the iteration. During this phase, each vector in the block is preconditioned by a different subgroup of processors, yielding a much coarser granularity. The rest of the method comprises a small portion of the total time and is still implemented in fine grain. We call this combination of fine and coarse grain parallelism multigrain. We apply this idea to the block Jacobi-Davidson eigensolver, and present experimental data that shows the significant reduction of latency effects on networks of clusters of roughly equal capacity and size. We conclude with a discussion on how multigrain can be applied dynamically based on runtime network performance monitoring.","PeriodicalId":279053,"journal":{"name":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 11th IEEE International Symposium on High Performance Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPDC.2002.1029912","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Clusters of workstations have become a cost-effective means of performing scientific computations. However, large network latencies, resource sharing, and heterogeneity found in networks of clusters and Grids can impede the performance of applications not specifically tailored for use in such environments. A typical example is the traditional fine grain implementations of Krylov-like iterative methods, a central component in many scientific applications. To exploit the potential of these environments, advances in networking technology must be complemented by advances in parallel algorithmic design. In this paper, we present an algorithmic technique that increases the granularity of parallel block iterative methods by inducing additional work during the preconditioning (inexact solution) phase of the iteration. During this phase, each vector in the block is preconditioned by a different subgroup of processors, yielding a much coarser granularity. The rest of the method comprises a small portion of the total time and is still implemented in fine grain. We call this combination of fine and coarse grain parallelism multigrain. We apply this idea to the block Jacobi-Davidson eigensolver, and present experimental data that shows the significant reduction of latency effects on networks of clusters of roughly equal capacity and size. We conclude with a discussion on how multigrain can be applied dynamically based on runtime network performance monitoring.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

簇网络特征值计算的多粒并行性

工作站集群已经成为执行科学计算的一种经济有效的手段。然而，在集群和网格网络中发现的大型网络延迟、资源共享和异构性可能会阻碍应用程序的性能，这些应用程序不是专门为在这种环境中使用而定制的。一个典型的例子是Krylov-like迭代方法的传统细粒度实现，这是许多科学应用的核心组件。为了开发这些环境的潜力，网络技术的进步必须与并行算法设计的进步相辅相成。在本文中，我们提出了一种算法技术，通过在迭代的预处理(不精确解)阶段引入额外的工作来增加并行块迭代方法的粒度。在此阶段，块中的每个向量都由不同的处理器子组进行预处理，从而产生更粗的粒度。该方法的其余部分只占总时间的一小部分，并且仍然以细粒度实现。我们把这种细粒和粗粒并行性的结合称为杂粮。我们将这一想法应用于块Jacobi-Davidson特征解算器，并提供实验数据，显示在容量和大小大致相等的集群网络上显著减少了延迟效应。最后，我们讨论了如何基于运行时网络性能监控动态应用多粒。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings 11th IEEE International Symposium on High Performance Distributed Computing

自引率

0.00%

发文量

期刊最新文献

MySRB and SRB - components of a Data Grid Distributed computing with load-managed active storage Using kernel couplings to predict parallel application performance BioGRID - An European grid for molecular biology Grid services in action: grid enabled optimisation and design search