首页 > 最新文献

[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation最新文献

英文 中文
A parallel implementation of the symmetric tridiagonal QR algorithm 对称三对角线QR算法的并行实现
P. Arbenz, K. Gates, C. Sprenger
The authors propose a novel and simple way to parallelize the QR algorithm for computing eigenvalues and eigenvectors of real symmetric tridiagonal matrices. This approach is suitable for all parallel computers, ranging from multiprocessor supercomputers with shared memory to massively parallel computers with local memory. The authors report on numerical experiments completed on a Cray-Y-MP, an Alliant FX-80, a Sequent Symmetry S81b, a nCUBE 2, a Thinking Machines CM200, and a cluster of Sun SPARCstations. The numerical results indicate that the proposed algorithm is suitable for parallel execution on the whole range of parallel computers. While the results obtained on the computers with vector facilities did not show very high efficiencies, those obtained with multiprocessor computers with scalar CPUs had very good speedups.<>
本文提出了一种新的、简单的方法来并行化计算实对称三对角矩阵的特征值和特征向量的QR算法。这种方法适用于所有并行计算机,从具有共享内存的多处理器超级计算机到具有本地内存的大规模并行计算机。作者报告了在Cray-Y-MP、Alliant FX-80、Sequent Symmetry S81b、nCUBE 2、Thinking Machines CM200和Sun sparcstation集群上完成的数值实验。数值结果表明,该算法适用于所有并行计算机的并行执行。虽然在具有矢量设施的计算机上获得的结果没有显示出非常高的效率,但在具有标量cpu的多处理器计算机上获得的结果具有非常好的速度
{"title":"A parallel implementation of the symmetric tridiagonal QR algorithm","authors":"P. Arbenz, K. Gates, C. Sprenger","doi":"10.1109/FMPC.1992.234936","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234936","url":null,"abstract":"The authors propose a novel and simple way to parallelize the QR algorithm for computing eigenvalues and eigenvectors of real symmetric tridiagonal matrices. This approach is suitable for all parallel computers, ranging from multiprocessor supercomputers with shared memory to massively parallel computers with local memory. The authors report on numerical experiments completed on a Cray-Y-MP, an Alliant FX-80, a Sequent Symmetry S81b, a nCUBE 2, a Thinking Machines CM200, and a cluster of Sun SPARCstations. The numerical results indicate that the proposed algorithm is suitable for parallel execution on the whole range of parallel computers. While the results obtained on the computers with vector facilities did not show very high efficiencies, those obtained with multiprocessor computers with scalar CPUs had very good speedups.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130435986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A graph-based subcube allocation and task migration in hypercube systems 超立方体系统中基于图的子多维数据集分配和任务迁移
O. Kang, B.M. Kim, H. Yoon, S. Maeng, J. Cho
The authors propose a task migration scheme based on the HSA (heuristic subcube allocation) strategy to solve the fragmentation problem in a hypercube. This scheme, called CSC (complementary subcube coalescence), uses a heuristic and an undirected graph, called the SC (subcube) graph. If an incoming request is not satisfied due to the system fragmentation, the task migration scheme is performed to generate higher dimension subcubes. Simulation results show that the HSA strategy gives better efficiency than the Buddy and GC strategies in the adaptive mode. Moreover, the HSA strategy has a significantly lower migration cost than that of the Buddy and GC strategies.<>
针对超立方体中的碎片问题,提出了一种基于启发式子立方体分配(HSA)策略的任务迁移方案。该方案称为CSC(互补子立方体合并),使用启发式和无向图,称为SC(子立方体)图。如果由于系统碎片而无法满足传入请求,则执行任务迁移方案以生成更高维度的子数据集。仿真结果表明,在自适应模式下,HSA策略比Buddy和GC策略具有更高的效率。此外,HSA策略的迁移成本明显低于Buddy和GC策略
{"title":"A graph-based subcube allocation and task migration in hypercube systems","authors":"O. Kang, B.M. Kim, H. Yoon, S. Maeng, J. Cho","doi":"10.1109/FMPC.1992.234931","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234931","url":null,"abstract":"The authors propose a task migration scheme based on the HSA (heuristic subcube allocation) strategy to solve the fragmentation problem in a hypercube. This scheme, called CSC (complementary subcube coalescence), uses a heuristic and an undirected graph, called the SC (subcube) graph. If an incoming request is not satisfied due to the system fragmentation, the task migration scheme is performed to generate higher dimension subcubes. Simulation results show that the HSA strategy gives better efficiency than the Buddy and GC strategies in the adaptive mode. Moreover, the HSA strategy has a significantly lower migration cost than that of the Buddy and GC strategies.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"2353 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127476663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Embedding the hypercube into the 3-dimension mesh 将超立方体嵌入到三维网格中
S. L. Scott, J. Baker
A constant time and space algorithm for embedding the hypercube architecture into the 3-dimension mesh (3D-mesh) is presented. This enables the cube/sub i/ operation to be performed on the embedded hypercube machine where the interprocessor communication function cube/sub i/ is defined on the embedded hypercube's PEs as cube/sub i/(b/sub n-1/...b/sub i/...b/sub 0/)=b/sub n-1/...b/sub i/...b/sub 0/ and b/sub i/ is the binary complement of b/sub i/.<>
提出了一种将超立方体结构嵌入三维网格(3D-mesh)的恒定时间和空间算法。这使得cube/sub i/操作可以在嵌入式超立方体机器上执行,其中处理器间通信功能cube/sub i/在嵌入式超立方体的pe上定义为cube/sub i/(b/sub n-1/…)我/订阅/……B /下标0/)= B /下标n-1/…我/订阅/……B /下标0/和B /下标i/是B /下标i/的二进制补码。
{"title":"Embedding the hypercube into the 3-dimension mesh","authors":"S. L. Scott, J. Baker","doi":"10.1109/FMPC.1992.234916","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234916","url":null,"abstract":"A constant time and space algorithm for embedding the hypercube architecture into the 3-dimension mesh (3D-mesh) is presented. This enables the cube/sub i/ operation to be performed on the embedded hypercube machine where the interprocessor communication function cube/sub i/ is defined on the embedded hypercube's PEs as cube/sub i/(b/sub n-1/...b/sub i/...b/sub 0/)=b/sub n-1/...b/sub i/...b/sub 0/ and b/sub i/ is the binary complement of b/sub i/.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130588144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
On the physical design of butterfly networks for PRAMs pram蝴蝶网物理设计研究
R. Drefenstedt, D. Schmidt
The design of networks for massively parallel computers is strongly influenced by available technology. The network latency, critical for many applications, is significantly increased by packaging constraints, i.e. many connections between switches involving pad drivers or even line drivers. The authors concentrate on reducing those influences for a butterfly network related to Ranade's routing algorithm. Such a network is being implemented for a parallel RAM (PRAM) with 128 physical processors and 128 K logical processors. The required throughput makes it critical to use shared buses and improves the problem of space. While delays caused by switches can only be hidden by mapping many virtual processors to some physical ones, connection latency may be reduced by additional registers (shorter clock cycle time) and suitable mapping schemes (less space). Suitable clustering of processor modules and network parts may additionally decrease the network delay.<>
大规模并行计算机网络的设计受到现有技术的强烈影响。对于许多应用来说至关重要的网络延迟,由于封装限制而显著增加,即交换机之间的许多连接涉及pad驱动器甚至线路驱动器。作者专注于减少与Ranade路由算法相关的蝴蝶网络的这些影响。这种网络正在为一个具有128个物理处理器和128 K逻辑处理器的并行RAM (PRAM)实现。所需的吞吐量使得使用共享总线变得至关重要,并改善了空间问题。虽然交换机造成的延迟只能通过将许多虚拟处理器映射到一些物理处理器来隐藏,但通过额外的寄存器(更短的时钟周期时间)和合适的映射方案(更少的空间)可以减少连接延迟。适当的处理器模块和网络部件聚类可以进一步降低网络延迟。
{"title":"On the physical design of butterfly networks for PRAMs","authors":"R. Drefenstedt, D. Schmidt","doi":"10.1109/FMPC.1992.234958","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234958","url":null,"abstract":"The design of networks for massively parallel computers is strongly influenced by available technology. The network latency, critical for many applications, is significantly increased by packaging constraints, i.e. many connections between switches involving pad drivers or even line drivers. The authors concentrate on reducing those influences for a butterfly network related to Ranade's routing algorithm. Such a network is being implemented for a parallel RAM (PRAM) with 128 physical processors and 128 K logical processors. The required throughput makes it critical to use shared buses and improves the problem of space. While delays caused by switches can only be hidden by mapping many virtual processors to some physical ones, connection latency may be reduced by additional registers (shorter clock cycle time) and suitable mapping schemes (less space). Suitable clustering of processor modules and network parts may additionally decrease the network delay.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123484880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
The virtual-time data-parallel machine 虚拟时间数据并行机
S. Shen, L. Kleinrock
The authors propose the virtual-time data-parallel machine to execute SIMD (single instruction multiple data) programs asynchronously. They first illustrate how asynchronous execution is more efficient than synchronous execution. For a simple model, they show that asynchronous execution outperforms synchronous execution roughly by a factor of (ln N), where N is the number of processors in the system. They then explore how to execute SIMD programs asynchronously without violating the SIMD semantics. They design a first in, first out (FIFO) priority cache, one for each processing element, to record the recent history of all variables. The cache, which is stacked between the processor and the memory, supports asynchronous execution in hardware efficiently and preserves the SIMD semantics of the software transparently. Analysis and simulation results indicate that the virtual-time data-parallel machine can achieve linear speed-up for computation-intensive data-parallel programs when the number of processors is large.<>
作者提出了虚拟时间数据并行机来异步执行SIMD(单指令多数据)程序。它们首先说明了异步执行如何比同步执行更有效。对于一个简单的模型,他们表明异步执行比同步执行的性能大约高出(ln N)倍,其中N是系统中的处理器数量。然后探讨如何在不违反SIMD语义的情况下异步执行SIMD程序。他们设计了一个先进先出(FIFO)优先级缓存,每个处理元素一个,以记录所有变量的最近历史。缓存(堆叠在处理器和内存之间)有效地支持硬件中的异步执行,并透明地保留软件的SIMD语义。分析和仿真结果表明,当处理器数量较大时,虚拟时间数据并行机可以实现计算密集型数据并行程序的线性加速
{"title":"The virtual-time data-parallel machine","authors":"S. Shen, L. Kleinrock","doi":"10.1109/FMPC.1992.234906","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234906","url":null,"abstract":"The authors propose the virtual-time data-parallel machine to execute SIMD (single instruction multiple data) programs asynchronously. They first illustrate how asynchronous execution is more efficient than synchronous execution. For a simple model, they show that asynchronous execution outperforms synchronous execution roughly by a factor of (ln N), where N is the number of processors in the system. They then explore how to execute SIMD programs asynchronously without violating the SIMD semantics. They design a first in, first out (FIFO) priority cache, one for each processing element, to record the recent history of all variables. The cache, which is stacked between the processor and the memory, supports asynchronous execution in hardware efficiently and preserves the SIMD semantics of the software transparently. Analysis and simulation results indicate that the virtual-time data-parallel machine can achieve linear speed-up for computation-intensive data-parallel programs when the number of processors is large.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114752250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1