{"title":"Vector computations on an orthogonal memory access multiprocessing system","authors":"I. Scherson, Yiming Ma","doi":"10.1109/ARITH.1987.6158715","DOIUrl":null,"url":null,"abstract":"An Orthogonal Memory Access system allows a multiplicity of processors to concurrently access distinct rows or columns of a rectangular array of data elements. The resulting tightly-coupled multi-processing system is feasible with current technology and has even been suggested for VLSI as a “reduced mesh”. In this paper we introduce the architecture and concentrate on its application to a number of basic vector and numerical computations. Matrix multiplication, L-U decomposition, polynomial evaluation and solutions to linear systems and partial differential equations, all show a speed-up of 0(n) for a n-processor system. The flexibility in the choice of the number of PEs makes the architecture a strong competitor in the world of special-purpose parallel systems. Actually, we prove that the machine exhibits the same performance as any other system with the same number of processors within a factor of 3.","PeriodicalId":424620,"journal":{"name":"1987 IEEE 8th Symposium on Computer Arithmetic (ARITH)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1987-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"1987 IEEE 8th Symposium on Computer Arithmetic (ARITH)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARITH.1987.6158715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
An Orthogonal Memory Access system allows a multiplicity of processors to concurrently access distinct rows or columns of a rectangular array of data elements. The resulting tightly-coupled multi-processing system is feasible with current technology and has even been suggested for VLSI as a “reduced mesh”. In this paper we introduce the architecture and concentrate on its application to a number of basic vector and numerical computations. Matrix multiplication, L-U decomposition, polynomial evaluation and solutions to linear systems and partial differential equations, all show a speed-up of 0(n) for a n-processor system. The flexibility in the choice of the number of PEs makes the architecture a strong competitor in the world of special-purpose parallel systems. Actually, we prove that the machine exhibits the same performance as any other system with the same number of processors within a factor of 3.