Pub Date : 1988-05-25DOI: 10.1109/ARRAYS.1988.18040
K.W. Przytula, J. Nash
Two sequences of operations necessary for implementation of high-resolution image formation in strip and spotlight modes of the synthetic-aperture radar (SAR) are presented. The sequences are mapped onto a systolic/cellular architecture. The mapping includes parallel implementation of all the basic operations and the pertinent data communication. Detailed estimates of the computation times are provided.<>
{"title":"Implementation of synthetic aperture radar algorithms on a systolic/cellular architecture","authors":"K.W. Przytula, J. Nash","doi":"10.1109/ARRAYS.1988.18040","DOIUrl":"https://doi.org/10.1109/ARRAYS.1988.18040","url":null,"abstract":"Two sequences of operations necessary for implementation of high-resolution image formation in strip and spotlight modes of the synthetic-aperture radar (SAR) are presented. The sequences are mapped onto a systolic/cellular architecture. The mapping includes parallel implementation of all the basic operations and the pertinent data communication. Detailed estimates of the computation times are provided.<<ETX>>","PeriodicalId":339807,"journal":{"name":"[1988] Proceedings. International Conference on Systolic Arrays","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121020364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1988-05-25DOI: 10.1109/ARRAYS.1988.18094
S. Bandyopadhyay, G. Jullien, A. Sengupta
Fault detection and correction using the Chinese remainder theorem for decoding is investigated. It is shown that this approach is well suited for implementation by VLSI circuits for digital signal processing using systolic architectures. A systolic array for multioperand residue addition is considered, and its application in error-tolerant digital signal processing is presented. It is shown that the array can be easily used for comparing efficiently a set of residues S=(x/sub 0/, x/sub 1/, . . ., x/sub N-1/) to a known constant. This algorithm has been used to detect errors by checking whether S lies in the illegitimate range. The multioperand residue adder has been modified to design a variable modulus adder. An error-tolerant RNS finite-impulse response filter has been designed using this variable modulus adder. Three schemes for error detection and correction are proposed.<>
{"title":"A systolic array for fault tolerant digital signal processing using a residue number system approach","authors":"S. Bandyopadhyay, G. Jullien, A. Sengupta","doi":"10.1109/ARRAYS.1988.18094","DOIUrl":"https://doi.org/10.1109/ARRAYS.1988.18094","url":null,"abstract":"Fault detection and correction using the Chinese remainder theorem for decoding is investigated. It is shown that this approach is well suited for implementation by VLSI circuits for digital signal processing using systolic architectures. A systolic array for multioperand residue addition is considered, and its application in error-tolerant digital signal processing is presented. It is shown that the array can be easily used for comparing efficiently a set of residues S=(x/sub 0/, x/sub 1/, . . ., x/sub N-1/) to a known constant. This algorithm has been used to detect errors by checking whether S lies in the illegitimate range. The multioperand residue adder has been modified to design a variable modulus adder. An error-tolerant RNS finite-impulse response filter has been designed using this variable modulus adder. Three schemes for error detection and correction are proposed.<<ETX>>","PeriodicalId":339807,"journal":{"name":"[1988] Proceedings. International Conference on Systolic Arrays","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127330853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1988-05-25DOI: 10.1109/ARRAYS.1988.18072
D. J. Evans, G. Megson
A systolic array implementation for solving parabolic equations numerically is presented. The finite-difference methods used are stable asymmetric approximations to the partial differential equations, which when coupled in groups of two adjacent points on the grid result in implicit equations that are easily converted to explicit form, thus offering many advantages suitable for solution by VLSI techniques. The regularity obtained from the grid structure and locality of data from groups of small size, combined with the attributes of truncation error cancellations and alternating the strategies of grid points, give unconditional stability and an efficient, systolic design.<>
{"title":"Systolic arrays for group explicit methods for solving parabolic partial differential equations","authors":"D. J. Evans, G. Megson","doi":"10.1109/ARRAYS.1988.18072","DOIUrl":"https://doi.org/10.1109/ARRAYS.1988.18072","url":null,"abstract":"A systolic array implementation for solving parabolic equations numerically is presented. The finite-difference methods used are stable asymmetric approximations to the partial differential equations, which when coupled in groups of two adjacent points on the grid result in implicit equations that are easily converted to explicit form, thus offering many advantages suitable for solution by VLSI techniques. The regularity obtained from the grid structure and locality of data from groups of small size, combined with the attributes of truncation error cancellations and alternating the strategies of grid points, give unconditional stability and an efficient, systolic design.<<ETX>>","PeriodicalId":339807,"journal":{"name":"[1988] Proceedings. International Conference on Systolic Arrays","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124820589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1988-05-25DOI: 10.1109/ARRAYS.1988.18044
V.K.P. Kumar, Yi-Chen Tsai
A method is proposed for designing a family of linear systolic arrays for matrix-oriented problems for which two-dimensional arrays have been designed. The design exhibits a tradeoff between local storage, s, and number of processing elements, n. The arrays are linear, with each processor having storage O(s),1>
{"title":"Synthesizing optimal family of linear systolic arrays for matrix computations","authors":"V.K.P. Kumar, Yi-Chen Tsai","doi":"10.1109/ARRAYS.1988.18044","DOIUrl":"https://doi.org/10.1109/ARRAYS.1988.18044","url":null,"abstract":"A method is proposed for designing a family of linear systolic arrays for matrix-oriented problems for which two-dimensional arrays have been designed. The design exhibits a tradeoff between local storage, s, and number of processing elements, n. The arrays are linear, with each processor having storage O(s),1<or=s<or=n, for n*n matrix problems. The input matrices are fed as two-speed data streams using fast and slow channels to satisfy the dependencies in the algorithm. The technique leads to simpler designs with fewer number of processors and improved delay from input to output, compared to a previous family of linear arrays.<<ETX>>","PeriodicalId":339807,"journal":{"name":"[1988] Proceedings. International Conference on Systolic Arrays","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125050538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1988-05-25DOI: 10.1109/ARRAYS.1988.18098
H. Ueda, K. Kato, H. Matsushima, K. Kaneko, M. Ejiri
A general-purpose image processor (GPIP) consisting of 64 digital signal processors (DSPs) in a 0.31-m/sup 3/ box is proposed to perform a wide range of image processing tasks. A high-speed DSP called DSP-i has been especially developed for this purpose. It has a highly parallel architecture with a two-level instruction hierarchy, multibank cache, and multiprocessor interface. The DSP-i machine cycle is 50 ns. A novel ring shift register bus architecture offers a flexible structure and an efficient data-exchange method for the system. Along with four proposed operation modes, it cuts the multiprocessing overhead to as little as 20%. The performance of the GPIP is 1000 MOPS (million operations per second).<>
{"title":"A multiprocessor system utilizing enhanced DSPs for image processing","authors":"H. Ueda, K. Kato, H. Matsushima, K. Kaneko, M. Ejiri","doi":"10.1109/ARRAYS.1988.18098","DOIUrl":"https://doi.org/10.1109/ARRAYS.1988.18098","url":null,"abstract":"A general-purpose image processor (GPIP) consisting of 64 digital signal processors (DSPs) in a 0.31-m/sup 3/ box is proposed to perform a wide range of image processing tasks. A high-speed DSP called DSP-i has been especially developed for this purpose. It has a highly parallel architecture with a two-level instruction hierarchy, multibank cache, and multiprocessor interface. The DSP-i machine cycle is 50 ns. A novel ring shift register bus architecture offers a flexible structure and an efficient data-exchange method for the system. Along with four proposed operation modes, it cuts the multiprocessing overhead to as little as 20%. The performance of the GPIP is 1000 MOPS (million operations per second).<<ETX>>","PeriodicalId":339807,"journal":{"name":"[1988] Proceedings. International Conference on Systolic Arrays","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125083406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1988-05-25DOI: 10.1109/ARRAYS.1988.18079
Weijia Shang, J. Fortes
The problem of identifying the time-optimal linear schedules for uniform dependence algorithms with any convex-polyhedron index set is addressed. Optimization procedures are proposed, and the class of algorithms is identified for which the total execution times by the optimal linear schedule and the free schedule that schedules the computation to execute as soon as its operands are available are equal. This method is useful in mapping algorithms onto systolic/MIMD (multiple-instruction, multiple-instruction stream) systems.<>
{"title":"Time optimal linear schedules for algorithms with uniform dependencies","authors":"Weijia Shang, J. Fortes","doi":"10.1109/ARRAYS.1988.18079","DOIUrl":"https://doi.org/10.1109/ARRAYS.1988.18079","url":null,"abstract":"The problem of identifying the time-optimal linear schedules for uniform dependence algorithms with any convex-polyhedron index set is addressed. Optimization procedures are proposed, and the class of algorithms is identified for which the total execution times by the optimal linear schedule and the free schedule that schedules the computation to execute as soon as its operands are available are equal. This method is useful in mapping algorithms onto systolic/MIMD (multiple-instruction, multiple-instruction stream) systems.<<ETX>>","PeriodicalId":339807,"journal":{"name":"[1988] Proceedings. International Conference on Systolic Arrays","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121204299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1988-05-25DOI: 10.1109/ARRAYS.1988.18061
H. Hellwagner
A linear systolic array for computing the generalized Fourier transform is proposed. The transform, which is an extension of the discrete Fourier transform, is briefly reviewed. The basic architecture is formally presented and proved, and an example is given. Some implementation issues are addressed. The array is versatile in the sense that it can compute a variety of different transforms. The array is programmed by simply adapting control input streams to the specific transform to be executed. Loading programs into the cells is not required. The design has constant I/O bandwidth requirements as a result of its dual systolic architecture.<>
{"title":"A systolic array with constant I/O bandwidth for the generalized Fourier transform","authors":"H. Hellwagner","doi":"10.1109/ARRAYS.1988.18061","DOIUrl":"https://doi.org/10.1109/ARRAYS.1988.18061","url":null,"abstract":"A linear systolic array for computing the generalized Fourier transform is proposed. The transform, which is an extension of the discrete Fourier transform, is briefly reviewed. The basic architecture is formally presented and proved, and an example is given. Some implementation issues are addressed. The array is versatile in the sense that it can compute a variety of different transforms. The array is programmed by simply adapting control input streams to the specific transform to be executed. Loading programs into the cells is not required. The design has constant I/O bandwidth requirements as a result of its dual systolic architecture.<<ETX>>","PeriodicalId":339807,"journal":{"name":"[1988] Proceedings. International Conference on Systolic Arrays","volume":"221 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127687201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1988-05-25DOI: 10.1109/ARRAYS.1988.18096
R. Goodman, A. McAuley
An efficient asynchronous serial-parallel multiplier architecture is presented. If offers significant advantages over conventional clocked versions, without some of the drawbacks normally associated with similar asynchronous techniques, such as excessive area. It is shown how a general asynchronous communication element can be designed and illustrated with the CMOS multiplier chip implementation. It is also shown how the multiplier could form the basis for a faster and more robust implementation of the Rivest-Sharmir-Adleman (RSA) public-key cryptosystem.<>
{"title":"An efficient asynchronous multiplier","authors":"R. Goodman, A. McAuley","doi":"10.1109/ARRAYS.1988.18096","DOIUrl":"https://doi.org/10.1109/ARRAYS.1988.18096","url":null,"abstract":"An efficient asynchronous serial-parallel multiplier architecture is presented. If offers significant advantages over conventional clocked versions, without some of the drawbacks normally associated with similar asynchronous techniques, such as excessive area. It is shown how a general asynchronous communication element can be designed and illustrated with the CMOS multiplier chip implementation. It is also shown how the multiplier could form the basis for a faster and more robust implementation of the Rivest-Sharmir-Adleman (RSA) public-key cryptosystem.<<ETX>>","PeriodicalId":339807,"journal":{"name":"[1988] Proceedings. International Conference on Systolic Arrays","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131937743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1988-05-25DOI: 10.1109/ARRAYS.1988.18075
W. Lin
A generic data-flow architecture for mapping large computation problems is designed. The architecture is based on reconfigurable shuffle buses, by which the complexity of interprocessor communications is largely simplified. The issues of representing the computation problems, deriving routing schemes for a generic linear array, and resolving the pipelining of multiple data flows are addressed. It is shown that the shuffle bus provides a very efficient interconnection network for both data shuffling and I/O interface.<>
{"title":"Mapping systolic algorithms into shuffle arrays","authors":"W. Lin","doi":"10.1109/ARRAYS.1988.18075","DOIUrl":"https://doi.org/10.1109/ARRAYS.1988.18075","url":null,"abstract":"A generic data-flow architecture for mapping large computation problems is designed. The architecture is based on reconfigurable shuffle buses, by which the complexity of interprocessor communications is largely simplified. The issues of representing the computation problems, deriving routing schemes for a generic linear array, and resolving the pipelining of multiple data flows are addressed. It is shown that the shuffle bus provides a very efficient interconnection network for both data shuffling and I/O interface.<<ETX>>","PeriodicalId":339807,"journal":{"name":"[1988] Proceedings. International Conference on Systolic Arrays","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133994443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1988-05-25DOI: 10.1109/ARRAYS.1988.18045
W. Liu, R. Cavin, T. Hughes
A theory is presented for rasterizing a class of two-dimensional problems including signal/image processing, computer vision, and linear algebra. The rasterization theory is steered by an isomorphic relationship between the multidimensional shuffle-exchange network (mDSE) and the multidimensional butterfly network (mDBN). Many important multidimensional signal-processing problems can be solved on a mDSE with a solution time approaching known theoretical lower bounds. The isomorphism between mDSE and mDBN is exploited by transforming and mDSE solution into its equivalent mDBN solution. A methodology for rastering the mDBN solution is developed. It turns out that not all mD algorithms can be rasterized. A sufficient condition for algorithm rasterization is given.<>
{"title":"Theory for systolizing global computational problems","authors":"W. Liu, R. Cavin, T. Hughes","doi":"10.1109/ARRAYS.1988.18045","DOIUrl":"https://doi.org/10.1109/ARRAYS.1988.18045","url":null,"abstract":"A theory is presented for rasterizing a class of two-dimensional problems including signal/image processing, computer vision, and linear algebra. The rasterization theory is steered by an isomorphic relationship between the multidimensional shuffle-exchange network (mDSE) and the multidimensional butterfly network (mDBN). Many important multidimensional signal-processing problems can be solved on a mDSE with a solution time approaching known theoretical lower bounds. The isomorphism between mDSE and mDBN is exploited by transforming and mDSE solution into its equivalent mDBN solution. A methodology for rastering the mDBN solution is developed. It turns out that not all mD algorithms can be rasterized. A sufficient condition for algorithm rasterization is given.<<ETX>>","PeriodicalId":339807,"journal":{"name":"[1988] Proceedings. International Conference on Systolic Arrays","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132705679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}