M. Valero-García, J. Navarro, J. Llabería, M. Valero
{"title":"Implementation of systolic algorithms using pipelined functional units","authors":"M. Valero-García, J. Navarro, J. Llabería, M. Valero","doi":"10.1109/ASAP.1990.145464","DOIUrl":null,"url":null,"abstract":"The authors present a method to implement systolic algorithms (SAs) using pipelined functional units (PFUs). This kind of unit makes it possible to improve the throughput of a processor because of the possibility of initiating a new operation before the previous one has been completed. The method permits transformation of a SA so that it can be efficiently executed using PFUs. The method is based on two temporal transformations (slowdown and retiming) and one spatial transformation (coalescing). The temporal transformations permit the modification of the SA in such a way that dependences established by the PFU are preserved. The spatial transformation improves the hardware utilization. The method was applied to 1-D SAs with data contraflow. To demonstrate the effectiveness of the method, the authors describe an efficient implementation of a non-time-homogeneous SA with data contraflow for QR decomposition.<<ETX>>","PeriodicalId":438078,"journal":{"name":"[1990] Proceedings of the International Conference on Application Specific Array Processors","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1990-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1990] Proceedings of the International Conference on Application Specific Array Processors","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASAP.1990.145464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The authors present a method to implement systolic algorithms (SAs) using pipelined functional units (PFUs). This kind of unit makes it possible to improve the throughput of a processor because of the possibility of initiating a new operation before the previous one has been completed. The method permits transformation of a SA so that it can be efficiently executed using PFUs. The method is based on two temporal transformations (slowdown and retiming) and one spatial transformation (coalescing). The temporal transformations permit the modification of the SA in such a way that dependences established by the PFU are preserved. The spatial transformation improves the hardware utilization. The method was applied to 1-D SAs with data contraflow. To demonstrate the effectiveness of the method, the authors describe an efficient implementation of a non-time-homogeneous SA with data contraflow for QR decomposition.<>