{"title":"Flexibility Inlining into Arithmetic Data-paths Exploiting A Regular Interconnection Scheme","authors":"S. Xydis, G. Economakos, K. Pekmestzi","doi":"10.1109/ICSAMOS.2007.4285744","DOIUrl":null,"url":null,"abstract":"This paper presents a design technique for coarse grained reconfigurable cores targeting mostly DSP applications. The proposed technique inlines flexibility into custom carry-save-arithmetic (CSA) datapaths exploiting a stable and canonical interconnection scheme. The canonical interconnection is revealed by a uniformity transformation imposed on the basic architectures of CSA multipliers and CSA chain-adders/subtractors. The design flow for the implementation of the core is analyzed in detail, and a novel reconfigurable architecture prototype is presented. The paper concludes with the experimental results showing that our architecture performs an average latency reduction of 32.63%, compared with datapaths of primitive computational resources, with a tolerable overhead in hardware utilization.","PeriodicalId":106933,"journal":{"name":"2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2007-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAMOS.2007.4285744","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
This paper presents a design technique for coarse grained reconfigurable cores targeting mostly DSP applications. The proposed technique inlines flexibility into custom carry-save-arithmetic (CSA) datapaths exploiting a stable and canonical interconnection scheme. The canonical interconnection is revealed by a uniformity transformation imposed on the basic architectures of CSA multipliers and CSA chain-adders/subtractors. The design flow for the implementation of the core is analyzed in detail, and a novel reconfigurable architecture prototype is presented. The paper concludes with the experimental results showing that our architecture performs an average latency reduction of 32.63%, compared with datapaths of primitive computational resources, with a tolerable overhead in hardware utilization.