{"title":"turbo译码的低功耗VLSI架构","authors":"Seok-Jun Lee, Naresh R Shanbhag, A. Singer","doi":"10.1145/871506.871599","DOIUrl":null,"url":null,"abstract":"Presented in this paper is a low-power architecture for turbo decoding of parallel concatenated convolutional codes. The proposed architecture is derived via the concept of block-interleaved computation followed by folding, retiming and voltage scaling. Block-interleaved computation can be applied to any data processing unit that operates on data blocks and satisfies the following three properties: 1) computation between blocks are independent; 2) a block can be segmented into computationally independent sub-blocks; and 3) computation within a sub-block is recursive. The application of block-interleaved computation, folding and retiming reduces the critical path delay in the add-compare-select (ACS) kernel of MAP decoders by 50%-84% with an area overhead of 14%-70%. Subsequent application of voltage scaling results in up to 65% savings in power for a block-interleaving depth of 6. Experimental results obtained by transistor-level timing and power analysis tools demonstrate power savings of 20%-44% for a block-interleaving depth of 2 in a 0.25 /spl mu/m CMOS process.","PeriodicalId":355883,"journal":{"name":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","volume":"261 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"A low-power VLSI architecture for turbo decoding\",\"authors\":\"Seok-Jun Lee, Naresh R Shanbhag, A. Singer\",\"doi\":\"10.1145/871506.871599\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Presented in this paper is a low-power architecture for turbo decoding of parallel concatenated convolutional codes. The proposed architecture is derived via the concept of block-interleaved computation followed by folding, retiming and voltage scaling. Block-interleaved computation can be applied to any data processing unit that operates on data blocks and satisfies the following three properties: 1) computation between blocks are independent; 2) a block can be segmented into computationally independent sub-blocks; and 3) computation within a sub-block is recursive. The application of block-interleaved computation, folding and retiming reduces the critical path delay in the add-compare-select (ACS) kernel of MAP decoders by 50%-84% with an area overhead of 14%-70%. Subsequent application of voltage scaling results in up to 65% savings in power for a block-interleaving depth of 6. Experimental results obtained by transistor-level timing and power analysis tools demonstrate power savings of 20%-44% for a block-interleaving depth of 2 in a 0.25 /spl mu/m CMOS process.\",\"PeriodicalId\":355883,\"journal\":{\"name\":\"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.\",\"volume\":\"261 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/871506.871599\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/871506.871599","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
摘要
本文提出了一种用于并行级联卷积码turbo译码的低功耗结构。所提出的架构是通过块交错计算的概念推导出来的,然后是折叠、重新定时和电压缩放。块交错计算可以应用于任何对数据块进行操作的数据处理单元,并满足以下三个属性:1)块之间的计算是独立的;2)一个块可以被分割成计算独立的子块;3)子块内的计算是递归的。块交错计算、折叠和重定时的应用使MAP解码器的添加比较选择(ACS)内核的关键路径延迟降低了50% ~ 84%,而面积开销为14% ~ 70%。随后的应用电压缩放导致高达65%的电力节省块交错深度为6。通过晶体管级时序和功率分析工具获得的实验结果表明,在0.25 /spl μ m CMOS工艺中,块交错深度为2可节省20%-44%的功耗。
Presented in this paper is a low-power architecture for turbo decoding of parallel concatenated convolutional codes. The proposed architecture is derived via the concept of block-interleaved computation followed by folding, retiming and voltage scaling. Block-interleaved computation can be applied to any data processing unit that operates on data blocks and satisfies the following three properties: 1) computation between blocks are independent; 2) a block can be segmented into computationally independent sub-blocks; and 3) computation within a sub-block is recursive. The application of block-interleaved computation, folding and retiming reduces the critical path delay in the add-compare-select (ACS) kernel of MAP decoders by 50%-84% with an area overhead of 14%-70%. Subsequent application of voltage scaling results in up to 65% savings in power for a block-interleaving depth of 6. Experimental results obtained by transistor-level timing and power analysis tools demonstrate power savings of 20%-44% for a block-interleaving depth of 2 in a 0.25 /spl mu/m CMOS process.