Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419845
H. Fujiwara, M. Obien
In this paper, we first introduce extended de Bruijn graphs to design extended shift registers that are functionally equivalent but not structurally equivalent to shift registers. Using the extended shift registers, we present a new secure and testable scan design approach that aims to satisfy both testability and security of digital circuits. The approach is only to replace the original scan registers to modified scan registers called extended scan registers. This method requires very little area overhead and no performance overhead. New concepts of scan security and scan testability are also introduced.
{"title":"Secure and testable scan design using extended de Bruijn graphs","authors":"H. Fujiwara, M. Obien","doi":"10.1109/ASPDAC.2010.5419845","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419845","url":null,"abstract":"In this paper, we first introduce extended de Bruijn graphs to design extended shift registers that are functionally equivalent but not structurally equivalent to shift registers. Using the extended shift registers, we present a new secure and testable scan design approach that aims to satisfy both testability and security of digital circuits. The approach is only to replace the original scan registers to modified scan registers called extended scan registers. This method requires very little area overhead and no performance overhead. New concepts of scan security and scan testability are also introduced.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122789675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419815
Dongkeun Oh, N. Kim, C. C. Chen, A. Davoodi, Y. Hu
Technology scaling has allowed integration of multiple cores into a single die. However, high power consumption of each core leads to very high heat density, limiting the throughput of thermal-constrained multi-core processors. To maximize the throughput, various software-based dynamic thermal management and optimization techniques have been proposed, many of which depend on accurate temperature sensing of each core. However, the decision for dynamic thermal management and throughput optimization only based on the temperature of each core can result in less optimal throughput in certain circumstances according to our investigation. In this paper, we propose 1) a dynamic power estimation method using a single thermal sensor for each core in multi-core processors, 2) a die temperature reconstruction method using the estimated power, and 3) a throughput optimization method based the estimated power instead of the temperature. According to our experiment using 90nm technology, the proposed method results in less than 3% error in estimating power and hot-spot temperature of a multi-core processor. Furthermore, the proposed throughput optimization method based on the estimated power leads to up to 4% higher throughput than a temperature-based optimization method.
{"title":"Runtime temperature-based power estimation for optimizing throughput of thermal-constrained multi-core processors","authors":"Dongkeun Oh, N. Kim, C. C. Chen, A. Davoodi, Y. Hu","doi":"10.1109/ASPDAC.2010.5419815","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419815","url":null,"abstract":"Technology scaling has allowed integration of multiple cores into a single die. However, high power consumption of each core leads to very high heat density, limiting the throughput of thermal-constrained multi-core processors. To maximize the throughput, various software-based dynamic thermal management and optimization techniques have been proposed, many of which depend on accurate temperature sensing of each core. However, the decision for dynamic thermal management and throughput optimization only based on the temperature of each core can result in less optimal throughput in certain circumstances according to our investigation. In this paper, we propose 1) a dynamic power estimation method using a single thermal sensor for each core in multi-core processors, 2) a die temperature reconstruction method using the estimated power, and 3) a throughput optimization method based the estimated power instead of the temperature. According to our experiment using 90nm technology, the proposed method results in less than 3% error in estimating power and hot-spot temperature of a multi-core processor. Furthermore, the proposed throughput optimization method based on the estimated power leads to up to 4% higher throughput than a temperature-based optimization method.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122942892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Next-decade computing power and interconnect bottle-neck challenge conventional IC design due to the ever increasing demands for high frequency and great bandwidth. Three-dimensional large-scale integration (3D-LSI) provides an opportunity to realize such high performance cores while reducing long latency. In this paper, we present a reference flow for the implementation of 3D via-last ICs in scalable face-to-back bonding style which leverages a mature set of 2D IC physical design tools. The first enabling technology of 3D-LSI is through-silicon via (TSV). Two kinds of TSV diameters are exemplified in the flow, namely, 5µm and 50µm. We propose an easy-to-adopt method to address the TSV-aware mixed-sized placement by considering the obstructions generated from adjacent-tier's floorplan, subject to certain TSV alignment constraints. Furthermore, the technique of clock tree synthesis (CTS) for a homogeneous die stack is developed to dramatically reduce the clock latency and skew. The mixed-sized placement and CTS of each tier can be done without iteration. To the best of our knowledge, no work has ever been published in literature discussing CTS for 3D via-last integration in a face-to-back fashion. Finally, to complete the proposed flow 2D timing-driven routing and modified off-line design rule check (DRC) and layout versus schematic (LVS) verification are performed very well.
{"title":"CAD reference flow for 3D via-last integrated circuits","authors":"Chang-Tzu Lin, D. Kwai, Yung-Fa Chou, Ting-Sheng Chen, Wen Ching Wu","doi":"10.1109/ASPDAC.2010.5419898","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419898","url":null,"abstract":"Next-decade computing power and interconnect bottle-neck challenge conventional IC design due to the ever increasing demands for high frequency and great bandwidth. Three-dimensional large-scale integration (3D-LSI) provides an opportunity to realize such high performance cores while reducing long latency. In this paper, we present a reference flow for the implementation of 3D via-last ICs in scalable face-to-back bonding style which leverages a mature set of 2D IC physical design tools. The first enabling technology of 3D-LSI is through-silicon via (TSV). Two kinds of TSV diameters are exemplified in the flow, namely, 5µm and 50µm. We propose an easy-to-adopt method to address the TSV-aware mixed-sized placement by considering the obstructions generated from adjacent-tier's floorplan, subject to certain TSV alignment constraints. Furthermore, the technique of clock tree synthesis (CTS) for a homogeneous die stack is developed to dramatically reduce the clock latency and skew. The mixed-sized placement and CTS of each tier can be done without iteration. To the best of our knowledge, no work has ever been published in literature discussing CTS for 3D via-last integration in a face-to-back fashion. Finally, to complete the proposed flow 2D timing-driven routing and modified off-line design rule check (DRC) and layout versus schematic (LVS) verification are performed very well.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123658692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419820
Jungsoo Kim, Younghoon Lee, S. Yoo, C. Kyung
Success of workload prediction, which is critical in achieving low energy consumption via dynamic voltage and frequency scaling (DVFS), depends on the accuracy of modeling the major sources of workload variation. Among them, memory stall time, whose variation is significant especially in case of memory-bound applications, has been mostly neglected or handled in too simplistic assumptions in previous works. In this paper, we present an analytical DVFS method which takes into account variations in both computation and memory stall cycles. The proposed method reduces leakage power consumption as well as switching power consumption through combined Vdd/Vbb scaling. Experimental results on MPEG4 and H.264 decoder have shown that, compared to previous methods [3] and [6], our method achieves up to additional 30.0% and 15.8% energy reductions, respectively.
{"title":"An analytical dynamic scaling of supply voltage and body bias exploiting memory stall time variation","authors":"Jungsoo Kim, Younghoon Lee, S. Yoo, C. Kyung","doi":"10.1109/ASPDAC.2010.5419820","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419820","url":null,"abstract":"Success of workload prediction, which is critical in achieving low energy consumption via dynamic voltage and frequency scaling (DVFS), depends on the accuracy of modeling the major sources of workload variation. Among them, memory stall time, whose variation is significant especially in case of memory-bound applications, has been mostly neglected or handled in too simplistic assumptions in previous works. In this paper, we present an analytical DVFS method which takes into account variations in both computation and memory stall cycles. The proposed method reduces leakage power consumption as well as switching power consumption through combined Vdd/Vbb scaling. Experimental results on MPEG4 and H.264 decoder have shown that, compared to previous methods [3] and [6], our method achieves up to additional 30.0% and 15.8% energy reductions, respectively.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128973129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419785
Li Li, Yuchun Ma, N. Xu, Yu Wang, Xianlong Hong
As technology advances, the voltage (IR) drop in the Power/Ground (P/G) network becomes a serious problem in modern IC design. The P/G network co-design with floorplan can improve the power design quality. Different with traditional approaches which analyze P/G network during the floorplanning iterations, in this paper, an efficient pattern selection method is used to provide gradient information for fast signal-integrity estimation. We also propose a novel P/G aware incremental algorithm which can intelligently fix the violations during the floorplanning process. The P/G pin assignment and wire sizing method are adopted during the floorplanning process so that the power routing resource can be minimized with the constraints of IR drop and electron migration (EM) considered. Experimental results based on the MCNC benchmarks show that our design not only significantly speeds up the optimization process, but also optimizes the power routing resource while the quality of the floorplanning is maintained.
{"title":"PS-FPG: Pattern selection based co-design of floorplan and Power/Ground network with wiring resource optimization","authors":"Li Li, Yuchun Ma, N. Xu, Yu Wang, Xianlong Hong","doi":"10.1109/ASPDAC.2010.5419785","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419785","url":null,"abstract":"As technology advances, the voltage (IR) drop in the Power/Ground (P/G) network becomes a serious problem in modern IC design. The P/G network co-design with floorplan can improve the power design quality. Different with traditional approaches which analyze P/G network during the floorplanning iterations, in this paper, an efficient pattern selection method is used to provide gradient information for fast signal-integrity estimation. We also propose a novel P/G aware incremental algorithm which can intelligently fix the violations during the floorplanning process. The P/G pin assignment and wire sizing method are adopted during the floorplanning process so that the power routing resource can be minimized with the constraints of IR drop and electron migration (EM) considered. Experimental results based on the MCNC benchmarks show that our design not only significantly speeds up the optimization process, but also optimizes the power routing resource while the quality of the floorplanning is maintained.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129167302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a gate delay estimation method that takes into account dynamic power supply noise. We review STA based on static IR-drop analysis and a conventional method for dynamic noise waveform, and reveal their limitations and problems that originate from circuit structures and higher delay sensitivity to voltage in advanced technologies. We then propose a gate delay computation that overcomes the problems with iterative computations and consideration of input voltage drop. Evaluation results with various circuits and noise injection timings show that the proposed method estimates path delay fluctuation well within 2% error on average.
{"title":"Gate delay estimation in STA under dynamic power supply noise","authors":"Takaaki Okumura, Fumihiro Minami, Kenji Shimazaki, Kimihiko Kuwada, Masanori Hashimoto","doi":"10.1587/TRANSFUN.E93.A.2447","DOIUrl":"https://doi.org/10.1587/TRANSFUN.E93.A.2447","url":null,"abstract":"This paper presents a gate delay estimation method that takes into account dynamic power supply noise. We review STA based on static IR-drop analysis and a conventional method for dynamic noise waveform, and reveal their limitations and problems that originate from circuit structures and higher delay sensitivity to voltage in advanced technologies. We then propose a gate delay computation that overcomes the problems with iterative computations and consideration of input voltage drop. Evaluation results with various circuits and noise injection timings show that the proposed method estimates path delay fluctuation well within 2% error on average.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116818305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419784
Hong-Zu Chou, Kai-Hui Chang, S. Kuo
Optimizing blocks in a System-on-Chip (SoC) circuit is becoming more and more important nowadays due to the use of third-party Intellectual Properties (IPs) and reused design blocks. In this paper, we propose techniques and methodologies that utilize abundant external don't-cares that exist in an SoC environment for block optimization. Our symbolic code-statement reachability analysis can extract don't-care conditions from constrained-random testbenches or other design blocks to identify unreachable conditional blocks in the design code. Those blocks can then be removed before logic synthesis is performed to produce smaller and more power-efficient final circuits. Our results show that we can optimize designs under different constraints and provide additional flexibility for SoC design flows.
{"title":"Optimizing blocks in an SoC using symbolic code-statement reachability analysis","authors":"Hong-Zu Chou, Kai-Hui Chang, S. Kuo","doi":"10.1109/ASPDAC.2010.5419784","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419784","url":null,"abstract":"Optimizing blocks in a System-on-Chip (SoC) circuit is becoming more and more important nowadays due to the use of third-party Intellectual Properties (IPs) and reused design blocks. In this paper, we propose techniques and methodologies that utilize abundant external don't-cares that exist in an SoC environment for block optimization. Our symbolic code-statement reachability analysis can extract don't-care conditions from constrained-random testbenches or other design blocks to identify unreachable conditional blocks in the design code. Those blocks can then be removed before logic synthesis is performed to produce smaller and more power-efficient final circuits. Our results show that we can optimize designs under different constraints and provide additional flexibility for SoC design flows.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"269 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116182092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419690
A. Kahng, Seokhyeong Kang, Rakesh Kumar, J. Sartori
Modern digital IC designs have a critical operating point, or “wall of slack”, that limits voltage scaling. Even with an error-tolerance mechanism, scaling voltage below a critical voltage - so-called overscaling - results in more timing errors than can be effectively detected or corrected. This limits the effectiveness of voltage scaling in trading off system reliability and power. We propose a design-level approach to trading off reliability and voltage (power) in, e.g., microprocessor designs. We increase the range of voltage values at which the (timing) error rate is acceptable; we achieve this through techniques for power-aware slack redistribution that shift the timing slack of frequently-exercised, near-critical timing paths in a power- and area-efficient manner. The resulting designs heuristically minimize the voltage at which the maximum allowable error rate is encountered, thus minimizing power consumption for a prescribed maximum error rate and allowing the design to fail more gracefully. Compared with baseline designs, we achieve a maximum of 32.8% and an average of 12.5% power reduction at an error rate of 2%. The area overhead of our techniques, as evaluated through physical implementation (synthesis, placement and routing), is no more than 2.7%.
{"title":"Slack redistribution for graceful degradation under voltage overscaling","authors":"A. Kahng, Seokhyeong Kang, Rakesh Kumar, J. Sartori","doi":"10.1109/ASPDAC.2010.5419690","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419690","url":null,"abstract":"Modern digital IC designs have a critical operating point, or “wall of slack”, that limits voltage scaling. Even with an error-tolerance mechanism, scaling voltage below a critical voltage - so-called overscaling - results in more timing errors than can be effectively detected or corrected. This limits the effectiveness of voltage scaling in trading off system reliability and power. We propose a design-level approach to trading off reliability and voltage (power) in, e.g., microprocessor designs. We increase the range of voltage values at which the (timing) error rate is acceptable; we achieve this through techniques for power-aware slack redistribution that shift the timing slack of frequently-exercised, near-critical timing paths in a power- and area-efficient manner. The resulting designs heuristically minimize the voltage at which the maximum allowable error rate is encountered, thus minimizing power consumption for a prescribed maximum error rate and allowing the design to fail more gracefully. Compared with baseline designs, we achieve a maximum of 32.8% and an average of 12.5% power reduction at an error rate of 2%. The area overhead of our techniques, as evaluated through physical implementation (synthesis, placement and routing), is no more than 2.7%.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126830272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419913
Chao Lu, V. Raghunathan, K. Roy
Harvesting electrical power from environmental energy sources is an attractive and increasingly feasible option for several micro-scale electronic systems such as biomedical implants and wireless sensor nodes that need to operate autonomously for long periods of time (months to years). However, designing highly efficient micro-scale energy harvesting systems requires an in-depth understanding of various design considerations and tradeoffs. This paper provides an overview of the area of micro-scale energy harvesting and discusses the various challenges and considerations involved from a system-design perspective.
{"title":"Micro-scale energy harvesting: A system design perspective","authors":"Chao Lu, V. Raghunathan, K. Roy","doi":"10.1109/ASPDAC.2010.5419913","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419913","url":null,"abstract":"Harvesting electrical power from environmental energy sources is an attractive and increasingly feasible option for several micro-scale electronic systems such as biomedical implants and wireless sensor nodes that need to operate autonomously for long periods of time (months to years). However, designing highly efficient micro-scale energy harvesting systems requires an in-depth understanding of various design considerations and tradeoffs. This paper provides an overview of the area of micro-scale energy harvesting and discusses the various challenges and considerations involved from a system-design perspective.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127482854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-01-18DOI: 10.1109/ASPDAC.2010.5419858
H. Arai, N. Miyamoto, K. Kotani, H. Fujisawa, Takashi Ito
A tailbiting block-interleaved pipelining (TB-BIP) is proposed for deeply-pipelined turbo decoders. Conventional sliding window block-interleaved pipelining (SW-BIP) turbo decoders suffer from many warm-up calculations when the number of pipeline stages is increased. However, by using TB-BIP, more than 50% of the warm-up calculations are reduced as compared to SW-BIP. We have implemented a TB-BIP WiMAX turbo decoder with four pipeline stages in the area of 3.8 mm2 using a 0.18 µm CMOS technology. The chip achieved 45 Mbps/iter and 3.11 nJ/b/iter at 99 MHz operation.
{"title":"A WiMAX turbo decoder with tailbiting BIP architecture","authors":"H. Arai, N. Miyamoto, K. Kotani, H. Fujisawa, Takashi Ito","doi":"10.1109/ASPDAC.2010.5419858","DOIUrl":"https://doi.org/10.1109/ASPDAC.2010.5419858","url":null,"abstract":"A tailbiting block-interleaved pipelining (TB-BIP) is proposed for deeply-pipelined turbo decoders. Conventional sliding window block-interleaved pipelining (SW-BIP) turbo decoders suffer from many warm-up calculations when the number of pipeline stages is increased. However, by using TB-BIP, more than 50% of the warm-up calculations are reduced as compared to SW-BIP. We have implemented a TB-BIP WiMAX turbo decoder with four pipeline stages in the area of 3.8 mm2 using a 0.18 µm CMOS technology. The chip achieved 45 Mbps/iter and 3.11 nJ/b/iter at 99 MHz operation.","PeriodicalId":152569,"journal":{"name":"2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126563565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}