Pub Date : 2011-01-25DOI: 10.1109/ASPDAC.2011.5722166
Chenchang Zhan, W. Ki
An output-capacitor-free adaptively biased low-dropout regulator with transient enhancement (ABTE LDR) is proposed. Techniques of Q-reduction compensation, adaptive biasing, and transient enhancement achieve low-voltage high-precision regulation with low quiescent current consumption while significantly improving the line and load transient responses and power supply rejections. The features of the ABTE LDR are experimentally verified by a 0.35-μm CMOS prototype.
{"title":"An adaptively biased low-dropout regulator with transient enhancement","authors":"Chenchang Zhan, W. Ki","doi":"10.1109/ASPDAC.2011.5722166","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722166","url":null,"abstract":"An output-capacitor-free adaptively biased low-dropout regulator with transient enhancement (ABTE LDR) is proposed. Techniques of Q-reduction compensation, adaptive biasing, and transient enhancement achieve low-voltage high-precision regulation with low quiescent current consumption while significantly improving the line and load transient responses and power supply rejections. The features of the ABTE LDR are experimentally verified by a 0.35-μm CMOS prototype.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129384694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-01-25DOI: 10.1109/ASPDAC.2011.5722209
Chia-Jen Chang, Pao-Jen Huang, Tai-Chen Chen, C. Liu
The 3D IC is an emerging technology. The primary emphasis on 3D-IC routing is the interface issues across dies. To handle the interface issue of connections, the inter-die routing, which uses micro bumps and two single-layer RDLs (Re-Distribution Layers) to achieve the connection between adjacent dies, is adopted. In this paper, we present an inter-die routing algorithm for 3D ICs with a pre-defined netlist. Our algorithm is based on integer linear programming (ILP) and adopts a two-stage technique of micro-bump assignment followed by non-regular RDL routing. First, the micro-bump assignment selects suitable micro-bumps for the pre-defined netlist such that no crossing problem exists inside the bounding boxes of each net. After the micro-bump assignment, the netlist is divided into two sub-netlists, one is for the upper RDL and the other is for the lower RDL. Second, the non-regular RDL routing determines minimum and non-crossing global paths for sub-netlists in the upper and lower RDLs individually. Experimental results show that our approach can obtain optimal wirelength and achieve 100% routability under reasonable CPU times.
{"title":"ILP-based inter-die routing for 3D ICs","authors":"Chia-Jen Chang, Pao-Jen Huang, Tai-Chen Chen, C. Liu","doi":"10.1109/ASPDAC.2011.5722209","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722209","url":null,"abstract":"The 3D IC is an emerging technology. The primary emphasis on 3D-IC routing is the interface issues across dies. To handle the interface issue of connections, the inter-die routing, which uses micro bumps and two single-layer RDLs (Re-Distribution Layers) to achieve the connection between adjacent dies, is adopted. In this paper, we present an inter-die routing algorithm for 3D ICs with a pre-defined netlist. Our algorithm is based on integer linear programming (ILP) and adopts a two-stage technique of micro-bump assignment followed by non-regular RDL routing. First, the micro-bump assignment selects suitable micro-bumps for the pre-defined netlist such that no crossing problem exists inside the bounding boxes of each net. After the micro-bump assignment, the netlist is divided into two sub-netlists, one is for the upper RDL and the other is for the lower RDL. Second, the non-regular RDL routing determines minimum and non-crossing global paths for sub-netlists in the upper and lower RDLs individually. Experimental results show that our approach can obtain optimal wirelength and achieve 100% routability under reasonable CPU times.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129467837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-01-25DOI: 10.1109/ASPDAC.2011.5722235
James Williamson, Yinghai Lu, L. Shang, H. Zhou, Xuan Zeng
Integrated circuit (IC) design automation has traditionally followed a hierarchical approach. Modern IC design flow is divided into sequentially-addressed design and optimization layers; each successively finer in design detail and data granularity while increasing in computational complexity. Eventual agreement across the design layers signals design closure. Obtaining design closure is a continual problem, as lack of awareness and interaction between layers often results in multiple design flow iterations. In this work, we propose parallel cross-layer optimization, in which the boundaries between design layers are broken, allowing for a more informed and efficient exploration of the design space. We leverage the heterogeneous parallel computational power in current and upcoming multi-core/many-core computation platforms to suite the heterogeneous characteristics of multiple design layers. Specifically, we unify the highlevel and physical synthesis design layers for parallel cross-layer IC design optimization. In addition, we introduce a massively-parallel GPU floorplanner with local and global convergence test as the proposed physical synthesis design layer. Our results show average performance gains of 11X speed-up over state-of-the-art.
{"title":"Parallel cross-layer optimization of high-level synthesis and physical design","authors":"James Williamson, Yinghai Lu, L. Shang, H. Zhou, Xuan Zeng","doi":"10.1109/ASPDAC.2011.5722235","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722235","url":null,"abstract":"Integrated circuit (IC) design automation has traditionally followed a hierarchical approach. Modern IC design flow is divided into sequentially-addressed design and optimization layers; each successively finer in design detail and data granularity while increasing in computational complexity. Eventual agreement across the design layers signals design closure. Obtaining design closure is a continual problem, as lack of awareness and interaction between layers often results in multiple design flow iterations. In this work, we propose parallel cross-layer optimization, in which the boundaries between design layers are broken, allowing for a more informed and efficient exploration of the design space. We leverage the heterogeneous parallel computational power in current and upcoming multi-core/many-core computation platforms to suite the heterogeneous characteristics of multiple design layers. Specifically, we unify the highlevel and physical synthesis design layers for parallel cross-layer IC design optimization. In addition, we introduce a massively-parallel GPU floorplanner with local and global convergence test as the proposed physical synthesis design layer. Our results show average performance gains of 11X speed-up over state-of-the-art.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130565881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-01-25DOI: 10.1109/ASPDAC.2011.5722291
Masaru Takahashi
The System on Chip (SoC) can include Full High Definition (FHD) video processing, however the turn around time of algorithm improvement have been long. We provide the new method utilizing the behavioral synthesis. Therefore, the turn around time of the algorithm improvement and hardware implementation can be shorten.
{"title":"FPGA prototyping using behavioral synthesis for improving video processing algorithm and FHD TV SoC design","authors":"Masaru Takahashi","doi":"10.1109/ASPDAC.2011.5722291","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722291","url":null,"abstract":"The System on Chip (SoC) can include Full High Definition (FHD) video processing, however the turn around time of algorithm improvement have been long. We provide the new method utilizing the behavioral synthesis. Therefore, the turn around time of the algorithm improvement and hardware implementation can be shorten.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128117348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-01-25DOI: 10.1109/ASPDAC.2011.5722163
Wen Fan, O. Choy
Robust, efficient and low complexity design methodologies for high speed multi-band orthogonal frequency division multiplexing ultra-wideband (MB-OFDM UWB) is presented. The proposed design is implemented in 0.13μm CMOS technology with the core area of 2.66mm × 0.94mm. Operating at 132MHz clock frequency, the estimated power consumption is 170mW.
{"title":"Robust and efficient baseband receiver design for MB-OFDM UWB system","authors":"Wen Fan, O. Choy","doi":"10.1109/ASPDAC.2011.5722163","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722163","url":null,"abstract":"Robust, efficient and low complexity design methodologies for high speed multi-band orthogonal frequency division multiplexing ultra-wideband (MB-OFDM UWB) is presented. The proposed design is implemented in 0.13μm CMOS technology with the core area of 2.66mm × 0.94mm. Operating at 132MHz clock frequency, the estimated power consumption is 170mW.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114315343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-01-25DOI: 10.1109/ASPDAC.2011.5722191
M. Rawlins, A. Gordon-Ross
Even though much previous work explores varying instruction cache optimization techniques individually, little work explores the combined effects of these techniques (i.e., do they complement or obviate each other). In this paper we explore the interaction of three optimizations: loop caching, cache tuning, and code compression. Results show that loop caching increases energy savings by as much as 26% compared to cache tuning alone and reduces decompression energy by as much as 73%.
{"title":"On the interplay of loop caching, code compression, and cache configuration","authors":"M. Rawlins, A. Gordon-Ross","doi":"10.1109/ASPDAC.2011.5722191","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722191","url":null,"abstract":"Even though much previous work explores varying instruction cache optimization techniques individually, little work explores the combined effects of these techniques (i.e., do they complement or obviate each other). In this paper we explore the interaction of three optimizations: loop caching, cache tuning, and code compression. Results show that loop caching increases energy savings by as much as 26% compared to cache tuning alone and reduces decompression energy by as much as 73%.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123285847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-01-25DOI: 10.1109/ASPDAC.2011.5722295
Jen-Yi Wuu, F. Pikus, A. Torres, M. Marek-Sadowska
Printability of layout objects becomes increasingly dependent on neighboring shapes within a larger and larger context window. In this paper, we propose a two-level hotspot pattern classification methodology that examines both central and peripheral patterns. Accuracy and runtime enhancement techniques are proposed, making our detection methodology robust and efficient as a fast physical verification tool that can be applied during early design stages to large-scale designs. We position our method as an approximate detection solution, similar to pattern matching-based tools widely adopted by the industry. In addition, our analyses of classification results reveal that the majority of non-hotspots falsely predicted as hotspots have printed CD barely over the minimum allowable CD threshold. Our method is verified on several 45 nm and 32 nm industrial designs.
{"title":"Rapid layout pattern classification","authors":"Jen-Yi Wuu, F. Pikus, A. Torres, M. Marek-Sadowska","doi":"10.1109/ASPDAC.2011.5722295","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722295","url":null,"abstract":"Printability of layout objects becomes increasingly dependent on neighboring shapes within a larger and larger context window. In this paper, we propose a two-level hotspot pattern classification methodology that examines both central and peripheral patterns. Accuracy and runtime enhancement techniques are proposed, making our detection methodology robust and efficient as a fast physical verification tool that can be applied during early design stages to large-scale designs. We position our method as an approximate detection solution, similar to pattern matching-based tools widely adopted by the industry. In addition, our analyses of classification results reveal that the majority of non-hotspots falsely predicted as hotspots have printed CD barely over the minimum allowable CD threshold. Our method is verified on several 45 nm and 32 nm industrial designs.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129276134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-01-25DOI: 10.1109/ASPDAC.2011.5722175
Bin Li, Zhen Fang, R. Iyer
With the rapid progress in semiconductor technologies, more and more accelerators can be integrated onto a single SoC chip. In SoCs, accelerators often require deterministic data access. However, as more and more applications are running simultaneous, latency can vary significantly due to contention. To address this problem, we propose a template-based memory access engine (MAE) for accelerators in SoCs. The proposed MAE can handle several common memory access patterns observed for near-future accelerators. Our evaluation results show that the proposed MAE can significantly reduce memory access latency and jitter, thus very effective for accelerators in SoCs.
{"title":"Template-based memory access engine for accelerators in SoCs","authors":"Bin Li, Zhen Fang, R. Iyer","doi":"10.1109/ASPDAC.2011.5722175","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722175","url":null,"abstract":"With the rapid progress in semiconductor technologies, more and more accelerators can be integrated onto a single SoC chip. In SoCs, accelerators often require deterministic data access. However, as more and more applications are running simultaneous, latency can vary significantly due to contention. To address this problem, we propose a template-based memory access engine (MAE) for accelerators in SoCs. The proposed MAE can handle several common memory access patterns observed for near-future accelerators. Our evaluation results show that the proposed MAE can significantly reduce memory access latency and jitter, thus very effective for accelerators in SoCs.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129193506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-01-25DOI: 10.1109/ASPDAC.2011.5722285
Jingqing Mu, Roman L. Lysecky
Significant research has demonstrated the performance and power benefits of runtime dynamic reconfiguration of FPGAs and microprocessor/FPGA devices. For dynamically reconfigurable systems, in which the selection of hardware coprocessors to implement within the FPGA is determined at runtime, online estimation methods are needed to evaluate the performance and power consumption impact of the hardware coprocessor selection. In this paper, we present a profile assisted online system-level performance and power estimation framework for estimating the speedup and power consumption of dynamically reconfigurable embedded systems. We evaluate the accuracy and fidelity of our online estimation framework for dynamic hardware kernel selection to maximize performance or minimize system power consumption.
{"title":"Profile assisted online system-level performance and power estimation for dynamic reconfigurable embedded systems","authors":"Jingqing Mu, Roman L. Lysecky","doi":"10.1109/ASPDAC.2011.5722285","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722285","url":null,"abstract":"Significant research has demonstrated the performance and power benefits of runtime dynamic reconfiguration of FPGAs and microprocessor/FPGA devices. For dynamically reconfigurable systems, in which the selection of hardware coprocessors to implement within the FPGA is determined at runtime, online estimation methods are needed to evaluate the performance and power consumption impact of the hardware coprocessor selection. In this paper, we present a profile assisted online system-level performance and power estimation framework for estimating the speedup and power consumption of dynamically reconfigurable embedded systems. We evaluate the accuracy and fidelity of our online estimation framework for dynamic hardware kernel selection to maximize performance or minimize system power consumption.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122669993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-01-25DOI: 10.1109/ASPDAC.2011.5722174
H. Kooti, D. Mishra, E. Bozorgzadeh
Due to the increase in demand for reconfigurability in embedded systems, schedulability in real-time task scheduling is challenged by non-negligible reconfiguration overheads. Reconfiguration of the system during task execution affects both deadline miss rate and deadline miss distribution. On the other hand, Quality of Service (QoS) in several embedded applications is not only determined by deadline miss rate but also the distribution of the tasks missing their deadlines (known as weakly-hard real-time systems). As a result, we propose to model QoS constraints as a set of constraints on dropout patterns (due to reconfiguration overhead) and present a novel online solution for the problem of reconfiguration-aware real-time scheduling. According to QoS constraints, we divide the ready instances of the tasks into two groups: critical and non-critical, then model each group as a network flow problem and provide an online scheduler for each group. We deployed our method on synthetic benchmarks as well as software defined radio implementation of VoIP on reconfigurable systems. Results show that our solution reduces the number of QoS violations by 19.01 times and 2.33 times (57.02%) in comparison with Bi-Modal Scheduler (BMS) [1] for synthetic benchmarks with low and high QoS constraint, respectively.
{"title":"Reconfiguration-aware real-time scheduling under QoS constraint","authors":"H. Kooti, D. Mishra, E. Bozorgzadeh","doi":"10.1109/ASPDAC.2011.5722174","DOIUrl":"https://doi.org/10.1109/ASPDAC.2011.5722174","url":null,"abstract":"Due to the increase in demand for reconfigurability in embedded systems, schedulability in real-time task scheduling is challenged by non-negligible reconfiguration overheads. Reconfiguration of the system during task execution affects both deadline miss rate and deadline miss distribution. On the other hand, Quality of Service (QoS) in several embedded applications is not only determined by deadline miss rate but also the distribution of the tasks missing their deadlines (known as weakly-hard real-time systems). As a result, we propose to model QoS constraints as a set of constraints on dropout patterns (due to reconfiguration overhead) and present a novel online solution for the problem of reconfiguration-aware real-time scheduling. According to QoS constraints, we divide the ready instances of the tasks into two groups: critical and non-critical, then model each group as a network flow problem and provide an online scheduler for each group. We deployed our method on synthetic benchmarks as well as software defined radio implementation of VoIP on reconfigurable systems. Results show that our solution reduces the number of QoS violations by 19.01 times and 2.33 times (57.02%) in comparison with Bi-Modal Scheduler (BMS) [1] for synthetic benchmarks with low and high QoS constraint, respectively.","PeriodicalId":316253,"journal":{"name":"16th Asia and South Pacific Design Automation Conference (ASP-DAC 2011)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121157006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}