Balancing cache energy efficiency and reliability is a major challenge for future multicore system design. Supply voltage reduction is an effective tool to minimize cache energy consumption, usually at the expense of increased number of errors. To achieve substantial energy reduction without degrading reliability, we propose an adaptive fault-tolerant cache architecture, which provides appropriate error control for each cache line based on the number of faulty cells detected at reduced supply voltages. Our experiments show that the proposed approach can improve energy efficiency by more than 25% and energy-execution time product by over 10%, while improving reliability up to 4X using Mean-Error-To-Failure (METF) metric, compared to the next-best solution at the cost of 0.08% storage overhead.
{"title":"Breaking the energy Barrier in fault-tolerant caches for multicore systems","authors":"P. Ampadu, Meilin Zhang, V. Stojanović","doi":"10.5555/2485288.2485466","DOIUrl":"https://doi.org/10.5555/2485288.2485466","url":null,"abstract":"Balancing cache energy efficiency and reliability is a major challenge for future multicore system design. Supply voltage reduction is an effective tool to minimize cache energy consumption, usually at the expense of increased number of errors. To achieve substantial energy reduction without degrading reliability, we propose an adaptive fault-tolerant cache architecture, which provides appropriate error control for each cache line based on the number of faulty cells detected at reduced supply voltages. Our experiments show that the proposed approach can improve energy efficiency by more than 25% and energy-execution time product by over 10%, while improving reliability up to 4X using Mean-Error-To-Failure (METF) metric, compared to the next-best solution at the cost of 0.08% storage overhead.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"15 1","pages":"731-736"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87042791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a new flexible quadratic and partitioning-based global placement approach which is able to optimize a wide class of objective functions, including linear, sub-quadratic, and quadratic net lengths as well as positive linear combinations of them. Based on iteratively re-weighted quadratic optimization, our algorithm extends the previous linearization techniques. If l is the length of some connection, most placement algorithms try to optimize l1 or l2. We show that optimizing lp with 1 < p < 2 helps to improve even linear connection lengths. With this new objective, our new version of the flow-based partitioning placement tool BonnPlace [25] is able to outperform the state-of-the-art force-directed algorithms SimPL, RQL, ComPLx and closes the gap to MAPLE in terms of (linear) HPWL.
本文提出了一种新的柔性二次和基于分区的全局布局方法,该方法能够优化广泛的目标函数,包括线性、次二次和二次网长度以及它们的正线性组合。该算法基于迭代重加权二次优化,扩展了以往的线性化技术。如果l是某个连接的长度,大多数放置算法都会尝试优化l1或l2。我们证明,优化lp < p < 2有助于改善偶数线性连接长度。有了这个新的目标,我们的基于流的分区放置工具BonnPlace[25]的新版本能够超越最先进的力导向算法SimPL, RQL, complex,并在(线性)HPWL方面缩小与MAPLE的差距。
{"title":"Sub-quadratic objectives in quadratic placement","authors":"Markus Struzyna","doi":"10.7873/DATE.2013.372","DOIUrl":"https://doi.org/10.7873/DATE.2013.372","url":null,"abstract":"This paper presents a new flexible quadratic and partitioning-based global placement approach which is able to optimize a wide class of objective functions, including linear, sub-quadratic, and quadratic net lengths as well as positive linear combinations of them. Based on iteratively re-weighted quadratic optimization, our algorithm extends the previous linearization techniques. If l is the length of some connection, most placement algorithms try to optimize l1 or l2. We show that optimizing lp with 1 < p < 2 helps to improve even linear connection lengths. With this new objective, our new version of the flow-based partitioning placement tool BonnPlace [25] is able to outperform the state-of-the-art force-directed algorithms SimPL, RQL, ComPLx and closes the gap to MAPLE in terms of (linear) HPWL.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"49 1","pages":"1867-1872"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85589447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The developments in micro-nano-electronics, biology and neuro-sciences make it possible to imagine a new world where vital signs can be monitored continuously, artificial organs can be implanted in human bodies and interfaces between the human brain and the environment can extend the capabilities of men thus making the dream of Dr. Frankenstein become true. This paper surveys some of the most innovative implantable devices and offers some perspectives on the ethical issues that come with the introduction of this technology.
{"title":"Dr. Frankenstein's dream made possible: Implanted electronic devices","authors":"D. Venuto, A. Sangiovanni-Vincentelli","doi":"10.7873/DATE.2013.311","DOIUrl":"https://doi.org/10.7873/DATE.2013.311","url":null,"abstract":"The developments in micro-nano-electronics, biology and neuro-sciences make it possible to imagine a new world where vital signs can be monitored continuously, artificial organs can be implanted in human bodies and interfaces between the human brain and the environment can extend the capabilities of men thus making the dream of Dr. Frankenstein become true. This paper surveys some of the most innovative implantable devices and offers some perspectives on the ethical issues that come with the introduction of this technology.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"42 1","pages":"1531-1536"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86275079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Kondratyev, L. Lavagno, M. Meyer, Yosinori Watanabe
This paper focuses on the resource sharing problem when performing high-level synthesis. It argues that the conventionally accepted synthesis flow when resource sharing is done after scheduling is sub-optimal because it cannot account for timing penalties from resource merging. The paper describes a competitive approach when resource sharing and scheduling are performed simultaneously. It provides a quantitative evaluation of both approaches and shows that performing sharing during scheduling wins over the conventional approach in terms of quality of results.
{"title":"Share with care: A quantitative evaluation of sharing approaches in high-level synthesis","authors":"A. Kondratyev, L. Lavagno, M. Meyer, Yosinori Watanabe","doi":"10.7873/DATE.2013.315","DOIUrl":"https://doi.org/10.7873/DATE.2013.315","url":null,"abstract":"This paper focuses on the resource sharing problem when performing high-level synthesis. It argues that the conventionally accepted synthesis flow when resource sharing is done after scheduling is sub-optimal because it cannot account for timing penalties from resource merging. The paper describes a competitive approach when resource sharing and scheduling are performed simultaneously. It provides a quantitative evaluation of both approaches and shows that performing sharing during scheduling wins over the conventional approach in terms of quality of results.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"33 1","pages":"1547-1552"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87726158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The floating random walk (FRW) algorithm is an important field-solver algorithm for capacitance extraction, which has several merits compared with other boundary element method (BEM) based algorithms. In this paper, the FRW algorithm is accelerated with the modern graphics processing units (GPUs). We propose an iterative GPU-based FRW algorithm flow and the technique using an inverse cumulative probability array (ICPA), to reduce the divergence among walks and the global-memory accessing. A variant FRW scheme is proposed to utilize the benefit of ICPA, so that it accelerates the extraction of multi-dielectric structures. The technique for extracting multiple nets concurrently is also discussed. Numerical results show that our GPU-based FRW brings over 20X speedup for various test cases with 0.5% convergence criterion over the CPU counterpart. For the extraction of multiple nets, our GPU-based FRW outperforms the CPU counterpart by up to 59X.
{"title":"GPU-friendly floating random walk algorithm for capacitance extraction of VLSI interconnects","authors":"Kuangya Zhai, Wenjian Yu, H. Zhuang","doi":"10.7873/DATE.2013.336","DOIUrl":"https://doi.org/10.7873/DATE.2013.336","url":null,"abstract":"The floating random walk (FRW) algorithm is an important field-solver algorithm for capacitance extraction, which has several merits compared with other boundary element method (BEM) based algorithms. In this paper, the FRW algorithm is accelerated with the modern graphics processing units (GPUs). We propose an iterative GPU-based FRW algorithm flow and the technique using an inverse cumulative probability array (ICPA), to reduce the divergence among walks and the global-memory accessing. A variant FRW scheme is proposed to utilize the benefit of ICPA, so that it accelerates the extraction of multi-dielectric structures. The technique for extracting multiple nets concurrently is also discussed. Numerical results show that our GPU-based FRW brings over 20X speedup for various test cases with 0.5% convergence criterion over the CPU counterpart. For the extraction of multiple nets, our GPU-based FRW outperforms the CPU counterpart by up to 59X.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"12 1","pages":"1661-1666"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77194654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Aitken, G. Fey, Z. Kalbarczyk, F. Reichenbach, M. Reorda
In safety related applications and in products with long lifetimes reliability is a must. Moreover, facing future technology nodes of integrated circuit device level reliability may decrease, i.e., counter-measures have to be taken to ensure product level reliability. But assessing the reliability of a large system is not a trivial task. This paper revisits the state-of-the-art in reliability evaluation starting from the physical device level, to the software system level, all the way up to the product level. Relevant standards and future trends are discussed.
{"title":"Reliability analysis reloaded: How will we survive?","authors":"R. Aitken, G. Fey, Z. Kalbarczyk, F. Reichenbach, M. Reorda","doi":"10.7873/DATE.2013.084","DOIUrl":"https://doi.org/10.7873/DATE.2013.084","url":null,"abstract":"In safety related applications and in products with long lifetimes reliability is a must. Moreover, facing future technology nodes of integrated circuit device level reliability may decrease, i.e., counter-measures have to be taken to ensure product level reliability. But assessing the reliability of a large system is not a trivial task. This paper revisits the state-of-the-art in reliability evaluation starting from the physical device level, to the software system level, all the way up to the product level. Relevant standards and future trends are discussed.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"18 1","pages":"358-367"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82428852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work we advocate the adoption of Binary Decision Diagrams (BDDs) for storing and manipulating Time-Series datasets. We first propose a generic BDD transformation which identifies and removes 50% of all BDD edges without any loss of information. Following, we optimize the core operation for adding samples to a dataset and characterize its complexity. We identify time-range queries as one of the core operations executed on time-series datasets, and describe explicit Boolean function constructions that aid in efficiently executing them directly on BDDs. We exhibit significant space and performance gains when applying our algorithms on synthetic and real-life biosensor time-series datasets collected from field trials.
{"title":"Optimizing BDDs for Time-Series dataset manipulation","authors":"S. Stergiou, J. Jain","doi":"10.7873/DATE.2013.212","DOIUrl":"https://doi.org/10.7873/DATE.2013.212","url":null,"abstract":"In this work we advocate the adoption of Binary Decision Diagrams (BDDs) for storing and manipulating Time-Series datasets. We first propose a generic BDD transformation which identifies and removes 50% of all BDD edges without any loss of information. Following, we optimize the core operation for adding samples to a dataset and characterize its complexity. We identify time-range queries as one of the core operations executed on time-series datasets, and describe explicit Boolean function constructions that aid in efficiently executing them directly on BDDs. We exhibit significant space and performance gains when applying our algorithms on synthetic and real-life biosensor time-series datasets collected from field trials.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"17 1","pages":"1018-1021"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80229194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Gaillardon, L. Amarù, Shashikanth Bobba, M. D. Marchi, D. Sacchetto, Y. Leblebici, G. Micheli
Vertically stacked nanowire FETs (NWFETs) with gate-all-around structure are the natural and most advanced extension of FinFETs. At advanced technology nodes, many devices exhibit ambipolar behavior, i.e., the device shows n- and p-type characteristics simultaneously. In this paper, we show that, by engineering of the contacts and by constructing independent double-gate structures, the device polarity can be electrostatically programmed to be either n- or p-type. Such a device enables a compact realization of XOR-based logic functions at the cost of a denser interconnect. To mitigate the added area/routing overhead caused by the additional gate, an approach for designing an efficient regular layout, called Sea-of-Tiles is presented. Then, specific logic synthesis techniques, supporting the higher expressive power provided by this technology, are introduced and used to showcase the performance of the controllable-polarity NWFETs circuits in comparison with traditional CMOS circuits.
{"title":"Vertically-stacked double-gate nanowire FETs with controllable polarity: From devices to regular ASICs","authors":"P. Gaillardon, L. Amarù, Shashikanth Bobba, M. D. Marchi, D. Sacchetto, Y. Leblebici, G. Micheli","doi":"10.7873/DATE.2013.137","DOIUrl":"https://doi.org/10.7873/DATE.2013.137","url":null,"abstract":"Vertically stacked nanowire FETs (NWFETs) with gate-all-around structure are the natural and most advanced extension of FinFETs. At advanced technology nodes, many devices exhibit ambipolar behavior, i.e., the device shows n- and p-type characteristics simultaneously. In this paper, we show that, by engineering of the contacts and by constructing independent double-gate structures, the device polarity can be electrostatically programmed to be either n- or p-type. Such a device enables a compact realization of XOR-based logic functions at the cost of a denser interconnect. To mitigate the added area/routing overhead caused by the additional gate, an approach for designing an efficient regular layout, called Sea-of-Tiles is presented. Then, specific logic synthesis techniques, supporting the higher expressive power provided by this technology, are introduced and used to showcase the performance of the controllable-polarity NWFETs circuits in comparison with traditional CMOS circuits.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"52 1","pages":"625-630"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81167321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Xiao, K. Irick, N. Vijaykrishnan, Donghwa Shin, N. Chang
In this paper, a bio-inspired technique of finding the regions of highest visual importance within an image is proposed for reducing power consumption in modern liquid crystal displays (LCDs) that utilize a 2D light-emitting diode (LED) backlighting system. The conspicuity map generated from this neuromorphic saliency model, along with an adaptive dimming method, is applied to the backlighting array to reduce the luminance of regions of least interest as perceived by a human viewer. Corresponding image compensation is applied to the saliency modulated image to minimize distortion and retain the original image quality. Experimental results shows average 65% power can be saved when the original display system is integrated with a low-overhead real-time hardware implementation of the saliency model.
{"title":"Saliency aware display power management","authors":"Yang Xiao, K. Irick, N. Vijaykrishnan, Donghwa Shin, N. Chang","doi":"10.7873/DATE.2013.250","DOIUrl":"https://doi.org/10.7873/DATE.2013.250","url":null,"abstract":"In this paper, a bio-inspired technique of finding the regions of highest visual importance within an image is proposed for reducing power consumption in modern liquid crystal displays (LCDs) that utilize a 2D light-emitting diode (LED) backlighting system. The conspicuity map generated from this neuromorphic saliency model, along with an adaptive dimming method, is applied to the backlighting array to reduce the luminance of regions of least interest as perceived by a human viewer. Corresponding image compensation is applied to the saliency modulated image to minimize distortion and retain the original image quality. Experimental results shows average 65% power can be saved when the original display system is integrated with a low-overhead real-time hardware implementation of the saliency model.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"15 1","pages":"1203-1208"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81764788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As the semiconductor process is scaled down, the endurance of NAND flash memory greatly deteriorates. To overcome such a poor endurance characteristic and to provide a reasonable storage lifetime, system-level endurance enhancement techniques are rapidly adopted in recent NAND flash-based storage devices like solid-state drives (SSDs). In this paper, we propose an integrated lifetime management approach for SSDs. The proposed lifetime management technique combines several lifetime-enhancement schemes, including lossless compression, deduplication, and performance throttling, in an integrated fashion so that the lifetime of SSDs can be maximally extended. By selectively disabling less effective lifetime-enhancement schemes, the proposed technique achieves both high performance and high energy efficiency while meeting the required lifetime. Our evaluation results show that the proposed technique, over the SSDs with no lifetime management schemes, improves write performance by up to 55% and reduces energy consumption by up to 43% while satisfying a 5-year lifetime warranty.
{"title":"An integrated approach for managing the lifetime of flash-based SSDs","authors":"Sungjin Lee, Taejin Kim, Jisung Park, Jihong Kim","doi":"10.7873/DATE.2013.309","DOIUrl":"https://doi.org/10.7873/DATE.2013.309","url":null,"abstract":"As the semiconductor process is scaled down, the endurance of NAND flash memory greatly deteriorates. To overcome such a poor endurance characteristic and to provide a reasonable storage lifetime, system-level endurance enhancement techniques are rapidly adopted in recent NAND flash-based storage devices like solid-state drives (SSDs). In this paper, we propose an integrated lifetime management approach for SSDs. The proposed lifetime management technique combines several lifetime-enhancement schemes, including lossless compression, deduplication, and performance throttling, in an integrated fashion so that the lifetime of SSDs can be maximally extended. By selectively disabling less effective lifetime-enhancement schemes, the proposed technique achieves both high performance and high energy efficiency while meeting the required lifetime. Our evaluation results show that the proposed technique, over the SSDs with no lifetime management schemes, improves write performance by up to 55% and reduces energy consumption by up to 43% while satisfying a 5-year lifetime warranty.","PeriodicalId":6310,"journal":{"name":"2013 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"33 1","pages":"1522-1525"},"PeriodicalIF":0.0,"publicationDate":"2013-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82730040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}