首页 > 最新文献

2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)最新文献

英文 中文
Carbon nanotube imperfection-immune digital VLSI: Frequently asked questions updated 碳纳米管缺陷免疫数字VLSI:常见问题更新
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105330
Hai Wei, Jie Zhang, Lan Wei, N. Patil, A. Lin, M. Shulaker, Hong-Yu Chen, H. Wong, S. Mitra
Carbon Nanotube Field-Effect Transistors (CNFETs) are excellent candidates for designing highly energy-efficient future digital systems. However, carbon nanotubes (CNTs) are inherently highly subject to imperfections that pose major obstacles to robust CNFET digital VLSI. This paper summarizes commonly raised questions and concerns about CNFET technology through a series of frequently asked questions. The specific questions addressed in this paper are motivated by recent advances in the field since the publication of our earlier paper on frequently asked questions in the Proceedings of the 2009 Design Automation Conference.
碳纳米管场效应晶体管(cnfet)是设计高能效未来数字系统的优秀候选者。然而,碳纳米管(CNTs)本身就具有高度的缺陷,这对稳健的CNFET数字VLSI构成了主要障碍。本文通过一系列常见问题,总结了CNFET技术中常见的问题和关注。本文中讨论的具体问题是由我们在2009年设计自动化会议论文集中发表的关于常见问题的早期论文以来该领域的最新进展所激发的。
{"title":"Carbon nanotube imperfection-immune digital VLSI: Frequently asked questions updated","authors":"Hai Wei, Jie Zhang, Lan Wei, N. Patil, A. Lin, M. Shulaker, Hong-Yu Chen, H. Wong, S. Mitra","doi":"10.1109/ICCAD.2011.6105330","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105330","url":null,"abstract":"Carbon Nanotube Field-Effect Transistors (CNFETs) are excellent candidates for designing highly energy-efficient future digital systems. However, carbon nanotubes (CNTs) are inherently highly subject to imperfections that pose major obstacles to robust CNFET digital VLSI. This paper summarizes commonly raised questions and concerns about CNFET technology through a series of frequently asked questions. The specific questions addressed in this paper are motivated by recent advances in the field since the publication of our earlier paper on frequently asked questions in the Proceedings of the 2009 Design Automation Conference.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88420367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Property-specific sequential invariant extraction for SAT-based unbounded model checking 用于基于sat的无界模型检查的特定于属性的顺序不变量提取
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105402
Hu-Hsi Yeh, Cheng-Yin Wu, Chung-Yang Huang
In this paper, we propose a property-specific sequential invariant extraction algorithm to improve the performance of the SAT-based Unbounded Modeling Checkers (UMCs). By analyzing the property-related predicates and their corresponding high-level design constructs such as FSMs and counters, we can quickly identify the sequential invariants that are useful in improving the property proving capabilities. We utilize these sequential invariants to refine the inductive hypothesis in induction-based UMCs, and to improve the accuracy of reachable state approximation in interpolation-based UMCs. The experimental results show that our tool can outperform a state-of-the-art UMC in most cases, especially for the difficult true properties.
在本文中,我们提出了一种特定属性的顺序不变量提取算法,以提高基于sat的无界建模检查器(UMCs)的性能。通过分析与属性相关的谓词及其相应的高级设计结构(如fsm和计数器),我们可以快速识别有助于提高属性证明能力的顺序不变量。我们利用这些序贯不变量来改进基于归纳的UMCs中的归纳假设,并提高基于插值的UMCs中可达状态逼近的准确性。实验结果表明,在大多数情况下,我们的工具可以优于最先进的UMC,特别是在难以真实属性的情况下。
{"title":"Property-specific sequential invariant extraction for SAT-based unbounded model checking","authors":"Hu-Hsi Yeh, Cheng-Yin Wu, Chung-Yang Huang","doi":"10.1109/ICCAD.2011.6105402","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105402","url":null,"abstract":"In this paper, we propose a property-specific sequential invariant extraction algorithm to improve the performance of the SAT-based Unbounded Modeling Checkers (UMCs). By analyzing the property-related predicates and their corresponding high-level design constructs such as FSMs and counters, we can quickly identify the sequential invariants that are useful in improving the property proving capabilities. We utilize these sequential invariants to refine the inductive hypothesis in induction-based UMCs, and to improve the accuracy of reachable state approximation in interpolation-based UMCs. The experimental results show that our tool can outperform a state-of-the-art UMC in most cases, especially for the difficult true properties.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73267182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Low-power multiple-bit upset tolerant memory optimization 低功耗多比特容错存储器优化
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105388
Seokjoong Kim, Matthew R. Guthaus
In this paper, we propose a framework for analyzing Soft Error Rates (SER) including Multiple-Bit Upsets (MBU). Then, using this framework, we optimize the soft error tolerant voltage (Vtol) and interleaving distance (ID) of low-power, error-tolerant memories. Experimental results show that the total power can be reduced by an average of 30.5% with Vtol optimization and an average of 40.9% by simultaneously considering Vtol and ID together when compared to worst-case design practices.
在本文中,我们提出了一个分析软误码率(SER)的框架,包括多位扰流(MBU)。然后,利用该框架,优化了低功耗容错存储器的软容错电压(Vtol)和交错距离(ID)。实验结果表明,与最坏情况设计相比,垂直起降优化可使总功率平均降低30.5%,同时考虑垂直起降和内径的设计可使总功率平均降低40.9%。
{"title":"Low-power multiple-bit upset tolerant memory optimization","authors":"Seokjoong Kim, Matthew R. Guthaus","doi":"10.1109/ICCAD.2011.6105388","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105388","url":null,"abstract":"In this paper, we propose a framework for analyzing Soft Error Rates (SER) including Multiple-Bit Upsets (MBU). Then, using this framework, we optimize the soft error tolerant voltage (Vtol) and interleaving distance (ID) of low-power, error-tolerant memories. Experimental results show that the total power can be reduced by an average of 30.5% with Vtol optimization and an average of 40.9% by simultaneously considering Vtol and ID together when compared to worst-case design practices.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78332473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Device-architecture co-optimization of STT-RAM based memory for low power embedded systems 低功耗嵌入式系统中基于STT-RAM存储器的器件架构协同优化
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105369
Cong Xu, Dimin Niu, Xiaochun Zhu, Seung H. Kang, M. Nowak, Yuan Xie
Spin-transfer torque random access memory (STT-RAM) is a fast, scalable, durable non-volatile memory which can be embedded into standard CMOS process. A wide range of write speeds from 1ns to 100ns have been reported for STT-RAM. The switching current of magnetic tunnel junction (MTJ) (which is the storage element of STT-RAM) is inversely proportional to the write pulse width. In this work, we propose a methodology to design STT-RAM for different optimization goals such as read performance, write performance and write energy by leveraging the trade-off between write current and write time of MTJ. We take the typical in-plane MTJ and advanced perpendicular MTJ (PMTJ) as our optimization targets. Our study shows that reducing write pulse width will harm read latency and energy. It is observed that “sweet spots” of write pulse width which minimize the write energy or write latency of STT-RAM caches may exist. The optimal write pulse width depends on MTJ specifications, STT-RAM capacity and I/O width. The simulation results indicate that by utilizing PMTJ, the optimized STT-RAM can compete against SRAM and DRAM as universal memory replacement in low power embedded systems.1
自旋转移扭矩随机存取存储器(STT-RAM)是一种快速、可扩展、耐用的非易失性存储器,可以嵌入到标准的CMOS工艺中。据报道,STT-RAM的写入速度范围从1ns到100ns不等。磁隧道结(MTJ)是STT-RAM的存储元件,其开关电流与写入脉冲宽度成反比。在这项工作中,我们提出了一种设计STT-RAM的方法,通过利用MTJ的写电流和写时间之间的权衡,来实现不同的优化目标,如读性能、写性能和写能量。以典型面内MTJ和先进垂直MTJ (PMTJ)为优化目标。我们的研究表明,减小写脉冲宽度会损害读延迟和能量。观察到写脉冲宽度的“最佳点”可能存在,它使STT-RAM缓存的写能量或写延迟最小化。最佳写脉冲宽度取决于MTJ规格、STT-RAM容量和I/O宽度。仿真结果表明,利用PMTJ,优化后的STT-RAM可以与SRAM和DRAM竞争,成为低功耗嵌入式系统通用内存的替代品
{"title":"Device-architecture co-optimization of STT-RAM based memory for low power embedded systems","authors":"Cong Xu, Dimin Niu, Xiaochun Zhu, Seung H. Kang, M. Nowak, Yuan Xie","doi":"10.1109/ICCAD.2011.6105369","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105369","url":null,"abstract":"Spin-transfer torque random access memory (STT-RAM) is a fast, scalable, durable non-volatile memory which can be embedded into standard CMOS process. A wide range of write speeds from 1ns to 100ns have been reported for STT-RAM. The switching current of magnetic tunnel junction (MTJ) (which is the storage element of STT-RAM) is inversely proportional to the write pulse width. In this work, we propose a methodology to design STT-RAM for different optimization goals such as read performance, write performance and write energy by leveraging the trade-off between write current and write time of MTJ. We take the typical in-plane MTJ and advanced perpendicular MTJ (PMTJ) as our optimization targets. Our study shows that reducing write pulse width will harm read latency and energy. It is observed that “sweet spots” of write pulse width which minimize the write energy or write latency of STT-RAM caches may exist. The optimal write pulse width depends on MTJ specifications, STT-RAM capacity and I/O width. The simulation results indicate that by utilizing PMTJ, the optimized STT-RAM can compete against SRAM and DRAM as universal memory replacement in low power embedded systems.1","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82029092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Improving shared cache behavior of multithreaded object-oriented applications in multicores 改进多核多线程面向对象应用程序的共享缓存行为
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105315
M. Kandemir, Shekhar Srikantaiah, S. Son
Understanding shared cache performance when executing multithreaded object-oriented applications and optimizing these applications for multicores have not received much attention. In this paper, we first quantify the intra-thread and inter-thread cache line (block) reuse characteristics of a set of multithreaded C++ programs when executed in shared cache based multicores. Our results show that, as far as shared on-chip caches are concerned, inter-thread cache line (block) reuse distances are much higher than intra-thread cache line reuse distances. We study the impact of these characteristics on the hit/miss behavior of the shared last-level cache on a commercial multicore machine. We then show that, by rearranging accesses to the objects shared across different threads and to the objects stored in nearby memory locations, inter-thread (temporal and spatial) object reuse distances can be reduced, which in turn helps to reduce inter-thread cache line reuse distances. The results we collected using eight multithreaded applications show that our proposed shared cache-aware code restructuring strategy can reduce misses in the last-level on-chip cache of a commercial multicore machine by 25.4%, on average. These savings in cache misses translate in turn to average execution time improvement of 11.9%.
在执行多线程面向对象应用程序和为多核优化这些应用程序时,理解共享缓存性能并没有得到太多关注。在本文中,我们首先量化了一组多线程c++程序在基于共享缓存的多核中执行时的线程内和线程间缓存线(块)重用特征。我们的研究结果表明,就共享片上缓存而言,线程间缓存线(块)重用距离远高于线程内缓存线重用距离。我们研究了这些特征对商业多核机器上共享最后一级缓存的命中/未命中行为的影响。然后我们表明,通过重新安排对不同线程之间共享的对象和存储在附近内存位置的对象的访问,可以减少线程间(时间和空间)对象重用距离,这反过来有助于减少线程间缓存线重用距离。我们使用八个多线程应用程序收集的结果表明,我们提出的共享缓存感知代码重构策略可以将商用多核机器的最后一级片上缓存中的失误平均减少25.4%。这些缓存丢失的节省转化为平均执行时间的11.9%的改进。
{"title":"Improving shared cache behavior of multithreaded object-oriented applications in multicores","authors":"M. Kandemir, Shekhar Srikantaiah, S. Son","doi":"10.1109/ICCAD.2011.6105315","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105315","url":null,"abstract":"Understanding shared cache performance when executing multithreaded object-oriented applications and optimizing these applications for multicores have not received much attention. In this paper, we first quantify the intra-thread and inter-thread cache line (block) reuse characteristics of a set of multithreaded C++ programs when executed in shared cache based multicores. Our results show that, as far as shared on-chip caches are concerned, inter-thread cache line (block) reuse distances are much higher than intra-thread cache line reuse distances. We study the impact of these characteristics on the hit/miss behavior of the shared last-level cache on a commercial multicore machine. We then show that, by rearranging accesses to the objects shared across different threads and to the objects stored in nearby memory locations, inter-thread (temporal and spatial) object reuse distances can be reduced, which in turn helps to reduce inter-thread cache line reuse distances. The results we collected using eight multithreaded applications show that our proposed shared cache-aware code restructuring strategy can reduce misses in the last-level on-chip cache of a commercial multicore machine by 25.4%, on average. These savings in cache misses translate in turn to average execution time improvement of 11.9%.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82182215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Improving dual Vt technology by simultaneous gate sizing and mechanical stress optimization 通过同时进行浇口尺寸和机械应力优化,改进双Vt工艺
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105410
J. Gu, G. Qu, Lin Yuan, Cheng Zhuo
Process-induced mechanical stress is used to enhance carrier mobility and drive current in contemporary CMOS technologies. Stressed cells have reduced delay but larger leakage consumption. Its efficient power/delay trading ratio makes mechanical stress an enticing alternative to other power optimization techniques. This paper proposes an effective urgentpath guided approach that improves dual Vt technique by incorporating gate sizing and mechanical stress simultaneously. The introduction of mechanical stress is shown to achieve 9.8% leakage and 2.8% total power savings over combined gate sizing and dual Vt approach.
在当代CMOS技术中,工艺诱导的机械应力用于提高载流子迁移率和驱动电流。受压电池延迟减少,但泄漏消耗较大。其有效的功率/延迟交易比率使机械应力成为其他功率优化技术的诱人替代方案。本文提出了一种有效的紧急路径引导方法,通过同时考虑浇口尺寸和机械应力来改进双Vt技术。机械应力的引入表明,在结合栅极尺寸和双Vt方法的情况下,可以实现9.8%的泄漏和2.8%的总功率节省。
{"title":"Improving dual Vt technology by simultaneous gate sizing and mechanical stress optimization","authors":"J. Gu, G. Qu, Lin Yuan, Cheng Zhuo","doi":"10.1109/ICCAD.2011.6105410","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105410","url":null,"abstract":"Process-induced mechanical stress is used to enhance carrier mobility and drive current in contemporary CMOS technologies. Stressed cells have reduced delay but larger leakage consumption. Its efficient power/delay trading ratio makes mechanical stress an enticing alternative to other power optimization techniques. This paper proposes an effective urgentpath guided approach that improves dual Vt technique by incorporating gate sizing and mechanical stress simultaneously. The introduction of mechanical stress is shown to achieve 9.8% leakage and 2.8% total power savings over combined gate sizing and dual Vt approach.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85980466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Detecting stability faults in sub-threshold SRAMs 检测亚阈值sram的稳定性故障
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105301
Chen-Wei Lin, Hao-Yu Yang, Chin-Yuan Huang, Hung-Hsin Chen, M. Chao
Detecting stability faults has been a crucial task and a hot research topic for the testing of conventional super-threshold 6T SRAM in the past. When lowering the supply voltage of SRAM to the subthreshold region, the impact of stability faults may significantly change, and hence the test methods developed in the past for detecting stability faults may no longer be effective. In this paper, we first categorize the subthreshold-SRAM designs into different types according to their bit-cell structures. Based on each type, we then analyze the difference of its stability faults compared to the conventional super-threshold 6T SRAM, and discuss how the stability-fault test methods should be modified accordingly. A series of experiments are conducted to validate the effectiveness of each stability-fault test method for different types of subthreshold-SRAM designs.
稳定性故障检测一直是传统超阈值6T SRAM测试的关键任务和研究热点。当SRAM的供电电压降低到亚阈值区域时,稳定性故障的影响可能会发生显著变化,因此过去开发的检测稳定性故障的测试方法可能不再有效。在本文中,我们首先根据其位元结构将亚阈值sram设计分为不同类型。在此基础上,分析了其稳定性故障与常规超阈值6T SRAM的差异,并讨论了稳定性故障测试方法应如何进行相应的修改。针对不同类型的亚阈值sram设计,通过一系列实验验证了每种稳定性故障测试方法的有效性。
{"title":"Detecting stability faults in sub-threshold SRAMs","authors":"Chen-Wei Lin, Hao-Yu Yang, Chin-Yuan Huang, Hung-Hsin Chen, M. Chao","doi":"10.1109/ICCAD.2011.6105301","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105301","url":null,"abstract":"Detecting stability faults has been a crucial task and a hot research topic for the testing of conventional super-threshold 6T SRAM in the past. When lowering the supply voltage of SRAM to the subthreshold region, the impact of stability faults may significantly change, and hence the test methods developed in the past for detecting stability faults may no longer be effective. In this paper, we first categorize the subthreshold-SRAM designs into different types according to their bit-cell structures. Based on each type, we then analyze the difference of its stability faults compared to the conventional super-threshold 6T SRAM, and discuss how the stability-fault test methods should be modified accordingly. A series of experiments are conducted to validate the effectiveness of each stability-fault test method for different types of subthreshold-SRAM designs.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84047879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Co-design of channel buffers and crossbar organizations in NoCs architectures noc架构中通道缓冲和跨栏组织的协同设计
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105329
Avinash Karanth Kodi, R. Morris, D. DiTomaso, Ashwini Sarathy, A. Louri
Network-on-Chips (NoCs) have emerged as a scalable solution to the wire delay constraints, thereby providing a high-performance communication fabric for future multicores. Research has shown that power, area and performance of Network-on-Chips (NoCs) architecture are tightly integrated with the design and optimization of the link and router (buffer and crossbar). Recent work has shown that adaptive channel buffers (on-link storage) can considerably reduce power consumption and area overhead by reducing or replacing the power hungry router buffers. However, channel buffer design can lead to Head-of-Line (HoL) blocking which eventually reduces the throughput of the network. In this paper, we explore the design space of organizing channel buffers and router crossbars to improve the performance (latency, throughput) while reducing the power consumption. Our proposed designs analyze the power-performance-area trade-off in designing channel buffers for NoC architectures while overcoming HoL blocking through crossbar optimizations. Our simulation and NoC design synthesis shows that for a 8 × 8 mesh architecture, we can reduce the power consumption by 25–40%, improve performance by 10–25% while occupying 4–13% more area when compared to the baseline architecture.
片上网络(noc)已成为一种可扩展的解决方案,以解决线延迟限制,从而为未来的多核提供高性能通信结构。研究表明,片上网络(noc)架构的功耗、面积和性能与链路和路由器(缓冲区和交叉条)的设计和优化密切相关。最近的研究表明,自适应信道缓冲器(链路上存储)可以通过减少或替换耗电的路由器缓冲器来显著降低功耗和面积开销。然而,通道缓冲区的设计可能会导致Head-of-Line (HoL)阻塞,最终降低网络的吞吐量。在本文中,我们探讨了组织通道缓冲区和路由器交叉条的设计空间,以提高性能(延迟,吞吐量),同时降低功耗。我们提出的设计分析了在设计NoC架构的信道缓冲区时的功率-性能-面积权衡,同时通过交叉优化克服HoL阻塞。我们的仿真和NoC设计综合表明,对于8 × 8网格架构,我们可以将功耗降低25-40%,性能提高10-25%,而与基准架构相比,占地面积增加4-13%。
{"title":"Co-design of channel buffers and crossbar organizations in NoCs architectures","authors":"Avinash Karanth Kodi, R. Morris, D. DiTomaso, Ashwini Sarathy, A. Louri","doi":"10.1109/ICCAD.2011.6105329","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105329","url":null,"abstract":"Network-on-Chips (NoCs) have emerged as a scalable solution to the wire delay constraints, thereby providing a high-performance communication fabric for future multicores. Research has shown that power, area and performance of Network-on-Chips (NoCs) architecture are tightly integrated with the design and optimization of the link and router (buffer and crossbar). Recent work has shown that adaptive channel buffers (on-link storage) can considerably reduce power consumption and area overhead by reducing or replacing the power hungry router buffers. However, channel buffer design can lead to Head-of-Line (HoL) blocking which eventually reduces the throughput of the network. In this paper, we explore the design space of organizing channel buffers and router crossbars to improve the performance (latency, throughput) while reducing the power consumption. Our proposed designs analyze the power-performance-area trade-off in designing channel buffers for NoC architectures while overcoming HoL blocking through crossbar optimizations. Our simulation and NoC design synthesis shows that for a 8 × 8 mesh architecture, we can reduce the power consumption by 25–40%, improve performance by 10–25% while occupying 4–13% more area when compared to the baseline architecture.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88906869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A theoretical probabilistic simulation framework for dynamic power estimation 动态功率估计的理论概率仿真框架
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105407
Lei Wang, M. Olbrich, E. Barke, Thomas Büchner, Markus Bühler, P. Panitz
As fast non-simulation-based power estimation techniques, probabilistic simulation techniques were widely researched in the 1990s. Spatial and temporal correlations are commonly known as two fundamental challenges of these kinds of techniques. Previous work showed that spatial correlation could be coped with by means of bit-parallel simulation. For temporal correlation that has great impact on estimating glitches, previous work only showed that it could be considered by means of a glitch-filtering scheme which is an approximation algorithm, but did not answer the question whether temporal correlation could be overcome without any approximation. Our work extends conventional probabilistic simulation techniques and puts the essentials and extensions of probabilistic simulation into a theoretical framework. Based on the framework, this paper shows that modeling temporal correlation in probabilistic simulation without any approximation is only possible in theory. Therefore, an improved approximation of the exact method is proposed. Compared to the conventional probabilistic simulation, our prominently improved results prove the effectiveness of our approximation algorithm. At the end of this paper, the advantages and the bottlenecks of probabilistic simulation are concluded in general.
概率仿真技术作为一种快速的非仿真功率估计技术,在20世纪90年代得到了广泛的研究。空间和时间相关性通常被认为是这类技术的两个基本挑战。先前的研究表明,空间相关性可以通过位并行模拟来处理。对于对故障估计影响较大的时间相关性,以往的工作只表明可以通过一种近似算法glitch-filtering方案来考虑它,而没有回答在没有任何近似的情况下是否可以克服时间相关性的问题。我们的工作扩展了传统的概率模拟技术,并将概率模拟的要点和扩展纳入了一个理论框架。基于该框架,本文证明了在概率模拟中不进行任何近似的时间相关性建模仅在理论上是可行的。因此,提出了一种改进的近似精确方法。与传统的概率模拟相比,我们的结果显著改善,证明了我们的近似算法的有效性。最后总结了概率仿真的优点和存在的瓶颈。
{"title":"A theoretical probabilistic simulation framework for dynamic power estimation","authors":"Lei Wang, M. Olbrich, E. Barke, Thomas Büchner, Markus Bühler, P. Panitz","doi":"10.1109/ICCAD.2011.6105407","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105407","url":null,"abstract":"As fast non-simulation-based power estimation techniques, probabilistic simulation techniques were widely researched in the 1990s. Spatial and temporal correlations are commonly known as two fundamental challenges of these kinds of techniques. Previous work showed that spatial correlation could be coped with by means of bit-parallel simulation. For temporal correlation that has great impact on estimating glitches, previous work only showed that it could be considered by means of a glitch-filtering scheme which is an approximation algorithm, but did not answer the question whether temporal correlation could be overcome without any approximation. Our work extends conventional probabilistic simulation techniques and puts the essentials and extensions of probabilistic simulation into a theoretical framework. Based on the framework, this paper shows that modeling temporal correlation in probabilistic simulation without any approximation is only possible in theory. Therefore, an improved approximation of the exact method is proposed. Compared to the conventional probabilistic simulation, our prominently improved results prove the effectiveness of our approximation algorithm. At the end of this paper, the advantages and the bottlenecks of probabilistic simulation are concluded in general.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90302497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Online clock skew tuning for timing speculation 在线时钟偏差调整时间猜测
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105366
Rong Ye, F. Yuan, Q. Xu
The timing performance and yield of integrated circuits can be improved by carefully assigning intentional clock skews to flip-flops. Due to the ever-increasing process, voltage, and temperature variations with technology scaling, however, traditional clock skew optimization solutions that work in a conservative manner to guarantee “always correct” computation cannot perform as well as expected. By allowing infrequent timing errors and recovering from them with minor performance impact, the concept of timing speculation has attracted lots of research attention since it enables “better than worst-case design”. In this work, we propose a novel online clock skew tuning technique for circuits equipped with timing speculation capability. By observing the occurrence of timing errors at runtime and tuning clock skews accordingly, the proposed technique is able to achieve much better timing performance when compared to existing clock skew optimization solutions. Experimental results on various benchmark circuits demonstrate the effectiveness of the proposed methodology.
集成电路的时序性能和良率可以通过仔细地为触发器分配有意的时钟偏差来提高。然而,由于工艺、电压和温度的变化随着技术的扩展而不断增加,传统的时钟偏差优化方案以保守的方式工作,以保证“始终正确”的计算,无法达到预期的效果。通过允许不频繁的定时错误并在对性能影响较小的情况下从中恢复,定时猜测的概念吸引了许多研究关注,因为它可以实现“比最坏情况更好的设计”。在这项工作中,我们提出了一种新的时钟偏差在线调谐技术,用于具有时序推测能力的电路。通过观察运行时发生的计时错误并相应地调整时钟倾斜,与现有的时钟倾斜优化解决方案相比,所提出的技术能够实现更好的计时性能。在各种基准电路上的实验结果证明了该方法的有效性。
{"title":"Online clock skew tuning for timing speculation","authors":"Rong Ye, F. Yuan, Q. Xu","doi":"10.1109/ICCAD.2011.6105366","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105366","url":null,"abstract":"The timing performance and yield of integrated circuits can be improved by carefully assigning intentional clock skews to flip-flops. Due to the ever-increasing process, voltage, and temperature variations with technology scaling, however, traditional clock skew optimization solutions that work in a conservative manner to guarantee “always correct” computation cannot perform as well as expected. By allowing infrequent timing errors and recovering from them with minor performance impact, the concept of timing speculation has attracted lots of research attention since it enables “better than worst-case design”. In this work, we propose a novel online clock skew tuning technique for circuits equipped with timing speculation capability. By observing the occurrence of timing errors at runtime and tuning clock skews accordingly, the proposed technique is able to achieve much better timing performance when compared to existing clock skew optimization solutions. Experimental results on various benchmark circuits demonstrate the effectiveness of the proposed methodology.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87296226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
期刊
2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1