首页 > 最新文献

2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)最新文献

英文 中文
Carbon nanotube imperfection-immune digital VLSI: Frequently asked questions updated 碳纳米管缺陷免疫数字VLSI:常见问题更新
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105330
Hai Wei, Jie Zhang, Lan Wei, N. Patil, A. Lin, M. Shulaker, Hong-Yu Chen, H. Wong, S. Mitra
Carbon Nanotube Field-Effect Transistors (CNFETs) are excellent candidates for designing highly energy-efficient future digital systems. However, carbon nanotubes (CNTs) are inherently highly subject to imperfections that pose major obstacles to robust CNFET digital VLSI. This paper summarizes commonly raised questions and concerns about CNFET technology through a series of frequently asked questions. The specific questions addressed in this paper are motivated by recent advances in the field since the publication of our earlier paper on frequently asked questions in the Proceedings of the 2009 Design Automation Conference.
碳纳米管场效应晶体管(cnfet)是设计高能效未来数字系统的优秀候选者。然而,碳纳米管(CNTs)本身就具有高度的缺陷,这对稳健的CNFET数字VLSI构成了主要障碍。本文通过一系列常见问题,总结了CNFET技术中常见的问题和关注。本文中讨论的具体问题是由我们在2009年设计自动化会议论文集中发表的关于常见问题的早期论文以来该领域的最新进展所激发的。
{"title":"Carbon nanotube imperfection-immune digital VLSI: Frequently asked questions updated","authors":"Hai Wei, Jie Zhang, Lan Wei, N. Patil, A. Lin, M. Shulaker, Hong-Yu Chen, H. Wong, S. Mitra","doi":"10.1109/ICCAD.2011.6105330","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105330","url":null,"abstract":"Carbon Nanotube Field-Effect Transistors (CNFETs) are excellent candidates for designing highly energy-efficient future digital systems. However, carbon nanotubes (CNTs) are inherently highly subject to imperfections that pose major obstacles to robust CNFET digital VLSI. This paper summarizes commonly raised questions and concerns about CNFET technology through a series of frequently asked questions. The specific questions addressed in this paper are motivated by recent advances in the field since the publication of our earlier paper on frequently asked questions in the Proceedings of the 2009 Design Automation Conference.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"2 1","pages":"227-230"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88420367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Property-specific sequential invariant extraction for SAT-based unbounded model checking 用于基于sat的无界模型检查的特定于属性的顺序不变量提取
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105402
Hu-Hsi Yeh, Cheng-Yin Wu, Chung-Yang Huang
In this paper, we propose a property-specific sequential invariant extraction algorithm to improve the performance of the SAT-based Unbounded Modeling Checkers (UMCs). By analyzing the property-related predicates and their corresponding high-level design constructs such as FSMs and counters, we can quickly identify the sequential invariants that are useful in improving the property proving capabilities. We utilize these sequential invariants to refine the inductive hypothesis in induction-based UMCs, and to improve the accuracy of reachable state approximation in interpolation-based UMCs. The experimental results show that our tool can outperform a state-of-the-art UMC in most cases, especially for the difficult true properties.
在本文中,我们提出了一种特定属性的顺序不变量提取算法,以提高基于sat的无界建模检查器(UMCs)的性能。通过分析与属性相关的谓词及其相应的高级设计结构(如fsm和计数器),我们可以快速识别有助于提高属性证明能力的顺序不变量。我们利用这些序贯不变量来改进基于归纳的UMCs中的归纳假设,并提高基于插值的UMCs中可达状态逼近的准确性。实验结果表明,在大多数情况下,我们的工具可以优于最先进的UMC,特别是在难以真实属性的情况下。
{"title":"Property-specific sequential invariant extraction for SAT-based unbounded model checking","authors":"Hu-Hsi Yeh, Cheng-Yin Wu, Chung-Yang Huang","doi":"10.1109/ICCAD.2011.6105402","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105402","url":null,"abstract":"In this paper, we propose a property-specific sequential invariant extraction algorithm to improve the performance of the SAT-based Unbounded Modeling Checkers (UMCs). By analyzing the property-related predicates and their corresponding high-level design constructs such as FSMs and counters, we can quickly identify the sequential invariants that are useful in improving the property proving capabilities. We utilize these sequential invariants to refine the inductive hypothesis in induction-based UMCs, and to improve the accuracy of reachable state approximation in interpolation-based UMCs. The experimental results show that our tool can outperform a state-of-the-art UMC in most cases, especially for the difficult true properties.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"1 1","pages":"674-678"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73267182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Low-power multiple-bit upset tolerant memory optimization 低功耗多比特容错存储器优化
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105388
Seokjoong Kim, Matthew R. Guthaus
In this paper, we propose a framework for analyzing Soft Error Rates (SER) including Multiple-Bit Upsets (MBU). Then, using this framework, we optimize the soft error tolerant voltage (Vtol) and interleaving distance (ID) of low-power, error-tolerant memories. Experimental results show that the total power can be reduced by an average of 30.5% with Vtol optimization and an average of 40.9% by simultaneously considering Vtol and ID together when compared to worst-case design practices.
在本文中,我们提出了一个分析软误码率(SER)的框架,包括多位扰流(MBU)。然后,利用该框架,优化了低功耗容错存储器的软容错电压(Vtol)和交错距离(ID)。实验结果表明,与最坏情况设计相比,垂直起降优化可使总功率平均降低30.5%,同时考虑垂直起降和内径的设计可使总功率平均降低40.9%。
{"title":"Low-power multiple-bit upset tolerant memory optimization","authors":"Seokjoong Kim, Matthew R. Guthaus","doi":"10.1109/ICCAD.2011.6105388","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105388","url":null,"abstract":"In this paper, we propose a framework for analyzing Soft Error Rates (SER) including Multiple-Bit Upsets (MBU). Then, using this framework, we optimize the soft error tolerant voltage (Vtol) and interleaving distance (ID) of low-power, error-tolerant memories. Experimental results show that the total power can be reduced by an average of 30.5% with Vtol optimization and an average of 40.9% by simultaneously considering Vtol and ID together when compared to worst-case design practices.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"33 1","pages":"577-581"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78332473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Improving dual Vt technology by simultaneous gate sizing and mechanical stress optimization 通过同时进行浇口尺寸和机械应力优化,改进双Vt工艺
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105410
J. Gu, G. Qu, Lin Yuan, Cheng Zhuo
Process-induced mechanical stress is used to enhance carrier mobility and drive current in contemporary CMOS technologies. Stressed cells have reduced delay but larger leakage consumption. Its efficient power/delay trading ratio makes mechanical stress an enticing alternative to other power optimization techniques. This paper proposes an effective urgentpath guided approach that improves dual Vt technique by incorporating gate sizing and mechanical stress simultaneously. The introduction of mechanical stress is shown to achieve 9.8% leakage and 2.8% total power savings over combined gate sizing and dual Vt approach.
在当代CMOS技术中,工艺诱导的机械应力用于提高载流子迁移率和驱动电流。受压电池延迟减少,但泄漏消耗较大。其有效的功率/延迟交易比率使机械应力成为其他功率优化技术的诱人替代方案。本文提出了一种有效的紧急路径引导方法,通过同时考虑浇口尺寸和机械应力来改进双Vt技术。机械应力的引入表明,在结合栅极尺寸和双Vt方法的情况下,可以实现9.8%的泄漏和2.8%的总功率节省。
{"title":"Improving dual Vt technology by simultaneous gate sizing and mechanical stress optimization","authors":"J. Gu, G. Qu, Lin Yuan, Cheng Zhuo","doi":"10.1109/ICCAD.2011.6105410","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105410","url":null,"abstract":"Process-induced mechanical stress is used to enhance carrier mobility and drive current in contemporary CMOS technologies. Stressed cells have reduced delay but larger leakage consumption. Its efficient power/delay trading ratio makes mechanical stress an enticing alternative to other power optimization techniques. This paper proposes an effective urgentpath guided approach that improves dual Vt technique by incorporating gate sizing and mechanical stress simultaneously. The introduction of mechanical stress is shown to achieve 9.8% leakage and 2.8% total power savings over combined gate sizing and dual Vt approach.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"59 1","pages":"732-735"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85980466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Detecting stability faults in sub-threshold SRAMs 检测亚阈值sram的稳定性故障
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105301
Chen-Wei Lin, Hao-Yu Yang, Chin-Yuan Huang, Hung-Hsin Chen, M. Chao
Detecting stability faults has been a crucial task and a hot research topic for the testing of conventional super-threshold 6T SRAM in the past. When lowering the supply voltage of SRAM to the subthreshold region, the impact of stability faults may significantly change, and hence the test methods developed in the past for detecting stability faults may no longer be effective. In this paper, we first categorize the subthreshold-SRAM designs into different types according to their bit-cell structures. Based on each type, we then analyze the difference of its stability faults compared to the conventional super-threshold 6T SRAM, and discuss how the stability-fault test methods should be modified accordingly. A series of experiments are conducted to validate the effectiveness of each stability-fault test method for different types of subthreshold-SRAM designs.
稳定性故障检测一直是传统超阈值6T SRAM测试的关键任务和研究热点。当SRAM的供电电压降低到亚阈值区域时,稳定性故障的影响可能会发生显著变化,因此过去开发的检测稳定性故障的测试方法可能不再有效。在本文中,我们首先根据其位元结构将亚阈值sram设计分为不同类型。在此基础上,分析了其稳定性故障与常规超阈值6T SRAM的差异,并讨论了稳定性故障测试方法应如何进行相应的修改。针对不同类型的亚阈值sram设计,通过一系列实验验证了每种稳定性故障测试方法的有效性。
{"title":"Detecting stability faults in sub-threshold SRAMs","authors":"Chen-Wei Lin, Hao-Yu Yang, Chin-Yuan Huang, Hung-Hsin Chen, M. Chao","doi":"10.1109/ICCAD.2011.6105301","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105301","url":null,"abstract":"Detecting stability faults has been a crucial task and a hot research topic for the testing of conventional super-threshold 6T SRAM in the past. When lowering the supply voltage of SRAM to the subthreshold region, the impact of stability faults may significantly change, and hence the test methods developed in the past for detecting stability faults may no longer be effective. In this paper, we first categorize the subthreshold-SRAM designs into different types according to their bit-cell structures. Based on each type, we then analyze the difference of its stability faults compared to the conventional super-threshold 6T SRAM, and discuss how the stability-fault test methods should be modified accordingly. A series of experiments are conducted to validate the effectiveness of each stability-fault test method for different types of subthreshold-SRAM designs.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"67 1","pages":"28-33"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84047879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Device-architecture co-optimization of STT-RAM based memory for low power embedded systems 低功耗嵌入式系统中基于STT-RAM存储器的器件架构协同优化
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105369
Cong Xu, Dimin Niu, Xiaochun Zhu, Seung H. Kang, M. Nowak, Yuan Xie
Spin-transfer torque random access memory (STT-RAM) is a fast, scalable, durable non-volatile memory which can be embedded into standard CMOS process. A wide range of write speeds from 1ns to 100ns have been reported for STT-RAM. The switching current of magnetic tunnel junction (MTJ) (which is the storage element of STT-RAM) is inversely proportional to the write pulse width. In this work, we propose a methodology to design STT-RAM for different optimization goals such as read performance, write performance and write energy by leveraging the trade-off between write current and write time of MTJ. We take the typical in-plane MTJ and advanced perpendicular MTJ (PMTJ) as our optimization targets. Our study shows that reducing write pulse width will harm read latency and energy. It is observed that “sweet spots” of write pulse width which minimize the write energy or write latency of STT-RAM caches may exist. The optimal write pulse width depends on MTJ specifications, STT-RAM capacity and I/O width. The simulation results indicate that by utilizing PMTJ, the optimized STT-RAM can compete against SRAM and DRAM as universal memory replacement in low power embedded systems.1
自旋转移扭矩随机存取存储器(STT-RAM)是一种快速、可扩展、耐用的非易失性存储器,可以嵌入到标准的CMOS工艺中。据报道,STT-RAM的写入速度范围从1ns到100ns不等。磁隧道结(MTJ)是STT-RAM的存储元件,其开关电流与写入脉冲宽度成反比。在这项工作中,我们提出了一种设计STT-RAM的方法,通过利用MTJ的写电流和写时间之间的权衡,来实现不同的优化目标,如读性能、写性能和写能量。以典型面内MTJ和先进垂直MTJ (PMTJ)为优化目标。我们的研究表明,减小写脉冲宽度会损害读延迟和能量。观察到写脉冲宽度的“最佳点”可能存在,它使STT-RAM缓存的写能量或写延迟最小化。最佳写脉冲宽度取决于MTJ规格、STT-RAM容量和I/O宽度。仿真结果表明,利用PMTJ,优化后的STT-RAM可以与SRAM和DRAM竞争,成为低功耗嵌入式系统通用内存的替代品
{"title":"Device-architecture co-optimization of STT-RAM based memory for low power embedded systems","authors":"Cong Xu, Dimin Niu, Xiaochun Zhu, Seung H. Kang, M. Nowak, Yuan Xie","doi":"10.1109/ICCAD.2011.6105369","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105369","url":null,"abstract":"Spin-transfer torque random access memory (STT-RAM) is a fast, scalable, durable non-volatile memory which can be embedded into standard CMOS process. A wide range of write speeds from 1ns to 100ns have been reported for STT-RAM. The switching current of magnetic tunnel junction (MTJ) (which is the storage element of STT-RAM) is inversely proportional to the write pulse width. In this work, we propose a methodology to design STT-RAM for different optimization goals such as read performance, write performance and write energy by leveraging the trade-off between write current and write time of MTJ. We take the typical in-plane MTJ and advanced perpendicular MTJ (PMTJ) as our optimization targets. Our study shows that reducing write pulse width will harm read latency and energy. It is observed that “sweet spots” of write pulse width which minimize the write energy or write latency of STT-RAM caches may exist. The optimal write pulse width depends on MTJ specifications, STT-RAM capacity and I/O width. The simulation results indicate that by utilizing PMTJ, the optimized STT-RAM can compete against SRAM and DRAM as universal memory replacement in low power embedded systems.1","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"11 1","pages":"463-470"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82029092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Improving shared cache behavior of multithreaded object-oriented applications in multicores 改进多核多线程面向对象应用程序的共享缓存行为
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105315
M. Kandemir, Shekhar Srikantaiah, S. Son
Understanding shared cache performance when executing multithreaded object-oriented applications and optimizing these applications for multicores have not received much attention. In this paper, we first quantify the intra-thread and inter-thread cache line (block) reuse characteristics of a set of multithreaded C++ programs when executed in shared cache based multicores. Our results show that, as far as shared on-chip caches are concerned, inter-thread cache line (block) reuse distances are much higher than intra-thread cache line reuse distances. We study the impact of these characteristics on the hit/miss behavior of the shared last-level cache on a commercial multicore machine. We then show that, by rearranging accesses to the objects shared across different threads and to the objects stored in nearby memory locations, inter-thread (temporal and spatial) object reuse distances can be reduced, which in turn helps to reduce inter-thread cache line reuse distances. The results we collected using eight multithreaded applications show that our proposed shared cache-aware code restructuring strategy can reduce misses in the last-level on-chip cache of a commercial multicore machine by 25.4%, on average. These savings in cache misses translate in turn to average execution time improvement of 11.9%.
在执行多线程面向对象应用程序和为多核优化这些应用程序时,理解共享缓存性能并没有得到太多关注。在本文中,我们首先量化了一组多线程c++程序在基于共享缓存的多核中执行时的线程内和线程间缓存线(块)重用特征。我们的研究结果表明,就共享片上缓存而言,线程间缓存线(块)重用距离远高于线程内缓存线重用距离。我们研究了这些特征对商业多核机器上共享最后一级缓存的命中/未命中行为的影响。然后我们表明,通过重新安排对不同线程之间共享的对象和存储在附近内存位置的对象的访问,可以减少线程间(时间和空间)对象重用距离,这反过来有助于减少线程间缓存线重用距离。我们使用八个多线程应用程序收集的结果表明,我们提出的共享缓存感知代码重构策略可以将商用多核机器的最后一级片上缓存中的失误平均减少25.4%。这些缓存丢失的节省转化为平均执行时间的11.9%的改进。
{"title":"Improving shared cache behavior of multithreaded object-oriented applications in multicores","authors":"M. Kandemir, Shekhar Srikantaiah, S. Son","doi":"10.1109/ICCAD.2011.6105315","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105315","url":null,"abstract":"Understanding shared cache performance when executing multithreaded object-oriented applications and optimizing these applications for multicores have not received much attention. In this paper, we first quantify the intra-thread and inter-thread cache line (block) reuse characteristics of a set of multithreaded C++ programs when executed in shared cache based multicores. Our results show that, as far as shared on-chip caches are concerned, inter-thread cache line (block) reuse distances are much higher than intra-thread cache line reuse distances. We study the impact of these characteristics on the hit/miss behavior of the shared last-level cache on a commercial multicore machine. We then show that, by rearranging accesses to the objects shared across different threads and to the objects stored in nearby memory locations, inter-thread (temporal and spatial) object reuse distances can be reduced, which in turn helps to reduce inter-thread cache line reuse distances. The results we collected using eight multithreaded applications show that our proposed shared cache-aware code restructuring strategy can reduce misses in the last-level on-chip cache of a commercial multicore machine by 25.4%, on average. These savings in cache misses translate in turn to average execution time improvement of 11.9%.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"1 1","pages":"118-125"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82182215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Co-design of channel buffers and crossbar organizations in NoCs architectures noc架构中通道缓冲和跨栏组织的协同设计
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105329
Avinash Karanth Kodi, R. Morris, D. DiTomaso, Ashwini Sarathy, A. Louri
Network-on-Chips (NoCs) have emerged as a scalable solution to the wire delay constraints, thereby providing a high-performance communication fabric for future multicores. Research has shown that power, area and performance of Network-on-Chips (NoCs) architecture are tightly integrated with the design and optimization of the link and router (buffer and crossbar). Recent work has shown that adaptive channel buffers (on-link storage) can considerably reduce power consumption and area overhead by reducing or replacing the power hungry router buffers. However, channel buffer design can lead to Head-of-Line (HoL) blocking which eventually reduces the throughput of the network. In this paper, we explore the design space of organizing channel buffers and router crossbars to improve the performance (latency, throughput) while reducing the power consumption. Our proposed designs analyze the power-performance-area trade-off in designing channel buffers for NoC architectures while overcoming HoL blocking through crossbar optimizations. Our simulation and NoC design synthesis shows that for a 8 × 8 mesh architecture, we can reduce the power consumption by 25–40%, improve performance by 10–25% while occupying 4–13% more area when compared to the baseline architecture.
片上网络(noc)已成为一种可扩展的解决方案,以解决线延迟限制,从而为未来的多核提供高性能通信结构。研究表明,片上网络(noc)架构的功耗、面积和性能与链路和路由器(缓冲区和交叉条)的设计和优化密切相关。最近的研究表明,自适应信道缓冲器(链路上存储)可以通过减少或替换耗电的路由器缓冲器来显著降低功耗和面积开销。然而,通道缓冲区的设计可能会导致Head-of-Line (HoL)阻塞,最终降低网络的吞吐量。在本文中,我们探讨了组织通道缓冲区和路由器交叉条的设计空间,以提高性能(延迟,吞吐量),同时降低功耗。我们提出的设计分析了在设计NoC架构的信道缓冲区时的功率-性能-面积权衡,同时通过交叉优化克服HoL阻塞。我们的仿真和NoC设计综合表明,对于8 × 8网格架构,我们可以将功耗降低25-40%,性能提高10-25%,而与基准架构相比,占地面积增加4-13%。
{"title":"Co-design of channel buffers and crossbar organizations in NoCs architectures","authors":"Avinash Karanth Kodi, R. Morris, D. DiTomaso, Ashwini Sarathy, A. Louri","doi":"10.1109/ICCAD.2011.6105329","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105329","url":null,"abstract":"Network-on-Chips (NoCs) have emerged as a scalable solution to the wire delay constraints, thereby providing a high-performance communication fabric for future multicores. Research has shown that power, area and performance of Network-on-Chips (NoCs) architecture are tightly integrated with the design and optimization of the link and router (buffer and crossbar). Recent work has shown that adaptive channel buffers (on-link storage) can considerably reduce power consumption and area overhead by reducing or replacing the power hungry router buffers. However, channel buffer design can lead to Head-of-Line (HoL) blocking which eventually reduces the throughput of the network. In this paper, we explore the design space of organizing channel buffers and router crossbars to improve the performance (latency, throughput) while reducing the power consumption. Our proposed designs analyze the power-performance-area trade-off in designing channel buffers for NoC architectures while overcoming HoL blocking through crossbar optimizations. Our simulation and NoC design synthesis shows that for a 8 × 8 mesh architecture, we can reduce the power consumption by 25–40%, improve performance by 10–25% while occupying 4–13% more area when compared to the baseline architecture.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"43 1","pages":"219-226"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88906869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Unequal-error-protection codes in SRAMs for mobile multimedia applications 移动多媒体应用中sram中的不等错误保护码
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105300
Xuebei Yang, K. Mohanram
In this paper, we introduce unequal-error-protection error correcting codes (UEPECCs) to improve SRAM reliability at low supply voltages for mobile multimedia applications. The fundamental premise for our work is that in multimedia applications, different bits in the same SRAM word are usually not equally significant, and hence deserve different protection levels. The key innovation in our work includes (i) a novel metric, word mean squared error, to measure the reliability of a SRAM word when different bits are not equally significant and (ii) an optimization algorithm based on dynamic programming to construct the UEPECC that assigns different protection levels to bits according to their significance. The advantage of the UEPECC over the traditional equal-error-protection ECC is demonstrated using two representative multimedia applications. For the same area, power, and encoding/decoding latency, SRAMs with UEPECC increase the peak signal-to-noise ratio by 8 dB in image processing and incur 60% less errors on average in optical flow (motion vector) computation.
在本文中,我们引入不等错保护纠错码(UEPECCs)来提高移动多媒体应用中SRAM在低电源电压下的可靠性。我们工作的基本前提是,在多媒体应用中,同一个SRAM字中的不同位通常不是同等重要的,因此应该得到不同的保护级别。我们工作中的关键创新包括(i)一种新的度量,词均方误差,用于衡量不同位不同等重要时SRAM字的可靠性;(ii)一种基于动态规划的优化算法,用于构建UEPECC,该算法根据位的重要性为其分配不同的保护级别。通过两个典型的多媒体应用,验证了UEPECC相对于传统等错保护ECC的优势。对于相同的面积,功率和编码/解码延迟,具有UEPECC的sram在图像处理中将峰值信噪比提高了8 dB,并且在光流(运动矢量)计算中平均减少了60%的误差。
{"title":"Unequal-error-protection codes in SRAMs for mobile multimedia applications","authors":"Xuebei Yang, K. Mohanram","doi":"10.1109/ICCAD.2011.6105300","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105300","url":null,"abstract":"In this paper, we introduce unequal-error-protection error correcting codes (UEPECCs) to improve SRAM reliability at low supply voltages for mobile multimedia applications. The fundamental premise for our work is that in multimedia applications, different bits in the same SRAM word are usually not equally significant, and hence deserve different protection levels. The key innovation in our work includes (i) a novel metric, word mean squared error, to measure the reliability of a SRAM word when different bits are not equally significant and (ii) an optimization algorithm based on dynamic programming to construct the UEPECC that assigns different protection levels to bits according to their significance. The advantage of the UEPECC over the traditional equal-error-protection ECC is demonstrated using two representative multimedia applications. For the same area, power, and encoding/decoding latency, SRAMs with UEPECC increase the peak signal-to-noise ratio by 8 dB in image processing and incur 60% less errors on average in optical flow (motion vector) computation.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"29 1","pages":"21-27"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75837460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Post-silicon bug diagnosis with inconsistent executions 后硅错误诊断与不一致的执行
Pub Date : 2011-11-07 DOI: 10.1109/ICCAD.2011.6105414
A. DeOrio, D. Khudia, V. Bertacco
The complexity of modern chips intensifies verification challenges, and an increasing share of this verification effort is shouldered by post-silicon validation. Focusing on the first silicon prototypes, post-silicon validation poses critical new challenges such as intermittent failures, where multiple executions of a same test do not yield a consistent outcome. These are often due to on-chip asynchronous events and electrical effects, leading to extremely time-consuming, if not unachievable, bug diagnosis and debugging processes. In this work, we propose a methodology called BPS (Bug Positioning System) to support the automatic diagnosis of these difficult bugs. During post-silicon validation, lightweight BPS hardware logs a compact encoding of observed signal activity over multiple executions of the same test: some passing, some failing. Leveraging a novel post-analysis algorithm, BPS uses the logged activity to diagnose the bug, identifying the approximate manifestation time and critical design signals. We found experimentally that BPS can localize most bugs down to the exact root signal and within about 1,000 clock cycles of their occurrence.
现代芯片的复杂性加剧了验证挑战,并且后硅验证承担了越来越多的验证工作。关注于第一个硅原型,后硅验证提出了关键的新挑战,如间歇性失败,其中多次执行相同的测试不能产生一致的结果。这通常是由于芯片上的异步事件和电子效应,导致极其耗时(如果不是无法实现的话)的错误诊断和调试过程。在这项工作中,我们提出了一种称为BPS (Bug Positioning System)的方法来支持这些困难的Bug的自动诊断。在硅后验证期间,轻量级BPS硬件记录了在多次执行相同测试时观察到的信号活动的紧凑编码:一些通过,一些失败。利用一种新颖的后期分析算法,BPS使用记录的活动来诊断bug,确定近似的表现时间和关键的设计信号。我们通过实验发现,BPS可以将大多数错误定位到精确的根信号,并在它们发生的大约1000个时钟周期内进行定位。
{"title":"Post-silicon bug diagnosis with inconsistent executions","authors":"A. DeOrio, D. Khudia, V. Bertacco","doi":"10.1109/ICCAD.2011.6105414","DOIUrl":"https://doi.org/10.1109/ICCAD.2011.6105414","url":null,"abstract":"The complexity of modern chips intensifies verification challenges, and an increasing share of this verification effort is shouldered by post-silicon validation. Focusing on the first silicon prototypes, post-silicon validation poses critical new challenges such as intermittent failures, where multiple executions of a same test do not yield a consistent outcome. These are often due to on-chip asynchronous events and electrical effects, leading to extremely time-consuming, if not unachievable, bug diagnosis and debugging processes. In this work, we propose a methodology called BPS (Bug Positioning System) to support the automatic diagnosis of these difficult bugs. During post-silicon validation, lightweight BPS hardware logs a compact encoding of observed signal activity over multiple executions of the same test: some passing, some failing. Leveraging a novel post-analysis algorithm, BPS uses the logged activity to diagnose the bug, identifying the approximate manifestation time and critical design signals. We found experimentally that BPS can localize most bugs down to the exact root signal and within about 1,000 clock cycles of their occurrence.","PeriodicalId":6357,"journal":{"name":"2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)","volume":"159 1","pages":"755-761"},"PeriodicalIF":0.0,"publicationDate":"2011-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77028935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
期刊
2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1