首页 > 最新文献

2011 IEEE 29th International Conference on Computer Design (ICCD)最新文献

英文 中文
EM and circuit co-simulation of a reconfigurable hybrid wireless NoC on 2D ICs 二维集成电路上可重构混合无线NoC的电磁与电路联合仿真
Pub Date : 2011-10-09 DOI: 10.1109/ICCD.2011.6081370
A. More, B. Taskin
The feasibility of the dynamic reconfigurability of the network layer of a hybrid wireless network-on-chip (NoC) that uses on-chip antennas for the wireless network layer and metal interconnects for the wired network layer is studied. The reconfigurability of the NoC is analyzed using a circuit co-simulation technique with a 3D finite element method (FEM) based full-wave electro-magnetic analysis of the antennas. The die and the circuits are modeled according to a typical complementary metal oxide semiconductor (CMOS) technology. It is shown that, it is possible to have 1) at least two different frequency domains for the signal sources and 2) the dynamic switching of the signal sinks between the two frequency domains, with minimal design and area overhead. When implemented, the proposed reconfigurable hybrid network architecture can reduce the latency and increase the network throughput.
研究了无线网络层采用片上天线,有线网络层采用金属互连的混合无线片上网络(NoC)的网络层动态可重构性的可行性。采用电路联合仿真技术和基于三维有限元法的天线全波电磁分析方法,分析了NoC的可重构性。根据典型的互补金属氧化物半导体(CMOS)技术对芯片和电路进行了建模。结果表明,在设计和面积开销最小的情况下,信号源至少有两个不同的频域,信号汇在两个频域之间动态切换是可能的。提出的可重构混合网络架构实现后,可以降低网络延迟,提高网络吞吐量。
{"title":"EM and circuit co-simulation of a reconfigurable hybrid wireless NoC on 2D ICs","authors":"A. More, B. Taskin","doi":"10.1109/ICCD.2011.6081370","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081370","url":null,"abstract":"The feasibility of the dynamic reconfigurability of the network layer of a hybrid wireless network-on-chip (NoC) that uses on-chip antennas for the wireless network layer and metal interconnects for the wired network layer is studied. The reconfigurability of the NoC is analyzed using a circuit co-simulation technique with a 3D finite element method (FEM) based full-wave electro-magnetic analysis of the antennas. The die and the circuits are modeled according to a typical complementary metal oxide semiconductor (CMOS) technology. It is shown that, it is possible to have 1) at least two different frequency domains for the signal sources and 2) the dynamic switching of the signal sinks between the two frequency domains, with minimal design and area overhead. When implemented, the proposed reconfigurable hybrid network architecture can reduce the latency and increase the network throughput.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115397072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Analysis of reliability of flip-flops under transistor aging effects in nano-scale CMOS technology 纳米级CMOS技术中晶体管老化效应下触发器可靠性分析
Pub Date : 2011-10-09 DOI: 10.1109/ICCD.2011.6081439
V. G. Rao, H. Mahmoodi
The effect of aging has become an important reliability concern in modern CMOS technology. NBTI and PBTI are known to bring about an increase in threshold voltage of the PMOS and NMOS respectively. This paper studies the effect of NBTI and PBTI on different flip-flop circuits with key parameters such as setup time, hold time, clock to output delay and data to output delay. The results in a predictive 32 nm technology show an increase of 0.43 to 1.23 pico-seconds in data-to-output delay depending on the Flip-Flop type. Moreover, we propose a method to use dual threshold voltage assignment to mitigate the effect of transistor aging on pulse triggered Flip-Flops. Dual Vth results show lower delay as well as 30% reduction in delay aging using the proposed dual threshold voltage method.
老化的影响已经成为现代CMOS技术中一个重要的可靠性问题。NBTI和PBTI分别使PMOS和NMOS的阈值电压升高。通过设定时间、保持时间、时钟到输出延时和数据到输出延时等关键参数,研究了NBTI和PBTI对不同触发器电路的影响。预测32nm技术的结果显示,根据触发器类型,数据到输出延迟增加0.43至1.23皮秒。此外,我们提出了一种使用双阈值电压分配的方法来减轻晶体管老化对脉冲触发触发器的影响。双Vth结果表明,采用双阈值电压方法可以降低延迟,延迟老化降低30%。
{"title":"Analysis of reliability of flip-flops under transistor aging effects in nano-scale CMOS technology","authors":"V. G. Rao, H. Mahmoodi","doi":"10.1109/ICCD.2011.6081439","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081439","url":null,"abstract":"The effect of aging has become an important reliability concern in modern CMOS technology. NBTI and PBTI are known to bring about an increase in threshold voltage of the PMOS and NMOS respectively. This paper studies the effect of NBTI and PBTI on different flip-flop circuits with key parameters such as setup time, hold time, clock to output delay and data to output delay. The results in a predictive 32 nm technology show an increase of 0.43 to 1.23 pico-seconds in data-to-output delay depending on the Flip-Flop type. Moreover, we propose a method to use dual threshold voltage assignment to mitigate the effect of transistor aging on pulse triggered Flip-Flops. Dual Vth results show lower delay as well as 30% reduction in delay aging using the proposed dual threshold voltage method.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116639504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
A morphable phase change memory architecture considering frequent zero values 一种考虑频繁零值的可变形相变存储器结构
Pub Date : 2011-10-09 DOI: 10.1109/ICCD.2011.6081426
M. Arjomand, A. Jadidi, Ali Shafiee, H. Sarbazi-Azad
Phase Change Memory (PCM) is emerging as a high-dense and power-efficient choice for future main memory systems. While PCM cell size is marching towards minimum achievable feature size, recent prototypes effectively improve device scalability by storing multiple bits per each cell. Unfortunately, Multi-Level Cell (MLC) PCM devices offer higher access time and energy when compared to Single-Level Cell (SLC) counterparts making it difficult to incorporate MLC in main memory. To address this challenge, we proposes Zero-value-based Morphable PCM, ZM-PCM for short, a novel MLC-PCM main memory architecture which tries incorporating benefits of both MLC and SLC devices within the same structure. ZM-PCM relies on the observation that zero value at various granularities is frequently occurred within main memory transactions when running PARSEC-2 programs. Motivated by this observation, ZM-PCM codes redundant zero MLC cells into limited bits that is storable in the SLC (or alternatively in devices with fewer bits) form with improved latency, energy, and lifetime with no reduction in available main memory capacity. We evaluate microarchitecture design of morphable PCM cell, coding and decoding algorithms and details of related circuits. We also introduce a simple area-efficient caching mechanism for fast cost-efficient access to coding metadata. Our evaluation on a quad-core CMP with 4GB 8-bit MLC PCM main memory shows that ZM-PCM morphs up to 93% (and 50% on average) of all memory cells with lower densities which directly turns in performance, power and lifetime enhancement.
相变存储器(PCM)正在成为未来主存储系统的高密度和高能效的选择。虽然PCM单元尺寸正朝着最小可实现的特征尺寸迈进,但最近的原型通过每个单元存储多个比特,有效地提高了设备的可扩展性。不幸的是,与单级单元(SLC)相比,多级单元(MLC) PCM器件提供了更高的访问时间和能量,这使得将MLC集成到主存中变得困难。为了应对这一挑战,我们提出了基于零值的Morphable PCM,简称ZM-PCM,这是一种新颖的MLC-PCM主存架构,它试图在同一结构中结合MLC和SLC器件的优点。ZM-PCM依赖于在运行parsec2程序时,在主存事务中经常出现不同粒度的零值这一观察结果。基于这一观察结果,ZM-PCM将冗余的零MLC单元编码为有限的比特,这些比特可存储在SLC(或具有更少比特的设备)中,具有改进的延迟、能量和寿命,而不会减少可用的主存储器容量。我们评估了可变形PCM单元的微结构设计、编码和解码算法以及相关电路的细节。我们还引入了一种简单的区域高效缓存机制,用于快速、经济高效地访问编码元数据。我们对具有4GB 8位MLC PCM主存的四核CMP进行的评估表明,ZM-PCM在低密度下可变形高达93%(平均50%)的所有存储单元,这直接提高了性能,功耗和寿命。
{"title":"A morphable phase change memory architecture considering frequent zero values","authors":"M. Arjomand, A. Jadidi, Ali Shafiee, H. Sarbazi-Azad","doi":"10.1109/ICCD.2011.6081426","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081426","url":null,"abstract":"Phase Change Memory (PCM) is emerging as a high-dense and power-efficient choice for future main memory systems. While PCM cell size is marching towards minimum achievable feature size, recent prototypes effectively improve device scalability by storing multiple bits per each cell. Unfortunately, Multi-Level Cell (MLC) PCM devices offer higher access time and energy when compared to Single-Level Cell (SLC) counterparts making it difficult to incorporate MLC in main memory. To address this challenge, we proposes Zero-value-based Morphable PCM, ZM-PCM for short, a novel MLC-PCM main memory architecture which tries incorporating benefits of both MLC and SLC devices within the same structure. ZM-PCM relies on the observation that zero value at various granularities is frequently occurred within main memory transactions when running PARSEC-2 programs. Motivated by this observation, ZM-PCM codes redundant zero MLC cells into limited bits that is storable in the SLC (or alternatively in devices with fewer bits) form with improved latency, energy, and lifetime with no reduction in available main memory capacity. We evaluate microarchitecture design of morphable PCM cell, coding and decoding algorithms and details of related circuits. We also introduce a simple area-efficient caching mechanism for fast cost-efficient access to coding metadata. Our evaluation on a quad-core CMP with 4GB 8-bit MLC PCM main memory shows that ZM-PCM morphs up to 93% (and 50% on average) of all memory cells with lower densities which directly turns in performance, power and lifetime enhancement.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127114068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Dynamic fine-grain body biasing of caches with latency and leakage 3T1D-based monitors 具有延迟和泄漏的基于3t1d监视器的缓存的动态细粒度体偏置
Pub Date : 2011-10-09 DOI: 10.1109/ICCD.2011.6081420
Shrikanth Ganapathy, R. Canal, Antonio González, A. Rubio
In this paper, we propose a dynamically tunable fine-grain body biasing mechanism to reduce standby leakage power in first level data-caches under process variations. Accessed physical arrays are forward body biased (FBB) to improve latency while idle (unaccessed) arrays are reverse body biased (RBB) for reducing standby leakage power. The bias voltage to be applied is computed at design time and updated at run-time to counter the negative effects of process variations. This ensures that under all scenarios, the cache will consume the lowest leakage power for the target access latency computed at design-time. A sensor-like hardware mechanism measures the variation in latency and leakage at run-time and this measurement is used to update the bias voltage. The backbone of the hardware used for measurement is a three-transistor one-diode(3T1D)DRAM cell embedded into a regular cache array. By measuring the access and retention time of the 3T1D cell, we show that it is possible to classify cache arrays based on run-time latency/leakage profiles. Our technique reduces leakage energy consumption and access latency of the cache on an average by 20% & 18% respectively. Finally we show that our technique will improve parametric yield by a maximum of 38% for worst-case scenario.
在本文中,我们提出了一种动态可调的细颗粒体偏置机制,以降低工艺变化下一级数据缓存的待机泄漏功率。已接入的物理阵列采用FBB (forward body biased),以提高时延;空闲(未接入)的物理阵列采用RBB (reverse body biased),以降低待机泄漏功率。要施加的偏置电压在设计时计算,并在运行时更新,以抵消工艺变化的负面影响。这确保了在所有场景下,缓存在设计时计算的目标访问延迟将消耗最低的泄漏功率。类似传感器的硬件机制测量运行时延迟和泄漏的变化,并使用该测量来更新偏置电压。用于测量的硬件的骨干是一个嵌入到常规缓存阵列中的三晶体管单二极管(3T1D)DRAM单元。通过测量3T1D单元的访问和保留时间,我们表明可以根据运行时延迟/泄漏概况对缓存阵列进行分类。我们的技术将缓存的泄漏能耗和访问延迟平均分别降低了20%和18%。最后,我们表明,在最坏的情况下,我们的技术将使参数产率提高38%。
{"title":"Dynamic fine-grain body biasing of caches with latency and leakage 3T1D-based monitors","authors":"Shrikanth Ganapathy, R. Canal, Antonio González, A. Rubio","doi":"10.1109/ICCD.2011.6081420","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081420","url":null,"abstract":"In this paper, we propose a dynamically tunable fine-grain body biasing mechanism to reduce standby leakage power in first level data-caches under process variations. Accessed physical arrays are forward body biased (FBB) to improve latency while idle (unaccessed) arrays are reverse body biased (RBB) for reducing standby leakage power. The bias voltage to be applied is computed at design time and updated at run-time to counter the negative effects of process variations. This ensures that under all scenarios, the cache will consume the lowest leakage power for the target access latency computed at design-time. A sensor-like hardware mechanism measures the variation in latency and leakage at run-time and this measurement is used to update the bias voltage. The backbone of the hardware used for measurement is a three-transistor one-diode(3T1D)DRAM cell embedded into a regular cache array. By measuring the access and retention time of the 3T1D cell, we show that it is possible to classify cache arrays based on run-time latency/leakage profiles. Our technique reduces leakage energy consumption and access latency of the cache on an average by 20% & 18% respectively. Finally we show that our technique will improve parametric yield by a maximum of 38% for worst-case scenario.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127890568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Static window addition: A new paradigm for the design of variable latency adders 静态窗口添加:一种设计可变延迟加法器的新范例
Pub Date : 2011-10-09 DOI: 10.1109/ICCD.2011.6081446
Kai Du, P. Varman, K. Mohanram
Speculative adders have attracted strong interest for achieving sublogarithmic delays by exploiting the tradeoffs between correctness and performance. Speculative adders also find use in the design of error-free variable latency adders, which combine speculation with error correction to achieve high performance for low area overhead over traditional adders. This paper describes static window addition (SWA), a novel function speculation technique for the design of low overhead, high performance variable latency adders. Analytical models for the error rate of SWA-based speculative adders are developed to facilitate both design exploration and convergence. We show that on average, variable latency addition using SWA-based speculative adders is 10% faster than the fastest DesignWare adder with area requirements of -5 to 40% for different adder widths.
投机加法器通过利用正确性和性能之间的权衡来实现次对数延迟,引起了人们的强烈兴趣。推测加法器也用于无错误可变延迟加法器的设计,它将推测与纠错相结合,以实现比传统加法器低面积开销的高性能。静态窗口加法(SWA)是一种用于设计低开销、高性能可变延迟加法器的新型函数推测技术。为了便于设计探索和收敛,本文建立了基于单角加权的推测加法器错误率的分析模型。我们表明,平均而言,使用基于swa的推测加法器的可变延迟加法比最快的DesignWare加法器快10%,不同加法器宽度的面积要求为-5到40%。
{"title":"Static window addition: A new paradigm for the design of variable latency adders","authors":"Kai Du, P. Varman, K. Mohanram","doi":"10.1109/ICCD.2011.6081446","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081446","url":null,"abstract":"Speculative adders have attracted strong interest for achieving sublogarithmic delays by exploiting the tradeoffs between correctness and performance. Speculative adders also find use in the design of error-free variable latency adders, which combine speculation with error correction to achieve high performance for low area overhead over traditional adders. This paper describes static window addition (SWA), a novel function speculation technique for the design of low overhead, high performance variable latency adders. Analytical models for the error rate of SWA-based speculative adders are developed to facilitate both design exploration and convergence. We show that on average, variable latency addition using SWA-based speculative adders is 10% faster than the fastest DesignWare adder with area requirements of -5 to 40% for different adder widths.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115860419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Adaptive execution assistance for multiplexed fault-tolerant chip multiprocessors 多路容错芯片多处理器的自适应执行辅助
Pub Date : 2011-10-09 DOI: 10.1109/ICCD.2011.6081432
Pramod Subramanyan, Virendra Singh, K. Saluja, E. Larsson
Relentless scaling of CMOS fabrication technology has made contemporary integrated circuits increasingly susceptible to transient faults, wearout-related permanent faults, intermittent faults and process variations. Therefore, mechanisms to mitigate the effects of decreased reliability are expected to become essential components of future general-purpose microprocessors.
CMOS制造技术的不断扩展使得当代集成电路越来越容易受到瞬态故障、磨损相关的永久故障、间歇性故障和工艺变化的影响。因此,减轻可靠性下降影响的机制有望成为未来通用微处理器的重要组成部分。
{"title":"Adaptive execution assistance for multiplexed fault-tolerant chip multiprocessors","authors":"Pramod Subramanyan, Virendra Singh, K. Saluja, E. Larsson","doi":"10.1109/ICCD.2011.6081432","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081432","url":null,"abstract":"Relentless scaling of CMOS fabrication technology has made contemporary integrated circuits increasingly susceptible to transient faults, wearout-related permanent faults, intermittent faults and process variations. Therefore, mechanisms to mitigate the effects of decreased reliability are expected to become essential components of future general-purpose microprocessors.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130290679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
FIMSIM: A fault injection infrastructure for microarchitectural simulators 用于微架构模拟器的故障注入基础设施
Pub Date : 2011-10-09 DOI: 10.1109/ICCD.2011.6081435
Gulay Yalcin, O. Unsal, A. Cristal, M. Valero
Fault injection is a widely used approach for experiment-based dependability evaluation. Injecting faults to microarchitectural simulators is particularly appealing for researchers, since it can be utilized at the early design stage of the processor. As such, it enables a preliminary analysis of the correlation between the criticality of processor-structure level faults and their impact on applications. In this study, we present FIMSIM, a compact fault injection infrastructure for microarchitectural simulators which is capable of injecting transient, permanent, intermittent and multi-bit faults. FIMSIM provides the opportunity to comprehensively evaluate the vulnerability of different microarchitectural structures against different fault models.
故障注入是一种广泛应用于实验可靠性评估的方法。在微架构模拟器中注入故障对研究人员特别有吸引力,因为它可以在处理器的早期设计阶段使用。因此,它可以初步分析处理器结构级故障的临界性及其对应用的影响之间的相关性。在这项研究中,我们提出了FIMSIM,一个紧凑的微架构模拟器故障注入基础设施,能够注入瞬态、永久、间歇和多比特故障。FIMSIM提供了综合评估不同微体系结构在不同故障模型下的脆弱性的机会。
{"title":"FIMSIM: A fault injection infrastructure for microarchitectural simulators","authors":"Gulay Yalcin, O. Unsal, A. Cristal, M. Valero","doi":"10.1109/ICCD.2011.6081435","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081435","url":null,"abstract":"Fault injection is a widely used approach for experiment-based dependability evaluation. Injecting faults to microarchitectural simulators is particularly appealing for researchers, since it can be utilized at the early design stage of the processor. As such, it enables a preliminary analysis of the correlation between the criticality of processor-structure level faults and their impact on applications. In this study, we present FIMSIM, a compact fault injection infrastructure for microarchitectural simulators which is capable of injecting transient, permanent, intermittent and multi-bit faults. FIMSIM provides the opportunity to comprehensively evaluate the vulnerability of different microarchitectural structures against different fault models.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115388950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
A novel software-based defect-tolerance approach for application-specific embedded systems 针对特定应用的嵌入式系统的一种新的基于软件的容错方法
Pub Date : 2011-10-09 DOI: 10.1109/ICCD.2011.6081441
Da Cheng, S. Gupta
Traditional approaches for improving yield are based on the use of hardware redundancy (HR), and their benefits are limited for high defect densities due to increasing layout complexities and diminishing return effects. This research is based on an observation that completely correct operation of user programs can be guaranteed while using chips with one or more unrepairable memory modules if software-level techniques satisfy two condistions: (1) defects only affect a few memory cells rather than cause malfunction for the entire memory module, and (2) either we do not use any part of the memory affected by the un-repaired defect, or we do use the affected part, but only in a manner that does not excite the un-repaired defect to cause errors. This paper proposes a software-based defect-tolerance (SBDT) approach in combination with HR to utilize defective memory chips for application-specific systems.
提高良率的传统方法是基于硬件冗余(HR)的使用,由于布局复杂性的增加和回报效应的减少,它们的效益在高缺陷密度下受到限制。本研究基于这样一个观察:当使用带有一个或多个不可修复内存模块的芯片时,如果软件级技术满足两个条件,则可以保证用户程序的完全正确运行:(1)缺陷仅影响少数存储单元,而不会导致整个存储模块的故障,以及(2)我们不使用受未修复缺陷影响的存储器的任何部分,或者我们使用受影响的部分,但仅以不会激发未修复缺陷导致错误的方式使用。本文提出了一种基于软件的缺陷容忍(SBDT)方法,并结合人力资源管理来利用特定应用系统的缺陷存储芯片。
{"title":"A novel software-based defect-tolerance approach for application-specific embedded systems","authors":"Da Cheng, S. Gupta","doi":"10.1109/ICCD.2011.6081441","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081441","url":null,"abstract":"Traditional approaches for improving yield are based on the use of hardware redundancy (HR), and their benefits are limited for high defect densities due to increasing layout complexities and diminishing return effects. This research is based on an observation that completely correct operation of user programs can be guaranteed while using chips with one or more unrepairable memory modules if software-level techniques satisfy two condistions: (1) defects only affect a few memory cells rather than cause malfunction for the entire memory module, and (2) either we do not use any part of the memory affected by the un-repaired defect, or we do use the affected part, but only in a manner that does not excite the un-repaired defect to cause errors. This paper proposes a software-based defect-tolerance (SBDT) approach in combination with HR to utilize defective memory chips for application-specific systems.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117204460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Adaptable architectures for distributed visual target tracking 分布式视觉目标跟踪的适应性体系结构
Pub Date : 2011-10-09 DOI: 10.1109/ICCD.2011.6081421
Domenic Forte, Ankur Srivastava
There are a growing number of visual tracking applications for mobile devices. However, the computer vision algorithms which process real-time video to track moving targets are demanding. Since a single mobile device possesses limited computational capabilities, energy, etc. to fully support target tracking, some works have investigated architectures which migrate a portion of tracking duties to another device at the cost of transmission bandwidth and energy. In this paper, we investigate the resource utilization in such architectures and present an adaptable architecture which balances tracking workload among the participating devices based on current resource availability (energy, temperature, bandwidth). Results show that the proposed solution requires low additional overhead, can improve on tracking system lifetime by reducing energy consumption, and is more effective in maintaining safe operating temperatures within participants as compared to previously investigated architecture
针对移动设备的视觉跟踪应用越来越多。然而,处理实时视频以跟踪运动目标的计算机视觉算法要求很高。由于单个移动设备具有有限的计算能力、能量等,无法完全支持目标跟踪,因此一些工作已经研究了以传输带宽和能量为代价将部分跟踪任务迁移到另一个设备的架构。在本文中,我们研究了这种架构中的资源利用率,并提出了一种适应性架构,该架构基于当前资源可用性(能量,温度,带宽)平衡参与设备之间的跟踪工作负载。结果表明,与先前研究的架构相比,所提出的解决方案需要较低的额外开销,可以通过减少能源消耗来改善跟踪系统的使用寿命,并且在保持参与者内的安全操作温度方面更有效
{"title":"Adaptable architectures for distributed visual target tracking","authors":"Domenic Forte, Ankur Srivastava","doi":"10.1109/ICCD.2011.6081421","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081421","url":null,"abstract":"There are a growing number of visual tracking applications for mobile devices. However, the computer vision algorithms which process real-time video to track moving targets are demanding. Since a single mobile device possesses limited computational capabilities, energy, etc. to fully support target tracking, some works have investigated architectures which migrate a portion of tracking duties to another device at the cost of transmission bandwidth and energy. In this paper, we investigate the resource utilization in such architectures and present an adaptable architecture which balances tracking workload among the participating devices based on current resource availability (energy, temperature, bandwidth). Results show that the proposed solution requires low additional overhead, can improve on tracking system lifetime by reducing energy consumption, and is more effective in maintaining safe operating temperatures within participants as compared to previously investigated architecture","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129720348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A machine learning approach to modeling power and performance of chip multiprocessors 用机器学习方法对芯片多处理器的功率和性能进行建模
Pub Date : 2011-10-09 DOI: 10.1109/ICCD.2011.6081374
Changshu Zhang, A. Ravindran, Kushal Datta, A. Mukherjee, B. Joshi
Exploring the vast microarchitectural design space of chip multiprocessors (CMPs) through the traditional approach of exhaustive simulations is impractical due to the long simulation times and its super-linear increase with core scaling. Kernel based statistical machine learning algorithms can potentially help predict multiple performance metrics with non-linear dependence on the CMP design parameters. In this paper, we describe and evaluate a machine learning framework that uses Kernel Canonical Correlation Analysis (KCCA) to predict the power dissipation and performance of CMPs. Specifically we focus on modeling the microarchitecture of a highly multithreaded CMP targeted towards packet processing. We use a cycle accurate CMP simulator to generate training samples required to build the model. Despite sampling only 0.016% of the design space we observe a median error of 6–10% in the KCCA predicted processor power dissipation and performance.
由于仿真时间长且随内核缩放呈超线性增长,通过传统的穷极仿真方法来探索芯片多处理器(cmp)广阔的微架构设计空间是不切实际的。基于核的统计机器学习算法可以潜在地帮助预测与CMP设计参数非线性依赖的多个性能指标。在本文中,我们描述和评估了一个机器学习框架,该框架使用核典型相关分析(KCCA)来预测cmp的功耗和性能。具体来说,我们关注的是针对数据包处理的高度多线程CMP的微架构建模。我们使用周期精确的CMP模拟器来生成构建模型所需的训练样本。尽管只采样了0.016%的设计空间,但我们观察到KCCA预测处理器功耗和性能的中位数误差为6-10%。
{"title":"A machine learning approach to modeling power and performance of chip multiprocessors","authors":"Changshu Zhang, A. Ravindran, Kushal Datta, A. Mukherjee, B. Joshi","doi":"10.1109/ICCD.2011.6081374","DOIUrl":"https://doi.org/10.1109/ICCD.2011.6081374","url":null,"abstract":"Exploring the vast microarchitectural design space of chip multiprocessors (CMPs) through the traditional approach of exhaustive simulations is impractical due to the long simulation times and its super-linear increase with core scaling. Kernel based statistical machine learning algorithms can potentially help predict multiple performance metrics with non-linear dependence on the CMP design parameters. In this paper, we describe and evaluate a machine learning framework that uses Kernel Canonical Correlation Analysis (KCCA) to predict the power dissipation and performance of CMPs. Specifically we focus on modeling the microarchitecture of a highly multithreaded CMP targeted towards packet processing. We use a cycle accurate CMP simulator to generate training samples required to build the model. Despite sampling only 0.016% of the design space we observe a median error of 6–10% in the KCCA predicted processor power dissipation and performance.","PeriodicalId":354015,"journal":{"name":"2011 IEEE 29th International Conference on Computer Design (ICCD)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128541033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2011 IEEE 29th International Conference on Computer Design (ICCD)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1