首页 > 最新文献

2016 International Great Lakes Symposium on VLSI (GLSVLSI)最新文献

英文 中文
Exploratory power noise models of standard cell 14, 10, and 7 nm FinFET ICs 探索标准单元14nm、10nm和7nm FinFET ic的功率噪声模型
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903035
Ravi Patel, Kan Xu, E. Friedman, P. Raghavan
The physical dimensions of standard cells constrain the dimensions of power networks, affecting the on-chip power noise. An exploratory modeling methodology is presented for estimating power noise in advanced technology nodes. The models are evaluated for 14, 10, and 7 nm technologies to assess the impact on performance. Scaled technologies are shown to be more sensitive to power noise, resulting in potential loss of performance enhancements achieved by scaling. Stripes between local track rails is evaluated as a means to reduce power noise, exhibiting up to 56.5% improvement in power noise for the 7 nm technology node. A strong dependence on the width of a stripe is observed, indicating that fewer wide stripes are more favorable then many thin stripes. As a promising alternative material for power network interconnects, graphene is shown to exhibit good potential in reducing power noise. The effects of different scaling scenarios of local power rails on power noise are also discussed.
标准电池的物理尺寸限制了电网的尺寸,影响了片上功率噪声。提出了一种用于先进技术节点功率噪声估计的探索性建模方法。这些模型分别针对14nm、10nm和7nm技术进行了评估,以评估对性能的影响。缩放技术被证明对功率噪声更敏感,导致通过缩放实现的性能增强的潜在损失。局部轨道轨道之间的条纹被评估为降低功率噪声的一种手段,在7纳米技术节点上显示出高达56.5%的功率噪声改善。观察到条纹的宽度有很强的依赖性,表明较少的宽条纹比许多细条纹更有利。石墨烯作为一种很有前途的电网互连材料,在降低功率噪声方面显示出良好的潜力。讨论了局部电源轨的不同标度方案对功率噪声的影响。
{"title":"Exploratory power noise models of standard cell 14, 10, and 7 nm FinFET ICs","authors":"Ravi Patel, Kan Xu, E. Friedman, P. Raghavan","doi":"10.1145/2902961.2903035","DOIUrl":"https://doi.org/10.1145/2902961.2903035","url":null,"abstract":"The physical dimensions of standard cells constrain the dimensions of power networks, affecting the on-chip power noise. An exploratory modeling methodology is presented for estimating power noise in advanced technology nodes. The models are evaluated for 14, 10, and 7 nm technologies to assess the impact on performance. Scaled technologies are shown to be more sensitive to power noise, resulting in potential loss of performance enhancements achieved by scaling. Stripes between local track rails is evaluated as a means to reduce power noise, exhibiting up to 56.5% improvement in power noise for the 7 nm technology node. A strong dependence on the width of a stripe is observed, indicating that fewer wide stripes are more favorable then many thin stripes. As a promising alternative material for power network interconnects, graphene is shown to exhibit good potential in reducing power noise. The effects of different scaling scenarios of local power rails on power noise are also discussed.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126350450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Security meets nanoelectronics for Internet of things applications 安全满足物联网应用的纳米电子学
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903045
G. Rose
The internet of things (IoT) is quickly emerging as the next major domain for embedded computer systems. Although the term IoT could be defined in a variety of different ways, IoT always encompasses typically ordinary devices (e.g., thermostats and kitchen appliances) augmented with computational power that allows regular communication via the internet. Given the simplicity of typical IoT devices, their on-board computer systems must also be simple in the sense that they be small and consume minimal power. However, the IoT itself presents new privacy and security concerns that must be considered when designing IoT devices. In order to provide robust security with minimal area and power overhead, it is the premise of this paper that IoT security be implemented using nanoelectronic security primitives and nano-enabled security protocols. Such nanoscale security primitives are expected to utilize a very small amount of area and consume a negligible amount of power, all while providing the required levels of security. This paper presents some examples of nanoelectronic security primitives and discusses how such circuits and systems can be of use for inclusion in emerging IoT devices.
物联网(IoT)正迅速成为嵌入式计算机系统的下一个主要领域。尽管物联网一词可以以各种不同的方式定义,但物联网总是包括典型的普通设备(例如恒温器和厨房电器),增强了计算能力,可以通过互联网进行常规通信。考虑到典型物联网设备的简单性,它们的车载计算机系统也必须简单,因为它们体积小,功耗最小。然而,物联网本身提出了在设计物联网设备时必须考虑的新的隐私和安全问题。为了以最小的面积和功耗开销提供强大的安全性,本文的前提是使用纳米电子安全原语和纳米支持的安全协议实现物联网安全。这种纳米级的安全原语预计将利用非常小的面积,消耗可忽略不计的功率,同时提供所需的安全级别。本文介绍了纳米电子安全原语的一些示例,并讨论了如何将此类电路和系统用于新兴物联网设备。
{"title":"Security meets nanoelectronics for Internet of things applications","authors":"G. Rose","doi":"10.1145/2902961.2903045","DOIUrl":"https://doi.org/10.1145/2902961.2903045","url":null,"abstract":"The internet of things (IoT) is quickly emerging as the next major domain for embedded computer systems. Although the term IoT could be defined in a variety of different ways, IoT always encompasses typically ordinary devices (e.g., thermostats and kitchen appliances) augmented with computational power that allows regular communication via the internet. Given the simplicity of typical IoT devices, their on-board computer systems must also be simple in the sense that they be small and consume minimal power. However, the IoT itself presents new privacy and security concerns that must be considered when designing IoT devices. In order to provide robust security with minimal area and power overhead, it is the premise of this paper that IoT security be implemented using nanoelectronic security primitives and nano-enabled security protocols. Such nanoscale security primitives are expected to utilize a very small amount of area and consume a negligible amount of power, all while providing the required levels of security. This paper presents some examples of nanoelectronic security primitives and discusses how such circuits and systems can be of use for inclusion in emerging IoT devices.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126585468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Dynamic real-time scheduler for large-scale MPSoCs 大规模mpsoc动态实时调度器
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903027
Marcelo Ruaro, F. Moraes
Large-scale MPSoCs requires a scalable and dynamic real-time (RT) task scheduler, able to handle non-deterministic computational behaviors. Current proposals for MPSoCs have limitations, as lack of scalability, complex static steps, validation with abstract models, or are not flexible to enable changes at runtime of the RT constraints. This work proposes a hierarchical task scheduler with monitoring features. The scheduler is dynamic, supporting changes in RT constraints at runtime. An API enables these features allowing to the application developer to reconfigure the tasks' period, deadline, and execution time by annotating the task code. At runtime, according to the task execution, the scheduler handles the API calls and adjust itself to ensure RT guarantees according to the new constraints. Scalability is ensured by dividing the scheduler into two hierarchical levels: LS (Local Schedulers), and CS (Cluster Schedulers). The LS runs at the processor level, using the LST (Least Slack-Time) algorithm. The CS runs at the cluster level, i.e., a group of processors controlled by a manager processor. The CS receives messages from the LSs, informing the processor slack-time, deadline violations, and RT changes. The CS implements an RT adaptation heuristic, triggering task migrations according to RT reconfiguration or deadline misses. Results show a negligible overhead in the applications' execution time and the fulfillment of the applications' RT constraints even with a high degree of resources sharing, in both processors and NoC.
大规模mpsoc需要一个可扩展的动态实时(RT)任务调度程序,能够处理不确定的计算行为。当前针对mpsoc的建议存在局限性,如缺乏可伸缩性、复杂的静态步骤、使用抽象模型进行验证,或者在运行时支持RT约束的更改时不够灵活。本文提出了一种具有监控功能的分层任务调度程序。调度器是动态的,支持在运行时对RT约束进行更改。API支持这些特性,允许应用程序开发人员通过注释任务代码来重新配置任务的周期、截止日期和执行时间。在运行时,根据任务执行情况,调度器处理API调用并调整自身,以根据新的约束确保RT保证。通过将调度器划分为两个层次级别:LS(本地调度器)和CS(集群调度器),可以确保可伸缩性。LS在处理器级别运行,使用LST (Least slacktime)算法。CS在集群级运行,即由管理器处理器控制的一组处理器。CS从LSs接收消息,通知处理器空闲时间、截止日期违反和RT更改。CS实现RT自适应启发式,根据RT重新配置或错过截止日期触发任务迁移。结果表明,即使在处理器和NoC中具有高度的资源共享,应用程序的执行时间开销和应用程序的RT约束的实现也可以忽略不计。
{"title":"Dynamic real-time scheduler for large-scale MPSoCs","authors":"Marcelo Ruaro, F. Moraes","doi":"10.1145/2902961.2903027","DOIUrl":"https://doi.org/10.1145/2902961.2903027","url":null,"abstract":"Large-scale MPSoCs requires a scalable and dynamic real-time (RT) task scheduler, able to handle non-deterministic computational behaviors. Current proposals for MPSoCs have limitations, as lack of scalability, complex static steps, validation with abstract models, or are not flexible to enable changes at runtime of the RT constraints. This work proposes a hierarchical task scheduler with monitoring features. The scheduler is dynamic, supporting changes in RT constraints at runtime. An API enables these features allowing to the application developer to reconfigure the tasks' period, deadline, and execution time by annotating the task code. At runtime, according to the task execution, the scheduler handles the API calls and adjust itself to ensure RT guarantees according to the new constraints. Scalability is ensured by dividing the scheduler into two hierarchical levels: LS (Local Schedulers), and CS (Cluster Schedulers). The LS runs at the processor level, using the LST (Least Slack-Time) algorithm. The CS runs at the cluster level, i.e., a group of processors controlled by a manager processor. The CS receives messages from the LSs, informing the processor slack-time, deadline violations, and RT changes. The CS implements an RT adaptation heuristic, triggering task migrations according to RT reconfiguration or deadline misses. Results show a negligible overhead in the applications' execution time and the fulfillment of the applications' RT constraints even with a high degree of resources sharing, in both processors and NoC.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115942787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Hardware security threats and potential countermeasures in emerging 3D ICs 新兴3D集成电路的硬件安全威胁及潜在对策
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903014
Jaya Dofe, Qiaoyan Yu, Hailang Wang, E. Salman
New hardware security threats are identified in emerging three-dimensional (3D) integrated circuits (ICs) and potential counter-measures are introduced. Trigger and payload mechanisms for future 3D hardware Trojans are predicted. Furthermore, a novel, network-on-chip based 3D obfuscation method is proposed to block the direct communication between two commercial dies in a 3D structure, thus thwarting reverse engineering attacks on the vertical dimension. Simulation results demonstrate that the proposed method effectively obfuscates the cross-plane communication by increasing the reverse engineering time by approximately 5× as compared to using direct through silicon via (TSV) connections. The proposed method consumes approximately one fifth the area and power of a typical network-on-chip designed in a 65 nm technology, exhibiting limited overhead.
在新兴的三维集成电路中发现了新的硬件安全威胁,并介绍了可能的应对措施。预测了未来3D硬件木马的触发和有效载荷机制。此外,提出了一种新颖的基于片上网络的三维混淆方法,以阻止三维结构中两个商用模具之间的直接通信,从而阻止垂直维度上的逆向工程攻击。仿真结果表明,与使用直接通硅孔(TSV)连接相比,该方法将逆向工程时间增加了约5倍,有效地混淆了跨平面通信。所提出的方法消耗的面积和功率大约是65nm技术设计的典型片上网络的五分之一,显示出有限的开销。
{"title":"Hardware security threats and potential countermeasures in emerging 3D ICs","authors":"Jaya Dofe, Qiaoyan Yu, Hailang Wang, E. Salman","doi":"10.1145/2902961.2903014","DOIUrl":"https://doi.org/10.1145/2902961.2903014","url":null,"abstract":"New hardware security threats are identified in emerging three-dimensional (3D) integrated circuits (ICs) and potential counter-measures are introduced. Trigger and payload mechanisms for future 3D hardware Trojans are predicted. Furthermore, a novel, network-on-chip based 3D obfuscation method is proposed to block the direct communication between two commercial dies in a 3D structure, thus thwarting reverse engineering attacks on the vertical dimension. Simulation results demonstrate that the proposed method effectively obfuscates the cross-plane communication by increasing the reverse engineering time by approximately 5× as compared to using direct through silicon via (TSV) connections. The proposed method consumes approximately one fifth the area and power of a typical network-on-chip designed in a 65 nm technology, exhibiting limited overhead.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116623218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Secure and low-overhead circuit obfuscation technique with multiplexers 使用多路复用器的安全、低开销电路混淆技术
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903000
Xueyan Wang, Xiaotao Jia, Qiang Zhou, Yici Cai, Jianlei Yang, Mingze Gao, G. Qu
Circuit obfuscation techniques have been proposed to conceal circuit's functionality in order to thwart reverse engineering (RE) attacks to integrated circuits (IC). We believe that a good obfuscation method should have low design complexity and low performance overhead, yet, causing high RE attack complexity. However, existing obfuscation techniques do not meet all these requirements. In this paper, we propose a polynomial obfuscation scheme which leverages special designed multiplexers (MUXs) to replace judiciously selected logic gates. Candidate to-be-obfuscated logic gates are selected based on a novel gate classification method which utilizes IC topological structure information. We show that this scheme is resilient to all the known attacks, hence it is secure. Experiments are conducted on ISCAS 85/89 and MCNC benchmark suites to evaluate the performance overhead due to obfuscation.
电路混淆技术被提出用来隐藏电路的功能,以阻止对集成电路(IC)的逆向工程攻击。我们认为,一个好的混淆方法应该具有较低的设计复杂度和较低的性能开销,但会导致较高的正则攻击复杂度。然而,现有的混淆技术并不能满足所有这些需求。在本文中,我们提出了一种多项式混淆方案,该方案利用特殊设计的多路复用器(mux)来取代明智选择的逻辑门。基于一种利用集成电路拓扑结构信息的栅极分类方法,选择了待混淆的候选逻辑门。我们证明了该方案对所有已知的攻击具有弹性,因此它是安全的。在ISCAS 85/89和MCNC基准测试套件上进行了实验,以评估由于混淆造成的性能开销。
{"title":"Secure and low-overhead circuit obfuscation technique with multiplexers","authors":"Xueyan Wang, Xiaotao Jia, Qiang Zhou, Yici Cai, Jianlei Yang, Mingze Gao, G. Qu","doi":"10.1145/2902961.2903000","DOIUrl":"https://doi.org/10.1145/2902961.2903000","url":null,"abstract":"Circuit obfuscation techniques have been proposed to conceal circuit's functionality in order to thwart reverse engineering (RE) attacks to integrated circuits (IC). We believe that a good obfuscation method should have low design complexity and low performance overhead, yet, causing high RE attack complexity. However, existing obfuscation techniques do not meet all these requirements. In this paper, we propose a polynomial obfuscation scheme which leverages special designed multiplexers (MUXs) to replace judiciously selected logic gates. Candidate to-be-obfuscated logic gates are selected based on a novel gate classification method which utilizes IC topological structure information. We show that this scheme is resilient to all the known attacks, hence it is secure. Experiments are conducted on ISCAS 85/89 and MCNC benchmark suites to evaluate the performance overhead due to obfuscation.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131762941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
A sampling clock skew correction technique for time-interleaved SAR ADCs 时间交错SAR adc的采样时钟偏差校正技术
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903008
D. Prashanth, Hae-Seung Lee
A technique for sampling clock skew correction by adjusting the delay in the input signal to each channel in a time-interleaved (TI) ADC is proposed. A proof-of-concept TI ADC employing this technique was implemented in a 65 nm CMOS process. The four-way TI ADC operates at an effective sampling rate of 150 MS/s, and achieves 60.2 dB and 58.2 dB SNDR for an input signal frequency of 2.1 MHz and 74.1 MHz, respectively. The ADC consumes 12.4 mW from a 1.2 V supply and occupies an area of 0.9 mm2.
提出了一种通过调整时间交错(TI) ADC中各通道输入信号的延迟来校正采样时钟偏差的技术。采用该技术的概念验证型TI ADC已在65纳米CMOS工艺中实现。该四路TI ADC的有效采样率为150 MS/s,在输入信号频率分别为2.1 MHz和74.1 MHz时,SNDR分别为60.2 dB和58.2 dB。ADC的功耗为12.4 mW,电源电压为1.2 V,面积为0.9 mm2。
{"title":"A sampling clock skew correction technique for time-interleaved SAR ADCs","authors":"D. Prashanth, Hae-Seung Lee","doi":"10.1145/2902961.2903008","DOIUrl":"https://doi.org/10.1145/2902961.2903008","url":null,"abstract":"A technique for sampling clock skew correction by adjusting the delay in the input signal to each channel in a time-interleaved (TI) ADC is proposed. A proof-of-concept TI ADC employing this technique was implemented in a 65 nm CMOS process. The four-way TI ADC operates at an effective sampling rate of 150 MS/s, and achieves 60.2 dB and 58.2 dB SNDR for an input signal frequency of 2.1 MHz and 74.1 MHz, respectively. The ADC consumes 12.4 mW from a 1.2 V supply and occupies an area of 0.9 mm2.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"33 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131992150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Capturing true workload dependency of BTI-induced degradation in CPU components 捕获CPU组件中由bti引起的降级的真实工作负载依赖性
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2902992
Dimitrios Stamoulis, S. Corbetta, D. Rodopoulos, P. Weckx, P. Debacker, B. Meyer, B. Kaczer, P. Raghavan, D. Soudris, F. Catthoor, Z. Zilic
Atomistic-based approaches accurately model Bias Temperature Instability phenomena, but they suffer from prolonged execution times, preventing their seamless integration in system-level analysis flows. In this paper we present a comprehensive flow that combines the accuracy of Capture Emission Time (CET) maps with the efficiency of the Compact Digital Waveform (CDW) representation. That way, we capture the true workload-dependent BTI-induced degradation of selected CPU components. First, we show that existing works that assume constant stress patterns fail to account for workload dependency leading to fundamental estimation errors. Second, we evaluate the impact of different real workloads on selected CPU sub-blocks from a commercial processor design. To the best of our knowledge, this is the first work that combines atomistic property and true workload-dependency for variability analysis.
基于原子的方法准确地模拟了偏置温度不稳定性现象,但它们的执行时间较长,阻碍了它们在系统级分析流中的无缝集成。在本文中,我们提出了一个综合流程,将捕获发射时间(CET)地图的准确性与紧凑数字波形(CDW)表示的效率相结合。通过这种方式,我们捕获了与工作负载相关的bti引起的选定CPU组件的降级。首先,我们表明,假设恒定应力模式的现有工作无法解释导致基本估计错误的工作负载依赖性。其次,我们评估了不同实际工作负载对来自商业处理器设计的选定CPU子块的影响。据我们所知,这是第一个将原子属性和真正的工作负载依赖性结合起来进行可变性分析的工作。
{"title":"Capturing true workload dependency of BTI-induced degradation in CPU components","authors":"Dimitrios Stamoulis, S. Corbetta, D. Rodopoulos, P. Weckx, P. Debacker, B. Meyer, B. Kaczer, P. Raghavan, D. Soudris, F. Catthoor, Z. Zilic","doi":"10.1145/2902961.2902992","DOIUrl":"https://doi.org/10.1145/2902961.2902992","url":null,"abstract":"Atomistic-based approaches accurately model Bias Temperature Instability phenomena, but they suffer from prolonged execution times, preventing their seamless integration in system-level analysis flows. In this paper we present a comprehensive flow that combines the accuracy of Capture Emission Time (CET) maps with the efficiency of the Compact Digital Waveform (CDW) representation. That way, we capture the true workload-dependent BTI-induced degradation of selected CPU components. First, we show that existing works that assume constant stress patterns fail to account for workload dependency leading to fundamental estimation errors. Second, we evaluate the impact of different real workloads on selected CPU sub-blocks from a commercial processor design. To the best of our knowledge, this is the first work that combines atomistic property and true workload-dependency for variability analysis.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116560646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Task-resource co-allocation for hotspot minimization in heterogeneous many-core NoCs 异构多核noc中热点最小化的任务资源协同分配
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903003
Md Farhadur Reza, Dan Zhao, Hongyi Wu
To fully exploit the massive parallelism of many cores, this work tackles the problem of mapping large-scale applications onto heterogeneous on-chip networks (NoCs) to minimize the peak workload for energy hotspot avoidance. A task-resource co-optimization framework is proposed which configures the on-chip communication infrastructure and maps the applications simultaneously and coherently, aiming to minimize the peak load under the constraints of computation power and communication capacity and a total cost budget of on-chip resources. The problem is first formulated into a linear programming model to search for optimal solution. A heuristic algorithm is further developed for fast design space exploration in extremely large-scale many-core NoCs. Extensive simulations are carried out under real-world benchmarks and randomly generated task graphs to demonstrate the effectiveness and efficiency of the proposed schemes.
为了充分利用多核的大规模并行性,本工作解决了将大规模应用映射到异构片上网络(noc)的问题,以最大限度地减少能量热点规避的峰值工作负载。提出了一种任务-资源协同优化框架,在计算能力、通信容量和片上资源总成本预算的约束下,以最小化峰值负载为目标,对片上通信基础设施进行同步、相干地配置和应用映射。首先将问题化为线性规划模型来寻找最优解。进一步提出了一种启发式算法,用于超大规模多核noc的快速设计空间探索。在真实世界的基准和随机生成的任务图下进行了大量的模拟,以证明所提出方案的有效性和效率。
{"title":"Task-resource co-allocation for hotspot minimization in heterogeneous many-core NoCs","authors":"Md Farhadur Reza, Dan Zhao, Hongyi Wu","doi":"10.1145/2902961.2903003","DOIUrl":"https://doi.org/10.1145/2902961.2903003","url":null,"abstract":"To fully exploit the massive parallelism of many cores, this work tackles the problem of mapping large-scale applications onto heterogeneous on-chip networks (NoCs) to minimize the peak workload for energy hotspot avoidance. A task-resource co-optimization framework is proposed which configures the on-chip communication infrastructure and maps the applications simultaneously and coherently, aiming to minimize the peak load under the constraints of computation power and communication capacity and a total cost budget of on-chip resources. The problem is first formulated into a linear programming model to search for optimal solution. A heuristic algorithm is further developed for fast design space exploration in extremely large-scale many-core NoCs. Extensive simulations are carried out under real-world benchmarks and randomly generated task graphs to demonstrate the effectiveness and efficiency of the proposed schemes.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114271766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
A design of a non-volatile PMC-based (programmable metallization cell) register file 基于非易失性pmc(可编程金属化单元)的寄存器文件的设计
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903034
Salin Junsangsri, Jie Han, F. Lombardi
This paper presents the design of a non-volatile register file using cells made of a SRAM and a Programmable Metallization Cell (PMC). The proposed cell is a symmetric 8T2P (8-transistors, 2PMC) design; it utilizes three control lines to ensure the correctness in its operations (i.e. Write, Read, Store and Restore). Simulation results using HSPICE are provided for the cell as well as the register file array (both one- and two-dimensional schemes). At cell level, it is shown that the off-state resistance has a limited effect on the Read time, because in the proposed circuit the transistor connecting the PMCs to the SRAM is off. While having no significant effect on the Store time, the time of the Restore operation depends on the value of the off-state resistance, i.e. an increase in off-state PMC resistance causes an increase in Restore time. Comparison between non-volatile register files utilizing either PMCs, or Phase Change Memories (PCMs) is provided.The register file using PMCs has a faster Store and Read times than the PCM-based counterpart; this is mostly caused by the difference in resistance values for these two non-volatile technologies. The lower delay involved in these operations confirms that the proposed PMC-based register file offers significant advantages in terms of delay performance.
本文介绍了用SRAM和可编程金属化单元(PMC)组成的单元设计一个非易失性寄存器文件。提出的电池是对称的8T2P(8个晶体管,2PMC)设计;它使用三条控制线来确保其操作的正确性(即写,读,存储和恢复)。使用HSPICE对单元和寄存器文件阵列(包括一维和二维方案)进行了仿真结果。在单元水平上,显示出断开状态电阻对读取时间的影响有限,因为在建议的电路中,将pmc连接到SRAM的晶体管是关闭的。虽然对Store时间没有显著影响,但Restore操作的时间取决于off-state阻值,即off-state PMC阻值的增加会导致Restore时间的增加。提供了使用pmc或相变存储器(pcm)的非易失性寄存器文件之间的比较。使用pmc的寄存器文件比基于pcm的寄存器文件具有更快的存储和读取时间;这主要是由于这两种非易失性技术的电阻值不同造成的。这些操作中涉及的较低延迟证实了所建议的基于pmc的寄存器文件在延迟性能方面提供了显着的优势。
{"title":"A design of a non-volatile PMC-based (programmable metallization cell) register file","authors":"Salin Junsangsri, Jie Han, F. Lombardi","doi":"10.1145/2902961.2903034","DOIUrl":"https://doi.org/10.1145/2902961.2903034","url":null,"abstract":"This paper presents the design of a non-volatile register file using cells made of a SRAM and a Programmable Metallization Cell (PMC). The proposed cell is a symmetric 8T2P (8-transistors, 2PMC) design; it utilizes three control lines to ensure the correctness in its operations (i.e. Write, Read, Store and Restore). Simulation results using HSPICE are provided for the cell as well as the register file array (both one- and two-dimensional schemes). At cell level, it is shown that the off-state resistance has a limited effect on the Read time, because in the proposed circuit the transistor connecting the PMCs to the SRAM is off. While having no significant effect on the Store time, the time of the Restore operation depends on the value of the off-state resistance, i.e. an increase in off-state PMC resistance causes an increase in Restore time. Comparison between non-volatile register files utilizing either PMCs, or Phase Change Memories (PCMs) is provided.The register file using PMCs has a faster Store and Read times than the PCM-based counterpart; this is mostly caused by the difference in resistance values for these two non-volatile technologies. The lower delay involved in these operations confirms that the proposed PMC-based register file offers significant advantages in terms of delay performance.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114686259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Performance constraint-aware task mapping to optimize lifetime reliability of manycore systems 性能约束感知任务映射优化多核系统寿命可靠性
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2902996
Vijeta Rathore, Vivek Chaturvedi, T. Srikanthan
Negative bias temperature instability (NBTI) has emerged as a critical challenge to lifetime reliability of computing systems. Traditionally, temperature-aware methodologies are used to mitigate the impact of NBTI on aging and degradation of computing systems. However, in the presence of process variation, which is the norm in manycore processors, temperature-aware techniques are inefficient in improving lifetime reliability and can result in poor performance. In this paper, we propose a novel performance constraint-aware task mapping technique to improve lifetime reliability by mitigating NBTI considering on-chip process variation. Our approach consists of two phases, namely design-time and run-time. During design time, we generate Pareto-optimal mappings. Following which, our run-time technique judiciously intervenes to perform workload migration to save the weakest processing core. We compare our approach with performance-greedy and thermal-aware task mapping techniques. The experiment results demonstrate that our approach outperforms other two techniques and improves lifetime reliability of a manycore system as much as 54% without violating the throughput constraint.
负偏置温度不稳定性(NBTI)已经成为计算系统寿命可靠性的一个关键挑战。传统上,温度感知方法用于减轻NBTI对计算系统老化和退化的影响。然而,在存在进程变化的情况下(这在多核处理器中是常态),温度感知技术在提高寿命可靠性方面效率低下,并可能导致性能下降。在本文中,我们提出了一种新的性能约束感知任务映射技术,通过考虑片上工艺变化来减轻NBTI,从而提高寿命可靠性。我们的方法包括两个阶段,即设计时和运行时。在设计期间,我们生成帕累托最优映射。接下来,我们的运行时技术会明智地进行干预,执行工作负载迁移,以保存最弱的处理核心。我们将我们的方法与性能贪婪和热感知任务映射技术进行比较。实验结果表明,我们的方法优于其他两种技术,在不违反吞吐量约束的情况下,将多核系统的寿命可靠性提高了54%。
{"title":"Performance constraint-aware task mapping to optimize lifetime reliability of manycore systems","authors":"Vijeta Rathore, Vivek Chaturvedi, T. Srikanthan","doi":"10.1145/2902961.2902996","DOIUrl":"https://doi.org/10.1145/2902961.2902996","url":null,"abstract":"Negative bias temperature instability (NBTI) has emerged as a critical challenge to lifetime reliability of computing systems. Traditionally, temperature-aware methodologies are used to mitigate the impact of NBTI on aging and degradation of computing systems. However, in the presence of process variation, which is the norm in manycore processors, temperature-aware techniques are inefficient in improving lifetime reliability and can result in poor performance. In this paper, we propose a novel performance constraint-aware task mapping technique to improve lifetime reliability by mitigating NBTI considering on-chip process variation. Our approach consists of two phases, namely design-time and run-time. During design time, we generate Pareto-optimal mappings. Following which, our run-time technique judiciously intervenes to perform workload migration to save the weakest processing core. We compare our approach with performance-greedy and thermal-aware task mapping techniques. The experiment results demonstrate that our approach outperforms other two techniques and improves lifetime reliability of a manycore system as much as 54% without violating the throughput constraint.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"210 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121359523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
2016 International Great Lakes Symposium on VLSI (GLSVLSI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1