首页 > 最新文献

ACM Great Lakes Symposium on VLSI最新文献

英文 中文
FPGA based implementation of a genetic algorithm for ARMA model parameters identification 基于FPGA实现了一种用于ARMA模型参数辨识的遗传算法
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591579
H. Merabti, D. Massicotte
In this paper, we propose an FPGA implementation of a genetic algorithm (GA) for linear and nonlinear auto regressive moving average (ARMA) model parameters identification. The GA features specifically designed genetic operators for adaptive filtering applications. The design was implemented using very low bit-wordlength fixed-point representation, where only 6-bit wordlength arithmetic was used. The implementation experiments show high parameters identification capabilities and low footprint.
在本文中,我们提出了一种用于线性和非线性自回归移动平均(ARMA)模型参数识别的遗传算法(GA)的FPGA实现。该遗传算法的特点是专门为自适应滤波应用设计了遗传算子。该设计使用非常低的位字长定点表示来实现,其中只使用了6位字长算法。实现实验表明,该方法具有较高的参数识别能力和较低的占用空间。
{"title":"FPGA based implementation of a genetic algorithm for ARMA model parameters identification","authors":"H. Merabti, D. Massicotte","doi":"10.1145/2591513.2591579","DOIUrl":"https://doi.org/10.1145/2591513.2591579","url":null,"abstract":"In this paper, we propose an FPGA implementation of a genetic algorithm (GA) for linear and nonlinear auto regressive moving average (ARMA) model parameters identification. The GA features specifically designed genetic operators for adaptive filtering applications. The design was implemented using very low bit-wordlength fixed-point representation, where only 6-bit wordlength arithmetic was used. The implementation experiments show high parameters identification capabilities and low footprint.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124534332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A study on the use of parallel wiring techniques for sub-20nm designs 并行布线技术在sub-20nm设计中的应用研究
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591588
Rickard Ewetz, Wen-Hao Liu, Kai-Yuan Chao, Ting-Chi Wang, Cheng-Kok Koh
Wire sizing can be used to reduce the delays of critical nets. However, because of the forbidden pitch issue in sub-20nm designs, wide wires may no longer be an attractive solution because of the restrictive wire spacing requirement from advanced lithography. In this work, we investigate the suitability of the parallel wiring technique, in which multiple parallel wires are used to route the same net, as an alternative to routing a net using a single wide wire. In particular, we study the trade offs between parasitics, timing, power, and routing resources. Our study reveals that wire sizing using both parallel wires and wide wires can be advantageous. Moreover, if high layout densities are required, parallel wiring can be a viable approach in solving timing problems for sub-20nm designs.
电线尺寸可以用来减少关键网络的延迟。然而,由于20nm以下设计的禁距问题,由于先进光刻技术对线间距的限制,宽线可能不再是一个有吸引力的解决方案。在这项工作中,我们研究了并行布线技术的适用性,其中使用多个并行线来路由相同的网络,作为使用单个宽线路由网络的替代方案。特别是,我们研究了寄生、时序、功率和路由资源之间的权衡。我们的研究表明,使用平行线和宽线的线尺寸是有利的。此外,如果需要高布局密度,并行布线可以成为解决sub-20nm设计时序问题的可行方法。
{"title":"A study on the use of parallel wiring techniques for sub-20nm designs","authors":"Rickard Ewetz, Wen-Hao Liu, Kai-Yuan Chao, Ting-Chi Wang, Cheng-Kok Koh","doi":"10.1145/2591513.2591588","DOIUrl":"https://doi.org/10.1145/2591513.2591588","url":null,"abstract":"Wire sizing can be used to reduce the delays of critical nets. However, because of the forbidden pitch issue in sub-20nm designs, wide wires may no longer be an attractive solution because of the restrictive wire spacing requirement from advanced lithography. In this work, we investigate the suitability of the parallel wiring technique, in which multiple parallel wires are used to route the same net, as an alternative to routing a net using a single wide wire. In particular, we study the trade offs between parasitics, timing, power, and routing resources. Our study reveals that wire sizing using both parallel wires and wide wires can be advantageous. Moreover, if high layout densities are required, parallel wiring can be a viable approach in solving timing problems for sub-20nm designs.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132923898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Optically reconfigurable gate array with an angle-multiplexed holographic memory 具有角度复用全息存储器的光学可重构门阵列
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591597
R. Moriwaki, H. Maekawa, A. Ogiwara, Minoru Watanabe
Optically reconfigurable gate arrays (ORGAs) have been developed to achieve a high-performance FPGA with numerous configuration contexts. In the architecture, an optical memory technology or a holographic memory technology has been introduced so that the architecture can have numerous configuration contexts and high-speed reconfiguration capability. Results show that the architecture can achieve a large virtual gate count that is much larger than those of currently available VLSIs. To date, ORGAs with a spatially multiplex holographic memory have been reported. However, the spatially multiplexed holographic memory can only have a small number of configuration contexts, which are limited to about 256 configuration contexts. To implement more than a million configuration contexts, an angle-multiplex holographic memory must be used. However, no ORGA with an angle multiplex holographic memory that can sufficiently exploit the huge storage capacity of a holographic memory has ever been reported. Therefore, this paper presents a proposal of a novel ORGA with an angle-multiplexed holographic memory. The architecture can open the possibility of providing a million configuration contexts for a multi-context FPGA.
光可重构门阵列(ORGAs)已被开发用于实现具有多种配置上下文的高性能FPGA。在该体系结构中,引入了光存储技术或全息存储技术,使该体系结构具有多种配置上下文和高速重构能力。结果表明,该架构可以实现比现有vlsi大得多的虚拟门数。迄今为止,已经报道了具有空间多重全息存储器的orga。然而,空间复用全息存储器只能具有少量的配置上下文,其被限制为约256个配置上下文。要实现超过一百万个配置上下文,必须使用角度复用全息存储器。然而,目前还没有报道过具有角度复用全息存储器的ORGA能够充分利用全息存储器的巨大存储容量。因此,本文提出了一种具有角度复用全息存储器的新型有机结构。该体系结构可以为多上下文FPGA提供一百万个配置上下文。
{"title":"Optically reconfigurable gate array with an angle-multiplexed holographic memory","authors":"R. Moriwaki, H. Maekawa, A. Ogiwara, Minoru Watanabe","doi":"10.1145/2591513.2591597","DOIUrl":"https://doi.org/10.1145/2591513.2591597","url":null,"abstract":"Optically reconfigurable gate arrays (ORGAs) have been developed to achieve a high-performance FPGA with numerous configuration contexts. In the architecture, an optical memory technology or a holographic memory technology has been introduced so that the architecture can have numerous configuration contexts and high-speed reconfiguration capability. Results show that the architecture can achieve a large virtual gate count that is much larger than those of currently available VLSIs. To date, ORGAs with a spatially multiplex holographic memory have been reported. However, the spatially multiplexed holographic memory can only have a small number of configuration contexts, which are limited to about 256 configuration contexts. To implement more than a million configuration contexts, an angle-multiplex holographic memory must be used. However, no ORGA with an angle multiplex holographic memory that can sufficiently exploit the huge storage capacity of a holographic memory has ever been reported. Therefore, this paper presents a proposal of a novel ORGA with an angle-multiplexed holographic memory. The architecture can open the possibility of providing a million configuration contexts for a multi-context FPGA.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"28 13","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131639131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimum implant area-aware gate sizing and placement 最小植入物区域感知栅极尺寸和位置
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591542
A. Kahng, Hyein Lee
With reduction of minimum feature size, the minimum implant area (MinIA) constraint is emerging as a new challenge for the physical implementation flow in sub-22nm technology. In particular, the MinIA constraint induces a new problem formulation wherein gate sizing and V_t-swapping must now be linked closely with detailed placement changes. To solve this new problem, we propose heuristic methods that fix MinIA violations and reduce power with gate sizing while minimizing placement perturbation to avoid creating extra timing violations. Compared to recent versions of commercial P&R tools, our methodologies achieve significant reductions (up to 100%) in the number of MinIA violations under timing/power constraints.
随着最小特征尺寸的减小,最小植入面积(MinIA)限制成为亚22nm技术物理实现流程的新挑战。特别是,MinIA约束引发了一个新的问题公式,其中栅极尺寸和v_t交换现在必须与详细的放置变化密切相关。为了解决这个新问题,我们提出了启发式方法来修复MinIA违规并通过栅极尺寸降低功率,同时最小化放置扰动以避免产生额外的定时违规。与最新版本的商业P&R工具相比,我们的方法在时间/功率限制下显著减少了MinIA违规次数(高达100%)。
{"title":"Minimum implant area-aware gate sizing and placement","authors":"A. Kahng, Hyein Lee","doi":"10.1145/2591513.2591542","DOIUrl":"https://doi.org/10.1145/2591513.2591542","url":null,"abstract":"With reduction of minimum feature size, the minimum implant area (MinIA) constraint is emerging as a new challenge for the physical implementation flow in sub-22nm technology. In particular, the MinIA constraint induces a new problem formulation wherein gate sizing and V_t-swapping must now be linked closely with detailed placement changes. To solve this new problem, we propose heuristic methods that fix MinIA violations and reduce power with gate sizing while minimizing placement perturbation to avoid creating extra timing violations. Compared to recent versions of commercial P&R tools, our methodologies achieve significant reductions (up to 100%) in the number of MinIA violations under timing/power constraints.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123311097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Exploiting heterogeneity in MPSoCs to prevent potential trojan propagation across malicious IPs 利用mpsoc的异构性来防止潜在的特洛伊木马跨恶意ip传播
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591595
Chen Liu, Chengmo Yang
Multiprocessor System-on-Chip (MPSoC) platforms face some of the most demanding security concerns, as they process, store, and communicate sensitive information using third-party intellectual property (3PIP) cores. The trend of outsourcing design and fabrication strongly questions the assumption of 3PIP components being trustworthy. While existing research focuses on addressing hardware trojans in individual IPs, this paper improves MPSoC security from another perspective. Specifically, our goal is to prevent trojans in malicious IPs from triggering each other and leading to severe system-wide degradation in security and reliability. We propose to impose trojan isolation constraints during static task scheduling, ensuring that all legal communications on the target MPSoC are between IPs of different types. This in turn enables the runtime system to monitor and detect undesired communication paths, if any. We furthermore pose the security-constrained MPSoC task scheduling as a multi-dimensional optimization problem, and solve it through Integer Linear Programming (ILP), thus minimizing the associated performance, power, and hardware overhead. The results show that trojan isolation can be achieved within one extra vendor and nearly no performance overhead.
多处理器片上系统(MPSoC)平台面临着一些最苛刻的安全问题,因为它们使用第三方知识产权(3PIP)内核处理、存储和通信敏感信息。外包设计和制造的趋势强烈质疑3PIP组件值得信赖的假设。现有的研究主要集中在解决单个ip中的硬件木马,而本文从另一个角度提高了MPSoC的安全性。具体来说,我们的目标是防止恶意ip中的木马相互触发,从而导致整个系统的安全性和可靠性严重下降。我们建议在静态任务调度期间施加木马隔离约束,确保目标MPSoC上的所有合法通信都在不同类型的ip之间进行。这反过来使运行时系统能够监视和检测不需要的通信路径(如果有的话)。我们进一步将安全约束的MPSoC任务调度作为一个多维优化问题,并通过整数线性规划(ILP)来解决它,从而最小化相关的性能,功耗和硬件开销。结果表明,木马隔离可以在一个额外的供应商内实现,并且几乎没有性能开销。
{"title":"Exploiting heterogeneity in MPSoCs to prevent potential trojan propagation across malicious IPs","authors":"Chen Liu, Chengmo Yang","doi":"10.1145/2591513.2591595","DOIUrl":"https://doi.org/10.1145/2591513.2591595","url":null,"abstract":"Multiprocessor System-on-Chip (MPSoC) platforms face some of the most demanding security concerns, as they process, store, and communicate sensitive information using third-party intellectual property (3PIP) cores. The trend of outsourcing design and fabrication strongly questions the assumption of 3PIP components being trustworthy. While existing research focuses on addressing hardware trojans in individual IPs, this paper improves MPSoC security from another perspective. Specifically, our goal is to prevent trojans in malicious IPs from triggering each other and leading to severe system-wide degradation in security and reliability. We propose to impose trojan isolation constraints during static task scheduling, ensuring that all legal communications on the target MPSoC are between IPs of different types. This in turn enables the runtime system to monitor and detect undesired communication paths, if any. We furthermore pose the security-constrained MPSoC task scheduling as a multi-dimensional optimization problem, and solve it through Integer Linear Programming (ILP), thus minimizing the associated performance, power, and hardware overhead. The results show that trojan isolation can be achieved within one extra vendor and nearly no performance overhead.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115822602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
VLSI systems for neurocomputing and health informatics 用于神经计算和健康信息学的VLSI系统
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2597168
K. Parhi
Ubiquitous access to computers, cell phones, internet, personal digital devices, cameras and TV can be attributed to advances in the very large scale integration (VLSI) technology and the advances in circuit design to operate circuits at Gigahertz rates. One of the mysteries that we have not been able to unravel is the understanding of how the brain works from different perspectives. Reverse engineering the brain has been identified as one of the grand challenge problems by the National Academies. Advances in sensor technologies and imaging modalities such as electroencephalogram (EEG), intra-cranial electroencephalogram (iEEG), magnetoencephalogram (MEG), and magnetic resonance imaging (MRI) allow us to collect data from hundreds of electrodes from the brain at sample rates ranging from 256 Hz to 15kHz. These data can be key to not only understanding brain functioning and brain connectivity at macro and micro levels in healthy subjects but also in identifying patients with neurological and mental disorder. Extracting the appropriate biomarkers using spectral-temporal-spatial signal processing approaches and classifying states using machine learning approaches can assist clinicians in predicting and detecting seizures in epileptic patients, and in identifying patients with mental disorder such as schizophrenia, depression and personality disorder. The biomarkers can be tracked to design personalized therapy and effectiveness of therapy by closed loop drug delivery or closed loop neuromodulation, i.e., brain stimulation either by invasive or non-invasive means using electrical or magnetic stimulation. High-performance VLSI system design is critical to not-only increasing battery life of VLSI chips for neuromodulation but also for reducing computation time by orders of magnitude in analyzing MRI signals. Another grand challenge problem identified by the National Academies is Advanced Health Informatics. Analysis of health data is key to monitoring biomarkers and delivering drugs as needed. VLSI system design of biomarkers and disease state classification is again critical in improving the health and quality of life of human beings. In this talk, I will highlight the emerging opportunities in high-performance low-power VLSI system design for neurocomputing and health informatics at various scales. At macroscale, the goal is to design small low-power implantable or wearable devices that can be used to monitor biomarkers and trigger an alarm signal to alert an abnormal state of the brain such as an impending seizure. At microscale, extracting thousands of connections from structural and functional MRI can require many hours or even a day for one subject and one set of parameters using parallel computers. The challenge here is to design parallel multicore computer architectures and compiler tools that can reduce the time for microscale analysis of MRI to an hour or less. I will describe research in my group in use of signal processing and machine lea
计算机,手机,互联网,个人数字设备,相机和电视的无处不在的访问可以归因于非常大规模集成(VLSI)技术的进步和电路设计的进步,以千兆赫兹的速度运行电路。我们无法解开的一个谜团是从不同的角度理解大脑是如何工作的。对大脑进行逆向工程已被美国国家科学院认定为重大挑战问题之一。传感器技术和成像模式的进步,如脑电图(EEG)、颅内脑电图(iEEG)、脑磁图(MEG)和磁共振成像(MRI),使我们能够以256hz至15kHz的采样率从数百个大脑电极收集数据。这些数据不仅可以在宏观和微观水平上理解健康受试者的大脑功能和大脑连接,而且可以识别神经和精神障碍患者。使用光谱-时空信号处理方法提取适当的生物标志物,并使用机器学习方法对状态进行分类,可以帮助临床医生预测和检测癫痫患者的癫痫发作,并识别精神分裂症、抑郁症和人格障碍等精神障碍患者。生物标记物可以通过闭环药物输送或闭环神经调节来设计个性化治疗和治疗效果,即通过侵入性或非侵入性手段使用电或磁刺激来刺激大脑。高性能VLSI系统设计不仅对提高神经调节VLSI芯片的电池寿命至关重要,而且对于在分析MRI信号时减少数量级的计算时间至关重要。美国国家科学院确定的另一个重大挑战问题是高级健康信息学。对健康数据的分析是监测生物标志物和根据需要提供药物的关键。生物标志物和疾病状态分类的VLSI系统设计对于改善人类的健康和生活质量至关重要。在这次演讲中,我将重点介绍各种规模的高性能低功耗VLSI系统设计在神经计算和健康信息学方面的新机遇。在宏观尺度上,目标是设计小型低功耗的可植入或可穿戴设备,用于监测生物标志物,并触发警报信号,以警告大脑的异常状态,如即将发作的癫痫。在微观尺度上,使用并行计算机,从结构和功能MRI中提取数千个连接可能需要数小时甚至一天的时间来处理一个受试者和一组参数。这里的挑战是设计并行的多核计算机体系结构和编译器工具,可以将MRI的微尺度分析时间减少到一个小时或更少。我将描述我的小组在使用信号处理和机器学习方法来识别和跟踪各种神经和精神障碍方面的研究。我将介绍一些VLSI设计的特征提取器,如功率谱密度(PSD)和分类器,如支持向量机(svm)的结果。我将以使用眼底图像分析和机器学习进行糖尿病视网膜病变筛查为例,说明健康信息学嵌入式系统设计中的机会。在这一领域需要进行重要的研究。我的演讲将有望激发在这个新兴和重要的领域嵌入式VLSI系统设计神经,生物和健康信息学的进一步研究。
{"title":"VLSI systems for neurocomputing and health informatics","authors":"K. Parhi","doi":"10.1145/2591513.2597168","DOIUrl":"https://doi.org/10.1145/2591513.2597168","url":null,"abstract":"Ubiquitous access to computers, cell phones, internet, personal digital devices, cameras and TV can be attributed to advances in the very large scale integration (VLSI) technology and the advances in circuit design to operate circuits at Gigahertz rates. One of the mysteries that we have not been able to unravel is the understanding of how the brain works from different perspectives. Reverse engineering the brain has been identified as one of the grand challenge problems by the National Academies. Advances in sensor technologies and imaging modalities such as electroencephalogram (EEG), intra-cranial electroencephalogram (iEEG), magnetoencephalogram (MEG), and magnetic resonance imaging (MRI) allow us to collect data from hundreds of electrodes from the brain at sample rates ranging from 256 Hz to 15kHz. These data can be key to not only understanding brain functioning and brain connectivity at macro and micro levels in healthy subjects but also in identifying patients with neurological and mental disorder. Extracting the appropriate biomarkers using spectral-temporal-spatial signal processing approaches and classifying states using machine learning approaches can assist clinicians in predicting and detecting seizures in epileptic patients, and in identifying patients with mental disorder such as schizophrenia, depression and personality disorder. The biomarkers can be tracked to design personalized therapy and effectiveness of therapy by closed loop drug delivery or closed loop neuromodulation, i.e., brain stimulation either by invasive or non-invasive means using electrical or magnetic stimulation. High-performance VLSI system design is critical to not-only increasing battery life of VLSI chips for neuromodulation but also for reducing computation time by orders of magnitude in analyzing MRI signals. Another grand challenge problem identified by the National Academies is Advanced Health Informatics. Analysis of health data is key to monitoring biomarkers and delivering drugs as needed. VLSI system design of biomarkers and disease state classification is again critical in improving the health and quality of life of human beings. In this talk, I will highlight the emerging opportunities in high-performance low-power VLSI system design for neurocomputing and health informatics at various scales. At macroscale, the goal is to design small low-power implantable or wearable devices that can be used to monitor biomarkers and trigger an alarm signal to alert an abnormal state of the brain such as an impending seizure. At microscale, extracting thousands of connections from structural and functional MRI can require many hours or even a day for one subject and one set of parameters using parallel computers. The challenge here is to design parallel multicore computer architectures and compiler tools that can reduce the time for microscale analysis of MRI to an hour or less. I will describe research in my group in use of signal processing and machine lea","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115644522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
WriteSmoothing: improving lifetime of non-volatile caches using intra-set wear-leveling WriteSmoothing:使用组内磨损均衡提高非易失性缓存的寿命
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591525
Sparsh Mittal, J. Vetter, Dong Li
Driven by the trends of increasing core-count and bandwidth-wall problem, the size of last level caches (LLCs) has greatly increased. Since SRAM consumes high leakage power, researchers have explored use of non-volatile memories (NVMs) for designing caches as they provide high density and consume low leakage power. However, since NVMs have low write-endurance and the existing cache management policies are write variation-unaware, effective wear-leveling techniques are required for achieving reasonable cache lifetimes using NVMs. We present WriteSmoothing, a technique for mitigating intra-set write variation in NVM caches. WriteSmoothing logically divides the cache-sets into multiple modules. For each module, WriteSmoothing collectively records number of writes in each way for any of the sets. It then periodically makes most frequently written ways in a module unavailable to shift the write-pressure to other ways in the sets of the module. Extensive simulation results have shown that on average, for single and dual-core system configurations, WriteSmoothing improves cache lifetime by 2.17X and 2.75X, respectively. Also, its implementation overhead is small and it works well for a wide range of algorithm and system parameters.
在核数增加和带宽墙问题的趋势下,最后一级缓存(llc)的大小大大增加。由于SRAM消耗高泄漏功率,研究人员已经探索使用非易失性存储器(nvm)来设计缓存,因为它们提供高密度和低泄漏功率。然而,由于nvm的写持久性较低,而且现有的缓存管理策略无法感知写变化,因此需要有效的损耗均衡技术来使用nvm实现合理的缓存生命周期。我们提出了WriteSmoothing,一种减轻NVM缓存中集合内写变化的技术。WriteSmoothing逻辑上将缓存集划分为多个模块。对于每个模块,WriteSmoothing都以每种方式记录任何集合的写次数。然后,它定期使模块中最频繁写入的方式不可用,以将写入压力转移到模块集合中的其他方式。广泛的模拟结果表明,平均而言,对于单核和双核系统配置,WriteSmoothing将缓存寿命分别提高2.17倍和2.75倍。此外,它的实现开销很小,并且适用于广泛的算法和系统参数。
{"title":"WriteSmoothing: improving lifetime of non-volatile caches using intra-set wear-leveling","authors":"Sparsh Mittal, J. Vetter, Dong Li","doi":"10.1145/2591513.2591525","DOIUrl":"https://doi.org/10.1145/2591513.2591525","url":null,"abstract":"Driven by the trends of increasing core-count and bandwidth-wall problem, the size of last level caches (LLCs) has greatly increased. Since SRAM consumes high leakage power, researchers have explored use of non-volatile memories (NVMs) for designing caches as they provide high density and consume low leakage power. However, since NVMs have low write-endurance and the existing cache management policies are write variation-unaware, effective wear-leveling techniques are required for achieving reasonable cache lifetimes using NVMs. We present WriteSmoothing, a technique for mitigating intra-set write variation in NVM caches. WriteSmoothing logically divides the cache-sets into multiple modules. For each module, WriteSmoothing collectively records number of writes in each way for any of the sets. It then periodically makes most frequently written ways in a module unavailable to shift the write-pressure to other ways in the sets of the module. Extensive simulation results have shown that on average, for single and dual-core system configurations, WriteSmoothing improves cache lifetime by 2.17X and 2.75X, respectively. Also, its implementation overhead is small and it works well for a wide range of algorithm and system parameters.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127178136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
A hybrid framework for application allocation and scheduling in multicore systems with energy harvesting 基于能量收集的多核系统应用程序分配和调度的混合框架
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591527
Yi Xiang, S. Pasricha
In this paper, we propose a novel hybrid design-time and run-time framework for allocating and scheduling applications in multi-core embedded systems with solar energy harvesting. Due to limited energy availability at run-time, our framework offloads scheduling complexity to design time by creating energy-efficient schedule templates for varying energy budget levels, which are selected at run-time in a manner that is contingent on the available harvested energy and executed with a lightweight slack reclamation scheme that extracts additional energy savings. Our experimental results show that the proposed framework produces energy-efficient and dependency-aware schedules to execute applications under varying and stringent energy constraints, with 23-40% lower miss rates than in prior works on harvesting energy-aware scheduling.
在本文中,我们提出了一个新的混合设计时和运行时框架,用于分配和调度多核嵌入式系统中的应用程序。由于运行时的能源可用性有限,我们的框架通过为不同的能源预算水平创建节能时间表模板,将调度复杂性转移到设计时,这些模板在运行时以一种取决于可用收获能量的方式进行选择,并使用轻量级的闲置回收方案执行,以提取额外的能源节约。我们的实验结果表明,所提出的框架产生了节能和依赖感知的调度,以执行在变化和严格的能量约束下的应用程序,与之前的工作相比,在收集能量感知调度方面的失败率降低了23-40%。
{"title":"A hybrid framework for application allocation and scheduling in multicore systems with energy harvesting","authors":"Yi Xiang, S. Pasricha","doi":"10.1145/2591513.2591527","DOIUrl":"https://doi.org/10.1145/2591513.2591527","url":null,"abstract":"In this paper, we propose a novel hybrid design-time and run-time framework for allocating and scheduling applications in multi-core embedded systems with solar energy harvesting. Due to limited energy availability at run-time, our framework offloads scheduling complexity to design time by creating energy-efficient schedule templates for varying energy budget levels, which are selected at run-time in a manner that is contingent on the available harvested energy and executed with a lightweight slack reclamation scheme that extracts additional energy savings. Our experimental results show that the proposed framework produces energy-efficient and dependency-aware schedules to execute applications under varying and stringent energy constraints, with 23-40% lower miss rates than in prior works on harvesting energy-aware scheduling.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123216501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A design approach to automatically generate on-chip monitors during high-level synthesis of hardware accelerator 硬件加速器高级合成过程中自动生成片上监视器的设计方法
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591521
M. B. Hammouda, P. Coussy, Loïc Lagadec
Embedded systems often implement safety critical applications making security a more and more important aspect in their design. Control-Flow Integrity (CFI) attacks are used to modify program behavior and can lead to learn valuable information directly or indirectly by perturbing a system and creating failures. Although CFI attacks are well-known in computer systems, they have been recently shown to be practical and feasible on embedded systems as well. In this context, CFI checks are mainly used to detect unintended software behaviors while very few works address non programmable hardware component monitoring. In this paper, we present a hardware-assisted paradigm to enhance embedded system security by detecting and preventing unintended hardware behavior. We propose a design approach that designs on-chip monitors (OCM) during High-Level Synthesis (HLS) of hardware accelerators (HWacc). Synthesis of OCM is introduced as a set of steps realized concurrently to the HLS flow of HWacc. Automatically generated OCM checks at runtime both the input/output timing behavior and the control flow of the monitored HWacc. Experimental results show the interest of the proposed approach: the error coverage on the control flow ranges from 99.75% to 100% while in average the OCM area overhead is less than 10%, the clock period overhead is at worst less than 5% and impact on the synthesis time is negligible.
嵌入式系统经常实现对安全至关重要的应用,使得安全性在其设计中越来越重要。控制流完整性(CFI)攻击用于修改程序行为,通过扰乱系统和制造故障,可以直接或间接地获取有价值的信息。尽管CFI攻击在计算机系统中是众所周知的,但它们最近在嵌入式系统中也被证明是实用和可行的。在这种情况下,CFI检查主要用于检测非预期的软件行为,而很少有作品涉及非可编程硬件组件监控。在本文中,我们提出了一个硬件辅助的范例,通过检测和防止意外的硬件行为来增强嵌入式系统的安全性。我们提出了一种在硬件加速器(HWacc)的高级合成(HLS)过程中设计片上监视器(OCM)的设计方法。将OCM的合成作为一组与HWacc的HLS流程并行实现的步骤。自动生成的OCM在运行时检查被监控HWacc的输入/输出时序行为和控制流。实验结果表明,该方法对控制流的误差覆盖率在99.75% ~ 100%之间,平均OCM面积开销小于10%,时钟周期开销小于5%,对合成时间的影响可以忽略不计。
{"title":"A design approach to automatically generate on-chip monitors during high-level synthesis of hardware accelerator","authors":"M. B. Hammouda, P. Coussy, Loïc Lagadec","doi":"10.1145/2591513.2591521","DOIUrl":"https://doi.org/10.1145/2591513.2591521","url":null,"abstract":"Embedded systems often implement safety critical applications making security a more and more important aspect in their design. Control-Flow Integrity (CFI) attacks are used to modify program behavior and can lead to learn valuable information directly or indirectly by perturbing a system and creating failures. Although CFI attacks are well-known in computer systems, they have been recently shown to be practical and feasible on embedded systems as well. In this context, CFI checks are mainly used to detect unintended software behaviors while very few works address non programmable hardware component monitoring. In this paper, we present a hardware-assisted paradigm to enhance embedded system security by detecting and preventing unintended hardware behavior. We propose a design approach that designs on-chip monitors (OCM) during High-Level Synthesis (HLS) of hardware accelerators (HWacc). Synthesis of OCM is introduced as a set of steps realized concurrently to the HLS flow of HWacc. Automatically generated OCM checks at runtime both the input/output timing behavior and the control flow of the monitored HWacc. Experimental results show the interest of the proposed approach: the error coverage on the control flow ranges from 99.75% to 100% while in average the OCM area overhead is less than 10%, the clock period overhead is at worst less than 5% and impact on the synthesis time is negligible.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114342005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Simscape design flow for memristor based programmable oscillators 基于忆阻器的可编程振荡器的Simscape设计流程
Pub Date : 2014-05-20 DOI: 10.1145/2591513.2591545
E. Agu, S. Mohanty, E. Kougianos, M. Gautam
In this paper a design optimization flow is proposed for memristor-based oscillators using the Gravitational Search Algorithm. This paper presents for the first time a memristor behavioral model in the Simscape physical modeling language. Using this model, a memristor based Wien oscillator is characterized within the Simscape framework. The oscillation frequency and power consumption of the oscillator for different configurations are explored.
本文提出了一种基于引力搜索算法的忆阻振荡器设计优化流程。本文首次在Simscape物理建模语言中建立了忆阻器的行为模型。利用该模型,在Simscape框架下对基于忆阻器的维恩振荡器进行了表征。探讨了不同结构下振荡器的振荡频率和功耗。
{"title":"Simscape design flow for memristor based programmable oscillators","authors":"E. Agu, S. Mohanty, E. Kougianos, M. Gautam","doi":"10.1145/2591513.2591545","DOIUrl":"https://doi.org/10.1145/2591513.2591545","url":null,"abstract":"In this paper a design optimization flow is proposed for memristor-based oscillators using the Gravitational Search Algorithm. This paper presents for the first time a memristor behavioral model in the Simscape physical modeling language. Using this model, a memristor based Wien oscillator is characterized within the Simscape framework. The oscillation frequency and power consumption of the oscillator for different configurations are explored.","PeriodicalId":272619,"journal":{"name":"ACM Great Lakes Symposium on VLSI","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132664918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
ACM Great Lakes Symposium on VLSI
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1