首页 > 最新文献

2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)最新文献

英文 中文
Hardware Trojan detection through golden chip-free statistical side-channel fingerprinting 硬件木马检测通过黄金芯片无统计侧通道指纹
Pub Date : 2014-06-01 DOI: 10.1145/2593069.2593147
Yu Liu, K. Huang, Y. Makris
Statistical side channel fingerprinting is a popular hardware Trojan detection method, wherein a parametric signature of a chip is collected and compared to a trusted region in a multi-dimensional space. This trusted region is statistically established so that, despite the uncertainty incurred by process variations, the fingerprint of Trojan-free chips is expected to fall within this region while the fingerprint of Trojan-infested chips is expected to fall outside. Learning this trusted region, however, assumes availability of a small set of trusted (i.e. “golden”) chips. Herein, we rescind this assumption and we demonstrate that an almost equally effective trusted region can be learned through a combination of a trusted simulation model, measurements from process control monitors (PCMs) which are typically present either on die or on wafer kerf, and advanced statistical tail modeling techniques. Effectiveness of this method is evaluated using silicon measurements from two hardware Trojan-infested versions of a wireless cryptographic integrated circuit.
统计侧信道指纹是一种流行的硬件木马检测方法,该方法收集芯片的参数签名并将其与多维空间中的可信区域进行比较。这个可信区域是通过统计建立的,因此,尽管工艺变化带来了不确定性,但没有木马病毒的芯片的指纹预计会落在这个区域内,而感染木马病毒的芯片的指纹预计会落在这个区域外。然而,学习这个可信区域需要假设有一小部分可信(即“黄金”)芯片的可用性。在这里,我们取消了这个假设,我们证明了一个几乎同样有效的可信区域可以通过可信仿真模型的组合来学习,过程控制监视器(pcm)的测量通常存在于模具或晶圆切口上,以及先进的统计尾部建模技术。该方法的有效性是通过两个硬件木马感染版本的无线加密集成电路的硅测量来评估的。
{"title":"Hardware Trojan detection through golden chip-free statistical side-channel fingerprinting","authors":"Yu Liu, K. Huang, Y. Makris","doi":"10.1145/2593069.2593147","DOIUrl":"https://doi.org/10.1145/2593069.2593147","url":null,"abstract":"Statistical side channel fingerprinting is a popular hardware Trojan detection method, wherein a parametric signature of a chip is collected and compared to a trusted region in a multi-dimensional space. This trusted region is statistically established so that, despite the uncertainty incurred by process variations, the fingerprint of Trojan-free chips is expected to fall within this region while the fingerprint of Trojan-infested chips is expected to fall outside. Learning this trusted region, however, assumes availability of a small set of trusted (i.e. “golden”) chips. Herein, we rescind this assumption and we demonstrate that an almost equally effective trusted region can be learned through a combination of a trusted simulation model, measurements from process control monitors (PCMs) which are typically present either on die or on wafer kerf, and advanced statistical tail modeling techniques. Effectiveness of this method is evaluated using silicon measurements from two hardware Trojan-infested versions of a wireless cryptographic integrated circuit.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131133761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 97
Software only, extremely compact, Keccak-based secure PRNG on ARM Cortex-M 仅软件,非常紧凑,基于keccak的安全PRNG在ARM Cortex-M上
Pub Date : 2014-06-01 DOI: 10.1145/2593069.2593218
A. V. Herrewege, I. Verbauwhede
The ability to generate secure random numbers is fundamental to the security of cryptographic protocols. Random Number Generators (RNGs) start to appear in recent modern Intel CPUs as used in desktops and servers. Solutions for embedded devices, such as e.g. sensor nodes and wireless routers, are still severely lacking however. In this paper we present the implementation of a secure pseudo-random number generator (PRNG) for the ARM Cortex-M microcontroller family, one of the most popular embedded platforms at this moment. For compactness and compatibility reasons, our implementation is software only. It uses the start-up values of on-chip SRAM as random seed and uses the KECCAK hash function for both entropy extraction as well as pseudo-random number generation. Getting KECCAK very compact in terms of memory requirements is therefore essential. KECCAK is a tunable algorithm: in this paper we discuss the minimum security requirements and the storage costs as a function of the KECCAK variant. The KECCAK permutation of our choice, KECCAK-f[200], is implemented in only 400 bytes. To the best of our knowledge, this is the smallest KECCAK implementation published so far. With the addition of initialization, hashing, padding and output generation functions, our complete solution fits within 496 bytes of ROM and requires 52 bytes of RAM. One byte of pseudo-random data, with a security level of at least 128 bits, can be generated in 3337 cyles on an ARM CortexM3/4, i.e. 50 KiB/s on a development board, plenty fast for a cryptographic PRNG in an embedded setting.
生成安全随机数的能力是加密协议安全性的基础。随机数生成器(rng)开始出现在最近的现代英特尔cpu中,用于台式机和服务器。然而,嵌入式设备的解决方案,如传感器节点和无线路由器,仍然严重缺乏。在本文中,我们提出了一种安全伪随机数生成器(PRNG)的实现,用于ARM Cortex-M微控制器家族,这是目前最流行的嵌入式平台之一。出于紧凑性和兼容性的考虑,我们的实现是纯软件的。它使用片上SRAM的启动值作为随机种子,并使用KECCAK哈希函数进行熵提取和伪随机数生成。因此,使KECCAK在内存需求方面非常紧凑是必不可少的。KECCAK是一种可调算法,本文讨论了最小安全要求和存储成本作为KECCAK变体的函数。我们选择的KECCAK排列,KECCAK-f[200],仅在400字节中实现。据我们所知,这是迄今为止发布的最小的KECCAK实现。加上初始化、散列、填充和输出生成功能,我们的完整解决方案只需要496字节的ROM和52字节的RAM。一个字节的伪随机数据,具有至少128位的安全级别,可以在ARM CortexM3/4上以3337个周期生成,即在开发板上以50 KiB/s的速度生成,对于嵌入式设置中的加密PRNG来说已经足够快了。
{"title":"Software only, extremely compact, Keccak-based secure PRNG on ARM Cortex-M","authors":"A. V. Herrewege, I. Verbauwhede","doi":"10.1145/2593069.2593218","DOIUrl":"https://doi.org/10.1145/2593069.2593218","url":null,"abstract":"The ability to generate secure random numbers is fundamental to the security of cryptographic protocols. Random Number Generators (RNGs) start to appear in recent modern Intel CPUs as used in desktops and servers. Solutions for embedded devices, such as e.g. sensor nodes and wireless routers, are still severely lacking however. In this paper we present the implementation of a secure pseudo-random number generator (PRNG) for the ARM Cortex-M microcontroller family, one of the most popular embedded platforms at this moment. For compactness and compatibility reasons, our implementation is software only. It uses the start-up values of on-chip SRAM as random seed and uses the KECCAK hash function for both entropy extraction as well as pseudo-random number generation. Getting KECCAK very compact in terms of memory requirements is therefore essential. KECCAK is a tunable algorithm: in this paper we discuss the minimum security requirements and the storage costs as a function of the KECCAK variant. The KECCAK permutation of our choice, KECCAK-f[200], is implemented in only 400 bytes. To the best of our knowledge, this is the smallest KECCAK implementation published so far. With the addition of initialization, hashing, padding and output generation functions, our complete solution fits within 496 bytes of ROM and requires 52 bytes of RAM. One byte of pseudo-random data, with a security level of at least 128 bits, can be generated in 3337 cyles on an ARM CortexM3/4, i.e. 50 KiB/s on a development board, plenty fast for a cryptographic PRNG in an embedded setting.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132995483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Content-centric display energy management for mobile devices 用于移动设备的以内容为中心的显示能量管理
Pub Date : 2014-06-01 DOI: 10.1145/2593069.2593113
Dongwon Kim, Nohyun Jung, H. Cha
While active studies have been conducted to reduce the power consumption of display-related components of mobile devices, previous work has rarely approached the issues without having to deteriorate graphical quality. In this paper, we propose an effective scheme to reduce display energy consumption without compromising user experience. We first define a metric called the content rate from which an appropriate refresh rate is determined for displaying content. The proposed system then sets an optimal refresh rate based on the content rate. Extensive experiments demonstrate that our system effectively reduces the total power in commercial smartphones, yet the display quality is satisfactorily maintained.
虽然已经进行了积极的研究,以减少移动设备中显示相关组件的功耗,但以前的工作很少在不降低图像质量的情况下解决这个问题。在本文中,我们提出了一种有效的方案,以减少显示能耗而不影响用户体验。我们首先定义一个称为内容率的指标,据此确定显示内容的适当刷新率。然后,所提议的系统根据所述内容速率设置最佳刷新率。大量的实验表明,我们的系统有效地降低了商用智能手机的总功耗,同时保持了令人满意的显示质量。
{"title":"Content-centric display energy management for mobile devices","authors":"Dongwon Kim, Nohyun Jung, H. Cha","doi":"10.1145/2593069.2593113","DOIUrl":"https://doi.org/10.1145/2593069.2593113","url":null,"abstract":"While active studies have been conducted to reduce the power consumption of display-related components of mobile devices, previous work has rarely approached the issues without having to deteriorate graphical quality. In this paper, we propose an effective scheme to reduce display energy consumption without compromising user experience. We first define a metric called the content rate from which an appropriate refresh rate is determined for displaying content. The proposed system then sets an optimal refresh rate based on the content rate. Extensive experiments demonstrate that our system effectively reduces the total power in commercial smartphones, yet the display quality is satisfactorily maintained.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132645763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Circuit camouflage integration for hardware IP protection 电路伪装集成硬件IP保护
Pub Date : 2014-06-01 DOI: 10.1145/2593069.2602554
Ron Cocchi, J. Baukus, Lap-Wai Chow, B. Wang
Circuit camouflage technologies can be integrated into standard logic cell developments using traditional CAD tools. Camouflaged logic cells are integrated into a typical design flow using standard front end and back end models. Camouflaged logic cells obfuscate a circuit's function by introducing subtle cell design changes at the GDS level. The logic function of a camouflaged logic cell is extremely difficult to determine through silicon imaging analysis preventing netlist extraction, clones and counterfeits. The application of circuit camouflage as part of a customer's design flow can protect hardware IP from reverse engineering. Camouflage fill techniques further inhibit Trojan circuit insertion by completely filling the design with realistic circuitry that does not affect the primary design function. All unused silicon appears to be functional circuitry, so an attacker cannot find space to insert a Trojan circuit. The integration of circuit camouflage techniques is compatible with standard chip design flows and EDA tools, and ICs using such techniques have been successfully employed in high-attack commercial and government segments. Protected under issued and pending patents.
电路伪装技术可以使用传统的CAD工具集成到标准的逻辑单元开发中。伪装的逻辑单元使用标准的前端和后端模型集成到一个典型的设计流程中。伪装的逻辑单元通过在GDS级别引入微妙的单元设计变化来混淆电路的功能。伪装逻辑单元的逻辑功能极难通过硅成像分析来确定,以防止网表提取、克隆和伪造。电路伪装作为客户设计流程的一部分,可以保护硬件IP免受逆向工程的侵害。伪装填充技术通过在设计中完全填充不影响主要设计功能的真实电路,进一步抑制特洛伊电路的插入。所有未使用的硅似乎都是功能性电路,因此攻击者无法找到插入木马电路的空间。电路伪装技术的集成与标准芯片设计流程和EDA工具兼容,使用这种技术的集成电路已经成功地应用于高攻击的商业和政府部门。受已颁发和正在申请的专利保护。
{"title":"Circuit camouflage integration for hardware IP protection","authors":"Ron Cocchi, J. Baukus, Lap-Wai Chow, B. Wang","doi":"10.1145/2593069.2602554","DOIUrl":"https://doi.org/10.1145/2593069.2602554","url":null,"abstract":"Circuit camouflage technologies can be integrated into standard logic cell developments using traditional CAD tools. Camouflaged logic cells are integrated into a typical design flow using standard front end and back end models. Camouflaged logic cells obfuscate a circuit's function by introducing subtle cell design changes at the GDS level. The logic function of a camouflaged logic cell is extremely difficult to determine through silicon imaging analysis preventing netlist extraction, clones and counterfeits. The application of circuit camouflage as part of a customer's design flow can protect hardware IP from reverse engineering. Camouflage fill techniques further inhibit Trojan circuit insertion by completely filling the design with realistic circuitry that does not affect the primary design function. All unused silicon appears to be functional circuitry, so an attacker cannot find space to insert a Trojan circuit. The integration of circuit camouflage techniques is compatible with standard chip design flows and EDA tools, and ICs using such techniques have been successfully employed in high-attack commercial and government segments. Protected under issued and pending patents.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134605170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 120
Functional ECO using metal-configurable gate-array spare cells 使用金属可配置门阵列备用电池的功能性ECO
Pub Date : 2014-06-01 DOI: 10.1145/2593069.2593145
Hua-Yu Chang, I. Jiang, Yao-Wen Chang
Metal-configurable gate-array spare cells, which have versatile functionality, are developed to overcome the inflexibility of standard spare cells used in conventional metal-only engineering change order (ECO). In this paper, we focus on functional ECO optimization using the new type of spare cells to fully exploit its strength. We observe that this functional ECO problem has the nature of dynamic logical and physical costs for selecting spare gate arrays. Unlike existing functional ECO works, which perform technology mapping based on ECO patches, we perform reverse mapping from spare gate arrays to handle these dynamic costs. We devise a spare array relation graph to record geometrical adjacency among spare gate arrays and interleave with the and-inverter network of ECO patches. To avoid redundant traversal and monitor the dynamic costs, we adopt A* search to simultaneously traverse and map between the logical ECO network and the physical spare array relation graph.
金属可配置门阵列备用电池具有多种功能,是为了克服传统的纯金属工程变更订单(ECO)中标准备用电池的不灵活性而开发的。在本文中,我们着重于利用新型备用电池进行功能性ECO优化,充分发挥其优势。我们观察到该功能性ECO问题具有选择备用门阵列的动态逻辑和物理成本的性质。与现有基于ECO补丁执行技术映射的功能性ECO工作不同,我们从备用门阵列执行反向映射来处理这些动态成本。我们设计了一个备用阵列关系图来记录备用门阵列之间的几何邻接关系,并与ECO贴片的逆变网络交织。为了避免冗余遍历和监控动态代价,我们采用A*搜索在逻辑ECO网络和物理备用阵列关系图之间同时遍历和映射。
{"title":"Functional ECO using metal-configurable gate-array spare cells","authors":"Hua-Yu Chang, I. Jiang, Yao-Wen Chang","doi":"10.1145/2593069.2593145","DOIUrl":"https://doi.org/10.1145/2593069.2593145","url":null,"abstract":"Metal-configurable gate-array spare cells, which have versatile functionality, are developed to overcome the inflexibility of standard spare cells used in conventional metal-only engineering change order (ECO). In this paper, we focus on functional ECO optimization using the new type of spare cells to fully exploit its strength. We observe that this functional ECO problem has the nature of dynamic logical and physical costs for selecting spare gate arrays. Unlike existing functional ECO works, which perform technology mapping based on ECO patches, we perform reverse mapping from spare gate arrays to handle these dynamic costs. We devise a spare array relation graph to record geometrical adjacency among spare gate arrays and interleave with the and-inverter network of ECO patches. To avoid redundant traversal and monitor the dynamic costs, we adopt A* search to simultaneously traverse and map between the logical ECO network and the physical spare array relation graph.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131652430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
OD3P: On-Demand Page Paired PCM OD3P:按需页面配对PCM
Pub Date : 2014-06-01 DOI: 10.1145/2593069.2593166
Marjan Asadinia, M. Arjomand, H. Sarbazi-Azad
With current memory scalability challenges, Phase Change Memory (PCM) is viewed as an attractive replacement to DRAM. The preliminary concern for PCM applicability is its limited write endurance that is highly affected by process variation in nanometer regime. This increases the variation in cell lifetime resulting in early and sudden reduction in main memory capacity due to wear-out of few cells. When some memory pages reach their endurance limits, other pages may be far from their limits even when using a perfect wear-leveling. Recent studies have proposed redirection or correction schemes to alleviate this problem, but all suffer from poor throughput or latency. On contrary, we present On-Demand Page Paired PCM (OD3P), a technique that mitigates the problem of fast failure of pages by redirecting them onto other healthy pages, leading to gradual capacity degradation. Compared to a state-of-the-art error correction scheme for PCM, our experiments indicated that OD3P can improve PCM time-to-failure and system performance (IPC) by 12% and 14%, respectively, under multi-threaded and multi-programmed workloads.
面对当前存储器可扩展性的挑战,相变存储器(PCM)被视为DRAM的一个有吸引力的替代品。对PCM适用性的初步关注是其有限的写入持久性,这在纳米范围内受到工艺变化的高度影响。这增加了细胞寿命的变化,导致由于少数细胞的磨损而导致主存储器容量的早期和突然减少。当一些内存页达到其持久极限时,即使使用完美的损耗均衡,其他页面也可能远未达到其极限。最近的研究提出了重定向或纠正方案来缓解这个问题,但都存在吞吐量差或延迟的问题。相反,我们提出了按需页面配对PCM (OD3P),这是一种通过将页面重定向到其他健康页面来减轻页面快速故障问题的技术,从而导致容量逐渐降低。与最先进的PCM纠错方案相比,我们的实验表明,在多线程和多编程工作负载下,OD3P可以将PCM的故障处理时间和系统性能(IPC)分别提高12%和14%。
{"title":"OD3P: On-Demand Page Paired PCM","authors":"Marjan Asadinia, M. Arjomand, H. Sarbazi-Azad","doi":"10.1145/2593069.2593166","DOIUrl":"https://doi.org/10.1145/2593069.2593166","url":null,"abstract":"With current memory scalability challenges, Phase Change Memory (PCM) is viewed as an attractive replacement to DRAM. The preliminary concern for PCM applicability is its limited write endurance that is highly affected by process variation in nanometer regime. This increases the variation in cell lifetime resulting in early and sudden reduction in main memory capacity due to wear-out of few cells. When some memory pages reach their endurance limits, other pages may be far from their limits even when using a perfect wear-leveling. Recent studies have proposed redirection or correction schemes to alleviate this problem, but all suffer from poor throughput or latency. On contrary, we present On-Demand Page Paired PCM (OD3P), a technique that mitigates the problem of fast failure of pages by redirecting them onto other healthy pages, leading to gradual capacity degradation. Compared to a state-of-the-art error correction scheme for PCM, our experiments indicated that OD3P can improve PCM time-to-failure and system performance (IPC) by 12% and 14%, respectively, under multi-threaded and multi-programmed workloads.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114316298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
State-restrict MLC STT-RAM designs for high-reliable high-performance memory system 高可靠高性能存储系统的状态限制MLC STT-RAM设计
Pub Date : 2014-06-01 DOI: 10.1145/2593069.2593220
Wujie Wen, Yaojun Zhang, Mengjie Mao, Yiran Chen
Multi-level Cell Spin-Transfer Torque Random AccessMemory (MLC STT-RAM) is a promising nonvolatile memory technology for high-capacity and high-performance applications. However, the reliability concerns and the complicated access mechanism greatly hinder the application of MLC STT-RAM. In this work, we develop a holistic solution set, namely, state-restrict MLC STT-RAM (SR-MLC STT-RAM) to improve the data integrity and performance of MLC STT-RAM with the minimized information density degradation. Three techniques: state restriction (StatRes), error pattern removal (ErrPR), and ternary coding (TerCode) are proposed at circuit level to reduce the read and write errors of MLC STT-RAMcells. State pre-recovery (PreREC) technique is also developed at architecture level to improve the access performance of SR-MLC STT-RAM by eliminating unnecessary two-step write operations. Our simulations show that compared to conventional MLC STT-RAM, SR-MLC STT-RAM can enhance the write and read reliability of memory cells by 10 - 10000×, allowing the application of simple error correction code schemes. Compared to single-level-cell (SLC) STT-RAM, SR-MLC STT-RAM based cache design can boost the system performance by 6.2% on average by leveraging the increased cache capacity at the same area and the improved write latency.
多层单元自旋转移扭矩随机存取存储器(MLC STT-RAM)是一种有前途的高容量和高性能非易失性存储器技术。然而,可靠性问题和复杂的存取机制极大地阻碍了MLC STT-RAM的应用。在这项工作中,我们开发了一个整体的解决方案集,即状态限制MLC STT-RAM (SR-MLC STT-RAM),以提高MLC STT-RAM的数据完整性和性能,同时最小化信息密度退化。为了减少MLC stt - ramcell的读写错误,在电路层面提出了状态限制(StatRes)、错误模式去除(ErrPR)和三元编码(TerCode)三种技术。为了消除不必要的两步写入操作,提高SR-MLC STT-RAM的访问性能,还在体系结构层面开发了状态预恢复(preec)技术。我们的仿真表明,与传统的MLC STT-RAM相比,SR-MLC STT-RAM可以将存储单元的写入和读取可靠性提高10 - 10000x,允许应用简单的纠错码方案。与单级单元(SLC) STT-RAM相比,基于SR-MLC STT-RAM的缓存设计可以利用相同区域增加的缓存容量和改进的写延迟,平均提高系统性能6.2%。
{"title":"State-restrict MLC STT-RAM designs for high-reliable high-performance memory system","authors":"Wujie Wen, Yaojun Zhang, Mengjie Mao, Yiran Chen","doi":"10.1145/2593069.2593220","DOIUrl":"https://doi.org/10.1145/2593069.2593220","url":null,"abstract":"Multi-level Cell Spin-Transfer Torque Random AccessMemory (MLC STT-RAM) is a promising nonvolatile memory technology for high-capacity and high-performance applications. However, the reliability concerns and the complicated access mechanism greatly hinder the application of MLC STT-RAM. In this work, we develop a holistic solution set, namely, state-restrict MLC STT-RAM (SR-MLC STT-RAM) to improve the data integrity and performance of MLC STT-RAM with the minimized information density degradation. Three techniques: state restriction (StatRes), error pattern removal (ErrPR), and ternary coding (TerCode) are proposed at circuit level to reduce the read and write errors of MLC STT-RAMcells. State pre-recovery (PreREC) technique is also developed at architecture level to improve the access performance of SR-MLC STT-RAM by eliminating unnecessary two-step write operations. Our simulations show that compared to conventional MLC STT-RAM, SR-MLC STT-RAM can enhance the write and read reliability of memory cells by 10 - 10000×, allowing the application of simple error correction code schemes. Compared to single-level-cell (SLC) STT-RAM, SR-MLC STT-RAM based cache design can boost the system performance by 6.2% on average by leveraging the increased cache capacity at the same area and the improved write latency.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114304069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Reducing latency in an SRAM/DRAM cache hierarchy via a novel Tag-Cache architecture 通过新颖的标签-缓存架构减少SRAM/DRAM缓存层次结构中的延迟
Pub Date : 2014-06-01 DOI: 10.1145/2593069.2593197
F. Hameed, L. Bauer, J. Henkel
Memory speed has become a major performance bottleneck as more and more cores are integrated on a multi-core chip. The widening latency gap between high speed cores and memory has led to the evolution of multi-level SRAM/DRAM cache hierarchies that exploit the latency benefits of smaller caches (e.g. private L1 and L2 SRAM caches) and the capacity benefits of larger caches (e.g. shared L3 SRAM and shared L4 DRAM cache). The main problem of employing large L3/L4 caches is their high tag lookup latency. To solve this problem, we introduce the novel concept of small and low latency SRAM/DRAM Tag-Cache structures that can quickly determine whether an access to the large L3/L4 caches will be a hit or a miss. The performance of the proposed Tag-Cache architecture depends upon the Tag-Cache hit rate and to improve it we propose a novel Tag-Cache insertion policy and a DRAM row buffer mapping policy that reduce the latency of memory requests. For a 16-core system, this improves the average harmonic mean instruction per cycle throughput of latency sensitive applications by 13.3% compared to state-of-the-art.
随着越来越多的核心被集成到多核芯片上,内存速度已经成为主要的性能瓶颈。高速内核和内存之间不断扩大的延迟差距导致了多级SRAM/DRAM缓存层次结构的发展,这些层次结构利用了较小缓存的延迟优势(例如私有L1和L2 SRAM缓存)和较大缓存的容量优势(例如共享L3 SRAM和共享L4 DRAM缓存)。使用大型L3/L4缓存的主要问题是它们的高标记查找延迟。为了解决这个问题,我们引入了小而低延迟的SRAM/DRAM标签缓存结构的新概念,该结构可以快速确定对大型L3/L4缓存的访问是否成功。所提出的标签缓存架构的性能取决于标签缓存命中率,为了改进它,我们提出了一种新的标签缓存插入策略和DRAM行缓冲区映射策略,以减少内存请求的延迟。对于16核系统,与最先进的技术相比,这将延迟敏感应用程序的平均谐波平均指令每周期吞吐量提高13.3%。
{"title":"Reducing latency in an SRAM/DRAM cache hierarchy via a novel Tag-Cache architecture","authors":"F. Hameed, L. Bauer, J. Henkel","doi":"10.1145/2593069.2593197","DOIUrl":"https://doi.org/10.1145/2593069.2593197","url":null,"abstract":"Memory speed has become a major performance bottleneck as more and more cores are integrated on a multi-core chip. The widening latency gap between high speed cores and memory has led to the evolution of multi-level SRAM/DRAM cache hierarchies that exploit the latency benefits of smaller caches (e.g. private L1 and L2 SRAM caches) and the capacity benefits of larger caches (e.g. shared L3 SRAM and shared L4 DRAM cache). The main problem of employing large L3/L4 caches is their high tag lookup latency. To solve this problem, we introduce the novel concept of small and low latency SRAM/DRAM Tag-Cache structures that can quickly determine whether an access to the large L3/L4 caches will be a hit or a miss. The performance of the proposed Tag-Cache architecture depends upon the Tag-Cache hit rate and to improve it we propose a novel Tag-Cache insertion policy and a DRAM row buffer mapping policy that reduce the latency of memory requests. For a 16-core system, this improves the average harmonic mean instruction per cycle throughput of latency sensitive applications by 13.3% compared to state-of-the-art.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114824991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Throughput optimization for SADP and e-beam based manufacturing of 1D layout 基于SADP和电子束的一维布局制造的吞吐量优化
Pub Date : 2014-06-01 DOI: 10.1145/2593069.2593233
Yixiao Ding, C. Chu, Wai-Kei Mak
Due to the resolution limitations of optical lithography equipment, 1D gridded layout design is gaining steam. Self-aligned double patterning (SADP) is a mature technology for printing 1D layouts. However, for 20nm and beyond, SADP using a single trim mask becomes insufficient for printing all 1D layouts. A viable solution is to complement SADP with e-beam lithography. In this paper, in order to increase the throughput of printing a 1D layout, we consider the problem of e-beam shot count minimization subject to bounded line end extension constraints. Two different approaches of utilizing the trim mask and e-beam to print a layout are considered. The first approach is under the assumption that the trim mask and e-beam are used for end cutting. The second is under the assumption that the trim mask and e-beam are used to rid of all unnecessary portions. We propose elegant ILP formulations for both approaches. Experimental results show that both ILP formulations can be solved efficiently. The pros and cons of the two approaches for manufacturing 1D layout are discussed.
由于光学光刻设备分辨率的限制,一维网格布局设计正逐渐受到重视。自对齐双模(SADP)是一种成熟的一维版面打印技术。然而,对于20nm及以上,使用单个修整掩模的SADP不足以打印所有1D布局。一个可行的解决方案是用电子束光刻技术补充SADP。为了提高一维布局的打印吞吐量,本文考虑了在有界线端延伸约束下电子束射射数最小化的问题。考虑了两种不同的方法,利用修剪掩模和电子束来打印布局。第一种方法是在假设修剪掩模和电子束被用于末端切割的情况下。第二个是在假设修剪掩模和电子束是用来摆脱所有不必要的部分。我们为这两种方法提出了优雅的ILP公式。实验结果表明,这两种ILP公式都能有效求解。讨论了制造一维布局的两种方法的优缺点。
{"title":"Throughput optimization for SADP and e-beam based manufacturing of 1D layout","authors":"Yixiao Ding, C. Chu, Wai-Kei Mak","doi":"10.1145/2593069.2593233","DOIUrl":"https://doi.org/10.1145/2593069.2593233","url":null,"abstract":"Due to the resolution limitations of optical lithography equipment, 1D gridded layout design is gaining steam. Self-aligned double patterning (SADP) is a mature technology for printing 1D layouts. However, for 20nm and beyond, SADP using a single trim mask becomes insufficient for printing all 1D layouts. A viable solution is to complement SADP with e-beam lithography. In this paper, in order to increase the throughput of printing a 1D layout, we consider the problem of e-beam shot count minimization subject to bounded line end extension constraints. Two different approaches of utilizing the trim mask and e-beam to print a layout are considered. The first approach is under the assumption that the trim mask and e-beam are used for end cutting. The second is under the assumption that the trim mask and e-beam are used to rid of all unnecessary portions. We propose elegant ILP formulations for both approaches. Experimental results show that both ILP formulations can be solved efficiently. The pros and cons of the two approaches for manufacturing 1D layout are discussed.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121898565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Secure memristor-based main memory 基于安全记忆器的主存储器
Pub Date : 2014-06-01 DOI: 10.1145/2593069.2593212
Sachhidh Kannan, Naghmeh Karimi, O. Sinanoglu
Non-volatile memory devices such as phase change memories and memristors are promising alternatives to SRAM and DRAM main memories as they provide higher density and improved energy efficiency. However, non-volatile main memories (NVMM) introduce security vulnerabilities. Sensitive data such as passwords and keys residing in the NVMM will persist and can be probed after power down. We propose sneak-path encryption (SPE), for memristor-based NVMM. SPE exploits the physical parameters, multi-level cell (MLC) capability and the sneak paths in cross-bar memories to encrypt the data stored in memristor-based NVMM. We investigate three attacks on NVMMs and show the resilience of SPE against them. We use a cycle accurate simulator to evaluate the security and performance impact of SPE based NVMM. SPE can secure the NVMM with a latency of 16 cycles and ~1.5% performance overhead.
非易失性存储设备,如相变存储器和忆阻器,是SRAM和DRAM主存储器的有希望的替代品,因为它们提供更高的密度和更高的能量效率。然而,非易失性主存储器(NVMM)引入了安全漏洞。驻留在NVMM中的敏感数据(如密码和密钥)将持续存在,并且可以在断电后进行探测。我们提出了基于忆阻器的NVMM的隐路径加密(SPE)。SPE利用物理参数、多级单元(MLC)能力和跨条存储器中的偷偷路径对存储在基于记忆器的NVMM中的数据进行加密。我们研究了三种针对nvmm的攻击,并展示了SPE对它们的弹性。我们使用周期精确模拟器来评估基于SPE的NVMM对安全性和性能的影响。SPE可以以16个周期的延迟和1.5%的性能开销来保护NVMM。
{"title":"Secure memristor-based main memory","authors":"Sachhidh Kannan, Naghmeh Karimi, O. Sinanoglu","doi":"10.1145/2593069.2593212","DOIUrl":"https://doi.org/10.1145/2593069.2593212","url":null,"abstract":"Non-volatile memory devices such as phase change memories and memristors are promising alternatives to SRAM and DRAM main memories as they provide higher density and improved energy efficiency. However, non-volatile main memories (NVMM) introduce security vulnerabilities. Sensitive data such as passwords and keys residing in the NVMM will persist and can be probed after power down. We propose sneak-path encryption (SPE), for memristor-based NVMM. SPE exploits the physical parameters, multi-level cell (MLC) capability and the sneak paths in cross-bar memories to encrypt the data stored in memristor-based NVMM. We investigate three attacks on NVMMs and show the resilience of SPE against them. We use a cycle accurate simulator to evaluate the security and performance impact of SPE based NVMM. SPE can secure the NVMM with a latency of 16 cycles and ~1.5% performance overhead.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122151529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
期刊
2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1