首页 > 最新文献

2016 International Great Lakes Symposium on VLSI (GLSVLSI)最新文献

英文 中文
ASIC implementation of an all-digital self-adaptive PVTA variation-aware clock generation system 用ASIC实现的全数字自适应PVTA变差感知时钟生成系统
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903006
J. Perez-Puigdemont, F. Moll
An all-digital self-adaptive clock generation system capable of autonomously adapt the clock frequency to compensate the effects of static spatially heterogeneous (SSHet) PVTA variations is presented. The design uses time-to-digital converters (TDCs) as delay sensors and a variable length ring oscillator (VLRO) as clock generator. The VLRO naturally adapts its frequency to the PVTA variations suffered by its logic gates while the TDCs are used to track these variations across the chip and modify the VLRO length in order to allocate them. The proposed system has been implemented in a silicon chip using a 65nm process. The fabricated chip has been used to test the system adaptive capabilities under SSHet voltage variations. Measurement results show that it effectively adapts the VLRO length, and hence the clock frequency, to the supply voltage variations.
提出了一种全数字自适应时钟生成系统,该系统能够自动调整时钟频率以补偿静态空间异构(SSHet) PVTA变化的影响。该设计采用时间-数字转换器(tdc)作为延迟传感器,可变长环振荡器(VLRO)作为时钟发生器。VLRO自然地调整其频率以适应其逻辑门所遭受的PVTA变化,而tdc用于跟踪芯片上的这些变化并修改VLRO长度以分配它们。该系统已在采用65nm工艺的硅片上实现。该芯片已用于测试系统在SSHet电压变化下的自适应能力。测量结果表明,该方法能有效地调整VLRO的长度和时钟频率,以适应电源电压的变化。
{"title":"ASIC implementation of an all-digital self-adaptive PVTA variation-aware clock generation system","authors":"J. Perez-Puigdemont, F. Moll","doi":"10.1145/2902961.2903006","DOIUrl":"https://doi.org/10.1145/2902961.2903006","url":null,"abstract":"An all-digital self-adaptive clock generation system capable of autonomously adapt the clock frequency to compensate the effects of static spatially heterogeneous (SSHet) PVTA variations is presented. The design uses time-to-digital converters (TDCs) as delay sensors and a variable length ring oscillator (VLRO) as clock generator. The VLRO naturally adapts its frequency to the PVTA variations suffered by its logic gates while the TDCs are used to track these variations across the chip and modify the VLRO length in order to allocate them. The proposed system has been implemented in a silicon chip using a 65nm process. The fabricated chip has been used to test the system adaptive capabilities under SSHet voltage variations. Measurement results show that it effectively adapts the VLRO length, and hence the clock frequency, to the supply voltage variations.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129686042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Optimizing the operating voltage of tunnel FET-based SRAM arrays equipped with read/write assist circuitry 具有读写辅助电路的隧道场效应晶体管SRAM阵列的工作电压优化
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903031
H. Afzali-Kusha, A. Shafaei, Massoud Pedram
This paper deals with obtaining the minimum operating voltage of memory arrays based on TFET SRAM cells. First, we compare the I-V characteristics of two TFETs and one FDSOI using SPICE simulations. The results reveal that TFET devices exhibit high ON/OFF current ratios at different power supply voltage levels. This observation suggests a higher stability for SRAM cells based on these devices. Next, the characteristics of 6T SRAM cells implemented using minimum sized transistors based on these three device structures are compared. The comparison, which considers two TFET cell structures, i.e., inward and outward SRAMs, is performed at different supply voltages. The results for the hold static noise margin show that at low supply voltages (i.e., below 300mV), the FDSOI SRAM cell cannot hold data whereas both the inward and outward structures of TFET have acceptable noise margins at all supply voltages. Among the two TFET structures, the outward cell is selected because of higher speed especially for the write operation. TFET SRAMs suffer from long read access latency at ultra-low supply voltages (e.g., 150mV). The problem, however, may be overcome by using the negative GND read-assist technique. The results show that for a 32×32 TFET outward SRAM array, the minimum energy consumption (energy-delay product) may be achieved at the supply voltage of 200mV (300mV) with 1.32GHz (4.55GHz) as the read access frequency.
本文研究了基于tefet SRAM单元的存储阵列的最小工作电压的获取问题。首先,我们使用SPICE模拟比较了两个tfet和一个FDSOI的I-V特性。结果表明,在不同的电源电压水平下,TFET器件具有较高的开/关电流比。这一观察结果表明,基于这些器件的SRAM电池具有更高的稳定性。接下来,比较了基于这三种器件结构的最小尺寸晶体管实现的6T SRAM单元的特性。在不同的电源电压下进行比较,考虑了两种TFET电池结构,即向内和向外sram。保持静态噪声裕度的结果表明,在低电源电压下(即低于300mV), FDSOI SRAM单元不能保持数据,而TFET的内向和外向结构在所有电源电压下都具有可接受的噪声裕度。在这两种结构中,选择向外的晶体管是因为它具有更高的速度,特别是对于写入操作。在超低电源电压(例如150mV)下,ttfet sram的读取访问延迟较长。然而,这个问题可以通过使用负GND读取辅助技术来克服。结果表明,对于32×32 TFET外置SRAM阵列,在电源电压为200mV (300mV)、读取接入频率为1.32GHz (4.55GHz)时,能量消耗(能量延迟积)最小。
{"title":"Optimizing the operating voltage of tunnel FET-based SRAM arrays equipped with read/write assist circuitry","authors":"H. Afzali-Kusha, A. Shafaei, Massoud Pedram","doi":"10.1145/2902961.2903031","DOIUrl":"https://doi.org/10.1145/2902961.2903031","url":null,"abstract":"This paper deals with obtaining the minimum operating voltage of memory arrays based on TFET SRAM cells. First, we compare the I-V characteristics of two TFETs and one FDSOI using SPICE simulations. The results reveal that TFET devices exhibit high ON/OFF current ratios at different power supply voltage levels. This observation suggests a higher stability for SRAM cells based on these devices. Next, the characteristics of 6T SRAM cells implemented using minimum sized transistors based on these three device structures are compared. The comparison, which considers two TFET cell structures, i.e., inward and outward SRAMs, is performed at different supply voltages. The results for the hold static noise margin show that at low supply voltages (i.e., below 300mV), the FDSOI SRAM cell cannot hold data whereas both the inward and outward structures of TFET have acceptable noise margins at all supply voltages. Among the two TFET structures, the outward cell is selected because of higher speed especially for the write operation. TFET SRAMs suffer from long read access latency at ultra-low supply voltages (e.g., 150mV). The problem, however, may be overcome by using the negative GND read-assist technique. The results show that for a 32×32 TFET outward SRAM array, the minimum energy consumption (energy-delay product) may be achieved at the supply voltage of 200mV (300mV) with 1.32GHz (4.55GHz) as the read access frequency.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130335606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leakage power minimization in deep sub-micron technology by exploiting positive slacks of dependent paths 利用相关路径的正松弛实现深亚微米技术的泄漏功率最小化
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2902991
T. Chakraborty, Santanu Kundu, D. Agrawal, Sanjay Shinde, Jacob Mathews, R. K. James
Leakage power minimization is one of the key aspects of modern multi-million low power system-on-chip (SoC) design. In post timing-closure phase, leakage-in-place-optimization (LIPO) is generally adopted to reduce leakage power by swapping high-leaky cells in the timing-data-paths by low-leaky ones of the same footprint. The traditional LIPO does not touch the clock network for leakage recovery. This paper investigates the opportunity to reduce leakage power further of an already leakage-power-minimized (by LIPO), timing closed design by minimally altering the balanced clock tree. The proposed method, Opportunistic LIPO, intends to borrow unused positive-slack from downstream (and/or upstream) paths, may or may not be at immediate neighborhood, and provide a “positive skew” (and/or “negative skew”) at the capture (and/or launch) clock edge of the current path. In this way, the proposed scheme creates an opportunity in the current path to increase the low-leaky cells distribution. Experimental results, computed over some practical duration (less than 48 hours), on some industry-standard design based on 28nm technology, of having around 50 million gates, shows that the proposed algorithm, “Opportunistic LIPO”, achieves 10-30% better leakage power as compared to traditional LIPO without increasing the number of timing violations and having no significant impact on overall area.
泄漏功率最小化是现代低功耗片上系统(SoC)设计的关键问题之一。在后时间闭合阶段,通常采用泄漏就地优化(LIPO),通过将时间数据路径中的高泄漏单元交换为相同占用空间的低泄漏单元来降低泄漏功率。传统的LIPO不接触时钟网络进行泄漏恢复。本文研究了通过最小限度地改变平衡时钟树来进一步降低泄漏功率的机会,这种泄漏功率已经最小化(通过LIPO),定时关闭设计。所提出的方法,机会性LIPO,打算从下游(和/或上游)路径借用未使用的正松弛,可能或可能不在邻近区域,并在当前路径的捕获(和/或发射)时钟边缘提供“正倾斜”(和/或“负倾斜”)。通过这种方式,所提出的方案在当前路径中创造了增加低泄漏电池分布的机会。实验结果表明,在基于28nm技术的工业标准设计中,在大约5000万个栅极的实际持续时间内(小于48小时)计算的结果表明,与传统的LIPO相比,提出的“机会型LIPO”算法在不增加计时违规次数的情况下,泄漏功率提高了10-30%,对总面积没有显著影响。
{"title":"Leakage power minimization in deep sub-micron technology by exploiting positive slacks of dependent paths","authors":"T. Chakraborty, Santanu Kundu, D. Agrawal, Sanjay Shinde, Jacob Mathews, R. K. James","doi":"10.1145/2902961.2902991","DOIUrl":"https://doi.org/10.1145/2902961.2902991","url":null,"abstract":"Leakage power minimization is one of the key aspects of modern multi-million low power system-on-chip (SoC) design. In post timing-closure phase, leakage-in-place-optimization (LIPO) is generally adopted to reduce leakage power by swapping high-leaky cells in the timing-data-paths by low-leaky ones of the same footprint. The traditional LIPO does not touch the clock network for leakage recovery. This paper investigates the opportunity to reduce leakage power further of an already leakage-power-minimized (by LIPO), timing closed design by minimally altering the balanced clock tree. The proposed method, Opportunistic LIPO, intends to borrow unused positive-slack from downstream (and/or upstream) paths, may or may not be at immediate neighborhood, and provide a “positive skew” (and/or “negative skew”) at the capture (and/or launch) clock edge of the current path. In this way, the proposed scheme creates an opportunity in the current path to increase the low-leaky cells distribution. Experimental results, computed over some practical duration (less than 48 hours), on some industry-standard design based on 28nm technology, of having around 50 million gates, shows that the proposed algorithm, “Opportunistic LIPO”, achieves 10-30% better leakage power as compared to traditional LIPO without increasing the number of timing violations and having no significant impact on overall area.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130998392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A novel on-chip impedance calibration method for LPDDR4 interface between DRAM and AP/SoC 一种新的DRAM与AP/SoC之间LPDDR4接口的片上阻抗校准方法
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2902982
Yongsuk Choi, Yong-Bin Kim
In this paper, a novel on-chip impedance calibration methodology for a LPDDR4 (low power double data rate) application is proposed. The background calibration operates to compensate mismatches and variations of the output NMOS drivers from process and temperature variations. The impedance matching concept uses process sensor and temperature monitoring sensors closely located to DQ pins as a means to detect output driver transistor mismatches due to process and temperature variations. In addition, digitized sensor outputs from ADCs are used as inputs of look-up tables, which control calibration codes of the transmitter driver. The proposed circuitry is designed with DRAM bidirectional transceiver and implemented using a standard 180nm CMOS technology, and the impedance calibration technique is demonstrated with external termination resistance of 40/48/60/80/120/240 ohm, respectively. In the receiver end, a PMOS input sense amplifier is designed considering the required common mode range for the LVSTL (low voltage swing termination logic) signal interface, and an adaptive gain control scheme is also applied on the receiver design. The process sensor is utilized to control the gain factor of the receiver. The active area including power-ring of the transmitter is 14.4mm2 with only 0.48mm2 of the proposed calibration circuit overhead.
本文提出了一种适用于LPDDR4(低功耗双数据速率)应用的片上阻抗校准方法。背景校准用于补偿过程和温度变化引起的输出NMOS驱动器的不匹配和变化。阻抗匹配概念使用靠近DQ引脚的过程传感器和温度监测传感器作为检测由于过程和温度变化而导致的输出驱动器晶体管不匹配的手段。此外,adc的数字化传感器输出用作查表的输入,查表控制发射机驱动器的校准代码。该电路采用DRAM双向收发器设计,采用标准的180nm CMOS技术实现,外部端电阻分别为40/48/60/80/120/240 ohm,阻抗校准技术得到验证。在接收端,考虑到LVSTL(低电压摆幅终止逻辑)信号接口所需的共模范围,设计了PMOS输入感测放大器,并在接收机设计中采用了自适应增益控制方案。过程传感器用于控制接收机的增益系数。包括功率环在内的发射机有效面积为14.4mm2,而所提出的校准电路开销仅为0.48mm2。
{"title":"A novel on-chip impedance calibration method for LPDDR4 interface between DRAM and AP/SoC","authors":"Yongsuk Choi, Yong-Bin Kim","doi":"10.1145/2902961.2902982","DOIUrl":"https://doi.org/10.1145/2902961.2902982","url":null,"abstract":"In this paper, a novel on-chip impedance calibration methodology for a LPDDR4 (low power double data rate) application is proposed. The background calibration operates to compensate mismatches and variations of the output NMOS drivers from process and temperature variations. The impedance matching concept uses process sensor and temperature monitoring sensors closely located to DQ pins as a means to detect output driver transistor mismatches due to process and temperature variations. In addition, digitized sensor outputs from ADCs are used as inputs of look-up tables, which control calibration codes of the transmitter driver. The proposed circuitry is designed with DRAM bidirectional transceiver and implemented using a standard 180nm CMOS technology, and the impedance calibration technique is demonstrated with external termination resistance of 40/48/60/80/120/240 ohm, respectively. In the receiver end, a PMOS input sense amplifier is designed considering the required common mode range for the LVSTL (low voltage swing termination logic) signal interface, and an adaptive gain control scheme is also applied on the receiver design. The process sensor is utilized to control the gain factor of the receiver. The active area including power-ring of the transmitter is 14.4mm2 with only 0.48mm2 of the proposed calibration circuit overhead.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114686119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Computing complex functions using factorization in unipolar stochastic logic 用单极随机逻辑分解计算复函数
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2902999
Yin Liu, K. Parhi
This paper addresses computing complex functions using unipolar stochastic logic. Stochastic computing requires simple logic gates and is inherently fault-tolerant. Thus, these structures are well suited for nanoscale CMOS technologies. Implementations of complex functions cost extremely low hardware complexity compared to traditional two's complement implementation. In this paper an approach based on polynomial factorization is proposed to compute functions in unipolar stochastic logic. In this approach, functions are expressed using polynomials, which are derived from Taylor expansion or Lagrange interpolation. Polynomials are implemented in stochastic logic by using factorization. Experimental results in terms of accuracy and hardware complexity are presented to compare the proposed designs of complex functions with previous implementations using Bernstein polynomials.
本文讨论了用单极随机逻辑计算复杂函数。随机计算需要简单的逻辑门,并且具有固有的容错性。因此,这些结构非常适合纳米级CMOS技术。与传统的两个互补实现相比,复杂功能的实现花费的硬件复杂性极低。本文提出了一种基于多项式分解的方法来计算单极随机逻辑中的函数。在这种方法中,函数是用多项式来表示的,而多项式是由泰勒展开或拉格朗日插值得来的。在随机逻辑中,多项式是通过因式分解实现的。在精度和硬件复杂性方面给出了实验结果,将所提出的复杂函数设计与以前使用伯恩斯坦多项式的实现进行了比较。
{"title":"Computing complex functions using factorization in unipolar stochastic logic","authors":"Yin Liu, K. Parhi","doi":"10.1145/2902961.2902999","DOIUrl":"https://doi.org/10.1145/2902961.2902999","url":null,"abstract":"This paper addresses computing complex functions using unipolar stochastic logic. Stochastic computing requires simple logic gates and is inherently fault-tolerant. Thus, these structures are well suited for nanoscale CMOS technologies. Implementations of complex functions cost extremely low hardware complexity compared to traditional two's complement implementation. In this paper an approach based on polynomial factorization is proposed to compute functions in unipolar stochastic logic. In this approach, functions are expressed using polynomials, which are derived from Taylor expansion or Lagrange interpolation. Polynomials are implemented in stochastic logic by using factorization. Experimental results in terms of accuracy and hardware complexity are presented to compare the proposed designs of complex functions with previous implementations using Bernstein polynomials.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"186 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114853314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
High-speed polynomial multiplier architecture for ring-LWE based public key cryptosystems 基于环lwe的公钥密码体制的高速多项式乘法器结构
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2902969
Chaohui Du, Guoqiang Bai, Xingjun Wu
Many lattice-based cryptosystems are based on the security of the Ring learning with errors (Ring-LWE) problem. The most critical and computationally intensive operation of these Ring-LWE based cryptosystems is polynomial multiplication. In this paper, we exploit the number theoretic transform to build a high-speed polynomial multiplier for the Ring-LWE based public key cryptosystems. We present a versatile pipelined polynomial multiplication architecture to calculate the product of two η-degree polynomials in about ((n lg n)/4+n/2) clock cycles. In addition, we introduce several optimization techniques to reduce the required ROM storage. The experimental results on a Spartan-6 FPGA show that the proposed hardware architecture can achieve a speedup of on average 2.25 than the state of the art of high-speed design. Meanwhile, our design is able to save up to 47.06% memory blocks.
许多基于格的密码系统都是基于带错误环学习(Ring- lwe)问题的安全性。在这些基于Ring-LWE的密码系统中,最关键和计算量最大的运算是多项式乘法。本文利用数论变换为基于Ring-LWE的公钥密码体制建立了一个高速多项式乘法器。我们提出了一种通用的流水线多项式乘法架构,可以在大约(n lgn)/4+n/2)个时钟周期内计算两个η度多项式的乘积。此外,我们还介绍了几种优化技术来减少所需的ROM存储。在Spartan-6 FPGA上的实验结果表明,所提出的硬件架构比目前高速设计的平均速度提高2.25。同时,我们的设计能够节省高达47.06%的内存块。
{"title":"High-speed polynomial multiplier architecture for ring-LWE based public key cryptosystems","authors":"Chaohui Du, Guoqiang Bai, Xingjun Wu","doi":"10.1145/2902961.2902969","DOIUrl":"https://doi.org/10.1145/2902961.2902969","url":null,"abstract":"Many lattice-based cryptosystems are based on the security of the Ring learning with errors (Ring-LWE) problem. The most critical and computationally intensive operation of these Ring-LWE based cryptosystems is polynomial multiplication. In this paper, we exploit the number theoretic transform to build a high-speed polynomial multiplier for the Ring-LWE based public key cryptosystems. We present a versatile pipelined polynomial multiplication architecture to calculate the product of two η-degree polynomials in about ((n lg n)/4+n/2) clock cycles. In addition, we introduce several optimization techniques to reduce the required ROM storage. The experimental results on a Spartan-6 FPGA show that the proposed hardware architecture can achieve a speedup of on average 2.25 than the state of the art of high-speed design. Meanwhile, our design is able to save up to 47.06% memory blocks.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128804310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
VLSI design methods for low power embedded encryption 低功耗嵌入式加密的VLSI设计方法
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2902963
I. Verbauwhede
Intelligent things, medical devices, vehicles and factories, all part of cyberphysical systems, will only be secure if we can build devices that can perform the mathematically demanding cryptographic operations in an efficient way. Unfortunately, many of devices operate under extremely limited power, energy and area constraints. Yet we expect that they can execute, often in real-time, the symmetric key, public key and/or hash functions needed for the application. At the same time, we request that the implementations are also secure against a wide range of physical attacks. This presentation will focus on the design methods to realize cryptographic operations on resource constrained devices. To reach the extremely low power, low energy and area budgets, we need to consider in an integrated way the protocols, the algorithms, the architectures and the circuit aspects of the application. These concepts will be illustrated with the design of several cryptographic co-processors suitable for implementation in embedded context.
智能设备、医疗设备、车辆和工厂,这些都是网络物理系统的一部分,只有当我们能够制造出能够高效地执行数学上要求很高的加密操作的设备时,它们才会安全。不幸的是,许多设备在非常有限的功率、能量和面积限制下运行。然而,我们希望它们能够经常实时地执行应用程序所需的对称密钥、公钥和/或哈希函数。同时,我们要求实现也是安全的,可以抵御各种物理攻击。本报告将重点介绍在资源受限的设备上实现加密操作的设计方法。为了达到极低的功耗,低能耗和面积预算,我们需要以综合的方式考虑应用程序的协议,算法,架构和电路方面。这些概念将通过设计几个适合在嵌入式环境中实现的加密协处理器来说明。
{"title":"VLSI design methods for low power embedded encryption","authors":"I. Verbauwhede","doi":"10.1145/2902961.2902963","DOIUrl":"https://doi.org/10.1145/2902961.2902963","url":null,"abstract":"Intelligent things, medical devices, vehicles and factories, all part of cyberphysical systems, will only be secure if we can build devices that can perform the mathematically demanding cryptographic operations in an efficient way. Unfortunately, many of devices operate under extremely limited power, energy and area constraints. Yet we expect that they can execute, often in real-time, the symmetric key, public key and/or hash functions needed for the application. At the same time, we request that the implementations are also secure against a wide range of physical attacks. This presentation will focus on the design methods to realize cryptographic operations on resource constrained devices. To reach the extremely low power, low energy and area budgets, we need to consider in an integrated way the protocols, the algorithms, the architectures and the circuit aspects of the application. These concepts will be illustrated with the design of several cryptographic co-processors suitable for implementation in embedded context.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128861664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Modeling and study of two-BDT-nanostructure based sequential logic circuits 基于双bdt纳米结构的顺序逻辑电路建模与研究
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903001
P. Marthi, Sheikh Rufsan Reza, N. Hossain, J. Millithaler, M. Margala, I. Íñiguez-de-la-Torre, J. Mateos, T. González
In this paper, study of different digital logic circuits developed using two-BDT ballistic nanostructure is presented. New D flipflop (DFF) based on the same nanostructure is also proposed. The logic structure comprises two ballistic deflection transistors (BDTs) that are experimentally proven to operate at Terahertz frequencies. The non-linear behavior of the BDT's transfer characteristic has been perfectly reproduced by means of Monte Carlo simulations, where a specific attention has been devoted to surface charges. An analytical model built on the results of advanced MC simulations has been integrated into a behavioral Verilog AMS module to confirm the functionality of the circuit design. The module is used to analyze operating conditions of different combinational circuits and to investigate the feasibility of DFF design using BDT nanostructure. The simulation results indicate successful operation of both combinational and sequential circuits developed using two-BDT logic structure under proper biasing of gate and source terminals. The operating voltages of the proposed DFF are estimated to be ± 225mV.
本文研究了利用双bdt弹道纳米结构开发的不同数字逻辑电路。基于相同纳米结构的新型D触发器(DFF)也被提出。该逻辑结构包括两个弹道偏转晶体管(bdt),经实验证明可在太赫兹频率下工作。通过蒙特卡罗模拟,BDT传输特性的非线性行为得到了完美的再现,其中特别注意了表面电荷。基于先进MC仿真结果的分析模型已集成到行为Verilog AMS模块中,以确认电路设计的功能。该模块用于分析不同组合电路的工作条件,并研究利用BDT纳米结构设计DFF的可行性。仿真结果表明,在门端和源端适当偏置的情况下,采用双bdt逻辑结构开发的组合电路和顺序电路均能成功运行。该DFF的工作电压估计为±225mV。
{"title":"Modeling and study of two-BDT-nanostructure based sequential logic circuits","authors":"P. Marthi, Sheikh Rufsan Reza, N. Hossain, J. Millithaler, M. Margala, I. Íñiguez-de-la-Torre, J. Mateos, T. González","doi":"10.1145/2902961.2903001","DOIUrl":"https://doi.org/10.1145/2902961.2903001","url":null,"abstract":"In this paper, study of different digital logic circuits developed using two-BDT ballistic nanostructure is presented. New D flipflop (DFF) based on the same nanostructure is also proposed. The logic structure comprises two ballistic deflection transistors (BDTs) that are experimentally proven to operate at Terahertz frequencies. The non-linear behavior of the BDT's transfer characteristic has been perfectly reproduced by means of Monte Carlo simulations, where a specific attention has been devoted to surface charges. An analytical model built on the results of advanced MC simulations has been integrated into a behavioral Verilog AMS module to confirm the functionality of the circuit design. The module is used to analyze operating conditions of different combinational circuits and to investigate the feasibility of DFF design using BDT nanostructure. The simulation results indicate successful operation of both combinational and sequential circuits developed using two-BDT logic structure under proper biasing of gate and source terminals. The operating voltages of the proposed DFF are estimated to be ± 225mV.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125305251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A clockless sequential PUF with autonomous majority voting 具有自主多数投票的无时钟顺序PUF
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903029
Xiaolin Xu, Daniel E. Holcomb
Physical unclonable functions (PUFs) leverage minute silicon process variations to produce device-tied secret keys. The energy and area costs of creating keys from PUFs can far exceed the costs of the basic PUF circuits alone. Minimizing the end-to-end cost of reliable key generation is critical to enable broader adoption of PUFs. In this work, we introduce a new style of PUF that employs autonomous majority voting to improve reliability. The novelty of this design, and the source of its efficiency, is that the inherently sequential majority voting procedure is carried out by a self-timed circuit without orchestration by a global clock. We use circuit simulation to evaluate the energy versus reliability tradeoffs achieved by different parameterizations of the design, to show that the design performs well across a range of supply voltages, and to quantify the robustness of the design across a broad range of operating temperatures.
物理不可克隆函数(puf)利用微小的硅工艺变化来生成与设备绑定的密钥。从PUF中创建密钥的能量和面积成本远远超过基本PUF电路的成本。最小化可靠密钥生成的端到端成本对于更广泛地采用puf至关重要。在这项工作中,我们引入了一种新的PUF风格,该风格采用自治多数投票来提高可靠性。这种设计的新颖之处在于,固有的顺序多数投票过程是由自定时电路执行的,而无需全局时钟的编排。我们使用电路仿真来评估通过不同参数化设计实现的能量与可靠性权衡,以显示设计在各种电源电压范围内表现良好,并量化设计在各种工作温度范围内的稳健性。
{"title":"A clockless sequential PUF with autonomous majority voting","authors":"Xiaolin Xu, Daniel E. Holcomb","doi":"10.1145/2902961.2903029","DOIUrl":"https://doi.org/10.1145/2902961.2903029","url":null,"abstract":"Physical unclonable functions (PUFs) leverage minute silicon process variations to produce device-tied secret keys. The energy and area costs of creating keys from PUFs can far exceed the costs of the basic PUF circuits alone. Minimizing the end-to-end cost of reliable key generation is critical to enable broader adoption of PUFs. In this work, we introduce a new style of PUF that employs autonomous majority voting to improve reliability. The novelty of this design, and the source of its efficiency, is that the inherently sequential majority voting procedure is carried out by a self-timed circuit without orchestration by a global clock. We use circuit simulation to evaluate the energy versus reliability tradeoffs achieved by different parameterizations of the design, to show that the design performs well across a range of supply voltages, and to quantify the robustness of the design across a broad range of operating temperatures.","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122221573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Security primitive design with nanoscale devices: A case study with resistive RAM 纳米级器件的安全原语设计:电阻式RAM的案例研究
Pub Date : 2016-05-18 DOI: 10.1145/2902961.2903042
Robert Karam, Rui Liu, Pai-Yu Chen, Shimeng Yu, S. Bhunia
Inherent stochastic physical mechanisms in emerging nonvolatile memories (NVMs), such as resistive random-access-memory (RRAM), have recently been explored for hardware security applications. Unlike the conventional silicon Physical Unclonable Functions (PUFs) that are solely based on manufacturing process variation, RRAM has some intrinsic randomness in its physical mechanisms that can be utilized as entropy sources; for instance, resistance variation, random telegraph noise, and probabilistic switching behaviors. This paper reviews the challenges and opportunities in building security primitives with emerging devices. In particular, it presents research progress of RRAM-based hardware security primitives, including PUF and True Random Number Generator (TRNG).
在新兴的非易失性存储器(nvm)中,固有的随机物理机制,如电阻性随机存取存储器(RRAM),最近被探索用于硬件安全应用。与传统的硅物理不可克隆函数(puf)完全基于制造工艺的变化不同,RRAM在其物理机制中具有一些内在的随机性,可以作为熵源;例如,电阻变化、随机电报噪声和概率开关行为。本文回顾了用新兴设备构建安全原语的挑战和机遇。重点介绍了基于随机存储器的硬件安全原语的研究进展,包括PUF和真随机数生成器(TRNG)。
{"title":"Security primitive design with nanoscale devices: A case study with resistive RAM","authors":"Robert Karam, Rui Liu, Pai-Yu Chen, Shimeng Yu, S. Bhunia","doi":"10.1145/2902961.2903042","DOIUrl":"https://doi.org/10.1145/2902961.2903042","url":null,"abstract":"Inherent stochastic physical mechanisms in emerging nonvolatile memories (NVMs), such as resistive random-access-memory (RRAM), have recently been explored for hardware security applications. Unlike the conventional silicon Physical Unclonable Functions (PUFs) that are solely based on manufacturing process variation, RRAM has some intrinsic randomness in its physical mechanisms that can be utilized as entropy sources; for instance, resistance variation, random telegraph noise, and probabilistic switching behaviors. This paper reviews the challenges and opportunities in building security primitives with emerging devices. In particular, it presents research progress of RRAM-based hardware security primitives, including PUF and True Random Number Generator (TRNG).","PeriodicalId":407054,"journal":{"name":"2016 International Great Lakes Symposium on VLSI (GLSVLSI)","volume":"5 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123438302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
期刊
2016 International Great Lakes Symposium on VLSI (GLSVLSI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1