首页 > 最新文献

2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)最新文献

英文 中文
Adaptive Transceiver for Wireless NoC to Enhance Multicast/Unicast Communication Scenarios 无线NoC自适应收发器增强多播/单播通信场景
Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00111
Joel Ortiz Sosa, O. Sentieys, C. Roland
Wireless Network-on-Chip (WiNoC) is a viable solution to overcome critical bottlenecks in on-chip communication backbone. However, standard WiNoC approaches are vulnerable to multi-path interference introduced by on-chip physical structures. To overcome such parasitic phenomenon, this paper presents an adaptive digital transceiver, which enhances communication reliability under different wireless channel configurations. Based on a semi-realistic wireless channel model, we investigate the impact of using some channel correction techniques. Experimental results show that our approach significantly improves Bit Error Rate (BER) under different wireless channel configurations. Moreover, our adaptive transceiver allows for wireless communication links to be established in conditions where this would not be possible for standard transceiver architectures. The proposed architecture, designed using a 28-nm FDSOI technology, consumes only 3.27 mW for a data rate of 10 Gbit/s and has a very small area footprint.
无线片上网络(WiNoC)是克服片上通信骨干网瓶颈的可行解决方案。然而,标准的WiNoC方法容易受到片上物理结构引入的多径干扰。为了克服这种寄生现象,本文提出了一种自适应数字收发器,提高了在不同无线信道配置下的通信可靠性。基于半真实的无线信道模型,我们研究了使用一些信道校正技术的影响。实验结果表明,该方法在不同的无线信道配置下都能显著提高误码率。此外,我们的自适应收发器允许在标准收发器架构无法实现的条件下建立无线通信链路。该架构采用28纳米FDSOI技术设计,数据速率为10 Gbit/s,功耗仅为3.27 mW,占地面积非常小。
{"title":"Adaptive Transceiver for Wireless NoC to Enhance Multicast/Unicast Communication Scenarios","authors":"Joel Ortiz Sosa, O. Sentieys, C. Roland","doi":"10.1109/ISVLSI.2019.00111","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00111","url":null,"abstract":"Wireless Network-on-Chip (WiNoC) is a viable solution to overcome critical bottlenecks in on-chip communication backbone. However, standard WiNoC approaches are vulnerable to multi-path interference introduced by on-chip physical structures. To overcome such parasitic phenomenon, this paper presents an adaptive digital transceiver, which enhances communication reliability under different wireless channel configurations. Based on a semi-realistic wireless channel model, we investigate the impact of using some channel correction techniques. Experimental results show that our approach significantly improves Bit Error Rate (BER) under different wireless channel configurations. Moreover, our adaptive transceiver allows for wireless communication links to be established in conditions where this would not be possible for standard transceiver architectures. The proposed architecture, designed using a 28-nm FDSOI technology, consumes only 3.27 mW for a data rate of 10 Gbit/s and has a very small area footprint.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"58 1","pages":"592-597"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74845370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Micro-electrode-dot Array Based Biochips : Advantages of Using Different Shaped CMAs 基于微电极点阵列的生物芯片:使用不同形状cma的优势
Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00061
Pampa Howladar, P. Roy, H. Rahaman
Recent emergence of micro-electrode-dot array (MEDA) based digital microfluidic biochips has facilitated major improvement in microfluidic operations for conventional lab-on-chip devices. MEDA based digital microfluidic biochips typically consist of a 2D planar array of cells of square sized microelectrodes. One of the most critical issues in the biochip layout design is the droplet transportation within the 2D layout. MEDA allows for dynamic routing with variable shaped and sized droplets. Microelectrode cells are dynamically group together to form configured microelectrode array (CMA) for variable shaped and sized droplets. This existing square shaped CMA is highly suitable for the droplets that needs exactly rectangular or square area but may not be as effective for rhombus or hexagonal shaped areas occupied by droplets. This is because the occupied additional area may be useful for other droplets to accommodate their corresponding minimum shortest path. In this paper, we present some MEDA layout design of 2D planar array of cells using triangle shaped microelectrodes by arranging them in a regular way. This allows for better convenience towards formation of different droplet shapes more efficiently, specifically for rhombus or hexagonal sized droplets. MEDA routing operations has been conducted on this newly proposed MEDA layout (using variable shaped electrodes) followed by analysis and comparison with existing MEDA layout design. Finally droplet transportation with standard bench marks are mapped in order to demonstrate how well the proposed designs can improve the performance of bioassay execution in MEDA based DMFB layout.
最近出现的基于微电极点阵列(MEDA)的数字微流控生物芯片促进了传统芯片上实验室设备微流控操作的重大改进。基于MEDA的数字微流控生物芯片通常由正方形微电极的二维平面细胞阵列组成。生物芯片布局设计中最关键的问题之一是液滴在二维布局中的传输。MEDA允许动态路由与可变形状和大小的液滴。微电极单元被动态地组合在一起,形成可配置的微电极阵列(CMA),用于不同形状和大小的液滴。现有的方形CMA非常适合于需要矩形或方形区域的液滴,但对于液滴所占据的菱形或六边形区域可能效果不佳。这是因为占用的额外面积可能对其他液滴容纳相应的最小最短路径有用。本文提出了一种利用三角形微电极按规则排列的二维平面单元阵列的MEDA布局设计方法。这样可以更方便地形成不同形状的液滴,特别是菱形或六边形大小的液滴。在新提出的MEDA布局(使用可变形状电极)上进行MEDA布线操作,并与现有的MEDA布局设计进行分析和比较。最后,绘制了具有标准基准的液滴运输图,以证明所提出的设计可以如何很好地提高基于MEDA的DMFB布局的生物测定执行性能。
{"title":"Micro-electrode-dot Array Based Biochips : Advantages of Using Different Shaped CMAs","authors":"Pampa Howladar, P. Roy, H. Rahaman","doi":"10.1109/ISVLSI.2019.00061","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00061","url":null,"abstract":"Recent emergence of micro-electrode-dot array (MEDA) based digital microfluidic biochips has facilitated major improvement in microfluidic operations for conventional lab-on-chip devices. MEDA based digital microfluidic biochips typically consist of a 2D planar array of cells of square sized microelectrodes. One of the most critical issues in the biochip layout design is the droplet transportation within the 2D layout. MEDA allows for dynamic routing with variable shaped and sized droplets. Microelectrode cells are dynamically group together to form configured microelectrode array (CMA) for variable shaped and sized droplets. This existing square shaped CMA is highly suitable for the droplets that needs exactly rectangular or square area but may not be as effective for rhombus or hexagonal shaped areas occupied by droplets. This is because the occupied additional area may be useful for other droplets to accommodate their corresponding minimum shortest path. In this paper, we present some MEDA layout design of 2D planar array of cells using triangle shaped microelectrodes by arranging them in a regular way. This allows for better convenience towards formation of different droplet shapes more efficiently, specifically for rhombus or hexagonal sized droplets. MEDA routing operations has been conducted on this newly proposed MEDA layout (using variable shaped electrodes) followed by analysis and comparison with existing MEDA layout design. Finally droplet transportation with standard bench marks are mapped in order to demonstrate how well the proposed designs can improve the performance of bioassay execution in MEDA based DMFB layout.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"180 1","pages":"296-301"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85190324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Message from the Technical Program Chairs 来自技术项目主席的信息
Pub Date : 2019-07-01 DOI: 10.1109/SCORED.2010.5703958
M. Ismail, Zaini Jamaludin
The 14 IEEE International Conference on Autonomic Computing (ICAC 2017) will take place in Columbus, Ohio USA. ICAC is the premiere forum for research on autonomic computing techniques, foundations and applications. In 2017, ICAC will feature 4 keynote speakers, one for each day of the conference. On July 17, Dr. Nelly Bencomo from Aston University will champion the rold of autonomic methods to create models of software interactions at runtime. Mr. Mark Patton from the Columbus Smart Cities Initiative will discuss integrated data exchanges, data APIs and city-scale applications in America’s Smart City on July 18. Dr. Ragunathan Rajkumar from Carnegie Mellon will talk about his breakthrough work on autonomous vehicles, especially self-driving cars, on July 19. On July 20, Dr. John Wilkes will talk about building Google’s wharehouse-scale computers that manage themselves.
第十四届IEEE自主计算国际会议(ICAC 2017)将在美国俄亥俄州哥伦布市举行。廉署是研究自主计算技术、基础及应用的主要论坛。2017年,廉署将邀请4位主讲嘉宾,每天一位主讲嘉宾。7月17日,来自阿斯顿大学的Nelly Bencomo博士将支持自主方法在运行时创建软件交互模型。来自哥伦布智慧城市计划的Mark Patton先生将于7月18日讨论美国智慧城市的综合数据交换、数据api和城市规模应用。来自卡内基梅隆大学的Ragunathan Rajkumar博士将于7月19日讲述他在自动驾驶汽车,尤其是自动驾驶汽车方面的突破性工作。7月20日,约翰·威尔克斯博士将讨论如何构建b谷歌的仓库级计算机,实现自我管理。
{"title":"Message from the Technical Program Chairs","authors":"M. Ismail, Zaini Jamaludin","doi":"10.1109/SCORED.2010.5703958","DOIUrl":"https://doi.org/10.1109/SCORED.2010.5703958","url":null,"abstract":"The 14 IEEE International Conference on Autonomic Computing (ICAC 2017) will take place in Columbus, Ohio USA. ICAC is the premiere forum for research on autonomic computing techniques, foundations and applications. In 2017, ICAC will feature 4 keynote speakers, one for each day of the conference. On July 17, Dr. Nelly Bencomo from Aston University will champion the rold of autonomic methods to create models of software interactions at runtime. Mr. Mark Patton from the Columbus Smart Cities Initiative will discuss integrated data exchanges, data APIs and city-scale applications in America’s Smart City on July 18. Dr. Ragunathan Rajkumar from Carnegie Mellon will talk about his breakthrough work on autonomous vehicles, especially self-driving cars, on July 19. On July 20, Dr. John Wilkes will talk about building Google’s wharehouse-scale computers that manage themselves.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"56 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87059071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SRAM On-Chip Monitoring Methodology for Energy Efficient Memory Operation at Near Threshold Voltage 近阈值电压下高效节能存储器的SRAM片上监测方法
Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00035
Taehwan Kim, Kwangok Jeong, Taewhan Kim, Kyumyung Choi
Low power design by near-threshold voltage (NTV) operation is very attractive since it affords to considerably mitigate the sharp increase of power dissipation. However, one key barrier for the use of NTV operation is the significant increase of the SRAM failure. In this work, we propose an on-chip SRAM monitoring methodology that is able to accurately predict the minimum voltage, Vddmin, on each die that does not cause SRAM failure under a target confidence level. Precisely, we propose an SRAM monitor, from which we measure a maximum voltage, Vfail that causes functional failure on that SRAM monitor. Then, we propose a novel methodology of inferring SRAM Vddmin on each die from the measured Vfail of SRAM monitor on the same die where Vfail-Vddmin correlation table is built-up in design infra development phase, and Vddmin can be directly derived from the measured Vfail referencing the correlation table in silicon production phase. Through experiments, we confirm that our proposed methodology is able to save leakage power by 7.43%, read energy by 3.98%, and write energy by 4.06% in SRAM bitcell array over that by applying a uniform minimum voltage for all dies while meeting the same yield constraint.
近阈值电压(NTV)操作的低功耗设计非常有吸引力,因为它可以大大减轻功耗的急剧增加。然而,使用NTV操作的一个关键障碍是SRAM故障的显著增加。在这项工作中,我们提出了一种片上SRAM监测方法,该方法能够准确预测在目标置信水平下不会导致SRAM故障的每个芯片上的最小电压Vddmin。确切地说,我们提出了一个SRAM监视器,从中我们测量最大电压,Vfail导致SRAM监视器上的功能故障。然后,我们提出了一种新的方法,从同一芯片上的SRAM监视器的测量Vfail推断每个芯片上的SRAM Vddmin,在设计基础开发阶段建立Vfail-Vddmin相关表,Vddmin可以直接从硅生产阶段的测量Vfail参考相关表导出。通过实验,我们证实,在满足相同良率约束的情况下,我们所提出的方法可以在SRAM位元阵列中节省7.43%的泄漏功率,3.98%的读取能量和4.06%的写入能量。
{"title":"SRAM On-Chip Monitoring Methodology for Energy Efficient Memory Operation at Near Threshold Voltage","authors":"Taehwan Kim, Kwangok Jeong, Taewhan Kim, Kyumyung Choi","doi":"10.1109/ISVLSI.2019.00035","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00035","url":null,"abstract":"Low power design by near-threshold voltage (NTV) operation is very attractive since it affords to considerably mitigate the sharp increase of power dissipation. However, one key barrier for the use of NTV operation is the significant increase of the SRAM failure. In this work, we propose an on-chip SRAM monitoring methodology that is able to accurately predict the minimum voltage, Vddmin, on each die that does not cause SRAM failure under a target confidence level. Precisely, we propose an SRAM monitor, from which we measure a maximum voltage, Vfail that causes functional failure on that SRAM monitor. Then, we propose a novel methodology of inferring SRAM Vddmin on each die from the measured Vfail of SRAM monitor on the same die where Vfail-Vddmin correlation table is built-up in design infra development phase, and Vddmin can be directly derived from the measured Vfail referencing the correlation table in silicon production phase. Through experiments, we confirm that our proposed methodology is able to save leakage power by 7.43%, read energy by 3.98%, and write energy by 4.06% in SRAM bitcell array over that by applying a uniform minimum voltage for all dies while meeting the same yield constraint.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"27 1","pages":"146-151"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82511491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Impact of Autocorrelation on Stochastic Circuit Accuracy 自相关对随机电路精度的影响
Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00057
T. Baker, J. Hayes
Stochastic computing (SC) with pseudo-random numbers offers the prospect of significant chip area and energy savings for large-scale applications such as neural networks. Because of SC's inherent stochasticity, all phenomena affecting accuracy must be carefully analyzed and controlled. This work addresses a fundamental error source, autocorrelation, which although recognized, has largely been neglected in the SC context. We observe that autocorrelation occurs in all types of stochastic circuits and has a major impact on the accuracy of sequential stochastic circuits. We present a methodology for analyzing autocorrelation and apply it to two broad SC circuit types: counter-based and shift-register based. We demonstrate the use of Markov chain theory to estimate autocorrelation errors in stochastic circuits. We also present an algorithm SANG for efficiently generating stochastic numbers that have prescribed autocorrelation and numerical values. SANG greatly aids the simulation of autocorrelation effects in SC.
伪随机数随机计算(SC)为神经网络等大规模应用提供了显著的芯片面积和节能前景。由于SC固有的随机性,所有影响精度的现象都必须仔细分析和控制。这项工作解决了一个基本的误差来源,自相关,虽然认识到,在很大程度上被忽视了在SC的背景下。我们观察到自相关存在于所有类型的随机电路中,并对顺序随机电路的精度产生重大影响。我们提出了一种分析自相关的方法,并将其应用于两种广泛的SC电路类型:基于计数器的和基于移位寄存器的。我们演示了使用马尔可夫链理论来估计随机电路中的自相关误差。我们还提出了一种有效生成具有规定的自相关和数值的随机数的算法SANG。SANG极大地帮助了SC中自相关效应的模拟。
{"title":"Impact of Autocorrelation on Stochastic Circuit Accuracy","authors":"T. Baker, J. Hayes","doi":"10.1109/ISVLSI.2019.00057","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00057","url":null,"abstract":"Stochastic computing (SC) with pseudo-random numbers offers the prospect of significant chip area and energy savings for large-scale applications such as neural networks. Because of SC's inherent stochasticity, all phenomena affecting accuracy must be carefully analyzed and controlled. This work addresses a fundamental error source, autocorrelation, which although recognized, has largely been neglected in the SC context. We observe that autocorrelation occurs in all types of stochastic circuits and has a major impact on the accuracy of sequential stochastic circuits. We present a methodology for analyzing autocorrelation and apply it to two broad SC circuit types: counter-based and shift-register based. We demonstrate the use of Markov chain theory to estimate autocorrelation errors in stochastic circuits. We also present an algorithm SANG for efficiently generating stochastic numbers that have prescribed autocorrelation and numerical values. SANG greatly aids the simulation of autocorrelation effects in SC.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"5 1","pages":"271-277"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90982435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Persistently-Secure Processors: Challenges and Opportunities for Securing Non-Volatile Memories 持久安全处理器:保护非易失性存储器的挑战和机遇
Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00114
Amro Awad, S. Suboh, Mao Ye, Kazi Abu Zubair, Mazen Al-Wadi
Emerging Non-Volatile Memories (NVMs) are getting close to their mass production stage. The persistence feature of NVMs enables many interesting applications and capabilities such as fast restoration, staging and direct access of persistent files. On the other hand, data persistence enlarges the attack surface due to data remanence. Additionally, since the memory data is expected to be restored, any accompanying security metadata must be recovered and restored correctly. While the main concepts of secure processors have been there for decades, designing persistently secure processors that are able to maintain security across system crashes/reboots is particularly challenging due to the trade-offs between write-endurance, resilience, performance and security. In this paper, we discuss the recent advances in this domain, challenges and future research opportunities.
新兴非易失性存储器(NVMs)已接近量产阶段。nvm的持久性特性支持许多有趣的应用程序和功能,例如快速恢复、暂存和对持久性文件的直接访问。另一方面,由于数据的残留,数据持久化扩大了攻击面。此外,由于需要恢复内存数据,因此必须正确地恢复和恢复任何附带的安全元数据。虽然安全处理器的主要概念已经存在了几十年,但设计能够在系统崩溃/重新启动时保持安全性的持久安全处理器尤其具有挑战性,因为需要在写持久性、弹性、性能和安全性之间进行权衡。在本文中,我们讨论了该领域的最新进展,挑战和未来的研究机会。
{"title":"Persistently-Secure Processors: Challenges and Opportunities for Securing Non-Volatile Memories","authors":"Amro Awad, S. Suboh, Mao Ye, Kazi Abu Zubair, Mazen Al-Wadi","doi":"10.1109/ISVLSI.2019.00114","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00114","url":null,"abstract":"Emerging Non-Volatile Memories (NVMs) are getting close to their mass production stage. The persistence feature of NVMs enables many interesting applications and capabilities such as fast restoration, staging and direct access of persistent files. On the other hand, data persistence enlarges the attack surface due to data remanence. Additionally, since the memory data is expected to be restored, any accompanying security metadata must be recovered and restored correctly. While the main concepts of secure processors have been there for decades, designing persistently secure processors that are able to maintain security across system crashes/reboots is particularly challenging due to the trade-offs between write-endurance, resilience, performance and security. In this paper, we discuss the recent advances in this domain, challenges and future research opportunities.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"101 1","pages":"610-614"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84196620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA
Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00012
Yao Chen, Kaili Zhang, Cheng Gong, Cong Hao, Xiaofan Zhang, Tao Li, Deming Chen
Deep Neural Networks (DNNs) have become promising solutions for data analysis especially for raw data processing from sensors. However, using DNN-based approaches can easily introduce huge demands of computation and memory consumption, which may not be feasible for direct deployment onto the Internet of Thing (IoT) devices, since they have strict constraints on hardware resources, power budgets, response latency, and manufacturing cost. To bring DNNs into IoT devices, embedded FPGA can be one of the most suitable candidates by providing better energy efficiency than GPU and CPU based solutions, and higher flexibility than ASICs. In this paper, we propose a systematic solution to deploy DNNs on embedded FPGAs, which includes a ternarized hardware Deep Learning Accelerator (T-DLA), and a framework for ternary neural network (TNN) training. T-DLA is a highly optimized hardware unit in FPGA specializing in accelerating the TNNs, while the proposed framework can significantly compress the DNN parameters down to two bits with little accuracy drop. Results show that our training framework can compress the DNN up to 14.14x while maintaining nearly the same accuracy compared to the floating point version. By illustrating our proposed design techniques, the T-DLA can deliver up to 0.4TOPS with 2.576W power consumption, showing 873.6x and 5.1x higher energy efficiency (fps/W) on ImageNet with Resnet-18 model comparing to Xeon E5-2630 CPU and Nvidia 1080 Ti GPU. To the best of our knowledge, this is the first instruction-based highly efficient ternary DLA design reported from the literature.
深度神经网络(dnn)已经成为数据分析特别是传感器原始数据处理的有前途的解决方案。然而,使用基于dnn的方法很容易带来巨大的计算和内存消耗需求,这对于直接部署到物联网(IoT)设备可能不可行,因为它们对硬件资源、功率预算、响应延迟和制造成本有严格的限制。为了将dnn引入物联网设备,嵌入式FPGA可以通过提供比基于GPU和CPU的解决方案更好的能效以及比asic更高的灵活性,成为最合适的候选者之一。T-DLA是FPGA中高度优化的硬件单元,专门用于加速tnn,而所提出的框架可以将DNN参数显著压缩到2位,精度几乎没有下降。结果表明,我们的训练框架可以将DNN压缩到14.14倍,同时保持与浮点版本几乎相同的精度。通过说明我们提出的设计技术,T-DLA可以提供高达0.4TOPS,功耗为2.576W,与Xeon E5-2630 CPU和Nvidia 1080 Ti GPU相比,在Resnet-18模型的ImageNet上显示873.6倍和5.1倍的能效(fps/W)。据我们所知,这是文献中报道的第一个基于指令的高效三元DLA设计。
{"title":"T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA","authors":"Yao Chen, Kaili Zhang, Cheng Gong, Cong Hao, Xiaofan Zhang, Tao Li, Deming Chen","doi":"10.1109/ISVLSI.2019.00012","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00012","url":null,"abstract":"Deep Neural Networks (DNNs) have become promising solutions for data analysis especially for raw data processing from sensors. However, using DNN-based approaches can easily introduce huge demands of computation and memory consumption, which may not be feasible for direct deployment onto the Internet of Thing (IoT) devices, since they have strict constraints on hardware resources, power budgets, response latency, and manufacturing cost. To bring DNNs into IoT devices, embedded FPGA can be one of the most suitable candidates by providing better energy efficiency than GPU and CPU based solutions, and higher flexibility than ASICs. In this paper, we propose a systematic solution to deploy DNNs on embedded FPGAs, which includes a ternarized hardware Deep Learning Accelerator (T-DLA), and a framework for ternary neural network (TNN) training. T-DLA is a highly optimized hardware unit in FPGA specializing in accelerating the TNNs, while the proposed framework can significantly compress the DNN parameters down to two bits with little accuracy drop. Results show that our training framework can compress the DNN up to 14.14x while maintaining nearly the same accuracy compared to the floating point version. By illustrating our proposed design techniques, the T-DLA can deliver up to 0.4TOPS with 2.576W power consumption, showing 873.6x and 5.1x higher energy efficiency (fps/W) on ImageNet with Resnet-18 model comparing to Xeon E5-2630 CPU and Nvidia 1080 Ti GPU. To the best of our knowledge, this is the first instruction-based highly efficient ternary DLA design reported from the literature.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"1 1","pages":"13-18"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81635472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Deep State Encryption for Sequential Logic Circuits 顺序逻辑电路的深度状态加密
Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00068
Yasaswy Kasarabada, Sudheer Ram Thulasi Raman, R. Vemuri
Logic encryption has been proposed as a potential solution to the hardware IP piracy problem. Naive logic encryption methods were shown to be susceptible to Boolean satisfiability (SAT) based attacks. In addition, the recently proposed Sequential SAT attack is able to decrypt many encrypted sequential logic circuits. This paper introduces a new logic encryption scheme that encrypts a sequential circuit on the occurrence of a chosen deep state. Two novel techniques to select a suitable deep state from the gate-level netlist of the design have been introduced. The attack resiliency of the proposed encryption technique against the sequential SAT attack is demonstrated using several standard benchmark circuits.
逻辑加密被认为是解决硬件IP盗版问题的一种潜在方法。朴素逻辑加密方法容易受到基于布尔可满足性(SAT)的攻击。此外,最近提出的顺序SAT攻击能够解密许多加密的顺序逻辑电路。本文介绍了一种新的逻辑加密方案,该方案在选定的深度状态发生时对顺序电路进行加密。介绍了从设计的门级网表中选择合适深态的两种新技术。使用几个标准基准电路演示了所提出的加密技术对顺序SAT攻击的攻击弹性。
{"title":"Deep State Encryption for Sequential Logic Circuits","authors":"Yasaswy Kasarabada, Sudheer Ram Thulasi Raman, R. Vemuri","doi":"10.1109/ISVLSI.2019.00068","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00068","url":null,"abstract":"Logic encryption has been proposed as a potential solution to the hardware IP piracy problem. Naive logic encryption methods were shown to be susceptible to Boolean satisfiability (SAT) based attacks. In addition, the recently proposed Sequential SAT attack is able to decrypt many encrypted sequential logic circuits. This paper introduces a new logic encryption scheme that encrypts a sequential circuit on the occurrence of a chosen deep state. Two novel techniques to select a suitable deep state from the gate-level netlist of the design have been introduced. The attack resiliency of the proposed encryption technique against the sequential SAT attack is demonstrated using several standard benchmark circuits.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"18 1","pages":"338-343"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84435170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
In-memory AES Implementation for Emerging Non-Volatile Main Memory 新兴非易失性主存储器的内存AES实现
Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00027
Mimi Xie, Yawen Wu, Zhenge Jia, J. Hu
Non-volatile memories are very promising candidates to be employed as next-generation non-volatile main memory (NVMM), because of their advantages over traditional DRAM main memory such as non-volatility, high density, and low leakage power. However, NVMM suffers a new security vulnerability because the nature of non-volatility allows the data to be retained a long time after power is off. An attacker with physical access to the system can readily scan the main memory content and extract all valuable information from the main memory. To protect the data of the NVMM, the whole memory should be provided with a security mechanism with comparable security level to DRAM. Take mobile devices (e.g. smart phone or laptop) for example, once attackers have physical access to the NVMM will they be able to read out the sensitive information. Therefore, memory encryption is required only when the device is shut down or put into sleep/screenlock mode. One-time encryption, encrypting the whole/most part of the memory only when it is necessary, is an efficient solution in such mobile scenarios. However, one-time memory encryption approach faces two challenges: First, it should be fast enough to maintain a low vulnerability window when locked and provide instant response when unlocked. Second, it should be energy-efficient considering the limited battery life.
由于非易失性存储器具有非易失性、高密度和低泄漏功率等优点,因此是非易失性存储器非常有希望成为下一代非易失性主存储器(NVMM)。然而,NVMM面临一个新的安全漏洞,因为非易失性的性质允许在断电后很长时间保留数据。具有系统物理访问权限的攻击者可以很容易地扫描主存储器内容并从主存储器中提取所有有价值的信息。为了保护NVMM的数据,应该为整个内存提供与DRAM相当的安全级别的安全机制。以移动设备(例如智能手机或笔记本电脑)为例,一旦攻击者能够物理访问NVMM,他们就能够读取敏感信息。因此,只有当设备关闭或进入睡眠/锁屏模式时才需要内存加密。一次性加密,仅在必要时对整个/大部分内存进行加密,是此类移动场景下的高效解决方案。然而,一次性内存加密方法面临两个挑战:首先,它应该足够快,在锁定时保持低漏洞窗口,在解锁时提供即时响应。其次,考虑到有限的电池寿命,它应该是节能的。
{"title":"In-memory AES Implementation for Emerging Non-Volatile Main Memory","authors":"Mimi Xie, Yawen Wu, Zhenge Jia, J. Hu","doi":"10.1109/ISVLSI.2019.00027","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00027","url":null,"abstract":"Non-volatile memories are very promising candidates to be employed as next-generation non-volatile main memory (NVMM), because of their advantages over traditional DRAM main memory such as non-volatility, high density, and low leakage power. However, NVMM suffers a new security vulnerability because the nature of non-volatility allows the data to be retained a long time after power is off. An attacker with physical access to the system can readily scan the main memory content and extract all valuable information from the main memory. To protect the data of the NVMM, the whole memory should be provided with a security mechanism with comparable security level to DRAM. Take mobile devices (e.g. smart phone or laptop) for example, once attackers have physical access to the NVMM will they be able to read out the sensitive information. Therefore, memory encryption is required only when the device is shut down or put into sleep/screenlock mode. One-time encryption, encrypting the whole/most part of the memory only when it is necessary, is an efficient solution in such mobile scenarios. However, one-time memory encryption approach faces two challenges: First, it should be fast enough to maintain a low vulnerability window when locked and provide instant response when unlocked. Second, it should be energy-efficient considering the limited battery life.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"56 1","pages":"103-103"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81569196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Design of Switched-Current Based Low-Power PIM Vision System for IoT Applications 基于开关电流的物联网低功耗PIM视觉系统设计
Pub Date : 2019-07-01 DOI: 10.1109/ISVLSI.2019.00041
Zheyu Liu, Zichen Fan, Qi Wei, Xing Wu, F. Qiao, Ping Jin, Xinjun Liu, Chengliang Liu, Huazhong Yang
Neural networks(NN) is becoming dominant in machine learning field for its excellent performance in classification, recognition and so on. However, the huge computation and memory overhead make it hard to implement NN algorithms on the existing platforms with real-time and energy-efficient performance. In this work, a low-power processing-in-memory (PIM) vision system for accelerate binary weight networks is proposed. This architecture utilizes PIM and features an energy-efficient switched current (SI) neuron, employing a network with binary weight and 9-bit activation. Simulation result shows the design occupies 5.82mm2 in SMIC 180nm CMOS technology, which consumes 1.45mW from 1.8V supplies. Our system outperforms the state-of-the-art designs in terms of power consumption and achieves energy efficiency up to 28.25TOPS/W.
神经网络以其在分类、识别等方面的优异表现,在机器学习领域占据主导地位。然而,巨大的计算和内存开销使得在现有平台上实现具有实时性和节能性能的神经网络算法变得困难。本文提出了一种用于加速二值权网络的低功耗内存处理(PIM)视觉系统。该架构利用PIM,并具有节能的开关电流(SI)神经元,采用具有二进制权值和9位激活的网络。仿真结果表明,该设计采用中芯国际180nm CMOS工艺,占地5.82mm2, 1.8V电源功耗1.45mW。我们的系统在功耗方面优于最先进的设计,能源效率高达28.25TOPS/W。
{"title":"Design of Switched-Current Based Low-Power PIM Vision System for IoT Applications","authors":"Zheyu Liu, Zichen Fan, Qi Wei, Xing Wu, F. Qiao, Ping Jin, Xinjun Liu, Chengliang Liu, Huazhong Yang","doi":"10.1109/ISVLSI.2019.00041","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00041","url":null,"abstract":"Neural networks(NN) is becoming dominant in machine learning field for its excellent performance in classification, recognition and so on. However, the huge computation and memory overhead make it hard to implement NN algorithms on the existing platforms with real-time and energy-efficient performance. In this work, a low-power processing-in-memory (PIM) vision system for accelerate binary weight networks is proposed. This architecture utilizes PIM and features an energy-efficient switched current (SI) neuron, employing a network with binary weight and 9-bit activation. Simulation result shows the design occupies 5.82mm2 in SMIC 180nm CMOS technology, which consumes 1.45mW from 1.8V supplies. Our system outperforms the state-of-the-art designs in terms of power consumption and achieves energy efficiency up to 28.25TOPS/W.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"89 1","pages":"181-186"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83871331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1