Pub Date : 2019-07-01DOI: 10.1109/ISVLSI.2019.00111
Joel Ortiz Sosa, O. Sentieys, C. Roland
Wireless Network-on-Chip (WiNoC) is a viable solution to overcome critical bottlenecks in on-chip communication backbone. However, standard WiNoC approaches are vulnerable to multi-path interference introduced by on-chip physical structures. To overcome such parasitic phenomenon, this paper presents an adaptive digital transceiver, which enhances communication reliability under different wireless channel configurations. Based on a semi-realistic wireless channel model, we investigate the impact of using some channel correction techniques. Experimental results show that our approach significantly improves Bit Error Rate (BER) under different wireless channel configurations. Moreover, our adaptive transceiver allows for wireless communication links to be established in conditions where this would not be possible for standard transceiver architectures. The proposed architecture, designed using a 28-nm FDSOI technology, consumes only 3.27 mW for a data rate of 10 Gbit/s and has a very small area footprint.
{"title":"Adaptive Transceiver for Wireless NoC to Enhance Multicast/Unicast Communication Scenarios","authors":"Joel Ortiz Sosa, O. Sentieys, C. Roland","doi":"10.1109/ISVLSI.2019.00111","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00111","url":null,"abstract":"Wireless Network-on-Chip (WiNoC) is a viable solution to overcome critical bottlenecks in on-chip communication backbone. However, standard WiNoC approaches are vulnerable to multi-path interference introduced by on-chip physical structures. To overcome such parasitic phenomenon, this paper presents an adaptive digital transceiver, which enhances communication reliability under different wireless channel configurations. Based on a semi-realistic wireless channel model, we investigate the impact of using some channel correction techniques. Experimental results show that our approach significantly improves Bit Error Rate (BER) under different wireless channel configurations. Moreover, our adaptive transceiver allows for wireless communication links to be established in conditions where this would not be possible for standard transceiver architectures. The proposed architecture, designed using a 28-nm FDSOI technology, consumes only 3.27 mW for a data rate of 10 Gbit/s and has a very small area footprint.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"58 1","pages":"592-597"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74845370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ISVLSI.2019.00061
Pampa Howladar, P. Roy, H. Rahaman
Recent emergence of micro-electrode-dot array (MEDA) based digital microfluidic biochips has facilitated major improvement in microfluidic operations for conventional lab-on-chip devices. MEDA based digital microfluidic biochips typically consist of a 2D planar array of cells of square sized microelectrodes. One of the most critical issues in the biochip layout design is the droplet transportation within the 2D layout. MEDA allows for dynamic routing with variable shaped and sized droplets. Microelectrode cells are dynamically group together to form configured microelectrode array (CMA) for variable shaped and sized droplets. This existing square shaped CMA is highly suitable for the droplets that needs exactly rectangular or square area but may not be as effective for rhombus or hexagonal shaped areas occupied by droplets. This is because the occupied additional area may be useful for other droplets to accommodate their corresponding minimum shortest path. In this paper, we present some MEDA layout design of 2D planar array of cells using triangle shaped microelectrodes by arranging them in a regular way. This allows for better convenience towards formation of different droplet shapes more efficiently, specifically for rhombus or hexagonal sized droplets. MEDA routing operations has been conducted on this newly proposed MEDA layout (using variable shaped electrodes) followed by analysis and comparison with existing MEDA layout design. Finally droplet transportation with standard bench marks are mapped in order to demonstrate how well the proposed designs can improve the performance of bioassay execution in MEDA based DMFB layout.
{"title":"Micro-electrode-dot Array Based Biochips : Advantages of Using Different Shaped CMAs","authors":"Pampa Howladar, P. Roy, H. Rahaman","doi":"10.1109/ISVLSI.2019.00061","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00061","url":null,"abstract":"Recent emergence of micro-electrode-dot array (MEDA) based digital microfluidic biochips has facilitated major improvement in microfluidic operations for conventional lab-on-chip devices. MEDA based digital microfluidic biochips typically consist of a 2D planar array of cells of square sized microelectrodes. One of the most critical issues in the biochip layout design is the droplet transportation within the 2D layout. MEDA allows for dynamic routing with variable shaped and sized droplets. Microelectrode cells are dynamically group together to form configured microelectrode array (CMA) for variable shaped and sized droplets. This existing square shaped CMA is highly suitable for the droplets that needs exactly rectangular or square area but may not be as effective for rhombus or hexagonal shaped areas occupied by droplets. This is because the occupied additional area may be useful for other droplets to accommodate their corresponding minimum shortest path. In this paper, we present some MEDA layout design of 2D planar array of cells using triangle shaped microelectrodes by arranging them in a regular way. This allows for better convenience towards formation of different droplet shapes more efficiently, specifically for rhombus or hexagonal sized droplets. MEDA routing operations has been conducted on this newly proposed MEDA layout (using variable shaped electrodes) followed by analysis and comparison with existing MEDA layout design. Finally droplet transportation with standard bench marks are mapped in order to demonstrate how well the proposed designs can improve the performance of bioassay execution in MEDA based DMFB layout.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"180 1","pages":"296-301"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85190324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/SCORED.2010.5703958
M. Ismail, Zaini Jamaludin
The 14 IEEE International Conference on Autonomic Computing (ICAC 2017) will take place in Columbus, Ohio USA. ICAC is the premiere forum for research on autonomic computing techniques, foundations and applications. In 2017, ICAC will feature 4 keynote speakers, one for each day of the conference. On July 17, Dr. Nelly Bencomo from Aston University will champion the rold of autonomic methods to create models of software interactions at runtime. Mr. Mark Patton from the Columbus Smart Cities Initiative will discuss integrated data exchanges, data APIs and city-scale applications in America’s Smart City on July 18. Dr. Ragunathan Rajkumar from Carnegie Mellon will talk about his breakthrough work on autonomous vehicles, especially self-driving cars, on July 19. On July 20, Dr. John Wilkes will talk about building Google’s wharehouse-scale computers that manage themselves.
{"title":"Message from the Technical Program Chairs","authors":"M. Ismail, Zaini Jamaludin","doi":"10.1109/SCORED.2010.5703958","DOIUrl":"https://doi.org/10.1109/SCORED.2010.5703958","url":null,"abstract":"The 14 IEEE International Conference on Autonomic Computing (ICAC 2017) will take place in Columbus, Ohio USA. ICAC is the premiere forum for research on autonomic computing techniques, foundations and applications. In 2017, ICAC will feature 4 keynote speakers, one for each day of the conference. On July 17, Dr. Nelly Bencomo from Aston University will champion the rold of autonomic methods to create models of software interactions at runtime. Mr. Mark Patton from the Columbus Smart Cities Initiative will discuss integrated data exchanges, data APIs and city-scale applications in America’s Smart City on July 18. Dr. Ragunathan Rajkumar from Carnegie Mellon will talk about his breakthrough work on autonomous vehicles, especially self-driving cars, on July 19. On July 20, Dr. John Wilkes will talk about building Google’s wharehouse-scale computers that manage themselves.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"56 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87059071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ISVLSI.2019.00035
Taehwan Kim, Kwangok Jeong, Taewhan Kim, Kyumyung Choi
Low power design by near-threshold voltage (NTV) operation is very attractive since it affords to considerably mitigate the sharp increase of power dissipation. However, one key barrier for the use of NTV operation is the significant increase of the SRAM failure. In this work, we propose an on-chip SRAM monitoring methodology that is able to accurately predict the minimum voltage, Vddmin, on each die that does not cause SRAM failure under a target confidence level. Precisely, we propose an SRAM monitor, from which we measure a maximum voltage, Vfail that causes functional failure on that SRAM monitor. Then, we propose a novel methodology of inferring SRAM Vddmin on each die from the measured Vfail of SRAM monitor on the same die where Vfail-Vddmin correlation table is built-up in design infra development phase, and Vddmin can be directly derived from the measured Vfail referencing the correlation table in silicon production phase. Through experiments, we confirm that our proposed methodology is able to save leakage power by 7.43%, read energy by 3.98%, and write energy by 4.06% in SRAM bitcell array over that by applying a uniform minimum voltage for all dies while meeting the same yield constraint.
{"title":"SRAM On-Chip Monitoring Methodology for Energy Efficient Memory Operation at Near Threshold Voltage","authors":"Taehwan Kim, Kwangok Jeong, Taewhan Kim, Kyumyung Choi","doi":"10.1109/ISVLSI.2019.00035","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00035","url":null,"abstract":"Low power design by near-threshold voltage (NTV) operation is very attractive since it affords to considerably mitigate the sharp increase of power dissipation. However, one key barrier for the use of NTV operation is the significant increase of the SRAM failure. In this work, we propose an on-chip SRAM monitoring methodology that is able to accurately predict the minimum voltage, Vddmin, on each die that does not cause SRAM failure under a target confidence level. Precisely, we propose an SRAM monitor, from which we measure a maximum voltage, Vfail that causes functional failure on that SRAM monitor. Then, we propose a novel methodology of inferring SRAM Vddmin on each die from the measured Vfail of SRAM monitor on the same die where Vfail-Vddmin correlation table is built-up in design infra development phase, and Vddmin can be directly derived from the measured Vfail referencing the correlation table in silicon production phase. Through experiments, we confirm that our proposed methodology is able to save leakage power by 7.43%, read energy by 3.98%, and write energy by 4.06% in SRAM bitcell array over that by applying a uniform minimum voltage for all dies while meeting the same yield constraint.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"27 1","pages":"146-151"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82511491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ISVLSI.2019.00057
T. Baker, J. Hayes
Stochastic computing (SC) with pseudo-random numbers offers the prospect of significant chip area and energy savings for large-scale applications such as neural networks. Because of SC's inherent stochasticity, all phenomena affecting accuracy must be carefully analyzed and controlled. This work addresses a fundamental error source, autocorrelation, which although recognized, has largely been neglected in the SC context. We observe that autocorrelation occurs in all types of stochastic circuits and has a major impact on the accuracy of sequential stochastic circuits. We present a methodology for analyzing autocorrelation and apply it to two broad SC circuit types: counter-based and shift-register based. We demonstrate the use of Markov chain theory to estimate autocorrelation errors in stochastic circuits. We also present an algorithm SANG for efficiently generating stochastic numbers that have prescribed autocorrelation and numerical values. SANG greatly aids the simulation of autocorrelation effects in SC.
{"title":"Impact of Autocorrelation on Stochastic Circuit Accuracy","authors":"T. Baker, J. Hayes","doi":"10.1109/ISVLSI.2019.00057","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00057","url":null,"abstract":"Stochastic computing (SC) with pseudo-random numbers offers the prospect of significant chip area and energy savings for large-scale applications such as neural networks. Because of SC's inherent stochasticity, all phenomena affecting accuracy must be carefully analyzed and controlled. This work addresses a fundamental error source, autocorrelation, which although recognized, has largely been neglected in the SC context. We observe that autocorrelation occurs in all types of stochastic circuits and has a major impact on the accuracy of sequential stochastic circuits. We present a methodology for analyzing autocorrelation and apply it to two broad SC circuit types: counter-based and shift-register based. We demonstrate the use of Markov chain theory to estimate autocorrelation errors in stochastic circuits. We also present an algorithm SANG for efficiently generating stochastic numbers that have prescribed autocorrelation and numerical values. SANG greatly aids the simulation of autocorrelation effects in SC.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"5 1","pages":"271-277"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90982435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ISVLSI.2019.00114
Amro Awad, S. Suboh, Mao Ye, Kazi Abu Zubair, Mazen Al-Wadi
Emerging Non-Volatile Memories (NVMs) are getting close to their mass production stage. The persistence feature of NVMs enables many interesting applications and capabilities such as fast restoration, staging and direct access of persistent files. On the other hand, data persistence enlarges the attack surface due to data remanence. Additionally, since the memory data is expected to be restored, any accompanying security metadata must be recovered and restored correctly. While the main concepts of secure processors have been there for decades, designing persistently secure processors that are able to maintain security across system crashes/reboots is particularly challenging due to the trade-offs between write-endurance, resilience, performance and security. In this paper, we discuss the recent advances in this domain, challenges and future research opportunities.
{"title":"Persistently-Secure Processors: Challenges and Opportunities for Securing Non-Volatile Memories","authors":"Amro Awad, S. Suboh, Mao Ye, Kazi Abu Zubair, Mazen Al-Wadi","doi":"10.1109/ISVLSI.2019.00114","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00114","url":null,"abstract":"Emerging Non-Volatile Memories (NVMs) are getting close to their mass production stage. The persistence feature of NVMs enables many interesting applications and capabilities such as fast restoration, staging and direct access of persistent files. On the other hand, data persistence enlarges the attack surface due to data remanence. Additionally, since the memory data is expected to be restored, any accompanying security metadata must be recovered and restored correctly. While the main concepts of secure processors have been there for decades, designing persistently secure processors that are able to maintain security across system crashes/reboots is particularly challenging due to the trade-offs between write-endurance, resilience, performance and security. In this paper, we discuss the recent advances in this domain, challenges and future research opportunities.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"101 1","pages":"610-614"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84196620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ISVLSI.2019.00012
Yao Chen, Kaili Zhang, Cheng Gong, Cong Hao, Xiaofan Zhang, Tao Li, Deming Chen
Deep Neural Networks (DNNs) have become promising solutions for data analysis especially for raw data processing from sensors. However, using DNN-based approaches can easily introduce huge demands of computation and memory consumption, which may not be feasible for direct deployment onto the Internet of Thing (IoT) devices, since they have strict constraints on hardware resources, power budgets, response latency, and manufacturing cost. To bring DNNs into IoT devices, embedded FPGA can be one of the most suitable candidates by providing better energy efficiency than GPU and CPU based solutions, and higher flexibility than ASICs. In this paper, we propose a systematic solution to deploy DNNs on embedded FPGAs, which includes a ternarized hardware Deep Learning Accelerator (T-DLA), and a framework for ternary neural network (TNN) training. T-DLA is a highly optimized hardware unit in FPGA specializing in accelerating the TNNs, while the proposed framework can significantly compress the DNN parameters down to two bits with little accuracy drop. Results show that our training framework can compress the DNN up to 14.14x while maintaining nearly the same accuracy compared to the floating point version. By illustrating our proposed design techniques, the T-DLA can deliver up to 0.4TOPS with 2.576W power consumption, showing 873.6x and 5.1x higher energy efficiency (fps/W) on ImageNet with Resnet-18 model comparing to Xeon E5-2630 CPU and Nvidia 1080 Ti GPU. To the best of our knowledge, this is the first instruction-based highly efficient ternary DLA design reported from the literature.
深度神经网络(dnn)已经成为数据分析特别是传感器原始数据处理的有前途的解决方案。然而,使用基于dnn的方法很容易带来巨大的计算和内存消耗需求,这对于直接部署到物联网(IoT)设备可能不可行,因为它们对硬件资源、功率预算、响应延迟和制造成本有严格的限制。为了将dnn引入物联网设备,嵌入式FPGA可以通过提供比基于GPU和CPU的解决方案更好的能效以及比asic更高的灵活性,成为最合适的候选者之一。T-DLA是FPGA中高度优化的硬件单元,专门用于加速tnn,而所提出的框架可以将DNN参数显著压缩到2位,精度几乎没有下降。结果表明,我们的训练框架可以将DNN压缩到14.14倍,同时保持与浮点版本几乎相同的精度。通过说明我们提出的设计技术,T-DLA可以提供高达0.4TOPS,功耗为2.576W,与Xeon E5-2630 CPU和Nvidia 1080 Ti GPU相比,在Resnet-18模型的ImageNet上显示873.6倍和5.1倍的能效(fps/W)。据我们所知,这是文献中报道的第一个基于指令的高效三元DLA设计。
{"title":"T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA","authors":"Yao Chen, Kaili Zhang, Cheng Gong, Cong Hao, Xiaofan Zhang, Tao Li, Deming Chen","doi":"10.1109/ISVLSI.2019.00012","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00012","url":null,"abstract":"Deep Neural Networks (DNNs) have become promising solutions for data analysis especially for raw data processing from sensors. However, using DNN-based approaches can easily introduce huge demands of computation and memory consumption, which may not be feasible for direct deployment onto the Internet of Thing (IoT) devices, since they have strict constraints on hardware resources, power budgets, response latency, and manufacturing cost. To bring DNNs into IoT devices, embedded FPGA can be one of the most suitable candidates by providing better energy efficiency than GPU and CPU based solutions, and higher flexibility than ASICs. In this paper, we propose a systematic solution to deploy DNNs on embedded FPGAs, which includes a ternarized hardware Deep Learning Accelerator (T-DLA), and a framework for ternary neural network (TNN) training. T-DLA is a highly optimized hardware unit in FPGA specializing in accelerating the TNNs, while the proposed framework can significantly compress the DNN parameters down to two bits with little accuracy drop. Results show that our training framework can compress the DNN up to 14.14x while maintaining nearly the same accuracy compared to the floating point version. By illustrating our proposed design techniques, the T-DLA can deliver up to 0.4TOPS with 2.576W power consumption, showing 873.6x and 5.1x higher energy efficiency (fps/W) on ImageNet with Resnet-18 model comparing to Xeon E5-2630 CPU and Nvidia 1080 Ti GPU. To the best of our knowledge, this is the first instruction-based highly efficient ternary DLA design reported from the literature.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"1 1","pages":"13-18"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81635472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ISVLSI.2019.00068
Yasaswy Kasarabada, Sudheer Ram Thulasi Raman, R. Vemuri
Logic encryption has been proposed as a potential solution to the hardware IP piracy problem. Naive logic encryption methods were shown to be susceptible to Boolean satisfiability (SAT) based attacks. In addition, the recently proposed Sequential SAT attack is able to decrypt many encrypted sequential logic circuits. This paper introduces a new logic encryption scheme that encrypts a sequential circuit on the occurrence of a chosen deep state. Two novel techniques to select a suitable deep state from the gate-level netlist of the design have been introduced. The attack resiliency of the proposed encryption technique against the sequential SAT attack is demonstrated using several standard benchmark circuits.
{"title":"Deep State Encryption for Sequential Logic Circuits","authors":"Yasaswy Kasarabada, Sudheer Ram Thulasi Raman, R. Vemuri","doi":"10.1109/ISVLSI.2019.00068","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00068","url":null,"abstract":"Logic encryption has been proposed as a potential solution to the hardware IP piracy problem. Naive logic encryption methods were shown to be susceptible to Boolean satisfiability (SAT) based attacks. In addition, the recently proposed Sequential SAT attack is able to decrypt many encrypted sequential logic circuits. This paper introduces a new logic encryption scheme that encrypts a sequential circuit on the occurrence of a chosen deep state. Two novel techniques to select a suitable deep state from the gate-level netlist of the design have been introduced. The attack resiliency of the proposed encryption technique against the sequential SAT attack is demonstrated using several standard benchmark circuits.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"18 1","pages":"338-343"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84435170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ISVLSI.2019.00027
Mimi Xie, Yawen Wu, Zhenge Jia, J. Hu
Non-volatile memories are very promising candidates to be employed as next-generation non-volatile main memory (NVMM), because of their advantages over traditional DRAM main memory such as non-volatility, high density, and low leakage power. However, NVMM suffers a new security vulnerability because the nature of non-volatility allows the data to be retained a long time after power is off. An attacker with physical access to the system can readily scan the main memory content and extract all valuable information from the main memory. To protect the data of the NVMM, the whole memory should be provided with a security mechanism with comparable security level to DRAM. Take mobile devices (e.g. smart phone or laptop) for example, once attackers have physical access to the NVMM will they be able to read out the sensitive information. Therefore, memory encryption is required only when the device is shut down or put into sleep/screenlock mode. One-time encryption, encrypting the whole/most part of the memory only when it is necessary, is an efficient solution in such mobile scenarios. However, one-time memory encryption approach faces two challenges: First, it should be fast enough to maintain a low vulnerability window when locked and provide instant response when unlocked. Second, it should be energy-efficient considering the limited battery life.
{"title":"In-memory AES Implementation for Emerging Non-Volatile Main Memory","authors":"Mimi Xie, Yawen Wu, Zhenge Jia, J. Hu","doi":"10.1109/ISVLSI.2019.00027","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00027","url":null,"abstract":"Non-volatile memories are very promising candidates to be employed as next-generation non-volatile main memory (NVMM), because of their advantages over traditional DRAM main memory such as non-volatility, high density, and low leakage power. However, NVMM suffers a new security vulnerability because the nature of non-volatility allows the data to be retained a long time after power is off. An attacker with physical access to the system can readily scan the main memory content and extract all valuable information from the main memory. To protect the data of the NVMM, the whole memory should be provided with a security mechanism with comparable security level to DRAM. Take mobile devices (e.g. smart phone or laptop) for example, once attackers have physical access to the NVMM will they be able to read out the sensitive information. Therefore, memory encryption is required only when the device is shut down or put into sleep/screenlock mode. One-time encryption, encrypting the whole/most part of the memory only when it is necessary, is an efficient solution in such mobile scenarios. However, one-time memory encryption approach faces two challenges: First, it should be fast enough to maintain a low vulnerability window when locked and provide instant response when unlocked. Second, it should be energy-efficient considering the limited battery life.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"56 1","pages":"103-103"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81569196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-07-01DOI: 10.1109/ISVLSI.2019.00041
Zheyu Liu, Zichen Fan, Qi Wei, Xing Wu, F. Qiao, Ping Jin, Xinjun Liu, Chengliang Liu, Huazhong Yang
Neural networks(NN) is becoming dominant in machine learning field for its excellent performance in classification, recognition and so on. However, the huge computation and memory overhead make it hard to implement NN algorithms on the existing platforms with real-time and energy-efficient performance. In this work, a low-power processing-in-memory (PIM) vision system for accelerate binary weight networks is proposed. This architecture utilizes PIM and features an energy-efficient switched current (SI) neuron, employing a network with binary weight and 9-bit activation. Simulation result shows the design occupies 5.82mm2 in SMIC 180nm CMOS technology, which consumes 1.45mW from 1.8V supplies. Our system outperforms the state-of-the-art designs in terms of power consumption and achieves energy efficiency up to 28.25TOPS/W.
{"title":"Design of Switched-Current Based Low-Power PIM Vision System for IoT Applications","authors":"Zheyu Liu, Zichen Fan, Qi Wei, Xing Wu, F. Qiao, Ping Jin, Xinjun Liu, Chengliang Liu, Huazhong Yang","doi":"10.1109/ISVLSI.2019.00041","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00041","url":null,"abstract":"Neural networks(NN) is becoming dominant in machine learning field for its excellent performance in classification, recognition and so on. However, the huge computation and memory overhead make it hard to implement NN algorithms on the existing platforms with real-time and energy-efficient performance. In this work, a low-power processing-in-memory (PIM) vision system for accelerate binary weight networks is proposed. This architecture utilizes PIM and features an energy-efficient switched current (SI) neuron, employing a network with binary weight and 9-bit activation. Simulation result shows the design occupies 5.82mm2 in SMIC 180nm CMOS technology, which consumes 1.45mW from 1.8V supplies. Our system outperforms the state-of-the-art designs in terms of power consumption and achieves energy efficiency up to 28.25TOPS/W.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"89 1","pages":"181-186"},"PeriodicalIF":0.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83871331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}