首页 > 最新文献

Proceedings of the Great Lakes Symposium on VLSI 2022最新文献

英文 中文
Flexible and Personalized Learning for Wearable Health Applications using HyperDimensional Computing 使用超维计算的可穿戴健康应用的灵活和个性化学习
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530373
Sina Shahhosseini, Yang Ni, Emad Kasaeyan Naeini, M. Imani, A. Rahmani, N. Dutt
Health and wellness applications increasingly rely on machine learning techniques to learn end-user physiological and behavioral patterns in everyday settings, posing two key challenges: inability to perform on-device online learning for resource-constrained wearables, and learning algorithms that support privacy-preserving personalization. We exploit a Hyperdimensional computing (HDC) solution for wearable devices that offers flexibility, high efficiency, and performance while enabling on-device personalization and privacy protection. We evaluate the efficacy of our approach using three case studies and show that our system improves performance of training by up to 35.8x compared with the state-of-the-art while offering a comparable accuracy.
健康和保健应用越来越依赖于机器学习技术来学习终端用户在日常环境中的生理和行为模式,这带来了两个关键挑战:无法为资源有限的可穿戴设备执行设备上的在线学习,以及支持隐私保护个性化的学习算法。我们为可穿戴设备开发了一种超维计算(HDC)解决方案,该解决方案提供了灵活性、高效率和高性能,同时实现了设备上的个性化和隐私保护。我们使用三个案例研究来评估我们的方法的有效性,并表明我们的系统与最先进的训练性能相比提高了35.8倍,同时提供了相当的准确性。
{"title":"Flexible and Personalized Learning for Wearable Health Applications using HyperDimensional Computing","authors":"Sina Shahhosseini, Yang Ni, Emad Kasaeyan Naeini, M. Imani, A. Rahmani, N. Dutt","doi":"10.1145/3526241.3530373","DOIUrl":"https://doi.org/10.1145/3526241.3530373","url":null,"abstract":"Health and wellness applications increasingly rely on machine learning techniques to learn end-user physiological and behavioral patterns in everyday settings, posing two key challenges: inability to perform on-device online learning for resource-constrained wearables, and learning algorithms that support privacy-preserving personalization. We exploit a Hyperdimensional computing (HDC) solution for wearable devices that offers flexibility, high efficiency, and performance while enabling on-device personalization and privacy protection. We evaluate the efficacy of our approach using three case studies and show that our system improves performance of training by up to 35.8x compared with the state-of-the-art while offering a comparable accuracy.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127940657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Adapt-Flow: A Flexible DNN Accelerator Architecture for Heterogeneous Dataflow Implementation adaptive - flow:一种灵活的DNN加速器架构,用于异构数据流的实现
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530311
Jiaqi Yang, Hao Zheng, A. Louri
Deep neural networks (DNNs) have been widely applied to various application domains. DNN computation is memory and compute-intensive requiring excessive memory access and a large number of computations. To efficiently implement these applications, several data reuse and parallelism exploitation strategies, called dataflows, have been proposed. Studies have shown that many DNN applications benefit from a heterogeneous dataflow strategy where the dataflow type changes from layer to layer. Unfortunately, very few existing DNN architectures can simultaneously accommodate multiple dataflows due to their limited hardware flexibility. In this paper, we propose a flexible DNN accelerator architecture, called Adapt-Flow, which has the capability of supporting multiple dataflow selections for each DNN layer at runtime. Specifically, the proposed Adapt-Flow architecture consists of (1) a flexible interconnect, (2) a dataflow selection algorithm, and (3) a dataflow mapping technique. The flexible interconnect provides dynamic support for various traffic patterns required by different dataflows. The proposed dataflow selection algorithm selects the optimal dataflow strategy for a given DNN layer with the aim of much improved performance. And the dataflow mapping technique efficiently maps the dataflow amenable to the flexible interconnect. Simulation studies show that the proposed Adapt-Flow architecture reduces execution time by 46%, 78%, 26%, and energy consumption by 45%, 80%, 25% as compared to NVDLA, ShiDianNao, and Eyeriss respectively.
深度神经网络(Deep neural network, dnn)已广泛应用于各个应用领域。DNN计算是内存和计算密集型的计算,需要过多的内存访问和大量的计算。为了有效地实现这些应用程序,提出了几种数据重用和并行开发策略,称为数据流。研究表明,许多深度神经网络应用受益于异构数据流策略,其中数据流类型从层到层变化。不幸的是,由于有限的硬件灵活性,很少有现有的深度神经网络架构可以同时容纳多个数据流。在本文中,我们提出了一个灵活的深度神经网络加速器架构,称为Adapt-Flow,它具有在运行时支持每个深度神经网络层的多个数据流选择的能力。具体来说,所提出的adaptive - flow架构包括(1)灵活互连,(2)数据流选择算法和(3)数据流映射技术。灵活的互联为不同数据流所需的各种流量模式提供动态支持。提出的数据流选择算法为给定的深度神经网络层选择最优的数据流策略,以大幅度提高性能。数据流映射技术有效地映射了适合于柔性互连的数据流。仿真研究表明,与NVDLA、ShiDianNao和Eyeriss相比,本文提出的adaptive - flow架构分别减少了46%、78%、26%的执行时间和45%、80%、25%的能耗。
{"title":"Adapt-Flow: A Flexible DNN Accelerator Architecture for Heterogeneous Dataflow Implementation","authors":"Jiaqi Yang, Hao Zheng, A. Louri","doi":"10.1145/3526241.3530311","DOIUrl":"https://doi.org/10.1145/3526241.3530311","url":null,"abstract":"Deep neural networks (DNNs) have been widely applied to various application domains. DNN computation is memory and compute-intensive requiring excessive memory access and a large number of computations. To efficiently implement these applications, several data reuse and parallelism exploitation strategies, called dataflows, have been proposed. Studies have shown that many DNN applications benefit from a heterogeneous dataflow strategy where the dataflow type changes from layer to layer. Unfortunately, very few existing DNN architectures can simultaneously accommodate multiple dataflows due to their limited hardware flexibility. In this paper, we propose a flexible DNN accelerator architecture, called Adapt-Flow, which has the capability of supporting multiple dataflow selections for each DNN layer at runtime. Specifically, the proposed Adapt-Flow architecture consists of (1) a flexible interconnect, (2) a dataflow selection algorithm, and (3) a dataflow mapping technique. The flexible interconnect provides dynamic support for various traffic patterns required by different dataflows. The proposed dataflow selection algorithm selects the optimal dataflow strategy for a given DNN layer with the aim of much improved performance. And the dataflow mapping technique efficiently maps the dataflow amenable to the flexible interconnect. Simulation studies show that the proposed Adapt-Flow architecture reduces execution time by 46%, 78%, 26%, and energy consumption by 45%, 80%, 25% as compared to NVDLA, ShiDianNao, and Eyeriss respectively.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129294328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Enhancing Information Security Courses With a Remotely Accessible Side-Channel Analysis Setup 通过远程访问的侧信道分析设置加强信息安全课程
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530347
Abubakr Abdulgadir, J. Kaps, A. Salman
The ever-increasing security threats to our digital infrastructure im- pose the training of a sufficient number of engineers on real-world equipment and attacks. A significant investment in equipment is often needed to teach hardware security. Additionally, the global COVID-19 pandemic demonstrated that online-accessible educational systems are crucial to the continuity of the teaching process. In this work, we describe our experiment with teaching hardware security using a centralized shared setup that can be accessed remotely by students. Our setup reduces the cost and makes teaching such advanced topics more accessible while keeping the benefits of using real hardware to gain practical experience.
我们的数字基础设施所面临的日益增加的安全威胁,要求对足够数量的工程师进行现实世界设备和攻击方面的培训。通常需要在设备上进行大量投资来教授硬件安全性。此外,全球2019冠状病毒病大流行表明,在线教育系统对教学过程的连续性至关重要。在这项工作中,我们描述了我们使用集中式共享设置来教授硬件安全性的实验,该设置可以由学生远程访问。我们的设置降低了成本,使教学这些高级主题更容易获得,同时保持使用真实硬件获得实践经验的好处。
{"title":"Enhancing Information Security Courses With a Remotely Accessible Side-Channel Analysis Setup","authors":"Abubakr Abdulgadir, J. Kaps, A. Salman","doi":"10.1145/3526241.3530347","DOIUrl":"https://doi.org/10.1145/3526241.3530347","url":null,"abstract":"The ever-increasing security threats to our digital infrastructure im- pose the training of a sufficient number of engineers on real-world equipment and attacks. A significant investment in equipment is often needed to teach hardware security. Additionally, the global COVID-19 pandemic demonstrated that online-accessible educational systems are crucial to the continuity of the teaching process. In this work, we describe our experiment with teaching hardware security using a centralized shared setup that can be accessed remotely by students. Our setup reduces the cost and makes teaching such advanced topics more accessible while keeping the benefits of using real hardware to gain practical experience.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117251012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ENTANGLE: An Enhanced Logic-locking Technique for Thwarting SAT and Structural Attacks 缠结:一种增强的逻辑锁定技术,用于挫败SAT和结构攻击
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530371
A. Darjani, N. Kavand, Shubham Rai, M. Wijtvliet, Akash Kumar
Among the SAT-resilient logic locking techniques, the Stripped-Functionality-Logic-Locking (SFLL) is the most promising solution which can guard the intellectual property against approximate, sensitization, SAT, and structural attacks which target Point-function techniques. However, even the SFLL technique has been shown to be vulnerable to a recent class of structural attacks that identify the perturbation logic. In this paper, we first categorize all possible classes of attacks on SFLL. Then we propose ENTANGLE a novel logic locking technique built upon SFLL that can resist all of these attacks, including the emerging ML-Based attacks. We test our technique against publicly available SFLL attacks. The implementation results show that ENTANGLE can secure large-sized industrial circuits with an average overhead of 11.6 percent and 9.1 percent for area and power, respectively.
在SAT弹性逻辑锁定技术中,剥离功能逻辑锁定(SFLL)是最有前途的解决方案,它可以保护知识产权免受针对点函数技术的近似攻击、敏化攻击、SAT攻击和结构攻击。然而,即使是SFLL技术也被证明容易受到最近一类识别摄动逻辑的结构攻击的攻击。在本文中,我们首先对所有可能的SFLL攻击进行分类。然后,我们提出了一种基于SFLL的新型逻辑锁定技术,可以抵抗所有这些攻击,包括新兴的基于ml的攻击。我们针对公开可用的sll攻击测试了我们的技术。实施结果表明,ENTANGLE可以确保大型工业电路的平均开销,面积和功耗分别为11.6%和9.1%。
{"title":"ENTANGLE: An Enhanced Logic-locking Technique for Thwarting SAT and Structural Attacks","authors":"A. Darjani, N. Kavand, Shubham Rai, M. Wijtvliet, Akash Kumar","doi":"10.1145/3526241.3530371","DOIUrl":"https://doi.org/10.1145/3526241.3530371","url":null,"abstract":"Among the SAT-resilient logic locking techniques, the Stripped-Functionality-Logic-Locking (SFLL) is the most promising solution which can guard the intellectual property against approximate, sensitization, SAT, and structural attacks which target Point-function techniques. However, even the SFLL technique has been shown to be vulnerable to a recent class of structural attacks that identify the perturbation logic. In this paper, we first categorize all possible classes of attacks on SFLL. Then we propose ENTANGLE a novel logic locking technique built upon SFLL that can resist all of these attacks, including the emerging ML-Based attacks. We test our technique against publicly available SFLL attacks. The implementation results show that ENTANGLE can secure large-sized industrial circuits with an average overhead of 11.6 percent and 9.1 percent for area and power, respectively.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115147435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
MOCCA: A Process Variation Tolerant Systolic DNN Accelerator using CNFETs in Monolithic 3D MOCCA:在单片3D中使用cnfet的过程变化容忍收缩DNN加速器
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530380
Samuel J. Engers, Cheng Chu, Dawen Xu, Ying Wang, Fan Chen
Hardware accelerators based on systolic arrays have become the dominant method for efficient processing of deep neural networks (DNNs). Although such designs provide significant performance improvement compared to its contemporary CPUs or GPUs, their power efficiency and area efficiency are greatly limited by the large computing array and on-chip memory. In this work, we demonstrate that we can further improve the efficiency of systolic accelerators using emerging carbon nanotube field-effect transistors (CNFETs) by stacking the computing logic and on-chip memory on multiple layers and utilizing monolithic 3D (M3D) vias for low-latency communication. We comprehensively explore the design space and present MOCCA, the first process variation tolerable CNFET-based systolic DNN accelerator. We validate MOCCA against previous 2D accelerators on state-of-the-arts DNN models. On average, MOCCA achieves the same throughput with 6.12× and 2.12× improvement respectively on performance and power efficiency in a 2× reduced chip footprint.
基于收缩阵列的硬件加速器已成为深度神经网络高效处理的主流方法。虽然这样的设计与当代的cpu或gpu相比提供了显著的性能改进,但它们的功率效率和面积效率受到大型计算阵列和片上存储器的极大限制。在这项工作中,我们证明了通过将计算逻辑和片上存储器堆叠在多层上,并利用单片3D (M3D)通孔进行低延迟通信,我们可以进一步提高收缩加速器的效率。我们全面探索了设计空间,并提出了MOCCA,这是第一个基于cnfet的可容忍工艺变化的收缩DNN加速器。我们在最先进的DNN模型上验证了MOCCA与先前2D加速器的对比。平均而言,MOCCA实现了相同的吞吐量,性能和功率效率分别提高了6.12倍和2.12倍,芯片占地面积减少了2倍。
{"title":"MOCCA: A Process Variation Tolerant Systolic DNN Accelerator using CNFETs in Monolithic 3D","authors":"Samuel J. Engers, Cheng Chu, Dawen Xu, Ying Wang, Fan Chen","doi":"10.1145/3526241.3530380","DOIUrl":"https://doi.org/10.1145/3526241.3530380","url":null,"abstract":"Hardware accelerators based on systolic arrays have become the dominant method for efficient processing of deep neural networks (DNNs). Although such designs provide significant performance improvement compared to its contemporary CPUs or GPUs, their power efficiency and area efficiency are greatly limited by the large computing array and on-chip memory. In this work, we demonstrate that we can further improve the efficiency of systolic accelerators using emerging carbon nanotube field-effect transistors (CNFETs) by stacking the computing logic and on-chip memory on multiple layers and utilizing monolithic 3D (M3D) vias for low-latency communication. We comprehensively explore the design space and present MOCCA, the first process variation tolerable CNFET-based systolic DNN accelerator. We validate MOCCA against previous 2D accelerators on state-of-the-arts DNN models. On average, MOCCA achieves the same throughput with 6.12× and 2.12× improvement respectively on performance and power efficiency in a 2× reduced chip footprint.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114376487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Memristor-based Secure Scan Design against the Scan-based Side-Channel Attacks 针对扫描侧信道攻击的忆阻器安全扫描设计
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530345
Mengqiang Lu, Aijiao Cui, Yan Shao, G. Qu
Scan chain design can improve the testability of a circuit while it can be used as a side-channel to access the sensitive information inside a cryptographic chip for the crack of cipher key. To secure the scan design while maintaining its testability, this paper proposes a memristor-based secure scan design. A lock and key scheme is introduced. Physical unclonable function (PUF) is used to generate a unique test key for each chip. When an input test key matches the PUF-based key, the scan chain can be used normally for testing. Otherwise, the data in some scan cells are obfuscated by the random bits, which are generated by reading the status of a memristor. As the random bits do not relate to the original test data, an adversary cannot access useful information from scan chain to deduce the cipher key. The experimental results show that the proposed secure scan design can resist all existing attacks while incurring low overhead. Also, the testability of the original design is not affected.
扫描链设计可以提高电路的可测试性,同时也可以作为破解密码时访问密码芯片内部敏感信息的侧通道。为了保证扫描设计的安全性,同时保持其可测试性,本文提出了一种基于忆阻器的安全扫描设计。介绍了一种锁和密钥方案。物理不可克隆功能(PUF)用于为每个芯片生成唯一的测试密钥。当输入的测试密钥与基于puf的密钥匹配时,可以正常使用扫描链进行测试。否则,一些扫描单元中的数据会被随机位混淆,这些随机位是通过读取忆阻器的状态而产生的。由于随机比特与原始测试数据不相关,攻击者无法从扫描链中获取有用信息来推断密码密钥。实验结果表明,所提出的安全扫描设计能够抵御现有的所有攻击,且开销低。同时,原设计的可测试性不受影响。
{"title":"A Memristor-based Secure Scan Design against the Scan-based Side-Channel Attacks","authors":"Mengqiang Lu, Aijiao Cui, Yan Shao, G. Qu","doi":"10.1145/3526241.3530345","DOIUrl":"https://doi.org/10.1145/3526241.3530345","url":null,"abstract":"Scan chain design can improve the testability of a circuit while it can be used as a side-channel to access the sensitive information inside a cryptographic chip for the crack of cipher key. To secure the scan design while maintaining its testability, this paper proposes a memristor-based secure scan design. A lock and key scheme is introduced. Physical unclonable function (PUF) is used to generate a unique test key for each chip. When an input test key matches the PUF-based key, the scan chain can be used normally for testing. Otherwise, the data in some scan cells are obfuscated by the random bits, which are generated by reading the status of a memristor. As the random bits do not relate to the original test data, an adversary cannot access useful information from scan chain to deduce the cipher key. The experimental results show that the proposed secure scan design can resist all existing attacks while incurring low overhead. Also, the testability of the original design is not affected.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127066474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
iMAD: An In-Memory Accelerator for AdderNet with Efficient 8-bit Addition and Subtraction Operations iMAD: AdderNet的内存加速器,具有高效的8位加减运算
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530313
Shien Zhu, Shiqing Li, Weichen Liu
Adder Neural Network (AdderNet) is a new type of Convolutional Neural Networks (CNNs) that replaces the computational-intensive multiplications in convolution layers with lightweight additions and subtractions. As a result, AdderNet preserves high accuracy with adder convolution kernels and achieves high speed and power efficiency. In-Memory Computing (IMC) is known as the next-generation artificial-intelligence computing paradigm that has been widely adopted for accelerating binary and ternary CNNs. As AdderNet has much higher accuracy than binary and ternary CNNs, accelerating AdderNet using IMC can obtain both performance and accuracy benefits. However, existing IMC devices have no dedicated subtraction function, and adding subtraction logic may bring larger area, higher power, and degraded addition performance. In this paper, we propose iMAD as an in-memory accelerator for AdderNet with efficient addition and subtraction operations. First, we propose an efficient in-memory subtraction operator at the circuit level and co-optimize the addition performance to reduce the latency and power. Second, we propose an accelerator architecture for AdderNet with high parallelism based on the optimized operators. Third, we propose an IMC-friendly computation pipeline for AdderNet convolution at the algorithm level to further boost the performance. Evaluation results show that our accelerator iMAD achieves 3.25X speedup and 3.55X energy efficiency compared with a state-of-the-art in-memory accelerator.
加法器神经网络(AdderNet)是一种新型的卷积神经网络(cnn),它用轻量级的加法和减法取代了卷积层中的计算密集型乘法。因此,AdderNet保留了加法器卷积核的高精度,并实现了高速度和高能效。内存计算(IMC)被称为下一代人工智能计算范式,已被广泛用于加速二进制和三元cnn。由于AdderNet具有比二进制和三进制cnn更高的精度,使用IMC加速AdderNet可以获得性能和精度的双重优势。但是,现有的IMC器件没有专用的减法功能,增加减法逻辑可能会带来更大的面积、更高的功耗和降低的加法性能。在本文中,我们提出iMAD作为AdderNet的内存加速器,具有高效的加减运算。首先,我们在电路级提出了一个高效的内存减法运算符,并共同优化了加法性能,以降低延迟和功耗。其次,在优化运算符的基础上,提出了AdderNet的高并行度加速器架构。第三,我们在算法层面提出了一种适合imc的AdderNet卷积计算管道,以进一步提高性能。评估结果表明,与最先进的内存加速器相比,我们的加速器iMAD实现了3.25倍的加速和3.55倍的能效。
{"title":"iMAD: An In-Memory Accelerator for AdderNet with Efficient 8-bit Addition and Subtraction Operations","authors":"Shien Zhu, Shiqing Li, Weichen Liu","doi":"10.1145/3526241.3530313","DOIUrl":"https://doi.org/10.1145/3526241.3530313","url":null,"abstract":"Adder Neural Network (AdderNet) is a new type of Convolutional Neural Networks (CNNs) that replaces the computational-intensive multiplications in convolution layers with lightweight additions and subtractions. As a result, AdderNet preserves high accuracy with adder convolution kernels and achieves high speed and power efficiency. In-Memory Computing (IMC) is known as the next-generation artificial-intelligence computing paradigm that has been widely adopted for accelerating binary and ternary CNNs. As AdderNet has much higher accuracy than binary and ternary CNNs, accelerating AdderNet using IMC can obtain both performance and accuracy benefits. However, existing IMC devices have no dedicated subtraction function, and adding subtraction logic may bring larger area, higher power, and degraded addition performance. In this paper, we propose iMAD as an in-memory accelerator for AdderNet with efficient addition and subtraction operations. First, we propose an efficient in-memory subtraction operator at the circuit level and co-optimize the addition performance to reduce the latency and power. Second, we propose an accelerator architecture for AdderNet with high parallelism based on the optimized operators. Third, we propose an IMC-friendly computation pipeline for AdderNet convolution at the algorithm level to further boost the performance. Evaluation results show that our accelerator iMAD achieves 3.25X speedup and 3.55X energy efficiency compared with a state-of-the-art in-memory accelerator.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122032882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Efficient Method for Timing-based Information Flow Verification in Hardware Designs 硬件设计中基于时序的信息流验证方法
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530363
Khitam M. Alatoun, R. Vemuri
Timing side channels are a serious threat to the security of hardware designs. By analyzing the execution times of a design, the attacker can expose the secret information. This paper proposes an approach to verify and monitor timing-based information flow properties. In addition, the method can highlight the path that is vulnerable to leakage, making it easier to trace the leaking channel. The method can be used during formal verification, dynamic verification during simulation, post-fabrication validation, and run-time monitoring if one is necessary. The method reduces the overhead of the security model, which helps speed up the verification process and create an efficient run-time hardware monitor. Various timing-based information flow properties from five different hardware designs were verified. The results show that our approach can accurately detect hardware timing channels with lower overhead.
时序侧信道严重威胁硬件设计的安全性。通过分析设计的执行时间,攻击者可以暴露秘密信息。本文提出了一种验证和监控基于时序的信息流特性的方法。此外,该方法可以突出显示易泄漏的路径,使其更容易跟踪泄漏通道。该方法可用于形式验证、仿真过程中的动态验证、制造后验证以及必要时的运行时监控。该方法减少了安全模型的开销,这有助于加快验证过程并创建高效的运行时硬件监视器。验证了五种不同硬件设计的各种基于时间的信息流特性。结果表明,该方法能够以较低的开销准确检测硬件时序通道。
{"title":"Efficient Method for Timing-based Information Flow Verification in Hardware Designs","authors":"Khitam M. Alatoun, R. Vemuri","doi":"10.1145/3526241.3530363","DOIUrl":"https://doi.org/10.1145/3526241.3530363","url":null,"abstract":"Timing side channels are a serious threat to the security of hardware designs. By analyzing the execution times of a design, the attacker can expose the secret information. This paper proposes an approach to verify and monitor timing-based information flow properties. In addition, the method can highlight the path that is vulnerable to leakage, making it easier to trace the leaking channel. The method can be used during formal verification, dynamic verification during simulation, post-fabrication validation, and run-time monitoring if one is necessary. The method reduces the overhead of the security model, which helps speed up the verification process and create an efficient run-time hardware monitor. Various timing-based information flow properties from five different hardware designs were verified. The results show that our approach can accurately detect hardware timing channels with lower overhead.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125812606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Highly Robust, Low Delay and DNU-Recovery Latch Design for Nanoscale CMOS Technology 一种用于纳米级CMOS技术的高鲁棒、低延迟和dnu恢复锁存器设计
Pub Date : 2022-06-06 DOI: 10.1145/3526241.3530321
Aibin Yan, Zhen Zhou, Shaojie Wei, Jie Cui, Yong Zhou, Tianming Ni, P. Girard, X. Wen
With the advancement of semiconductor technologies, nano-scale CMOS circuits have become more vulnerable to soft errors, such as single-node-upsets (SNUs) and double-node-upsets (DNUs). In order to effectively tolerate DNUs caused by radiation and reduce the delay and area consumption of latches, this paper proposes a DNU resilient latch in the nanoscale CMOS technology. The latch mainly comprises four input-split inverters and four 2-input C-elements. Since all internal nodes are interlocked, the latch can recover from all possible DNUs. Simulation results show that, compared with the state-of-the-art DNU self-recovery latch designs, the proposed latch can save 64.51% transmission delay and 56.88% delay-area-power-product (DAPP) on average, respectively.
随着半导体技术的进步,纳米级CMOS电路越来越容易受到软误差的影响,如单节点扰流(snu)和双节点扰流(dnu)。为了有效耐受辐射引起的DNU,减少锁存器的延迟和面积消耗,本文提出了一种基于纳米级CMOS技术的DNU弹性锁存器。锁存器主要由4个输入分路逆变器和4个2输入c元组成。由于所有内部节点都是互锁的,锁存器可以从所有可能的dna中恢复。仿真结果表明,与目前最先进的DNU自恢复锁存器设计相比,该锁存器平均可节省64.51%的传输延迟和56.88%的延迟面积功率产品(DAPP)。
{"title":"A Highly Robust, Low Delay and DNU-Recovery Latch Design for Nanoscale CMOS Technology","authors":"Aibin Yan, Zhen Zhou, Shaojie Wei, Jie Cui, Yong Zhou, Tianming Ni, P. Girard, X. Wen","doi":"10.1145/3526241.3530321","DOIUrl":"https://doi.org/10.1145/3526241.3530321","url":null,"abstract":"With the advancement of semiconductor technologies, nano-scale CMOS circuits have become more vulnerable to soft errors, such as single-node-upsets (SNUs) and double-node-upsets (DNUs). In order to effectively tolerate DNUs caused by radiation and reduce the delay and area consumption of latches, this paper proposes a DNU resilient latch in the nanoscale CMOS technology. The latch mainly comprises four input-split inverters and four 2-input C-elements. Since all internal nodes are interlocked, the latch can recover from all possible DNUs. Simulation results show that, compared with the state-of-the-art DNU self-recovery latch designs, the proposed latch can save 64.51% transmission delay and 56.88% delay-area-power-product (DAPP) on average, respectively.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122526501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Session details: Session 6A: Special Session -1: Machine Learning and Hardware Attacks 会议详情:会议6A:特别会议-1:机器学习和硬件攻击
Pub Date : 2022-06-06 DOI: 10.1145/3542692
Qiaoyan Yu
{"title":"Session details: Session 6A: Special Session -1: Machine Learning and Hardware Attacks","authors":"Qiaoyan Yu","doi":"10.1145/3542692","DOIUrl":"https://doi.org/10.1145/3542692","url":null,"abstract":"","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121525500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the Great Lakes Symposium on VLSI 2022
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1