ACM Journal on Emerging Technologies in Computing Systems最新文献_第5页

Introduction to the Special Issue on CAD for Security: Pre-silicon Security Sign-off Solutions Through Design Cycle 安全CAD特刊简介:设计周期内的预硅安全签署解决方案

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-03-17 DOI: https://dl.acm.org/doi/10.1145/3584317

Farimah Farahmandi, Ankur Srivastava, Giorgio Di Natale, Mark Tehranipoor

This introduction welcomes all readers to this ACM JETC special issue on CAD for Security: Pre-silicon Security Sign-off Solutions Through Design Cycle. The articles published in this special issue reflect how computer-aided design (CAD) tools are developed to expand the notion of automated security verification throughout the system-on-chip (SoC) design cycle. This special issue aims to demonstrate how the semiconductor industry must look for security-oriented metrics and evaluation as part of automatic CAD solution development to aid analysis, identifying, root-causing, and mitigating SoC security problems. Throughout this introductory note, we first represent the need for such a security-oriented sign-off solution for the ASIC design flow, then it is followed by providing an overview of the articles published in this special issue and how they address such requirements.

本介绍欢迎所有读者阅读ACM JETC关于安全CAD的特刊:通过设计周期的预硅安全签名解决方案。本期特刊中发表的文章反映了计算机辅助设计(CAD)工具是如何在整个片上系统(SoC)设计周期中扩展自动安全验证概念的。本期特刊旨在展示半导体行业必须如何寻找面向安全的指标和评估，作为自动CAD解决方案开发的一部分，以帮助分析、识别、根源和减轻SoC安全问题。在这篇介绍性文章中，我们首先介绍了针对ASIC设计流程的这种面向安全的签名解决方案的需求，然后概述了本期特刊中发表的文章，以及它们是如何满足这些需求的。

引用次数: 0

Characterization of Timing-based Software Side-channel Attacks and Mitigations on Network-on-Chip Hardware 基于时序的软件侧信道攻击特征及片上网络硬件的缓解

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-03-02 DOI: 10.1145/3585519

U. Ali, Sheikh Abdul Rasheed Sahni, O. Khan

Modern network-on-chip (NoC) hardware is an emerging target for side-channel security attacks. A recent work implemented and characterized timing-based software side-channel attacks that target NoC hardware on a real multicore machine. This article studies the impact of system noise on prior attack setups and shows that high noise is sufficient to defeat the attacker. We propose an information theory-based attack setup that uses repetition codes and differential signaling techniques to de-noise the unwanted noise from the NoC channel to successfully implement a practical covert-communication attack on a real multicore machine. The evaluation demonstrates an attack efficacy of 97%, 88%, and 78% under low, medium, and high external noise, respectively. Our attack characterization reveals that noise-based mitigation schemes are inadequate to prevent practical covert communication, and thus isolation-based mitigation schemes must be considered to ensure strong security. Isolation-based schemes are shown to mitigate timing-based side-channel attacks. However, their impact on the performance of real-world security critical workloads is not well understood in the literature. This article evaluates the performance implications of state-of-the-art spatial and temporal isolation schemes. The performance impact is shown to range from 2–3% for a set of graph and machine learning workloads, thus making isolation-based mitigations practical.

现代片上网络（NoC）硬件是侧信道安全攻击的新兴目标。最近的一项工作实现并表征了针对真实多核机器上的NoC硬件的基于定时的软件侧信道攻击。本文研究了系统噪声对先前攻击设置的影响，并表明高噪声足以击败攻击者。我们提出了一种基于信息论的攻击设置，该设置使用重复码和差分信令技术来消除来自NoC信道的不需要的噪声，以成功地在真实的多核机器上实现实际的隐蔽通信攻击。评估表明，在低、中等和高外部噪声下，攻击有效性分别为97%、88%和78%。我们的攻击特征表明，基于噪声的缓解方案不足以阻止实际的秘密通信，因此必须考虑基于隔离的缓解方案以确保强大的安全性。显示了基于隔离的方案来减轻基于定时的侧信道攻击。然而，它们对现实世界中安全关键工作负载性能的影响在文献中并没有得到很好的理解。本文评估了最先进的空间和时间隔离方案的性能影响。对于一组图形和机器学习工作负载，性能影响范围为2-3%，因此使基于隔离的缓解措施变得切实可行。

{"title":"Characterization of Timing-based Software Side-channel Attacks and Mitigations on Network-on-Chip Hardware","authors":"U. Ali, Sheikh Abdul Rasheed Sahni, O. Khan","doi":"10.1145/3585519","DOIUrl":"https://doi.org/10.1145/3585519","url":null,"abstract":"Modern network-on-chip (NoC) hardware is an emerging target for side-channel security attacks. A recent work implemented and characterized timing-based software side-channel attacks that target NoC hardware on a real multicore machine. This article studies the impact of system noise on prior attack setups and shows that high noise is sufficient to defeat the attacker. We propose an information theory-based attack setup that uses repetition codes and differential signaling techniques to de-noise the unwanted noise from the NoC channel to successfully implement a practical covert-communication attack on a real multicore machine. The evaluation demonstrates an attack efficacy of 97%, 88%, and 78% under low, medium, and high external noise, respectively. Our attack characterization reveals that noise-based mitigation schemes are inadequate to prevent practical covert communication, and thus isolation-based mitigation schemes must be considered to ensure strong security. Isolation-based schemes are shown to mitigate timing-based side-channel attacks. However, their impact on the performance of real-world security critical workloads is not well understood in the literature. This article evaluates the performance implications of state-of-the-art spatial and temporal isolation schemes. The performance impact is shown to range from 2–3% for a set of graph and machine learning workloads, thus making isolation-based mitigations practical.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":" ","pages":"1 - 23"},"PeriodicalIF":2.2,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49039123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Mapping Method Tolerating SAF and Variation for Memristor Crossbar Array Based Neural Network Inference on Edge Devices 边缘器件上基于神经网络推理的忆阻器横条阵列容SAF和变异映射方法

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-02-25 DOI: 10.1145/3585518

Yu Ma, Linfeng Zheng, Pingqiang Zhou

There is an increasing demand for running neural network inference on edge devices. Memristor crossbar array (MCA) based accelerators can be used to accelerate neural networks on edge devices. However, reliability issues in memristors, such as stuck-at faults (SAF) and variations, lead to weight deviation of neural networks and therefore have a severe influence on inference accuracy. In this work, we focus on the reliability issues in memristors for edge devices. We formulate the reliability problem as a 0–1 programming problem, based on the analysis of sum weight variation (SWV). In order to solve the problem, we simplify the problem with an approximation - different columns have the same weights, based on our observation of the weight distribution. Then we propose an effective mapping method to solve the simplified problem. We evaluate our proposed method with two neural network applications on two datasets. The experimental results on the classification application show that our proposed method can recover 95% accuracy considering SAF defects and can increase by up to 60% accuracy with variation σ =0.4. The results of the neural rendering application show that our proposed method can prevent render quality reduction.

在边缘设备上运行神经网络推理的需求越来越大。基于忆阻交叉棒阵列(MCA)的加速器可用于加速边缘设备上的神经网络。然而，记忆电阻器的可靠性问题，如卡在故障(SAF)和变异，会导致神经网络的权重偏差，从而严重影响推理精度。在这项工作中，我们专注于边缘器件忆阻器的可靠性问题。在权重变化和分析的基础上，将可靠性问题表述为一个0-1规划问题。为了解决这个问题，我们用一个近似来简化这个问题——根据我们对权重分布的观察，不同的列具有相同的权重。然后，我们提出了一种有效的映射方法来解决简化问题。我们用两个神经网络在两个数据集上的应用来评估我们提出的方法。分类应用的实验结果表明，在考虑SAF缺陷的情况下，该方法可以恢复95%的准确率，当σ =0.4时，准确率可提高60%。神经网络渲染应用结果表明，该方法可以有效防止渲染质量下降。

{"title":"A Mapping Method Tolerating SAF and Variation for Memristor Crossbar Array Based Neural Network Inference on Edge Devices","authors":"Yu Ma, Linfeng Zheng, Pingqiang Zhou","doi":"10.1145/3585518","DOIUrl":"https://doi.org/10.1145/3585518","url":null,"abstract":"There is an increasing demand for running neural network inference on edge devices. Memristor crossbar array (MCA) based accelerators can be used to accelerate neural networks on edge devices. However, reliability issues in memristors, such as stuck-at faults (SAF) and variations, lead to weight deviation of neural networks and therefore have a severe influence on inference accuracy. In this work, we focus on the reliability issues in memristors for edge devices. We formulate the reliability problem as a 0–1 programming problem, based on the analysis of sum weight variation (SWV). In order to solve the problem, we simplify the problem with an approximation - different columns have the same weights, based on our observation of the weight distribution. Then we propose an effective mapping method to solve the simplified problem. We evaluate our proposed method with two neural network applications on two datasets. The experimental results on the classification application show that our proposed method can recover 95% accuracy considering SAF defects and can increase by up to 60% accuracy with variation σ =0.4. The results of the neural rendering application show that our proposed method can prevent render quality reduction.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"19 1","pages":"1 - 21"},"PeriodicalIF":2.2,"publicationDate":"2023-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42933281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Hybrid Optical-Electrical Analog Deep Learning Accelerator Using Incoherent Optical Signals

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-02-17 DOI: 10.1145/3584183

Mingdai Yang, Qiuwen Lou, R. Rajaei, M. Jokar, Junyi Qiu, Yuming Liu, Aditi Udupa, F. Chong, J. Dallesasse, Milton Feng, L. Goddard, X. S. Hu, Yanjing Li

Optical deep learning (DL) accelerators have attracted significant interests due to their latency and power advantages. In this article, we focus on incoherent optical designs. A significant challenge is that there is no known solution to perform single-wavelength accumulation (a key operation required for DL workloads) using incoherent optical signals efficiently. Therefore, we devise a hybrid approach, where accumulation is done in the electrical domain, and multiplication is performed in the optical domain. The key technology enabler of our design is the transistor laser, which performs electrical-to-optical and optical-to-electrical conversions efficiently. Through detailed design and evaluation of our design, along with a comprehensive benchmarking study against state-of-the-art RRAM-based designs, we derive the following key results: (1) For a four-layer multilayer perceptron network, our design achieves 115× and 17.11× improvements in latency and energy, respectively, compared to the RRAM-based design. We can take full advantage of the speed and energy benefits of the optical technology because the inference task can be entirely mapped onto our design. (2) For a complex workload (Resnet50), weight reprogramming is needed, and intermediate results need to be stored/re-fetched to/from memories. In this case, for the same area, our design still outperforms the RRAM-based design by 15.92× in inference latency, and 8.99× in energy.

光学深度学习(DL)加速器由于其延迟和功率优势而引起了人们的极大兴趣。本文主要讨论非相干光学设计。一个重要的挑战是，没有已知的解决方案可以有效地使用非相干光信号进行单波长积累(DL工作负载所需的关键操作)。因此，我们设计了一种混合方法，在电域中进行积累，在光域中进行乘法。我们设计的关键技术是晶体管激光器，它可以有效地进行光电和光光电转换。通过对我们的设计进行详细的设计和评估，以及对最先进的基于rram的设计进行全面的基准测试研究，我们得出了以下关键结果:(1)对于四层多层感知器网络，与基于rram的设计相比，我们的设计在延迟和能量方面分别提高了115倍和17.11倍。我们可以充分利用光学技术的速度和能源优势，因为推理任务可以完全映射到我们的设计中。(2)对于复杂的工作负载(Resnet50)，需要权重重编程，并且需要将中间结果存储/重新提取到内存中。在这种情况下，对于相同的区域，我们的设计仍然比基于ram的设计在推理延迟上高出15.92倍，在能量上高出8.99倍。

{"title":"A Hybrid Optical-Electrical Analog Deep Learning Accelerator Using Incoherent Optical Signals","authors":"Mingdai Yang, Qiuwen Lou, R. Rajaei, M. Jokar, Junyi Qiu, Yuming Liu, Aditi Udupa, F. Chong, J. Dallesasse, Milton Feng, L. Goddard, X. S. Hu, Yanjing Li","doi":"10.1145/3584183","DOIUrl":"https://doi.org/10.1145/3584183","url":null,"abstract":"Optical deep learning (DL) accelerators have attracted significant interests due to their latency and power advantages. In this article, we focus on incoherent optical designs. A significant challenge is that there is no known solution to perform single-wavelength accumulation (a key operation required for DL workloads) using incoherent optical signals efficiently. Therefore, we devise a hybrid approach, where accumulation is done in the electrical domain, and multiplication is performed in the optical domain. The key technology enabler of our design is the transistor laser, which performs electrical-to-optical and optical-to-electrical conversions efficiently. Through detailed design and evaluation of our design, along with a comprehensive benchmarking study against state-of-the-art RRAM-based designs, we derive the following key results: (1) For a four-layer multilayer perceptron network, our design achieves 115× and 17.11× improvements in latency and energy, respectively, compared to the RRAM-based design. We can take full advantage of the speed and energy benefits of the optical technology because the inference task can be entirely mapped onto our design. (2) For a complex workload (Resnet50), weight reprogramming is needed, and intermediate results need to be stored/re-fetched to/from memories. In this case, for the same area, our design still outperforms the RRAM-based design by 15.92× in inference latency, and 8.99× in energy.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":" ","pages":"1 - 24"},"PeriodicalIF":2.2,"publicationDate":"2023-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47917167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Introduction to the Special Issue on CAD for Security: Pre-silicon Security Sign-off Solutions Through Design Cycle 安全CAD特刊简介：设计周期中的预硅安全签字解决方案

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-01-31 DOI: 10.1145/3584317

Farimah Farahmandi, Ankur Srivastava, Giorgio Di Natale, M. Tehranipoor

This introduction welcomes all readers to this ACM JETC special issue on CAD for Security: Pre-silicon Security Sign-off Solutions Through Design Cycle. The articles published in this special issue reflect how computer-aided design (CAD) tools are developed to expand the notion of automated security verification throughout the system-on-chip (SoC) design cycle. This special issue aims to demonstrate how the semiconductor industry must look for security-oriented metrics and evaluation as part of automatic CAD solution development to aid analysis, identifying, root-causing, and mitigating SoC security problems. Throughout this introductory note, we first represent the need for such a security-oriented sign-off solution for the ASIC design flow, then it is followed by providing an overview of the articles published in this special issue and how they address such requirements.

本简介欢迎所有读者阅读ACM JETC关于CAD for Security的特刊：通过设计周期的硅前安全签字解决方案。本期特刊中发表的文章反映了计算机辅助设计（CAD）工具是如何开发的，以在整个片上系统（SoC）设计周期中扩展自动安全验证的概念。本特刊旨在展示半导体行业必须如何寻找面向安全的指标和评估，作为自动CAD解决方案开发的一部分，以帮助分析、识别、根本原因和缓解SoC安全问题。在本介绍性说明中，我们首先阐述了ASIC设计流程对这种面向安全的签字解决方案的需求，然后概述了本特刊中发表的文章以及它们如何满足这些要求。

引用次数: 0

Survey of Approaches and Techniques for Security Verification of Computer Systems 计算机系统安全验证方法与技术综述

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-01-19 DOI: https://dl.acm.org/doi/10.1145/3564785

Ferhat Erata, Shuwen Deng, Faisal Zaghloul, Wenjie Xiong, Onur Demir, Jakub Szefer

This article surveys the landscape of security verification approaches and techniques for computer systems at various levels: from a software-application level all the way to the physical hardware level. Different existing projects are compared, based on the tools used and security aspects being examined. Since many systems require both hardware and software components to work together to provide the system’s promised security protections, it is not sufficient to verify just the software levels or just the hardware levels in a mutually exclusive fashion. This survey especially highlights system levels that are verified by the different existing projects and presents to the readers the state of the art in hardware and software system security verification. Few approaches come close to providing full-system verification, and there is still much room for improvement.

本文概述了计算机系统在各个级别上的安全验证方法和技术的概况:从软件应用程序级别一直到物理硬件级别。根据所使用的工具和正在检查的安全方面，比较不同的现有项目。由于许多系统需要硬件和软件组件一起工作以提供系统所承诺的安全保护，因此仅以互斥的方式验证软件级别或仅验证硬件级别是不够的。该调查特别强调了由不同现有项目验证的系统级别，并向读者展示了硬件和软件系统安全验证的最新状态。很少有方法接近于提供全系统验证，并且仍有很大的改进空间。

引用次数: 0

Automated Generation of Security Assertions for RTL Models RTL模型安全断言的自动生成

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-01-19 DOI: https://dl.acm.org/doi/10.1145/3565801

Hasini Witharana, Aruna Jayasena, Andrew Whigham, Prabhat Mishra

System-on-Chip (SoC) security is vital in designing trustworthy systems. Detecting and fixing a vulnerability in the early stages is easier and cost-effective. Assertion-based verification is widely used for functional validation of Register-Transfer Level (RTL) designs. Assertions can improve the controllability and observability that can lead to faster error detection and localization. Although assertions are widely used for functional validation of RTL models, there is limited effort in applying assertions to detect SoC security vulnerabilities. Specifically, a fundamental challenge in SoC security and trust validation is how to develop high-quality security assertions. In this article, we perform automated vulnerability analysis of RTL models to generate security assertions for six classes of vulnerabilities. Experimental results show that the generated security assertions can detect a wide variety of vulnerabilities. Our automated framework can drastically reduce the overall security validation effort compared to the manual development of security assertions. Automated generation of security assertions will enable assertion-based verification to be one of the most promising pre-silicon security sign-off solutions.

片上系统(SoC)的安全性对于设计可靠的系统至关重要。在早期阶段检测和修复漏洞更容易，而且成本效益更高。基于断言的验证被广泛用于RTL设计的功能验证。断言可以提高可控性和可观察性，从而更快地进行错误检测和定位。尽管断言被广泛用于RTL模型的功能验证，但在应用断言检测SoC安全漏洞方面的努力有限。具体来说，SoC安全和信任验证的一个基本挑战是如何开发高质量的安全断言。在本文中，我们执行RTL模型的自动漏洞分析，为六类漏洞生成安全断言。实验结果表明，生成的安全断言可以检测到各种各样的漏洞。与手动开发安全断言相比，我们的自动化框架可以大大减少整体安全验证工作。安全断言的自动生成将使基于断言的验证成为最有前途的预硅安全签名解决方案之一。

{"title":"Automated Generation of Security Assertions for RTL Models","authors":"Hasini Witharana, Aruna Jayasena, Andrew Whigham, Prabhat Mishra","doi":"https://dl.acm.org/doi/10.1145/3565801","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3565801","url":null,"abstract":"<p>System-on-Chip (SoC) security is vital in designing trustworthy systems. Detecting and fixing a vulnerability in the early stages is easier and cost-effective. Assertion-based verification is widely used for functional validation of Register-Transfer Level (RTL) designs. Assertions can improve the controllability and observability that can lead to faster error detection and localization. Although assertions are widely used for functional validation of RTL models, there is limited effort in applying assertions to detect SoC security vulnerabilities. Specifically, a fundamental challenge in SoC security and trust validation is how to develop high-quality security assertions. In this article, we perform automated vulnerability analysis of RTL models to generate security assertions for six classes of vulnerabilities. Experimental results show that the generated security assertions can detect a wide variety of vulnerabilities. Our automated framework can drastically reduce the overall security validation effort compared to the manual development of security assertions. Automated generation of security assertions will enable assertion-based verification to be one of the most promising pre-silicon security sign-off solutions.</p>","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"98 4","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Low-Rank Gradient Descent for Memory-Efficient Training of Deep In-Memory Arrays 基于低秩梯度下降的深度内存数组高效训练

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-01-13 DOI: 10.1145/3577214

Siyuan Huang, B. Hoskins, M. Daniels, M. Stiles, G. Adam

The movement of large quantities of data during the training of a deep neural network presents immense challenges for machine learning workloads, especially those based on future functional memories deployed to store network models. As the size of network models begins to vastly outstrip traditional silicon computing resources, functional memories based on flash, resistive switches, magnetic tunnel junctions, and other technologies can store these new ultra-large models. However, new approaches are then needed to minimize hardware overhead, especially on the movement and calculation of gradient information that cannot be efficiently contained in these new memory resources. To do this, we introduce streaming batch principal component analysis (SBPCA) as an update algorithm. Streaming batch principal component analysis uses stochastic power iterations to generate a stochastic rank-k approximation of the network gradient. We demonstrate that the low-rank updates produced by streaming batch principal component analysis can effectively train convolutional neural networks on a variety of common datasets, with performance comparable to standard mini-batch gradient descent. Our approximation is made in an expanded vector form that can efficiently be applied to the rows and columns of crossbars for array-level updates. These results promise improvements in the design of application-specific integrated circuits based around large vector-matrix multiplier memories.

在深度神经网络的训练过程中，大量数据的移动给机器学习工作负载带来了巨大的挑战，特别是那些基于未来功能记忆部署来存储网络模型的工作负载。随着网络模型的规模开始大大超过传统的硅计算资源，基于闪存、电阻开关、磁隧道结和其他技术的功能存储器可以存储这些新的超大型模型。然而，需要新的方法来最小化硬件开销，特别是在梯度信息的移动和计算上，这些梯度信息不能有效地包含在这些新的内存资源中。为此，我们引入了流批量主成分分析(SBPCA)作为更新算法。流批主成分分析使用随机幂次迭代生成网络梯度的随机秩-k近似。我们证明了流式批处理主成分分析产生的低秩更新可以有效地在各种通用数据集上训练卷积神经网络，其性能与标准的小批梯度下降相当。我们的近似是用扩展向量的形式进行的，可以有效地应用于横杆的行和列，以进行数组级更新。这些结果有望改善基于大型矢量矩阵乘法器存储器的特定应用集成电路的设计。

{"title":"Low-Rank Gradient Descent for Memory-Efficient Training of Deep In-Memory Arrays","authors":"Siyuan Huang, B. Hoskins, M. Daniels, M. Stiles, G. Adam","doi":"10.1145/3577214","DOIUrl":"https://doi.org/10.1145/3577214","url":null,"abstract":"The movement of large quantities of data during the training of a deep neural network presents immense challenges for machine learning workloads, especially those based on future functional memories deployed to store network models. As the size of network models begins to vastly outstrip traditional silicon computing resources, functional memories based on flash, resistive switches, magnetic tunnel junctions, and other technologies can store these new ultra-large models. However, new approaches are then needed to minimize hardware overhead, especially on the movement and calculation of gradient information that cannot be efficiently contained in these new memory resources. To do this, we introduce streaming batch principal component analysis (SBPCA) as an update algorithm. Streaming batch principal component analysis uses stochastic power iterations to generate a stochastic rank-k approximation of the network gradient. We demonstrate that the low-rank updates produced by streaming batch principal component analysis can effectively train convolutional neural networks on a variety of common datasets, with performance comparable to standard mini-batch gradient descent. Our approximation is made in an expanded vector form that can efficiently be applied to the rows and columns of crossbars for array-level updates. These results promise improvements in the design of application-specific integrated circuits based around large vector-matrix multiplier memories.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"19 1","pages":"1 - 24"},"PeriodicalIF":2.2,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42537203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AroMa: Evaluating Deep Learning Systems for Stealthy Integrity Attacks on Multi-tenant Accelerators AroMa：评估深度学习系统对多租户加速器的窃取完整性攻击

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-01-06 DOI: 10.1145/3579033

Xiangru Chen, Maneesh Merugu, Jiaqi Zhang, Sandip Ray

Multi-tenant applications have been proliferating in recent years, supported by the emergence of computing-as-service paradigms. Unfortunately, multi-tenancy induces new security vulnerabilities due to spatial or temporal co-location of applications with possibly malicious intent. In this article, we consider a special class of stealthy integrity attacks on multi-tenant deep learning accelerators. One interesting conclusion is that it is possible to perform targeted integrity attacks on kernel weights of deep learning systems such that it remains functional but mis-labels specific categories of input data through standard RowHammer attacks by only changing 0.0009% of the total weights. We develop an automated framework, AroMa, to evaluate the impact of multi-tenancy on security of deep learning accelerators against integrity attacks on memory systems. We present extensive evaluations on AroMa to demonstrate its effectiveness.

近年来，在计算即服务范式的支持下，多租户应用程序得到了迅猛发展。不幸的是，由于可能存在恶意意图的应用程序在空间或时间上的共存，多租户会导致新的安全漏洞。一个有趣的结论是，有可能对深度学习系统的内核权重执行有针对性的完整性攻击，这样它就可以保持功能，但通过标准的RowHammer攻击，只需改变总权重的0.0009%，就可以错误地标记输入数据的特定类别。我们对AroMa进行了广泛的评估，以证明其有效性。

引用次数: 0

On Securing Cryptographic ICs against Scan-based Attacks: A Hamming Weight Distribution Perspective 保护加密ic免受基于扫描的攻击:一个汉明权重分布的观点

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2022-12-22 DOI: 10.1145/3577215

Dipojjwal Ray, Yogendra Sao, S. Biswas, Sk Subidh Ali

Scan chain-based Design for Testability is the industry standard in use for testing manufacturing defects in the semiconductor industry to ensure the structural and functional correctness of chips. Fault coverage is significantly enhanced due to the higher observability and controllability of the internal latches. These ensuing benefits to testing, if misused, expose vulnerabilities that can be detrimental to the security aspects, especially in the context of crypto-chips that contain a secret key. Hence, it remains of paramount importance for a chip designer to secure crypto-chips against various scan attacks. A countermeasure is proposed in this article that preserves the secrecy of an embedded key in a cryptographic integrated circuit running an Advanced Encryption Standard (AES) implementation. A novel design involving a hardware unit is illustrated that circumvents differential scan attacks by essentially performing bit flips deterministically, using a pre-computed mask value. This helps secure the chip while retaining full testability. The controller logic directly depends on a mask determination algorithm that can defend against any scan attack with 𝒪 theoretical complexity. Security analysis of our proposed defense procedure is performed in the framework of Discrete Event Systems (DES). The sequential scan circuit of an AES cryptosystem is modeled as a DES using Finite State Automata. A security notion, Opacity, is used to quantify and formally verify the security aspects of our controlled system, which shows that the entropy of the secret key is preserved. A case study is performed that shows to mitigate state-of-the-art differential scan attacks successfully at a nominal extra overhead of 1.78%.

基于扫描链的可测试性设计是半导体行业用于测试制造缺陷以确保芯片结构和功能正确性的行业标准。由于内部锁存器具有较高的可观测性和可控性，故障覆盖率显著提高。测试的这些好处，如果被滥用，就会暴露出可能对安全方面有害的漏洞，特别是在包含密钥的加密芯片的上下文中。因此，对于芯片设计者来说，确保加密芯片免受各种扫描攻击仍然是至关重要的。本文提出了一种保护运行高级加密标准(AES)实现的加密集成电路中嵌入密钥的保密性的对策。一种涉及硬件单元的新设计，通过使用预先计算的掩码值，本质上执行位翻转来规避差分扫描攻击。这有助于确保芯片的安全性，同时保持完全的可测试性。控制器逻辑直接依赖于掩码确定算法，该算法可以防御任何具有理论复杂度的扫描攻击。在离散事件系统(DES)框架下对我们提出的防御程序进行了安全性分析。利用有限状态自动机将AES密码系统的顺序扫描电路建模为DES。一个安全概念，不透明度，被用来量化和形式化验证我们的控制系统的安全方面，这表明秘密密钥的熵是保留的。一个案例研究显示，以1.78%的名义额外开销成功减轻了最先进的差分扫描攻击。

{"title":"On Securing Cryptographic ICs against Scan-based Attacks: A Hamming Weight Distribution Perspective","authors":"Dipojjwal Ray, Yogendra Sao, S. Biswas, Sk Subidh Ali","doi":"10.1145/3577215","DOIUrl":"https://doi.org/10.1145/3577215","url":null,"abstract":"Scan chain-based Design for Testability is the industry standard in use for testing manufacturing defects in the semiconductor industry to ensure the structural and functional correctness of chips. Fault coverage is significantly enhanced due to the higher observability and controllability of the internal latches. These ensuing benefits to testing, if misused, expose vulnerabilities that can be detrimental to the security aspects, especially in the context of crypto-chips that contain a secret key. Hence, it remains of paramount importance for a chip designer to secure crypto-chips against various scan attacks. A countermeasure is proposed in this article that preserves the secrecy of an embedded key in a cryptographic integrated circuit running an Advanced Encryption Standard (AES) implementation. A novel design involving a hardware unit is illustrated that circumvents differential scan attacks by essentially performing bit flips deterministically, using a pre-computed mask value. This helps secure the chip while retaining full testability. The controller logic directly depends on a mask determination algorithm that can defend against any scan attack with 𝒪 theoretical complexity. Security analysis of our proposed defense procedure is performed in the framework of Discrete Event Systems (DES). The sequential scan circuit of an AES cryptosystem is modeled as a DES using Finite State Automata. A security notion, Opacity, is used to quantify and formally verify the security aspects of our controlled system, which shows that the entropy of the secret key is preserved. A case study is performed that shows to mitigate state-of-the-art differential scan attacks successfully at a nominal extra overhead of 1.78%.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":" ","pages":"1 - 20"},"PeriodicalIF":2.2,"publicationDate":"2022-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46087379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1