ACM Journal on Emerging Technologies in Computing Systems最新文献_第3页

Securing Network-on-chips Against Fault-injection and Crypto-analysis Attacks via Stochastic Anonymous Routing 利用随机匿名路由保护片上网络免受故障注入和密码分析攻击

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3592798

Ahmad Patooghy, Mahdi Hasanzadeh, Amin Sarihi, Mostafa Abdelrehim, Abdel-Hameed A. Badawy

Network-on-chip (NoC) is widely used as an efficient communication architecture in multi-core and many-core System-on-chips (SoCs). However, the shared communication resources in an NoC platform, e.g., channels, buffers, and routers, might be used to conduct attacks compromising the security of NoC-based SoCs. Most of the proposed encryption-based protection methods in the literature require leaving some parts of the packet unencrypted to allow the routers to process/forward packets accordingly. This reveals the source/destination information of the packet to malicious routers, which can be exploited in various attacks. For the first time, we propose the idea of secure, anonymous routing with minimal hardware overhead to encrypt the entire packet while exchanging secure information over the network. We have designed and implemented a new NoC architecture that works with encrypted addresses. The proposed method can manage malicious and benign failures at NoC channels and buffers by bypassing failed components with a situation-driven stochastic path diversification approach. Hardware evaluations show that the proposed security solution combats the security threats at the affordable cost of 1.5% area and 20% power overheads chip-wide.

片上网络(NoC)作为一种高效的通信架构被广泛应用于多核和多核片上系统(soc)中。然而，NoC平台中的共享通信资源，如通道、缓冲区和路由器，可能被用来进行危及基于NoC的soc安全性的攻击。文献中提出的大多数基于加密的保护方法都要求保留数据包的某些部分未加密，以允许路由器相应地处理/转发数据包。这将数据包的源/目的信息暴露给恶意路由器，可以利用这些信息进行各种攻击。我们首次提出了安全、匿名路由的思想，以最小的硬件开销来加密整个数据包，同时在网络上交换安全信息。我们设计并实现了一个新的NoC架构，可以使用加密地址。该方法可以通过情境驱动的随机路径多样化方法绕过失效组件来管理NoC通道和缓冲区的恶意和良性故障。硬件评估表明，提出的安全解决方案以可承受的1.5%的面积和20%的芯片功耗开销来对抗安全威胁。

{"title":"Securing Network-on-chips Against Fault-injection and Crypto-analysis Attacks via Stochastic Anonymous Routing","authors":"Ahmad Patooghy, Mahdi Hasanzadeh, Amin Sarihi, Mostafa Abdelrehim, Abdel-Hameed A. Badawy","doi":"https://dl.acm.org/doi/10.1145/3592798","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3592798","url":null,"abstract":"Network-on-chip (NoC) is widely used as an efficient communication architecture in multi-core and many-core System-on-chips (SoCs). However, the shared communication resources in an NoC platform, e.g., channels, buffers, and routers, might be used to conduct attacks compromising the security of NoC-based SoCs. Most of the proposed encryption-based protection methods in the literature require leaving some parts of the packet unencrypted to allow the routers to process/forward packets accordingly. This reveals the source/destination information of the packet to malicious routers, which can be exploited in various attacks. For the first time, we propose the idea of secure, anonymous routing with minimal hardware overhead to encrypt the entire packet while exchanging secure information over the network. We have designed and implemented a new NoC architecture that works with encrypted addresses. The proposed method can manage malicious and benign failures at NoC channels and buffers by bypassing failed components with a situation-driven stochastic path diversification approach. Hardware evaluations show that the proposed security solution combats the security threats at the affordable cost of 1.5% area and 20% power overheads chip-wide.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"99 4","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

2DMAC: A Sustainable and Efficient Medium Access Control Mechanism for Future Wireless NoCs 2DMAC:面向未来无线网络中心的一种可持续、高效的媒介访问控制机制

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3570727

Sidhartha Sankar Rout, Mitali Sinha, Sujay Deb

Wireless Network-on-Chip (WNoC) requires a Medium Access Control (MAC) mechanism for an interference-free sharing of the wireless channel. In traditional MAC, a token is circulated among the Wireless Interfaces (WIs) in a Round Robin manner. The WI with the token holds the channel for a fixed number of cycles. However, the channel requirement of the individual WIs dynamically changes over time due to the varying traffic density across the WNoC. Moreover, the conventional WNoCs give equal importance to all the traffic taking the wireless path and transmit it in an oldest-first manner. Nevertheless, the critical data can degrade the system performance to a large extent by delaying the application runtime if not served promptly. We propose 2DMAC, which can change the token arbitration pattern and tune the channel hold time of each WI based on its runtime traffic density and criticality status. Moreover, 2DMAC prioritizes the critical traffic over the non-critical traffic during the wireless data transfer. The proposed mechanism improves the wireless channel utilization by 15.67% and the network throughput by 29.83% and reduces the critical data latency by 29.77% over the traditional MAC.

无线片上网络(WNoC)需要一种介质访问控制(MAC)机制来实现无线信道的无干扰共享。在传统的MAC中，令牌以轮询的方式在无线接口(wi)之间循环。带有令牌的WI为固定数量的周期保持通道。然而，由于WNoC上不同的流量密度，各个wi的信道需求会随时间动态变化。此外，传统wnoc对采用无线路径的所有业务同等重视，并以最老优先的方式传输。但是，如果不及时提供关键数据，则会延迟应用程序运行时，从而在很大程度上降低系统性能。我们提出了2DMAC，它可以改变令牌仲裁模式，并根据每个WI的运行时流量密度和临界状态调整通道保持时间。此外，在无线数据传输过程中，2DMAC将关键流量优先于非关键流量。该机制与传统MAC相比，无线信道利用率提高15.67%，网络吞吐量提高29.83%，关键数据延迟降低29.77%。

{"title":"2DMAC: A Sustainable and Efficient Medium Access Control Mechanism for Future Wireless NoCs","authors":"Sidhartha Sankar Rout, Mitali Sinha, Sujay Deb","doi":"https://dl.acm.org/doi/10.1145/3570727","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3570727","url":null,"abstract":"Wireless Network-on-Chip (WNoC) requires a Medium Access Control (MAC) mechanism for an interference-free sharing of the wireless channel. In traditional MAC, a token is circulated among the Wireless Interfaces (WIs) in a Round Robin manner. The WI with the token holds the channel for a fixed number of cycles. However, the channel requirement of the individual WIs dynamically changes over time due to the varying traffic density across the WNoC. Moreover, the conventional WNoCs give equal importance to all the traffic taking the wireless path and transmit it in an oldest-first manner. Nevertheless, the critical data can degrade the system performance to a large extent by delaying the application runtime if not served promptly. We propose 2DMAC, which can change the token arbitration pattern and tune the channel hold time of each WI based on its runtime traffic density and criticality status. Moreover, 2DMAC prioritizes the critical traffic over the non-critical traffic during the wireless data transfer. The proposed mechanism improves the wireless channel utilization by 15.67% and the network throughput by 29.83% and reduces the critical data latency by 29.77% over the traditional MAC.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"99 3","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EVHA: Explainable Vision System for Hardware Testing and Assurance—An Overview EVHA:硬件测试和保证的可解释视觉系统-概述

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3590772

Md. Mahfuz Al Hasan, Mohammad Tahsin Mostafiz, Thomas An Le, Jake Julia, Nidish Vashistha, Shayan Taheri, Navid Asadizanjani

Due to the ever-growing demands for electronic chips in different sectors, semiconductor companies have been mandated to offshore their manufacturing processes. This unwanted matter has made security and trustworthiness of their fabricated chips concerning and has caused the creation of hardware attacks. In this condition, different entities in the semiconductor supply chain can act maliciously and execute an attack on the design computing layers, from devices to systems. Our attack is a hardware Trojan that is inserted during mask generation/fabrication in an untrusted foundry. The Trojan leaves a footprint in the fabrication through addition, deletion, or change of design cells. To tackle this problem, we propose EVHA (Explainable Vision System for Hardware Testing and Assurance) in this work, which can detect the smallest possible change to a design in a low-cost, accurate, and fast manner. The inputs to this system are scanning electron microscopy images acquired from the integrated circuits under examination. The system output is the determination of integrated circuit status in terms of having any defect and/or hardware Trojan through addition, deletion, or change in the design cells at the cell level. This article provides an overview on the design, development, implementation, and analysis of our defense system.

由于不同行业对电子芯片的需求不断增长，半导体公司被要求将其制造过程转移到海外。这一不必要的问题使他们制造的芯片的安全性和可信度受到关注，并导致了硬件攻击的产生。在这种情况下，半导体供应链中的不同实体可以恶意行动，并对从设备到系统的设计计算层执行攻击。我们的攻击是在不受信任的铸造厂生成/制造掩码期间插入的硬件木马。木马通过添加、删除或更改设计单元在制造中留下足迹。为了解决这个问题，我们在这项工作中提出了EVHA(硬件测试和保证的可解释视觉系统)，它可以以低成本，准确和快速的方式检测到设计中最小的可能变化。该系统的输入是从被检查的集成电路获得的扫描电子显微镜图像。系统输出是通过在单元级的设计单元中添加、删除或更改来确定集成电路状态是否有任何缺陷和/或硬件木马。本文概述了我们的防御系统的设计、开发、实现和分析。

{"title":"EVHA: Explainable Vision System for Hardware Testing and Assurance—An Overview","authors":"Md. Mahfuz Al Hasan, Mohammad Tahsin Mostafiz, Thomas An Le, Jake Julia, Nidish Vashistha, Shayan Taheri, Navid Asadizanjani","doi":"https://dl.acm.org/doi/10.1145/3590772","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3590772","url":null,"abstract":"Due to the ever-growing demands for electronic chips in different sectors, semiconductor companies have been mandated to offshore their manufacturing processes. This unwanted matter has made security and trustworthiness of their fabricated chips concerning and has caused the creation of hardware attacks. In this condition, different entities in the semiconductor supply chain can act maliciously and execute an attack on the design computing layers, from devices to systems. Our attack is a hardware Trojan that is inserted during mask generation/fabrication in an untrusted foundry. The Trojan leaves a footprint in the fabrication through addition, deletion, or change of design cells. To tackle this problem, we propose EVHA (Explainable Vision System for Hardware Testing and Assurance) in this work, which can detect the smallest possible change to a design in a low-cost, accurate, and fast manner. The inputs to this system are scanning electron microscopy images acquired from the integrated circuits under examination. The system output is the determination of integrated circuit status in terms of having any defect and/or hardware Trojan through addition, deletion, or change in the design cells at the cell level. This article provides an overview on the design, development, implementation, and analysis of our defense system.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"95 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fusing In-Storage and Near-Storage Acceleration of Convolutional Neural Networks 卷积神经网络的存储融合与近存储加速

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-06-17 DOI: 10.1145/3597496

Ikenna Okafor, A. Ramanathan, Nagadastagiri Challapalle, Zheyu Li, Vijaykrishnan Narayanan

Video analytics have a wide range of applications and has attracted much interest over the years. While it can be both computationally and energy intensive, video analytics can greatly benefit from in/ near memory compute. The practice of moving compute closer to memory has continued to show improvements to performance and energy consumption and is seeing increasing adoption. Recent advancements in solid state drives (SSDs) have incorporated near memory Field Programmable Gate Arrays (FPGAs) with shared access to the drive’s storage cells. These near memory FPGAs are capable of running operations required by video analytic pipelines such as object detection and template matching. These operations are typically executed using Convolutional Neural Networks (CNNs). A CNN is composed of multiple individually processed layers which perform various image processing tasks. Due to lack of resources, a layer may be partitioned into more manageable sub-layers. These sub-layers are then processed sequentially, however some sub-layers can be processed simultaneously. Moreover, the storage cells within FPGA equipped SSD’s are capable of being augmented with in-storage compute to accelerate CNN workloads and exploit the intra parallelism within a CNN layer. To this end we present our work, which leverages heterogeneous architectures to create an in/near-storage acceleration solution for video analytics. We designed a NAND flash accelerator, and an FPGA accelerator, then mapped and evaluated several CNN benchmarks. We show how to utilize FPGAs, local DRAMs, and in-memory SSD compute to accelerate CNN workloads. Our work also demonstrates how to remove unnecessary memory transfers to save latency and energy.

视频分析有着广泛的应用，多年来引起了人们的极大兴趣。虽然视频分析既可能是计算密集型的，也可能是能源密集型的。但它可以从内存内/近内存计算中受益匪浅。将计算移近内存的做法继续显示出性能和能耗的改进，并且越来越多地被采用。固态硬盘（SSD）的最新进展结合了近存储器现场可编程门阵列（FPGA），可以共享对硬盘存储单元的访问。这些近内存FPGA能够运行视频分析管道所需的操作，如对象检测和模板匹配。这些操作通常使用卷积神经网络（CNNs）来执行。CNN由多个单独处理的层组成，这些层执行各种图像处理任务。由于缺乏资源，一个层可能被划分为更易于管理的子层。然后依次处理这些子层，但是可以同时处理一些子层。此外，配备FPGA的SSD中的存储单元能够通过存储内计算进行扩展，以加速CNN工作负载并利用CNN层内的内部并行性。为此，我们介绍了我们的工作，该工作利用异构架构为视频分析创建了一个存储内/近存储加速解决方案。我们设计了一个NAND闪存加速器和一个FPGA加速器，然后映射和评估了几个CNN基准。我们展示了如何利用FPGA、本地DRAM和内存SSD计算来加速CNN工作负载。我们的工作还演示了如何消除不必要的内存传输，以节省延迟和能量。

{"title":"Fusing In-Storage and Near-Storage Acceleration of Convolutional Neural Networks","authors":"Ikenna Okafor, A. Ramanathan, Nagadastagiri Challapalle, Zheyu Li, Vijaykrishnan Narayanan","doi":"10.1145/3597496","DOIUrl":"https://doi.org/10.1145/3597496","url":null,"abstract":"Video analytics have a wide range of applications and has attracted much interest over the years. While it can be both computationally and energy intensive, video analytics can greatly benefit from in/ near memory compute. The practice of moving compute closer to memory has continued to show improvements to performance and energy consumption and is seeing increasing adoption. Recent advancements in solid state drives (SSDs) have incorporated near memory Field Programmable Gate Arrays (FPGAs) with shared access to the drive’s storage cells. These near memory FPGAs are capable of running operations required by video analytic pipelines such as object detection and template matching. These operations are typically executed using Convolutional Neural Networks (CNNs). A CNN is composed of multiple individually processed layers which perform various image processing tasks. Due to lack of resources, a layer may be partitioned into more manageable sub-layers. These sub-layers are then processed sequentially, however some sub-layers can be processed simultaneously. Moreover, the storage cells within FPGA equipped SSD’s are capable of being augmented with in-storage compute to accelerate CNN workloads and exploit the intra parallelism within a CNN layer. To this end we present our work, which leverages heterogeneous architectures to create an in/near-storage acceleration solution for video analytics. We designed a NAND flash accelerator, and an FPGA accelerator, then mapped and evaluated several CNN benchmarks. We show how to utilize FPGAs, local DRAMs, and in-memory SSD compute to accelerate CNN workloads. Our work also demonstrates how to remove unnecessary memory transfers to save latency and energy.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"1 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42044380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fusing In-Storage and Near-Storage Acceleration of Convolutional Neural Networks 卷积神经网络存储内加速与存储近加速的融合

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-06-17 DOI: https://dl.acm.org/doi/10.1145/3597496

Ikenna Okafor, Akshay Krishna Ramanathan, Nagadastagiri Reddy Challapalle, Zheyu Li, Vijaykrishnan Narayanan

Video analytics have a wide range of applications and has attracted much interest over the years. While it can be both computationally and energy intensive, video analytics can greatly benefit from in/ near memory compute. The practice of moving compute closer to memory has continued to show improvements to performance and energy consumption and is seeing increasing adoption. Recent advancements in solid state drives (SSDs) have incorporated near memory Field Programmable Gate Arrays (FPGAs) with shared access to the drive’s storage cells. These near memory FPGAs are capable of running operations required by video analytic pipelines such as object detection and template matching. These operations are typically executed using Convolutional Neural Networks (CNNs). A CNN is composed of multiple individually processed layers which perform various image processing tasks. Due to lack of resources, a layer may be partitioned into more manageable sub-layers. These sub-layers are then processed sequentially, however some sub-layers can be processed simultaneously. Moreover, the storage cells within FPGA equipped SSD’s are capable of being augmented with in-storage compute to accelerate CNN workloads and exploit the intra parallelism within a CNN layer. To this end we present our work, which leverages heterogeneous architectures to create an in/near-storage acceleration solution for video analytics. We designed a NAND flash accelerator, and an FPGA accelerator, then mapped and evaluated several CNN benchmarks. We show how to utilize FPGAs, local DRAMs, and in-memory SSD compute to accelerate CNN workloads. Our work also demonstrates how to remove unnecessary memory transfers to save latency and energy.

视频分析具有广泛的应用，多年来引起了人们的极大兴趣。虽然它可能是计算和能源密集型的，但视频分析可以从内存/近内存计算中受益匪浅。将计算移动到内存附近的实践继续显示出性能和能耗的改进，并且越来越多地采用。固态硬盘(ssd)的最新进展是将近存储器现场可编程门阵列(fpga)与驱动器存储单元的共享访问相结合。这些近内存fpga能够运行视频分析管道所需的操作，如对象检测和模板匹配。这些操作通常使用卷积神经网络(cnn)执行。CNN由多个单独处理的层组成，这些层执行各种图像处理任务。由于缺乏资源，一个层可能被划分为更易于管理的子层。然后依次处理这些子层，但是有些子层可以同时处理。此外，配备FPGA的SSD内的存储单元能够与存储内计算相增强，以加速CNN工作负载并利用CNN层内的并行性。为此，我们展示了我们的工作，它利用异构架构来创建视频分析的内/近存储加速解决方案。我们设计了一个NAND闪存加速器和一个FPGA加速器，然后映射和评估了几个CNN基准。我们展示了如何利用fpga、本地dram和内存SSD计算来加速CNN工作负载。我们的工作还演示了如何删除不必要的内存传输以节省延迟和能源。

{"title":"Fusing In-Storage and Near-Storage Acceleration of Convolutional Neural Networks","authors":"Ikenna Okafor, Akshay Krishna Ramanathan, Nagadastagiri Reddy Challapalle, Zheyu Li, Vijaykrishnan Narayanan","doi":"https://dl.acm.org/doi/10.1145/3597496","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3597496","url":null,"abstract":"Video analytics have a wide range of applications and has attracted much interest over the years. While it can be both computationally and energy intensive, video analytics can greatly benefit from in/ near memory compute. The practice of moving compute closer to memory has continued to show improvements to performance and energy consumption and is seeing increasing adoption. Recent advancements in solid state drives (SSDs) have incorporated near memory Field Programmable Gate Arrays (FPGAs) with shared access to the drive’s storage cells. These near memory FPGAs are capable of running operations required by video analytic pipelines such as object detection and template matching. These operations are typically executed using Convolutional Neural Networks (CNNs). A CNN is composed of multiple individually processed layers which perform various image processing tasks. Due to lack of resources, a layer may be partitioned into more manageable sub-layers. These sub-layers are then processed sequentially, however some sub-layers can be processed simultaneously. Moreover, the storage cells within FPGA equipped SSD’s are capable of being augmented with in-storage compute to accelerate CNN workloads and exploit the intra parallelism within a CNN layer. To this end we present our work, which leverages heterogeneous architectures to create an in/near-storage acceleration solution for video analytics. We designed a NAND flash accelerator, and an FPGA accelerator, then mapped and evaluated several CNN benchmarks. We show how to utilize FPGAs, local DRAMs, and in-memory SSD compute to accelerate CNN workloads. Our work also demonstrates how to remove unnecessary memory transfers to save latency and energy.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"94 3","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Noninvasive Technique to Detect Authentic/Counterfeit SRAM Chips 一种检测真假SRAM芯片的非侵入技术

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-05-30 DOI: https://dl.acm.org/doi/10.1145/3597024

B. M. S. Bahar Talukder, Farah Ferdaus, Md Tauhidur Rahman

Many commercially available memory chips are fabricated worldwide in untrusted facilities. Therefore, a counterfeit memory chip can easily enter into the supply chain in different formats. Deploying these counterfeit memory chips into an electronic system can severely affect security and reliability domains because of their substandard quality, poor performance, and shorter lifespan. Therefore, a proper solution is required to identify counterfeit memory chips before deploying them in mission-, safety-, and security-critical systems. However, a single solution to prevent counterfeiting is challenging due to the diversity of counterfeit types, sources, and refinement techniques. Besides, the chips can pass initial testing and still fail while being used in the system. Furthermore, existing solutions focus on detecting a single counterfeit type (e.g., detecting recycled memory chips). This work proposes a framework that detects major counterfeit static random-access memory (SRAM) types by attesting/identifying the origin of the manufacturer. The proposed technique generates a single signature for a manufacturer and does not require any exhaustive registration/authentication process. We validate our proposed technique using 345 SRAM chips produced by major manufacturers. The silicon results show that the test scores (F₁ score) of our proposed technique of identifying memory manufacturer and part-number are 93% and 71%, respectively.

许多商用存储芯片是在世界各地不可靠的设施中制造的。因此，假冒的存储芯片很容易以不同的形式进入供应链。将这些假冒内存芯片部署到电子系统中会严重影响安全性和可靠性领域，因为它们的质量不合格，性能差，寿命短。因此，在将假冒内存芯片部署到任务、安全和安全关键系统之前，需要一个适当的解决方案来识别假冒内存芯片。然而，由于伪造类型、来源和改进技术的多样性，防止伪造的单一解决方案是具有挑战性的。此外，芯片可以通过初始测试，但在系统中使用时仍然会失败。此外，现有的解决方案侧重于检测单一假冒类型(例如，检测回收的存储芯片)。这项工作提出了一个框架，通过证明/识别制造商的来源来检测主要的假冒静态随机存取存储器(SRAM)类型。所建议的技术为制造商生成单个签名，并且不需要任何详尽的注册/身份验证过程。我们使用主要制造商生产的345个SRAM芯片验证了我们提出的技术。硅测试结果表明，我们提出的识别存储器制造商和零件编号技术的测试分数(F1分数)分别为93%和71%。

{"title":"A Noninvasive Technique to Detect Authentic/Counterfeit SRAM Chips","authors":"B. M. S. Bahar Talukder, Farah Ferdaus, Md Tauhidur Rahman","doi":"https://dl.acm.org/doi/10.1145/3597024","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3597024","url":null,"abstract":"Many commercially available memory chips are fabricated worldwide in untrusted facilities. Therefore, a counterfeit memory chip can easily enter into the supply chain in different formats. Deploying these counterfeit memory chips into an electronic system can severely affect security and reliability domains because of their substandard quality, poor performance, and shorter lifespan. Therefore, a proper solution is required to identify counterfeit memory chips before deploying them in mission-, safety-, and security-critical systems. However, a single solution to prevent counterfeiting is challenging due to the diversity of counterfeit types, sources, and refinement techniques. Besides, the chips can pass initial testing and still fail while being used in the system. Furthermore, existing solutions focus on detecting a single counterfeit type (e.g., detecting recycled memory chips). This work proposes a framework that detects major counterfeit static random-access memory (SRAM) types by attesting/identifying the origin of the manufacturer. The proposed technique generates a single signature for a manufacturer and does not require any exhaustive registration/authentication process. We validate our proposed technique using 345 SRAM chips produced by major manufacturers. The silicon results show that the test scores (F1 score) of our proposed technique of identifying memory manufacturer and part-number are 93% and 71%, respectively.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"95 4","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards the Generation of Test Vectors for the Detection of Hardware Trojan Targeting Effective Switching Activity 针对有效交换活动的硬件木马检测测试向量的生成研究

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-05-19 DOI: https://dl.acm.org/doi/10.1145/3597497

Anindan Mondal, Debasish Kalita, Archisman Ghosh, Suchismita Roy, Bibhash Sen

Hardware Trojans (HT) are small circuits intentionally designed by an adversary for harmful purposes. These types of circuits are extremely difficult to detect. An HT often requires some specific signals to activate which is almost impossible to discover. For this reason, test generation for side channel analysis has gained significant attraction in recent times which does not require HT activation. Such test generation techniques aim to generate a large amount of switching activity inside the HT circuit, increasing transient current measurement. However, such methods suffer from either long runtime or reliable results. In this work, a test generation technique is proposed based on the relative switching activity of the circuit to overcome the limitations of the existing works. Initially, the proposed technique measures the impact of each input on rare nets individually using random vector simulation. Potent inputs are selected to obtain a new set of test vectors that provides high relative switching inside a circuit. The proposed method is applied on 11 different ISCAS and 3 ITC 99 benchmark circuits. Experimental results endorse the efficacy of the proposed method outperforming traditional hamming distance-based re-ordering techniques (up to 20x) while requiring a small run-time.

硬件木马(HT)是由攻击者故意设计的小电路，用于有害目的。这些类型的电路极难检测到。HT通常需要一些特定的信号来激活，而这些信号几乎不可能被发现。由于这个原因，侧通道分析的测试生成在最近的时代获得了显著的吸引力，它不需要HT激活。这种测试生成技术的目的是在高温高压电路内部产生大量的开关活动，增加瞬态电流测量。然而，这些方法要么运行时间长，要么结果不可靠。本文提出了一种基于电路相对开关活度的测试生成技术，以克服现有工作的局限性。首先，提出的技术使用随机向量模拟分别测量每个输入对稀有网的影响。选择有效的输入以获得一组新的测试向量，该测试向量在电路内提供高相对开关。该方法在11个不同的ISCAS和3个ITC 99基准电路上进行了应用。实验结果表明，该方法的有效性优于传统的基于汉明距离的重新排序技术(高达20倍)，同时需要较小的运行时间。

{"title":"Towards the Generation of Test Vectors for the Detection of Hardware Trojan Targeting Effective Switching Activity","authors":"Anindan Mondal, Debasish Kalita, Archisman Ghosh, Suchismita Roy, Bibhash Sen","doi":"https://dl.acm.org/doi/10.1145/3597497","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3597497","url":null,"abstract":"Hardware Trojans (HT) are small circuits intentionally designed by an adversary for harmful purposes. These types of circuits are extremely difficult to detect. An HT often requires some specific signals to activate which is almost impossible to discover. For this reason, test generation for side channel analysis has gained significant attraction in recent times which does not require HT activation. Such test generation techniques aim to generate a large amount of switching activity inside the HT circuit, increasing transient current measurement. However, such methods suffer from either long runtime or reliable results. In this work, a test generation technique is proposed based on the relative switching activity of the circuit to overcome the limitations of the existing works. Initially, the proposed technique measures the impact of each input on rare nets individually using random vector simulation. Potent inputs are selected to obtain a new set of test vectors that provides high relative switching inside a circuit. The proposed method is applied on 11 different ISCAS and 3 ITC 99 benchmark circuits. Experimental results endorse the efficacy of the proposed method outperforming traditional hamming distance-based re-ordering techniques (up to 20x) while requiring a small run-time.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"94 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards the Generation of Test Vectors for the Detection of Hardware Trojan Targeting Effective Switching Activity 针对有效交换活动的硬件木马检测测试向量的生成研究

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-05-19 DOI: 10.1145/3597497

Anindan Mondal, Debasish Kalita, A. Ghosh, Suchismita Roy, Bibhash Sen

Hardware Trojans (HT) are small circuits intentionally designed by an adversary for harmful purposes. These types of circuits are extremely difficult to detect. An HT often requires some specific signals to activate which is almost impossible to discover. For this reason, test generation for side channel analysis has gained significant attraction in recent times which does not require HT activation. Such test generation techniques aim to generate a large amount of switching activity inside the HT circuit, increasing transient current measurement. However, such methods suffer from either long runtime or reliable results. In this work, a test generation technique is proposed based on the relative switching activity of the circuit to overcome the limitations of the existing works. Initially, the proposed technique measures the impact of each input on rare nets individually using random vector simulation. Potent inputs are selected to obtain a new set of test vectors that provides high relative switching inside a circuit. The proposed method is applied on 11 different ISCAS and 3 ITC 99 benchmark circuits. Experimental results endorse the efficacy of the proposed method outperforming traditional hamming distance-based re-ordering techniques (up to 20x) while requiring a small run-time.

硬件木马(HT)是由攻击者故意设计的小电路，用于有害目的。这些类型的电路极难检测到。HT通常需要一些特定的信号来激活，而这些信号几乎不可能被发现。由于这个原因，侧通道分析的测试生成在最近的时代获得了显著的吸引力，它不需要HT激活。这种测试生成技术的目的是在高温高压电路内部产生大量的开关活动，增加瞬态电流测量。然而，这些方法要么运行时间长，要么结果不可靠。本文提出了一种基于电路相对开关活度的测试生成技术，以克服现有工作的局限性。首先，提出的技术使用随机向量模拟分别测量每个输入对稀有网的影响。选择有效的输入以获得一组新的测试向量，该测试向量在电路内提供高相对开关。该方法在11个不同的ISCAS和3个ITC 99基准电路上进行了应用。实验结果表明，该方法的有效性优于传统的基于汉明距离的重新排序技术(高达20倍)，同时需要较小的运行时间。

{"title":"Towards the Generation of Test Vectors for the Detection of Hardware Trojan Targeting Effective Switching Activity","authors":"Anindan Mondal, Debasish Kalita, A. Ghosh, Suchismita Roy, Bibhash Sen","doi":"10.1145/3597497","DOIUrl":"https://doi.org/10.1145/3597497","url":null,"abstract":"Hardware Trojans (HT) are small circuits intentionally designed by an adversary for harmful purposes. These types of circuits are extremely difficult to detect. An HT often requires some specific signals to activate which is almost impossible to discover. For this reason, test generation for side channel analysis has gained significant attraction in recent times which does not require HT activation. Such test generation techniques aim to generate a large amount of switching activity inside the HT circuit, increasing transient current measurement. However, such methods suffer from either long runtime or reliable results. In this work, a test generation technique is proposed based on the relative switching activity of the circuit to overcome the limitations of the existing works. Initially, the proposed technique measures the impact of each input on rare nets individually using random vector simulation. Potent inputs are selected to obtain a new set of test vectors that provides high relative switching inside a circuit. The proposed method is applied on 11 different ISCAS and 3 ITC 99 benchmark circuits. Experimental results endorse the efficacy of the proposed method outperforming traditional hamming distance-based re-ordering techniques (up to 20x) while requiring a small run-time.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":" ","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45576261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Low-Rank Gradient Descent for Memory-Efficient Training of Deep In-Memory Arrays 基于低秩梯度下降的深度内存数组高效训练

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-05-18 DOI: https://dl.acm.org/doi/10.1145/3577214

Siyuan Huang, Brian D. Hoskins, Matthew W. Daniels, Mark D. Stiles, Gina C. Adam

The movement of large quantities of data during the training of a deep neural network presents immense challenges for machine learning workloads, especially those based on future functional memories deployed to store network models. As the size of network models begins to vastly outstrip traditional silicon computing resources, functional memories based on flash, resistive switches, magnetic tunnel junctions, and other technologies can store these new ultra-large models. However, new approaches are then needed to minimize hardware overhead, especially on the movement and calculation of gradient information that cannot be efficiently contained in these new memory resources. To do this, we introduce streaming batch principal component analysis (SBPCA) as an update algorithm. Streaming batch principal component analysis uses stochastic power iterations to generate a stochastic rank-k approximation of the network gradient. We demonstrate that the low-rank updates produced by streaming batch principal component analysis can effectively train convolutional neural networks on a variety of common datasets, with performance comparable to standard mini-batch gradient descent. Our approximation is made in an expanded vector form that can efficiently be applied to the rows and columns of crossbars for array-level updates. These results promise improvements in the design of application-specific integrated circuits based around large vector-matrix multiplier memories.

在深度神经网络的训练过程中，大量数据的移动给机器学习工作负载带来了巨大的挑战，特别是那些基于未来功能记忆部署来存储网络模型的工作负载。随着网络模型的规模开始大大超过传统的硅计算资源，基于闪存、电阻开关、磁隧道结和其他技术的功能存储器可以存储这些新的超大型模型。然而，需要新的方法来最小化硬件开销，特别是在梯度信息的移动和计算上，这些梯度信息不能有效地包含在这些新的内存资源中。为此，我们引入了流批量主成分分析(SBPCA)作为更新算法。流批主成分分析使用随机幂次迭代生成网络梯度的随机秩-k近似。我们证明了流式批处理主成分分析产生的低秩更新可以有效地在各种通用数据集上训练卷积神经网络，其性能与标准的小批梯度下降相当。我们的近似是用扩展向量的形式进行的，可以有效地应用于横杆的行和列，以进行数组级更新。这些结果有望改善基于大型矢量矩阵乘法器存储器的特定应用集成电路的设计。

{"title":"Low-Rank Gradient Descent for Memory-Efficient Training of Deep In-Memory Arrays","authors":"Siyuan Huang, Brian D. Hoskins, Matthew W. Daniels, Mark D. Stiles, Gina C. Adam","doi":"https://dl.acm.org/doi/10.1145/3577214","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3577214","url":null,"abstract":"The movement of large quantities of data during the training of a deep neural network presents immense challenges for machine learning workloads, especially those based on future functional memories deployed to store network models. As the size of network models begins to vastly outstrip traditional silicon computing resources, functional memories based on flash, resistive switches, magnetic tunnel junctions, and other technologies can store these new ultra-large models. However, new approaches are then needed to minimize hardware overhead, especially on the movement and calculation of gradient information that cannot be efficiently contained in these new memory resources. To do this, we introduce streaming batch principal component analysis (SBPCA) as an update algorithm. Streaming batch principal component analysis uses stochastic power iterations to generate a stochastic rank-k approximation of the network gradient. We demonstrate that the low-rank updates produced by streaming batch principal component analysis can effectively train convolutional neural networks on a variety of common datasets, with performance comparable to standard mini-batch gradient descent. Our approximation is made in an expanded vector form that can efficiently be applied to the rows and columns of crossbars for array-level updates. These results promise improvements in the design of application-specific integrated circuits based around large vector-matrix multiplier memories.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"96 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FPIC: A Novel Semantic Dataset for Optical PCB Assurance FPIC:一种新的用于光学PCB保证的语义数据集

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-05-18 DOI: https://dl.acm.org/doi/10.1145/3588032

Nathan Jessurun, Olivia P. Dizon-Paradis, Jacob Harrison, Shajib Ghosh, Mark M. Tehranipoor, Damon L. Woodard, Navid Asadizanjani

Outsourced PCB fabrication necessitates increased hardware assurance capabilities. Several assurance techniques based on AOI have been proposed that leverage PCB images acquired using digital cameras. We review state-of-the-art AOI techniques and observe a strong, rapid trend toward ML solutions. These require significant amounts of labeled ground truth data, which is lacking in the publicly available PCB data space. We contribute the FPIC dataset to address this need. Additionally, we outline new hardware security methodologies enabled by our dataset.

外包PCB制造需要增加硬件保证能力。已经提出了几种基于AOI的保证技术，利用使用数码相机获取的PCB图像。我们回顾了最先进的AOI技术，并观察到ML解决方案的强劲、快速趋势。这些需要大量标记的地面真值数据，这在公开可用的PCB数据空间中是缺乏的。我们提供FPIC数据集来满足这一需求。此外，我们还概述了我们的数据集支持的新的硬件安全方法。

引用次数: 0