Error-Detection Schemes for Analog Content-Addressable Memories

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Computers Pub Date : 2024-04-08 DOI:10.1109/TC.2024.3386065

Ron M. Roth

{"title":"Error-Detection Schemes for Analog Content-Addressable Memories","authors":"Ron M. Roth","doi":"10.1109/TC.2024.3386065","DOIUrl":null,"url":null,"abstract":"Analog content-addressable memories (in short, a-CAMs) have been recently introduced as accelerators for machine-learning tasks, such as tree-based inference or implementation of nonlinear activation functions. The cells in these memories contain nanoscale memristive devices, which may be susceptible to various types of errors, such as manufacturing defects, inaccurate programming of the cells, or drifts in their contents over time. The objective of this work is to develop techniques for overcoming the reliability issues that are caused by such error events. To this end, several coding schemes are presented for the detection of errors in a-CAMs. These schemes consist of an encoding stage, a detection cycle (which is performed periodically), and some minor additions to the hardware. During encoding, redundancy symbols are programmed into a portion of the a-CAM (or, alternatively, are written into an external memory). During each detection cycle, a certain set of input vectors is applied to the a-CAM. The schemes differ in several ways, e.g., in the range of alphabet sizes that they are most suitable for, in the tradeoff that each provides between redundancy and hardware additions, or in the type of errors that they handle (Hamming metric versus \n<inline-formula><tex-math>$L_{1}$</tex-math></inline-formula>\n metric).","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"73 7","pages":"1795-1808"},"PeriodicalIF":3.8000,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10494682/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Analog content-addressable memories (in short, a-CAMs) have been recently introduced as accelerators for machine-learning tasks, such as tree-based inference or implementation of nonlinear activation functions. The cells in these memories contain nanoscale memristive devices, which may be susceptible to various types of errors, such as manufacturing defects, inaccurate programming of the cells, or drifts in their contents over time. The objective of this work is to develop techniques for overcoming the reliability issues that are caused by such error events. To this end, several coding schemes are presented for the detection of errors in a-CAMs. These schemes consist of an encoding stage, a detection cycle (which is performed periodically), and some minor additions to the hardware. During encoding, redundancy symbols are programmed into a portion of the a-CAM (or, alternatively, are written into an external memory). During each detection cycle, a certain set of input vectors is applied to the a-CAM. The schemes differ in several ways, e.g., in the range of alphabet sizes that they are most suitable for, in the tradeoff that each provides between redundancy and hardware additions, or in the type of errors that they handle (Hamming metric versus

$L_{1}$

metric).

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

模拟内容可寻址存储器的错误检测方案

模拟内容可寻址存储器（简称 a-CAM）最近被引入作为机器学习任务的加速器，例如基于树的推理或非线性激活函数的实现。这些存储器中的单元包含纳米级的忆阻器件，可能会受到各种类型错误的影响，如制造缺陷、单元编程不准确或内容随时间漂移。这项工作的目标是开发克服此类错误事件所造成的可靠性问题的技术。为此，介绍了几种用于检测 a-CAM 中错误的编码方案。这些方案包括一个编码阶段、一个检测周期（周期性执行）和对硬件的一些微小添加。在编码过程中，冗余符号被编入 a-CAM 的一部分（或写入外部存储器）。在每个检测周期中，一组特定的输入向量被应用到 a-CAM 中。这些方案在多个方面存在差异，例如最适合的字母大小范围、冗余与硬件添加之间的权衡，或处理的错误类型（汉明度量与 $L_{1}$ 度量）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Computers 工程技术-工程：电子与电气

CiteScore

6.60

自引率

5.40%

发文量

199

审稿时长

6.0 months

期刊介绍： The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.

期刊最新文献

2025 Reviewers List Evaluation of Radiation Resilience, Performance, and Vmin of Sub-3 nm FSFET Based SRAM Arrays Dual-Pronged Deep Learning Preprocessing on Heterogeneous Platforms With CPU, Accelerator and CSD Latency Optimization in Hybrid Memory System for GNNs Fused FP8 Many-Terms Dot Product With Scaling and FP32 Accumulation