A Case for In-Memory Random Scatter-Gather for Fast Graph Processing

IF 1.4 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Computer Architecture Letters Pub Date : 2024-03-13 DOI:10.1109/LCA.2024.3376680

Changmin Shin;Taehee Kwon;Jaeyong Song;Jae Hyung Ju;Frank Liu;Yeonkyu Choi;Jinho Lee

引用次数: 0

Abstract

Because of the widely recognized memory wall issue, modern DRAMs are increasingly being assigned innovative functionalities beyond the basic read and write operations. Often referred to as “function-in-memory”, these techniques are crafted to leverage the abundant internal bandwidth available within the DRAM. However, these techniques face several challenges, including requiring large areas for arithmetic units and the necessity of splitting a single word into multiple pieces. These challenges severely limit the practical application of these function-in-memory techniques. In this paper, we present Piccolo, an efficient design of random scatter-gather memory. Our method achieves significant improvements with minimal overhead. By demonstrating our technique on a graph processing accelerator, we show that Piccolo and the proposed accelerator achieves

$1.2-3.1 \times$

speedup compared to the prior art.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

快速图形处理的内存随机散点收集案例

由于公认的内存墙问题，现代 DRAM 越来越多地被赋予基本读写操作之外的创新功能。这些技术通常被称为 "内存中的功能"，旨在充分利用 DRAM 内部丰富的带宽。然而，这些技术面临着一些挑战，包括需要大面积的算术单元，以及必须将单个字分割成多个片段。这些挑战严重限制了这些内存中函数技术的实际应用。在本文中，我们介绍了一种高效的随机散点收集存储器设计 Piccolo。我们的方法以最小的开销实现了显著的改进。通过在图形处理加速器上演示我们的技术，我们发现与现有技术相比，Piccolo 和提议的加速器的速度提高了 1.2-3.1 \times$。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Computer Architecture Letters COMPUTER SCIENCE, HARDWARE & ARCHITECTURE-

CiteScore

4.60

自引率

4.30%

发文量

期刊介绍： IEEE Computer Architecture Letters is a rigorously peer-reviewed forum for publishing early, high-impact results in the areas of uni- and multiprocessor computer systems, computer architecture, microarchitecture, workload characterization, performance evaluation and simulation techniques, and power-aware computing. Submissions are welcomed on any topic in computer architecture, especially but not limited to: microprocessor and multiprocessor systems, microarchitecture and ILP processors, workload characterization, performance evaluation and simulation techniques, compiler-hardware and operating system-hardware interactions, interconnect architectures, memory and cache systems, power and thermal issues at the architecture level, I/O architectures and techniques, independent validation of previously published results, analysis of unsuccessful techniques, domain-specific processor architectures (e.g., embedded, graphics, network, etc.), real-time and high-availability architectures, reconfigurable systems.

期刊最新文献

DAWN: Efficient Distribution of Attention Workload in PIM-Enabled Systems for LLM Inference 2025 Reviewers List* Driving the Core Frontend With LiteBTB CTL: A Case for CXL Device-Managed Hugepages H3: Hybrid Architecture Using High Bandwidth Memory and High Bandwidth Flash for Cost-Efficient LLM Inference