Optimizing the performance of in-memory file system by thread scheduling and file migration under NUMA multiprocessor systems

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Journal of Systems Architecture Pub Date : 2025-02-01 Epub Date: 2025-01-16 DOI:10.1016/j.sysarc.2025.103344

Ting Wu , Jingting He , Ying Qian , Weichen Liu

{"title":"Optimizing the performance of in-memory file system by thread scheduling and file migration under NUMA multiprocessor systems","authors":"Ting Wu , Jingting He , Ying Qian , Weichen Liu","doi":"10.1016/j.sysarc.2025.103344","DOIUrl":null,"url":null,"abstract":"<div><div>Internet and IoT Applications generate large amounts of data that require efficient storage and processing. Emerging Compute Express Link (CXL) and Non-Volatile Memories (NVM) bring new opportunities for in-memory computing by reducing the latency of data access and processing. Many in-memory file systems based on the Hybrid DRAM/NVM are designed for high performance. However, achieving high performance under Non-Uniform Memory Access (NUMA) multiprocessor systems has significant challenges. In particular, the performance of file requests on NUMA systems varies over a disturbingly wide range, depending on the affinity of threads to file data. Moreover, memory controllers and interconnect links congestion bring excessive latency and performance loss on file accesses. Therefore, both the placement of file and thread and load balance are critical for data-intensive applications on NUMA systems. In this paper, we optimize the performance of multiple threads requesting in-memory files on NUMA systems by considering both memory congestion and data locality. First, we present the system model and formulate the problem as latency minimization on NUMA nodes. Then, we present a two-layer design to optimize the performance by properly migrating threads and dynamically adjusting the file distribution. Further, based on the design, we implement a functional NUMA-aware in-memory file system, Hydrafs-RFCT, in the Linux kernel. Experimental results show that the Hydrafs-RFCT optimizes the performance of multi-thread applications on NUMA systems. The average aggravated performance of Hydrafs-RFCT is 100.14 %, 112.7 %, 39.4 %, and 6.4 % higher than that of Ext4-DAX, PMFS, SIMFS, and Hydrafs, respectively.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"159 ","pages":"Article 103344"},"PeriodicalIF":4.1000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Architecture","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1383762125000165","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/16 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Internet and IoT Applications generate large amounts of data that require efficient storage and processing. Emerging Compute Express Link (CXL) and Non-Volatile Memories (NVM) bring new opportunities for in-memory computing by reducing the latency of data access and processing. Many in-memory file systems based on the Hybrid DRAM/NVM are designed for high performance. However, achieving high performance under Non-Uniform Memory Access (NUMA) multiprocessor systems has significant challenges. In particular, the performance of file requests on NUMA systems varies over a disturbingly wide range, depending on the affinity of threads to file data. Moreover, memory controllers and interconnect links congestion bring excessive latency and performance loss on file accesses. Therefore, both the placement of file and thread and load balance are critical for data-intensive applications on NUMA systems. In this paper, we optimize the performance of multiple threads requesting in-memory files on NUMA systems by considering both memory congestion and data locality. First, we present the system model and formulate the problem as latency minimization on NUMA nodes. Then, we present a two-layer design to optimize the performance by properly migrating threads and dynamically adjusting the file distribution. Further, based on the design, we implement a functional NUMA-aware in-memory file system, Hydrafs-RFCT, in the Linux kernel. Experimental results show that the Hydrafs-RFCT optimizes the performance of multi-thread applications on NUMA systems. The average aggravated performance of Hydrafs-RFCT is 100.14 %, 112.7 %, 39.4 %, and 6.4 % higher than that of Ext4-DAX, PMFS, SIMFS, and Hydrafs, respectively.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在NUMA多处理器系统下，通过线程调度和文件迁移优化内存文件系统的性能

互联网和物联网应用产生大量数据，需要高效存储和处理。新兴的计算快速链路（Compute Express Link， CXL）和非易失性存储器（Non-Volatile memory， NVM）通过减少数据访问和处理的延迟，为内存计算带来了新的机遇。许多基于混合DRAM/NVM的内存文件系统都是为高性能而设计的。然而，在非统一内存访问（NUMA）多处理器系统下实现高性能面临着重大挑战。特别是，NUMA系统上文件请求的性能变化范围之广令人不安，这取决于线程对文件数据的亲缘性。此外，内存控制器和互连链路拥塞给文件访问带来过大的延迟和性能损失。因此，对于NUMA系统上的数据密集型应用程序来说，文件和线程的放置以及负载平衡都是至关重要的。在本文中，我们通过考虑内存拥塞和数据局部性来优化NUMA系统上请求内存中文件的多线程的性能。首先，我们提出了系统模型，并将问题表述为NUMA节点上的延迟最小化。然后，我们提出了一个两层设计，通过适当的线程迁移和动态调整文件分布来优化性能。在此基础上，我们在Linux内核中实现了一个功能性的numa感知内存文件系统Hydrafs-RFCT。实验结果表明，hydrafs - rct优化了NUMA系统上多线程应用程序的性能。与Ext4-DAX、PMFS、SIMFS和Hydrafs相比，Hydrafs- rfct的平均强化性能分别提高了100.14%、112.7%、39.4%和6.4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Systems Architecture 工程技术-计算机：硬件

CiteScore

8.70

自引率

15.60%

发文量

226

审稿时长

46 days

期刊介绍： The Journal of Systems Architecture: Embedded Software Design (JSA) is a journal covering all design and architectural aspects related to embedded systems and software. It ranges from the microarchitecture level via the system software level up to the application-specific architecture level. Aspects such as real-time systems, operating systems, FPGA programming, programming languages, communications (limited to analysis and the software stack), mobile systems, parallel and distributed architectures as well as additional subjects in the computer and system architecture area will fall within the scope of this journal. Technology will not be a main focus, but its use and relevance to particular designs will be. Case studies are welcome but must contribute more than just a design for a particular piece of software. Design automation of such systems including methodologies, techniques and tools for their design as well as novel designs of software components fall within the scope of this journal. Novel applications that use embedded systems are also central in this journal. While hardware is not a part of this journal hardware/software co-design methods that consider interplay between software and hardware components with and emphasis on software are also relevant here.