Pufferfish: Container-driven Elastic Memory Management for Data-intensive Applications

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference) Pub Date : 2019-11-20 DOI:10.1145/3357223.3362730

Wei Chen, Aidi Pi, Shaoqi Wang, Xiaobo Zhou

{"title":"Pufferfish: Container-driven Elastic Memory Management for Data-intensive Applications","authors":"Wei Chen, Aidi Pi, Shaoqi Wang, Xiaobo Zhou","doi":"10.1145/3357223.3362730","DOIUrl":null,"url":null,"abstract":"Data-intensive applications often suffer from significant memory pressure, resulting in excessive garbage collection (GC) and out-of-memory (OOM) errors, harming system performance and reliability. In this paper, we demonstrate how lightweight virtualization via OS containers opens up opportunities to address memory pressure and realize memory elasticity: 1) tasks running in a container can be set to a large heap size to avoid OutOfMemory (OOM) errors, and 2) tasks that are under memory pressure and incur significant swapping activities can be temporarily \"suspended\" by depriving resources from the hosting containers, and be \"resumed\" when resources are available. We propose and develop Pufferfish, an elastic memory manager, that leverages containers to flexibly allocate memory for tasks. Memory elasticity achieved by Pufferfish can be exploited by a cluster scheduler to improve cluster utilization and task parallelism. We implement Pufferfish on the cluster scheduler Apache Yarn. Experiments with Spark and MapReduce on real-world traces show Pufferfish is able to avoid OOM errors, improve cluster memory utilization by 2.7x and the median job runtime by 5.5x compared to a memory over-provisioning solution.","PeriodicalId":91949,"journal":{"name":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","volume":"67 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3357223.3362730","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Data-intensive applications often suffer from significant memory pressure, resulting in excessive garbage collection (GC) and out-of-memory (OOM) errors, harming system performance and reliability. In this paper, we demonstrate how lightweight virtualization via OS containers opens up opportunities to address memory pressure and realize memory elasticity: 1) tasks running in a container can be set to a large heap size to avoid OutOfMemory (OOM) errors, and 2) tasks that are under memory pressure and incur significant swapping activities can be temporarily "suspended" by depriving resources from the hosting containers, and be "resumed" when resources are available. We propose and develop Pufferfish, an elastic memory manager, that leverages containers to flexibly allocate memory for tasks. Memory elasticity achieved by Pufferfish can be exploited by a cluster scheduler to improve cluster utilization and task parallelism. We implement Pufferfish on the cluster scheduler Apache Yarn. Experiments with Spark and MapReduce on real-world traces show Pufferfish is able to avoid OOM errors, improve cluster memory utilization by 2.7x and the median job runtime by 5.5x compared to a memory over-provisioning solution.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Pufferfish:用于数据密集型应用程序的容器驱动弹性内存管理

数据密集型应用程序通常承受巨大的内存压力，导致过多的垃圾收集(GC)和内存不足(OOM)错误，从而损害系统性能和可靠性。在本文中，我们演示了通过操作系统容器的轻量级虚拟化如何打开解决内存压力和实现内存弹性的机会:1)可以将在容器中运行的任务设置为较大的堆大小，以避免OutOfMemory (OOM)错误;2)处于内存压力下并引发重大交换活动的任务可以通过从托管容器中剥夺资源来暂时“挂起”，并在资源可用时“恢复”。我们提出并开发了Pufferfish，一个弹性内存管理器，它利用容器灵活地为任务分配内存。集群调度器可以利用Pufferfish实现的内存弹性来提高集群利用率和任务并行性。我们在集群调度程序Apache Yarn上实现了Pufferfish。使用Spark和MapReduce进行的实验表明，与内存过度配置解决方案相比，Pufferfish能够避免OOM错误，将集群内存利用率提高2.7倍，将中位数作业运行时间提高5.5倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)

自引率

0.00%

发文量

期刊最新文献

OneEdge Towards Reliable AI for Source Code Understanding Chronus Open Research Problems in the Cloud Building Reliable Cloud Services Using Coyote Actors