反馈控制的可预测eScience资源共享

2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2008-11-15 DOI:10.1145/1413370.1413384

Sang-Min Park, M. Humphrey

{"title":"反馈控制的可预测eScience资源共享","authors":"Sang-Min Park, M. Humphrey","doi":"10.1145/1413370.1413384","DOIUrl":null,"url":null,"abstract":"The emerging class of adaptive, real-time, data-driven applications is a significant problem for today's HPC systems. In general, it is extremely difficult for queuing-system-controlled HPC resources to make and guarantee a tightly-bounded prediction regarding the time at which a newly-submitted application will execute. While a reservation-based approach partially addresses the problem, it can create severe resource under-utilization (unused reservations, necessary scheduled idle slots, underutilized reservations, etc.) that resource providers are eager to avoid. In contrast, this paper presents a fundamentally different approach to guarantee predictable execution. By creating a virtualized application layer called the performance container, and opportunistically multiplexing concurrent performance containers through the application of formal feedback control theory, we regulate the job's progress such that the job meets its deadline without requiring exclusive access to resources even in the presence of a wide class of unexpected disturbances. Our evaluation using two widely-used applications, WRF and BLAST, on an 8-core server show our approach is predictable and meets deadlines with 3.4 % of errors on average while achieving high overall utilization.","PeriodicalId":230761,"journal":{"name":"2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"Feedback-controlled resource sharing for predictable eScience\",\"authors\":\"Sang-Min Park, M. Humphrey\",\"doi\":\"10.1145/1413370.1413384\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The emerging class of adaptive, real-time, data-driven applications is a significant problem for today's HPC systems. In general, it is extremely difficult for queuing-system-controlled HPC resources to make and guarantee a tightly-bounded prediction regarding the time at which a newly-submitted application will execute. While a reservation-based approach partially addresses the problem, it can create severe resource under-utilization (unused reservations, necessary scheduled idle slots, underutilized reservations, etc.) that resource providers are eager to avoid. In contrast, this paper presents a fundamentally different approach to guarantee predictable execution. By creating a virtualized application layer called the performance container, and opportunistically multiplexing concurrent performance containers through the application of formal feedback control theory, we regulate the job's progress such that the job meets its deadline without requiring exclusive access to resources even in the presence of a wide class of unexpected disturbances. Our evaluation using two widely-used applications, WRF and BLAST, on an 8-core server show our approach is predictable and meets deadlines with 3.4 % of errors on average while achieving high overall utilization.\",\"PeriodicalId\":230761,\"journal\":{\"name\":\"2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1413370.1413384\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1413370.1413384","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

摘要

新兴的自适应、实时、数据驱动的应用是当今高性能计算系统面临的一个重大问题。通常，队列系统控制的HPC资源很难对新提交的应用程序的执行时间做出严格的预测。虽然基于预留的方法部分地解决了这个问题，但它可能会造成严重的资源利用不足(未使用的预留、必要的计划空闲插槽、未充分利用的预留等)，而资源提供者希望避免这些情况。相比之下，本文提出了一种完全不同的方法来保证可预测的执行。通过创建一个称为性能容器的虚拟化应用层，并通过应用形式反馈控制理论对并发性能容器进行机会多路复用，我们调节了作业的进度，这样即使在存在大量意外干扰的情况下，作业也能满足其最后期限，而不需要独占访问资源。我们在一台8核服务器上使用两个广泛使用的应用程序WRF和BLAST进行评估，结果表明我们的方法是可预测的，在实现高总体利用率的同时，平均错误率为3.4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Feedback-controlled resource sharing for predictable eScience

The emerging class of adaptive, real-time, data-driven applications is a significant problem for today's HPC systems. In general, it is extremely difficult for queuing-system-controlled HPC resources to make and guarantee a tightly-bounded prediction regarding the time at which a newly-submitted application will execute. While a reservation-based approach partially addresses the problem, it can create severe resource under-utilization (unused reservations, necessary scheduled idle slots, underutilized reservations, etc.) that resource providers are eager to avoid. In contrast, this paper presents a fundamentally different approach to guarantee predictable execution. By creating a virtualized application layer called the performance container, and opportunistically multiplexing concurrent performance containers through the application of formal feedback control theory, we regulate the job's progress such that the job meets its deadline without requiring exclusive access to resources even in the presence of a wide class of unexpected disturbances. Our evaluation using two widely-used applications, WRF and BLAST, on an 8-core server show our approach is predictable and meets deadlines with 3.4 % of errors on average while achieving high overall utilization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis

自引率

0.00%

发文量