Mapping of RAID Controller Performance Data to the Job History on Large Computing Systems

2014 International Workshop on Data Intensive Scalable Computing Systems Pub Date : 2014-11-16 DOI:10.1109/DISCS.2014.7

Marc Hartung, Michael Kluge

引用次数: 3

Abstract

For systems executing a mixture of different data intensive applications in parallel there is always the question about the impact that each application has on the storage subsystem. From the perspective of storage, I/O is typically anonymous as it does not contain user identifiers or similar information. This paper focuses on the analysis of performance data collected on shared system components like global file systems that can not be mapped back to user activities immediately. Our approach classifies user jobs based on their properties into classes and correlates these classes with global timelines. Within the paper we will show details of the clustering algorithm, depict our measurement environment and present first results. The results are valuable for tuning HPC storage system to achieve an optimized behavior on a global system level or to separate users into classes with different I/O demands.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

大型计算系统中RAID控制器性能数据到作业历史的映射

对于并行执行不同数据密集型应用程序的系统，总是存在每个应用程序对存储子系统的影响的问题。从存储的角度来看，I/O通常是匿名的，因为它不包含用户标识符或类似信息。本文主要分析在共享系统组件(如全局文件系统)上收集的性能数据，这些组件不能立即映射回用户活动。我们的方法根据用户作业的属性将其分类为类，并将这些类与全局时间轴关联起来。在本文中，我们将展示聚类算法的细节，描述我们的测量环境并给出初步结果。这些结果对于调优HPC存储系统以在全局系统级别上实现优化行为或将用户划分为具有不同I/O需求的类非常有价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 International Workshop on Data Intensive Scalable Computing Systems

自引率

0.00%

发文量