{"title":"Mapping of RAID Controller Performance Data to the Job History on Large Computing Systems","authors":"Marc Hartung, Michael Kluge","doi":"10.1109/DISCS.2014.7","DOIUrl":null,"url":null,"abstract":"For systems executing a mixture of different data intensive applications in parallel there is always the question about the impact that each application has on the storage subsystem. From the perspective of storage, I/O is typically anonymous as it does not contain user identifiers or similar information. This paper focuses on the analysis of performance data collected on shared system components like global file systems that can not be mapped back to user activities immediately. Our approach classifies user jobs based on their properties into classes and correlates these classes with global timelines. Within the paper we will show details of the clustering algorithm, depict our measurement environment and present first results. The results are valuable for tuning HPC storage system to achieve an optimized behavior on a global system level or to separate users into classes with different I/O demands.","PeriodicalId":278119,"journal":{"name":"2014 International Workshop on Data Intensive Scalable Computing Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Workshop on Data Intensive Scalable Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DISCS.2014.7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
For systems executing a mixture of different data intensive applications in parallel there is always the question about the impact that each application has on the storage subsystem. From the perspective of storage, I/O is typically anonymous as it does not contain user identifiers or similar information. This paper focuses on the analysis of performance data collected on shared system components like global file systems that can not be mapped back to user activities immediately. Our approach classifies user jobs based on their properties into classes and correlates these classes with global timelines. Within the paper we will show details of the clustering algorithm, depict our measurement environment and present first results. The results are valuable for tuning HPC storage system to achieve an optimized behavior on a global system level or to separate users into classes with different I/O demands.