{"title":"Analysis of application heartbeats: Learning structural and temporal features in time series data for identification of performance problems","authors":"Emma S. Buneci, D. Reed","doi":"10.1109/SC.2008.5219753","DOIUrl":null,"url":null,"abstract":"Grids promote new modes of scientific collaboration and discovery by connecting distributed instruments, data and computing facilities. Because many resources are shared, application performance can vary widely and unexpectedly. We describe a novel performance analysis framework that reasons temporally and qualitatively about performance data from multiple monitoring levels and sources. The framework periodically analyzes application performance states by generating and interpreting signatures containing structural and temporal features from time-series data. Signatures are compared to expected behaviors and in case of mismatches, the framework hints at causes of degraded performance, based on unexpected behavior characteristics previously learned by application exposure to known performance stress factors. Experiments with two scientific applications reveal signatures that have distinct characteristics during well-performing versus poor-performing executions. The ability to automatically and compactly generate signatures capturing fundamental differences between good and poor application performance states is essential to improving the quality of service for Grid applications.","PeriodicalId":230761,"journal":{"name":"2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC.2008.5219753","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24
Abstract
Grids promote new modes of scientific collaboration and discovery by connecting distributed instruments, data and computing facilities. Because many resources are shared, application performance can vary widely and unexpectedly. We describe a novel performance analysis framework that reasons temporally and qualitatively about performance data from multiple monitoring levels and sources. The framework periodically analyzes application performance states by generating and interpreting signatures containing structural and temporal features from time-series data. Signatures are compared to expected behaviors and in case of mismatches, the framework hints at causes of degraded performance, based on unexpected behavior characteristics previously learned by application exposure to known performance stress factors. Experiments with two scientific applications reveal signatures that have distinct characteristics during well-performing versus poor-performing executions. The ability to automatically and compactly generate signatures capturing fundamental differences between good and poor application performance states is essential to improving the quality of service for Grid applications.