Jayanti Vemulapati, Anuruddha S. Khastgir, Chethana Savalgi
Big data analytics platforms on cloud are becoming mainstream technology enabling cost-effective rapid deployment of customer's Big Data applications delivering quicker insights from their data. It is, therefore, even more imperative that we have high performant platform infrastructure and application at a reasonable cost. This is only possible if we make a transition from traditional approach to execute and measure performance by adopting new AI techniques such as Machine Learning (ML) & predictive approach to performance benchmarking for every application domain. This paper proposes a high-level conceptual model for automated performance benchmarking which includes execution engine that has been designed to support a self-service model covering automated benchmarking in every application domain. The automated engine is supported by performance scaling recommendations via prescriptive analytics from real performance data set. We furthermore extended the recommendation capabilities of our self-service automated engine by introducing predictive analytics for making it more flexible in addressing 'what-if' scenarios to predict 'Right Scale' with measurement of "Performance Cost Ratio" (PCR). Finally, we also present some real-world industry examples which have seen the performance benefits in their applications with the recommendations given by our proposed model.
{"title":"AI Based Performance Benchmarking & Analysis of Big Data and Cloud Powered Applications: An in Depth View","authors":"Jayanti Vemulapati, Anuruddha S. Khastgir, Chethana Savalgi","doi":"10.1145/3297663.3309676","DOIUrl":"https://doi.org/10.1145/3297663.3309676","url":null,"abstract":"Big data analytics platforms on cloud are becoming mainstream technology enabling cost-effective rapid deployment of customer's Big Data applications delivering quicker insights from their data. It is, therefore, even more imperative that we have high performant platform infrastructure and application at a reasonable cost. This is only possible if we make a transition from traditional approach to execute and measure performance by adopting new AI techniques such as Machine Learning (ML) & predictive approach to performance benchmarking for every application domain. This paper proposes a high-level conceptual model for automated performance benchmarking which includes execution engine that has been designed to support a self-service model covering automated benchmarking in every application domain. The automated engine is supported by performance scaling recommendations via prescriptive analytics from real performance data set. We furthermore extended the recommendation capabilities of our self-service automated engine by introducing predictive analytics for making it more flexible in addressing 'what-if' scenarios to predict 'Right Scale' with measurement of \"Performance Cost Ratio\" (PCR). Finally, we also present some real-world industry examples which have seen the performance benefits in their applications with the recommendations given by our proposed model.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126403170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a simulation based approach for scheduling jobs that are part of a batch workflow. Our objective is to minimize the makespan, defined as completion time of the last job to leave the system in a batch workflow with dependencies. The existing job schedulers make scheduling decisions based on available cores, memory size, priority or execution time of jobs. This does not guarantee minimum makespan since contention for resources among concurrently running jobs are ignored. In our approach, prior to scheduling batch jobs on physical servers, we simulate the execution of jobs using a discrete event simulator. The simulator considers available cores and available memory bandwidth on distributed systems to accurately simulate the execution of jobs using resource contention models in a concurrent run. We also propose simulation based job scheduling algorithms that use underlying contention models and minimize the makespan by optimally mapping jobs onto the available nodes. Our approach ensures that job dependencies are adhered to during the simulation. We assess the efficacy of our job scheduling algorithms and contention models by performing experiments on a real cluster. Our experimental results show that simulation based approach improves the makespan by 15% to 35% depending on the nature of workload.
{"title":"Simulation Based Job Scheduling Optimization for Batch Workloads","authors":"Dheeraj Chahal, Benny Mathew, M. Nambiar","doi":"10.1145/3297663.3310312","DOIUrl":"https://doi.org/10.1145/3297663.3310312","url":null,"abstract":"We present a simulation based approach for scheduling jobs that are part of a batch workflow. Our objective is to minimize the makespan, defined as completion time of the last job to leave the system in a batch workflow with dependencies. The existing job schedulers make scheduling decisions based on available cores, memory size, priority or execution time of jobs. This does not guarantee minimum makespan since contention for resources among concurrently running jobs are ignored. In our approach, prior to scheduling batch jobs on physical servers, we simulate the execution of jobs using a discrete event simulator. The simulator considers available cores and available memory bandwidth on distributed systems to accurately simulate the execution of jobs using resource contention models in a concurrent run. We also propose simulation based job scheduling algorithms that use underlying contention models and minimize the makespan by optimally mapping jobs onto the available nodes. Our approach ensures that job dependencies are adhered to during the simulation. We assess the efficacy of our job scheduling algorithms and contention models by performing experiments on a real cluster. Our experimental results show that simulation based approach improves the makespan by 15% to 35% depending on the nature of workload.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132634793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sacheendra Talluri, Alicja Luszczak, Cristina L. Abad, A. Iosup
The proliferation of big data processing platforms has led to radically different system designs, such as MapReduce and the newer Spark. Understanding the workloads of such systems facilitates tuning and could foster new designs. However, whereas MapReduce workloads have been characterized extensively, relatively little public knowledge exists about the characteristics of Spark workloads in representative environments. To address this problem, in this work we collect and analyze a 6-month Spark workload from a major provider of big data processing services, Databricks. Our analysis focuses on a number of key features, such as the long-term trends of reads and modifications, the statistical properties of reads, and the popularity of clusters and of file formats. Overall, we present numerous findings that could form the basis of new systems studies and designs. Our quantitative evidence and its analysis suggest the existence of daily and weekly load imbalances, of heavy-tailed and bursty behaviour, of the relative rarity of modifications, and of proliferation of big data specific formats.
{"title":"Characterization of a Big Data Storage Workload in the Cloud","authors":"Sacheendra Talluri, Alicja Luszczak, Cristina L. Abad, A. Iosup","doi":"10.1145/3297663.3310302","DOIUrl":"https://doi.org/10.1145/3297663.3310302","url":null,"abstract":"The proliferation of big data processing platforms has led to radically different system designs, such as MapReduce and the newer Spark. Understanding the workloads of such systems facilitates tuning and could foster new designs. However, whereas MapReduce workloads have been characterized extensively, relatively little public knowledge exists about the characteristics of Spark workloads in representative environments. To address this problem, in this work we collect and analyze a 6-month Spark workload from a major provider of big data processing services, Databricks. Our analysis focuses on a number of key features, such as the long-term trends of reads and modifications, the statistical properties of reads, and the popularity of clusters and of file formats. Overall, we present numerous findings that could form the basis of new systems studies and designs. Our quantitative evidence and its analysis suggest the existence of daily and weekly load imbalances, of heavy-tailed and bursty behaviour, of the relative rarity of modifications, and of proliferation of big data specific formats.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133514125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Bezemer, Simon Eismann, Vincenzo Ferme, Johannes Grohmann, R. Heinrich, Pooyan Jamshidi, Weiyi Shang, A. Hoorn, M. Villavicencio, J. Walter, Felix Willnecker
DevOps is a modern software engineering paradigm that is gaining widespread adoption in industry. The goal of DevOps is to bring software changes into production with a high frequency and fast feedback cycles. This conflicts with software quality assurance activities, particularly with respect to performance. For instance, performance evaluation activities --- such as load testing --- require a considerable amount of time to get statistically significant results. We conducted an industrial survey to get insights into how performance is addressed in industrial DevOps settings. In particular, we were interested in the frequency of executing performance evaluations, the tools being used, the granularity of the obtained performance data, and the use of model-based techniques. The survey responses, which come from a wide variety of participants from different industry sectors, indicate that the complexity of performance engineering approaches and tools is a barrier for wide-spread adoption of performance analysis in DevOps. The implication of our results is that performance analysis tools need to have a short learning curve, and should be easy to integrate into the DevOps pipeline in order to be adopted by practitioners.
{"title":"How is Performance Addressed in DevOps?","authors":"C. Bezemer, Simon Eismann, Vincenzo Ferme, Johannes Grohmann, R. Heinrich, Pooyan Jamshidi, Weiyi Shang, A. Hoorn, M. Villavicencio, J. Walter, Felix Willnecker","doi":"10.1145/3297663.3309672","DOIUrl":"https://doi.org/10.1145/3297663.3309672","url":null,"abstract":"DevOps is a modern software engineering paradigm that is gaining widespread adoption in industry. The goal of DevOps is to bring software changes into production with a high frequency and fast feedback cycles. This conflicts with software quality assurance activities, particularly with respect to performance. For instance, performance evaluation activities --- such as load testing --- require a considerable amount of time to get statistically significant results. We conducted an industrial survey to get insights into how performance is addressed in industrial DevOps settings. In particular, we were interested in the frequency of executing performance evaluations, the tools being used, the granularity of the obtained performance data, and the use of model-based techniques. The survey responses, which come from a wide variety of participants from different industry sectors, indicate that the complexity of performance engineering approaches and tools is a barrier for wide-spread adoption of performance analysis in DevOps. The implication of our results is that performance analysis tools need to have a short learning curve, and should be easy to integrate into the DevOps pipeline in order to be adopted by practitioners.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"60 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131451544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linear Programs (LPs) appear in a large number of applications. Offloading the LP solving tasks to a GPU is viable to accelerate an application's performance. Existing work on offloading and solving an LP on a GPU shows that performance can be accelerated only for large LPs (typically 500 constraints, 500 variables and above). This paper is motivated from applications having to solve small LPs but many of them. Existing techniques fail to accelerate such applications using GPU. We propose a batched LP solver in CUDA to accelerate such applications and demonstrate its utility in a use case - state-space exploration of models of control systems design. A performance comparison of The batched LP solver against sequential solving in CPU using the open source solver GLPK (GNU Linear Programming Kit) and the CPLEX solver from IBM is also shown. The evaluation on selected LP benchmarks from the Netlib repository displays a maximum speed-up of 95x and 5x with respect to CPLEX and GLPK solver respectively, for a batch of 1e5 LPs.
线性规划(lp)有大量的应用。将LP求解任务卸载到GPU上对于加速应用程序的性能是可行的。在GPU上卸载和解决LP的现有工作表明,只有在大型LP(通常是500个约束,500个变量及以上)下才能加速性能。本文的动机来自于必须解决小型lp的应用程序,但其中有很多。现有技术无法使用GPU加速此类应用程序。我们在CUDA中提出了一个批处理LP求解器,以加速此类应用,并展示其在控制系统设计模型的用例-状态空间探索中的实用性。还显示了使用开源求解器GLPK (GNU Linear Programming Kit)和IBM的CPLEX求解器在CPU中对批处理LP求解器与顺序求解的性能比较。对Netlib存储库中选定的LP基准的评估显示,对于一批1e5 LP,相对于CPLEX和GLPK求解器,最大速度分别提高了95倍和5倍。
{"title":"Simultaneous Solving of Batched Linear Programs on a GPU","authors":"Amit Gurung, Rajarshi Ray","doi":"10.1145/3297663.3310308","DOIUrl":"https://doi.org/10.1145/3297663.3310308","url":null,"abstract":"Linear Programs (LPs) appear in a large number of applications. Offloading the LP solving tasks to a GPU is viable to accelerate an application's performance. Existing work on offloading and solving an LP on a GPU shows that performance can be accelerated only for large LPs (typically 500 constraints, 500 variables and above). This paper is motivated from applications having to solve small LPs but many of them. Existing techniques fail to accelerate such applications using GPU. We propose a batched LP solver in CUDA to accelerate such applications and demonstrate its utility in a use case - state-space exploration of models of control systems design. A performance comparison of The batched LP solver against sequential solving in CPU using the open source solver GLPK (GNU Linear Programming Kit) and the CPLEX solver from IBM is also shown. The evaluation on selected LP benchmarks from the Netlib repository displays a maximum speed-up of 95x and 5x with respect to CPLEX and GLPK solver respectively, for a batch of 1e5 LPs.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128843688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seetharami R. Seelam, P. Tůma, G. Casale, T. Field, J. N. Amaral
We are delighted to bring you an outstanding technical program to 2013 International Conference on Performance Engineering -- ICPE'13 in Prague. The main Research track for the conference attracted 42 submissions. Thanks to the diligent efforts of the members of the Program Committee each paper received a minimum of four reviews. After extensive deliberation the Program Committee decided to accept 20 submissions as regular papers and two as short papers. The Industry and Experience track focuses on the application of research results to industrial performance engineering problems and addresses in particular innovative implementations, the novel application of performance-related technologies and the reporting of insightful performance results. This track received 22 submissions of which 8 were selected for presentation at the conference. The papers accepted to the Research track and to the Industry and Experience track cover several topics such as software development and various flavors of modeling, including performance, survivability and scalability modeling. The development of representative workloads and benchmarks is also well represented. There are then a number of papers that focus on performance aspects of cloud-related systems and more general aspects of scheduling and load balancing. The Vision/Work-in-Progress track is a feature of ICPE that allows researchers to present and discuss ideas that they are still working on or that they are planning to work on in the near future. It is a great forum for learning about the direction of research in the area. This year we received 18 submissions to this track and were able to accommodate 10 short presentations in the conference program. The topics covered by this track are similar to those in the main Research track, which suggests that they are likely to feature again at ICPE in the near future. In summary, there were 81 submissions in total across the three tracks, of which 38 were selected for presentation. We are now looking forward to several days of great presentations and stimulating discussions at ICPE 2013 in beautiful Prague. It has been a privilege and a pleasure forus to be involved.
{"title":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","authors":"Seetharami R. Seelam, P. Tůma, G. Casale, T. Field, J. N. Amaral","doi":"10.1145/3297663","DOIUrl":"https://doi.org/10.1145/3297663","url":null,"abstract":"We are delighted to bring you an outstanding technical program to 2013 International Conference on Performance Engineering -- ICPE'13 in Prague. The main Research track for the conference attracted 42 submissions. Thanks to the diligent efforts of the members of the Program Committee each paper received a minimum of four reviews. After extensive deliberation the Program Committee decided to accept 20 submissions as regular papers and two as short papers. \u0000 \u0000The Industry and Experience track focuses on the application of research results to industrial performance engineering problems and addresses in particular innovative implementations, the novel application of performance-related technologies and the reporting of insightful performance results. This track received 22 submissions of which 8 were selected for presentation at the conference. \u0000 \u0000The papers accepted to the Research track and to the Industry and Experience track cover several topics such as software development and various flavors of modeling, including performance, survivability and scalability modeling. The development of representative workloads and benchmarks is also well represented. There are then a number of papers that focus on performance aspects of cloud-related systems and more general aspects of scheduling and load balancing. \u0000 \u0000The Vision/Work-in-Progress track is a feature of ICPE that allows researchers to present and discuss ideas that they are still working on or that they are planning to work on in the near future. It is a great forum for learning about the direction of research in the area. This year we received 18 submissions to this track and were able to accommodate 10 short presentations in the conference program. The topics covered by this track are similar to those in the main Research track, which suggests that they are likely to feature again at ICPE in the near future. \u0000 \u0000In summary, there were 81 submissions in total across the three tracks, of which 38 were selected for presentation. We are now looking forward to several days of great presentations and stimulating discussions at ICPE 2013 in beautiful Prague. It has been a privilege and a pleasure forus to be involved.","PeriodicalId":273447,"journal":{"name":"Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125957247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}