首页 > 最新文献

Companion of the 2018 ACM/SPEC International Conference on Performance Engineering最新文献

英文 中文
Adaptive Dispatch: A Pattern for Performance-Aware Software Self-Adaptation 自适应调度:一种性能感知软件自适应模式
Petr Kubát, L. Bulej, T. Bures, Vojtech Horký, P. Tůma
Modern software systems often employ dynamic adaptation to runtime conditions in some parts of their functionality -- well known examples range from autotuning of computing kernels through adaptive battery saving strategies of mobile applications to dynamic load balancing and failover functionality in computing clouds. Typically, the implementation of these features is problem-specific -- a particular autotuner, a particular load balancer, and so on -- and enjoys little support from the implementation environment beyond standard programming constructs. In this work, we propose Adaptive Dispatch as a generic coding pattern for implementing dynamic adaptation. We believe that such pattern can make the implementation of dynamic adaptation features better in multiple aspects -- an explicit adaptation construct makes the presence of adaptation easily visible to programmers, lends itself to manipulation with development tools, and facilitates coordination of adaptation behavior at runtime. We present an implementation of the Adaptive Dispatch pattern as an internal DSL in Scala.
现代软件系统通常在其功能的某些部分中采用动态适应运行时条件——众所周知的例子包括计算内核的自动调优,移动应用程序的自适应电池节省策略,以及计算云中的动态负载平衡和故障转移功能。通常,这些特性的实现是特定于问题的——一个特定的自动调优器、一个特定的负载平衡器,等等——并且除了标准编程结构之外,很少得到实现环境的支持。在这项工作中,我们提出自适应调度作为实现动态自适应的通用编码模式。我们相信这样的模式可以在多个方面使动态适应特性的实现更好——一个显式的适应结构使适应的存在对程序员来说很容易看到,使其易于使用开发工具进行操作,并促进了运行时适应行为的协调。我们将自适应调度模式作为Scala内部DSL的实现。
{"title":"Adaptive Dispatch: A Pattern for Performance-Aware Software Self-Adaptation","authors":"Petr Kubát, L. Bulej, T. Bures, Vojtech Horký, P. Tůma","doi":"10.1145/3185768.3186406","DOIUrl":"https://doi.org/10.1145/3185768.3186406","url":null,"abstract":"Modern software systems often employ dynamic adaptation to runtime conditions in some parts of their functionality -- well known examples range from autotuning of computing kernels through adaptive battery saving strategies of mobile applications to dynamic load balancing and failover functionality in computing clouds. Typically, the implementation of these features is problem-specific -- a particular autotuner, a particular load balancer, and so on -- and enjoys little support from the implementation environment beyond standard programming constructs. In this work, we propose Adaptive Dispatch as a generic coding pattern for implementing dynamic adaptation. We believe that such pattern can make the implementation of dynamic adaptation features better in multiple aspects -- an explicit adaptation construct makes the presence of adaptation easily visible to programmers, lends itself to manipulation with development tools, and facilitates coordination of adaptation behavior at runtime. We present an implementation of the Adaptive Dispatch pattern as an internal DSL in Scala.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"109 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89956190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Application Speedup Characterization: Modeling Parallelization Overhead and Variations of Problem Size and Number of Cores. 应用加速表征:建模并行化开销和问题大小和核数的变化。
Victor H. F. Oliveira, Alex F. A. Furtunato, L. Silveira, Kyriakos Georgiou, K. Eder, S. X. D. Souza
To make efficient use of multi-core processors, it is important to understand the performance behavior of parallel applications. Modeling this can enable the use of online approaches to optimize throughput or energy, or even guarantee a minimum QoS. Accurate models would avoid probe different runtime configurations, which causes overhead. Throughout the years, many speedup models were proposed. Most of them based on Amdahl's or Gustafson's laws. However, many of those make considerations such as a fixed parallel fraction, or a parallel fraction that varies linearly with problem size, and inexistent parallelization overhead. Although such models aid in the theoretical understanding, these considerations do not hold in real environments, which makes the modeling unsuitable for accurate characterization of parallel applications. The model proposed estimates the speedup taking into account the variation of its parallel fraction according to problem size, number of cores used and overhead. Using four applications from the PARSEC benchmark suite, the proposed model was able to estimate speedups more accurately than other models in recent literature.
为了有效地利用多核处理器,理解并行应用程序的性能行为是很重要的。对此进行建模可以使用在线方法来优化吞吐量或能量,甚至保证最低的QoS。准确的模型将避免探测不同的运行时配置,这会导致开销。多年来,人们提出了许多加速模型。大多数都是基于阿姆达尔定律或古斯塔夫森定律。但是,其中许多都需要考虑固定的并行分数,或者随问题大小线性变化的并行分数,以及不存在的并行开销。尽管这些模型有助于理论理解,但这些考虑因素在实际环境中并不成立,这使得建模不适合准确表征并行应用程序。该模型根据问题大小、使用的核数和开销来估计其并行分数的变化。使用来自PARSEC基准套件的四个应用程序,所提出的模型能够比最近文献中的其他模型更准确地估计速度。
{"title":"Application Speedup Characterization: Modeling Parallelization Overhead and Variations of Problem Size and Number of Cores.","authors":"Victor H. F. Oliveira, Alex F. A. Furtunato, L. Silveira, Kyriakos Georgiou, K. Eder, S. X. D. Souza","doi":"10.1145/3185768.3185770","DOIUrl":"https://doi.org/10.1145/3185768.3185770","url":null,"abstract":"To make efficient use of multi-core processors, it is important to understand the performance behavior of parallel applications. Modeling this can enable the use of online approaches to optimize throughput or energy, or even guarantee a minimum QoS. Accurate models would avoid probe different runtime configurations, which causes overhead. Throughout the years, many speedup models were proposed. Most of them based on Amdahl's or Gustafson's laws. However, many of those make considerations such as a fixed parallel fraction, or a parallel fraction that varies linearly with problem size, and inexistent parallelization overhead. Although such models aid in the theoretical understanding, these considerations do not hold in real environments, which makes the modeling unsuitable for accurate characterization of parallel applications. The model proposed estimates the speedup taking into account the variation of its parallel fraction according to problem size, number of cores used and overhead. Using four applications from the PARSEC benchmark suite, the proposed model was able to estimate speedups more accurately than other models in recent literature.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78312116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards Scalability Guidelines for Semantic Data Container Management 面向语义数据容器管理的可伸缩性指南
Gunnar Brataas, B. Neumayr, C. G. Schütz, A. Vennesland
Semantic container management is a promising approach to organize data. However, the scalability of this approach is challenging. By scalability in this paper, we mean the expressivity and size of the semantic data containers we can handle, given a suitable quality threshold. In this paper, we derive scalability characteristics of the semantic container approach in a structured way. We also describe actual experiments where we vary the number of available CPU cores and quality thresholds. We conclude this work-in-progress paper by describing how more measurements could be performed so that the missing guidelines could be provided.
语义容器管理是一种很有前途的数据组织方法。然而,这种方法的可伸缩性是具有挑战性的。本文中的可伸缩性是指在给定合适的质量阈值的情况下,我们可以处理的语义数据容器的表达能力和大小。本文以结构化的方式推导了语义容器方法的可扩展性特征。我们还描述了实际的实验,其中我们改变了可用CPU内核的数量和质量阈值。我们通过描述如何执行更多的测量以提供缺失的指导方针来结束这篇正在进行的论文。
{"title":"Towards Scalability Guidelines for Semantic Data Container Management","authors":"Gunnar Brataas, B. Neumayr, C. G. Schütz, A. Vennesland","doi":"10.1145/3185768.3186302","DOIUrl":"https://doi.org/10.1145/3185768.3186302","url":null,"abstract":"Semantic container management is a promising approach to organize data. However, the scalability of this approach is challenging. By scalability in this paper, we mean the expressivity and size of the semantic data containers we can handle, given a suitable quality threshold. In this paper, we derive scalability characteristics of the semantic container approach in a structured way. We also describe actual experiments where we vary the number of available CPU cores and quality thresholds. We conclude this work-in-progress paper by describing how more measurements could be performed so that the missing guidelines could be provided.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90681842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SPARK Job Performance Analysis and Prediction Tool SPARK工作绩效分析和预测工具
Rekha Singhal, Chetan Phalak, P. Singh
Spark is one of most widely deployed in-memory big data technology for parallel data processing across cluster of machines. The availability of these big data platforms on commodity machines has raised the challenge of assuring performance of applications with increase in data size. We have build a tool to assist application developer and tester to estimate an application execution time for larger data size before deployment. Conversely, the tool may also be used to estimate the competent cluster size for desired application performance in production environment. The tool can be used for detailed profiling of Spark job, post execution, to understand performance bottleneck. This tool incorporates different configurations of Spark cluster to estimate application performance. Therefore, it can also be used with optimization techniques to get tuned value of Spark parameters for an optimal performance. The tool's key innovations are support for different configurations of Spark platform for performance prediction and simulator to estimate Spark stage execution time which includes task execution variability due to HDFS, data skew and cluster nodes heterogeneity. The tool using model [3] has been shown to predict within 20% error bound for Wordcount, Terasort,Kmeans and few SQL workloads.
Spark是部署最广泛的内存大数据技术之一,用于跨机器集群并行处理数据。这些大数据平台在商用机器上的可用性提出了在数据量增加的情况下确保应用程序性能的挑战。我们已经构建了一个工具来帮助应用程序开发人员和测试人员在部署之前估计更大数据规模的应用程序执行时间。相反,该工具也可用于估计生产环境中所需应用程序性能所需的集群大小。该工具可用于详细分析Spark作业、后期执行情况,了解性能瓶颈。该工具结合了Spark集群的不同配置来评估应用程序性能。因此,它也可以与优化技术一起使用,以获得最佳性能的Spark参数的调优值。该工具的关键创新是支持不同配置的Spark平台,用于性能预测和模拟器,以估计Spark阶段执行时间,包括由于HDFS,数据倾斜和集群节点异构而导致的任务执行可变性。使用模型[3]的工具已被证明可以预测Wordcount、Terasort、Kmeans和少数SQL工作负载的误差范围在20%以内。
{"title":"SPARK Job Performance Analysis and Prediction Tool","authors":"Rekha Singhal, Chetan Phalak, P. Singh","doi":"10.1145/3185768.3185772","DOIUrl":"https://doi.org/10.1145/3185768.3185772","url":null,"abstract":"Spark is one of most widely deployed in-memory big data technology for parallel data processing across cluster of machines. The availability of these big data platforms on commodity machines has raised the challenge of assuring performance of applications with increase in data size. We have build a tool to assist application developer and tester to estimate an application execution time for larger data size before deployment. Conversely, the tool may also be used to estimate the competent cluster size for desired application performance in production environment. The tool can be used for detailed profiling of Spark job, post execution, to understand performance bottleneck. This tool incorporates different configurations of Spark cluster to estimate application performance. Therefore, it can also be used with optimization techniques to get tuned value of Spark parameters for an optimal performance. The tool's key innovations are support for different configurations of Spark platform for performance prediction and simulator to estimate Spark stage execution time which includes task execution variability due to HDFS, data skew and cluster nodes heterogeneity. The tool using model [3] has been shown to predict within 20% error bound for Wordcount, Terasort,Kmeans and few SQL workloads.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"80 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80676391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Performance Study of Big Data Workloads in Cloud Datacenters with Network Variability 具有网络可变性的云数据中心大数据工作负载性能研究
Alexandru Uta, Harry Obaseki
Public cloud computing platforms are a cost-effective solution for individuals and organizations to deploy various types of workloads, ranging from scientific applications, business-critical workloads, e-governance to big data applications. Co-locating all such different types of workloads in a single datacenter leads not only to performance degradation, but also to large degrees of performance variability, which is the result of virtualization, resource sharing and congestion. Many studies have already assessed and characterized the degree of resource variability in public clouds. However, we are missing a clear picture on how resource variability impacts big data workloads. In this work, we take a step towards characterizing the behavior of big data workloads under network bandwidth variability. Emulating real-world clouds» bandwidth distribution, we characterize the performance achieved by running real-world big data applications. We find that most big data workloads are slowed down under network variability scenarios, even those that are not network-bound. Moreover, the maximum average slowdown for the cloud setup with highest variability is 1.48 for CPU-bound workloads, and 1.79 for network-bound workloads.
公共云计算平台是个人和组织部署各种类型工作负载(从科学应用程序、关键业务工作负载、电子政务到大数据应用程序)的经济有效的解决方案。将所有这些不同类型的工作负载放在一个数据中心中不仅会导致性能下降,而且还会导致很大程度的性能变化,这是虚拟化、资源共享和拥塞的结果。许多研究已经评估和描述了公共云中资源可变性的程度。然而,对于资源可变性如何影响大数据工作负载,我们还没有清晰的认识。在这项工作中,我们朝着描述网络带宽可变性下大数据工作负载的行为迈出了一步。模拟现实世界的云»带宽分布,我们描述了通过运行现实世界的大数据应用程序实现的性能。我们发现,在网络可变性场景下,大多数大数据工作负载都变慢了,即使是那些不受网络限制的场景。此外,对于具有最高可变性的云设置,cpu绑定工作负载的最大平均速度为1.48,网络绑定工作负载的最大平均速度为1.79。
{"title":"A Performance Study of Big Data Workloads in Cloud Datacenters with Network Variability","authors":"Alexandru Uta, Harry Obaseki","doi":"10.1145/3185768.3186299","DOIUrl":"https://doi.org/10.1145/3185768.3186299","url":null,"abstract":"Public cloud computing platforms are a cost-effective solution for individuals and organizations to deploy various types of workloads, ranging from scientific applications, business-critical workloads, e-governance to big data applications. Co-locating all such different types of workloads in a single datacenter leads not only to performance degradation, but also to large degrees of performance variability, which is the result of virtualization, resource sharing and congestion. Many studies have already assessed and characterized the degree of resource variability in public clouds. However, we are missing a clear picture on how resource variability impacts big data workloads. In this work, we take a step towards characterizing the behavior of big data workloads under network bandwidth variability. Emulating real-world clouds» bandwidth distribution, we characterize the performance achieved by running real-world big data applications. We find that most big data workloads are slowed down under network variability scenarios, even those that are not network-bound. Moreover, the maximum average slowdown for the cloud setup with highest variability is 1.48 for CPU-bound workloads, and 1.79 for network-bound workloads.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74942011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Combining Energy Saving Techniques in Data Centres using Model-Based Analysis 结合基于模型分析的数据中心节能技术
Björn F. Postema, T. V. Damme, C. D. Persis, P. Tesi, B. Haverkort
Advanced power management and cooling techniques for data centres often co-exist as separate entities in current-day operation of data centres. This paper proposes to combine these techniques to achieve greater power savings. To this end, an existing theoretical thermal-aware model is integrated in an extensive simulation framework for data centres using power and performance models, which allows for a detailed study in power, performance and thermal metrics. The paper compares four distinct cases for studying the effect on these metrics: a data centre with (i) basic functionality; (ii) advanced cooling; (iii) advanced power management; and (iv) a combination thereof. The combined case shows a significant reduction in the energy consumption compared to the other cases while performance and thermal demands are kept intact. The combination of these techniques shows improvements in energy savings and shows it is meaningful to investigate further into smart combined energy saving techniques.
数据中心的先进电源管理和冷却技术通常作为独立的实体共存于当前的数据中心运营中。本文建议将这些技术结合起来以实现更大的节能。为此,将现有的理论热感知模型集成到使用功率和性能模型的数据中心的广泛模拟框架中,从而可以对功率,性能和热指标进行详细研究。本文比较了四个不同的案例来研究对这些指标的影响:具有(i)基本功能的数据中心;(ii)先进冷却;(三)先进的电源管理;(iv)两者的组合。与其他案例相比,这种组合案例在保持性能和热需求不变的情况下显著降低了能耗。这些技术的结合显示了节能的改善,并表明进一步研究智能组合节能技术是有意义的。
{"title":"Combining Energy Saving Techniques in Data Centres using Model-Based Analysis","authors":"Björn F. Postema, T. V. Damme, C. D. Persis, P. Tesi, B. Haverkort","doi":"10.1145/3185768.3186310","DOIUrl":"https://doi.org/10.1145/3185768.3186310","url":null,"abstract":"Advanced power management and cooling techniques for data centres often co-exist as separate entities in current-day operation of data centres. This paper proposes to combine these techniques to achieve greater power savings. To this end, an existing theoretical thermal-aware model is integrated in an extensive simulation framework for data centres using power and performance models, which allows for a detailed study in power, performance and thermal metrics. The paper compares four distinct cases for studying the effect on these metrics: a data centre with (i) basic functionality; (ii) advanced cooling; (iii) advanced power management; and (iv) a combination thereof. The combined case shows a significant reduction in the energy consumption compared to the other cases while performance and thermal demands are kept intact. The combination of these techniques shows improvements in energy savings and shows it is meaningful to investigate further into smart combined energy saving techniques.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84310537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Exploratory Analysis of Spark Structured Streaming Spark结构化流的探索性分析
Todor Ivanov, Jason Taafe
In the Big Data era, stream processing has become a common requirement for many data-intensive applications. This has lead to many advances in the development and adaption of large scale streaming systems. Spark and Flink have become a popular choice for many developers as they combine both batch and streaming capabilities in a single system. However, introducing the Spark Structured Streaming in version 2.0 opened up completely new features for SparkSQL, which are alternatively only available in Apache Calcite. This work focuses on the new Spark Structured Streaming and analyses it by diving into its internal functionalities. With the help of a micro-benchmark consisting of streaming queries, we perform initial experiments evaluating the technology. Our results show that Spark Structured Streaming is able to run multiple queries successfully in parallel on data with changing velocity and volume sizes.
在大数据时代,流处理已经成为许多数据密集型应用的共同需求。这导致了大规模流系统的开发和适应方面的许多进步。Spark和Flink已经成为许多开发人员的热门选择,因为它们在单个系统中结合了批处理和流处理功能。然而,在2.0版本中引入Spark结构化流为SparkSQL打开了全新的特性,这些特性只能在Apache方解石中使用。这项工作的重点是新的Spark结构化流,并通过深入研究其内部功能来分析它。借助由流查询组成的微基准测试,我们执行了评估该技术的初步实验。我们的结果表明,Spark结构化流能够成功地在不同速度和容量大小的数据上并行运行多个查询。
{"title":"Exploratory Analysis of Spark Structured Streaming","authors":"Todor Ivanov, Jason Taafe","doi":"10.1145/3185768.3186360","DOIUrl":"https://doi.org/10.1145/3185768.3186360","url":null,"abstract":"In the Big Data era, stream processing has become a common requirement for many data-intensive applications. This has lead to many advances in the development and adaption of large scale streaming systems. Spark and Flink have become a popular choice for many developers as they combine both batch and streaming capabilities in a single system. However, introducing the Spark Structured Streaming in version 2.0 opened up completely new features for SparkSQL, which are alternatively only available in Apache Calcite. This work focuses on the new Spark Structured Streaming and analyses it by diving into its internal functionalities. With the help of a micro-benchmark consisting of streaming queries, we perform initial experiments evaluating the technology. Our results show that Spark Structured Streaming is able to run multiple queries successfully in parallel on data with changing velocity and volume sizes.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"35 16","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91439008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Performance Prediction for Families of Data-Intensive Software Applications 数据密集型软件应用家族的性能预测
J. Verriet, R. Dankers, L. Somers
Performance is a critical system property of any system, in particular of data-intensive systems, such as image processing systems. We describe a performance engineering method for families of data-intensive systems that is both simple and accurate; the performance of new family members is predicted using models of existing family members. The predictive models are calibrated using static code analysis and regression. Code analysis is used to extract performance profiles, which are used in combination with regression to derive predictive performance models. A case study presents the application for an industrial image processing case, which revealed as benefits the easy application and identification of code performance optimization points.
性能是任何系统的关键系统属性,特别是数据密集型系统,如图像处理系统。我们描述了一种既简单又准确的数据密集型系统的性能工程方法;利用现有家庭成员的模型预测新家庭成员的绩效。使用静态代码分析和回归对预测模型进行校准。代码分析用于提取性能概要文件,这些概要文件与回归结合使用以派生预测性性能模型。以一个工业图像处理案例为例,说明了该方法易于应用和识别代码性能优化点的优点。
{"title":"Performance Prediction for Families of Data-Intensive Software Applications","authors":"J. Verriet, R. Dankers, L. Somers","doi":"10.1145/3185768.3186405","DOIUrl":"https://doi.org/10.1145/3185768.3186405","url":null,"abstract":"Performance is a critical system property of any system, in particular of data-intensive systems, such as image processing systems. We describe a performance engineering method for families of data-intensive systems that is both simple and accurate; the performance of new family members is predicted using models of existing family members. The predictive models are calibrated using static code analysis and regression. Code analysis is used to extract performance profiles, which are used in combination with regression to derive predictive performance models. A case study presents the application for an industrial image processing case, which revealed as benefits the easy application and identification of code performance optimization points.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"98 5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82238611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Workload-Dependent Performance Analysis of an In-Memory Database in a Multi-Tenant Configuration 多租户配置下内存数据库的工作负载相关性能分析
Dominik Paluch, Harald Kienegger, H. Krcmar
Modern in-memory database systems begin to provide multi-tenancy features. In contrast to the traditional operation of one large database appliance per system, the utilization of the multi-tenancy features allows for multiple database containers running on one system. Consequently, the database tenants share the same system resources, which has an influence on their performance. Understanding the performance of database tenants in different setups with varying workloads is a challenging task. However, knowledge of the performance behavior is crucial in order to benefit from multi-tenancy. In this paper, we provide fine-grained performance insights of the in-memory database SAP HANA in a multi-tenant configuration. We perform multiple benchmark runs utilizing an online analytical processing benchmark in order to retrieve information about the performance behavior of the multi-tenant database containers. Furthermore, we provide an analysis of the collected results and show a more efficient usage of threads in an environment with less active tenants under specific workload conditions.
现代内存数据库系统开始提供多租户特性。与每个系统使用一个大型数据库设备的传统操作不同,多租户特性的利用允许在一个系统上运行多个数据库容器。因此,数据库租户共享相同的系统资源,这对它们的性能有影响。理解具有不同工作负载的不同设置中的数据库租户的性能是一项具有挑战性的任务。但是,要从多租户中获益,了解性能行为是至关重要的。在本文中,我们提供了多租户配置下内存数据库SAP HANA的细粒度性能洞察。我们利用在线分析处理基准执行多个基准运行,以便检索有关多租户数据库容器的性能行为的信息。此外,我们对收集到的结果进行了分析,并展示了在特定工作负载条件下,在具有较少活跃租户的环境中更有效地使用线程。
{"title":"A Workload-Dependent Performance Analysis of an In-Memory Database in a Multi-Tenant Configuration","authors":"Dominik Paluch, Harald Kienegger, H. Krcmar","doi":"10.1145/3185768.3186290","DOIUrl":"https://doi.org/10.1145/3185768.3186290","url":null,"abstract":"Modern in-memory database systems begin to provide multi-tenancy features. In contrast to the traditional operation of one large database appliance per system, the utilization of the multi-tenancy features allows for multiple database containers running on one system. Consequently, the database tenants share the same system resources, which has an influence on their performance. Understanding the performance of database tenants in different setups with varying workloads is a challenging task. However, knowledge of the performance behavior is crucial in order to benefit from multi-tenancy. In this paper, we provide fine-grained performance insights of the in-memory database SAP HANA in a multi-tenant configuration. We perform multiple benchmark runs utilizing an online analytical processing benchmark in order to retrieve information about the performance behavior of the multi-tenant database containers. Furthermore, we provide an analysis of the collected results and show a more efficient usage of threads in an environment with less active tenants under specific workload conditions.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"239 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89756283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ABench: Big Data Architecture Stack Benchmark abbench:大数据架构堆栈基准
Todor Ivanov, Rekha Singhal
Distributed big data processing and analytics applications demand a comprehensive end-to-end architecture stack consisting of big data technologies. However, there are many possible architecture patterns (e.g. Lambda, Kappa or Pipeline architectures) to choose from when implementing the application requirements. A big data technology in isolation may be best performing for a particular application, but its performance in connection with other technologies depends on the connectors and the environment. Similarly, existing big data benchmarks evaluate the performance of different technologies in isolation, but no work has been done on benchmarking big data architecture stacks as a whole. For example, BigBench (TPCx-BB) may be used to evaluate the performance of Spark, but is it applicable to PySpark or to Spark with Kafka stack as well? What is the impact of having different programming environments and/or any other technology like Spark? This vision paper proposes a new category of benchmark, called ABench, to fill this gap and discusses key aspects necessary for the performance evaluation of different big data architecture stacks.
分布式大数据处理和分析应用需要一个由大数据技术组成的全面的端到端架构堆栈。然而,在实现应用程序需求时,有许多可能的体系结构模式(例如Lambda、Kappa或Pipeline体系结构)可供选择。单独使用的大数据技术可能对特定应用具有最佳性能,但与其他技术结合使用时,其性能取决于连接器和环境。同样,现有的大数据基准是孤立地评估不同技术的性能,但没有对大数据架构堆栈进行整体基准测试。例如,BigBench (TPCx-BB)可以用来评估Spark的性能,但它是否适用于PySpark或带有Kafka堆栈的Spark ?拥有不同的编程环境和/或其他像Spark这样的技术有什么影响?本文提出了一个新的基准类别,称为abbench,以填补这一空白,并讨论了不同大数据架构堆栈性能评估所需的关键方面。
{"title":"ABench: Big Data Architecture Stack Benchmark","authors":"Todor Ivanov, Rekha Singhal","doi":"10.1145/3185768.3186300","DOIUrl":"https://doi.org/10.1145/3185768.3186300","url":null,"abstract":"Distributed big data processing and analytics applications demand a comprehensive end-to-end architecture stack consisting of big data technologies. However, there are many possible architecture patterns (e.g. Lambda, Kappa or Pipeline architectures) to choose from when implementing the application requirements. A big data technology in isolation may be best performing for a particular application, but its performance in connection with other technologies depends on the connectors and the environment. Similarly, existing big data benchmarks evaluate the performance of different technologies in isolation, but no work has been done on benchmarking big data architecture stacks as a whole. For example, BigBench (TPCx-BB) may be used to evaluate the performance of Spark, but is it applicable to PySpark or to Spark with Kafka stack as well? What is the impact of having different programming environments and/or any other technology like Spark? This vision paper proposes a new category of benchmark, called ABench, to fill this gap and discusses key aspects necessary for the performance evaluation of different big data architecture stacks.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76737163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
Companion of the 2018 ACM/SPEC International Conference on Performance Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1