2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems最新文献_第6页

A general algorithm to compute the steady-state solution of product-form cooperating Markov chains 计算乘积型合作马尔可夫链稳态解的一般算法

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems

Pub Date : 2009-12-28 DOI: 10.1109/MASCOT.2009.5366744

A. Marin, S. R. Bulò

In the last few years several new results about product-form solutions of stochastic models have been formulated. In particular, the Reversed Compound Agent Theorem (RCAT) and its extensions play a pivotal role in the characterization of cooperating stochastic models in product-form. Although these results have been used to prove several well-known theorems (e.g., Jackson queueing network and G-network solutions) as well as novel ones, to the best of our knowledge, an automatic tool to derive the product-form solution (if present) of a generic cooperation among a set of stochastic processes, is not yet developed. In this paper we address the problem of solving the non-linear system of equations that arises from the application of RCAT. We present an iterative algorithm that is the base of a software tool currently under development. We illustrate the algorithm, discuss the convergence and the complexity, compare it with previous algorithms defined for the analysis of the Jackson networks and the G-networks. Several tests have been conducted involving the solutions of a (arbitrary) large number of cooperating processes in product-form by RCAT.

近年来，人们提出了一些关于随机模型积型解的新结果。其中，反向复合代理定理(RCAT)及其扩展在产品形式的协作随机模型的表征中起着关键作用。虽然这些结果已经被用来证明几个著名的定理(例如，Jackson排队网络和g -网络解)以及一些新的定理，但据我们所知，一个自动工具来推导一组随机过程之间的一般合作的乘积形式解(如果存在)，还没有开发出来。在本文中，我们解决了由RCAT应用引起的非线性方程组的求解问题。我们提出了一个迭代算法，它是目前正在开发的软件工具的基础。我们举例说明了该算法，讨论了其收敛性和复杂性，并将其与先前为Jackson网络和g网络分析所定义的算法进行了比较。针对RCAT在产品形态中(任意)大量合作过程的解决方案进行了一些测试。

{"title":"A general algorithm to compute the steady-state solution of product-form cooperating Markov chains","authors":"A. Marin, S. R. Bulò","doi":"10.1109/MASCOT.2009.5366744","DOIUrl":"https://doi.org/10.1109/MASCOT.2009.5366744","url":null,"abstract":"In the last few years several new results about product-form solutions of stochastic models have been formulated. In particular, the Reversed Compound Agent Theorem (RCAT) and its extensions play a pivotal role in the characterization of cooperating stochastic models in product-form. Although these results have been used to prove several well-known theorems (e.g., Jackson queueing network and G-network solutions) as well as novel ones, to the best of our knowledge, an automatic tool to derive the product-form solution (if present) of a generic cooperation among a set of stochastic processes, is not yet developed. In this paper we address the problem of solving the non-linear system of equations that arises from the application of RCAT. We present an iterative algorithm that is the base of a software tool currently under development. We illustrate the algorithm, discuss the convergence and the complexity, compare it with previous algorithms defined for the analysis of the Jackson networks and the G-networks. Several tests have been conducted involving the solutions of a (arbitrary) large number of cooperating processes in product-form by RCAT.","PeriodicalId":275737,"journal":{"name":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128119306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Learning based address mapping for improving the performance of memory subsystems 基于学习的地址映射，用于提高内存子系统的性能

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems

Pub Date : 2009-12-28 DOI: 10.1109/MASCOT.2009.5366234

Pratyush Kumar, M. Desai

Interleaved address mapping has been effectively used to improve the performance of a parallely accessible memory subsystem. We propose a generalization of such mappings and study them in the framework of application specific MPSoCs. In this generalization, a section of the address bits is used to map each address to a memory bank and a row within that bank, using a Look-Up Table(LUT). We model the problem of address mapping optimization as a Markov Decision Process (MDP). To solve the MDP, we propose a reinforcement learning based algorithm which learns an optimized mapping within the generalized class, for a specific application mapped to an MPSoC system. Through cycle-accurate simulations on a simulation framework specifically developed for such a study, we demonstrate that a system using an address mapping generated in this manner exhibits substantially higher performance when compared to the same system using interleaved address mappings. These results indicate that application and architecture visibility can be leveraged to obtain better mappings than generic interleaved solutions, and that an automated reinforcement learning approach can identify such mappings using only the run-time behaviour of the system.

交错地址映射被有效地用于提高并行访问内存子系统的性能。我们提出了这种映射的泛化，并在特定应用的mpsoc框架中研究它们。在这种概括中，使用查找表(LUT)，一段地址位用于将每个地址映射到一个内存库和该存储库中的一行。我们将地址映射优化问题建模为马尔可夫决策过程(MDP)。为了解决MDP，我们提出了一种基于强化学习的算法，该算法在广义类中学习优化映射，用于映射到MPSoC系统的特定应用。通过在专门为此类研究开发的仿真框架上进行周期精确仿真，我们证明，与使用交错地址映射的相同系统相比，使用以这种方式生成的地址映射的系统表现出更高的性能。这些结果表明，可以利用应用程序和架构可见性来获得比通用交错解决方案更好的映射，并且自动化强化学习方法可以仅使用系统的运行时行为来识别此类映射。

{"title":"Learning based address mapping for improving the performance of memory subsystems","authors":"Pratyush Kumar, M. Desai","doi":"10.1109/MASCOT.2009.5366234","DOIUrl":"https://doi.org/10.1109/MASCOT.2009.5366234","url":null,"abstract":"Interleaved address mapping has been effectively used to improve the performance of a parallely accessible memory subsystem. We propose a generalization of such mappings and study them in the framework of application specific MPSoCs. In this generalization, a section of the address bits is used to map each address to a memory bank and a row within that bank, using a Look-Up Table(LUT). We model the problem of address mapping optimization as a Markov Decision Process (MDP). To solve the MDP, we propose a reinforcement learning based algorithm which learns an optimized mapping within the generalized class, for a specific application mapped to an MPSoC system. Through cycle-accurate simulations on a simulation framework specifically developed for such a study, we demonstrate that a system using an address mapping generated in this manner exhibits substantially higher performance when compared to the same system using interleaved address mappings. These results indicate that application and architecture visibility can be leveraged to obtain better mappings than generic interleaved solutions, and that an automated reinforcement learning approach can identify such mappings using only the run-time behaviour of the system.","PeriodicalId":275737,"journal":{"name":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114666713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Price war with migrating customers 与迁移客户的价格战

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems

Pub Date : 2009-12-28 DOI: 10.1109/MASCOT.2009.5362674

P. Maillé, M. Naldi, B. Tuffin

In the telecommunication world, competition among providers to attract and keep customers is fierce. On the other hand customers churn between providers due to better prices, better reputation or better services. We propose in this paper to study the price war between two providers in the case where users' decisions are modeled by a Markov chain, with price-dependent transition rates. Each provider is assumed to look for a maximized revenue, which depends on the strategy of the competitor. Therefore, using the framework of non-cooperative game theory, we show how the price war can be analyzed and show the influence of various parameters.

在电信领域，供应商之间为吸引和留住客户的竞争非常激烈。另一方面，由于更好的价格，更好的声誉或更好的服务，客户在供应商之间流动。本文提出在用户决策由马尔可夫链建模并具有价格依赖转移率的情况下，研究两个供应商之间的价格战。假设每个供应商都在寻找最大的收入，这取决于竞争对手的策略。因此，我们利用非合作博弈论的框架，展示了如何分析价格战，并展示了各种参数的影响。

引用次数: 12

Extreme Binning: Scalable, parallel deduplication for chunk-based file backup Extreme Binning:用于基于块的文件备份的可伸缩的并行重复数据删除

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems

Pub Date : 2009-12-28 DOI: 10.1109/MASCOT.2009.5366623

Deepavali Bhagwat, K. Eshghi, D. Long, Mark Lillibridge

Data deduplication is an essential and critical component of backup systems. Essential, because it reduces storage space requirements, and critical, because the performance of the entire backup operation depends on its throughput. Traditional backup workloads consist of large data streams with high locality, which existing deduplication techniques require to provide reasonable throughput. We present Extreme Binning, a scalable deduplication technique for non-traditional backup workloads that are made up of individual files with no locality among consecutive files in a given window of time. Due to lack of locality, existing techniques perform poorly on these workloads. Extreme Binning exploits file similarity instead of locality, and makes only one disk access for chunk lookup per file, which gives reasonable throughput. Multi-node backup systems built with Extreme Binning scale gracefully with the amount of input data; more backup nodes can be added to boost throughput. Each file is allocated using a stateless routing algorithm to only one node, allowing for maximum parallelization, and each backup node is autonomous with no dependency across nodes, making data management tasks robust with low overhead.

重复数据删除是备份系统必不可少的重要组成部分。必不可少，因为它减少了存储空间需求;至关重要，因为整个备份操作的性能取决于它的吞吐量。传统的备份工作负载由高度局部性的大数据流组成，现有的重复数据删除技术需要提供合理的吞吐量。我们介绍Extreme Binning，这是一种可扩展的重复数据删除技术，适用于非传统备份工作负载，这些工作负载由单个文件组成，在给定的时间窗口内，连续文件之间没有位置。由于缺乏局部性，现有技术在这些工作负载上的性能很差。极限分组利用文件相似性而不是局部性，并且对每个文件只进行一次磁盘访问以进行块查找，从而提供合理的吞吐量。使用Extreme bining构建的多节点备份系统可以随输入数据量优雅地扩展;可以添加更多的备份节点来提高吞吐量。每个文件使用无状态路由算法只分配给一个节点，允许最大程度的并行化，并且每个备份节点都是自治的，没有节点之间的依赖关系，从而使数据管理任务具有较低的开销。

{"title":"Extreme Binning: Scalable, parallel deduplication for chunk-based file backup","authors":"Deepavali Bhagwat, K. Eshghi, D. Long, Mark Lillibridge","doi":"10.1109/MASCOT.2009.5366623","DOIUrl":"https://doi.org/10.1109/MASCOT.2009.5366623","url":null,"abstract":"Data deduplication is an essential and critical component of backup systems. Essential, because it reduces storage space requirements, and critical, because the performance of the entire backup operation depends on its throughput. Traditional backup workloads consist of large data streams with high locality, which existing deduplication techniques require to provide reasonable throughput. We present Extreme Binning, a scalable deduplication technique for non-traditional backup workloads that are made up of individual files with no locality among consecutive files in a given window of time. Due to lack of locality, existing techniques perform poorly on these workloads. Extreme Binning exploits file similarity instead of locality, and makes only one disk access for chunk lookup per file, which gives reasonable throughput. Multi-node backup systems built with Extreme Binning scale gracefully with the amount of input data; more backup nodes can be added to boost throughput. Each file is allocated using a stateless routing algorithm to only one node, allowing for maximum parallelization, and each backup node is autonomous with no dependency across nodes, making data management tasks robust with low overhead.","PeriodicalId":275737,"journal":{"name":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117235741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 355

A simulation approach to evaluating design decisions in MapReduce setups 在MapReduce设置中评估设计决策的模拟方法

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems

Pub Date : 2009-12-28 DOI: 10.1109/MASCOT.2009.5366973

Guanying Wang, A. Butt, P. Pandey, Karan Gupta

MapReduce has emerged as a model of choice for supporting modern data-intensive applications. The model is easy-to-use and promising in reducing time-to-solution. It is also a key enabler for cloud computing, which provides transparent and flexible access to a large number of compute, storage and networking resources. Setting up and operating a large MapReduce cluster entails careful evaluation of various design choices and run-time parameters to achieve high efficiency. However, this design space has not been explored in detail. In this paper, we adopt a simulation approach to systematically understanding the performance of MapReduce setups. The resulting simulator, MRPerf, captures such aspects of these setups as node, rack and network configurations, disk parameters and performance, data layout and application I/O characteristics, among others, and uses this information to predict expected application performance. Specifically, we use MRPerf to explore the effect of several component inter-connect topologies, data locality, and software and hardware failures on overall application performance. MR-Perf allows us to quantify the effect of these factors, and thus can serve as a tool for optimizing existing MapReduce setups as well as designing new ones.

MapReduce已经成为支持现代数据密集型应用程序的首选模型。该模型易于使用，并且有望缩短求解时间。它也是云计算的关键推动者，云计算提供了对大量计算、存储和网络资源的透明和灵活的访问。设置和操作大型MapReduce集群需要仔细评估各种设计选择和运行时参数，以实现高效率。然而，这个设计空间还没有被详细探讨。在本文中，我们采用模拟方法来系统地理解MapReduce设置的性能。生成的模拟器MRPerf捕获这些设置的各个方面，如节点、机架和网络配置、磁盘参数和性能、数据布局和应用程序I/O特征等，并使用这些信息来预测预期的应用程序性能。具体来说，我们使用MRPerf来探索几种组件互连拓扑、数据局部性以及软件和硬件故障对整体应用程序性能的影响。MR-Perf允许我们量化这些因素的影响，因此可以作为优化现有MapReduce设置以及设计新设置的工具。

{"title":"A simulation approach to evaluating design decisions in MapReduce setups","authors":"Guanying Wang, A. Butt, P. Pandey, Karan Gupta","doi":"10.1109/MASCOT.2009.5366973","DOIUrl":"https://doi.org/10.1109/MASCOT.2009.5366973","url":null,"abstract":"MapReduce has emerged as a model of choice for supporting modern data-intensive applications. The model is easy-to-use and promising in reducing time-to-solution. It is also a key enabler for cloud computing, which provides transparent and flexible access to a large number of compute, storage and networking resources. Setting up and operating a large MapReduce cluster entails careful evaluation of various design choices and run-time parameters to achieve high efficiency. However, this design space has not been explored in detail. In this paper, we adopt a simulation approach to systematically understanding the performance of MapReduce setups. The resulting simulator, MRPerf, captures such aspects of these setups as node, rack and network configurations, disk parameters and performance, data layout and application I/O characteristics, among others, and uses this information to predict expected application performance. Specifically, we use MRPerf to explore the effect of several component inter-connect topologies, data locality, and software and hardware failures on overall application performance. MR-Perf allows us to quantify the effect of these factors, and thus can serve as a tool for optimizing existing MapReduce setups as well as designing new ones.","PeriodicalId":275737,"journal":{"name":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125970446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 250

Mining for statistical models of availability in large-scale distributed systems: An empirical study of SETI@home 大规模分布式系统中可用性统计模型的挖掘:SETI@home的实证研究

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems

Pub Date : 2009-12-28 DOI: 10.1109/MASCOT.2009.5367061

B. Javadi, Derrick Kondo, J. Vincent, David P. Anderson

In the age of cloud, Grid, P2P, and volunteer distributed computing, large-scale systems with tens of thousands of unreliable hosts are increasingly common. Invariably, these systems are composed of heterogeneous hosts whose individual availability often exhibit different statistical properties (for example stationary versus non-stationary behavior) and fit different models (for example Exponential, Weibull, or Pareto probability distributions). In this paper, we describe an effective method for discovering subsets of hosts whose availability have similar statistical properties and can be modelled with similar probability distributions. We apply this method with about 230,000 host availability traces obtained from a real large-scale Internet-distributed system, namely SETI@home. We find that about 34% of hosts exhibit availability that is a truly random process, and that these hosts can often be modelled accurately with a few distinct distributions from different families. We believe that this characterization is fundamental in the design of stochastic scheduling algorithms across large-scale systems where host availability is uncertain.

在云、网格、P2P和志愿者分布式计算的时代，拥有数万台不可靠主机的大规模系统越来越普遍。这些系统总是由异构主机组成，这些主机的个体可用性通常表现出不同的统计特性(例如平稳与非平稳行为)，并适合不同的模型(例如指数、威布尔或帕累托概率分布)。在本文中，我们描述了一种发现主机子集的有效方法，这些主机子集的可用性具有相似的统计特性，并且可以用相似的概率分布建模。我们将此方法应用于从真实的大规模互联网分布式系统SETI@home获得的约23万个主机可用性跟踪。我们发现，大约34%的房东表现出的可用性是一个真正随机的过程，这些房东通常可以用不同家庭的几个不同分布准确地建模。我们认为，这种特性是设计大型系统随机调度算法的基础，其中主机可用性是不确定的。

{"title":"Mining for statistical models of availability in large-scale distributed systems: An empirical study of SETI@home","authors":"B. Javadi, Derrick Kondo, J. Vincent, David P. Anderson","doi":"10.1109/MASCOT.2009.5367061","DOIUrl":"https://doi.org/10.1109/MASCOT.2009.5367061","url":null,"abstract":"In the age of cloud, Grid, P2P, and volunteer distributed computing, large-scale systems with tens of thousands of unreliable hosts are increasingly common. Invariably, these systems are composed of heterogeneous hosts whose individual availability often exhibit different statistical properties (for example stationary versus non-stationary behavior) and fit different models (for example Exponential, Weibull, or Pareto probability distributions). In this paper, we describe an effective method for discovering subsets of hosts whose availability have similar statistical properties and can be modelled with similar probability distributions. We apply this method with about 230,000 host availability traces obtained from a real large-scale Internet-distributed system, namely SETI@home. We find that about 34% of hosts exhibit availability that is a truly random process, and that these hosts can often be modelled accurately with a few distinct distributions from different families. We believe that this characterization is fundamental in the design of stochastic scheduling algorithms across large-scale systems where host availability is uncertain.","PeriodicalId":275737,"journal":{"name":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130379319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 74

Enhancing and optimizing a data protection solution 增强和优化数据保护解决方案

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems

Pub Date : 2009-12-28 DOI: 10.1109/MASCOT.2009.5367043

L. Cherkasova, R. Lau, Harald Burose, Bernhard Kappler

Analyzing and managing large amounts of unstructured information is a high priority task for many companies. For implementing content management solutions, companies need a comprehensive view of their unstructured data. In order to provide a new level of intelligence and control over data resident within the enterprise, one needs to build a chain of tools and automated processes that enable the evaluation, analysis, and visibility into information assets and their dynamics during the information life-cycle. We propose a novel framework to utilize the existing backup infrastructure by integrating additional content analysis routines and extracting already available filesystem metadata over time. This is used to perform data analysis and trending to add performance optimization and self-management capabilities to backup and information management tasks. Backup management faces serious challenges on its own: processing ever increasing amount of data while meeting the timing constraints of backup windows could require adaptive changes in backup scheduling routines. We revisit a traditional backup job scheduling and demonstrate that random job scheduling may lead to inefficient backup processing and an increased backup time. In this work, we use a historic information about the object backup processing time and suggest an additional job scheduling, and automated parameter tuning which may significantly optimize the overall backup time. Under this scheduling, called LBF, the longest backups (the objects with longest backup time) are scheduled first. We evaluate the performance benefits of the introduced scheduling using a realistic workload collected from the seven backup servers at HP Labs. Significant reduction of the backup time (up to 30%) and improved quality of service can be achieved under the proposed job assignment policy.

分析和管理大量非结构化信息是许多公司的首要任务。为了实现内容管理解决方案，公司需要一个非结构化数据的全面视图。为了对企业内驻留的数据提供更高级别的智能和控制，需要构建一系列工具和自动化流程，以便在信息生命周期中对信息资产及其动态进行评估、分析和可见性。我们提出了一个新的框架，通过集成额外的内容分析例程和提取已经可用的文件系统元数据来利用现有的备份基础设施。它用于执行数据分析和趋势分析，以便为备份和信息管理任务添加性能优化和自我管理功能。备份管理本身就面临着严峻的挑战:在处理不断增加的数据量的同时满足备份窗口的时间限制，可能需要对备份调度例程进行自适应更改。我们重新讨论传统的备份作业调度，并证明随机作业调度可能导致低效的备份处理和增加的备份时间。在这项工作中，我们使用关于对象备份处理时间的历史信息，并建议额外的作业调度和自动参数调优，这可能会显着优化总体备份时间。在这种称为LBF的调度下，首先调度时间最长的备份(备份时间最长的对象)。我们使用从HP Labs的七个备份服务器收集的实际工作负载来评估引入的调度的性能优势。在建议的工作分配政策下，可大幅减少备份时间(最多可达30%)，并改善服务质素。

{"title":"Enhancing and optimizing a data protection solution","authors":"L. Cherkasova, R. Lau, Harald Burose, Bernhard Kappler","doi":"10.1109/MASCOT.2009.5367043","DOIUrl":"https://doi.org/10.1109/MASCOT.2009.5367043","url":null,"abstract":"Analyzing and managing large amounts of unstructured information is a high priority task for many companies. For implementing content management solutions, companies need a comprehensive view of their unstructured data. In order to provide a new level of intelligence and control over data resident within the enterprise, one needs to build a chain of tools and automated processes that enable the evaluation, analysis, and visibility into information assets and their dynamics during the information life-cycle. We propose a novel framework to utilize the existing backup infrastructure by integrating additional content analysis routines and extracting already available filesystem metadata over time. This is used to perform data analysis and trending to add performance optimization and self-management capabilities to backup and information management tasks. Backup management faces serious challenges on its own: processing ever increasing amount of data while meeting the timing constraints of backup windows could require adaptive changes in backup scheduling routines. We revisit a traditional backup job scheduling and demonstrate that random job scheduling may lead to inefficient backup processing and an increased backup time. In this work, we use a historic information about the object backup processing time and suggest an additional job scheduling, and automated parameter tuning which may significantly optimize the overall backup time. Under this scheduling, called LBF, the longest backups (the objects with longest backup time) are scheduled first. We evaluate the performance benefits of the introduced scheduling using a realistic workload collected from the seven backup servers at HP Labs. Significant reduction of the backup time (up to 30%) and improved quality of service can be achieved under the proposed job assignment policy.","PeriodicalId":275737,"journal":{"name":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126516272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

TransPlant: A parameterized methodology for generating transactional memory workloads 移植:用于生成事务性内存工作负载的参数化方法

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems

Pub Date : 2009-12-28 DOI: 10.1109/MASCOT.2009.5366659

James Poe, C. Hughes, Tao Li

Transactional memory provides a means to bridge the discrepancy between programmer productivity and the difficulty in exploiting thread-level parallelism gains offered by emerging chip multiprocessors. Because the hardware has outpaced the software, there are very few modern multithreaded benchmarks available and even fewer for transactional memory researchers. This hurdle must be overcome for transactional memory research to mature and to gain widespread acceptance. Currently, for performance evaluations, most researchers rely on manually converted lock-based multithreaded workloads or the small group of programs written explicitly for transactional memory. Using converted benchmarks is problematic because they have been tuned so well that they may not be representative of how a programmer will actually use transactional memory. Hand coding stressor benchmarks is unattractive because it is tedious and time consuming. A new parameterized methodology that can automatically generate a program based on the desired high-level program characteristics benefits the transactional memory community. In this work, we propose techniques to generate parameterized transactional memory benchmarks based on a feature set, decoupled from the underlying transactional model. Using principle component analysis, clustering, and raw transactional performance metrics, we show that TransPlant can generate benchmarks with features that lie outside the boundary occupied by these traditional benchmarks. We also show how TransPlant can mimic the behavior of SPLASH-2 and STAMP transactional memory workloads. The program generation methods proposed here will help transactional memory architects select a robust set of programs for quick design evaluations.

事务性内存提供了一种方法来弥合程序员生产力与利用新兴芯片多处理器提供的线程级并行性增益的困难之间的差异。由于硬件的发展速度已经超过了软件，所以很少有现代多线程基准可用，事务性内存研究人员使用的基准就更少了。为了使事务性记忆研究成熟并获得广泛接受，必须克服这一障碍。目前，对于性能评估，大多数研究人员依赖于手动转换基于锁的多线程工作负载或为事务性内存显式编写的一小组程序。使用转换后的基准测试是有问题的，因为它们调优得非常好，可能无法代表程序员实际使用事务性内存的方式。手工编码压力源基准没有吸引力，因为它既乏味又耗时。一种新的参数化方法，可以根据所需的高级程序特征自动生成程序，使事务性内存社区受益。在这项工作中，我们提出了基于特征集生成参数化事务性内存基准的技术，与底层事务性模型解耦。使用主成分分析、聚类和原始事务性能指标，我们展示了移植可以生成具有这些传统基准所占据的边界之外的特征的基准。我们还展示了移植如何模拟SPLASH-2和STAMP事务性内存工作负载的行为。这里提出的程序生成方法将帮助事务性内存架构师为快速设计评估选择一组健壮的程序。

{"title":"TransPlant: A parameterized methodology for generating transactional memory workloads","authors":"James Poe, C. Hughes, Tao Li","doi":"10.1109/MASCOT.2009.5366659","DOIUrl":"https://doi.org/10.1109/MASCOT.2009.5366659","url":null,"abstract":"Transactional memory provides a means to bridge the discrepancy between programmer productivity and the difficulty in exploiting thread-level parallelism gains offered by emerging chip multiprocessors. Because the hardware has outpaced the software, there are very few modern multithreaded benchmarks available and even fewer for transactional memory researchers. This hurdle must be overcome for transactional memory research to mature and to gain widespread acceptance. Currently, for performance evaluations, most researchers rely on manually converted lock-based multithreaded workloads or the small group of programs written explicitly for transactional memory. Using converted benchmarks is problematic because they have been tuned so well that they may not be representative of how a programmer will actually use transactional memory. Hand coding stressor benchmarks is unattractive because it is tedious and time consuming. A new parameterized methodology that can automatically generate a program based on the desired high-level program characteristics benefits the transactional memory community. In this work, we propose techniques to generate parameterized transactional memory benchmarks based on a feature set, decoupled from the underlying transactional model. Using principle component analysis, clustering, and raw transactional performance metrics, we show that TransPlant can generate benchmarks with features that lie outside the boundary occupied by these traditional benchmarks. We also show how TransPlant can mimic the behavior of SPLASH-2 and STAMP transactional memory workloads. The program generation methods proposed here will help transactional memory architects select a robust set of programs for quick design evaluations.","PeriodicalId":275737,"journal":{"name":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131123440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Performance evaluation of scheduling policies in symmetric multiprocessing environments 对称多处理环境下调度策略的性能评估

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems

Pub Date : 2009-12-28 DOI: 10.1109/MASCOT.2009.5366656

J. Happe, Henning Groenda, Ralf H. Reussner

The shift of hardware architecture towards parallel execution led to a broad usage of multi-core processors in desktop systems and in server systems. The benefit of additional processor cores for software performance depends on the software's parallelism as well as the operating system scheduler's capabilities. Especially, the load on the available processors (or cores) strongly influences response times and throughput of software applications. Hence, a sophisticated understanding of the mutual influence of software behaviour and operating system schedulers is essential for accurate performance evaluations. Multi-core systems pose new challenges for performance analysis and developers of operating systems. For example, an optimal scheduling policy for multi-server systems, such as shortest remaining processing time (SRPT) for single-server systems, is not yet known in queueing theory. In this paper, we present a detailed experimental evaluation of general purpose operating system (GPOS) schedulers in symmetric multiprocessing (SMP) environments. In particular, we are interested in the influence of multiprocessor load balancing on software performance. Additionally, the evaluation includes effects of GPOS schedulers that can also occur in single-processor environments, such as I/O-boundedness of tasks and different prioritisation strategies. The results presented in this paper provide the basis for the future development of more accurate performance models of today's software systems.

硬件架构向并行执行的转变导致了多核处理器在桌面系统和服务器系统中的广泛使用。额外的处理器内核对软件性能的好处取决于软件的并行性以及操作系统调度器的能力。特别是，可用处理器(或核心)上的负载严重影响软件应用程序的响应时间和吞吐量。因此，对软件行为和操作系统调度器的相互影响的复杂理解对于准确的性能评估是必不可少的。多核系统对操作系统的性能分析和开发人员提出了新的挑战。例如，多服务器系统的最佳调度策略，如单服务器系统的最短剩余处理时间(SRPT)，在排队理论中还不为人所知。本文对对称多处理(SMP)环境下通用操作系统(GPOS)调度器进行了详细的实验评估。我们特别感兴趣的是多处理器负载平衡对软件性能的影响。此外，评估还包括GPOS调度器的影响，这种影响也可能发生在单处理器环境中，例如任务的I/ o受限性和不同的优先级策略。本文提出的结果为今后开发更精确的软件系统性能模型提供了基础。

{"title":"Performance evaluation of scheduling policies in symmetric multiprocessing environments","authors":"J. Happe, Henning Groenda, Ralf H. Reussner","doi":"10.1109/MASCOT.2009.5366656","DOIUrl":"https://doi.org/10.1109/MASCOT.2009.5366656","url":null,"abstract":"The shift of hardware architecture towards parallel execution led to a broad usage of multi-core processors in desktop systems and in server systems. The benefit of additional processor cores for software performance depends on the software's parallelism as well as the operating system scheduler's capabilities. Especially, the load on the available processors (or cores) strongly influences response times and throughput of software applications. Hence, a sophisticated understanding of the mutual influence of software behaviour and operating system schedulers is essential for accurate performance evaluations. Multi-core systems pose new challenges for performance analysis and developers of operating systems. For example, an optimal scheduling policy for multi-server systems, such as shortest remaining processing time (SRPT) for single-server systems, is not yet known in queueing theory. In this paper, we present a detailed experimental evaluation of general purpose operating system (GPOS) schedulers in symmetric multiprocessing (SMP) environments. In particular, we are interested in the influence of multiprocessor load balancing on software performance. Additionally, the evaluation includes effects of GPOS schedulers that can also occur in single-processor environments, such as I/O-boundedness of tasks and different prioritisation strategies. The results presented in this paper provide the basis for the future development of more accurate performance models of today's software systems.","PeriodicalId":275737,"journal":{"name":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116196183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Large Block CLOCK (LB-CLOCK): A write caching algorithm for solid state disks 大块时钟(LB-CLOCK):固态磁盘的写缓存算法

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems

Pub Date : 2009-12-28 DOI: 10.1109/MASCOT.2009.5366737

Biplob K. Debnath, S. Subramanya, D. Du, D. Lilja

Solid State Disks (SSDs) using NAND flash memory are increasingly being adopted in the high-end servers of datacenters to improve performance of the I/O-intensive applications. Compared to the traditional enterprise class hard disks, SSDs provide faster read performance, lower cooling cost, and higher power efficiency. However, write performance of a flash based SSD can be up to an order of magnitude slower than its read performance. Furthermore, frequent write operations degrade the lifetime of flash memory. A nonvolatile cache can greatly help to solve these problems. Although a RAM cache is relative high in cost, it has successfully eliminated the performance gap between fast CPU and slow magnetic disk. Similarly, a nonvolatile cache in an SSD can alleviate the disparity between the flash memory's read and write performance. A small write cache that reduces the number of flash block erase operations, can lead to substantial performance gain for write-intensive applications and can extend the overall lifetime of flash based SSDs. This paper presents a novel write caching algorithm, the Large Block CLOCK (LB-CLOCK) algorithm, which considers ‘recency’ and ‘block space utilization’ metrics to make cache management decisions. LB-CLOCK dynamically varies the priority between these two metrics to adapt to changes in workload characteristics. Our simulation based experimental results show that LB-CLOCK outperforms the best known existing flash caching algorithms for a wide range of workloads.

使用NAND闪存的固态硬盘(ssd)越来越多地应用于数据中心的高端服务器，以提高I/ o密集型应用程序的性能。与传统的企业级硬盘相比，ssd具有更快的读取性能、更低的散热成本和更高的功耗效率。但是，基于闪存的SSD的写性能可能比读性能慢一个数量级。此外，频繁的写操作会降低闪存的寿命。非易失性缓存可以极大地帮助解决这些问题。虽然RAM高速缓存的成本相对较高，但它已经成功地消除了快速CPU和慢速磁盘之间的性能差距。类似地，SSD中的非易失性缓存可以缓解闪存读写性能之间的差异。一个小的写缓存可以减少闪存块擦除操作的数量，可以为写密集型应用程序带来实质性的性能提升，并可以延长基于闪存的ssd的整体生命周期。本文提出了一种新的写缓存算法，即大块时钟(LB-CLOCK)算法，该算法考虑了“近时性”和“块空间利用率”指标来做出缓存管理决策。LB-CLOCK动态地改变这两个指标之间的优先级，以适应工作负载特征的变化。我们基于仿真的实验结果表明，LB-CLOCK在广泛的工作负载下优于现有最知名的闪存缓存算法。

{"title":"Large Block CLOCK (LB-CLOCK): A write caching algorithm for solid state disks","authors":"Biplob K. Debnath, S. Subramanya, D. Du, D. Lilja","doi":"10.1109/MASCOT.2009.5366737","DOIUrl":"https://doi.org/10.1109/MASCOT.2009.5366737","url":null,"abstract":"Solid State Disks (SSDs) using NAND flash memory are increasingly being adopted in the high-end servers of datacenters to improve performance of the I/O-intensive applications. Compared to the traditional enterprise class hard disks, SSDs provide faster read performance, lower cooling cost, and higher power efficiency. However, write performance of a flash based SSD can be up to an order of magnitude slower than its read performance. Furthermore, frequent write operations degrade the lifetime of flash memory. A nonvolatile cache can greatly help to solve these problems. Although a RAM cache is relative high in cost, it has successfully eliminated the performance gap between fast CPU and slow magnetic disk. Similarly, a nonvolatile cache in an SSD can alleviate the disparity between the flash memory's read and write performance. A small write cache that reduces the number of flash block erase operations, can lead to substantial performance gain for write-intensive applications and can extend the overall lifetime of flash based SSDs. This paper presents a novel write caching algorithm, the Large Block CLOCK (LB-CLOCK) algorithm, which considers ‘recency’ and ‘block space utilization’ metrics to make cache management decisions. LB-CLOCK dynamically varies the priority between these two metrics to adapt to changes in workload characteristics. Our simulation based experimental results show that LB-CLOCK outperforms the best known existing flash caching algorithms for a wide range of workloads.","PeriodicalId":275737,"journal":{"name":"2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122168409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 39