2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems最新文献

英文中文

DistCL: A Framework for the Distributed Execution of OpenCL Kernels DistCL: OpenCL内核的分布式执行框架

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2013-08-14 DOI: 10.1109/MASCOTS.2013.77

Tahir Diop, Steven Gurfinkel, J. Anderson, Natalie D. Enright Jerger

GPUs are used to speed up many scientific computations, however, to use several networked GPUs concurrently, the programmer must explicitly partition work and transmit data between devices. We propose DistCL, a novel framework that distributes the execution of penCL kernels across a GPU cluster. DistCL makes multiple distributed compute devices appear to be a single compute device. DistCL abstracts and manages many of the challenges associated with distributing a kernel across multiple devices including: (1) partitioning work into smaller parts, (2) scheduling these parts across the network, (3) partitioning memory so that each part of memory is written to by at most one device, and (4) tracking and transferring these parts of memory. Converting an OpenCL application to DistCL is straightforward and requires little programmer effort. This makes it a powerful and valuable tool for exploring the distributed execution of OpenCL kernels. We compare DistCL to SnuCL, which also facilitates the distribution of OpenCL kernels. We also give some insights: distributed tasks favor more compute bound problems and favour large contiguous memory accesses. DistCL achieves a maximum speedup of 29.1 and average speedups of 7.3 when distributing kernels among 32 peers over an Infiniband cluster.

gpu被用来加速许多科学计算，然而，要同时使用几个联网的gpu，程序员必须明确地划分工作并在设备之间传输数据。我们提出了DistCL，这是一个新颖的框架，它在GPU集群上分布执行penCL内核。DistCL使多个分布式计算设备看起来像单个计算设备。DistCL抽象和管理与跨多个设备分发内核相关的许多挑战，包括:(1)将工作划分为更小的部分，(2)跨网络调度这些部分，(3)对内存进行分区，以便内存的每个部分最多由一个设备写入，以及(4)跟踪和传输这些内存部分。将OpenCL应用程序转换为DistCL非常简单，程序员几乎不需要付出任何努力。这使得它成为探索OpenCL内核分布式执行的强大而有价值的工具。我们将DistCL与SnuCL进行比较，后者也有助于OpenCL内核的分发。我们还给出了一些见解:分布式任务更适合计算受限的问题，并且更适合大型连续内存访问。当在Infiniband集群上的32个节点之间分发内核时，DistCL实现了29.1的最大加速，7.3的平均加速。

{"title":"DistCL: A Framework for the Distributed Execution of OpenCL Kernels","authors":"Tahir Diop, Steven Gurfinkel, J. Anderson, Natalie D. Enright Jerger","doi":"10.1109/MASCOTS.2013.77","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.77","url":null,"abstract":"GPUs are used to speed up many scientific computations, however, to use several networked GPUs concurrently, the programmer must explicitly partition work and transmit data between devices. We propose DistCL, a novel framework that distributes the execution of penCL kernels across a GPU cluster. DistCL makes multiple distributed compute devices appear to be a single compute device. DistCL abstracts and manages many of the challenges associated with distributing a kernel across multiple devices including: (1) partitioning work into smaller parts, (2) scheduling these parts across the network, (3) partitioning memory so that each part of memory is written to by at most one device, and (4) tracking and transferring these parts of memory. Converting an OpenCL application to DistCL is straightforward and requires little programmer effort. This makes it a powerful and valuable tool for exploring the distributed execution of OpenCL kernels. We compare DistCL to SnuCL, which also facilitates the distribution of OpenCL kernels. We also give some insights: distributed tasks favor more compute bound problems and favour large contiguous memory accesses. DistCL achieves a maximum speedup of 29.1 and average speedups of 7.3 when distributing kernels among 32 peers over an Infiniband cluster.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116025896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

A Novel Simulation Methodology for Accelerating Reliability Assessment of SSDs 一种加速固态硬盘可靠性评估的仿真方法

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2013-08-14 DOI: 10.1109/MASCOTS.2013.46

Luyao Jiang, S. Gurumurthi

Reliability is an important factor to consider when designing and deploying SSDs in storage systems. Both the endurance and the retention time of flash memory are affected by the history of low-level stress and recovery patterns in flash cells, which are determined by the workload characteristics, the time during which the workload utilizes the SSD, and the FTL algorithms. Accurately assessing SSD reliability requires simulating several years' of workload behavior, which is time consuming. This paper presents a methodology that uses snapshot-based sampling and clustering techniques to help reduce the simulation time while maintaining high accuracy. The methodology leverages the key insight that most of the large changes in retention time occur early in the lifetime of the SSD, whereas most of the simulation time is spent in its later stages. This allows simulation acceleration to focus on the later stages without significant loss of accuracy. We show that our approach provides an average speed-up of 12X relative to detailed simulation with an error of 3.21% in the estimated mean and 6.42% in the estimated standard deviation of the retention times of the blocks in the SSD.

可靠性是在存储系统中设计和部署ssd时需要考虑的一个重要因素。闪存的持久时间和保留时间都受到闪存单元中的低水平应力历史和恢复模式的影响，这是由工作负载特征、工作负载使用SSD的时间和FTL算法决定的。准确评估SSD可靠性需要模拟数年的工作负载行为，这非常耗时。本文提出了一种方法，使用基于快照的采样和聚类技术来帮助减少模拟时间，同时保持高精度。该方法利用了一个关键的见解，即大多数保留时间的大变化发生在SSD生命周期的早期，而大多数模拟时间都花在其后期阶段。这使得仿真加速可以专注于后期阶段，而不会显著降低精度。我们表明，相对于详细的模拟，我们的方法提供了12倍的平均加速，SSD中块保留时间的估计平均值误差为3.21%，估计标准差误差为6.42%。

{"title":"A Novel Simulation Methodology for Accelerating Reliability Assessment of SSDs","authors":"Luyao Jiang, S. Gurumurthi","doi":"10.1109/MASCOTS.2013.46","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.46","url":null,"abstract":"Reliability is an important factor to consider when designing and deploying SSDs in storage systems. Both the endurance and the retention time of flash memory are affected by the history of low-level stress and recovery patterns in flash cells, which are determined by the workload characteristics, the time during which the workload utilizes the SSD, and the FTL algorithms. Accurately assessing SSD reliability requires simulating several years' of workload behavior, which is time consuming. This paper presents a methodology that uses snapshot-based sampling and clustering techniques to help reduce the simulation time while maintaining high accuracy. The methodology leverages the key insight that most of the large changes in retention time occur early in the lifetime of the SSD, whereas most of the simulation time is spent in its later stages. This allows simulation acceleration to focus on the later stages without significant loss of accuracy. We show that our approach provides an average speed-up of 12X relative to detailed simulation with an error of 3.21% in the estimated mean and 6.42% in the estimated standard deviation of the retention times of the blocks in the SSD.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122413355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Networks of Order Independent Queues with Signals 带信号的顺序无关队列网络

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2013-08-14 DOI: 10.1109/MASCOTS.2013.21

Thu-Ha Dao-Thi, J. Fourneau, Minh-Anh Tran

We study the steady-state distribution of networks of order independent queues with negative signals which delete customers. An Order Independent queue is defined by a service rate which is independent on the order of the customers in the queue. Such an abstract discipline may be used to model complex blocking mechanism (for instance the Multiserver Station with Concurrent Classes of Customers). Order independent queues are in general neither symmetric nor reversible. We prove that, under usual assumptions on the arrivals, the services and the routing of customers, such a network of queues with signals has a steady-state distribution with product form solution. The proof is based on the quasi-reversibility of the queues. We also present some examples of application for this new analytical result.

研究了带有删除客户的负信号的秩无关队列网络的稳态分布。订单无关队列由服务费率定义，该费率与队列中客户的订单无关。这样一个抽象的规程可以用来为复杂的阻塞机制建模(例如具有并发客户类的多服务器站)。顺序无关队列通常既不对称也不可逆。我们证明了在对到达、服务和顾客路线的通常假设下，这样的信号队列网络具有具有产品形式解的稳态分布。该证明基于队列的拟可逆性。本文还给出了一些应用实例。

引用次数: 2

Resource Estimation for Network Virtualization through Users and Network Interaction Analysis 基于用户和网络交互分析的网络虚拟化资源估计

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2013-08-14 DOI: 10.1109/MASCOTS.2013.65

Bo-Chun Wang, Y. Tay, L. Golubchik

Network virtualization can potentially overcome Internet ossification. This technology lets multiple virtual networks run on a shared physical infrastructure. A key step lies in mapping a virtual network request to a resource allocation in the network substrate. Previous approaches to this network embedding problem assumed the request will ask for specific resources, such as network capacity or computing power. However, the end-user is more interested in performance. This paper therefore considers a different request format, namely a request will ask for a certain quality of service (QoS). The infrastructure provider must then determine the resource allocation necessary for this QoS. In particular, the provider must take into account user reaction to perceived performance and adjust the allocation dynamically. To this end, we propose an estimation mechanism that is based on analyzing the interaction between user behavior and network performance. This approach can dynamically adjust resource estimations when QoS requirements change. Our simulation-based experiments demonstrate that the proposed approach can satisfy user performance requirements through appropriate resource estimation. Moreover, our approach can adjust resource estimations efficiently and accurately.

网络虚拟化可以潜在地克服互联网僵化。该技术允许多个虚拟网络在共享的物理基础设施上运行。关键步骤在于将虚拟网络请求映射到网络基板中的资源分配。以前解决这种网络嵌入问题的方法假设请求将请求特定的资源，例如网络容量或计算能力。然而，最终用户对性能更感兴趣。因此，本文考虑了一种不同的请求格式，即请求将要求一定的服务质量(QoS)。然后，基础设施提供者必须确定此QoS所需的资源分配。特别是，提供商必须考虑用户对感知性能的反应，并动态调整分配。为此，我们提出了一种基于分析用户行为和网络性能之间交互作用的估计机制。这种方法可以在QoS需求发生变化时动态调整资源估计。基于仿真的实验表明，该方法可以通过适当的资源估计来满足用户的性能需求。此外，我们的方法可以有效和准确地调整资源估计。

{"title":"Resource Estimation for Network Virtualization through Users and Network Interaction Analysis","authors":"Bo-Chun Wang, Y. Tay, L. Golubchik","doi":"10.1109/MASCOTS.2013.65","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.65","url":null,"abstract":"Network virtualization can potentially overcome Internet ossification. This technology lets multiple virtual networks run on a shared physical infrastructure. A key step lies in mapping a virtual network request to a resource allocation in the network substrate. Previous approaches to this network embedding problem assumed the request will ask for specific resources, such as network capacity or computing power. However, the end-user is more interested in performance. This paper therefore considers a different request format, namely a request will ask for a certain quality of service (QoS). The infrastructure provider must then determine the resource allocation necessary for this QoS. In particular, the provider must take into account user reaction to perceived performance and adjust the allocation dynamically. To this end, we propose an estimation mechanism that is based on analyzing the interaction between user behavior and network performance. This approach can dynamically adjust resource estimations when QoS requirements change. Our simulation-based experiments demonstrate that the proposed approach can satisfy user performance requirements through appropriate resource estimation. Moreover, our approach can adjust resource estimations efficiently and accurately.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123965515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Three-Dimensional Redundancy Codes for Archival Storage 档案存储的三维冗余码

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2013-08-14 DOI: 10.1109/MASCOTS.2013.45

Jehan-Francois Pâris, D. Long, W. Litwin

Fault-tolerant disk arrays rely on replication or erasure-coding to reconstruct lost data after a disk failure. As disk capacity increases, so does the risk of encountering irrecoverable read errors that would prevent the full recovery of the lost data. We propose a three-dimensional erasure-coding technique that reduces that risk by guaranteeing full recovery in the presence of all triple and nearly all quadruple disk failures. Our solution performs better than existing solutions, such as sets of disk arrays using Reed-Solomon codes against triple failures in each individual array. Given its very high reliability, it is especially suited to the needs of very large data sets that must be preserved over long periods of time.

容错磁盘阵列依靠复制或擦除编码在磁盘故障后重建丢失的数据。随着磁盘容量的增加，遇到不可恢复的读取错误的风险也会增加，这将阻止完全恢复丢失的数据。我们提出了一种三维擦除编码技术，通过保证在所有三盘和几乎所有四盘故障的情况下完全恢复来降低这种风险。我们的解决方案比现有的解决方案性能更好，例如使用Reed-Solomon代码的磁盘阵列集，以防止每个单独阵列中的三重故障。鉴于其极高的可靠性，它特别适合于必须长时间保存的非常大的数据集的需求。

引用次数: 9

Introducing Geographic Restrictions to the SLAW Human Mobility Model 引入地理限制的SLAW人类流动模型

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2013-08-14 DOI: 10.1109/MASCOTS.2013.34

Matthias Schwamborn, N. Aschenbruck

Among other statistical features, the analysis of fine-grained GPS traces from different outdoor scenarios has shown that human mobility statistically resembles Lévy Walks and led to the design of the Self-similar Least-Action Walk (SLAW) mobility model. It was concluded that human mobility is scale-free and that this feature is invariant irrespective of any geographic constraints. These constraints were considered too scenario-specific and were omitted in SLAW. However, we argue that geographic constraints should not be considered as an unnecessary detail, but as an important feature of a realistic mobility model for the simulative performance evaluation of mobile networks. Therefore, we introduce geographic restrictions to SLAW in the form of maps. Our evaluation of the extended model (called MSLAW) shows that the introduced restrictions have a significant impact on several performance metrics relevant for opportunistic networks.

在其他统计特征中，对来自不同户外场景的细粒度GPS痕迹的分析表明，人类的移动性在统计上与lsamvy Walks相似，并导致了自相似最小行动步行(SLAW)移动性模型的设计。得出的结论是，人类的流动性是无标度的，这一特征是不变的，无论任何地理限制。这些约束被认为过于特定于场景，因此在SLAW中被省略了。然而，我们认为地理约束不应该被视为一个不必要的细节，而是作为一个现实的移动模型的重要特征，用于模拟移动网络的性能评估。因此，我们以地图的形式对SLAW进行地理限制。我们对扩展模型(称为MSLAW)的评估表明，引入的限制对机会网络相关的几个性能指标有重大影响。

引用次数: 19

Validating Storage System Instrumentation 验证存储系统仪表

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2013-08-14 DOI: 10.1109/MASCOTS.2013.73

I. Adams, M. Storer, Avani Wildani, E. L. Miller, B. A. Madden

There is a large body of work-such as system administration and intrusion detection-that relies upon storage system logs and snapshots. These solutions rely on accurate system records, however, little effort has been made to verify the correctness of logging instrumentation and log reliability. We present a solution, called ExDiff, that uses expectation differencing to validate storage system logs. Our solution can identify development errors such as the omission of a logging point and runtime errors such as log crashes. ExDiff uses metadata snapshots and activity logs to predict the expected state of the system and compares that with the system's actual state. Mismatches between the expected and actual metadata states can then be used to highlight gaps in log coverage, as well as aid in identifying specific types of missing entries. We show that ExDiff provides valuable insight to system designers, administrators and researchers by accurately identifying gaps in log coverage, providing clues useful in isolating specific types of missing log entries, and highlighting potential misunderstandings in logged action.

有大量的工作(如系统管理和入侵检测)依赖于存储系统日志和快照。这些解决方案依赖于准确的系统记录，然而，很少努力去验证日志工具的正确性和日志的可靠性。我们提出了一个名为ExDiff的解决方案，它使用期望差来验证存储系统日志。我们的解决方案可以识别开发错误，比如遗漏一个日志点，以及运行时错误，比如日志崩溃。ExDiff使用元数据快照和活动日志来预测系统的预期状态，并将其与系统的实际状态进行比较。然后，可以使用预期和实际元数据状态之间的不匹配来突出显示日志覆盖中的差距，并帮助识别特定类型的缺失条目。我们展示了ExDiff为系统设计人员、管理员和研究人员提供了有价值的见解，它可以准确地识别日志覆盖中的空白，提供有助于隔离特定类型的缺失日志条目的线索，并突出记录操作中潜在的误解。

{"title":"Validating Storage System Instrumentation","authors":"I. Adams, M. Storer, Avani Wildani, E. L. Miller, B. A. Madden","doi":"10.1109/MASCOTS.2013.73","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.73","url":null,"abstract":"There is a large body of work-such as system administration and intrusion detection-that relies upon storage system logs and snapshots. These solutions rely on accurate system records, however, little effort has been made to verify the correctness of logging instrumentation and log reliability. We present a solution, called ExDiff, that uses expectation differencing to validate storage system logs. Our solution can identify development errors such as the omission of a logging point and runtime errors such as log crashes. ExDiff uses metadata snapshots and activity logs to predict the expected state of the system and compares that with the system's actual state. Mismatches between the expected and actual metadata states can then be used to highlight gaps in log coverage, as well as aid in identifying specific types of missing entries. We show that ExDiff provides valuable insight to system designers, administrators and researchers by accurately identifying gaps in log coverage, providing clues useful in isolating specific types of missing log entries, and highlighting potential misunderstandings in logged action.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134550575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Impact of Multi-access Links on the Internet Topology Modeling 多接入链路对Internet拓扑建模的影响

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2013-08-14 DOI: 10.1109/MASCOTS.2013.60

M. Akgun, M. H. Gunes

Comprehensive analyses that aim to better understand the topology of real world networks have been an important research challenge. Internet topology measurement studies provide samples of the underlying network at various levels. Although router-level Internet topology measurement systems target low level Internet infrastructure, they primarily focus on the Layer-3 connectivity and ignore the underlying multi-access links. In this paper, in addition to the thoroughly studied degree distribution, we analyze the subnet and interface distributions of major Internet topology datasets. We also investigate the impact of the higher granularity modeling at link level versus router level modeling. Our analysis establishes a foundation for the Layer-2 Internet topology generation and introduces the link layer characteristics into the network modeling.

旨在更好地理解现实世界网络拓扑结构的综合分析一直是一个重要的研究挑战。互联网拓扑测量研究提供了不同层次的底层网络样本。尽管路由器级互联网拓扑测量系统的目标是底层互联网基础设施，但它们主要关注第三层连接，而忽略了底层的多址链路。在本文中，除了深入研究度分布外，我们还分析了主要互联网拓扑数据集的子网和接口分布。我们还研究了链路级与路由器级建模的高粒度建模的影响。本文的分析为第二层Internet拓扑的生成奠定了基础，并将链路层特性引入到网络建模中。

引用次数: 4

Characterization Analysis of Resource Utilization Distribution 资源利用分布特征分析

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2013-08-14 DOI: 10.1109/MASCOTS.2013.54

R. Birke, L. Chen, M. Gribaudo, P. Piazzolla

To efficiently manage resources and provide guaranteed services, today's computing systems monitor and collect a large number of resource usages, for example the average and time series of CPU utilization. However, little is known about the analytical distribution of resource usages, which are the crucial parameters to infer performance metrics defined in service level agreements (SLAs), such as response times and throughputs. In this paper, we aim to characterize the entire distribution of CPU utilization via stochastic reward models. In particular, we first study and derive the probability density function of the utilization of widely known and applied queuing systems, namely Poisson processes, Markov modulated Poisson processes and time-varying Poisson processes. Secondly, we apply our proposed analysis on characterizing the CPU usage of live production systems, and simulated queuing systems. Evaluation results show that analytical characterization of the selected queueing models can capture the utilization distribution of a wide range of real-life systems well, and we argue the robustness of our methodology to further infer system performance metrics.

为了有效地管理资源和提供有保障的服务，当今的计算系统监控和收集大量的资源使用情况，例如CPU利用率的平均值和时间序列。然而，我们对资源使用的分析分布知之甚少，而资源使用是推断服务水平协议(sla)中定义的性能指标的关键参数，例如响应时间和吞吐量。在本文中，我们的目标是通过随机奖励模型来描述CPU利用率的整个分布。特别地，我们首先研究并推导了广泛应用的排队系统的概率密度函数，即泊松过程、马尔可夫调制泊松过程和时变泊松过程。其次，我们将提出的分析应用于实时生产系统和模拟排队系统的CPU使用特征。评估结果表明，所选择的排队模型的分析表征可以很好地捕获广泛的现实系统的利用率分布，并且我们认为我们的方法具有鲁棒性，可以进一步推断系统性能指标。

{"title":"Characterization Analysis of Resource Utilization Distribution","authors":"R. Birke, L. Chen, M. Gribaudo, P. Piazzolla","doi":"10.1109/MASCOTS.2013.54","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.54","url":null,"abstract":"To efficiently manage resources and provide guaranteed services, today's computing systems monitor and collect a large number of resource usages, for example the average and time series of CPU utilization. However, little is known about the analytical distribution of resource usages, which are the crucial parameters to infer performance metrics defined in service level agreements (SLAs), such as response times and throughputs. In this paper, we aim to characterize the entire distribution of CPU utilization via stochastic reward models. In particular, we first study and derive the probability density function of the utilization of widely known and applied queuing systems, namely Poisson processes, Markov modulated Poisson processes and time-varying Poisson processes. Secondly, we apply our proposed analysis on characterizing the CPU usage of live production systems, and simulated queuing systems. Evaluation results show that analytical characterization of the selected queueing models can capture the utilization distribution of a wide range of real-life systems well, and we argue the robustness of our methodology to further infer system performance metrics.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133091804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Managing Network Reservation for Tenants in Oversubscribed Clouds 管理超额认购云中的租户网络预订

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2013-08-14 DOI: 10.1109/MASCOTS.2013.13

Mayank Mishra, P. Dutta, Praveen Kumar, V. Mann

As businesses move their critical IT operations to multi-tenant cloud data centers, it is becoming increasingly important to provide network performance guarantees to individual tenants. Due to the impact of network congestion on the performance of many common cloud applications, recent work has focused on enabling network reservation for individual tenants. Current network reservation methods, however, do not gracefully degrade in the presence of network over subscriptions that may frequently occur in a cloud environment. In this context, for a shared data center network, we introduce Network Satisfaction Ratio (NSR) as a measure of the satisfaction derived by a tenant from a given network reservation. NSR is defined as the ratio of the actual reserved bandwidth to the desired bandwidth of the tenant. Based on NSR, we present a novel network reservation mechanism that can admit time-varying tenant requests and can fairly distribute any degradation in the NSR among the tenants in presence of network over subscription. We evaluate the proposed method using both synthetic network traffic trace and representative data center traffic trace generated by running a reduced data center job trace in a small test bed. The evaluation shows that our method adapts to changes in network reservations, and it provides significant and fair improvement in NSR when the data center network is oversubscribed.

随着企业将关键IT操作转移到多租户云数据中心，为单个租户提供网络性能保证变得越来越重要。由于网络拥塞对许多常见云应用程序性能的影响，最近的工作重点是为单个租户启用网络预订。但是，当前的网络预订方法在云环境中可能经常出现的网络订阅情况下不会优雅地降级。在这种情况下，对于共享数据中心网络，我们引入网络满意度比率(NSR)作为租户从给定网络预订中获得满意度的度量。NSR定义为租户的实际预留带宽与期望带宽的比值。基于NSR，我们提出了一种新的网络预留机制，该机制可以接受时变的租户请求，并且可以在网络超过订阅的情况下在租户之间公平分配NSR中的任何降级。我们使用合成网络流量跟踪和代表性数据中心流量跟踪来评估所提出的方法，这些数据中心流量跟踪是通过在小型测试台上运行简化的数据中心作业跟踪生成的。评估结果表明，该方法能够适应网络预留的变化，在数据中心网络被超额订阅的情况下，对NSR有显著而公平的改善。

{"title":"Managing Network Reservation for Tenants in Oversubscribed Clouds","authors":"Mayank Mishra, P. Dutta, Praveen Kumar, V. Mann","doi":"10.1109/MASCOTS.2013.13","DOIUrl":"https://doi.org/10.1109/MASCOTS.2013.13","url":null,"abstract":"As businesses move their critical IT operations to multi-tenant cloud data centers, it is becoming increasingly important to provide network performance guarantees to individual tenants. Due to the impact of network congestion on the performance of many common cloud applications, recent work has focused on enabling network reservation for individual tenants. Current network reservation methods, however, do not gracefully degrade in the presence of network over subscriptions that may frequently occur in a cloud environment. In this context, for a shared data center network, we introduce Network Satisfaction Ratio (NSR) as a measure of the satisfaction derived by a tenant from a given network reservation. NSR is defined as the ratio of the actual reserved bandwidth to the desired bandwidth of the tenant. Based on NSR, we present a novel network reservation mechanism that can admit time-varying tenant requests and can fairly distribute any degradation in the NSR among the tenants in presence of network over subscription. We evaluate the proposed method using both synthetic network traffic trace and representative data center traffic trace generated by running a reduced data center job trace in a small test bed. The evaluation shows that our method adapts to changes in network reservations, and it provides significant and fair improvement in NSR when the data center network is oversubscribed.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129744149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀