A VoD System for Massively Scaled, Heterogeneous Environments: Design and Implementation

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems Pub Date : 2013-08-14 DOI:10.1109/MASCOTS.2013.8

Kangwook Lee, Lisa Yan, Abhay K. Parekh, K. Ramchandran

{"title":"A VoD System for Massively Scaled, Heterogeneous Environments: Design and Implementation","authors":"Kangwook Lee, Lisa Yan, Abhay K. Parekh, K. Ramchandran","doi":"10.1109/MASCOTS.2013.8","DOIUrl":null,"url":null,"abstract":"We propose, analyze and implement a general architecture for massively parallel VoD content distribution. We allow for devices that have a wide range of reliability, storage and bandwidth constraints. Each device can act as a cache for other devices and can also communicate with a central server. Some devices may be dedicated caches with no co-located users. Our goal is to allow each user device to be able to stream any movie from a large catalog, while minimizing the load of the central server. First, we architect and formulate a static optimization problem that accounts for various network bandwidth and storage capacity constraints, as well as the maximum number of network connections for each device. Not surprisingly this formulation is NP-hard. We then use a Markov approximation technique in a primal-dual framework to devise a highly distributed algorithm which is provably close to the optimal. Next we test the practical effectiveness of the distributed algorithm in several ways. We demonstrate remarkable robustness to system scale and changes in demand, user churn, network failure and node failures via a packet level simulation of the system. Finally, we describe our results from numerous experiments on a full implementation of the system with 60 caches and 120 users on 20 Amazon EC2 instances. In addition to corroborating our analytical and simulation-based findings, the implementation allows us to examine various system-level tradeoffs. Examples of this include: (i) the split between server to cache and cache to device traffic, (ii) the tradeoff between cache update intervals and the time taken for the system to adjust to changes in demand, and (iii) the tradeoff between the rate of virtual topology updates and convergence. These insights give us the confidence to claim that a much larger system on the scale of hundreds of thousands of highly heterogeneous nodes would perform as well as our current implementation.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"348 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MASCOTS.2013.8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

We propose, analyze and implement a general architecture for massively parallel VoD content distribution. We allow for devices that have a wide range of reliability, storage and bandwidth constraints. Each device can act as a cache for other devices and can also communicate with a central server. Some devices may be dedicated caches with no co-located users. Our goal is to allow each user device to be able to stream any movie from a large catalog, while minimizing the load of the central server. First, we architect and formulate a static optimization problem that accounts for various network bandwidth and storage capacity constraints, as well as the maximum number of network connections for each device. Not surprisingly this formulation is NP-hard. We then use a Markov approximation technique in a primal-dual framework to devise a highly distributed algorithm which is provably close to the optimal. Next we test the practical effectiveness of the distributed algorithm in several ways. We demonstrate remarkable robustness to system scale and changes in demand, user churn, network failure and node failures via a packet level simulation of the system. Finally, we describe our results from numerous experiments on a full implementation of the system with 60 caches and 120 users on 20 Amazon EC2 instances. In addition to corroborating our analytical and simulation-based findings, the implementation allows us to examine various system-level tradeoffs. Examples of this include: (i) the split between server to cache and cache to device traffic, (ii) the tradeoff between cache update intervals and the time taken for the system to adjust to changes in demand, and (iii) the tradeoff between the rate of virtual topology updates and convergence. These insights give us the confidence to claim that a much larger system on the scale of hundreds of thousands of highly heterogeneous nodes would perform as well as our current implementation.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

面向大规模异构环境的VoD系统:设计与实现

我们提出、分析并实现了一个大规模并行VoD内容分发的通用架构。我们允许具有广泛可靠性，存储和带宽限制的设备。每个设备都可以作为其他设备的缓存，也可以与中央服务器通信。有些设备可能是专用缓存，没有共同定位的用户。我们的目标是允许每个用户设备能够流式传输来自大型目录的任何电影，同时最小化中央服务器的负载。首先，我们构建并制定了一个静态优化问题，该问题考虑了各种网络带宽和存储容量约束，以及每个设备的最大网络连接数。毫不奇怪，这个公式是NP-hard的。然后，我们在原始对偶框架中使用马尔可夫近似技术来设计一个可证明接近最优的高度分布式算法。接下来，我们从几个方面测试了分布式算法的实际有效性。我们通过系统的数据包级模拟展示了对系统规模和需求变化、用户流失、网络故障和节点故障的显著鲁棒性。最后，我们描述了在20个Amazon EC2实例上具有60个缓存和120个用户的系统的完整实现上的大量实验结果。除了证实我们的分析和基于模拟的发现外，该实现还允许我们检查各种系统级权衡。这方面的例子包括:(i)服务器到缓存和缓存到设备流量之间的分离，(ii)缓存更新间隔和系统适应需求变化所花费的时间之间的权衡，以及(iii)虚拟拓扑更新速度和收敛速度之间的权衡。这些见解使我们有信心声称，在数十万个高度异构节点的规模上，更大的系统将与我们当前的实现一样出色。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems

自引率

0.00%

发文量

期刊最新文献

On Modeling Low-Power Wireless Protocols Based on Synchronous Packet Transmissions Analysis of a Simple Approach to Modeling Performance for Streaming Data Applications On the Accuracy of Trace Replay Methods for File System Evaluation A Fix-and-Relax Model for Heterogeneous LTE-Based Networks Making JavaScript Better by Making It Even Slower