Kangwook Lee, Lisa Yan, Abhay K. Parekh, K. Ramchandran
{"title":"A VoD System for Massively Scaled, Heterogeneous Environments: Design and Implementation","authors":"Kangwook Lee, Lisa Yan, Abhay K. Parekh, K. Ramchandran","doi":"10.1109/MASCOTS.2013.8","DOIUrl":null,"url":null,"abstract":"We propose, analyze and implement a general architecture for massively parallel VoD content distribution. We allow for devices that have a wide range of reliability, storage and bandwidth constraints. Each device can act as a cache for other devices and can also communicate with a central server. Some devices may be dedicated caches with no co-located users. Our goal is to allow each user device to be able to stream any movie from a large catalog, while minimizing the load of the central server. First, we architect and formulate a static optimization problem that accounts for various network bandwidth and storage capacity constraints, as well as the maximum number of network connections for each device. Not surprisingly this formulation is NP-hard. We then use a Markov approximation technique in a primal-dual framework to devise a highly distributed algorithm which is provably close to the optimal. Next we test the practical effectiveness of the distributed algorithm in several ways. We demonstrate remarkable robustness to system scale and changes in demand, user churn, network failure and node failures via a packet level simulation of the system. Finally, we describe our results from numerous experiments on a full implementation of the system with 60 caches and 120 users on 20 Amazon EC2 instances. In addition to corroborating our analytical and simulation-based findings, the implementation allows us to examine various system-level tradeoffs. Examples of this include: (i) the split between server to cache and cache to device traffic, (ii) the tradeoff between cache update intervals and the time taken for the system to adjust to changes in demand, and (iii) the tradeoff between the rate of virtual topology updates and convergence. These insights give us the confidence to claim that a much larger system on the scale of hundreds of thousands of highly heterogeneous nodes would perform as well as our current implementation.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"348 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MASCOTS.2013.8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
We propose, analyze and implement a general architecture for massively parallel VoD content distribution. We allow for devices that have a wide range of reliability, storage and bandwidth constraints. Each device can act as a cache for other devices and can also communicate with a central server. Some devices may be dedicated caches with no co-located users. Our goal is to allow each user device to be able to stream any movie from a large catalog, while minimizing the load of the central server. First, we architect and formulate a static optimization problem that accounts for various network bandwidth and storage capacity constraints, as well as the maximum number of network connections for each device. Not surprisingly this formulation is NP-hard. We then use a Markov approximation technique in a primal-dual framework to devise a highly distributed algorithm which is provably close to the optimal. Next we test the practical effectiveness of the distributed algorithm in several ways. We demonstrate remarkable robustness to system scale and changes in demand, user churn, network failure and node failures via a packet level simulation of the system. Finally, we describe our results from numerous experiments on a full implementation of the system with 60 caches and 120 users on 20 Amazon EC2 instances. In addition to corroborating our analytical and simulation-based findings, the implementation allows us to examine various system-level tradeoffs. Examples of this include: (i) the split between server to cache and cache to device traffic, (ii) the tradeoff between cache update intervals and the time taken for the system to adjust to changes in demand, and (iii) the tradeoff between the rate of virtual topology updates and convergence. These insights give us the confidence to claim that a much larger system on the scale of hundreds of thousands of highly heterogeneous nodes would perform as well as our current implementation.