{"title":"DieCast:用精确的比例模型测试分布式系统","authors":"Diwaker Gupta, K. Vishwanath, Amin Vahdat","doi":"10.1145/1963559.1963560","DOIUrl":null,"url":null,"abstract":"Large-scale network services can consist of tens of thousands of machines running thousands of unique software configurations spread across hundreds of physical networks. Testing such services for complex performance problems and configuration errors remains a difficult problem. Existing testing techniques, such as simulation or running smaller instances of a service, have limitations in predicting overall service behavior at such scales.\n Testing large services should ideally be done at the same scale and configuration as the target deployment, which can be technically and economically infeasible. We present DieCast, an approach to scaling network services in which we multiplex all of the nodes in a given service configuration as virtual machines across a much smaller number of physical machines in a test harness. We show how to accurately scale CPU, network, and disk to provide the illusion that each VM matches a machine in the original service in terms of both available computing resources and communication behavior. We present the architecture and evaluation of a system we built to support such experimentation and discuss its limitations. We show that for a variety of services---including a commercial high-performance cluster-based file system---and resource utilization levels, DieCast matches the behavior of the original service while using a fraction of the physical resources.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"1 1","pages":"4:1-4:48"},"PeriodicalIF":2.0000,"publicationDate":"2008-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"127","resultStr":"{\"title\":\"DieCast: Testing Distributed Systems with an Accurate Scale Model\",\"authors\":\"Diwaker Gupta, K. Vishwanath, Amin Vahdat\",\"doi\":\"10.1145/1963559.1963560\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large-scale network services can consist of tens of thousands of machines running thousands of unique software configurations spread across hundreds of physical networks. Testing such services for complex performance problems and configuration errors remains a difficult problem. Existing testing techniques, such as simulation or running smaller instances of a service, have limitations in predicting overall service behavior at such scales.\\n Testing large services should ideally be done at the same scale and configuration as the target deployment, which can be technically and economically infeasible. We present DieCast, an approach to scaling network services in which we multiplex all of the nodes in a given service configuration as virtual machines across a much smaller number of physical machines in a test harness. We show how to accurately scale CPU, network, and disk to provide the illusion that each VM matches a machine in the original service in terms of both available computing resources and communication behavior. We present the architecture and evaluation of a system we built to support such experimentation and discuss its limitations. We show that for a variety of services---including a commercial high-performance cluster-based file system---and resource utilization levels, DieCast matches the behavior of the original service while using a fraction of the physical resources.\",\"PeriodicalId\":50918,\"journal\":{\"name\":\"ACM Transactions on Computer Systems\",\"volume\":\"1 1\",\"pages\":\"4:1-4:48\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2008-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"127\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Computer Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/1963559.1963560\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Computer Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/1963559.1963560","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
DieCast: Testing Distributed Systems with an Accurate Scale Model
Large-scale network services can consist of tens of thousands of machines running thousands of unique software configurations spread across hundreds of physical networks. Testing such services for complex performance problems and configuration errors remains a difficult problem. Existing testing techniques, such as simulation or running smaller instances of a service, have limitations in predicting overall service behavior at such scales.
Testing large services should ideally be done at the same scale and configuration as the target deployment, which can be technically and economically infeasible. We present DieCast, an approach to scaling network services in which we multiplex all of the nodes in a given service configuration as virtual machines across a much smaller number of physical machines in a test harness. We show how to accurately scale CPU, network, and disk to provide the illusion that each VM matches a machine in the original service in terms of both available computing resources and communication behavior. We present the architecture and evaluation of a system we built to support such experimentation and discuss its limitations. We show that for a variety of services---including a commercial high-performance cluster-based file system---and resource utilization levels, DieCast matches the behavior of the original service while using a fraction of the physical resources.
期刊介绍:
ACM Transactions on Computer Systems (TOCS) presents research and development results on the design, implementation, analysis, evaluation, and use of computer systems and systems software. The term "computer systems" is interpreted broadly and includes operating systems, systems architecture and hardware, distributed systems, optimizing compilers, and the interaction between systems and computer networks. Articles appearing in TOCS will tend either to present new techniques and concepts, or to report on experiences and experiments with actual systems. Insights useful to system designers, builders, and users will be emphasized.
TOCS publishes research and technical papers, both short and long. It includes technical correspondence to permit commentary on technical topics and on previously published papers.