Towards Reproducible Evaluation of Large-Scale Distributed Systems

Proceedings of the 2018 Workshop on Advanced Tools, Programming Languages, and PLatforms for Implementing and Evaluating Algorithms for Distributed systems Pub Date : 2018-07-23 DOI:10.1145/3231104.3231113

M. Matos

{"title":"Towards Reproducible Evaluation of Large-Scale Distributed Systems","authors":"M. Matos","doi":"10.1145/3231104.3231113","DOIUrl":null,"url":null,"abstract":"Reproducing experimental results is nowadays seen as one of the greatest impairments for the progress of science in general and distributed systems in particular. This stems from the increasing complexity of the systems under study and the inherent complexity of capturing and controlling all variables that can potentially affect experimental results. We argue that this can only be addressed with a systematic approach to all the stages of the evaluation process. This raises the following challenges: i) precisely describe the environment and variables affecting the experiment, ii) minimize the number of (uncontrollable) variables affecting the experiment and iii) have the ability to subject the system under evaluation to controlled fault patterns. In the following, we highlight the research directions we are currently pursuing to address these goals. Our overarching goal is to build an open-source evaluation platform, Angainor, able to deploy an experiment, control the network topology, inject faults, monitor the whole experiment and automatically derive summary statistics of the experimental data.","PeriodicalId":164914,"journal":{"name":"Proceedings of the 2018 Workshop on Advanced Tools, Programming Languages, and PLatforms for Implementing and Evaluating Algorithms for Distributed systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 Workshop on Advanced Tools, Programming Languages, and PLatforms for Implementing and Evaluating Algorithms for Distributed systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3231104.3231113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Reproducing experimental results is nowadays seen as one of the greatest impairments for the progress of science in general and distributed systems in particular. This stems from the increasing complexity of the systems under study and the inherent complexity of capturing and controlling all variables that can potentially affect experimental results. We argue that this can only be addressed with a systematic approach to all the stages of the evaluation process. This raises the following challenges: i) precisely describe the environment and variables affecting the experiment, ii) minimize the number of (uncontrollable) variables affecting the experiment and iii) have the ability to subject the system under evaluation to controlled fault patterns. In the following, we highlight the research directions we are currently pursuing to address these goals. Our overarching goal is to build an open-source evaluation platform, Angainor, able to deploy an experiment, control the network topology, inject faults, monitor the whole experiment and automatically derive summary statistics of the experimental data.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

实现对大规模分布式系统的可重复评估

如今，复制实验结果已被视为科学进步，特别是分布式系统进步的最大障碍之一。这源于所研究系统的复杂性不断增加，以及捕捉和控制可能影响实验结果的所有变量的内在复杂性。我们认为，只有对评估过程的所有阶段采取系统的方法，才能解决这一问题。这就提出了以下挑战：i) 精确描述实验环境和影响实验的变量；ii) 尽量减少影响实验的（不可控）变量数量；iii) 有能力让被评估系统接受受控故障模式。下面，我们将重点介绍目前为实现这些目标而开展的研究方向。我们的总体目标是建立一个开源评估平台 Angainor，该平台能够部署实验、控制网络拓扑结构、注入故障、监控整个实验并自动得出实验数据的汇总统计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2018 Workshop on Advanced Tools, Programming Languages, and PLatforms for Implementing and Evaluating Algorithms for Distributed systems

自引率

0.00%

发文量