{"title":"Towards Reproducible Evaluation of Large-Scale Distributed Systems","authors":"M. Matos","doi":"10.1145/3231104.3231113","DOIUrl":null,"url":null,"abstract":"Reproducing experimental results is nowadays seen as one of the greatest impairments for the progress of science in general and distributed systems in particular. This stems from the increasing complexity of the systems under study and the inherent complexity of capturing and controlling all variables that can potentially affect experimental results. We argue that this can only be addressed with a systematic approach to all the stages of the evaluation process. This raises the following challenges: i) precisely describe the environment and variables affecting the experiment, ii) minimize the number of (uncontrollable) variables affecting the experiment and iii) have the ability to subject the system under evaluation to controlled fault patterns. In the following, we highlight the research directions we are currently pursuing to address these goals. Our overarching goal is to build an open-source evaluation platform, Angainor, able to deploy an experiment, control the network topology, inject faults, monitor the whole experiment and automatically derive summary statistics of the experimental data.","PeriodicalId":164914,"journal":{"name":"Proceedings of the 2018 Workshop on Advanced Tools, Programming Languages, and PLatforms for Implementing and Evaluating Algorithms for Distributed systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2018 Workshop on Advanced Tools, Programming Languages, and PLatforms for Implementing and Evaluating Algorithms for Distributed systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3231104.3231113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Reproducing experimental results is nowadays seen as one of the greatest impairments for the progress of science in general and distributed systems in particular. This stems from the increasing complexity of the systems under study and the inherent complexity of capturing and controlling all variables that can potentially affect experimental results. We argue that this can only be addressed with a systematic approach to all the stages of the evaluation process. This raises the following challenges: i) precisely describe the environment and variables affecting the experiment, ii) minimize the number of (uncontrollable) variables affecting the experiment and iii) have the ability to subject the system under evaluation to controlled fault patterns. In the following, we highlight the research directions we are currently pursuing to address these goals. Our overarching goal is to build an open-source evaluation platform, Angainor, able to deploy an experiment, control the network topology, inject faults, monitor the whole experiment and automatically derive summary statistics of the experimental data.