C. H. Bryan Liu, Ângelo Cardoso, Paul Couturier, Emma J. McCoy
{"title":"在线控制实验数据集","authors":"C. H. Bryan Liu, Ângelo Cardoso, Paul Couturier, Emma J. McCoy","doi":"arxiv-2111.10198","DOIUrl":null,"url":null,"abstract":"Online Controlled Experiments (OCE) are the gold standard to measure impact\nand guide decisions for digital products and services. Despite many\nmethodological advances in this area, the scarcity of public datasets and the\nlack of a systematic review and categorization hinder its development. We\npresent the first survey and taxonomy for OCE datasets, which highlight the\nlack of a public dataset to support the design and running of experiments with\nadaptive stopping, an increasingly popular approach to enable quickly deploying\nimprovements or rolling back degrading changes. We release the first such\ndataset, containing daily checkpoints of decision metrics from multiple, real\nexperiments run on a global e-commerce platform. The dataset design is guided\nby a broader discussion on data requirements for common statistical tests used\nin digital experimentation. We demonstrate how to use the dataset in the\nadaptive stopping scenario using sequential and Bayesian hypothesis tests and\nlearn the relevant parameters for each approach.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Datasets for Online Controlled Experiments\",\"authors\":\"C. H. Bryan Liu, Ângelo Cardoso, Paul Couturier, Emma J. McCoy\",\"doi\":\"arxiv-2111.10198\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online Controlled Experiments (OCE) are the gold standard to measure impact\\nand guide decisions for digital products and services. Despite many\\nmethodological advances in this area, the scarcity of public datasets and the\\nlack of a systematic review and categorization hinder its development. We\\npresent the first survey and taxonomy for OCE datasets, which highlight the\\nlack of a public dataset to support the design and running of experiments with\\nadaptive stopping, an increasingly popular approach to enable quickly deploying\\nimprovements or rolling back degrading changes. We release the first such\\ndataset, containing daily checkpoints of decision metrics from multiple, real\\nexperiments run on a global e-commerce platform. The dataset design is guided\\nby a broader discussion on data requirements for common statistical tests used\\nin digital experimentation. We demonstrate how to use the dataset in the\\nadaptive stopping scenario using sequential and Bayesian hypothesis tests and\\nlearn the relevant parameters for each approach.\",\"PeriodicalId\":501533,\"journal\":{\"name\":\"arXiv - CS - General Literature\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - General Literature\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2111.10198\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - General Literature","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2111.10198","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Online Controlled Experiments (OCE) are the gold standard to measure impact
and guide decisions for digital products and services. Despite many
methodological advances in this area, the scarcity of public datasets and the
lack of a systematic review and categorization hinder its development. We
present the first survey and taxonomy for OCE datasets, which highlight the
lack of a public dataset to support the design and running of experiments with
adaptive stopping, an increasingly popular approach to enable quickly deploying
improvements or rolling back degrading changes. We release the first such
dataset, containing daily checkpoints of decision metrics from multiple, real
experiments run on a global e-commerce platform. The dataset design is guided
by a broader discussion on data requirements for common statistical tests used
in digital experimentation. We demonstrate how to use the dataset in the
adaptive stopping scenario using sequential and Bayesian hypothesis tests and
learn the relevant parameters for each approach.