Maria Esteller-Cucala, Vicenc Fernandez, Diego Villuendas
{"title":"在线个性化A/B测试中应避免的实验陷阱","authors":"Maria Esteller-Cucala, Vicenc Fernandez, Diego Villuendas","doi":"10.1145/3314183.3323853","DOIUrl":null,"url":null,"abstract":"Online controlled experiments (also called A/B tests, bucket testing or randomized experiments) have become an habitual practice in numerous companies for measuring the impact of new features and changes deployed to softwares products. In theory, these experiments are one of the simplest methods to evaluate the potential effects that new features have on user's behavior. In practice, however, there are many pitfalls that can obscure the interpretation of results or induce invalid conclusions. There is, in the literature, no shortage of prior work on online controlled experiments addressing these pitfalls and conclusions misinterpretations, but the topic is not tackled considering the specific case of testing personalization features. In this paper, we present some of the experimentation pitfalls that are particularly important for personalization features. To better illustrate each pitfall, we include a combination of theoretical argumentation as well as examples from real company's experiments. While there is clearly value in evaluating personalized features by means of online controlled experiments, there are some pitfalls to bear in mind while testing. With this paper, we aim to increase the experimenters' awareness of leading to improved quality and reliability of the results.","PeriodicalId":240482,"journal":{"name":"Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Experimentation Pitfalls to Avoid in A/B Testing for Online Personalization\",\"authors\":\"Maria Esteller-Cucala, Vicenc Fernandez, Diego Villuendas\",\"doi\":\"10.1145/3314183.3323853\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online controlled experiments (also called A/B tests, bucket testing or randomized experiments) have become an habitual practice in numerous companies for measuring the impact of new features and changes deployed to softwares products. In theory, these experiments are one of the simplest methods to evaluate the potential effects that new features have on user's behavior. In practice, however, there are many pitfalls that can obscure the interpretation of results or induce invalid conclusions. There is, in the literature, no shortage of prior work on online controlled experiments addressing these pitfalls and conclusions misinterpretations, but the topic is not tackled considering the specific case of testing personalization features. In this paper, we present some of the experimentation pitfalls that are particularly important for personalization features. To better illustrate each pitfall, we include a combination of theoretical argumentation as well as examples from real company's experiments. While there is clearly value in evaluating personalized features by means of online controlled experiments, there are some pitfalls to bear in mind while testing. With this paper, we aim to increase the experimenters' awareness of leading to improved quality and reliability of the results.\",\"PeriodicalId\":240482,\"journal\":{\"name\":\"Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization\",\"volume\":\"108 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3314183.3323853\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Adjunct Publication of the 27th Conference on User Modeling, Adaptation and Personalization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3314183.3323853","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Experimentation Pitfalls to Avoid in A/B Testing for Online Personalization
Online controlled experiments (also called A/B tests, bucket testing or randomized experiments) have become an habitual practice in numerous companies for measuring the impact of new features and changes deployed to softwares products. In theory, these experiments are one of the simplest methods to evaluate the potential effects that new features have on user's behavior. In practice, however, there are many pitfalls that can obscure the interpretation of results or induce invalid conclusions. There is, in the literature, no shortage of prior work on online controlled experiments addressing these pitfalls and conclusions misinterpretations, but the topic is not tackled considering the specific case of testing personalization features. In this paper, we present some of the experimentation pitfalls that are particularly important for personalization features. To better illustrate each pitfall, we include a combination of theoretical argumentation as well as examples from real company's experiments. While there is clearly value in evaluating personalized features by means of online controlled experiments, there are some pitfalls to bear in mind while testing. With this paper, we aim to increase the experimenters' awareness of leading to improved quality and reliability of the results.