{"title":"Click fraud prediction by stacking algorithm","authors":"N. Sahllal, E. M. Souidi","doi":"10.3233/IA-221069","DOIUrl":null,"url":null,"abstract":"Click fraud is the sort of deception in which traffic figures for online ads are intentionally inflated. For businesses that advertise online, click fraud may occur often, resulting in erroneous click statistics and lost funds. That is why many businesses are hesitant to advertise their products on websites and mobile apps. To market their products safely, businesses need a reliable technique for detecting click fraud. In this paper we present a stacking algorithm as a solution to this problem. The proposed method’s premise is to combine multiple learners to achieve an optimal result. The Synthetic Minority Oversampling Technique (SMOTE) with a combination of undersampling are chosen to handle the unbalanced dataset. In the first-level learners, there are four supervised Machine Learning algorithms, which are AdaBoost, Random Forest, Decision Tree and Logistic Regression. Moreover, Logistic Regression is used again as a the second-level learner. To verify the efficacy of the suggested approach, comparative tests are carried out on the public dataset available on Kaggle from China’s largest independent big data service platform TalkingData. Multiple indicators, such as Accuracy, F1 Score, ROC curve, Loss Log and AUC Score, are utilized to analyze the prediction outcomes. The findings reveal that the stacking method improves forecast accuracy while also maintaining a high level of stability.","PeriodicalId":42055,"journal":{"name":"Intelligenza Artificiale","volume":"17 1","pages":"131-141"},"PeriodicalIF":1.9000,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligenza Artificiale","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/IA-221069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Click fraud is the sort of deception in which traffic figures for online ads are intentionally inflated. For businesses that advertise online, click fraud may occur often, resulting in erroneous click statistics and lost funds. That is why many businesses are hesitant to advertise their products on websites and mobile apps. To market their products safely, businesses need a reliable technique for detecting click fraud. In this paper we present a stacking algorithm as a solution to this problem. The proposed method’s premise is to combine multiple learners to achieve an optimal result. The Synthetic Minority Oversampling Technique (SMOTE) with a combination of undersampling are chosen to handle the unbalanced dataset. In the first-level learners, there are four supervised Machine Learning algorithms, which are AdaBoost, Random Forest, Decision Tree and Logistic Regression. Moreover, Logistic Regression is used again as a the second-level learner. To verify the efficacy of the suggested approach, comparative tests are carried out on the public dataset available on Kaggle from China’s largest independent big data service platform TalkingData. Multiple indicators, such as Accuracy, F1 Score, ROC curve, Loss Log and AUC Score, are utilized to analyze the prediction outcomes. The findings reveal that the stacking method improves forecast accuracy while also maintaining a high level of stability.