Marcin Gabryel, Dawid Lada, Z. Filutowicz, Zofia Patora-Wysocka, Marek Kisiel-Dorohinicki, Guangxing Chen
{"title":"Detecting Anomalies in Advertising Web Traffic with the Use of the Variational Autoencoder","authors":"Marcin Gabryel, Dawid Lada, Z. Filutowicz, Zofia Patora-Wysocka, Marek Kisiel-Dorohinicki, Guangxing Chen","doi":"10.2478/jaiscr-2022-0017","DOIUrl":null,"url":null,"abstract":"Abstract This paper presents a neural network model for identifying non-human traffic to a web-site, which is significantly different from visits made by regular users. Such visits are undesirable from the point of view of the website owner as they are not human activity, and therefore do not bring any value, and, what is more, most often involve costs incurred in connection with the handling of advertising. They are made most often by dishonest publishers using special software (bots) to generate profits. Bots are also used in scraping, which is automatic scanning and downloading of website content, which actually is not in the interest of website authors. The model proposed in this work is learnt by data extracted directly from the web browser during website visits. This data is acquired by using a specially prepared JavaScript that monitors the behavior of the user or bot. The appearance of a bot on a website generates parameter values that are significantly different from those collected during typical visits made by human website users. It is not possible to learn more about the software controlling the bots and to know all the data generated by them. Therefore, this paper proposes a variational autoencoder (VAE) neural network model with modifications to detect the occurrence of abnormal parameter values that deviate from data obtained from human users’ Internet traffic. The algorithm works on the basis of a popular autoencoder method for detecting anomalies, however, a number of original improvements have been implemented. In the study we used authentic data extracted from several large online stores.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"12 1","pages":"255 - 256"},"PeriodicalIF":3.3000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence and Soft Computing Research","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.2478/jaiscr-2022-0017","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 2
Abstract
Abstract This paper presents a neural network model for identifying non-human traffic to a web-site, which is significantly different from visits made by regular users. Such visits are undesirable from the point of view of the website owner as they are not human activity, and therefore do not bring any value, and, what is more, most often involve costs incurred in connection with the handling of advertising. They are made most often by dishonest publishers using special software (bots) to generate profits. Bots are also used in scraping, which is automatic scanning and downloading of website content, which actually is not in the interest of website authors. The model proposed in this work is learnt by data extracted directly from the web browser during website visits. This data is acquired by using a specially prepared JavaScript that monitors the behavior of the user or bot. The appearance of a bot on a website generates parameter values that are significantly different from those collected during typical visits made by human website users. It is not possible to learn more about the software controlling the bots and to know all the data generated by them. Therefore, this paper proposes a variational autoencoder (VAE) neural network model with modifications to detect the occurrence of abnormal parameter values that deviate from data obtained from human users’ Internet traffic. The algorithm works on the basis of a popular autoencoder method for detecting anomalies, however, a number of original improvements have been implemented. In the study we used authentic data extracted from several large online stores.
期刊介绍:
Journal of Artificial Intelligence and Soft Computing Research (available also at Sciendo (De Gruyter)) is a dynamically developing international journal focused on the latest scientific results and methods constituting traditional artificial intelligence methods and soft computing techniques. Our goal is to bring together scientists representing both approaches and various research communities.