{"title":"User behavior analysis of automobile websites based on distributed computing and sequential pattern mining","authors":"Yuanying Peng, K. Yu","doi":"10.1109/ICNIDC.2016.7974540","DOIUrl":null,"url":null,"abstract":"Nowadays Internet user behavior becomes more and more complicated due to application diversity. It is important to analyze user behavior on specific websites such as e-commerce, education, and healthcare in order for personalized recommendation or targeted advertisement. In this paper, based on the large-scale traffic flow data of real network and crawling data from websites, we focus on the analysis of user browsing behavior on automobile websites. First of all, data pre-processing and statistical analysis based on MapReduce framework are designed and implemented, which is mainly to transform the flow data type to sequential dataset. By improving regular expressions matching method in distributed computing, the running time is reduced from O(N) to O(1). Secondly, we apply the sequential pattern mining algorithm AprioriAll to analyze the sequential dataset. The analysis result reflects the preference of the users when browsing automobile websites to acquire their wanted information.","PeriodicalId":439987,"journal":{"name":"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNIDC.2016.7974540","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Nowadays Internet user behavior becomes more and more complicated due to application diversity. It is important to analyze user behavior on specific websites such as e-commerce, education, and healthcare in order for personalized recommendation or targeted advertisement. In this paper, based on the large-scale traffic flow data of real network and crawling data from websites, we focus on the analysis of user browsing behavior on automobile websites. First of all, data pre-processing and statistical analysis based on MapReduce framework are designed and implemented, which is mainly to transform the flow data type to sequential dataset. By improving regular expressions matching method in distributed computing, the running time is reduced from O(N) to O(1). Secondly, we apply the sequential pattern mining algorithm AprioriAll to analyze the sequential dataset. The analysis result reflects the preference of the users when browsing automobile websites to acquire their wanted information.