{"title":"Web Behavior Analysis Using Sparse Non-Negative Matrix Factorization","authors":"Akihiro Demachi, Shin Matsushima, K. Yamanishi","doi":"10.1109/DSAA.2016.85","DOIUrl":null,"url":null,"abstract":"We are concerned with the issue of discovering behavioral patterns on the web. When a large amount of web access logs are given, we are interested in how they are categorized and how they are related to activities in real life. In order to conduct that analysis, we develop a novel algorithm for sparse non-negative matrix factorization (SNMF), which can discover patterns of web behaviors. Although there exist a number of variants of SNMFs, our algorithm is novel in that it updates parameters in a multiplicative way with performance guaranteed, thereby works more robustly than existing ones, even when the rank of factorized matrices is large. We demonstrate the effectiveness of our algorithm using artificial data sets. We then apply our algorithm into a large scale web log data obtained from 70,000 monitors to discover meaningful relations among web behavioral patterns and real life activities. We employ the information-theoretic measure to demonstrate that our algorithm is able to extract more significant relations among web behavior patterns and real life activities than competitive methods.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSAA.2016.85","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We are concerned with the issue of discovering behavioral patterns on the web. When a large amount of web access logs are given, we are interested in how they are categorized and how they are related to activities in real life. In order to conduct that analysis, we develop a novel algorithm for sparse non-negative matrix factorization (SNMF), which can discover patterns of web behaviors. Although there exist a number of variants of SNMFs, our algorithm is novel in that it updates parameters in a multiplicative way with performance guaranteed, thereby works more robustly than existing ones, even when the rank of factorized matrices is large. We demonstrate the effectiveness of our algorithm using artificial data sets. We then apply our algorithm into a large scale web log data obtained from 70,000 monitors to discover meaningful relations among web behavioral patterns and real life activities. We employ the information-theoretic measure to demonstrate that our algorithm is able to extract more significant relations among web behavior patterns and real life activities than competitive methods.