Chowdhury Farhan Ahmed, S. Tanbeer, Byeong-Soo Jeong
{"title":"动态Web日志数据中高效Web访问序列的挖掘","authors":"Chowdhury Farhan Ahmed, S. Tanbeer, Byeong-Soo Jeong","doi":"10.1109/SNPD.2010.21","DOIUrl":null,"url":null,"abstract":"Mining web access sequences can discover very useful knowledge from web logs with broad applications. By considering non-binary occurrences of web pages as internal utilities in web access sequences, e.g., time spent by each user in a web page, more realistic information can be extracted. However, the existing utility-based approach has many limitations such as considering only forward references of web access sequences, not applicable for incremental mining, suffers in the level-wise candidate generation-and-test methodology, needs several database scans and does not show how to mine web traversal sequences with external utility, i.e., different impacts/significances for different web pages. In this paper, we propose a new approach to solve these problems. Moreover, we propose two novel tree structures, called UWAS-tree (utility-based web access sequence tree), and IUWAS-tree (incremental UWAS tree), for mining web access sequences in static and dynamic databases respectively. Our approach can handle both forward and backward references, static and dynamic data, avoids the level-wise candidate generation-and-test methodology, does not scan databases several times and considers both internal and external utilities of a web page. Extensive performance analyses show that our approach is very efficient for both static and incremental mining of high utility web access sequences.","PeriodicalId":266363,"journal":{"name":"2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2010-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"67","resultStr":"{\"title\":\"Mining High Utility Web Access Sequences in Dynamic Web Log Data\",\"authors\":\"Chowdhury Farhan Ahmed, S. Tanbeer, Byeong-Soo Jeong\",\"doi\":\"10.1109/SNPD.2010.21\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mining web access sequences can discover very useful knowledge from web logs with broad applications. By considering non-binary occurrences of web pages as internal utilities in web access sequences, e.g., time spent by each user in a web page, more realistic information can be extracted. However, the existing utility-based approach has many limitations such as considering only forward references of web access sequences, not applicable for incremental mining, suffers in the level-wise candidate generation-and-test methodology, needs several database scans and does not show how to mine web traversal sequences with external utility, i.e., different impacts/significances for different web pages. In this paper, we propose a new approach to solve these problems. Moreover, we propose two novel tree structures, called UWAS-tree (utility-based web access sequence tree), and IUWAS-tree (incremental UWAS tree), for mining web access sequences in static and dynamic databases respectively. Our approach can handle both forward and backward references, static and dynamic data, avoids the level-wise candidate generation-and-test methodology, does not scan databases several times and considers both internal and external utilities of a web page. Extensive performance analyses show that our approach is very efficient for both static and incremental mining of high utility web access sequences.\",\"PeriodicalId\":266363,\"journal\":{\"name\":\"2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-06-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"67\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SNPD.2010.21\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SNPD.2010.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mining High Utility Web Access Sequences in Dynamic Web Log Data
Mining web access sequences can discover very useful knowledge from web logs with broad applications. By considering non-binary occurrences of web pages as internal utilities in web access sequences, e.g., time spent by each user in a web page, more realistic information can be extracted. However, the existing utility-based approach has many limitations such as considering only forward references of web access sequences, not applicable for incremental mining, suffers in the level-wise candidate generation-and-test methodology, needs several database scans and does not show how to mine web traversal sequences with external utility, i.e., different impacts/significances for different web pages. In this paper, we propose a new approach to solve these problems. Moreover, we propose two novel tree structures, called UWAS-tree (utility-based web access sequence tree), and IUWAS-tree (incremental UWAS tree), for mining web access sequences in static and dynamic databases respectively. Our approach can handle both forward and backward references, static and dynamic data, avoids the level-wise candidate generation-and-test methodology, does not scan databases several times and considers both internal and external utilities of a web page. Extensive performance analyses show that our approach is very efficient for both static and incremental mining of high utility web access sequences.