响应式Web使用数据处理的新方法

Murat Ali Bayir, I. H. Toroslu, A. Cosar
{"title":"响应式Web使用数据处理的新方法","authors":"Murat Ali Bayir, I. H. Toroslu, A. Cosar","doi":"10.1109/ICDEW.2006.13","DOIUrl":null,"url":null,"abstract":"Web usage mining exploits data mining techniques to discover valuable information from navigation behavior of World Wide Web (WWW) users. The required information is captured by web servers and stored in web usage data logs. The first phase of web usage mining is the data processing phase. In the data processing phase, first, relevant information is filtered from the logs. After that, sessions are reconstructed by using heuristics that select and group requests belonging to the same user session. If we are processing requests after they are handled by the web server, this technique is called \"reactive\" while in \"proactive\" techniques the same (pre)processing occurs during the interactive browsing of the web site by the user. Reactive session reconstruction uses \"time\" and \"navigation\" oriented heuristics. We propose to combine these heuristics with \"site topology\" information in order to increase the accuracy of the reconstructed sessions. In this work, we have implemented an agent simulator, which models behavior of web users and generates web user navigation as well as the log data kept by the web server. By this way we know the actual user sessions and we can accurately evaluate and compare the performances of alternative session reconstruction heuristics (which will use only the web server log data). To the best of our knowledge, this paper is the first work that uses such an agent simulator, and therefore, is able to accurately evaluate different session reconstruction heuristics. By using the agent simulator, we attempt to show that our new heuristic discovers more accurate sessions than previous heuristics.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"A New Approach for Reactive Web Usage Data Processing\",\"authors\":\"Murat Ali Bayir, I. H. Toroslu, A. Cosar\",\"doi\":\"10.1109/ICDEW.2006.13\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web usage mining exploits data mining techniques to discover valuable information from navigation behavior of World Wide Web (WWW) users. The required information is captured by web servers and stored in web usage data logs. The first phase of web usage mining is the data processing phase. In the data processing phase, first, relevant information is filtered from the logs. After that, sessions are reconstructed by using heuristics that select and group requests belonging to the same user session. If we are processing requests after they are handled by the web server, this technique is called \\\"reactive\\\" while in \\\"proactive\\\" techniques the same (pre)processing occurs during the interactive browsing of the web site by the user. Reactive session reconstruction uses \\\"time\\\" and \\\"navigation\\\" oriented heuristics. We propose to combine these heuristics with \\\"site topology\\\" information in order to increase the accuracy of the reconstructed sessions. In this work, we have implemented an agent simulator, which models behavior of web users and generates web user navigation as well as the log data kept by the web server. By this way we know the actual user sessions and we can accurately evaluate and compare the performances of alternative session reconstruction heuristics (which will use only the web server log data). To the best of our knowledge, this paper is the first work that uses such an agent simulator, and therefore, is able to accurately evaluate different session reconstruction heuristics. By using the agent simulator, we attempt to show that our new heuristic discovers more accurate sessions than previous heuristics.\",\"PeriodicalId\":331953,\"journal\":{\"name\":\"22nd International Conference on Data Engineering Workshops (ICDEW'06)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"22nd International Conference on Data Engineering Workshops (ICDEW'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDEW.2006.13\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDEW.2006.13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

摘要

Web使用挖掘利用数据挖掘技术从万维网用户的导航行为中发现有价值的信息。所需的信息由web服务器捕获并存储在web使用数据日志中。web使用挖掘的第一阶段是数据处理阶段。在数据处理阶段,首先从日志中过滤相关信息。之后,使用启发式方法对属于同一用户会话的请求进行选择和分组,从而重构会话。如果我们在web服务器处理请求之后再处理请求,这种技术被称为“响应式”,而在“主动”技术中,同样的(预)处理发生在用户交互式浏览网站的过程中。响应式会话重构使用面向“时间”和“导航”的启发式方法。我们建议将这些启发式方法与“站点拓扑”信息结合起来,以提高重建会话的准确性。在这项工作中,我们实现了一个代理模拟器,它对web用户的行为进行建模,并生成web用户导航以及web服务器保存的日志数据。通过这种方式,我们知道实际的用户会话,我们可以准确地评估和比较替代会话重建启发式的性能(它将只使用web服务器日志数据)。据我们所知,本文是第一个使用这种代理模拟器的工作,因此,能够准确地评估不同的会话重建启发式。通过使用代理模拟器,我们试图证明我们的新启发式比以前的启发式发现更准确的会话。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A New Approach for Reactive Web Usage Data Processing
Web usage mining exploits data mining techniques to discover valuable information from navigation behavior of World Wide Web (WWW) users. The required information is captured by web servers and stored in web usage data logs. The first phase of web usage mining is the data processing phase. In the data processing phase, first, relevant information is filtered from the logs. After that, sessions are reconstructed by using heuristics that select and group requests belonging to the same user session. If we are processing requests after they are handled by the web server, this technique is called "reactive" while in "proactive" techniques the same (pre)processing occurs during the interactive browsing of the web site by the user. Reactive session reconstruction uses "time" and "navigation" oriented heuristics. We propose to combine these heuristics with "site topology" information in order to increase the accuracy of the reconstructed sessions. In this work, we have implemented an agent simulator, which models behavior of web users and generates web user navigation as well as the log data kept by the web server. By this way we know the actual user sessions and we can accurately evaluate and compare the performances of alternative session reconstruction heuristics (which will use only the web server log data). To the best of our knowledge, this paper is the first work that uses such an agent simulator, and therefore, is able to accurately evaluate different session reconstruction heuristics. By using the agent simulator, we attempt to show that our new heuristic discovers more accurate sessions than previous heuristics.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Web Interface Navigation Design: Which Style of Navigation-Link Menus Do Users Prefer? Replication Based on Objects Load under a Content Distribution Network A Stochastic Approach for Trust Management A Multiple-Perspective, Interactive Approach for Web Information Extraction and Exploration Seaweed: Distributed Scalable Ad Hoc Querying
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1