Estimating the size and evolution of categorised topics in web directories

I. Anagnostopoulos, C. Anagnostopoulos
{"title":"Estimating the size and evolution of categorised topics in web directories","authors":"I. Anagnostopoulos, C. Anagnostopoulos","doi":"10.3233/WIA-2010-0179","DOIUrl":null,"url":null,"abstract":"In this paper a statistical approach for estimating the evolution of categorized web page populations in web directories is proposed. The proposal is based on the capture-recapture method used in wildlife biological studies and it is modified according to the necessary assumptions and amendments for conducting the experiments on the web. During these experiments, web pages are likened to animals and the specific categories of web pages are likened to particular species of animals whose abundance, birth and survival rates are estimated. The capture-recapture model followed is a model that allows us to consider the populations under study as open. Thus, in the course of time the population evolves, meaning that new web pages are inserted in the study, while others are removed or become inactive, resembling the natural processes of migration or death. Artificial intelligence classifiers, capable of categorizing web pages, play the role of the biologists who recognize the species under study. In our work, four different simulations were conducted in order to evaluate the robustness of the model followed on the web paradigm, based on four different real classification cases. The paper provides the implementation details of our proposed web-based capture-recapture model, along with its initial assessment.","PeriodicalId":263450,"journal":{"name":"Web Intell. Agent Syst.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Web Intell. Agent Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/WIA-2010-0179","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In this paper a statistical approach for estimating the evolution of categorized web page populations in web directories is proposed. The proposal is based on the capture-recapture method used in wildlife biological studies and it is modified according to the necessary assumptions and amendments for conducting the experiments on the web. During these experiments, web pages are likened to animals and the specific categories of web pages are likened to particular species of animals whose abundance, birth and survival rates are estimated. The capture-recapture model followed is a model that allows us to consider the populations under study as open. Thus, in the course of time the population evolves, meaning that new web pages are inserted in the study, while others are removed or become inactive, resembling the natural processes of migration or death. Artificial intelligence classifiers, capable of categorizing web pages, play the role of the biologists who recognize the species under study. In our work, four different simulations were conducted in order to evaluate the robustness of the model followed on the web paradigm, based on four different real classification cases. The paper provides the implementation details of our proposed web-based capture-recapture model, along with its initial assessment.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
估计网络目录中分类主题的大小和演变
本文提出了一种估计网络目录中分类网页种群演化的统计方法。该提案基于野生动物生物学研究中使用的捕获-再捕获方法,并根据在网络上进行实验所需的假设和修正进行了修改。在这些实验中,网页被比作动物,网页的特定类别被比作特定物种的动物,其丰度、出生率和存活率都是估计出来的。所遵循的捕获-再捕获模型是一种允许我们将所研究的种群视为开放的模型。因此,随着时间的推移,人口不断发展,这意味着新的网页被插入研究中,而其他网页被删除或变得不活跃,类似于迁移或死亡的自然过程。能够对网页进行分类的人工智能分类器,扮演着识别所研究物种的生物学家的角色。在我们的工作中,基于四种不同的真实分类案例,进行了四种不同的模拟,以评估基于web范式的模型的鲁棒性。本文提供了我们提出的基于web的捕获-再捕获模型的实现细节,以及它的初步评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Detecting cyberbullying in social networks using multi-agent system Scalable approximating SVD algorithm for recommender systems Web usage mining based recommender systems using implicit heterogeneous data: - A Particle Swarm Optimization based clustering approach Agent-based problem solving methods in Big Data environment Multi-agent orienteering problem with time-dependent capacity constraints
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1