Taxonomy-based adaptive Web search method

Said Mirza Pahlevi, H. Kitagawa
{"title":"Taxonomy-based adaptive Web search method","authors":"Said Mirza Pahlevi, H. Kitagawa","doi":"10.1109/ITCC.2002.1000409","DOIUrl":null,"url":null,"abstract":"Current crawler-based search engines usually return a long list of search results containing a lot of noise documents. By indexing collected documents on a topic path in taxonomy, taxonomy-based search engines can improve the search result quality. However the searches are limited to the locally compiled databases. We propose an adaptive Web search method to improve the search result quality enabling the users to search many databases existing in the Web space. The method has a characteristic that combines the taxonomy-based search engines and a machine learning technique. More specifically, we construct a rule-based classifier using pre-classified documents provided by a taxonomy-based search engine based on a selected context category on its taxonomy, and then use it to modify the user query. The resulting modified query will be sent to the crawler-based search engines and the returned results will be presented to the user. We evaluate the effectiveness of our method by showing that the returned results from the modified query almost contain documents that will be categorized into the selected context category.","PeriodicalId":115190,"journal":{"name":"Proceedings. International Conference on Information Technology: Coding and Computing","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Information Technology: Coding and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITCC.2002.1000409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Current crawler-based search engines usually return a long list of search results containing a lot of noise documents. By indexing collected documents on a topic path in taxonomy, taxonomy-based search engines can improve the search result quality. However the searches are limited to the locally compiled databases. We propose an adaptive Web search method to improve the search result quality enabling the users to search many databases existing in the Web space. The method has a characteristic that combines the taxonomy-based search engines and a machine learning technique. More specifically, we construct a rule-based classifier using pre-classified documents provided by a taxonomy-based search engine based on a selected context category on its taxonomy, and then use it to modify the user query. The resulting modified query will be sent to the crawler-based search engines and the returned results will be presented to the user. We evaluate the effectiveness of our method by showing that the returned results from the modified query almost contain documents that will be categorized into the selected context category.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于分类法的自适应Web搜索方法
当前基于爬虫的搜索引擎通常返回一长串搜索结果,其中包含大量噪声文档。通过在分类法的主题路径上索引收集到的文档,基于分类法的搜索引擎可以提高搜索结果的质量。然而,搜索仅限于本地编译的数据库。为了提高搜索结果的质量,提出了一种自适应的Web搜索方法,使用户能够同时搜索Web空间中存在的多个数据库。该方法的特点是结合了基于分类的搜索引擎和机器学习技术。更具体地说,我们使用基于分类法的搜索引擎根据其分类法上选定的上下文类别提供的预分类文档构建基于规则的分类器,然后使用它来修改用户查询。修改后的查询结果将发送到基于爬虫的搜索引擎,返回的结果将显示给用户。通过显示修改后的查询返回的结果几乎包含将被分类到所选上下文类别中的文档,我们评估了方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Parallel execution of relational algebra operator under distributed database systems Enhancing watermark robustness through mixture of watermarked digital objects Improving precision and recall for Soundex retrieval Performance driven circuit clustering and partitioning Experimental results towards content-based sub-image retrieval
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1