Deep Web Data Source Classification Based on Query Interface Context

Zilu Cui, Yuchen Fu
{"title":"Deep Web Data Source Classification Based on Query Interface Context","authors":"Zilu Cui, Yuchen Fu","doi":"10.1109/ICCIS.2012.117","DOIUrl":null,"url":null,"abstract":"As the volume of information in the Deep Web grows, a Deep Web data source classification algorithm based on query interface context is presented. Two methods are combined to get the search interface similarity. One is based on the vector space. The classical TF-IDF statistics are used to gain the similarity between search interfaces. The other is to compute the two pages semantic similarity by the use of HowNet. Based on the K-NN algorithm, a WDB classification algorithm is presented. Experimental results show this algorithm generates high-quality clusters, measured both in terms of entropy and F-measure. It indicates the practical value of application.","PeriodicalId":269967,"journal":{"name":"2012 Fourth International Conference on Computational and Information Sciences","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Fourth International Conference on Computational and Information Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIS.2012.117","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

As the volume of information in the Deep Web grows, a Deep Web data source classification algorithm based on query interface context is presented. Two methods are combined to get the search interface similarity. One is based on the vector space. The classical TF-IDF statistics are used to gain the similarity between search interfaces. The other is to compute the two pages semantic similarity by the use of HowNet. Based on the K-NN algorithm, a WDB classification algorithm is presented. Experimental results show this algorithm generates high-quality clusters, measured both in terms of entropy and F-measure. It indicates the practical value of application.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于查询接口上下文的深度网络数据源分类
随着深度网络信息量的增长,提出了一种基于查询接口上下文的深度网络数据源分类算法。将两种方法相结合,得到搜索界面相似度。一个是基于向量空间的。经典TF-IDF统计数据用于获得搜索界面之间的相似度。另一种是利用HowNet计算两个页面的语义相似度。在K-NN算法的基础上,提出了一种WDB分类算法。实验结果表明,该算法生成了高质量的聚类,并对熵和F-measure进行了测量。说明了该方法的实际应用价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Study on Battle Damage Level Prediction Using Hybrid-learning Algorithm Resource Modeling and Analysis of Real-Time Software Based on Process Algebra Design and Simulation of Random Access Procedure in TD-LTE E-commerce Entrepreneurship Education Research of College Students Majoring in Ceramics Art Design An Image Hiding Scheme Based on 3D Skew Tent Map and Discrete Wavelet Transform
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1