文档检索的查询索引和基于聚类的索引模型

Sathish Vuyyala
{"title":"文档检索的查询索引和基于聚类的索引模型","authors":"Sathish Vuyyala","doi":"10.46253/j.mr.v4i4.a2","DOIUrl":null,"url":null,"abstract":": In the research community field, query optimization plays an important role to retrieve the important and the appropriate documents on the basis of query indexing. In the documents, using the query retrieval process the information is retrieved on the basis of the distance measured. Although several methods are present in the query processing scheme as well as indexing, extracting the matched as well as appropriate documents still outcomes in numerous confronts in the research community. Hence, to retrieve the appropriate documents competently an effective cluster-based inverted indexing model is adopted. By exploiting stop word removal and stemming approaches, unnecessary and redundant words are removed. By cluster-based inverted indexing approach, document indexing is carried out that is the integration of Possibilistic fuzzy c-means (PFCM) clustering approach to index the documents. For user queries, such as multigram queries or semantic queries, on basis of Bhattacharyya distance to generate an enhanced query outcome, query matching is processed. By exploiting the Pearson correlation coefficient, the query optimization is carried out and the appropriate documents are retrieved efficiently. The achievement of a developed cluster-based indexing approach is carried out in this paper. The developed cluster-based indexing approach performance is calculated by exploiting measures, namely precision, recall, as well as F-measure. exploiting the Bhattacharyya distance. On the basis of the least distance measure or Bhattacharya distance, the enhanced query matching outcomes were obtained. The Pearson correlation coefficient was used by the query optimization on the basis of the interactive query optimization and retrieves appropriate documents competently. The developed cluster-based inverted indexing approach obtains enhanced performance with the measures, such as recall, precision, as well as F-measure values.","PeriodicalId":167187,"journal":{"name":"Multimedia Research","volume":"106 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Query Indexing and Cluster-based Indexing Model for the Document Retrieval\",\"authors\":\"Sathish Vuyyala\",\"doi\":\"10.46253/j.mr.v4i4.a2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": In the research community field, query optimization plays an important role to retrieve the important and the appropriate documents on the basis of query indexing. In the documents, using the query retrieval process the information is retrieved on the basis of the distance measured. Although several methods are present in the query processing scheme as well as indexing, extracting the matched as well as appropriate documents still outcomes in numerous confronts in the research community. Hence, to retrieve the appropriate documents competently an effective cluster-based inverted indexing model is adopted. By exploiting stop word removal and stemming approaches, unnecessary and redundant words are removed. By cluster-based inverted indexing approach, document indexing is carried out that is the integration of Possibilistic fuzzy c-means (PFCM) clustering approach to index the documents. For user queries, such as multigram queries or semantic queries, on basis of Bhattacharyya distance to generate an enhanced query outcome, query matching is processed. By exploiting the Pearson correlation coefficient, the query optimization is carried out and the appropriate documents are retrieved efficiently. The achievement of a developed cluster-based indexing approach is carried out in this paper. The developed cluster-based indexing approach performance is calculated by exploiting measures, namely precision, recall, as well as F-measure. exploiting the Bhattacharyya distance. On the basis of the least distance measure or Bhattacharya distance, the enhanced query matching outcomes were obtained. The Pearson correlation coefficient was used by the query optimization on the basis of the interactive query optimization and retrieves appropriate documents competently. The developed cluster-based inverted indexing approach obtains enhanced performance with the measures, such as recall, precision, as well as F-measure values.\",\"PeriodicalId\":167187,\"journal\":{\"name\":\"Multimedia Research\",\"volume\":\"106 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Multimedia Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.46253/j.mr.v4i4.a2\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.46253/j.mr.v4i4.a2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在研究领域,查询优化是在查询索引的基础上检索到重要的、合适的文档。在文档中,使用查询检索过程根据测量的距离检索信息。尽管在查询处理方案和索引中存在几种方法,但提取匹配的和适当的文档仍然是研究界面临的许多问题。因此,为了有效地检索相应的文档,采用了一种有效的基于聚类的倒排索引模型。利用停止词去除和词干提取方法,去除不必要和冗余的词。采用基于聚类的倒排索引方法,将可能性模糊c均值(PFCM)聚类方法与文献索引方法相结合,实现文献索引。对于多图查询或语义查询等用户查询,根据Bhattacharyya距离生成增强的查询结果,进行查询匹配。利用Pearson相关系数进行查询优化,有效地检索到相应的文档。本文实现了一种基于聚类的索引方法。所开发的基于聚类的索引方法的性能是通过利用精度、召回率和f度量来计算的。利用巴塔查里亚的距离。在最小距离度量或Bhattacharya距离的基础上,得到增强的查询匹配结果。在交互式查询优化的基础上,利用Pearson相关系数进行查询优化,能够胜任地检索到合适的文档。本文提出的基于聚类的倒排索引方法在查全率、查准率和f测量值等指标上得到了较好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Query Indexing and Cluster-based Indexing Model for the Document Retrieval
: In the research community field, query optimization plays an important role to retrieve the important and the appropriate documents on the basis of query indexing. In the documents, using the query retrieval process the information is retrieved on the basis of the distance measured. Although several methods are present in the query processing scheme as well as indexing, extracting the matched as well as appropriate documents still outcomes in numerous confronts in the research community. Hence, to retrieve the appropriate documents competently an effective cluster-based inverted indexing model is adopted. By exploiting stop word removal and stemming approaches, unnecessary and redundant words are removed. By cluster-based inverted indexing approach, document indexing is carried out that is the integration of Possibilistic fuzzy c-means (PFCM) clustering approach to index the documents. For user queries, such as multigram queries or semantic queries, on basis of Bhattacharyya distance to generate an enhanced query outcome, query matching is processed. By exploiting the Pearson correlation coefficient, the query optimization is carried out and the appropriate documents are retrieved efficiently. The achievement of a developed cluster-based indexing approach is carried out in this paper. The developed cluster-based indexing approach performance is calculated by exploiting measures, namely precision, recall, as well as F-measure. exploiting the Bhattacharyya distance. On the basis of the least distance measure or Bhattacharya distance, the enhanced query matching outcomes were obtained. The Pearson correlation coefficient was used by the query optimization on the basis of the interactive query optimization and retrieves appropriate documents competently. The developed cluster-based inverted indexing approach obtains enhanced performance with the measures, such as recall, precision, as well as F-measure values.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Application of Telemedicine for Healthcare Delivery in Nigeria The Role of Agricultural Input Credit on Production of Maize: A Case Study in Shebedneo District, Sidama Region, Ethiopia Enhancing An Image Blood Staining Malaria Diagnosis Using Convolution Neural Network On Raspberry Pi Android-Based Examination Questions Reader Application for Visually Impaired Students To Improve the Insect Pests Images- A Comparative Analysis of Image Denoising Methods
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1