Hybrid optimization and ontology-based semantic model for efficient text-based information retrieval.

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Journal of Supercomputing Pub Date : 2023-01-01 DOI:10.1007/s11227-022-04708-9

Ram Kumar, S C Sharma

{"title":"Hybrid optimization and ontology-based semantic model for efficient text-based information retrieval.","authors":"Ram Kumar, S C Sharma","doi":"10.1007/s11227-022-04708-9","DOIUrl":null,"url":null,"abstract":"<p><p>Query expansion is an important approach utilized to improve the efficiency of data retrieval tasks. Numerous works are carried out by the researchers to generate fair constructive results; however, they do not provide acceptable results for all kinds of queries particularly phrase and individual queries. The utilization of identical data sources and weighting strategies for expanding such terms are the major cause of this issue which leads the model unable to capture the comprehensive relationship between the query terms. In order to tackle this issue, we developed a novel approach for query expansion technique to analyze the different data sources namely WordNet, Wikipedia, and Text REtrieval Conference. This paper presents an Improved Aquila Optimization-based COOT(IAOCOOT) algorithm for query expansion which retrieves the semantic aspects that match the query term. The semantic heterogeneity associated with document retrieval mainly impacts the relevance matching between the query and the document. The main cause of this issue is that the similarity among the words is not evaluated correctly. To overcome this problem, we are using a Modified Needleman Wunsch algorithm algorithm to deal with the problems of uncertainty, imprecision in the information retrieval process, and semantic ambiguity of indexed terms in both the local and global perspectives. The k most similar word is determined and returned from a candidate set through the top-k words selection technique and it is widely utilized in different tasks. The proposed IAOCOOT model is evaluated using different standard Information Retrieval performance metrics to compute the validity of the proposed work by comparing it with other state-of-art techniques.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"79 2","pages":"2251-2280"},"PeriodicalIF":2.7000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9364863/pdf/","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Supercomputing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11227-022-04708-9","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 9

Abstract

Query expansion is an important approach utilized to improve the efficiency of data retrieval tasks. Numerous works are carried out by the researchers to generate fair constructive results; however, they do not provide acceptable results for all kinds of queries particularly phrase and individual queries. The utilization of identical data sources and weighting strategies for expanding such terms are the major cause of this issue which leads the model unable to capture the comprehensive relationship between the query terms. In order to tackle this issue, we developed a novel approach for query expansion technique to analyze the different data sources namely WordNet, Wikipedia, and Text REtrieval Conference. This paper presents an Improved Aquila Optimization-based COOT(IAOCOOT) algorithm for query expansion which retrieves the semantic aspects that match the query term. The semantic heterogeneity associated with document retrieval mainly impacts the relevance matching between the query and the document. The main cause of this issue is that the similarity among the words is not evaluated correctly. To overcome this problem, we are using a Modified Needleman Wunsch algorithm algorithm to deal with the problems of uncertainty, imprecision in the information retrieval process, and semantic ambiguity of indexed terms in both the local and global perspectives. The k most similar word is determined and returned from a candidate set through the top-k words selection technique and it is widely utilized in different tasks. The proposed IAOCOOT model is evaluated using different standard Information Retrieval performance metrics to compute the validity of the proposed work by comparing it with other state-of-art techniques.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于本体的混合优化语义模型在文本信息检索中的应用。

查询扩展是提高数据检索任务效率的一种重要方法。研究人员进行了大量的工作，以产生公平的建设性结果;然而，它们不能为所有类型的查询提供可接受的结果，尤其是短语和单个查询。使用相同的数据源和加权策略来扩展这些术语是导致该问题的主要原因，导致模型无法捕获查询术语之间的全面关系。为了解决这个问题，我们开发了一种新的查询扩展技术来分析不同的数据源，即WordNet、Wikipedia和Text REtrieval Conference。本文提出了一种改进的基于Aquila优化的COOT(IAOCOOT)查询扩展算法，用于检索与查询项匹配的语义方面。与文档检索相关的语义异构性主要影响查询与文档之间的相关性匹配。造成这一问题的主要原因是没有正确评估单词之间的相似性。为了克服这一问题，我们使用了一种改进的Needleman Wunsch算法来处理信息检索过程中的不确定性、不精确性以及索引项在局部和全局两方面的语义歧义问题。通过top-k单词选择技术从候选集中确定并返回k个最相似的单词，并广泛应用于不同的任务中。使用不同的标准信息检索性能指标来评估所提出的IAOCOOT模型，通过将其与其他最先进的技术进行比较来计算所提出工作的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Supercomputing 工程技术-工程：电子与电气

CiteScore

6.30

自引率

12.10%

发文量

734

审稿时长

13 months

期刊介绍： The Journal of Supercomputing publishes papers on the technology, architecture and systems, algorithms, languages and programs, performance measures and methods, and applications of all aspects of Supercomputing. Tutorial and survey papers are intended for workers and students in the fields associated with and employing advanced computer systems. The journal also publishes letters to the editor, especially in areas relating to policy, succinct statements of paradoxes, intuitively puzzling results, partial results and real needs. Published theoretical and practical papers are advanced, in-depth treatments describing new developments and new ideas. Each includes an introduction summarizing prior, directly pertinent work that is useful for the reader to understand, in order to appreciate the advances being described.