Combining offline and on-the-fly disambiguation to perform semantic-aware XML querying

IF 1.2 4区 计算机科学 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS Computer Science and Information Systems Pub Date : 2023-01-01 DOI:10.2298/csis220228063t
Joe Tekli, Gilbert Tekli, R. Chbeir
{"title":"Combining offline and on-the-fly disambiguation to perform semantic-aware XML querying","authors":"Joe Tekli, Gilbert Tekli, R. Chbeir","doi":"10.2298/csis220228063t","DOIUrl":null,"url":null,"abstract":"Many efforts have been deployed by the IR community to extend free-text query processing toward semi-structured XML search. Most methods rely on the concept of Lowest Comment Ancestor (LCA) between two or multiple structural nodes to identify the most specific XML elements containing query keywords posted by the user. Yet, few of the existing approaches consider XML semantics, and the methods that process semantics generally rely on computationally expensive word sense disambiguation (WSD) techniques, or apply semantic analysis in one stage only: performing query relaxation/refinement over the bag of words retrieval model, to reduce processing time. In this paper, we describe a new approach for XML keyword search aiming to solve the limitations mentioned above. Our solution first transforms the XML document collection (offline) and the keyword query (on-the-fly) into meaningful semantic representations using context-based and global disambiguation methods, specially designed to allow almost linear computation efficiency. We use a semantic-aware inverted index to allow semantic-aware search, result selection, and result ranking functionality. The semantically augmented XML data tree is processed for structural node clustering, based on semantic query concepts (i.e., key-concepts), in order to identify and rank candidate answer sub-trees containing related occurrences of query key-concepts. Dedicated weighting functions and various search algorithms have been developed for that purpose and will be presented here. Experimental results highlight the quality and potential of our approach.","PeriodicalId":50636,"journal":{"name":"Computer Science and Information Systems","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Science and Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.2298/csis220228063t","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Many efforts have been deployed by the IR community to extend free-text query processing toward semi-structured XML search. Most methods rely on the concept of Lowest Comment Ancestor (LCA) between two or multiple structural nodes to identify the most specific XML elements containing query keywords posted by the user. Yet, few of the existing approaches consider XML semantics, and the methods that process semantics generally rely on computationally expensive word sense disambiguation (WSD) techniques, or apply semantic analysis in one stage only: performing query relaxation/refinement over the bag of words retrieval model, to reduce processing time. In this paper, we describe a new approach for XML keyword search aiming to solve the limitations mentioned above. Our solution first transforms the XML document collection (offline) and the keyword query (on-the-fly) into meaningful semantic representations using context-based and global disambiguation methods, specially designed to allow almost linear computation efficiency. We use a semantic-aware inverted index to allow semantic-aware search, result selection, and result ranking functionality. The semantically augmented XML data tree is processed for structural node clustering, based on semantic query concepts (i.e., key-concepts), in order to identify and rank candidate answer sub-trees containing related occurrences of query key-concepts. Dedicated weighting functions and various search algorithms have been developed for that purpose and will be presented here. Experimental results highlight the quality and potential of our approach.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
结合离线和实时消歧来执行语义感知的XML查询
IR社区已经做了很多工作,将自由文本查询处理扩展到半结构化的XML搜索。大多数方法依赖于两个或多个结构节点之间的最低评论祖先(LCA)概念来标识包含用户发布的查询关键字的最特定的XML元素。然而,很少有现有的方法考虑XML语义,处理语义的方法通常依赖于计算代价高昂的词义消歧(WSD)技术,或者只在一个阶段应用语义分析:在词包检索模型上执行查询放松/细化,以减少处理时间。在本文中,我们描述了一种新的XML关键字搜索方法,旨在解决上述限制。我们的解决方案首先使用基于上下文和全局消歧方法将XML文档集合(离线)和关键字查询(实时)转换为有意义的语义表示,这些方法专门设计用于实现几乎线性的计算效率。我们使用语义感知的倒排索引来实现语义感知的搜索、结果选择和结果排序功能。基于语义查询概念(即键概念),对语义增强的XML数据树进行结构化节点聚类处理,以便识别包含相关查询键概念出现的候选答案子树并对其进行排序。专门的加权函数和各种搜索算法已经为此目的而开发,并将在这里介绍。实验结果突出了我们方法的质量和潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computer Science and Information Systems
Computer Science and Information Systems COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, SOFTWARE ENGINEERING
CiteScore
2.30
自引率
21.40%
发文量
76
审稿时长
7.5 months
期刊介绍: About the journal Home page Contact information Aims and scope Indexing information Editorial policies ComSIS consortium Journal boards Managing board For authors Information for contributors Paper submission Article submission through OJS Copyright transfer form Download section For readers Forthcoming articles Current issue Archive Subscription For reviewers View and review submissions News Journal''s Facebook page Call for special issue New issue notification Aims and scope Computer Science and Information Systems (ComSIS) is an international refereed journal, published in Serbia. The objective of ComSIS is to communicate important research and development results in the areas of computer science, software engineering, and information systems.
期刊最新文献
Reviewer Acknowledgements for Computer and Information Science, Vol. 16, No. 3 Drawbacks of Traditional Environmental Monitoring Systems Improving the Classification Ability of Delegating Classifiers Using Different Supervised Machine Learning Algorithms Reinforcement learning - based adaptation and scheduling methods for multi-source DASH On the Convergence of Hypergeometric to Binomial Distributions
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1