Efficient parsing-based search over structured data

Aditya G. Parameswaran, R. Kaushik, A. Arasu
{"title":"Efficient parsing-based search over structured data","authors":"Aditya G. Parameswaran, R. Kaushik, A. Arasu","doi":"10.1145/2505515.2505764","DOIUrl":null,"url":null,"abstract":"Parsing-based search, i.e., parsing keyword search queries using grammars, is often used to override the traditional \"bag-of-words'\" semantics in web search and enterprise search scenarios. Compared to the \"bag-of-words\" semantics, the parsing-based semantics is richer and more customizable. While a formalism for parsing-based semantics for keyword search has been proposed in prior work and ad-hoc implementations exist, the problem of designing efficient algorithms to support the semantics is largely unstudied. In this paper, we present a suite of efficient algorithms and auxiliary indexes for this problem. Our algorithms work for a broad classes of grammars used in practice, and cover a variety of database matching functions (set- and substring-containment, approximate and exact equality) and scoring functions (to filter and rank different parses). We formally analyze the time complexity of our algorithms and provide an empirical evaluation over real-world data to show that our algorithms scale well with the size of the database and grammar.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"197 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2505515.2505764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Parsing-based search, i.e., parsing keyword search queries using grammars, is often used to override the traditional "bag-of-words'" semantics in web search and enterprise search scenarios. Compared to the "bag-of-words" semantics, the parsing-based semantics is richer and more customizable. While a formalism for parsing-based semantics for keyword search has been proposed in prior work and ad-hoc implementations exist, the problem of designing efficient algorithms to support the semantics is largely unstudied. In this paper, we present a suite of efficient algorithms and auxiliary indexes for this problem. Our algorithms work for a broad classes of grammars used in practice, and cover a variety of database matching functions (set- and substring-containment, approximate and exact equality) and scoring functions (to filter and rank different parses). We formally analyze the time complexity of our algorithms and provide an empirical evaluation over real-world data to show that our algorithms scale well with the size of the database and grammar.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高效的基于解析的结构化数据搜索
基于解析的搜索,即使用语法解析关键字搜索查询,通常用于在web搜索和企业搜索场景中覆盖传统的“词袋”语义。与“词袋”语义相比,基于解析的语义更丰富,更可定制。虽然在之前的工作中已经提出了一种基于解析的关键字搜索语义的形式,并且存在特定的实现,但设计有效的算法来支持语义的问题在很大程度上没有研究。在本文中,我们提出了一套有效的算法和辅助指标。我们的算法适用于实践中使用的各种语法,涵盖了各种数据库匹配函数(集合和子字符串包含,近似和精确相等)和评分函数(过滤和排序不同的解析)。我们正式分析了算法的时间复杂性,并提供了对现实世界数据的经验评估,以表明我们的算法可以很好地随数据库和语法的大小进行扩展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Exploring XML data is as easy as using maps Mining-based compression approach of propositional formulae Flexible and dynamic compromises for effective recommendations Efficient parsing-based search over structured data Recommendation via user's personality and social contextual
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1