Reporting L Most Favorite Objects in Uncertain Databases with Probabilistic Reverse Top-k Queries

Guoqing Xiao, Kenli Li, Keqin Li
{"title":"Reporting L Most Favorite Objects in Uncertain Databases with Probabilistic Reverse Top-k Queries","authors":"Guoqing Xiao, Kenli Li, Keqin Li","doi":"10.1109/ICDMW.2015.47","DOIUrl":null,"url":null,"abstract":"Top-k queries are widely studied for identifying a ranked set of the k most interesting objects based on the individual user preference. Reverse top-k queries are proposed from the perspective of the product manufacturer, which are essential for manufacturers to assess the potential market and impacts of their products. However, the existing approaches for reverse top-k queries are all based on the assumption that the underlying data are exact. Due to the intrinsic differences between uncertain and certain data, these methods are designed only in certain databases and cannot be applied to uncertain case directly. Motivated by this, in this paper, we firstly model the probabilistic reverse top-k queries in the context of uncertain data. Moreover, we formulate the challenging problem of processing queries that report l most favorite objects to users, where impact factor of an object is defined as the cardinality of the probabilistic reverse top-k query result set. For speeding up the query, we exploit several properties of probabilistic threshold top-k queries and probabilistic skyline queries to reduce the solution space of this problem. In addition, an upper bound of the potential users is estimated to reduce the cost of computing the probabilistic reverse top-k queries for the candidate objects. Furthermore, effective pruning heuristics are presented to further reduce the search space of query processing. Finally, efficient query algorithms are presented seamlessly with integration of the proposed pruning strategies. Extensive experiments demonstrate the efficiency and effectiveness of our proposed algorithms with various experimental settings.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2015.47","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Top-k queries are widely studied for identifying a ranked set of the k most interesting objects based on the individual user preference. Reverse top-k queries are proposed from the perspective of the product manufacturer, which are essential for manufacturers to assess the potential market and impacts of their products. However, the existing approaches for reverse top-k queries are all based on the assumption that the underlying data are exact. Due to the intrinsic differences between uncertain and certain data, these methods are designed only in certain databases and cannot be applied to uncertain case directly. Motivated by this, in this paper, we firstly model the probabilistic reverse top-k queries in the context of uncertain data. Moreover, we formulate the challenging problem of processing queries that report l most favorite objects to users, where impact factor of an object is defined as the cardinality of the probabilistic reverse top-k query result set. For speeding up the query, we exploit several properties of probabilistic threshold top-k queries and probabilistic skyline queries to reduce the solution space of this problem. In addition, an upper bound of the potential users is estimated to reduce the cost of computing the probabilistic reverse top-k queries for the candidate objects. Furthermore, effective pruning heuristics are presented to further reduce the search space of query processing. Finally, efficient query algorithms are presented seamlessly with integration of the proposed pruning strategies. Extensive experiments demonstrate the efficiency and effectiveness of our proposed algorithms with various experimental settings.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用概率反向Top-k查询报告不确定数据库中L个最喜欢的对象
Top-k查询被广泛研究,用于根据个人用户偏好确定k个最有趣对象的排序集。从产品制造商的角度提出反向top-k查询,这对于制造商评估其产品的潜在市场和影响至关重要。然而,现有的反向top-k查询方法都是基于底层数据是精确的假设。由于不确定数据与确定数据的本质区别,这些方法仅针对特定数据库设计,不能直接应用于不确定情况。基于此,本文首先对不确定数据背景下的概率反向top-k查询进行建模。此外,我们还提出了一个具有挑战性的问题,即处理向用户报告l个最喜欢的对象的查询,其中对象的影响因子被定义为概率反向top-k查询结果集的基数。为了加快查询速度,我们利用了概率阈值top-k查询和概率天际线查询的一些特性来减小该问题的解空间。此外,还估计了潜在用户的上限,以减少计算候选对象的概率反向top-k查询的成本。在此基础上,提出了有效的剪枝启发式算法,进一步缩小查询处理的搜索空间。最后,结合所提出的修剪策略,无缝地提出了高效的查询算法。大量的实验证明了我们提出的算法在各种实验设置下的效率和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Large-Scale Linear Support Vector Ordinal Regression Solver Joint Recovery and Representation Learning for Robust Correlation Estimation Based on Partially Observed Data Accurate Classification of Biological Data Using Ensembles Large-Scale Unusual Time Series Detection Sentiment Polarity Classification Using Structural Features
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1