A Framework for Identifying Skylines over Incomplete Data

2014 3rd International Conference on Advanced Computer Science Applications and Technologies Pub Date : 2014-12-29 DOI:10.1109/ACSAT.2014.21

A. Alwan, H. Ibrahim, N. Udzir

{"title":"A Framework for Identifying Skylines over Incomplete Data","authors":"A. Alwan, H. Ibrahim, N. Udzir","doi":"10.1109/ACSAT.2014.21","DOIUrl":null,"url":null,"abstract":"Skyline queries provide a flexible query operator that returns data items (skylines) which are not being dominated by other data items in all dimensions (attributes) of the database. Most of the existing skyline techniques determine the skylines by assuming that the values of dimensions for every data item are available (complete). However, this assumption is not always true particularly for multidimensional database as some values may be missing. The incompleteness of data leads to the loss of the transitivity property of skyline technique and results into failure in test dominance as some data items are incomparable to each other. Furthermore, incompleteness of data influences negatively on the process of finding skylines, leading to high overhead, due to exhaustive pair wise comparisons between the data items. This paper proposed a framework to process skyline queries for incomplete data with the aim of avoiding the issue of cyclic dominance in deriving skylines. The proposed framework for identifying skylines for incomplete data consists of four components, namely: Data Clustering Builder, Group Constructor and Local Skylines Identifier, k-dom Skyline Generator, and Incomplete Skylines Identifier. Including these processes in the proposed framework has optimized the process of identifying skylines in incomplete database by reducing the necessary number of pair wise comparison through eliminating the dominated data items as early as possible before applying the skyline technique.","PeriodicalId":137452,"journal":{"name":"2014 3rd International Conference on Advanced Computer Science Applications and Technologies","volume":"34 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 3rd International Conference on Advanced Computer Science Applications and Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSAT.2014.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Skyline queries provide a flexible query operator that returns data items (skylines) which are not being dominated by other data items in all dimensions (attributes) of the database. Most of the existing skyline techniques determine the skylines by assuming that the values of dimensions for every data item are available (complete). However, this assumption is not always true particularly for multidimensional database as some values may be missing. The incompleteness of data leads to the loss of the transitivity property of skyline technique and results into failure in test dominance as some data items are incomparable to each other. Furthermore, incompleteness of data influences negatively on the process of finding skylines, leading to high overhead, due to exhaustive pair wise comparisons between the data items. This paper proposed a framework to process skyline queries for incomplete data with the aim of avoiding the issue of cyclic dominance in deriving skylines. The proposed framework for identifying skylines for incomplete data consists of four components, namely: Data Clustering Builder, Group Constructor and Local Skylines Identifier, k-dom Skyline Generator, and Incomplete Skylines Identifier. Including these processes in the proposed framework has optimized the process of identifying skylines in incomplete database by reducing the necessary number of pair wise comparison through eliminating the dominated data items as early as possible before applying the skyline technique.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在不完整数据上识别天际线的框架

Skyline查询提供了一个灵活的查询操作符，它返回的数据项(skylines)在数据库的所有维度(属性)中没有被其他数据项所支配。大多数现有的天际线技术通过假设每个数据项的维度值是可用的(完整的)来确定天际线。然而，这个假设并不总是正确的，特别是对于多维数据库，因为一些值可能会丢失。数据的不完备性导致天际线技术的传递性丧失，部分数据项之间的不可比较性导致测试优势失效。此外，数据的不完整性对查找天际线的过程产生负面影响，由于数据项之间的穷举成对比较，导致高开销。本文提出了一个框架来处理不完整数据的天际线查询，目的是避免在推导天际线时出现循环优势问题。提出的用于识别不完整数据天际线的框架由四个组件组成，即:数据聚类构建器、组构造器和本地天际线标识符、k-dom天际线生成器和不完整天际线标识符。在应用天际线技术之前，通过尽早消除主导数据项，减少必要的配对比较次数，从而优化了在不完整数据库中识别天际线的过程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 3rd International Conference on Advanced Computer Science Applications and Technologies

自引率

0.00%

发文量