Mei Bai, Yuxue Han, Peng Yin, Xite Wang, Guanyu Li, Bo Ning, Qian Ma
{"title":"S_IDS:在不完整数据流上高效的skyline查询算法","authors":"Mei Bai, Yuxue Han, Peng Yin, Xite Wang, Guanyu Li, Bo Ning, Qian Ma","doi":"10.1016/j.datak.2023.102258","DOIUrl":null,"url":null,"abstract":"<div><p>The efficient processing of mass stream data has attracted wide attention in the database field. The skyline query on the sensor data stream can monitor multiple targets in real time, to avoid abnormal events such as fire and explosion, which is very useful in the practical application of sensor data monitoring. However, real-world stream data may often contain incomplete data attributes due to faulty sensing devices or imperfect data collection techniques. Skyline queries over incomplete data streams may lead to a lack of transitivity and loop domination issues. To solve the problem of the skyline query over incomplete data streams, firstly, this paper uses differential dependency rule (DD) to fill the missing attribute values of data in the incomplete data stream. Then, the hierarchical grid index (HGrid) is introduced into the field of skyline query to improve pruning efficiency. In the process of skyline calculation, this paper only keeps as few calculation results as possible for the data that may affect the result to avoid a large number of repeated calculations. Thus, S_IDS (Skyline query algorithm over Incomplete Data Stream) is proposed to query skyline results with high confidence from the incomplete data stream. Finally, by comparing with the most advanced skyline query algorithms over incomplete data streams, the correctness and efficiency of the proposed S_IDS algorithm are proved.</p></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"149 ","pages":"Article 102258"},"PeriodicalIF":2.7000,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"S_IDS: An efficient skyline query algorithm over incomplete data streams\",\"authors\":\"Mei Bai, Yuxue Han, Peng Yin, Xite Wang, Guanyu Li, Bo Ning, Qian Ma\",\"doi\":\"10.1016/j.datak.2023.102258\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The efficient processing of mass stream data has attracted wide attention in the database field. The skyline query on the sensor data stream can monitor multiple targets in real time, to avoid abnormal events such as fire and explosion, which is very useful in the practical application of sensor data monitoring. However, real-world stream data may often contain incomplete data attributes due to faulty sensing devices or imperfect data collection techniques. Skyline queries over incomplete data streams may lead to a lack of transitivity and loop domination issues. To solve the problem of the skyline query over incomplete data streams, firstly, this paper uses differential dependency rule (DD) to fill the missing attribute values of data in the incomplete data stream. Then, the hierarchical grid index (HGrid) is introduced into the field of skyline query to improve pruning efficiency. In the process of skyline calculation, this paper only keeps as few calculation results as possible for the data that may affect the result to avoid a large number of repeated calculations. Thus, S_IDS (Skyline query algorithm over Incomplete Data Stream) is proposed to query skyline results with high confidence from the incomplete data stream. Finally, by comparing with the most advanced skyline query algorithms over incomplete data streams, the correctness and efficiency of the proposed S_IDS algorithm are proved.</p></div>\",\"PeriodicalId\":55184,\"journal\":{\"name\":\"Data & Knowledge Engineering\",\"volume\":\"149 \",\"pages\":\"Article 102258\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2023-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data & Knowledge Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169023X23001180\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X23001180","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
海量流数据的高效处理已经引起了数据库领域的广泛关注。对传感器数据流进行天际线查询,可以实时监控多个目标,避免火灾、爆炸等异常事件的发生,在传感器数据监控的实际应用中非常有用。然而,现实世界的流数据可能经常包含不完整的数据属性,这是由于故障的传感设备或不完善的数据收集技术。对不完整数据流的Skyline查询可能导致传递性不足和循环支配问题。为了解决不完整数据流上的天际线查询问题,首先利用差分依赖规则(DD)填充不完整数据流中缺失的数据属性值;然后,将层次网格索引(HGrid)引入天际线查询领域,以提高剪枝效率。在天际线计算过程中,对于可能影响结果的数据,本文只保留尽可能少的计算结果,以避免大量的重复计算。为此,提出S_IDS (Skyline query algorithm over Incomplete Data Stream)算法,从不完整数据流中以高置信度查询Skyline结果。最后,通过与目前最先进的不完全数据流天际线查询算法的比较,验证了S_IDS算法的正确性和有效性。
S_IDS: An efficient skyline query algorithm over incomplete data streams
The efficient processing of mass stream data has attracted wide attention in the database field. The skyline query on the sensor data stream can monitor multiple targets in real time, to avoid abnormal events such as fire and explosion, which is very useful in the practical application of sensor data monitoring. However, real-world stream data may often contain incomplete data attributes due to faulty sensing devices or imperfect data collection techniques. Skyline queries over incomplete data streams may lead to a lack of transitivity and loop domination issues. To solve the problem of the skyline query over incomplete data streams, firstly, this paper uses differential dependency rule (DD) to fill the missing attribute values of data in the incomplete data stream. Then, the hierarchical grid index (HGrid) is introduced into the field of skyline query to improve pruning efficiency. In the process of skyline calculation, this paper only keeps as few calculation results as possible for the data that may affect the result to avoid a large number of repeated calculations. Thus, S_IDS (Skyline query algorithm over Incomplete Data Stream) is proposed to query skyline results with high confidence from the incomplete data stream. Finally, by comparing with the most advanced skyline query algorithms over incomplete data streams, the correctness and efficiency of the proposed S_IDS algorithm are proved.
期刊介绍:
Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.