用于优化MapReduce算法的Bloom过滤器及其变体:综述

2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech) Pub Date : 2020-11-24 DOI:10.1109/CloudTech49835.2020.9365876

F. Ezzaki, N. Abghour, A. Elomri, K. Moussaid, M. Rida

{"title":"用于优化MapReduce算法的Bloom过滤器及其变体:综述","authors":"F. Ezzaki, N. Abghour, A. Elomri, K. Moussaid, M. Rida","doi":"10.1109/CloudTech49835.2020.9365876","DOIUrl":null,"url":null,"abstract":"The bloom filter is a probabilistic data model used to test the existence of an element in a set, i.e., for any given item, the bloom filter could test the membership query on this candidate. The bloom filter has many advantages due to its simplicity and efficiency in highly solving the issue of data representation in many fields and to support membership queries, it has been known as space and time-efficient randomized data structure, by filtering out redundant data and optimizing the memory consumption. However, bloom filters are limited to membership tests and don’t support the deletion of elements. They also generate the false positive probability as they are based on a probabilistic model, this error rate is generated when an element that doesn’t belong to a set is considered as a member of this set by the bloom filter. Our goal is to compare a number of well- existed algorithms related to the boom filter for future work on the optimization of the join’s algorithms in MapReduce. This paper provides an overview of the different variants of the bloom filter and analyses the studies that have been interested in this area of research.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"93 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bloom filter and its variants for the optimization of MapReduce’s algorithms: A review\",\"authors\":\"F. Ezzaki, N. Abghour, A. Elomri, K. Moussaid, M. Rida\",\"doi\":\"10.1109/CloudTech49835.2020.9365876\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The bloom filter is a probabilistic data model used to test the existence of an element in a set, i.e., for any given item, the bloom filter could test the membership query on this candidate. The bloom filter has many advantages due to its simplicity and efficiency in highly solving the issue of data representation in many fields and to support membership queries, it has been known as space and time-efficient randomized data structure, by filtering out redundant data and optimizing the memory consumption. However, bloom filters are limited to membership tests and don’t support the deletion of elements. They also generate the false positive probability as they are based on a probabilistic model, this error rate is generated when an element that doesn’t belong to a set is considered as a member of this set by the bloom filter. Our goal is to compare a number of well- existed algorithms related to the boom filter for future work on the optimization of the join’s algorithms in MapReduce. This paper provides an overview of the different variants of the bloom filter and analyses the studies that have been interested in this area of research.\",\"PeriodicalId\":272860,\"journal\":{\"name\":\"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)\",\"volume\":\"93 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CloudTech49835.2020.9365876\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CloudTech49835.2020.9365876","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

布隆过滤器是一种概率数据模型，用于测试集合中某个元素的存在性，也就是说，对于任何给定的项目，布隆过滤器可以测试该候选项目的成员查询。布隆过滤器具有简单、高效的优点，能够很好地解决许多领域的数据表示问题，支持成员查询，通过过滤冗余数据和优化内存消耗，被称为空间和时间高效的随机数据结构。但是，布隆过滤器仅限于成员测试，不支持删除元素。它们也会产生假阳性概率，因为它们是基于概率模型的，这个错误率是当一个不属于集合的元素被布隆过滤器认为是这个集合的成员时产生的。我们的目标是比较一些现有的与boom filter相关的算法，以便将来在MapReduce中优化join算法。本文概述了布隆过滤器的不同变体，并分析了对这一研究领域感兴趣的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Bloom filter and its variants for the optimization of MapReduce’s algorithms: A review

The bloom filter is a probabilistic data model used to test the existence of an element in a set, i.e., for any given item, the bloom filter could test the membership query on this candidate. The bloom filter has many advantages due to its simplicity and efficiency in highly solving the issue of data representation in many fields and to support membership queries, it has been known as space and time-efficient randomized data structure, by filtering out redundant data and optimizing the memory consumption. However, bloom filters are limited to membership tests and don’t support the deletion of elements. They also generate the false positive probability as they are based on a probabilistic model, this error rate is generated when an element that doesn’t belong to a set is considered as a member of this set by the bloom filter. Our goal is to compare a number of well- existed algorithms related to the boom filter for future work on the optimization of the join’s algorithms in MapReduce. This paper provides an overview of the different variants of the bloom filter and analyses the studies that have been interested in this area of research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)

自引率

0.00%

发文量

期刊最新文献

CloudTech 2020 Copyright Page An IoT data logging instrument for monitoring and early efficiency loss detection at a photovoltaic generation plant A cloud-based foundational infrastructure for water management ecosystem Medical Image Registration via Similarity Measure based on Convolutional Neural Network Quality Approach to Analyze the Causes of Failures in MOOC