F. Ezzaki, N. Abghour, A. Elomri, K. Moussaid, M. Rida
{"title":"用于优化MapReduce算法的Bloom过滤器及其变体:综述","authors":"F. Ezzaki, N. Abghour, A. Elomri, K. Moussaid, M. Rida","doi":"10.1109/CloudTech49835.2020.9365876","DOIUrl":null,"url":null,"abstract":"The bloom filter is a probabilistic data model used to test the existence of an element in a set, i.e., for any given item, the bloom filter could test the membership query on this candidate. The bloom filter has many advantages due to its simplicity and efficiency in highly solving the issue of data representation in many fields and to support membership queries, it has been known as space and time-efficient randomized data structure, by filtering out redundant data and optimizing the memory consumption. However, bloom filters are limited to membership tests and don’t support the deletion of elements. They also generate the false positive probability as they are based on a probabilistic model, this error rate is generated when an element that doesn’t belong to a set is considered as a member of this set by the bloom filter. Our goal is to compare a number of well- existed algorithms related to the boom filter for future work on the optimization of the join’s algorithms in MapReduce. This paper provides an overview of the different variants of the bloom filter and analyses the studies that have been interested in this area of research.","PeriodicalId":272860,"journal":{"name":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","volume":"93 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bloom filter and its variants for the optimization of MapReduce’s algorithms: A review\",\"authors\":\"F. Ezzaki, N. Abghour, A. Elomri, K. Moussaid, M. Rida\",\"doi\":\"10.1109/CloudTech49835.2020.9365876\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The bloom filter is a probabilistic data model used to test the existence of an element in a set, i.e., for any given item, the bloom filter could test the membership query on this candidate. The bloom filter has many advantages due to its simplicity and efficiency in highly solving the issue of data representation in many fields and to support membership queries, it has been known as space and time-efficient randomized data structure, by filtering out redundant data and optimizing the memory consumption. However, bloom filters are limited to membership tests and don’t support the deletion of elements. They also generate the false positive probability as they are based on a probabilistic model, this error rate is generated when an element that doesn’t belong to a set is considered as a member of this set by the bloom filter. Our goal is to compare a number of well- existed algorithms related to the boom filter for future work on the optimization of the join’s algorithms in MapReduce. This paper provides an overview of the different variants of the bloom filter and analyses the studies that have been interested in this area of research.\",\"PeriodicalId\":272860,\"journal\":{\"name\":\"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)\",\"volume\":\"93 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CloudTech49835.2020.9365876\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CloudTech49835.2020.9365876","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Bloom filter and its variants for the optimization of MapReduce’s algorithms: A review
The bloom filter is a probabilistic data model used to test the existence of an element in a set, i.e., for any given item, the bloom filter could test the membership query on this candidate. The bloom filter has many advantages due to its simplicity and efficiency in highly solving the issue of data representation in many fields and to support membership queries, it has been known as space and time-efficient randomized data structure, by filtering out redundant data and optimizing the memory consumption. However, bloom filters are limited to membership tests and don’t support the deletion of elements. They also generate the false positive probability as they are based on a probabilistic model, this error rate is generated when an element that doesn’t belong to a set is considered as a member of this set by the bloom filter. Our goal is to compare a number of well- existed algorithms related to the boom filter for future work on the optimization of the join’s algorithms in MapReduce. This paper provides an overview of the different variants of the bloom filter and analyses the studies that have been interested in this area of research.