A Forest-structured Bloom Filter with flash memory

Guanlin Lu, Biplob K. Debnath, D. Du
{"title":"A Forest-structured Bloom Filter with flash memory","authors":"Guanlin Lu, Biplob K. Debnath, D. Du","doi":"10.1109/MSST.2011.5937232","DOIUrl":null,"url":null,"abstract":"A Bloom Filter (BF) is a data structure based on probability to compactly represent/record a set of elements (keys). It has wide applications on efficiently identifying a key that has been seen before with minimum amount of recording space used. BF is heavily used in chunking based data de-duplication. Traditionally, a BF is implemented as in-RAM data structure; hence its size is limited by the available RAM space on the machine. For certain applications like data de-duplication that require a big BF beyond the size of available RAM space, it becomes necessary to store a BF into a secondary storage device. Since BF operations are inherently random in nature, magnetic disk provides worse performance for the random read and write operations. It will not be a good fit for storing the large BF. Flash memory based Solid State Drive (SSD) has been considered as an emerging storage device that has superior performance and can potentially replace disks as the preferred secondary storage devices. However, several special characteristics of flash memory make designing a flash memory based BF very challenging. In this paper, our goal is to design an efficient flash memory based BF that is fully aware of these physical characteristics. To this end, we propose a Forest-structured BF design (FBF). FBF uses a combination of RAM and flash memory to design a BF. BF is stored on the flash, while RAM helps to mitigate the impact of slow write performance of flash memory. In addition, in-flash BF is organized in a forest-like structure in order to improve the lookup performance. Our experimental results show that FBF design achieves 2 times faster processing speed with 50% less number of flash write operations when compared with the existing flash memory based BF designs.","PeriodicalId":136636,"journal":{"name":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSST.2011.5937232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 26

Abstract

A Bloom Filter (BF) is a data structure based on probability to compactly represent/record a set of elements (keys). It has wide applications on efficiently identifying a key that has been seen before with minimum amount of recording space used. BF is heavily used in chunking based data de-duplication. Traditionally, a BF is implemented as in-RAM data structure; hence its size is limited by the available RAM space on the machine. For certain applications like data de-duplication that require a big BF beyond the size of available RAM space, it becomes necessary to store a BF into a secondary storage device. Since BF operations are inherently random in nature, magnetic disk provides worse performance for the random read and write operations. It will not be a good fit for storing the large BF. Flash memory based Solid State Drive (SSD) has been considered as an emerging storage device that has superior performance and can potentially replace disks as the preferred secondary storage devices. However, several special characteristics of flash memory make designing a flash memory based BF very challenging. In this paper, our goal is to design an efficient flash memory based BF that is fully aware of these physical characteristics. To this end, we propose a Forest-structured BF design (FBF). FBF uses a combination of RAM and flash memory to design a BF. BF is stored on the flash, while RAM helps to mitigate the impact of slow write performance of flash memory. In addition, in-flash BF is organized in a forest-like structure in order to improve the lookup performance. Our experimental results show that FBF design achieves 2 times faster processing speed with 50% less number of flash write operations when compared with the existing flash memory based BF designs.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一个森林结构的布隆过滤器与闪存
布隆过滤器(BF)是一种基于概率的数据结构,用于紧凑地表示/记录一组元素(键)。它在以最小的记录空间有效地识别以前见过的密钥方面具有广泛的应用。BF在基于分块的重复数据删除中得到了广泛的应用。传统上,BF是作为内存中的数据结构实现的;因此,它的大小受到机器上可用RAM空间的限制。对于某些应用程序,如数据重复删除,需要一个超过可用RAM空间大小的大BF,就有必要将BF存储到辅助存储设备中。由于高炉操作本身具有随机性,因此磁盘对随机读写操作的性能较差。它不适合储存大型高炉。基于闪存的固态硬盘(Solid State Drive, SSD)被认为是一种新兴的存储设备,具有优越的性能,有可能取代磁盘成为首选的二级存储设备。然而,闪存的一些特殊特性使得基于BF的闪存的设计非常具有挑战性。在本文中,我们的目标是设计一个高效的基于BF的闪存,充分意识到这些物理特性。为此,我们提出了一种森林结构BF设计(FBF)。FBF采用RAM和闪存相结合的方式来设计BF。BF存储在闪存上,而RAM有助于减轻闪存写入速度慢的影响。此外,为了提高查找性能,flash内BF被组织成类似森林的结构。实验结果表明,与现有基于闪存的BF设计相比,FBF设计的处理速度提高了2倍,闪存写入操作次数减少了50%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Data allocation strategies for the management of Quality of Service in Virtualised Storage Systems Performance models of flash-based solid-state drives for real workloads Performance modeling and analysis of flash-based storage devices Understanding and improving computational science storage access through continuous characterization YouChoose: A performance interface enabling convenient and efficient QoS support for consolidated storage systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1