使用Hadoop对大型可搜索数据集进行初始加密

Feng Wang, Mathias Kohler, A. Schaad
{"title":"使用Hadoop对大型可搜索数据集进行初始加密","authors":"Feng Wang, Mathias Kohler, A. Schaad","doi":"10.1145/2752952.2752960","DOIUrl":null,"url":null,"abstract":"With the introduction and the widely use of external hosted infrastructures, secure storage of sensitive data becomes more and more important. There are systems available to store and query encrypted data in a database, but not all applications may start with empty tables rather than having sets of legacy data. Hence, there is a need to transform existing plaintext databases to encrypted form. Usually existing enterprise databases may contain terabytes of data. A single machine would require many months for the initial encryption of a large data set. We propose encrypting data in parallel using a Hadoop cluster which is a simple five step process including the Hadoop set up, target preparation, source data import, encrypting the data, and finally exporting it to the target. We evaluated our solution on real world data and report on performance and data consumption. The results show that encrypting data in parallel can be done in a very scalable manner. Using a parallelized encryption cluster compared to a single server machine reduces the encryption time from months down to days or even hours.","PeriodicalId":305802,"journal":{"name":"Proceedings of the 20th ACM Symposium on Access Control Models and Technologies","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Initial Encryption of large Searchable Data Sets using Hadoop\",\"authors\":\"Feng Wang, Mathias Kohler, A. Schaad\",\"doi\":\"10.1145/2752952.2752960\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the introduction and the widely use of external hosted infrastructures, secure storage of sensitive data becomes more and more important. There are systems available to store and query encrypted data in a database, but not all applications may start with empty tables rather than having sets of legacy data. Hence, there is a need to transform existing plaintext databases to encrypted form. Usually existing enterprise databases may contain terabytes of data. A single machine would require many months for the initial encryption of a large data set. We propose encrypting data in parallel using a Hadoop cluster which is a simple five step process including the Hadoop set up, target preparation, source data import, encrypting the data, and finally exporting it to the target. We evaluated our solution on real world data and report on performance and data consumption. The results show that encrypting data in parallel can be done in a very scalable manner. Using a parallelized encryption cluster compared to a single server machine reduces the encryption time from months down to days or even hours.\",\"PeriodicalId\":305802,\"journal\":{\"name\":\"Proceedings of the 20th ACM Symposium on Access Control Models and Technologies\",\"volume\":\"78 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 20th ACM Symposium on Access Control Models and Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2752952.2752960\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th ACM Symposium on Access Control Models and Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2752952.2752960","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

随着外部托管基础设施的引入和广泛使用,敏感数据的安全存储变得越来越重要。有一些系统可用于在数据库中存储和查询加密数据,但并非所有应用程序都可以从空表开始,而不是从遗留数据集开始。因此,需要将现有的明文数据库转换为加密形式。通常,现有的企业数据库可能包含数tb的数据。一台机器对一个大数据集进行初始加密需要好几个月的时间。我们建议使用Hadoop集群并行加密数据,这是一个简单的五步过程,包括Hadoop设置,目标准备,源数据导入,数据加密,最后导出到目标。我们根据真实世界的数据评估了我们的解决方案,并报告了性能和数据消耗情况。结果表明,并行数据加密可以以一种非常可扩展的方式完成。与使用单个服务器机器相比,使用并行加密集群可以将加密时间从几个月减少到几天甚至几个小时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Initial Encryption of large Searchable Data Sets using Hadoop
With the introduction and the widely use of external hosted infrastructures, secure storage of sensitive data becomes more and more important. There are systems available to store and query encrypted data in a database, but not all applications may start with empty tables rather than having sets of legacy data. Hence, there is a need to transform existing plaintext databases to encrypted form. Usually existing enterprise databases may contain terabytes of data. A single machine would require many months for the initial encryption of a large data set. We propose encrypting data in parallel using a Hadoop cluster which is a simple five step process including the Hadoop set up, target preparation, source data import, encrypting the data, and finally exporting it to the target. We evaluated our solution on real world data and report on performance and data consumption. The results show that encrypting data in parallel can be done in a very scalable manner. Using a parallelized encryption cluster compared to a single server machine reduces the encryption time from months down to days or even hours.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On Missing Attributes in Access Control: Non-deterministic and Probabilistic Attribute Retrieval Towards Attribute-Based Authorisation for Bidirectional Programming Hard Instances for Verification Problems in Access Control Mitigating Access Control Vulnerabilities through Interactive Static Analysis A Logical Approach to Restricting Access in Online Social Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1