Information Leakage in Encrypted Deduplication via Frequency Analysis

2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) Pub Date : 2017-06-26 DOI:10.1109/DSN.2017.28

Jingwei Li, Chuan Qin, P. Lee, Xiaosong Zhang

{"title":"Information Leakage in Encrypted Deduplication via Frequency Analysis","authors":"Jingwei Li, Chuan Qin, P. Lee, Xiaosong Zhang","doi":"10.1109/DSN.2017.28","DOIUrl":null,"url":null,"abstract":"Encrypted deduplication seamlessly combines encryption and deduplication to simultaneously achieve both data security and storage efficiency. State-of-the-art encrypted deduplication systems mostly adopt a deterministic encryption approach that encrypts each plaintext chunk with a key derived from the content of the chunk itself, so that identical plaintext chunks are always encrypted into identical ciphertext chunks for deduplication. However, such deterministic encryption inherently reveals the underlying frequency distribution of the original plaintext chunks. This allows an adversary to launch frequency analysis against the resulting ciphertext chunks, and ultimately infer the content of the original plaintext chunks. In this paper, we study how frequency analysis practically affects information leakage in encrypted deduplication storage, from both attack and defense perspectives. We first propose a new inference attack that exploits chunk locality to increase the coverage of inferred chunks. We conduct trace-driven evaluation on both real-world and synthetic datasets, and show that the new inference attack can infer a significant fraction of plaintext chunks under backup workloads. To protect against frequency analysis, we borrow the idea of existing performance-driven deduplication approaches and consider an encryption scheme called MinHash encryption, which disturbs the frequency rank of ciphertext chunks by encrypting some identical plaintext chunks into multiple distinct ciphertext chunks. Our trace-driven evaluation shows that MinHash encryption effectively mitigates the inference attack, while maintaining high storage efficiency.","PeriodicalId":426928,"journal":{"name":"2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN.2017.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Encrypted deduplication seamlessly combines encryption and deduplication to simultaneously achieve both data security and storage efficiency. State-of-the-art encrypted deduplication systems mostly adopt a deterministic encryption approach that encrypts each plaintext chunk with a key derived from the content of the chunk itself, so that identical plaintext chunks are always encrypted into identical ciphertext chunks for deduplication. However, such deterministic encryption inherently reveals the underlying frequency distribution of the original plaintext chunks. This allows an adversary to launch frequency analysis against the resulting ciphertext chunks, and ultimately infer the content of the original plaintext chunks. In this paper, we study how frequency analysis practically affects information leakage in encrypted deduplication storage, from both attack and defense perspectives. We first propose a new inference attack that exploits chunk locality to increase the coverage of inferred chunks. We conduct trace-driven evaluation on both real-world and synthetic datasets, and show that the new inference attack can infer a significant fraction of plaintext chunks under backup workloads. To protect against frequency analysis, we borrow the idea of existing performance-driven deduplication approaches and consider an encryption scheme called MinHash encryption, which disturbs the frequency rank of ciphertext chunks by encrypting some identical plaintext chunks into multiple distinct ciphertext chunks. Our trace-driven evaluation shows that MinHash encryption effectively mitigates the inference attack, while maintaining high storage efficiency.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于频率分析的加密重复数据删除中的信息泄露

加密重复数据删除将加密和重复数据删除无缝结合，同时实现数据安全性和存储效率。最先进的加密重复数据删除系统大多采用确定性加密方法，使用从数据块本身的内容派生的密钥对每个明文块进行加密，以便始终将相同的明文块加密为相同的密文块进行重复数据删除。然而，这种确定性加密本质上揭示了原始明文块的底层频率分布。这允许攻击者对生成的密文块进行频率分析，并最终推断出原始明文块的内容。本文从攻击和防御两个角度研究了频率分析对重复数据删除加密存储中信息泄漏的实际影响。我们首先提出了一种新的推理攻击，利用块局部性来增加推断块的覆盖范围。我们对真实世界和合成数据集进行了跟踪驱动的评估，并表明新的推理攻击可以在备份工作负载下推断出相当一部分明文块。为了防止频率分析，我们借用了现有的性能驱动的重复数据删除方法的思想，并考虑了一种称为MinHash加密的加密方案，该方案通过将一些相同的明文块加密成多个不同的密文块来干扰密文块的频率等级。我们的跟踪驱动评估表明，MinHash加密有效地减轻了推理攻击，同时保持了较高的存储效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

自引率

0.00%

发文量

期刊最新文献

Compromising Security of Economic Dispatch in Power System Operations Implicit Smartphone User Authentication with Sensors and Contextual Machine Learning Towards Automated Discovery of Crash-Resistant Primitives in Binary Executables Sensor-Based Implicit Authentication of Smartphone Users Athena: A Framework for Scalable Anomaly Detection in Software-Defined Networks