{"title":"Privacy-Preserving Ranked Fuzzy Keyword Search over Encrypted Cloud Data","authors":"Qunqun Xu, Hong Shen, Yingpeng Sang, Hui Tian","doi":"10.1109/PDCAT.2013.44","DOIUrl":null,"url":null,"abstract":"As Cloud Computing becomes popular, more and more data owners prefer to store their data into the cloud for great flexibility and economic savings. In order to protect the data privacy, sensitive data usually have to be encrypted before outsourcing, which makes effective data utilization a challenging task. Although traditional searchable symmetric encryption schemes allow users to securely search over encrypted data through keywords and selectively retrieve files of interest without capturing any relevance of data files or search keywords, and fuzzy keyword search on encrypted data allows minor typos and format inconsistencies, secure ranked keyword search captures the relevance of data files and returns the results that are wanted most by users. These techniques function unilaterally, which greatly reduces the system usability and efficiency. In this paper, for the first time, we define and solve the problem of privacy-preserving ranked fuzzy keyword search over encrypted cloud data. Ranked fuzzy keyword search greatly enhances system usability and efficiency when exact match fails. It returns the matching files in a ranked order with respect to certain relevance criteria (e.g., keyword frequency) based on keyword similarity semantics. In our solution, we exploit the edit distance to quantify keyword similarity and dictionary-based fuzzy set construction to construct fuzzy keyword sets, which greatly reduces the index size, storage and communication costs. We choose the efficient similarity measure of \"coordinate matching\", i.e., as many matches as possible, to obtain the relevance of data files to the search keywords.","PeriodicalId":187974,"journal":{"name":"2013 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Parallel and Distributed Computing, Applications and Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2013.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
As Cloud Computing becomes popular, more and more data owners prefer to store their data into the cloud for great flexibility and economic savings. In order to protect the data privacy, sensitive data usually have to be encrypted before outsourcing, which makes effective data utilization a challenging task. Although traditional searchable symmetric encryption schemes allow users to securely search over encrypted data through keywords and selectively retrieve files of interest without capturing any relevance of data files or search keywords, and fuzzy keyword search on encrypted data allows minor typos and format inconsistencies, secure ranked keyword search captures the relevance of data files and returns the results that are wanted most by users. These techniques function unilaterally, which greatly reduces the system usability and efficiency. In this paper, for the first time, we define and solve the problem of privacy-preserving ranked fuzzy keyword search over encrypted cloud data. Ranked fuzzy keyword search greatly enhances system usability and efficiency when exact match fails. It returns the matching files in a ranked order with respect to certain relevance criteria (e.g., keyword frequency) based on keyword similarity semantics. In our solution, we exploit the edit distance to quantify keyword similarity and dictionary-based fuzzy set construction to construct fuzzy keyword sets, which greatly reduces the index size, storage and communication costs. We choose the efficient similarity measure of "coordinate matching", i.e., as many matches as possible, to obtain the relevance of data files to the search keywords.