{"title":"ASIDL:一种基于分解和切片的缺失数据发布方法","authors":"Wu Tongtong, Z. Hongxia","doi":"10.1109/CTMCD53128.2021.00028","DOIUrl":null,"url":null,"abstract":"In data release, data mining and analysis technologies can fully tap the value of data and promote the development of related industries. However, maliciously mining data seriously threatens users’ private information. Therefore, the privacy protection data release technology that realizes data security and effectiveness has developed rapidly, and corresponding research results have been obtained. However, the current research directions are mostly related to the general protection needs, and the data release area for special protection needs still needs attention. Aiming at the problem of reconstruction errors that are almost unavoidable in the reconstruction method of missing data, this paper proposes the ASIDL method based on decomposition and slicing. The TBBL (Tuple Bucket Build based on 1-diversity) algorithm clusters the QI attributes and randomly extracts the top 1 Tuples with different sensitive values form buckets, and more tuples formed by permutation and combination are used to protect their identity. At the same time, the TBSL (Tuple Bucket Split based on 1-diversity) algorithm is used to split the existing tuple buckets under the premise of satisfying 1-diversity, so as to achieve smaller buckets and ensure information validity. This paper compares with existing algorithms, and the results show that the method proposed in this paper has lower information loss and higher efficiency.","PeriodicalId":298084,"journal":{"name":"2021 International Conference on Computer Technology and Media Convergence Design (CTMCD)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ASIDL : A method for publishing missing data based on decomposition and slicing\",\"authors\":\"Wu Tongtong, Z. Hongxia\",\"doi\":\"10.1109/CTMCD53128.2021.00028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In data release, data mining and analysis technologies can fully tap the value of data and promote the development of related industries. However, maliciously mining data seriously threatens users’ private information. Therefore, the privacy protection data release technology that realizes data security and effectiveness has developed rapidly, and corresponding research results have been obtained. However, the current research directions are mostly related to the general protection needs, and the data release area for special protection needs still needs attention. Aiming at the problem of reconstruction errors that are almost unavoidable in the reconstruction method of missing data, this paper proposes the ASIDL method based on decomposition and slicing. The TBBL (Tuple Bucket Build based on 1-diversity) algorithm clusters the QI attributes and randomly extracts the top 1 Tuples with different sensitive values form buckets, and more tuples formed by permutation and combination are used to protect their identity. At the same time, the TBSL (Tuple Bucket Split based on 1-diversity) algorithm is used to split the existing tuple buckets under the premise of satisfying 1-diversity, so as to achieve smaller buckets and ensure information validity. This paper compares with existing algorithms, and the results show that the method proposed in this paper has lower information loss and higher efficiency.\",\"PeriodicalId\":298084,\"journal\":{\"name\":\"2021 International Conference on Computer Technology and Media Convergence Design (CTMCD)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Computer Technology and Media Convergence Design (CTMCD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CTMCD53128.2021.00028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computer Technology and Media Convergence Design (CTMCD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CTMCD53128.2021.00028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
在数据发布方面,数据挖掘和分析技术可以充分挖掘数据的价值,促进相关产业的发展。然而,恶意挖掘数据严重威胁着用户的隐私信息。因此,实现数据安全有效的隐私保护数据发布技术得到了迅速发展,并取得了相应的研究成果。但是,目前的研究方向多与一般保护需求相关,特殊保护需求的数据发布领域仍需关注。针对缺失数据重构方法中几乎不可避免的重构误差问题,本文提出了基于分解和切片的ASIDL方法。TBBL (Tuple Bucket Build based on 1-diversity)算法对QI属性进行聚类,随机抽取前1个敏感值不同的元组组成桶,并使用更多通过排列组合形成的元组来保护其身份。同时,采用TBSL (Tuple Bucket Split based on 1-diversity)算法,在满足1-diversity的前提下,对已有的元组桶进行分割,实现更小的桶,保证信息的有效性。通过与现有算法的比较,结果表明本文提出的方法具有更小的信息损失和更高的效率。
ASIDL : A method for publishing missing data based on decomposition and slicing
In data release, data mining and analysis technologies can fully tap the value of data and promote the development of related industries. However, maliciously mining data seriously threatens users’ private information. Therefore, the privacy protection data release technology that realizes data security and effectiveness has developed rapidly, and corresponding research results have been obtained. However, the current research directions are mostly related to the general protection needs, and the data release area for special protection needs still needs attention. Aiming at the problem of reconstruction errors that are almost unavoidable in the reconstruction method of missing data, this paper proposes the ASIDL method based on decomposition and slicing. The TBBL (Tuple Bucket Build based on 1-diversity) algorithm clusters the QI attributes and randomly extracts the top 1 Tuples with different sensitive values form buckets, and more tuples formed by permutation and combination are used to protect their identity. At the same time, the TBSL (Tuple Bucket Split based on 1-diversity) algorithm is used to split the existing tuple buckets under the premise of satisfying 1-diversity, so as to achieve smaller buckets and ensure information validity. This paper compares with existing algorithms, and the results show that the method proposed in this paper has lower information loss and higher efficiency.