Priyank Jain, Neelam Pathak, Pratibhadevi Tapashetti, A. Umesh
{"title":"基于样本选择和奇异值分解的数据决策树隐私保护处理","authors":"Priyank Jain, Neelam Pathak, Pratibhadevi Tapashetti, A. Umesh","doi":"10.1109/ISIAS.2013.6947739","DOIUrl":null,"url":null,"abstract":"Data mining is a set of automated techniques used to extract hidden or buried information from large databases. With the development of data mining technologies, privacy protection has become a challenge for data mining applications in many fields. To solve this problem, many privacy-preserving data mining methods have been proposed. One important type of such methods is based on Singular Value Decomposition (SVD). In the proposed algorithm, attributes are grouped according to their distance difference similarity by clustering the data set using decision tree classification. Secondly, the algorithm packetizes the attributes according to their SA value in each group. Thirdly, for each group it selects attributes from the smallest bucket and searches for a similar attributes in the attributes-1 largest buckets from the same group to create an equivalence class following the unique attribute-distinct diversity anonymization model. The proposed algorithm satisfies the “utility based anonymization principle that crucial information is protected from being suppressed. Also, weights given to attributes improve clustering and give the ability to control the generalization's depth. In prototype decision tree is combination of clustering and classification technique such methods are called ensemble classifier, this new proposed method is more efficient in balancing data privacy and data utility.","PeriodicalId":370107,"journal":{"name":"2013 9th International Conference on Information Assurance and Security (IAS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Privacy preserving processing of data decision tree based on sample selection and Singular Value Decomposition\",\"authors\":\"Priyank Jain, Neelam Pathak, Pratibhadevi Tapashetti, A. Umesh\",\"doi\":\"10.1109/ISIAS.2013.6947739\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data mining is a set of automated techniques used to extract hidden or buried information from large databases. With the development of data mining technologies, privacy protection has become a challenge for data mining applications in many fields. To solve this problem, many privacy-preserving data mining methods have been proposed. One important type of such methods is based on Singular Value Decomposition (SVD). In the proposed algorithm, attributes are grouped according to their distance difference similarity by clustering the data set using decision tree classification. Secondly, the algorithm packetizes the attributes according to their SA value in each group. Thirdly, for each group it selects attributes from the smallest bucket and searches for a similar attributes in the attributes-1 largest buckets from the same group to create an equivalence class following the unique attribute-distinct diversity anonymization model. The proposed algorithm satisfies the “utility based anonymization principle that crucial information is protected from being suppressed. Also, weights given to attributes improve clustering and give the ability to control the generalization's depth. In prototype decision tree is combination of clustering and classification technique such methods are called ensemble classifier, this new proposed method is more efficient in balancing data privacy and data utility.\",\"PeriodicalId\":370107,\"journal\":{\"name\":\"2013 9th International Conference on Information Assurance and Security (IAS)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 9th International Conference on Information Assurance and Security (IAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISIAS.2013.6947739\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 9th International Conference on Information Assurance and Security (IAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIAS.2013.6947739","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Privacy preserving processing of data decision tree based on sample selection and Singular Value Decomposition
Data mining is a set of automated techniques used to extract hidden or buried information from large databases. With the development of data mining technologies, privacy protection has become a challenge for data mining applications in many fields. To solve this problem, many privacy-preserving data mining methods have been proposed. One important type of such methods is based on Singular Value Decomposition (SVD). In the proposed algorithm, attributes are grouped according to their distance difference similarity by clustering the data set using decision tree classification. Secondly, the algorithm packetizes the attributes according to their SA value in each group. Thirdly, for each group it selects attributes from the smallest bucket and searches for a similar attributes in the attributes-1 largest buckets from the same group to create an equivalence class following the unique attribute-distinct diversity anonymization model. The proposed algorithm satisfies the “utility based anonymization principle that crucial information is protected from being suppressed. Also, weights given to attributes improve clustering and give the ability to control the generalization's depth. In prototype decision tree is combination of clustering and classification technique such methods are called ensemble classifier, this new proposed method is more efficient in balancing data privacy and data utility.