{"title":"Data streams and privacy: Two emerging issues in data classification","authors":"Radhika Kotecha, Sanjay Garg","doi":"10.1109/NUICONE.2015.7449597","DOIUrl":null,"url":null,"abstract":"Several real-world applications generate data streams where the opportunity to examine each instance is concise. Effective classification of such data streams is an emerging issue in data mining. However, such classification can cause severe threats to privacy. There are several applications like credit card fraud detection, disease outbreak or biological attack detection, loan approval, etc. where the data is homogeneously distributed among different parties. These parties may wish to collaboratively build a classifier to obtain certain global patterns but will be reluctant to disclose their private data. Privacy-preserving classification of such homogeneously distributed data is a challenging issue too. In this paper, we present a brief review of the work carried out in data stream classification and privacy-preserving classification of homogeneously distributed data; followed by an empirical evaluation and performance comparison of some methods in both these areas. We also propose and evaluate an approach of creating an ensemble of anonymous decision trees to classify homogeneously distributed data in a privacy-preserving manner. We further identify the need to develop efficient methods for privacy-preserving classification of homogeneously distributed data streams and propose a suitable approach for the same.","PeriodicalId":131332,"journal":{"name":"2015 5th Nirma University International Conference on Engineering (NUiCONE)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 5th Nirma University International Conference on Engineering (NUiCONE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NUICONE.2015.7449597","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Several real-world applications generate data streams where the opportunity to examine each instance is concise. Effective classification of such data streams is an emerging issue in data mining. However, such classification can cause severe threats to privacy. There are several applications like credit card fraud detection, disease outbreak or biological attack detection, loan approval, etc. where the data is homogeneously distributed among different parties. These parties may wish to collaboratively build a classifier to obtain certain global patterns but will be reluctant to disclose their private data. Privacy-preserving classification of such homogeneously distributed data is a challenging issue too. In this paper, we present a brief review of the work carried out in data stream classification and privacy-preserving classification of homogeneously distributed data; followed by an empirical evaluation and performance comparison of some methods in both these areas. We also propose and evaluate an approach of creating an ensemble of anonymous decision trees to classify homogeneously distributed data in a privacy-preserving manner. We further identify the need to develop efficient methods for privacy-preserving classification of homogeneously distributed data streams and propose a suitable approach for the same.