Michal Sramka, R. Safavi-Naini, J. Denzinger, Mina Askari, Jie Gao
{"title":"从未清理数据中提取的知识在应用于清理数据时的效用","authors":"Michal Sramka, R. Safavi-Naini, J. Denzinger, Mina Askari, Jie Gao","doi":"10.1109/PST.2008.30","DOIUrl":null,"url":null,"abstract":"Knowledge discovery systems extract knowledge from data that can be used for making prediction about incomplete data items. Utility is a measure of the usefulness of the discovered knowledge and satisfaction of the user with that knowledge. We motivate and address the question of usefulness of sanitized data using the notion of utility in data mining systems. For this we measure the success of patterns and rules discovered from the original data to make predictions about the sanitized data using a previously developed framework. Using experimental results on a set of medical data we demonstrate that it is possible to make useful predictions about the sanitized medical data when rules discovered from the original unsanitized medical data are used. We explain our results and compare it with the case where no sanitization is involved.","PeriodicalId":422934,"journal":{"name":"2008 Sixth Annual Conference on Privacy, Security and Trust","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Utility of Knowledge Extracted from Unsanitized Data when Applied to Sanitized Data\",\"authors\":\"Michal Sramka, R. Safavi-Naini, J. Denzinger, Mina Askari, Jie Gao\",\"doi\":\"10.1109/PST.2008.30\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Knowledge discovery systems extract knowledge from data that can be used for making prediction about incomplete data items. Utility is a measure of the usefulness of the discovered knowledge and satisfaction of the user with that knowledge. We motivate and address the question of usefulness of sanitized data using the notion of utility in data mining systems. For this we measure the success of patterns and rules discovered from the original data to make predictions about the sanitized data using a previously developed framework. Using experimental results on a set of medical data we demonstrate that it is possible to make useful predictions about the sanitized medical data when rules discovered from the original unsanitized medical data are used. We explain our results and compare it with the case where no sanitization is involved.\",\"PeriodicalId\":422934,\"journal\":{\"name\":\"2008 Sixth Annual Conference on Privacy, Security and Trust\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Sixth Annual Conference on Privacy, Security and Trust\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PST.2008.30\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Sixth Annual Conference on Privacy, Security and Trust","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PST.2008.30","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Utility of Knowledge Extracted from Unsanitized Data when Applied to Sanitized Data
Knowledge discovery systems extract knowledge from data that can be used for making prediction about incomplete data items. Utility is a measure of the usefulness of the discovered knowledge and satisfaction of the user with that knowledge. We motivate and address the question of usefulness of sanitized data using the notion of utility in data mining systems. For this we measure the success of patterns and rules discovered from the original data to make predictions about the sanitized data using a previously developed framework. Using experimental results on a set of medical data we demonstrate that it is possible to make useful predictions about the sanitized medical data when rules discovered from the original unsanitized medical data are used. We explain our results and compare it with the case where no sanitization is involved.