{"title":"对数据预处理进行分析,防止数据挖掘中的直接歧视","authors":"Trupti A. Aneyrao, R. Fadnavis","doi":"10.1109/STARTUP.2016.7583986","DOIUrl":null,"url":null,"abstract":"Data mining is a technology using which we can extract useful information from data. There are two major issues in data mining first is privacy violation and and second is discrimination. Discrimination is the unfair treatment with respect to the features that should not be considered while decision making. With respect to human, it is when people are given unfair treatment on the basis of their sensitive features like gender, race, religion etc. Discrimination can be of two types direct discrimination and indirect discrimination. Direct discrimination consists of training rules based on sensitive attributes like religion, race, community etc. Indirect discrimination is a discrimination which occurs when the decisions are taken on non-sensitive attributes but these attributes are closely related to direct discriminatory attributes. Automated decision making systems uses data mining techniques to train the system for decision making. Data form the previous work is used for the rule generation to train the system. At first sight, we can say that automating decisions systems are fair in decision making, but if the training data sets are itself discriminatory then the the system cannot be free from discrimination. To remove such discrimination we have discrimination discovery and prevention techniques in data mining. This paper mainly focuses direct discrimination removal from the data.","PeriodicalId":355852,"journal":{"name":"2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Analysis for data preprocessing to prevent direct discrimination in data mining\",\"authors\":\"Trupti A. Aneyrao, R. Fadnavis\",\"doi\":\"10.1109/STARTUP.2016.7583986\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data mining is a technology using which we can extract useful information from data. There are two major issues in data mining first is privacy violation and and second is discrimination. Discrimination is the unfair treatment with respect to the features that should not be considered while decision making. With respect to human, it is when people are given unfair treatment on the basis of their sensitive features like gender, race, religion etc. Discrimination can be of two types direct discrimination and indirect discrimination. Direct discrimination consists of training rules based on sensitive attributes like religion, race, community etc. Indirect discrimination is a discrimination which occurs when the decisions are taken on non-sensitive attributes but these attributes are closely related to direct discriminatory attributes. Automated decision making systems uses data mining techniques to train the system for decision making. Data form the previous work is used for the rule generation to train the system. At first sight, we can say that automating decisions systems are fair in decision making, but if the training data sets are itself discriminatory then the the system cannot be free from discrimination. To remove such discrimination we have discrimination discovery and prevention techniques in data mining. This paper mainly focuses direct discrimination removal from the data.\",\"PeriodicalId\":355852,\"journal\":{\"name\":\"2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/STARTUP.2016.7583986\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STARTUP.2016.7583986","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Analysis for data preprocessing to prevent direct discrimination in data mining
Data mining is a technology using which we can extract useful information from data. There are two major issues in data mining first is privacy violation and and second is discrimination. Discrimination is the unfair treatment with respect to the features that should not be considered while decision making. With respect to human, it is when people are given unfair treatment on the basis of their sensitive features like gender, race, religion etc. Discrimination can be of two types direct discrimination and indirect discrimination. Direct discrimination consists of training rules based on sensitive attributes like religion, race, community etc. Indirect discrimination is a discrimination which occurs when the decisions are taken on non-sensitive attributes but these attributes are closely related to direct discriminatory attributes. Automated decision making systems uses data mining techniques to train the system for decision making. Data form the previous work is used for the rule generation to train the system. At first sight, we can say that automating decisions systems are fair in decision making, but if the training data sets are itself discriminatory then the the system cannot be free from discrimination. To remove such discrimination we have discrimination discovery and prevention techniques in data mining. This paper mainly focuses direct discrimination removal from the data.