对数据预处理进行分析,防止数据挖掘中的直接歧视

Trupti A. Aneyrao, R. Fadnavis
{"title":"对数据预处理进行分析,防止数据挖掘中的直接歧视","authors":"Trupti A. Aneyrao, R. Fadnavis","doi":"10.1109/STARTUP.2016.7583986","DOIUrl":null,"url":null,"abstract":"Data mining is a technology using which we can extract useful information from data. There are two major issues in data mining first is privacy violation and and second is discrimination. Discrimination is the unfair treatment with respect to the features that should not be considered while decision making. With respect to human, it is when people are given unfair treatment on the basis of their sensitive features like gender, race, religion etc. Discrimination can be of two types direct discrimination and indirect discrimination. Direct discrimination consists of training rules based on sensitive attributes like religion, race, community etc. Indirect discrimination is a discrimination which occurs when the decisions are taken on non-sensitive attributes but these attributes are closely related to direct discriminatory attributes. Automated decision making systems uses data mining techniques to train the system for decision making. Data form the previous work is used for the rule generation to train the system. At first sight, we can say that automating decisions systems are fair in decision making, but if the training data sets are itself discriminatory then the the system cannot be free from discrimination. To remove such discrimination we have discrimination discovery and prevention techniques in data mining. This paper mainly focuses direct discrimination removal from the data.","PeriodicalId":355852,"journal":{"name":"2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Analysis for data preprocessing to prevent direct discrimination in data mining\",\"authors\":\"Trupti A. Aneyrao, R. Fadnavis\",\"doi\":\"10.1109/STARTUP.2016.7583986\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data mining is a technology using which we can extract useful information from data. There are two major issues in data mining first is privacy violation and and second is discrimination. Discrimination is the unfair treatment with respect to the features that should not be considered while decision making. With respect to human, it is when people are given unfair treatment on the basis of their sensitive features like gender, race, religion etc. Discrimination can be of two types direct discrimination and indirect discrimination. Direct discrimination consists of training rules based on sensitive attributes like religion, race, community etc. Indirect discrimination is a discrimination which occurs when the decisions are taken on non-sensitive attributes but these attributes are closely related to direct discriminatory attributes. Automated decision making systems uses data mining techniques to train the system for decision making. Data form the previous work is used for the rule generation to train the system. At first sight, we can say that automating decisions systems are fair in decision making, but if the training data sets are itself discriminatory then the the system cannot be free from discrimination. To remove such discrimination we have discrimination discovery and prevention techniques in data mining. This paper mainly focuses direct discrimination removal from the data.\",\"PeriodicalId\":355852,\"journal\":{\"name\":\"2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/STARTUP.2016.7583986\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STARTUP.2016.7583986","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

数据挖掘是一种从数据中提取有用信息的技术。数据挖掘中存在两个主要问题,一是侵犯隐私,二是歧视。歧视是指在决策时对不应考虑的特征进行不公平的对待。就人类而言,这是指人们因性别、种族、宗教等敏感特征而受到不公平对待。歧视可以分为直接歧视和间接歧视两种。直接歧视包括基于宗教、种族、社区等敏感属性的培训规则。间接歧视是指在对非敏感属性作出决策时发生的歧视,但这些属性与直接歧视属性密切相关。自动化决策系统使用数据挖掘技术来训练系统进行决策。以前工作的数据用于生成规则来训练系统。乍一看,我们可以说自动化决策系统在决策方面是公平的,但如果训练数据集本身具有歧视性,那么系统就无法摆脱歧视。为了消除这种歧视,我们在数据挖掘中引入了歧视发现和预防技术。本文主要研究数据的直接歧视去除。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Analysis for data preprocessing to prevent direct discrimination in data mining
Data mining is a technology using which we can extract useful information from data. There are two major issues in data mining first is privacy violation and and second is discrimination. Discrimination is the unfair treatment with respect to the features that should not be considered while decision making. With respect to human, it is when people are given unfair treatment on the basis of their sensitive features like gender, race, religion etc. Discrimination can be of two types direct discrimination and indirect discrimination. Direct discrimination consists of training rules based on sensitive attributes like religion, race, community etc. Indirect discrimination is a discrimination which occurs when the decisions are taken on non-sensitive attributes but these attributes are closely related to direct discriminatory attributes. Automated decision making systems uses data mining techniques to train the system for decision making. Data form the previous work is used for the rule generation to train the system. At first sight, we can say that automating decisions systems are fair in decision making, but if the training data sets are itself discriminatory then the the system cannot be free from discrimination. To remove such discrimination we have discrimination discovery and prevention techniques in data mining. This paper mainly focuses direct discrimination removal from the data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reconfigurable filtenna in UHF band for cognitive radio application Efficient data search using map reduce framework Sense disambiguation for Marathi language words using decision graph method Logo matching and recognition: A concise review Survey on detecting leakage of sensitive data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1