Identifiability of Causal-based ML Fairness Notions

2022 14th International Conference on Computational Intelligence and Communication Networks (CICN) Pub Date : 2022-12-04 DOI:10.1109/CICN56167.2022.10008263

K. Makhlouf, Sami Zhioua, C. Palamidessi

{"title":"Identifiability of Causal-based ML Fairness Notions","authors":"K. Makhlouf, Sami Zhioua, C. Palamidessi","doi":"10.1109/CICN56167.2022.10008263","DOIUrl":null,"url":null,"abstract":"Machine learning algorithms can produce biased outcome/prediction, typically, against minorities and under-represented sub-populations. Therefore, fairness is emerging as an important requirement for the safe application of machine learning based technologies. The most commonly used fairness notions (e.g. statistical parity, equalized odds, predictive parity, etc.) are observational and rely on mere correlation between variables. These notions fail to identify bias in case of statistical anomalies such as Simpson's or Berkson's paradoxes. Causality-based fairness notions (e.g. counterfactual fairness, no-proxy discrimination, etc.) are immune to such anomalies and hence more reliable to assess fairness. The problem of causality-based fairness notions, however, is that they are defined in terms of quantities (e.g. causal, counterfactual, and path-specific effects) that are not always measurable. This is known as the identifiability problem and is the topic of a large body of work in the causal inference literature. The first contribution of this paper is a compilation of the major identifiability results which are of particular relevance for machine learning fairness. To the best of our knowledge, no previous work in the field of ML fairness or causal inference provides such systemization of knowledge. The second contribution is more general and addresses the main problem of using causality in machine learning, that is, how to extract causal knowledge from observational data in real scenarios. This paper shows how this can be achieved using identifiability.","PeriodicalId":287589,"journal":{"name":"2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CICN56167.2022.10008263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Machine learning algorithms can produce biased outcome/prediction, typically, against minorities and under-represented sub-populations. Therefore, fairness is emerging as an important requirement for the safe application of machine learning based technologies. The most commonly used fairness notions (e.g. statistical parity, equalized odds, predictive parity, etc.) are observational and rely on mere correlation between variables. These notions fail to identify bias in case of statistical anomalies such as Simpson's or Berkson's paradoxes. Causality-based fairness notions (e.g. counterfactual fairness, no-proxy discrimination, etc.) are immune to such anomalies and hence more reliable to assess fairness. The problem of causality-based fairness notions, however, is that they are defined in terms of quantities (e.g. causal, counterfactual, and path-specific effects) that are not always measurable. This is known as the identifiability problem and is the topic of a large body of work in the causal inference literature. The first contribution of this paper is a compilation of the major identifiability results which are of particular relevance for machine learning fairness. To the best of our knowledge, no previous work in the field of ML fairness or causal inference provides such systemization of knowledge. The second contribution is more general and addresses the main problem of using causality in machine learning, that is, how to extract causal knowledge from observational data in real scenarios. This paper shows how this can be achieved using identifiability.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于因果关系的机器学习公平性概念的可识别性

机器学习算法可能会产生有偏见的结果/预测，通常是针对少数群体和代表性不足的亚群体。因此，公平正成为安全应用基于机器学习的技术的重要要求。最常用的公平概念(如统计奇偶性、均等几率、预测奇偶性等)是观察性的，依赖于变量之间的相关性。在统计异常的情况下，如辛普森悖论或伯克森悖论，这些概念无法识别偏见。基于因果关系的公平概念(例如反事实公平，无代理歧视等)不受这种异常现象的影响，因此更可靠地评估公平。然而，基于因果关系的公平概念的问题在于，它们是根据数量(例如因果效应、反事实效应和路径特定效应)来定义的，这些数量并不总是可测量的。这被称为可识别性问题，是因果推理文献中大量工作的主题。本文的第一个贡献是汇编了与机器学习公平性特别相关的主要可识别性结果。据我们所知，以前在机器学习公平性或因果推理领域的工作没有提供这样的知识系统化。第二个贡献更一般，解决了在机器学习中使用因果关系的主要问题，即如何从真实场景中的观测数据中提取因果知识。本文展示了如何使用可识别性来实现这一点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)

自引率

0.00%

发文量