Luiz Fernando F. P. de Lima, Danielle Rousy D. Ricarte, Clauirton A. Siebra
{"title":"A Benchmark Proposal for Non-Generative Fair Adversarial Learning Strategies Using a Fairness-Utility Trade-off Metric","authors":"Luiz Fernando F. P. de Lima, Danielle Rousy D. Ricarte, Clauirton A. Siebra","doi":"10.1111/coin.70003","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>AI systems for decision-making have become increasingly popular in several areas. However, it is possible to identify biased decisions in many applications, which have become a concern for the computer science, artificial intelligence, and law communities. Therefore, researchers are proposing solutions to mitigate bias and discrimination among decision-makers. Some explored strategies are based on GANs to generate fair data. Others are based on adversarial learning to achieve fairness by encoding fairness constraints through an adversarial model. Moreover, it is usual for each proposal to assess its model with a specific metric, making comparing current approaches a complex task. Therefore, this work proposes a systematical benchmark procedure to assess the fair machine learning models. The proposed procedure comprises a fairness-utility trade-off metric (<span></span><math>\n <semantics>\n <mrow>\n <mi>FU-score</mi>\n </mrow>\n <annotation>$$ FU\\hbox{-} score $$</annotation>\n </semantics></math>), the utility and fairness metrics to compose this assessment, the used datasets and preparation, and the statistical test. A previous work presents some of these definitions. The present work enriches the procedure by increasing the applied datasets and statistical guarantees when comparing the models' results. We performed this benchmark evaluation for the non-generative adversarial models, analyzing the literature models from the same metric perspective. This assessment could not indicate a single model which better performs for all datasets. However, we built an understanding of how each model performs on each dataset with statistical confidence.</p>\n </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/coin.70003","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
AI systems for decision-making have become increasingly popular in several areas. However, it is possible to identify biased decisions in many applications, which have become a concern for the computer science, artificial intelligence, and law communities. Therefore, researchers are proposing solutions to mitigate bias and discrimination among decision-makers. Some explored strategies are based on GANs to generate fair data. Others are based on adversarial learning to achieve fairness by encoding fairness constraints through an adversarial model. Moreover, it is usual for each proposal to assess its model with a specific metric, making comparing current approaches a complex task. Therefore, this work proposes a systematical benchmark procedure to assess the fair machine learning models. The proposed procedure comprises a fairness-utility trade-off metric (), the utility and fairness metrics to compose this assessment, the used datasets and preparation, and the statistical test. A previous work presents some of these definitions. The present work enriches the procedure by increasing the applied datasets and statistical guarantees when comparing the models' results. We performed this benchmark evaluation for the non-generative adversarial models, analyzing the literature models from the same metric perspective. This assessment could not indicate a single model which better performs for all datasets. However, we built an understanding of how each model performs on each dataset with statistical confidence.
期刊介绍:
This leading international journal promotes and stimulates research in the field of artificial intelligence (AI). Covering a wide range of issues - from the tools and languages of AI to its philosophical implications - Computational Intelligence provides a vigorous forum for the publication of both experimental and theoretical research, as well as surveys and impact studies. The journal is designed to meet the needs of a wide range of AI workers in academic and industrial research.