使用集成的局部公平和准确分类的度量和算法。

Datenbank-Spektrum : Zeitschrift fur Datenbanktechnologie : Organ der Fachgruppe Datenbanken der Gesellschaft fur Informatik e.V Pub Date : 2022-01-01 Epub Date: 2022-01-17 DOI:10.1007/s13222-021-00401-y

Nico Lässig, Sarah Oppold, Melanie Herschel

{"title":"使用集成的局部公平和准确分类的度量和算法。","authors":"Nico Lässig, Sarah Oppold, Melanie Herschel","doi":"10.1007/s13222-021-00401-y","DOIUrl":null,"url":null,"abstract":"To obtain accurate predictions of classifiers, model ensembles comprising multiple trained machine learning models are nowadays used. In particular, dynamic model ensembles pick the most accurate model for each query object, by applying the model that performed best on similar data. Dynamic model ensembles may however suffer, similarly to single machine learning models, from bias, which can eventually lead to unfair treatment of certain groups of a general population. To mitigate unfair classification, recent work has thus proposed fair model ensembles, that instead of focusing (solely) on accuracy also optimize global fairness. While such global fairness globally minimizes bias, imbalances may persist in different regions of the data, e.g., caused by some local bias maxima leading to local unfairness. Therefore, we extend our previous work by including a framework that bridges the gap between dynamic model ensembles and fair model ensembles. More precisely, we investigate the problem of devising locally fair and accurate dynamic model ensembles, which ultimately optimize for equal opportunity of similar subjects. We propose a general framework to perform this task and present several algorithms implementing the framework components. In this paper we also present a runtime-efficient framework adaptation that keeps the quality of the results on a similar level. Furthermore, new fairness metrics are presented as well as detailed informations about necessary data preparations. Our evaluation of the framework implementations and metrics shows that our approach outperforms the state-of-the art for different types and degrees of bias present in training data in terms of both local and global fairness, while reaching comparable accuracy.","PeriodicalId":72771,"journal":{"name":"Datenbank-Spektrum : Zeitschrift fur Datenbanktechnologie : Organ der Fachgruppe Datenbanken der Gesellschaft fur Informatik e.V","volume":"22 1","pages":"23-43"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8762451/pdf/","citationCount":"2","resultStr":"{\"title\":\"Metrics and Algorithms for Locally Fair and Accurate Classifications using Ensembles.\",\"authors\":\"Nico Lässig, Sarah Oppold, Melanie Herschel\",\"doi\":\"10.1007/s13222-021-00401-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To obtain accurate predictions of classifiers, model ensembles comprising multiple trained machine learning models are nowadays used. In particular, dynamic model ensembles pick the most accurate model for each query object, by applying the model that performed best on similar data. Dynamic model ensembles may however suffer, similarly to single machine learning models, from bias, which can eventually lead to unfair treatment of certain groups of a general population. To mitigate unfair classification, recent work has thus proposed fair model ensembles, that instead of focusing (solely) on accuracy also optimize global fairness. While such global fairness globally minimizes bias, imbalances may persist in different regions of the data, e.g., caused by some local bias maxima leading to local unfairness. Therefore, we extend our previous work by including a framework that bridges the gap between dynamic model ensembles and fair model ensembles. More precisely, we investigate the problem of devising locally fair and accurate dynamic model ensembles, which ultimately optimize for equal opportunity of similar subjects. We propose a general framework to perform this task and present several algorithms implementing the framework components. In this paper we also present a runtime-efficient framework adaptation that keeps the quality of the results on a similar level. Furthermore, new fairness metrics are presented as well as detailed informations about necessary data preparations. Our evaluation of the framework implementations and metrics shows that our approach outperforms the state-of-the art for different types and degrees of bias present in training data in terms of both local and global fairness, while reaching comparable accuracy.\",\"PeriodicalId\":72771,\"journal\":{\"name\":\"Datenbank-Spektrum : Zeitschrift fur Datenbanktechnologie : Organ der Fachgruppe Datenbanken der Gesellschaft fur Informatik e.V\",\"volume\":\"22 1\",\"pages\":\"23-43\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8762451/pdf/\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Datenbank-Spektrum : Zeitschrift fur Datenbanktechnologie : Organ der Fachgruppe Datenbanken der Gesellschaft fur Informatik e.V\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s13222-021-00401-y\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2022/1/17 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Datenbank-Spektrum : Zeitschrift fur Datenbanktechnologie : Organ der Fachgruppe Datenbanken der Gesellschaft fur Informatik e.V","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13222-021-00401-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/1/17 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

为了获得分类器的准确预测，现在使用由多个训练过的机器学习模型组成的模型集成。特别是，动态模型集成通过应用在类似数据上表现最好的模型，为每个查询对象选择最准确的模型。然而，与单一机器学习模型类似，动态模型集成可能会受到偏见的影响，这最终会导致对一般人群中的某些群体的不公平对待。为了减轻不公平的分类，最近的工作因此提出了公平的模型集成，而不是(仅仅)关注准确性，也优化全局公平性。虽然这种全局公平在全局上最大限度地减少了偏差，但不平衡可能在数据的不同区域持续存在，例如，由于某些局部偏差最大值导致局部不公平。因此，我们通过包含一个框架来扩展我们以前的工作，该框架可以弥合动态模型集成和公平模型集成之间的差距。更准确地说，我们研究了设计局部公平和准确的动态模型集合的问题，该问题最终优化为相似主题的均等机会。我们提出了一个通用框架来执行这项任务，并提出了实现框架组件的几种算法。在本文中，我们还提出了一个运行时高效的框架适应，使结果的质量保持在类似的水平上。此外，还提出了新的公平性指标以及有关必要数据准备的详细信息。我们对框架实现和度量的评估表明，我们的方法在局部和全局公平性方面优于训练数据中存在的不同类型和程度的偏差的最新技术，同时达到相当的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Metrics and Algorithms for Locally Fair and Accurate Classifications using Ensembles.

To obtain accurate predictions of classifiers, model ensembles comprising multiple trained machine learning models are nowadays used. In particular, dynamic model ensembles pick the most accurate model for each query object, by applying the model that performed best on similar data. Dynamic model ensembles may however suffer, similarly to single machine learning models, from bias, which can eventually lead to unfair treatment of certain groups of a general population. To mitigate unfair classification, recent work has thus proposed fair model ensembles, that instead of focusing (solely) on accuracy also optimize global fairness. While such global fairness globally minimizes bias, imbalances may persist in different regions of the data, e.g., caused by some local bias maxima leading to local unfairness. Therefore, we extend our previous work by including a framework that bridges the gap between dynamic model ensembles and fair model ensembles. More precisely, we investigate the problem of devising locally fair and accurate dynamic model ensembles, which ultimately optimize for equal opportunity of similar subjects. We propose a general framework to perform this task and present several algorithms implementing the framework components. In this paper we also present a runtime-efficient framework adaptation that keeps the quality of the results on a similar level. Furthermore, new fairness metrics are presented as well as detailed informations about necessary data preparations. Our evaluation of the framework implementations and metrics shows that our approach outperforms the state-of-the art for different types and degrees of bias present in training data in terms of both local and global fairness, while reaching comparable accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Datenbank-Spektrum : Zeitschrift fur Datenbanktechnologie : Organ der Fachgruppe Datenbanken der Gesellschaft fur Informatik e.V

自引率

0.00%

发文量

期刊最新文献

An Extension of DNAContainer with a Small Memory Footprint SportsTables: A New Corpus for Semantic Type Detection (Extended Version) Dissertationen Accelerating Large Table Scan Using Processing-In-Memory Technology Geo Engine: Workflow-driven Geospatial Portals for Data Science