Mitigating Sex Bias in Audio Data-driven COPD and COVID-19 Breathing Pattern Detection Models

arXiv - EE - Audio and Speech Processing Pub Date : 2024-09-16 DOI:arxiv-2409.10677

Rachel Pfeifer, Sudip Vhaduri, James Eric Dietz

{"title":"Mitigating Sex Bias in Audio Data-driven COPD and COVID-19 Breathing Pattern Detection Models","authors":"Rachel Pfeifer, Sudip Vhaduri, James Eric Dietz","doi":"arxiv-2409.10677","DOIUrl":null,"url":null,"abstract":"In the healthcare industry, researchers have been developing machine learning\nmodels to automate diagnosing patients with respiratory illnesses based on\ntheir breathing patterns. However, these models do not consider the demographic\nbiases, particularly sex bias, that often occur when models are trained with a\nskewed patient dataset. Hence, it is essential in such an important industry to\nreduce this bias so that models can make fair diagnoses. In this work, we\nexamine the bias in models used to detect breathing patterns of two major\nrespiratory diseases, i.e., chronic obstructive pulmonary disease (COPD) and\nCOVID-19. Using decision tree models trained with audio recordings of breathing\npatterns obtained from two open-source datasets consisting of 29 COPD and 680\nCOVID-19-positive patients, we analyze the effect of sex bias on the models.\nWith a threshold optimizer and two constraints (demographic parity and\nequalized odds) to mitigate the bias, we witness 81.43% (demographic parity\ndifference) and 71.81% (equalized odds difference) improvements. These findings\nare statistically significant.","PeriodicalId":501284,"journal":{"name":"arXiv - EE - Audio and Speech Processing","volume":"21 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Audio and Speech Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10677","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In the healthcare industry, researchers have been developing machine learning models to automate diagnosing patients with respiratory illnesses based on their breathing patterns. However, these models do not consider the demographic biases, particularly sex bias, that often occur when models are trained with a skewed patient dataset. Hence, it is essential in such an important industry to reduce this bias so that models can make fair diagnoses. In this work, we examine the bias in models used to detect breathing patterns of two major respiratory diseases, i.e., chronic obstructive pulmonary disease (COPD) and COVID-19. Using decision tree models trained with audio recordings of breathing patterns obtained from two open-source datasets consisting of 29 COPD and 680 COVID-19-positive patients, we analyze the effect of sex bias on the models. With a threshold optimizer and two constraints (demographic parity and equalized odds) to mitigate the bias, we witness 81.43% (demographic parity difference) and 71.81% (equalized odds difference) improvements. These findings are statistically significant.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

减轻音频数据驱动的慢性阻塞性肺病和 COVID-19 呼吸模式检测模型中的性别偏差

在医疗保健行业，研究人员一直在开发机器学习模型，以便根据呼吸模式自动诊断呼吸系统疾病患者。然而，这些模型并没有考虑人口统计学偏差，尤其是性别偏差，而当模型使用偏斜的患者数据集进行训练时，往往会出现这种偏差。因此，在如此重要的行业中，必须减少这种偏差，以便模型能做出公平的诊断。在这项工作中，我们研究了用于检测两种主要呼吸系统疾病（即慢性阻塞性肺病（COPD）和COVID-19）呼吸模式的模型中存在的偏差。通过使用阈值优化器和两个约束条件（人口统计学奇偶性和均等化几率）来减轻偏差，我们见证了 81.43%（人口统计学奇偶性差异）和 71.81%（均等化几率差异）的改进。这些结果在统计学上具有重要意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - EE - Audio and Speech Processing

自引率

0.00%

发文量