从因果角度评定情感分析系统是否存在偏见

IEEE transactions on technology and society Pub Date : 2024-03-11 DOI:10.1109/TTS.2024.3375519

Kausik Lakkaraju;Biplav Srivastava;Marco Valtorta

{"title":"从因果角度评定情感分析系统是否存在偏见","authors":"Kausik Lakkaraju;Biplav Srivastava;Marco Valtorta","doi":"10.1109/TTS.2024.3375519","DOIUrl":null,"url":null,"abstract":"Sentiment Analysis Systems (SASs) are data-driven Artificial Intelligence (AI) systems that assign one or more numbers to convey the polarity and emotional intensity of a given piece of text. However, like other automatic machine learning systems, SASs can exhibit model uncertainty, resulting in drastic swings in output with even small changes in input. This issue becomes more problematic when inputs involve protected attributes like gender or race, as it can be perceived as bias or unfairness. To address this, we propose a novel method to assess and rate SASs. We perturb inputs in a controlled causal setting to test if the output sentiment is sensitive to protected attributes while keeping other components of the textual input, such as chosen emotion words, fixed. Based on the results, we assign labels (ratings) at both fine-grained and overall levels to indicate the robustness of the SAS to input changes. The ratings can help decision-makers improve online content by reducing hate speech, often fueled by biases related to protected attributes such as gender and race. These ratings provide a principled basis for comparing SASs and making informed choices based on their behavior. The ratings also benefit all users, especially developers who reuse off-the-shelf SASs to build larger AI systems but do not have access to their code or training data to compare.","PeriodicalId":73324,"journal":{"name":"IEEE transactions on technology and society","volume":"5 1","pages":"82-92"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rating Sentiment Analysis Systems for Bias Through a Causal Lens\",\"authors\":\"Kausik Lakkaraju;Biplav Srivastava;Marco Valtorta\",\"doi\":\"10.1109/TTS.2024.3375519\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sentiment Analysis Systems (SASs) are data-driven Artificial Intelligence (AI) systems that assign one or more numbers to convey the polarity and emotional intensity of a given piece of text. However, like other automatic machine learning systems, SASs can exhibit model uncertainty, resulting in drastic swings in output with even small changes in input. This issue becomes more problematic when inputs involve protected attributes like gender or race, as it can be perceived as bias or unfairness. To address this, we propose a novel method to assess and rate SASs. We perturb inputs in a controlled causal setting to test if the output sentiment is sensitive to protected attributes while keeping other components of the textual input, such as chosen emotion words, fixed. Based on the results, we assign labels (ratings) at both fine-grained and overall levels to indicate the robustness of the SAS to input changes. The ratings can help decision-makers improve online content by reducing hate speech, often fueled by biases related to protected attributes such as gender and race. These ratings provide a principled basis for comparing SASs and making informed choices based on their behavior. The ratings also benefit all users, especially developers who reuse off-the-shelf SASs to build larger AI systems but do not have access to their code or training data to compare.\",\"PeriodicalId\":73324,\"journal\":{\"name\":\"IEEE transactions on technology and society\",\"volume\":\"5 1\",\"pages\":\"82-92\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on technology and society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10466637/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on technology and society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10466637/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

情感分析系统（SAS）是一种数据驱动的人工智能（AI）系统，它可以分配一个或多个数字来表达给定文本的极性和情感强度。然而，与其他自动机器学习系统一样，SAS 也会表现出模型的不确定性，导致即使输入的微小变化也会导致输出的剧烈波动。当输入涉及性别或种族等受保护的属性时，这个问题就变得更加棘手，因为这可能被视为偏见或不公平。为了解决这个问题，我们提出了一种评估和评价 SAS 的新方法。我们在受控的因果关系设置中对输入进行扰动，以测试输出情感是否对受保护属性敏感，同时保持文本输入的其他成分（如所选情感词）固定不变。根据结果，我们在细粒度和整体层面上分配标签（评级），以显示 SAS 对输入变化的鲁棒性。这些评级可以帮助决策者通过减少仇恨言论来改进在线内容，而仇恨言论往往是由性别和种族等受保护属性的偏见所助长的。这些评级为比较 SAS 和根据其行为做出明智选择提供了原则性依据。这些评级也有利于所有用户，特别是那些重复使用现成的 SAS 构建大型人工智能系统，但却无法获得其代码或训练数据进行比较的开发人员。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Rating Sentiment Analysis Systems for Bias Through a Causal Lens

Sentiment Analysis Systems (SASs) are data-driven Artificial Intelligence (AI) systems that assign one or more numbers to convey the polarity and emotional intensity of a given piece of text. However, like other automatic machine learning systems, SASs can exhibit model uncertainty, resulting in drastic swings in output with even small changes in input. This issue becomes more problematic when inputs involve protected attributes like gender or race, as it can be perceived as bias or unfairness. To address this, we propose a novel method to assess and rate SASs. We perturb inputs in a controlled causal setting to test if the output sentiment is sensitive to protected attributes while keeping other components of the textual input, such as chosen emotion words, fixed. Based on the results, we assign labels (ratings) at both fine-grained and overall levels to indicate the robustness of the SAS to input changes. The ratings can help decision-makers improve online content by reducing hate speech, often fueled by biases related to protected attributes such as gender and race. These ratings provide a principled basis for comparing SASs and making informed choices based on their behavior. The ratings also benefit all users, especially developers who reuse off-the-shelf SASs to build larger AI systems but do not have access to their code or training data to compare.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助