Developing a sentence level fairness metric using word embeddings.

Ahmed Izzidien, Stephen Fitz, Peter Romero, Bao S Loe, David Stillwell
{"title":"Developing a sentence level fairness metric using word embeddings.","authors":"Ahmed Izzidien, Stephen Fitz, Peter Romero, Bao S Loe, David Stillwell","doi":"10.1007/s42803-022-00049-4","DOIUrl":null,"url":null,"abstract":"<p><p>Fairness is a principal social value that is observable in civilisations around the world. Yet, a fairness metric for digital texts that describe even a simple social interaction, e.g., 'The boy hurt the girl' has not been developed. We address this by employing word embeddings that use factors found in a new social psychology literature review on the topic. We use these factors to build fairness vectors. These vectors are used as sentence level measures, whereby each dimension reflects a fairness component. The approach is employed to approximate human perceptions of fairness. The method leverages a pro-social bias within word embeddings, for which we obtain an F1 = 79.8 on a list of sentences using the Universal Sentence Encoder (USE). A second approach, using principal component analysis (PCA) and machine learning (ML), produces an F1 = 86.2. Repeating these tests using Sentence Bidirectional Encoder Representations from Transformers (SBERT) produces an F1 = 96.9 and F1 = 100 respectively. Improvements using subspace representations are further suggested. By proposing a first-principles approach, the paper contributes to the analysis of digital texts along an ethical dimension.</p>","PeriodicalId":91018,"journal":{"name":"International journal of digital humanities","volume":" ","pages":"1-36"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9549858/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of digital humanities","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s42803-022-00049-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Fairness is a principal social value that is observable in civilisations around the world. Yet, a fairness metric for digital texts that describe even a simple social interaction, e.g., 'The boy hurt the girl' has not been developed. We address this by employing word embeddings that use factors found in a new social psychology literature review on the topic. We use these factors to build fairness vectors. These vectors are used as sentence level measures, whereby each dimension reflects a fairness component. The approach is employed to approximate human perceptions of fairness. The method leverages a pro-social bias within word embeddings, for which we obtain an F1 = 79.8 on a list of sentences using the Universal Sentence Encoder (USE). A second approach, using principal component analysis (PCA) and machine learning (ML), produces an F1 = 86.2. Repeating these tests using Sentence Bidirectional Encoder Representations from Transformers (SBERT) produces an F1 = 96.9 and F1 = 100 respectively. Improvements using subspace representations are further suggested. By proposing a first-principles approach, the paper contributes to the analysis of digital texts along an ethical dimension.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用词嵌入开发句子级公平度量。
公平是一种主要的社会价值观,在世界各地的文明中都可以看到。然而,即使是描述简单社会互动(如 "男孩伤害了女孩")的数字文本,也尚未开发出公平度量标准。为了解决这个问题,我们采用了词嵌入技术,使用了在有关该主题的最新社会心理学文献综述中发现的因素。我们利用这些因素来构建公平性向量。这些向量被用作句子级别的衡量标准,其中每个维度都反映了公平性的组成部分。该方法可用于近似人类对公平性的感知。该方法利用了词嵌入中的亲社会偏差,我们在使用通用句子编码器(USE)的句子列表中获得了 F1 = 79.8 的结果。第二种方法使用主成分分析(PCA)和机器学习(ML),得出的 F1 = 86.2。使用来自变换器的句子双向编码器表示法(SBERT)重复这些测试,F1 = 96.9,F1 = 100。我们还提出了使用子空间表示法进行改进的建议。通过提出第一原理方法,本文为从伦理维度分析数字文本做出了贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
RelChronVis: an interactive web application for visualizing the relative chronology of language changes DOLD: a digital platform for conducting online language experiments and surveys Correction: Committing to reproducibility and explainability: using Git as a research journal Correction: Committing to reproducibility and explainability: using Git as a research journal Open Times: The future of critique in the age of (un)replicability
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1