Bias in human data: A feedback from social sciences

Savaş Takan, Duygu Ergün, Sinem Getir Yaman, Onur Kılınççeker
{"title":"Bias in human data: A feedback from social sciences","authors":"Savaş Takan, Duygu Ergün, Sinem Getir Yaman, Onur Kılınççeker","doi":"10.1002/widm.1498","DOIUrl":null,"url":null,"abstract":"Abstract The fairness of human‐related software has become critical with its widespread use in our daily lives, where life‐changing decisions are made. However, with the use of these systems, many erroneous results emerged. Technologies have started to be developed to tackle unexpected results. As for the solution to the issue, companies generally focus on algorithm‐oriented errors. The utilized solutions usually only work in some algorithms. Because the cause of the problem is not just the algorithm; it is also the data itself. For instance, deep learning cannot establish the cause–effect relationship quickly. In addition, the boundaries between statistical or heuristic algorithms are unclear. The algorithm's fairness may vary depending on the data related to context. From this point of view, our article focuses on how the data should be, which is not a matter of statistics. In this direction, the picture in question has been revealed through a scenario specific to “vulnerable and disadvantaged” groups, which is one of the most fundamental problems today. With the joint contribution of computer science and social sciences, it aims to predict the possible social dangers that may arise from artificial intelligence algorithms using the clues obtained in this study. To highlight the potential social and mass problems caused by data, Gerbner's “cultivation theory” is reinterpreted. To this end, we conduct an experimental evaluation on popular algorithms and their data sets, such as Word2Vec, GloVe, and ELMO. The article stresses the importance of a holistic approach combining the algorithm, data, and an interdisciplinary assessment. This article is categorized under: Algorithmic Development > Statistics","PeriodicalId":500599,"journal":{"name":"WIREs Data Mining and Knowledge Discovery","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"WIREs Data Mining and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/widm.1498","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract The fairness of human‐related software has become critical with its widespread use in our daily lives, where life‐changing decisions are made. However, with the use of these systems, many erroneous results emerged. Technologies have started to be developed to tackle unexpected results. As for the solution to the issue, companies generally focus on algorithm‐oriented errors. The utilized solutions usually only work in some algorithms. Because the cause of the problem is not just the algorithm; it is also the data itself. For instance, deep learning cannot establish the cause–effect relationship quickly. In addition, the boundaries between statistical or heuristic algorithms are unclear. The algorithm's fairness may vary depending on the data related to context. From this point of view, our article focuses on how the data should be, which is not a matter of statistics. In this direction, the picture in question has been revealed through a scenario specific to “vulnerable and disadvantaged” groups, which is one of the most fundamental problems today. With the joint contribution of computer science and social sciences, it aims to predict the possible social dangers that may arise from artificial intelligence algorithms using the clues obtained in this study. To highlight the potential social and mass problems caused by data, Gerbner's “cultivation theory” is reinterpreted. To this end, we conduct an experimental evaluation on popular algorithms and their data sets, such as Word2Vec, GloVe, and ELMO. The article stresses the importance of a holistic approach combining the algorithm, data, and an interdisciplinary assessment. This article is categorized under: Algorithmic Development > Statistics

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
人类数据中的偏见:来自社会科学的反馈
与人相关的软件的公平性随着其在我们日常生活中的广泛使用而变得至关重要,在日常生活中做出改变生活的决定。然而,随着这些系统的使用,出现了许多错误的结果。已经开始开发技术来处理意想不到的结果。对于这个问题的解决方案,公司通常关注算法导向的错误。所使用的解通常只适用于某些算法。因为问题的原因不仅仅是算法;它也是数据本身。例如,深度学习无法快速建立因果关系。此外,统计算法和启发式算法之间的界限也不清楚。算法的公平性可能因与上下文相关的数据而异。从这个角度来看,我们的文章关注的是数据应该是怎样的,这不是一个统计问题。在这个方向上,所讨论的情况是通过一种针对“易受伤害和处境不利”群体的具体情况揭示出来的,这是当今最根本的问题之一。在计算机科学和社会科学的共同贡献下,它旨在利用本研究获得的线索预测人工智能算法可能产生的社会危险。为了突出数据带来的潜在社会和大众问题,格伯纳的“培养理论”被重新诠释。为此,我们对流行的算法及其数据集,如Word2Vec、GloVe和ELMO进行了实验评估。文章强调了综合算法、数据和跨学科评估的整体方法的重要性。本文分类如下:算法开发>统计数据
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Causality and causal inference for engineers: Beyond correlation, regression, prediction and artificial intelligence Evolution toward intelligent communications: Impact of deep learning applications on the future of 6G technology The state‐of‐art review of ultra‐precision machining using text mining: Identification of main themes and recommendations for the future direction Deep learning models for price forecasting of financial time series: A review of recent advancements: 2020–2022 Pre‐trained language models: What do they know?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1