首页 > 最新文献

Big Data & Society最新文献

英文 中文
Formally comparing topic models and human-generated qualitative coding of physician mothers’ experiences of workplace discrimination 对医生母亲工作场所歧视经历的主题模型和人为生成的定性编码进行正式比较
IF 8.5 1区 社会学 Q1 Social Sciences Pub Date : 2023-01-01 DOI: 10.1177/20539517221149106
Adam S. Miner, Sheridan A Stewart, M. Halley, Laura K. Nelson, Eleni Linos
Differences between computationally generated and human-generated themes in unstructured text are important to understand yet difficult to assess formally. In this study, we bridge these approaches through two contributions. First, we formally compare a primarily computational approach, topic modeling, to a primarily human-driven approach, qualitative thematic coding, in an impactful context: physician mothers’ experience of workplace discrimination. Second, we compare our chosen topic model to a principled alternative topic model to make explicit study design decisions meriting consideration in future research. By formally contrasting computationally generated (i.e. topic modeling) and human-generated (i.e. thematic coding) knowledge, we shed light on issues of interest to several audiences, notably computational social scientists who wish to understand study design tradeoffs, and qualitative researchers who may wish to leverage computational methods to improve the speed and reproducibility of labor-intensive coding. Although useful in other domains, we highlight the value of fast, reproducible methods to better understand experiences of workplace discrimination.
在非结构化文本中,计算生成和人工生成主题之间的差异对理解很重要,但很难正式评估。在本研究中,我们通过两个贡献来连接这些方法。首先,我们正式比较了主要的计算方法,主题建模,主要人为驱动的方法,定性主题编码,在一个有影响力的背景下:医生母亲的工作场所歧视的经历。其次,我们将选择的主题模型与有原则的备选主题模型进行比较,以做出明确的研究设计决策,以便在未来的研究中加以考虑。通过正式对比计算生成(即主题建模)和人类生成(即主题编码)的知识,我们揭示了一些受众感兴趣的问题,特别是希望理解研究设计权衡的计算社会科学家,以及可能希望利用计算方法来提高劳动密集型编码的速度和可重复性的定性研究人员。尽管在其他领域也很有用,但我们强调了快速、可重复的方法在更好地理解工作场所歧视经历方面的价值。
{"title":"Formally comparing topic models and human-generated qualitative coding of physician mothers’ experiences of workplace discrimination","authors":"Adam S. Miner, Sheridan A Stewart, M. Halley, Laura K. Nelson, Eleni Linos","doi":"10.1177/20539517221149106","DOIUrl":"https://doi.org/10.1177/20539517221149106","url":null,"abstract":"Differences between computationally generated and human-generated themes in unstructured text are important to understand yet difficult to assess formally. In this study, we bridge these approaches through two contributions. First, we formally compare a primarily computational approach, topic modeling, to a primarily human-driven approach, qualitative thematic coding, in an impactful context: physician mothers’ experience of workplace discrimination. Second, we compare our chosen topic model to a principled alternative topic model to make explicit study design decisions meriting consideration in future research. By formally contrasting computationally generated (i.e. topic modeling) and human-generated (i.e. thematic coding) knowledge, we shed light on issues of interest to several audiences, notably computational social scientists who wish to understand study design tradeoffs, and qualitative researchers who may wish to leverage computational methods to improve the speed and reproducibility of labor-intensive coding. Although useful in other domains, we highlight the value of fast, reproducible methods to better understand experiences of workplace discrimination.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44128155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deleterious consequences: How Google's original sociotechnical affordances ultimately shaped ‘trusted users’ in surveillance capitalism 有害后果:谷歌最初的社会技术启示如何最终塑造了监控资本主义中的“可信用户”
IF 8.5 1区 社会学 Q1 Social Sciences Pub Date : 2023-01-01 DOI: 10.1177/20539517231171058
Renée Ridgway
Google dominates around 92% of the search market worldwide (as of November 2022), with most of its revenue derived from search advertising. However, Google's hegemony over search and the resulting implications are not necessarily accidental, arbitrary or (un)intentional. This article revisits Brin and Page's original paper, drawing on six of their key innovations, concerns and design choices (counting citations or backlinks, trusted user, advertising, personalization, usage data, smart algorithms) to explain the evolution of Google's hypertext search engine technologies through ‘moments of contingency’, which led to corporate lock-ins. Underpinned by analyses of patents, statements and secondary sources, it elucidates how early Google considerations and certain affordances not only came to shape the web (backlinks, trusted user, advertising) but subsequently facilitated contemporary surveillance capitalism. Building upon Zuboff's ‘Big Other’, it describes the ways in which Google as an infrastructure is intertwined with Big Data's platformization and the ad infinitum collection of usage data, beyond just personalization. This extraction and refinement of usage data as ‘behavioural surplus’ results in ‘deleterious consequences’: a ‘habit of automaticity,’ which shapes the trusted user through ‘ubiquitous googling’ and smart algorithms, whilst simultaneously generating prediction products for surveillance capitalism. Advancing Latour's ‘predicting the path’ of technological innovation, this cause-and-effect story contributes a new taxonomy of Google sociotechnical affordances to critical STS, media history and web search literature.
谷歌占据了全球约92%的搜索市场(截至2022年11月),其大部分收入来自搜索广告。然而,谷歌对搜索的霸权及其产生的影响并不一定是偶然的、武断的或(非)故意的。本文回顾了Brin和Page的原始论文,借鉴了他们的六项关键创新、关注点和设计选择(计算引用或反向链接、可信用户、广告、个性化、使用数据、智能算法),解释了谷歌超文本搜索引擎技术通过“偶然时刻”的演变,从而导致了公司锁定。在对专利、声明和次要来源的分析的基础上,它阐明了谷歌早期的考虑因素和某些启示不仅塑造了网络(反向链接、可信用户、广告),而且随后促进了当代监控资本主义。它以Zuboff的“Big Other”为基础,描述了谷歌作为一个基础设施与大数据的平台化和使用数据的无限收集交织在一起的方式,而不仅仅是个性化。这种将使用数据提取和提炼为“行为盈余”的做法会产生“有害后果”:一种“自动化习惯”,通过“无处不在的谷歌搜索”和智能算法塑造可信用户,同时为监控资本主义生成预测产品。这个因果故事推动了拉图尔对技术创新的“预测之路”,为批判性STS、媒体历史和网络搜索文献提供了谷歌社会技术可供性的新分类。
{"title":"Deleterious consequences: How Google's original sociotechnical affordances ultimately shaped ‘trusted users’ in surveillance capitalism","authors":"Renée Ridgway","doi":"10.1177/20539517231171058","DOIUrl":"https://doi.org/10.1177/20539517231171058","url":null,"abstract":"Google dominates around 92% of the search market worldwide (as of November 2022), with most of its revenue derived from search advertising. However, Google's hegemony over search and the resulting implications are not necessarily accidental, arbitrary or (un)intentional. This article revisits Brin and Page's original paper, drawing on six of their key innovations, concerns and design choices (counting citations or backlinks, trusted user, advertising, personalization, usage data, smart algorithms) to explain the evolution of Google's hypertext search engine technologies through ‘moments of contingency’, which led to corporate lock-ins. Underpinned by analyses of patents, statements and secondary sources, it elucidates how early Google considerations and certain affordances not only came to shape the web (backlinks, trusted user, advertising) but subsequently facilitated contemporary surveillance capitalism. Building upon Zuboff's ‘Big Other’, it describes the ways in which Google as an infrastructure is intertwined with Big Data's platformization and the ad infinitum collection of usage data, beyond just personalization. This extraction and refinement of usage data as ‘behavioural surplus’ results in ‘deleterious consequences’: a ‘habit of automaticity,’ which shapes the trusted user through ‘ubiquitous googling’ and smart algorithms, whilst simultaneously generating prediction products for surveillance capitalism. Advancing Latour's ‘predicting the path’ of technological innovation, this cause-and-effect story contributes a new taxonomy of Google sociotechnical affordances to critical STS, media history and web search literature.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41625322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning machine learning: On the political economy of big tech's online AI courses 学习机器学习:论大型科技公司在线人工智能课程的政治经济学
IF 8.5 1区 社会学 Q1 Social Sciences Pub Date : 2023-01-01 DOI: 10.1177/20539517231153806
Inga Luchs, C. Apprich, M. Broersma
Machine learning (ML) algorithms are still a novel research object in the field of media studies. While existing research focuses on concrete software on the one hand and the socio-economic context of the development and use of these systems on the other, this paper studies online ML courses as a research object that has received little attention so far. By pursuing a walkthrough and critical discourse analysis of Google's Machine Learning Crash Course and IBM's introductory course to Machine Learning with Python, we not only shed light on the technical knowledge, assumptions, and dominant infrastructures of ML as a field of practice, but also on the economic interests of the companies providing the courses. We demonstrate how the online courses further support Google and IBM to consolidate and even expand their position of power by recruiting new AI talent and by securing their infrastructures and models to become the dominant ones. Further, we show how the companies not only influence greatly how ML is represented, but also how these representations in turn influence and direct current ML research and development, as well as the societal effects of their products. Here, they boast an image of fair and democratic artificial intelligence, which stands in stark contrast to the ubiquity of their corporate products and the advertised directives of efficiency and performativity the companies strive for. This underlines the need for alternative infrastructures and perspectives.
机器学习算法在媒体研究领域仍然是一个新的研究对象。现有的研究一方面侧重于具体的软件,另一方面侧重于这些系统开发和使用的社会经济背景,而本文将在线ML课程作为一个研究对象,迄今为止很少受到关注。通过对b谷歌的机器学习速成课程和IBM的机器学习Python入门课程进行演练和批判性话语分析,我们不仅揭示了机器学习作为实践领域的技术知识、假设和主导基础设施,而且还揭示了提供课程的公司的经济利益。我们展示了在线课程如何进一步支持b谷歌和IBM通过招募新的人工智能人才,并确保他们的基础设施和模型成为主导地位,巩固甚至扩大他们的权力地位。此外,我们还展示了这些公司如何不仅极大地影响机器学习的表现方式,而且还展示了这些表现如何反过来影响和指导当前的机器学习研究和开发,以及它们的产品的社会影响。在这里,他们吹嘘着公平民主的人工智能形象,这与他们无处不在的企业产品以及公司所追求的效率和性能的广告指令形成鲜明对比。这强调了对替代基础设施和观点的需求。
{"title":"Learning machine learning: On the political economy of big tech's online AI courses","authors":"Inga Luchs, C. Apprich, M. Broersma","doi":"10.1177/20539517231153806","DOIUrl":"https://doi.org/10.1177/20539517231153806","url":null,"abstract":"Machine learning (ML) algorithms are still a novel research object in the field of media studies. While existing research focuses on concrete software on the one hand and the socio-economic context of the development and use of these systems on the other, this paper studies online ML courses as a research object that has received little attention so far. By pursuing a walkthrough and critical discourse analysis of Google's Machine Learning Crash Course and IBM's introductory course to Machine Learning with Python, we not only shed light on the technical knowledge, assumptions, and dominant infrastructures of ML as a field of practice, but also on the economic interests of the companies providing the courses. We demonstrate how the online courses further support Google and IBM to consolidate and even expand their position of power by recruiting new AI talent and by securing their infrastructures and models to become the dominant ones. Further, we show how the companies not only influence greatly how ML is represented, but also how these representations in turn influence and direct current ML research and development, as well as the societal effects of their products. Here, they boast an image of fair and democratic artificial intelligence, which stands in stark contrast to the ubiquity of their corporate products and the advertised directives of efficiency and performativity the companies strive for. This underlines the need for alternative infrastructures and perspectives.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41796643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Machine learning, meaning making: On reading computer science texts 机器学习,意义创造:关于阅读计算机科学文本
IF 8.5 1区 社会学 Q1 Social Sciences Pub Date : 2023-01-01 DOI: 10.1177/20539517231166887
Louise Amoore, Alexander Campolo, Benjamin N. Jacobsen, Ludovico Rella
Computer science tends to foreclose the reading of its texts by social science and humanities scholars – via code and scale, mathematics, black box opacities, secret or proprietary models. Yet, when computer science papers are read in order to better understand what machine learning means for societies, a form of reading is brought to bear that is not primarily about excavating the hidden meaning of a text or exposing underlying truths about science. Not strictly reading to make sense or to discern definitive meaning of computer science texts, reading is an engagement with the sense-making and meaning-making that takes place. We propose a strategy for reading computer science that is attentive to the act of reading itself, that stays close to the difficulty involved in all forms of reading, and that works with the text as already properly belonging to the ethico-politics that this difficulty engenders. Addressing a series of three “reading problems” – genre, readability, and meaning – we discuss machine learning textbooks and papers as sites where today's algorithmic models are actively giving accounts of their paradigmatic worldview. Much more than matters of technical definition or proof of concept, texts are sites where concepts are forged and contested. In our times, when the political application of AI and machine learning is so commonly geared to settle or predict difficult societal problems in advance, a reading strategy must open the gaps and difficulties of that which cannot be settled or resolved.
计算机科学倾向于排除社会科学和人文学者对其文本的阅读——通过代码和规模、数学、黑箱不透明、秘密或专有模型。然而,当人们为了更好地理解机器学习对社会的意义而阅读计算机科学论文时,就会产生一种阅读形式,这种阅读方式主要不是挖掘文本的隐藏含义,也不是揭露科学的潜在真相。阅读并不是严格地为了理解或辨别计算机科学文本的明确含义而阅读,阅读是一种参与其中的意义构建和意义构建。我们提出了一种阅读计算机科学的策略,它关注阅读本身的行为,与所有形式的阅读所涉及的困难保持密切联系,并与已经适当地属于这种困难所产生的伦理政治的文本一起工作。为了解决一系列的三个“阅读问题”——类型、可读性和意义——我们将机器学习教科书和论文作为网站来讨论,在这些网站上,今天的算法模型正在积极地给出它们的范式世界观的描述。文本不仅仅是技术定义或概念证明的问题,而是概念伪造和争论的场所。在我们这个时代,人工智能和机器学习的政治应用通常是为了提前解决或预测困难的社会问题,阅读策略必须打开那些无法解决或解决的差距和困难。
{"title":"Machine learning, meaning making: On reading computer science texts","authors":"Louise Amoore, Alexander Campolo, Benjamin N. Jacobsen, Ludovico Rella","doi":"10.1177/20539517231166887","DOIUrl":"https://doi.org/10.1177/20539517231166887","url":null,"abstract":"Computer science tends to foreclose the reading of its texts by social science and humanities scholars – via code and scale, mathematics, black box opacities, secret or proprietary models. Yet, when computer science papers are read in order to better understand what machine learning means for societies, a form of reading is brought to bear that is not primarily about excavating the hidden meaning of a text or exposing underlying truths about science. Not strictly reading to make sense or to discern definitive meaning of computer science texts, reading is an engagement with the sense-making and meaning-making that takes place. We propose a strategy for reading computer science that is attentive to the act of reading itself, that stays close to the difficulty involved in all forms of reading, and that works with the text as already properly belonging to the ethico-politics that this difficulty engenders. Addressing a series of three “reading problems” – genre, readability, and meaning – we discuss machine learning textbooks and papers as sites where today's algorithmic models are actively giving accounts of their paradigmatic worldview. Much more than matters of technical definition or proof of concept, texts are sites where concepts are forged and contested. In our times, when the political application of AI and machine learning is so commonly geared to settle or predict difficult societal problems in advance, a reading strategy must open the gaps and difficulties of that which cannot be settled or resolved.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48636165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Clicks and particulates: Value, alienation, and attunement as unifying themes in big data studies 点击和微粒:价值、异化和调谐作为大数据研究的统一主题
IF 8.5 1区 社会学 Q1 Social Sciences Pub Date : 2023-01-01 DOI: 10.1177/20539517231184891
G. Ottinger, K. Bronson, D. Nafus
Critiques of data colonialism and surveillance capitalism focus on data collected from online behavior. We propose that analytical concepts from these critiques—namely, regimes of value and patterns of alienation and attunement—could be applied more widely to better understand the threats that datafication poses to equity and democracy in the social and environmental realms. Regimes of value, which include the institutions and technologies that make data meaningful and render them selectively available for appropriation, are relevant both to for-profit companies’ data practices and to states’ participation in the datafication of the environment; examining regimes of value raises questions about how data are exploited and how they are neglected. Patterns of alienation associated with datafication include the potential for alienation from the environment; however, at least in some value regimes, alienation may be accompanied by possibilities for attunement to natural and social phenomena that might otherwise have escaped notice.
对数据殖民主义和监视资本主义的批评集中在从网络行为中收集的数据上。我们建议,这些批评中的分析概念,即价值体系以及异化和协调模式,可以更广泛地应用,以更好地理解数据化对社会和环境领域的公平和民主构成的威胁。价值体系,包括使数据有意义并使其有选择地可供使用的机构和技术,与营利性公司的数据实践和各州参与环境数据化都相关;研究价值体系引发了数据如何被利用以及如何被忽视的问题。与数据化相关的异化模式包括与环境异化的可能性;然而,至少在某些价值体系中,异化可能伴随着适应自然和社会现象的可能性,否则这些现象可能会被忽视。
{"title":"Clicks and particulates: Value, alienation, and attunement as unifying themes in big data studies","authors":"G. Ottinger, K. Bronson, D. Nafus","doi":"10.1177/20539517231184891","DOIUrl":"https://doi.org/10.1177/20539517231184891","url":null,"abstract":"Critiques of data colonialism and surveillance capitalism focus on data collected from online behavior. We propose that analytical concepts from these critiques—namely, regimes of value and patterns of alienation and attunement—could be applied more widely to better understand the threats that datafication poses to equity and democracy in the social and environmental realms. Regimes of value, which include the institutions and technologies that make data meaningful and render them selectively available for appropriation, are relevant both to for-profit companies’ data practices and to states’ participation in the datafication of the environment; examining regimes of value raises questions about how data are exploited and how they are neglected. Patterns of alienation associated with datafication include the potential for alienation from the environment; however, at least in some value regimes, alienation may be accompanied by possibilities for attunement to natural and social phenomena that might otherwise have escaped notice.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46584764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital identity as platform-mediated surveillance 作为平台中介监控的数字身份
IF 8.5 1区 社会学 Q1 Social Sciences Pub Date : 2023-01-01 DOI: 10.1177/20539517221135176
S. Masiero
Digital identity systems are usually viewed as datafiers of existing populations. Yet a platform view finds limited space in the digital identity discourse, with the result that the platform features of digital identity systems are not seen in relation to their surveillance outcomes. In this commentary I illuminate how the core platform properties of digital identity systems afford the undue surveillance of vulnerable groups, leading users into the binary condition of either registering and being profiled, or giving up essential benefits from providers of development programmes. By doing so I contest the “dark side” narrative often applied to digital identity, arguing that, rather than just a side, it is the very inner matter of digital identity platforms that enables surveillance outcomes.
数字身份系统通常被视为现有人口的数据分析器。然而,平台观点在数字身份话语中发现了有限的空间,其结果是,数字身份系统的平台特征与它们的监控结果没有关系。在这篇评论中,我阐明了数字身份系统的核心平台属性是如何对弱势群体进行不当监视的,导致用户陷入要么注册并被分析,要么放弃从发展计划提供者那里获得的基本利益的二元状态。通过这样做,我对经常应用于数字身份的“黑暗面”叙述提出了质疑,我认为,数字身份平台的内在问题,而不仅仅是一面,才是监控结果的根源。
{"title":"Digital identity as platform-mediated surveillance","authors":"S. Masiero","doi":"10.1177/20539517221135176","DOIUrl":"https://doi.org/10.1177/20539517221135176","url":null,"abstract":"Digital identity systems are usually viewed as datafiers of existing populations. Yet a platform view finds limited space in the digital identity discourse, with the result that the platform features of digital identity systems are not seen in relation to their surveillance outcomes. In this commentary I illuminate how the core platform properties of digital identity systems afford the undue surveillance of vulnerable groups, leading users into the binary condition of either registering and being profiled, or giving up essential benefits from providers of development programmes. By doing so I contest the “dark side” narrative often applied to digital identity, arguing that, rather than just a side, it is the very inner matter of digital identity platforms that enables surveillance outcomes.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43817463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The ethical dimensions of Google autocomplete 谷歌自动完成的伦理维度
IF 8.5 1区 社会学 Q1 Social Sciences Pub Date : 2023-01-01 DOI: 10.1177/20539517231156518
Rosie Graham
What questions should we ask of Google’s Autocomplete suggestions? This article highlights some of the key ethical issues raised by Google’s automated suggestion tool that provides potential queries below a user’s search box. Much of the discourse surrounding Google’s suggestions has been framed through legal cases in which complex issues can become distilled into black-and-white questions of the law. For example, do Google have to remove a particular suggestion and do they have to pay a settlement for damages? This commentary argues that shaping this discourse along primarily legal lines obscures many of these other moral dimensions raised by Google Autocomplete. Building from existing typologies, this commentary first outlines the legal discourse before exploring five additional ethical challenges, each framed around a particular moral question in which all users have a stake. Written in the form of a commentary, the purpose of this article is not to conclusively answer the ethical questions raised, but rather to give an account of why these particular questions are worth debating. Autocomplete’s suggestions are not simply a mirror of what users are typing into Google’s search bar. Google’s official statement is that “Autocomplete is a time-saving but complex feature. It doesn’t simply display the most common queries on a given topic” but “also predict[s] individual words and phrases that are based on both real searches as well as word patterns found across the web” (Google, 2022). Both its underlying methods and associated terminology have changed throughout time, shifting between providing completions, suggestions, and predictions. In doing so, the grounds for potential critique are ever-changing, which means that Google’s approach to Autocomplete deserves significant scrutiny.
对于b谷歌的自动补全建议,我们应该问哪些问题?这篇文章重点介绍了b谷歌的自动建议工具提出的一些关键的道德问题,该工具在用户的搜索框下面提供潜在的查询。围绕b谷歌建议的许多讨论都是通过法律案例来框定的,在这些案例中,复杂的问题可以被提炼成非黑即白的法律问题。例如,b谷歌是否必须删除一个特定的建议,他们是否必须支付损害赔偿?这篇评论认为,按照主要的法律路线来塑造这一话语,模糊了谷歌自动完成提出的许多其他道德维度。基于现有的类型学,本评论首先概述了法律话语,然后探索了五个额外的道德挑战,每个挑战都围绕着一个特定的道德问题,所有用户都有利害关系。这篇文章以评论的形式写成,目的并不是要结论性地回答所提出的伦理问题,而是要说明为什么这些特定的问题值得讨论。自动补全的建议并不仅仅是用户在b谷歌搜索栏中输入内容的镜像。谷歌的官方声明是“自动补全是一个节省时间但复杂的功能。它不仅显示给定主题上最常见的查询”,而且“还预测基于真实搜索以及在网络上发现的单词模式的单个单词和短语”(b谷歌,2022)。它的基本方法和相关术语随着时间的推移而变化,在提供完井、建议和预测之间转换。在这样做的过程中,潜在的批评理由是不断变化的,这意味着谷歌对自动完成的方法值得仔细审查。
{"title":"The ethical dimensions of Google autocomplete","authors":"Rosie Graham","doi":"10.1177/20539517231156518","DOIUrl":"https://doi.org/10.1177/20539517231156518","url":null,"abstract":"What questions should we ask of Google’s Autocomplete suggestions? This article highlights some of the key ethical issues raised by Google’s automated suggestion tool that provides potential queries below a user’s search box. Much of the discourse surrounding Google’s suggestions has been framed through legal cases in which complex issues can become distilled into black-and-white questions of the law. For example, do Google have to remove a particular suggestion and do they have to pay a settlement for damages? This commentary argues that shaping this discourse along primarily legal lines obscures many of these other moral dimensions raised by Google Autocomplete. Building from existing typologies, this commentary first outlines the legal discourse before exploring five additional ethical challenges, each framed around a particular moral question in which all users have a stake. Written in the form of a commentary, the purpose of this article is not to conclusively answer the ethical questions raised, but rather to give an account of why these particular questions are worth debating. Autocomplete’s suggestions are not simply a mirror of what users are typing into Google’s search bar. Google’s official statement is that “Autocomplete is a time-saving but complex feature. It doesn’t simply display the most common queries on a given topic” but “also predict[s] individual words and phrases that are based on both real searches as well as word patterns found across the web” (Google, 2022). Both its underlying methods and associated terminology have changed throughout time, shifting between providing completions, suggestions, and predictions. In doing so, the grounds for potential critique are ever-changing, which means that Google’s approach to Autocomplete deserves significant scrutiny.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47412208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
FAIR data sharing: An international perspective on why medical researchers are lagging behind FAIR数据共享:医学研究人员为何落后的国际视角
IF 8.5 1区 社会学 Q1 Social Sciences Pub Date : 2023-01-01 DOI: 10.1177/20539517231171052
L. Rainey, J. Lutomski, M. Broeders
FAIR data, that is, Findable, Accessible, Interoperable, and Reusable data, and Big Data intersect across issues related to data storage, access, and processing. The solution-oriented FAIR principles serve an integral role in improving Big Data; yet to date, the implementation of FAIR in multiple sectors has been fragmented. We conducted an exploratory analysis to identify incentives and barriers in creating FAIR data in the medical sector using digital concept mapping, a systematic mixed methods approach. Thirty-eight principal investigators (PIs) were recruited from North America, Europe, and Oceania. Our analysis revealed five clusters rated according to perceived relevance: ‘Efficiency and collaboration’ (rating 7.23), ‘Privacy and security’ (rating 7.18), ‘Data management standards’ (rating 7.16), ‘Organization of services’ (rating 6.98), and ‘Ownership’ (rating 6.28). All five clusters scored relatively high and within a narrow range (i.e., 6.28–7.69), implying that each cluster likely influences researchers’ decision-making processes. PIs harbor a positive view of FAIR data sharing, as exemplified by participants highly prioritizing ‘Efficiency and collaboration’. However, the other four clusters received only modestly lower ratings and largely contained barriers to FAIR data sharing. When viewed collectively, the benefits of efficiency and collaboration may not be sufficient in propelling FAIR data sharing. Arguably, until more of these reported barriers are addressed, widespread support of FAIR data will not translate into widespread practice. This research lays the preliminary foundation for conducting targeted large-scale research into FAIR data practices in the medical research community.
FAIR数据,即可查找、可访问、可互操作和可重复使用的数据,以及大数据在与数据存储、访问和处理相关的问题上交叉。面向解决方案的FAIR原则在改进大数据方面发挥着不可或缺的作用;迄今为止,FAIR在多个部门的实施是分散的。我们进行了一项探索性分析,以确定使用数字概念图(一种系统的混合方法)创建医疗部门FAIR数据的动机和障碍。从北美、欧洲和大洋洲招募了38名主要研究人员。我们的分析揭示了五个根据感知相关性进行评级的集群:“效率和协作”(评级7.23)、“隐私和安全”(评级7.18)、“数据管理标准”(评级71.6)、“服务组织”(评级6.98)和“所有权”(评级6.28)。所有五个集群的得分都相对较高,且在较窄的范围内(即6.28-7.69),这意味着每个聚类都可能影响研究人员的决策过程。PI对FAIR数据共享持积极态度,参与者高度重视“效率和协作”就是一个例证。然而,其他四个集群的评级仅略低,并且在很大程度上包含了FAIR数据共享的障碍。从整体来看,效率和协作的好处可能不足以推动FAIR数据共享。可以说,在解决更多这些报告的障碍之前,对FAIR数据的广泛支持不会转化为广泛的实践。这项研究为在医学研究界对FAIR数据实践进行有针对性的大规模研究奠定了初步基础。
{"title":"FAIR data sharing: An international perspective on why medical researchers are lagging behind","authors":"L. Rainey, J. Lutomski, M. Broeders","doi":"10.1177/20539517231171052","DOIUrl":"https://doi.org/10.1177/20539517231171052","url":null,"abstract":"FAIR data, that is, Findable, Accessible, Interoperable, and Reusable data, and Big Data intersect across issues related to data storage, access, and processing. The solution-oriented FAIR principles serve an integral role in improving Big Data; yet to date, the implementation of FAIR in multiple sectors has been fragmented. We conducted an exploratory analysis to identify incentives and barriers in creating FAIR data in the medical sector using digital concept mapping, a systematic mixed methods approach. Thirty-eight principal investigators (PIs) were recruited from North America, Europe, and Oceania. Our analysis revealed five clusters rated according to perceived relevance: ‘Efficiency and collaboration’ (rating 7.23), ‘Privacy and security’ (rating 7.18), ‘Data management standards’ (rating 7.16), ‘Organization of services’ (rating 6.98), and ‘Ownership’ (rating 6.28). All five clusters scored relatively high and within a narrow range (i.e., 6.28–7.69), implying that each cluster likely influences researchers’ decision-making processes. PIs harbor a positive view of FAIR data sharing, as exemplified by participants highly prioritizing ‘Efficiency and collaboration’. However, the other four clusters received only modestly lower ratings and largely contained barriers to FAIR data sharing. When viewed collectively, the benefits of efficiency and collaboration may not be sufficient in propelling FAIR data sharing. Arguably, until more of these reported barriers are addressed, widespread support of FAIR data will not translate into widespread practice. This research lays the preliminary foundation for conducting targeted large-scale research into FAIR data practices in the medical research community.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45542784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Because the machine can discriminate: How machine learning serves and transforms biological explanations of human difference. 因为机器可以区分:机器学习如何服务和改变人类差异的生物学解释
IF 8.5 1区 社会学 Q1 Social Sciences Pub Date : 2023-01-01 Epub Date: 2023-02-20 DOI: 10.1177/20539517231155060
Jeffrey W Lockhart

Research on scientific/intellectual movements, and social movements generally, tends to focus on resources and conditions outside the substance of the movements, such as funding and publication opportunities or the prestige and networks of movement actors. Drawing on Pinch's theory of technologies as institutions, I argue that research methods can also serve as resources for scientific movements by institutionalizing their ideas in research practice. I demonstrate the argument with the case of neuroscience, where the adoption of machine learning changed how scientists think about measurement and modeling of group difference. This provided an opportunity for members of the sex difference movement by offering a 'truly categorical' quantitative methodology that aligned more closely with their understanding of male and female brains and bodies as categorically distinct. The result was a flurry of publications and symbiotic relationships with other researchers that rescued a scientific movement which had been growing increasingly untenable under the prior methodological regime of univariate, frequentist analyses. I call for increased sociological attention to the inner workings of technologies that we typically black box in light of their potential consequences for the social world. I also suggest that machine learning in particular might have wide-reaching implications for how we conceive of human groups beyond sex, including race, sexuality, criminality, and political position, where scientists are just beginning to adopt its methods.

对科学/知识分子运动以及一般的社会运动的研究,往往侧重于运动实质之外的资源和条件,如资金和出版机会,或运动参与者的声望和网络。借鉴Pinch的技术作为制度的理论,我认为研究方法也可以作为科学运动的资源,通过在研究实践中将其思想制度化。我以神经科学为例证明了这一论点,在神经科学中,机器学习的采用改变了科学家对群体差异测量和建模的看法。这为性别差异运动的成员提供了一个机会,提供了一个“真正的分类”定量方法,更接近于他们对男性和女性大脑和身体的分类差异的理解。结果,他发表了大量的论文,并与其他研究人员建立了共生关系,挽救了一场科学运动,这场运动在先前的单变量、频率分析的方法学制度下越来越站不住脚。我呼吁增加社会学对技术内部运作的关注,鉴于它们对社会世界的潜在影响,我们通常会将其暗箱操作。我还认为,机器学习可能会对我们如何理解性别以外的人类群体产生深远的影响,包括种族、性行为、犯罪和政治立场,而科学家们刚刚开始采用机器学习的方法。
{"title":"Because the machine can discriminate: How machine learning serves and transforms biological explanations of human difference.","authors":"Jeffrey W Lockhart","doi":"10.1177/20539517231155060","DOIUrl":"10.1177/20539517231155060","url":null,"abstract":"<p><p>Research on scientific/intellectual movements, and social movements generally, tends to focus on resources and conditions outside the substance of the movements, such as funding and publication opportunities or the prestige and networks of movement actors. Drawing on Pinch's theory of technologies as institutions, I argue that research methods can also serve as resources for scientific movements by institutionalizing their ideas in research practice. I demonstrate the argument with the case of neuroscience, where the adoption of machine learning changed how scientists think about measurement and modeling of group difference. This provided an opportunity for members of the sex difference movement by offering a 'truly categorical' quantitative methodology that aligned more closely with their understanding of male and female brains and bodies as categorically distinct. The result was a flurry of publications and symbiotic relationships with other researchers that rescued a scientific movement which had been growing increasingly untenable under the prior methodological regime of univariate, frequentist analyses. I call for increased sociological attention to the inner workings of technologies that we typically black box in light of their potential consequences for the social world. I also suggest that machine learning in particular might have wide-reaching implications for how we conceive of human groups beyond sex, including race, sexuality, criminality, and political position, where scientists are just beginning to adopt its methods.</p>","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10704893/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43128359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Data arenas: The relational dynamics of data activism 数据领域:数据行动主义的关系动态
IF 8.5 1区 社会学 Q1 Social Sciences Pub Date : 2023-01-01 DOI: 10.1177/20539517231177617
Bartosz Ślosarski
The article proposes the theoretical category of data arenas as a relational field for strategic actors in diverse areas of the contentious politics of data (Beraldo and Milan, 2019). The paper argues that the conceptualization of data activism needs to be related to the immediate data arena in which the action takes place, in order to select the interactive opportunities and threats for emerging data-driven repertoires of action. To fully work through the relational dynamics of data activism, it is necessary to move from a conceptualization of data infrastructure to the notion of data arenas as an ‘open-ended bundle of rules and resources that allows certain kinds of interaction to proceed’ (Jasper, 2006: 141). Using the case of environmental data activism, I highlight four key dimensions to study: (a) strategic use of data as capital that differentiates and positions actors, as well as influences their further choices; (b) practices of defining the boundaries of the problem on which the arena focuses and outlining the pool of actors who participate in the process of solving it; (3) sets of relationships among the outlined pool of actors which represent opportunities and threats for the actors, related to the position they occupy within an arena; and (4) power as the ability to control and shape an arena. Data arena approach shed new light on data activism as a relational practice, combining the latest developments in research on data contexts and the political situatedness of data with the emerging field of research on data activism.
文章提出了数据领域的理论类别,作为有争议的数据政治的不同领域中战略参与者的关系领域(Beraldo和Milan,2019)。该论文认为,数据行动主义的概念化需要与行动发生的直接数据领域相关,以便为新兴的数据驱动行动选择互动机会和威胁。为了充分利用数据激进主义的关系动力学,有必要从数据基础设施的概念化转变为数据领域的概念,将其视为“允许进行某些类型的交互的开放式规则和资源束”(Jasper,2006:141)。以环境数据行动主义为例,我强调了要研究的四个关键维度:(a)将数据作为区分和定位参与者的资本进行战略使用,并影响他们的进一步选择;(b) 界定竞技场所关注问题的边界,并概述参与解决问题过程的参与者的做法;(3) 概述的行动者群体之间的关系集,代表行动者的机会和威胁,与他们在竞技场中的地位有关;以及(4)权力,即控制和塑造竞技场的能力。数据竞技场方法将数据情境和数据的政治情境研究的最新发展与新兴的数据行动主义研究领域相结合,为数据行动主义作为一种关系实践提供了新的视角。
{"title":"Data arenas: The relational dynamics of data activism","authors":"Bartosz Ślosarski","doi":"10.1177/20539517231177617","DOIUrl":"https://doi.org/10.1177/20539517231177617","url":null,"abstract":"The article proposes the theoretical category of data arenas as a relational field for strategic actors in diverse areas of the contentious politics of data (Beraldo and Milan, 2019). The paper argues that the conceptualization of data activism needs to be related to the immediate data arena in which the action takes place, in order to select the interactive opportunities and threats for emerging data-driven repertoires of action. To fully work through the relational dynamics of data activism, it is necessary to move from a conceptualization of data infrastructure to the notion of data arenas as an ‘open-ended bundle of rules and resources that allows certain kinds of interaction to proceed’ (Jasper, 2006: 141). Using the case of environmental data activism, I highlight four key dimensions to study: (a) strategic use of data as capital that differentiates and positions actors, as well as influences their further choices; (b) practices of defining the boundaries of the problem on which the arena focuses and outlining the pool of actors who participate in the process of solving it; (3) sets of relationships among the outlined pool of actors which represent opportunities and threats for the actors, related to the position they occupy within an arena; and (4) power as the ability to control and shape an arena. Data arena approach shed new light on data activism as a relational practice, combining the latest developments in research on data contexts and the political situatedness of data with the emerging field of research on data activism.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":null,"pages":null},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41726502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Big Data & Society
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1