Big Data & Society最新文献_第4页

Learning machine learning: On the political economy of big tech's online AI courses 学习机器学习:论大型科技公司在线人工智能课程的政治经济学

IF 8.5 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society

Pub Date : 2023-01-01 DOI: 10.1177/20539517231153806

Inga Luchs, C. Apprich, M. Broersma

Machine learning (ML) algorithms are still a novel research object in the field of media studies. While existing research focuses on concrete software on the one hand and the socio-economic context of the development and use of these systems on the other, this paper studies online ML courses as a research object that has received little attention so far. By pursuing a walkthrough and critical discourse analysis of Google's Machine Learning Crash Course and IBM's introductory course to Machine Learning with Python, we not only shed light on the technical knowledge, assumptions, and dominant infrastructures of ML as a field of practice, but also on the economic interests of the companies providing the courses. We demonstrate how the online courses further support Google and IBM to consolidate and even expand their position of power by recruiting new AI talent and by securing their infrastructures and models to become the dominant ones. Further, we show how the companies not only influence greatly how ML is represented, but also how these representations in turn influence and direct current ML research and development, as well as the societal effects of their products. Here, they boast an image of fair and democratic artificial intelligence, which stands in stark contrast to the ubiquity of their corporate products and the advertised directives of efficiency and performativity the companies strive for. This underlines the need for alternative infrastructures and perspectives.

机器学习算法在媒体研究领域仍然是一个新的研究对象。现有的研究一方面侧重于具体的软件，另一方面侧重于这些系统开发和使用的社会经济背景，而本文将在线ML课程作为一个研究对象，迄今为止很少受到关注。通过对b谷歌的机器学习速成课程和IBM的机器学习Python入门课程进行演练和批判性话语分析，我们不仅揭示了机器学习作为实践领域的技术知识、假设和主导基础设施，而且还揭示了提供课程的公司的经济利益。我们展示了在线课程如何进一步支持b谷歌和IBM通过招募新的人工智能人才，并确保他们的基础设施和模型成为主导地位，巩固甚至扩大他们的权力地位。此外，我们还展示了这些公司如何不仅极大地影响机器学习的表现方式，而且还展示了这些表现如何反过来影响和指导当前的机器学习研究和开发，以及它们的产品的社会影响。在这里，他们吹嘘着公平民主的人工智能形象，这与他们无处不在的企业产品以及公司所追求的效率和性能的广告指令形成鲜明对比。这强调了对替代基础设施和观点的需求。

{"title":"Learning machine learning: On the political economy of big tech's online AI courses","authors":"Inga Luchs, C. Apprich, M. Broersma","doi":"10.1177/20539517231153806","DOIUrl":"https://doi.org/10.1177/20539517231153806","url":null,"abstract":"Machine learning (ML) algorithms are still a novel research object in the field of media studies. While existing research focuses on concrete software on the one hand and the socio-economic context of the development and use of these systems on the other, this paper studies online ML courses as a research object that has received little attention so far. By pursuing a walkthrough and critical discourse analysis of Google's Machine Learning Crash Course and IBM's introductory course to Machine Learning with Python, we not only shed light on the technical knowledge, assumptions, and dominant infrastructures of ML as a field of practice, but also on the economic interests of the companies providing the courses. We demonstrate how the online courses further support Google and IBM to consolidate and even expand their position of power by recruiting new AI talent and by securing their infrastructures and models to become the dominant ones. Further, we show how the companies not only influence greatly how ML is represented, but also how these representations in turn influence and direct current ML research and development, as well as the societal effects of their products. Here, they boast an image of fair and democratic artificial intelligence, which stands in stark contrast to the ubiquity of their corporate products and the advertised directives of efficiency and performativity the companies strive for. This underlines the need for alternative infrastructures and perspectives.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41796643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Machine learning, meaning making: On reading computer science texts 机器学习，意义创造:关于阅读计算机科学文本

IF 8.5 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society

Pub Date : 2023-01-01 DOI: 10.1177/20539517231166887

Louise Amoore, Alexander Campolo, Benjamin N. Jacobsen, Ludovico Rella

Computer science tends to foreclose the reading of its texts by social science and humanities scholars – via code and scale, mathematics, black box opacities, secret or proprietary models. Yet, when computer science papers are read in order to better understand what machine learning means for societies, a form of reading is brought to bear that is not primarily about excavating the hidden meaning of a text or exposing underlying truths about science. Not strictly reading to make sense or to discern definitive meaning of computer science texts, reading is an engagement with the sense-making and meaning-making that takes place. We propose a strategy for reading computer science that is attentive to the act of reading itself, that stays close to the difficulty involved in all forms of reading, and that works with the text as already properly belonging to the ethico-politics that this difficulty engenders. Addressing a series of three “reading problems” – genre, readability, and meaning – we discuss machine learning textbooks and papers as sites where today's algorithmic models are actively giving accounts of their paradigmatic worldview. Much more than matters of technical definition or proof of concept, texts are sites where concepts are forged and contested. In our times, when the political application of AI and machine learning is so commonly geared to settle or predict difficult societal problems in advance, a reading strategy must open the gaps and difficulties of that which cannot be settled or resolved.

计算机科学倾向于排除社会科学和人文学者对其文本的阅读——通过代码和规模、数学、黑箱不透明、秘密或专有模型。然而，当人们为了更好地理解机器学习对社会的意义而阅读计算机科学论文时，就会产生一种阅读形式，这种阅读方式主要不是挖掘文本的隐藏含义，也不是揭露科学的潜在真相。阅读并不是严格地为了理解或辨别计算机科学文本的明确含义而阅读，阅读是一种参与其中的意义构建和意义构建。我们提出了一种阅读计算机科学的策略，它关注阅读本身的行为，与所有形式的阅读所涉及的困难保持密切联系，并与已经适当地属于这种困难所产生的伦理政治的文本一起工作。为了解决一系列的三个“阅读问题”——类型、可读性和意义——我们将机器学习教科书和论文作为网站来讨论，在这些网站上，今天的算法模型正在积极地给出它们的范式世界观的描述。文本不仅仅是技术定义或概念证明的问题，而是概念伪造和争论的场所。在我们这个时代，人工智能和机器学习的政治应用通常是为了提前解决或预测困难的社会问题，阅读策略必须打开那些无法解决或解决的差距和困难。

{"title":"Machine learning, meaning making: On reading computer science texts","authors":"Louise Amoore, Alexander Campolo, Benjamin N. Jacobsen, Ludovico Rella","doi":"10.1177/20539517231166887","DOIUrl":"https://doi.org/10.1177/20539517231166887","url":null,"abstract":"Computer science tends to foreclose the reading of its texts by social science and humanities scholars – via code and scale, mathematics, black box opacities, secret or proprietary models. Yet, when computer science papers are read in order to better understand what machine learning means for societies, a form of reading is brought to bear that is not primarily about excavating the hidden meaning of a text or exposing underlying truths about science. Not strictly reading to make sense or to discern definitive meaning of computer science texts, reading is an engagement with the sense-making and meaning-making that takes place. We propose a strategy for reading computer science that is attentive to the act of reading itself, that stays close to the difficulty involved in all forms of reading, and that works with the text as already properly belonging to the ethico-politics that this difficulty engenders. Addressing a series of three “reading problems” – genre, readability, and meaning – we discuss machine learning textbooks and papers as sites where today's algorithmic models are actively giving accounts of their paradigmatic worldview. Much more than matters of technical definition or proof of concept, texts are sites where concepts are forged and contested. In our times, when the political application of AI and machine learning is so commonly geared to settle or predict difficult societal problems in advance, a reading strategy must open the gaps and difficulties of that which cannot be settled or resolved.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48636165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Formally comparing topic models and human-generated qualitative coding of physician mothers’ experiences of workplace discrimination 对医生母亲工作场所歧视经历的主题模型和人为生成的定性编码进行正式比较

IF 8.5 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society

Pub Date : 2023-01-01 DOI: 10.1177/20539517221149106

Adam S. Miner, Sheridan A Stewart, M. Halley, Laura K. Nelson, Eleni Linos

Differences between computationally generated and human-generated themes in unstructured text are important to understand yet difficult to assess formally. In this study, we bridge these approaches through two contributions. First, we formally compare a primarily computational approach, topic modeling, to a primarily human-driven approach, qualitative thematic coding, in an impactful context: physician mothers’ experience of workplace discrimination. Second, we compare our chosen topic model to a principled alternative topic model to make explicit study design decisions meriting consideration in future research. By formally contrasting computationally generated (i.e. topic modeling) and human-generated (i.e. thematic coding) knowledge, we shed light on issues of interest to several audiences, notably computational social scientists who wish to understand study design tradeoffs, and qualitative researchers who may wish to leverage computational methods to improve the speed and reproducibility of labor-intensive coding. Although useful in other domains, we highlight the value of fast, reproducible methods to better understand experiences of workplace discrimination.

在非结构化文本中，计算生成和人工生成主题之间的差异对理解很重要，但很难正式评估。在本研究中，我们通过两个贡献来连接这些方法。首先，我们正式比较了主要的计算方法，主题建模，主要人为驱动的方法，定性主题编码，在一个有影响力的背景下:医生母亲的工作场所歧视的经历。其次，我们将选择的主题模型与有原则的备选主题模型进行比较，以做出明确的研究设计决策，以便在未来的研究中加以考虑。通过正式对比计算生成(即主题建模)和人类生成(即主题编码)的知识，我们揭示了一些受众感兴趣的问题，特别是希望理解研究设计权衡的计算社会科学家，以及可能希望利用计算方法来提高劳动密集型编码的速度和可重复性的定性研究人员。尽管在其他领域也很有用，但我们强调了快速、可重复的方法在更好地理解工作场所歧视经历方面的价值。

{"title":"Formally comparing topic models and human-generated qualitative coding of physician mothers’ experiences of workplace discrimination","authors":"Adam S. Miner, Sheridan A Stewart, M. Halley, Laura K. Nelson, Eleni Linos","doi":"10.1177/20539517221149106","DOIUrl":"https://doi.org/10.1177/20539517221149106","url":null,"abstract":"Differences between computationally generated and human-generated themes in unstructured text are important to understand yet difficult to assess formally. In this study, we bridge these approaches through two contributions. First, we formally compare a primarily computational approach, topic modeling, to a primarily human-driven approach, qualitative thematic coding, in an impactful context: physician mothers’ experience of workplace discrimination. Second, we compare our chosen topic model to a principled alternative topic model to make explicit study design decisions meriting consideration in future research. By formally contrasting computationally generated (i.e. topic modeling) and human-generated (i.e. thematic coding) knowledge, we shed light on issues of interest to several audiences, notably computational social scientists who wish to understand study design tradeoffs, and qualitative researchers who may wish to leverage computational methods to improve the speed and reproducibility of labor-intensive coding. Although useful in other domains, we highlight the value of fast, reproducible methods to better understand experiences of workplace discrimination.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44128155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Based and confused: Tracing the political connotations of a memetic phrase across the web 基于和困惑：在网络上追踪一个模因短语的政治含义

IF 8.5 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society

Pub Date : 2023-01-01 DOI: 10.1177/20539517231163175

S. Hagen, D. de Zeeuw

Current research on the weaponisation of far-right discourse online has mostly focused on the dangers of normalising hate speech. However, this often operates on questionable assumptions about how far-right terms retain problematic meanings over time and across different platforms. Yet contextual meaning-change, we argue, is key to assessing the normalisation of problematic but fuzzy terms as they spread across the Web. To redress this, our article traces the changing meaning of the term based, a word that was appropriated from Black Twitter to become a staple of online far-right slang in the mid-2010s. Through a quali-quantitative cross-platform approach, we analyse the evolution of the term between 2010 and 2021 on Twitter, Reddit and 4chan. We find that while the far right meaning of based partially survived, its meaning changed and was rendered diffuse as it was adopted by other communities, afforded by a repurposable kernel of meaning to based as ‘not caring about what other people think’ and ‘being true to yourself’ to which different (political) connotations were attached. This challenges the understanding of far-right memes and hate speech as carrying a single and persistent problematic message, and instead emphasises their varied meanings and subcultural functions within specific online communities.

目前关于极右翼网络言论武器化的研究主要集中在仇恨言论正常化的危险上。然而，这通常基于有问题的假设，即随着时间的推移，极右翼术语在不同平台上保留了有问题的含义。然而，我们认为，上下文意义的变化是评估有问题但模糊的术语在网络上传播时是否正常化的关键。为了纠正这一点，我们的文章追溯了“基于”一词含义的变化，这个词从黑色推特中被挪用，在2010年代中期成为网络极右翼俚语的主要内容。通过quali量化跨平台方法，我们在推特、Reddit和4chan上分析了2010年至2021年间该术语的演变。我们发现，虽然“基于”的极右翼含义部分保留了下来，但随着它被其他社区采用，它的含义发生了变化，并变得分散开来，这是由一个可重新调整用途的核心含义提供的，即“不在乎别人的想法”和“忠于自己”，而不同的（政治）含义则附加于此。这挑战了人们对极右翼模因和仇恨言论的理解，认为它们携带着单一而持久的问题信息，反而强调了它们在特定网络社区中的不同含义和亚文化功能。

{"title":"Based and confused: Tracing the political connotations of a memetic phrase across the web","authors":"S. Hagen, D. de Zeeuw","doi":"10.1177/20539517231163175","DOIUrl":"https://doi.org/10.1177/20539517231163175","url":null,"abstract":"Current research on the weaponisation of far-right discourse online has mostly focused on the dangers of normalising hate speech. However, this often operates on questionable assumptions about how far-right terms retain problematic meanings over time and across different platforms. Yet contextual meaning-change, we argue, is key to assessing the normalisation of problematic but fuzzy terms as they spread across the Web. To redress this, our article traces the changing meaning of the term based, a word that was appropriated from Black Twitter to become a staple of online far-right slang in the mid-2010s. Through a quali-quantitative cross-platform approach, we analyse the evolution of the term between 2010 and 2021 on Twitter, Reddit and 4chan. We find that while the far right meaning of based partially survived, its meaning changed and was rendered diffuse as it was adopted by other communities, afforded by a repurposable kernel of meaning to based as ‘not caring about what other people think’ and ‘being true to yourself’ to which different (political) connotations were attached. This challenges the understanding of far-right memes and hate speech as carrying a single and persistent problematic message, and instead emphasises their varied meanings and subcultural functions within specific online communities.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42830558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

‘I started seeing shadows everywhere’: The diverse chilling effects of surveillance in Zimbabwe “我开始看到到处都是阴影”:津巴布韦监视的各种寒蝉效应

IF 8.5 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society

Pub Date : 2023-01-01 DOI: 10.1177/20539517231158631

A. Stevens, P. Fussey, Daragh Murray, Kuda Hove, Otto Saki

Recent years have witnessed growing ubiquity and potency of state surveillance measures with heightened implications for human rights and social justice. While impacts of surveillance are routinely framed through ‘privacy’ narratives, emphasising ‘chilling effects’ surfaces a more complex range of harms and rights implications for those who are, or believe they are, subjected to surveillance. Although first emphasised during the McCarthy era, surveillance ‘chilling effects’ remain under-researched, particularly in Africa. Drawing on rare interview data from participants subjected to state-sponsored surveillance in Zimbabwe, the paper reveals complex assemblages of state and non-state actors involved in diverse and expansive hybrid online–offline monitoring. While scholarship has recently emphasised the importance of large-scale digital mass surveillance, the Zimbabwean context reveals complex assemblages of ‘big data’, social media and other digital monitoring combining with more traditional human surveillance practices. Such inseparable online–offline imbrications compound the scale, scope and impact of surveillance and invite analyses as an integrated ensemble. The paper evidences how these surveillance activities exert chilling effects that vary in form, scope and intensity, and implicate rights essential to the development of personal identity and effective functioning of participatory democracy. Moreover, the data reveals impacts beyond the individual to the vicarious and collective. These include gendered dimensions, eroded interpersonal trust and the depleted ability of human rights defenders to organise and particulate in democratic processes. Overall, surveillance chilling effects exert a wide spectrum of outcomes which consequently interfere with enjoyment of multiple rights and hold both short- and long-term implications for democratic participation.

近年来，国家监督措施日益普遍和有效，对人权和社会正义的影响越来越大。虽然监视的影响通常是通过“隐私”叙事来构建的，但强调“寒蝉效应”对那些受到监视或相信自己受到监视的人来说，会带来更复杂的伤害和权利影响。尽管在麦卡锡时代首次强调，但监控的“寒蝉效应”仍然研究不足，尤其是在非洲。该论文利用津巴布韦受国家资助的监测参与者的罕见采访数据，揭示了参与多样化和广泛的线上线下混合监测的国家和非国家行为者的复杂组合。尽管学术界最近强调了大规模数字大规模监控的重要性，但津巴布韦的背景揭示了“大数据”、社交媒体和其他数字监控与更传统的人类监控实践的复杂组合。这种不可分割的线上和线下的重叠使监控的规模、范围和影响更加复杂，并将分析作为一个整体。本文证明了这些监视活动如何在形式、范围和强度上产生寒蝉效应，并暗示了对发展个人身份和参与式民主的有效运作至关重要的权利。此外，数据揭示了个人之外对替代和集体的影响。其中包括性别层面、人际信任受到侵蚀，以及人权维护者组织和参与民主进程的能力减弱。总的来说，监督的寒蝉效应产生了广泛的结果，从而干扰了多重权利的享受，并对民主参与产生了短期和长期影响。

{"title":"‘I started seeing shadows everywhere’: The diverse chilling effects of surveillance in Zimbabwe","authors":"A. Stevens, P. Fussey, Daragh Murray, Kuda Hove, Otto Saki","doi":"10.1177/20539517231158631","DOIUrl":"https://doi.org/10.1177/20539517231158631","url":null,"abstract":"Recent years have witnessed growing ubiquity and potency of state surveillance measures with heightened implications for human rights and social justice. While impacts of surveillance are routinely framed through ‘privacy’ narratives, emphasising ‘chilling effects’ surfaces a more complex range of harms and rights implications for those who are, or believe they are, subjected to surveillance. Although first emphasised during the McCarthy era, surveillance ‘chilling effects’ remain under-researched, particularly in Africa. Drawing on rare interview data from participants subjected to state-sponsored surveillance in Zimbabwe, the paper reveals complex assemblages of state and non-state actors involved in diverse and expansive hybrid online–offline monitoring. While scholarship has recently emphasised the importance of large-scale digital mass surveillance, the Zimbabwean context reveals complex assemblages of ‘big data’, social media and other digital monitoring combining with more traditional human surveillance practices. Such inseparable online–offline imbrications compound the scale, scope and impact of surveillance and invite analyses as an integrated ensemble. The paper evidences how these surveillance activities exert chilling effects that vary in form, scope and intensity, and implicate rights essential to the development of personal identity and effective functioning of participatory democracy. Moreover, the data reveals impacts beyond the individual to the vicarious and collective. These include gendered dimensions, eroded interpersonal trust and the depleted ability of human rights defenders to organise and particulate in democratic processes. Overall, surveillance chilling effects exert a wide spectrum of outcomes which consequently interfere with enjoyment of multiple rights and hold both short- and long-term implications for democratic participation.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46180129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Clicks and particulates: Value, alienation, and attunement as unifying themes in big data studies 点击和微粒:价值、异化和调谐作为大数据研究的统一主题

IF 8.5 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society

Pub Date : 2023-01-01 DOI: 10.1177/20539517231184891

G. Ottinger, K. Bronson, D. Nafus

Critiques of data colonialism and surveillance capitalism focus on data collected from online behavior. We propose that analytical concepts from these critiques—namely, regimes of value and patterns of alienation and attunement—could be applied more widely to better understand the threats that datafication poses to equity and democracy in the social and environmental realms. Regimes of value, which include the institutions and technologies that make data meaningful and render them selectively available for appropriation, are relevant both to for-profit companies’ data practices and to states’ participation in the datafication of the environment; examining regimes of value raises questions about how data are exploited and how they are neglected. Patterns of alienation associated with datafication include the potential for alienation from the environment; however, at least in some value regimes, alienation may be accompanied by possibilities for attunement to natural and social phenomena that might otherwise have escaped notice.

对数据殖民主义和监视资本主义的批评集中在从网络行为中收集的数据上。我们建议，这些批评中的分析概念，即价值体系以及异化和协调模式，可以更广泛地应用，以更好地理解数据化对社会和环境领域的公平和民主构成的威胁。价值体系，包括使数据有意义并使其有选择地可供使用的机构和技术，与营利性公司的数据实践和各州参与环境数据化都相关；研究价值体系引发了数据如何被利用以及如何被忽视的问题。与数据化相关的异化模式包括与环境异化的可能性；然而，至少在某些价值体系中，异化可能伴随着适应自然和社会现象的可能性，否则这些现象可能会被忽视。

引用次数: 0

FAIR data sharing: An international perspective on why medical researchers are lagging behind FAIR数据共享：医学研究人员为何落后的国际视角

IF 8.5 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society

Pub Date : 2023-01-01 DOI: 10.1177/20539517231171052

L. Rainey, J. Lutomski, M. Broeders

FAIR data, that is, Findable, Accessible, Interoperable, and Reusable data, and Big Data intersect across issues related to data storage, access, and processing. The solution-oriented FAIR principles serve an integral role in improving Big Data; yet to date, the implementation of FAIR in multiple sectors has been fragmented. We conducted an exploratory analysis to identify incentives and barriers in creating FAIR data in the medical sector using digital concept mapping, a systematic mixed methods approach. Thirty-eight principal investigators (PIs) were recruited from North America, Europe, and Oceania. Our analysis revealed five clusters rated according to perceived relevance: ‘Efficiency and collaboration’ (rating 7.23), ‘Privacy and security’ (rating 7.18), ‘Data management standards’ (rating 7.16), ‘Organization of services’ (rating 6.98), and ‘Ownership’ (rating 6.28). All five clusters scored relatively high and within a narrow range (i.e., 6.28–7.69), implying that each cluster likely influences researchers’ decision-making processes. PIs harbor a positive view of FAIR data sharing, as exemplified by participants highly prioritizing ‘Efficiency and collaboration’. However, the other four clusters received only modestly lower ratings and largely contained barriers to FAIR data sharing. When viewed collectively, the benefits of efficiency and collaboration may not be sufficient in propelling FAIR data sharing. Arguably, until more of these reported barriers are addressed, widespread support of FAIR data will not translate into widespread practice. This research lays the preliminary foundation for conducting targeted large-scale research into FAIR data practices in the medical research community.

FAIR数据，即可查找、可访问、可互操作和可重复使用的数据，以及大数据在与数据存储、访问和处理相关的问题上交叉。面向解决方案的FAIR原则在改进大数据方面发挥着不可或缺的作用；迄今为止，FAIR在多个部门的实施是分散的。我们进行了一项探索性分析，以确定使用数字概念图（一种系统的混合方法）创建医疗部门FAIR数据的动机和障碍。从北美、欧洲和大洋洲招募了38名主要研究人员。我们的分析揭示了五个根据感知相关性进行评级的集群：“效率和协作”（评级7.23）、“隐私和安全”（评级7.18）、“数据管理标准”（评级71.6）、“服务组织”（评级6.98）和“所有权”（评级6.28）。所有五个集群的得分都相对较高，且在较窄的范围内（即6.28-7.69），这意味着每个聚类都可能影响研究人员的决策过程。PI对FAIR数据共享持积极态度，参与者高度重视“效率和协作”就是一个例证。然而，其他四个集群的评级仅略低，并且在很大程度上包含了FAIR数据共享的障碍。从整体来看，效率和协作的好处可能不足以推动FAIR数据共享。可以说，在解决更多这些报告的障碍之前，对FAIR数据的广泛支持不会转化为广泛的实践。这项研究为在医学研究界对FAIR数据实践进行有针对性的大规模研究奠定了初步基础。

{"title":"FAIR data sharing: An international perspective on why medical researchers are lagging behind","authors":"L. Rainey, J. Lutomski, M. Broeders","doi":"10.1177/20539517231171052","DOIUrl":"https://doi.org/10.1177/20539517231171052","url":null,"abstract":"FAIR data, that is, Findable, Accessible, Interoperable, and Reusable data, and Big Data intersect across issues related to data storage, access, and processing. The solution-oriented FAIR principles serve an integral role in improving Big Data; yet to date, the implementation of FAIR in multiple sectors has been fragmented. We conducted an exploratory analysis to identify incentives and barriers in creating FAIR data in the medical sector using digital concept mapping, a systematic mixed methods approach. Thirty-eight principal investigators (PIs) were recruited from North America, Europe, and Oceania. Our analysis revealed five clusters rated according to perceived relevance: ‘Efficiency and collaboration’ (rating 7.23), ‘Privacy and security’ (rating 7.18), ‘Data management standards’ (rating 7.16), ‘Organization of services’ (rating 6.98), and ‘Ownership’ (rating 6.28). All five clusters scored relatively high and within a narrow range (i.e., 6.28–7.69), implying that each cluster likely influences researchers’ decision-making processes. PIs harbor a positive view of FAIR data sharing, as exemplified by participants highly prioritizing ‘Efficiency and collaboration’. However, the other four clusters received only modestly lower ratings and largely contained barriers to FAIR data sharing. When viewed collectively, the benefits of efficiency and collaboration may not be sufficient in propelling FAIR data sharing. Arguably, until more of these reported barriers are addressed, widespread support of FAIR data will not translate into widespread practice. This research lays the preliminary foundation for conducting targeted large-scale research into FAIR data practices in the medical research community.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45542784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Data arenas: The relational dynamics of data activism 数据领域:数据行动主义的关系动态

IF 8.5 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society

Pub Date : 2023-01-01 DOI: 10.1177/20539517231177617

Bartosz Ślosarski

The article proposes the theoretical category of data arenas as a relational field for strategic actors in diverse areas of the contentious politics of data (Beraldo and Milan, 2019). The paper argues that the conceptualization of data activism needs to be related to the immediate data arena in which the action takes place, in order to select the interactive opportunities and threats for emerging data-driven repertoires of action. To fully work through the relational dynamics of data activism, it is necessary to move from a conceptualization of data infrastructure to the notion of data arenas as an ‘open-ended bundle of rules and resources that allows certain kinds of interaction to proceed’ (Jasper, 2006: 141). Using the case of environmental data activism, I highlight four key dimensions to study: (a) strategic use of data as capital that differentiates and positions actors, as well as influences their further choices; (b) practices of defining the boundaries of the problem on which the arena focuses and outlining the pool of actors who participate in the process of solving it; (3) sets of relationships among the outlined pool of actors which represent opportunities and threats for the actors, related to the position they occupy within an arena; and (4) power as the ability to control and shape an arena. Data arena approach shed new light on data activism as a relational practice, combining the latest developments in research on data contexts and the political situatedness of data with the emerging field of research on data activism.

文章提出了数据领域的理论类别，作为有争议的数据政治的不同领域中战略参与者的关系领域（Beraldo和Milan，2019）。该论文认为，数据行动主义的概念化需要与行动发生的直接数据领域相关，以便为新兴的数据驱动行动选择互动机会和威胁。为了充分利用数据激进主义的关系动力学，有必要从数据基础设施的概念化转变为数据领域的概念，将其视为“允许进行某些类型的交互的开放式规则和资源束”（Jasper，2006:141）。以环境数据行动主义为例，我强调了要研究的四个关键维度：（a）将数据作为区分和定位参与者的资本进行战略使用，并影响他们的进一步选择；（b）界定竞技场所关注问题的边界，并概述参与解决问题过程的参与者的做法；（3）概述的行动者群体之间的关系集，代表行动者的机会和威胁，与他们在竞技场中的地位有关；以及（4）权力，即控制和塑造竞技场的能力。数据竞技场方法将数据情境和数据的政治情境研究的最新发展与新兴的数据行动主义研究领域相结合，为数据行动主义作为一种关系实践提供了新的视角。

{"title":"Data arenas: The relational dynamics of data activism","authors":"Bartosz Ślosarski","doi":"10.1177/20539517231177617","DOIUrl":"https://doi.org/10.1177/20539517231177617","url":null,"abstract":"The article proposes the theoretical category of data arenas as a relational field for strategic actors in diverse areas of the contentious politics of data (Beraldo and Milan, 2019). The paper argues that the conceptualization of data activism needs to be related to the immediate data arena in which the action takes place, in order to select the interactive opportunities and threats for emerging data-driven repertoires of action. To fully work through the relational dynamics of data activism, it is necessary to move from a conceptualization of data infrastructure to the notion of data arenas as an ‘open-ended bundle of rules and resources that allows certain kinds of interaction to proceed’ (Jasper, 2006: 141). Using the case of environmental data activism, I highlight four key dimensions to study: (a) strategic use of data as capital that differentiates and positions actors, as well as influences their further choices; (b) practices of defining the boundaries of the problem on which the arena focuses and outlining the pool of actors who participate in the process of solving it; (3) sets of relationships among the outlined pool of actors which represent opportunities and threats for the actors, related to the position they occupy within an arena; and (4) power as the ability to control and shape an arena. Data arena approach shed new light on data activism as a relational practice, combining the latest developments in research on data contexts and the political situatedness of data with the emerging field of research on data activism.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41726502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Digital identity as platform-mediated surveillance 作为平台中介监控的数字身份

IF 8.5 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society

Pub Date : 2023-01-01 DOI: 10.1177/20539517221135176

S. Masiero

Digital identity systems are usually viewed as datafiers of existing populations. Yet a platform view finds limited space in the digital identity discourse, with the result that the platform features of digital identity systems are not seen in relation to their surveillance outcomes. In this commentary I illuminate how the core platform properties of digital identity systems afford the undue surveillance of vulnerable groups, leading users into the binary condition of either registering and being profiled, or giving up essential benefits from providers of development programmes. By doing so I contest the “dark side” narrative often applied to digital identity, arguing that, rather than just a side, it is the very inner matter of digital identity platforms that enables surveillance outcomes.

数字身份系统通常被视为现有人口的数据分析器。然而，平台观点在数字身份话语中发现了有限的空间，其结果是，数字身份系统的平台特征与它们的监控结果没有关系。在这篇评论中，我阐明了数字身份系统的核心平台属性是如何对弱势群体进行不当监视的，导致用户陷入要么注册并被分析，要么放弃从发展计划提供者那里获得的基本利益的二元状态。通过这样做，我对经常应用于数字身份的“黑暗面”叙述提出了质疑，我认为，数字身份平台的内在问题，而不仅仅是一面，才是监控结果的根源。

引用次数: 2

The ethical dimensions of Google autocomplete 谷歌自动完成的伦理维度

IF 8.5 1区社会学 Q1 SOCIAL SCIENCES, INTERDISCIPLINARY

Big Data & Society

Pub Date : 2023-01-01 DOI: 10.1177/20539517231156518

Rosie Graham

What questions should we ask of Google’s Autocomplete suggestions? This article highlights some of the key ethical issues raised by Google’s automated suggestion tool that provides potential queries below a user’s search box. Much of the discourse surrounding Google’s suggestions has been framed through legal cases in which complex issues can become distilled into black-and-white questions of the law. For example, do Google have to remove a particular suggestion and do they have to pay a settlement for damages? This commentary argues that shaping this discourse along primarily legal lines obscures many of these other moral dimensions raised by Google Autocomplete. Building from existing typologies, this commentary first outlines the legal discourse before exploring five additional ethical challenges, each framed around a particular moral question in which all users have a stake. Written in the form of a commentary, the purpose of this article is not to conclusively answer the ethical questions raised, but rather to give an account of why these particular questions are worth debating. Autocomplete’s suggestions are not simply a mirror of what users are typing into Google’s search bar. Google’s official statement is that “Autocomplete is a time-saving but complex feature. It doesn’t simply display the most common queries on a given topic” but “also predict[s] individual words and phrases that are based on both real searches as well as word patterns found across the web” (Google, 2022). Both its underlying methods and associated terminology have changed throughout time, shifting between providing completions, suggestions, and predictions. In doing so, the grounds for potential critique are ever-changing, which means that Google’s approach to Autocomplete deserves significant scrutiny.

对于b谷歌的自动补全建议，我们应该问哪些问题?这篇文章重点介绍了b谷歌的自动建议工具提出的一些关键的道德问题，该工具在用户的搜索框下面提供潜在的查询。围绕b谷歌建议的许多讨论都是通过法律案例来框定的，在这些案例中，复杂的问题可以被提炼成非黑即白的法律问题。例如，b谷歌是否必须删除一个特定的建议，他们是否必须支付损害赔偿?这篇评论认为，按照主要的法律路线来塑造这一话语，模糊了谷歌自动完成提出的许多其他道德维度。基于现有的类型学，本评论首先概述了法律话语，然后探索了五个额外的道德挑战，每个挑战都围绕着一个特定的道德问题，所有用户都有利害关系。这篇文章以评论的形式写成，目的并不是要结论性地回答所提出的伦理问题，而是要说明为什么这些特定的问题值得讨论。自动补全的建议并不仅仅是用户在b谷歌搜索栏中输入内容的镜像。谷歌的官方声明是“自动补全是一个节省时间但复杂的功能。它不仅显示给定主题上最常见的查询”，而且“还预测基于真实搜索以及在网络上发现的单词模式的单个单词和短语”(b谷歌，2022)。它的基本方法和相关术语随着时间的推移而变化，在提供完井、建议和预测之间转换。在这样做的过程中，潜在的批评理由是不断变化的，这意味着谷歌对自动完成的方法值得仔细审查。

{"title":"The ethical dimensions of Google autocomplete","authors":"Rosie Graham","doi":"10.1177/20539517231156518","DOIUrl":"https://doi.org/10.1177/20539517231156518","url":null,"abstract":"What questions should we ask of Google’s Autocomplete suggestions? This article highlights some of the key ethical issues raised by Google’s automated suggestion tool that provides potential queries below a user’s search box. Much of the discourse surrounding Google’s suggestions has been framed through legal cases in which complex issues can become distilled into black-and-white questions of the law. For example, do Google have to remove a particular suggestion and do they have to pay a settlement for damages? This commentary argues that shaping this discourse along primarily legal lines obscures many of these other moral dimensions raised by Google Autocomplete. Building from existing typologies, this commentary first outlines the legal discourse before exploring five additional ethical challenges, each framed around a particular moral question in which all users have a stake. Written in the form of a commentary, the purpose of this article is not to conclusively answer the ethical questions raised, but rather to give an account of why these particular questions are worth debating. Autocomplete’s suggestions are not simply a mirror of what users are typing into Google’s search bar. Google’s official statement is that “Autocomplete is a time-saving but complex feature. It doesn’t simply display the most common queries on a given topic” but “also predict[s] individual words and phrases that are based on both real searches as well as word patterns found across the web” (Google, 2022). Both its underlying methods and associated terminology have changed throughout time, shifting between providing completions, suggestions, and predictions. In doing so, the grounds for potential critique are ever-changing, which means that Google’s approach to Autocomplete deserves significant scrutiny.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47412208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2