首页 > 最新文献

Ethics and Information Technology最新文献

英文 中文
The fundamental rights risks of countering cognitive warfare with artificial intelligence. 用人工智能对抗认知战的基本权利风险。
IF 4 2区 哲学 Q1 ETHICS Pub Date : 2025-01-01 Epub Date: 2025-10-06 DOI: 10.1007/s10676-025-09868-9
Henning Lahmann, Bart Custers, Benjamyn I Scott

This article analyses ideas to use AI-supported systems to counter 'cognitive warfare' and critically examines the implications of such systems for fundamental rights and values. After explicating the notion of 'cognitive warfare' as used in contemporary public security discourse, the article describes the emergence of generative AI tools that are expected to exacerbate the problem of adversarial activities against the online information ecosystems of democratic societies. In response, researchers and policymakers have proposed to utilize AI to devise countermeasures, ranging from AI-based early warning systems to state-run content moderation tools. These interventions, however, interfere, to different degrees, with fundamental rights and values such as privacy, communication rights, and self-determination. This article argues that such proposals insufficiently account for the complexity of contemporary online information ecosystems, particularly the inherent difficulty in establishing causality and attribution. Reliance on the precautionary principle might offer a justificatory frame for AI-enabled measures to counter 'cognitive warfare' in the absence of conclusive empirical evidence of harm. However, any such state intervention must be based in law and adhere to strict proportionality.

本文分析了使用人工智能支持的系统来对抗“认知战”的想法,并批判性地审视了这些系统对基本权利和价值观的影响。在解释了当代公共安全话语中使用的“认知战”概念之后,文章描述了生成式人工智能工具的出现,预计这些工具将加剧针对民主社会在线信息生态系统的对抗活动问题。对此,研究人员和政策制定者建议利用人工智能制定对策,从基于人工智能的预警系统到国营内容审查工具。然而,这些干预在不同程度上干扰了隐私权、通讯权和自决权等基本权利和价值观。本文认为,这些建议不足以解释当代在线信息生态系统的复杂性,特别是在建立因果关系和归因方面的固有困难。在缺乏确凿的危害经验证据的情况下,对预防原则的依赖可能为人工智能措施提供一个合理的框架,以对抗“认知战”。然而,任何此类国家干预都必须以法律为依据,并严格遵守比例原则。
{"title":"The fundamental rights risks of countering cognitive warfare with artificial intelligence.","authors":"Henning Lahmann, Bart Custers, Benjamyn I Scott","doi":"10.1007/s10676-025-09868-9","DOIUrl":"10.1007/s10676-025-09868-9","url":null,"abstract":"<p><p>This article analyses ideas to use AI-supported systems to counter 'cognitive warfare' and critically examines the implications of such systems for fundamental rights and values. After explicating the notion of 'cognitive warfare' as used in contemporary public security discourse, the article describes the emergence of generative AI tools that are expected to exacerbate the problem of adversarial activities against the online information ecosystems of democratic societies. In response, researchers and policymakers have proposed to utilize AI to devise countermeasures, ranging from AI-based early warning systems to state-run content moderation tools. These interventions, however, interfere, to different degrees, with fundamental rights and values such as privacy, communication rights, and self-determination. This article argues that such proposals insufficiently account for the complexity of contemporary online information ecosystems, particularly the inherent difficulty in establishing causality and attribution. Reliance on the precautionary principle might offer a justificatory frame for AI-enabled measures to counter 'cognitive warfare' in the absence of conclusive empirical evidence of harm. However, any such state intervention must be based in law and adhere to strict proportionality.</p>","PeriodicalId":51495,"journal":{"name":"Ethics and Information Technology","volume":"27 4","pages":"49"},"PeriodicalIF":4.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12500826/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145253706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Establishing human responsibility and accountability at early stages of the lifecycle for AI-based defence systems. 在基于人工智能的防御系统生命周期的早期阶段建立人的责任和问责制。
IF 4 2区 哲学 Q1 ETHICS Pub Date : 2025-01-01 Epub Date: 2025-10-06 DOI: 10.1007/s10676-025-09862-1
Ariel Conn, Ingvild Bode

The use of AI technologies in weapons systems has triggered a decade-long international debate, especially with regard to human control, responsibility, and accountability around autonomous and intelligent systems (AIS) in defence. However, most of these ethical and legal discussions have revolved primarily around the point of use of a hypothetical AIS, and in doing so, one critical component still remains under-appreciated: human decision-making across the full timeline of the AIS lifecycle. When discussions around human involvement start at the point at which a hypothetical AIS has taken some undesirable action, they typically prompt the question: "what happens next?" This approach primarily concerns the technology at the time of use and may be appropriate for conventional weapons systems, for which humans have clear lines of control and therefore accountability at the time of use. However, this is not precisely the case for AIS. Rather than focusing first on the system in its comparatively most autonomous state, it is more helpful to consider when, along the lifecycle, humans have more clear, direct control over the system (e.g. through research, design, testing, or procurement) and how, at those earlier times, human decision-makers can take steps to decrease the likelihood that an AIS will perform 'inappropriately' or take incorrect actions. In this paper, we therefore argue that addressing many arising concerns requires a shift in how and when participants of the international debate on AI in the military domain think about, talk about, and plan for human involvement across the full lifecycle of AIS in defence. This shift includes a willingness to hold human decision-makers accountable, even if their roles occurred at much earlier stages of the lifecycle. Of course, this raises another question: "How?" We close by formulating a number of recommendations, including the adoption of the IEEE-SA Lifecycle Framework, the consideration of policy knots, and the adoption of Human Readiness Levels.

在武器系统中使用人工智能技术引发了长达十年的国际辩论,特别是关于国防中自主和智能系统(AIS)的人为控制、责任和问责制。然而,大多数这些伦理和法律讨论主要围绕着一个假设的AIS的使用点,在这样做的过程中,一个关键的组成部分仍然被低估:人类在AIS生命周期的整个时间轴上的决策。当关于人类参与的讨论开始于一个假设的AIS采取了一些不受欢迎的行动时,他们通常会提出这样的问题:“接下来会发生什么?”这种方法主要涉及使用时的技术,可能适用于常规武器系统,因为人类对这些系统有明确的控制界线,因此在使用时负有责任。然而,AIS的情况并非如此。与其首先关注系统相对最自主的状态,不如考虑在整个生命周期中,人类何时对系统有更清晰、直接的控制(例如,通过研究、设计、测试或采购),以及在这些早期阶段,人类决策者如何采取措施减少人工智能执行“不恰当”或采取不正确行动的可能性,这更有帮助。因此,在本文中,我们认为,解决许多新出现的问题需要在军事领域人工智能国际辩论的参与者如何以及何时思考、谈论和计划人类参与国防人工智能的整个生命周期方面发生转变。这种转变包括让人类决策者负责的意愿,即使他们的角色发生在生命周期的早期阶段。当然,这又提出了另一个问题:“怎么做到的?”最后,我们提出了一些建议,包括采用IEEE-SA生命周期框架,考虑政策结点,以及采用人类准备水平。
{"title":"Establishing human responsibility and accountability at early stages of the lifecycle for AI-based defence systems.","authors":"Ariel Conn, Ingvild Bode","doi":"10.1007/s10676-025-09862-1","DOIUrl":"10.1007/s10676-025-09862-1","url":null,"abstract":"<p><p>The use of AI technologies in weapons systems has triggered a decade-long international debate, especially with regard to human control, responsibility, and accountability around autonomous and intelligent systems (AIS) in defence. However, most of these ethical and legal discussions have revolved primarily around the point of use of a hypothetical AIS, and in doing so, one critical component still remains under-appreciated: human decision-making across the full timeline of the AIS lifecycle. When discussions around human involvement start at the point at which a hypothetical AIS has taken some undesirable action, they typically prompt the question: \"what happens next?\" This approach primarily concerns the technology at the time of use and may be appropriate for conventional weapons systems, for which humans have clear lines of control and therefore accountability at the time of use. However, this is not precisely the case for AIS. Rather than focusing first on the system in its comparatively most autonomous state, it is more helpful to consider when, along the lifecycle, humans have more clear, direct control over the system (e.g. through research, design, testing, or procurement) and how, at those earlier times, human decision-makers can take steps to decrease the likelihood that an AIS will perform 'inappropriately' or take incorrect actions. In this paper, we therefore argue that addressing many arising concerns requires a shift in how and when participants of the international debate on AI in the military domain think about, talk about, and plan for human involvement across the full lifecycle of AIS in defence. This shift includes a willingness to hold human decision-makers accountable, even if their roles occurred at much earlier stages of the lifecycle. Of course, this raises another question: \"How?\" We close by formulating a number of recommendations, including the adoption of the IEEE-SA Lifecycle Framework, the consideration of policy knots, and the adoption of Human Readiness Levels.</p>","PeriodicalId":51495,"journal":{"name":"Ethics and Information Technology","volume":"27 4","pages":"51"},"PeriodicalIF":4.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12500784/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145253729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reasons underdetermination in meaningful human control. 有意义的人类控制不确定的原因。
IF 4 2区 哲学 Q1 ETHICS Pub Date : 2025-01-01 Epub Date: 2025-10-15 DOI: 10.1007/s10676-025-09858-x
Atay Kozlovski

The rapid proliferation of AI systems has raised many concerns about safety and responsibility in their design and use. The philosophical framework of Meaningful Human Control (MHC) was developed in response to these concerns, and tries to provide a standard for designing and evaluating such systems. While promising, the framework still requires further theoretical and practical refinement. This paper contributes to that effort by drawing on research in axiology and rational decision theory to identify a critical gap in the framework. Specifically, it argues that while 'reasons' play a central role in MHC, there has been little discussion of the possibility that, when weighed against each other, reasons may not always point to a single, rationally preferable course of action. I refer to these cases as instances of reasons underdetermination, and this paper discusses the need to address this issue within the MHC framework. The paper begins by providing an overview of the key concepts of the MHC framework and then examines the role of 'reasons' in the framework's two main conditions - Tracking and Tracing. It then discusses the phenomenon of reasons underdetermination and shows how it poses a challenge for the achievement of both Tracking and Tracing.

人工智能系统的迅速扩散引发了人们对其设计和使用中的安全和责任的许多担忧。有意义的人类控制(MHC)的哲学框架是针对这些问题而发展起来的,并试图为设计和评估此类系统提供一个标准。虽然前景看好,但该框架仍需要进一步的理论和实践完善。本文通过借鉴价值论和理性决策理论的研究来确定框架中的一个关键缺口,从而为这一努力做出贡献。具体来说,该研究认为,虽然“理性”在MHC中起着核心作用,但很少有人讨论,当相互权衡时,理性可能并不总是指向一个单一的、理性的、更可取的行动方案。我将这些案例称为原因不确定的实例,本文讨论了在MHC框架内解决这一问题的必要性。本文首先概述了MHC框架的关键概念,然后研究了“原因”在框架的两个主要条件下的作用——跟踪和追踪。然后讨论了原因不确定的现象,并展示了它如何对跟踪和跟踪的实现构成挑战。
{"title":"Reasons underdetermination in meaningful human control.","authors":"Atay Kozlovski","doi":"10.1007/s10676-025-09858-x","DOIUrl":"10.1007/s10676-025-09858-x","url":null,"abstract":"<p><p>The rapid proliferation of AI systems has raised many concerns about safety and responsibility in their design and use. The philosophical framework of Meaningful Human Control (MHC) was developed in response to these concerns, and tries to provide a standard for designing and evaluating such systems. While promising, the framework still requires further theoretical and practical refinement. This paper contributes to that effort by drawing on research in axiology and rational decision theory to identify a critical gap in the framework. Specifically, it argues that while 'reasons' play a central role in MHC, there has been little discussion of the possibility that, when weighed against each other, reasons may not always point to a single, rationally preferable course of action. I refer to these cases as instances of reasons underdetermination, and this paper discusses the need to address this issue within the MHC framework. The paper begins by providing an overview of the key concepts of the MHC framework and then examines the role of 'reasons' in the framework's two main conditions - Tracking and Tracing. It then discusses the phenomenon of reasons underdetermination and shows how it poses a challenge for the achievement of both Tracking and Tracing.</p>","PeriodicalId":51495,"journal":{"name":"Ethics and Information Technology","volume":"27 4","pages":"59"},"PeriodicalIF":4.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12528354/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145330726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Personalised care, youth mental health, and digital technology: A value sensitive design perspective and framework. 个性化护理、青少年心理健康和数字技术:价值敏感的设计视角和框架。
IF 4 2区 哲学 Q1 ETHICS Pub Date : 2025-01-01 Epub Date: 2025-10-22 DOI: 10.1007/s10676-025-09866-x
Adam Poulsen, Ian B Hickie, Min K Chong, Haley M LaMonica, Ashlee Turner, Frank Iorfino

Digital health is typically driven, in part, by the principle of personalised care. However, the underlying values and associated ethical design considerations at the intersection of personalised care, youth mental health, and digital technology are underexplored. Through a value sensitive design lens, this work aims to contribute a prototype conceptual framework for the ethical design and evaluation of personalised youth digital mental health technology, which comprises three values-personalisation, empowerment, and autonomy-and 15 design norms as fundamental yet non-exhaustive ethical criteria. Furthermore, it provides illustrative applications of the framework by applying it to (1) the proactive design of two exemplary digital mental health technologies to draw out emerging ethical considerations and (2) the retrospective evaluation of three existing technologies to assess whether they are designed to support personalisation, empowerment, and autonomy. This work creates an understanding of personalised care and related values in this socio-technical context, with key design recommendations going forward for youth digital mental health research, practice, and associated policy.

Supplementary information: The online version contains supplementary material available at 10.1007/s10676-025-09866-x.

数字医疗通常在一定程度上受到个性化护理原则的推动。然而,在个性化护理、青少年心理健康和数字技术的交叉点上,潜在的价值观和相关的伦理设计考虑还没有得到充分的探索。通过价值敏感的设计视角,这项工作旨在为个性化青少年数字心理健康技术的伦理设计和评估提供一个原型概念框架,其中包括三个价值观——个性化、赋权和自主——以及15个设计规范,作为基本但非详尽的伦理标准。此外,它通过将该框架应用于(1)两种示范性数字心理健康技术的主动设计,以引出新兴的伦理考虑,以及(2)对三种现有技术的回顾性评估,以评估它们的设计是否支持个性化、赋权和自主,从而提供了该框架的说明性应用。这项工作在这种社会技术背景下创造了对个性化护理和相关价值观的理解,并为青年数字心理健康研究、实践和相关政策提出了关键的设计建议。补充信息:在线版本包含补充资料,可在10.1007/s10676-025-09866-x获得。
{"title":"Personalised care, youth mental health, and digital technology: A value sensitive design perspective and framework.","authors":"Adam Poulsen, Ian B Hickie, Min K Chong, Haley M LaMonica, Ashlee Turner, Frank Iorfino","doi":"10.1007/s10676-025-09866-x","DOIUrl":"10.1007/s10676-025-09866-x","url":null,"abstract":"<p><p>Digital health is typically driven, in part, by the principle of personalised care. However, the underlying values and associated ethical design considerations at the intersection of personalised care, youth mental health, and digital technology are underexplored. Through a value sensitive design lens, this work aims to contribute a prototype conceptual framework for the ethical design and evaluation of personalised youth digital mental health technology, which comprises three values-personalisation, empowerment, and autonomy-and 15 design norms as fundamental yet non-exhaustive ethical criteria. Furthermore, it provides illustrative applications of the framework by applying it to (1) the proactive design of two exemplary digital mental health technologies to draw out emerging ethical considerations and (2) the retrospective evaluation of three existing technologies to assess whether they are designed to support personalisation, empowerment, and autonomy. This work creates an understanding of personalised care and related values in this socio-technical context, with key design recommendations going forward for youth digital mental health research, practice, and associated policy.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s10676-025-09866-x.</p>","PeriodicalId":51495,"journal":{"name":"Ethics and Information Technology","volume":"27 4","pages":"61"},"PeriodicalIF":4.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12546520/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145373334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Urban Digital Twins and metaverses towards city multiplicities: uniting or dividing urban experiences? 城市数字双胞胎和迈向城市多重性的元verses:联合还是分割城市经验?
IF 3.4 2区 哲学 Q1 ETHICS Pub Date : 2025-01-01 Epub Date: 2024-11-23 DOI: 10.1007/s10676-024-09812-3
Javier Argota Sánchez-Vaquerizo

Urban Digital Twins (UDTs) have become the new buzzword for researchers, planners, policymakers, and industry experts when it comes to designing, planning, and managing sustainable and efficient cities. It encapsulates the last iteration of the technocratic and ultra-efficient, post-modernist vision of smart cities. However, while more applications branded as UDTs appear around the world, its conceptualization remains ambiguous. Beyond being technically prescriptive about what UDTs are, this article focuses on their aspects of interaction and operationalization in connection to people in cities, and how enhanced by metaverse ideas they can deepen societal divides by offering divergent urban experiences based on different stakeholder preferences. Therefore, firstly this article repositions the term UDTs by comparing existing concrete and located applications that have a focus on interaction and participation, including some that may be closer to the concept of UDT than is commonly assumed. Based on the components found separately in the different studied cases, it is possible to hypothesize about possible future, more advanced realizations of UDTs. This enables us to contrast their positive and negative societal impacts. While the development of new immersive interactive digital worlds can improve planning using collective knowledge for more inclusive and diverse cities, they pose significant risks not only the common ones regarding privacy, transparency, or fairness, but also social fragmentation based on urban digital multiplicities. The potential benefits and challenges of integrating this multiplicity of UDTs into participatory urban governance emphasize the need for human-centric approaches to promote socio-technical frameworks able to mitigate risks as social division.

城市数字双胞胎(UDTs)已成为研究人员、规划人员、决策者和行业专家在设计、规划和管理可持续高效城市时的新流行语。它概括了技术官僚、超高效、后现代主义的智慧城市愿景的最后一次迭代。然而,尽管世界各地出现了越来越多打着 UDT 标签的应用,但其概念仍然模糊不清。除了在技术上规定什么是 UDTs 之外,本文重点关注的是 UDTs 与城市中人的互动和操作方面,以及它们如何在元宇宙思想的推动下,根据不同利益相关者的偏好提供不同的城市体验,从而加深社会分化。因此,本文首先通过比较现有的以互动和参与为重点的具体应用和定位应用,包括一些可能比通常认为的更接近 UDT 概念的应用,对 UDT 一词进行了重新定位。根据在不同研究案例中分别发现的组成部分,我们可以对未来可能出现的更先进的 UDT 进行假设。这使我们能够对比它们对社会的积极和消极影响。虽然开发新的沉浸式互动数字世界可以利用集体知识改善规划,使城市更具包容性和多样性,但它们也带来了巨大的风险,不仅是隐私、透明度或公平性方面的常见风险,还有基于城市数字多元性的社会分裂风险。在参与式城市治理中融入这种多元 UDT 的潜在益处和挑战,强调了以人为本的方法的必要性,以促进能够减轻社会分裂风险的社会技术框架。
{"title":"Urban Digital Twins and metaverses towards city multiplicities: uniting or dividing urban experiences?","authors":"Javier Argota Sánchez-Vaquerizo","doi":"10.1007/s10676-024-09812-3","DOIUrl":"10.1007/s10676-024-09812-3","url":null,"abstract":"<p><p>Urban Digital Twins (UDTs) have become the new buzzword for researchers, planners, policymakers, and industry experts when it comes to designing, planning, and managing sustainable and efficient cities. It encapsulates the last iteration of the technocratic and ultra-efficient, post-modernist vision of smart cities. However, while more applications branded as UDTs appear around the world, its conceptualization remains ambiguous. Beyond being technically prescriptive about what UDTs are, this article focuses on their aspects of interaction and operationalization in connection to people in cities, and how enhanced by metaverse ideas they can deepen societal divides by offering divergent urban experiences based on different stakeholder preferences. Therefore, firstly this article repositions the term UDTs by comparing existing concrete and located applications that have a focus on interaction and participation, including some that may be closer to the concept of UDT than is commonly assumed. Based on the components found separately in the different studied cases, it is possible to hypothesize about possible future, more advanced realizations of UDTs. This enables us to contrast their positive and negative societal impacts. While the development of new immersive interactive digital worlds can improve planning using collective knowledge for more inclusive and diverse cities, they pose significant risks not only the common ones regarding privacy, transparency, or fairness, but also social fragmentation based on urban digital multiplicities. The potential benefits and challenges of integrating this multiplicity of UDTs into participatory urban governance emphasize the need for human-centric approaches to promote socio-technical frameworks able to mitigate risks as social division.</p>","PeriodicalId":51495,"journal":{"name":"Ethics and Information Technology","volume":"27 1","pages":"4"},"PeriodicalIF":3.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11584446/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142710416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Helpful, harmless, honest? Sociotechnical limits of AI alignment and safety through Reinforcement Learning from Human Feedback. 有用的,无害的,诚实的?通过从人类反馈中强化学习的人工智能对齐和安全的社会技术限制。
IF 3.4 2区 哲学 Q1 ETHICS Pub Date : 2025-01-01 Epub Date: 2025-06-04 DOI: 10.1007/s10676-025-09837-2
Adam Dahlgren Lindström, Leila Methnani, Lea Krause, Petter Ericson, Íñigo Martínez de Rituerto de Troya, Dimitri Coelho Mollo, Roel Dobbe

This paper critically evaluates the attempts to align Artificial Intelligence (AI) systems, especially Large Language Models (LLMs), with human values and intentions through Reinforcement Learning from Feedback methods, involving either human feedback (RLHF) or AI feedback (RLAIF). Specifically, we show the shortcomings of the broadly pursued alignment goals of honesty, harmlessness, and helpfulness. Through a multidisciplinary sociotechnical critique, we examine both the theoretical underpinnings and practical implementations of RLHF techniques, revealing significant limitations in their approach to capturing the complexities of human ethics, and contributing to AI safety. We highlight tensions inherent in the goals of RLHF, as captured in the HHH principle (helpful, harmless and honest). In addition, we discuss ethically-relevant issues that tend to be neglected in discussions about alignment and RLHF, among which the trade-offs between user-friendliness and deception, flexibility and interpretability, and system safety. We offer an alternative vision for AI safety and ethics which positions RLHF approaches within a broader context of comprehensive design across institutions, processes and technological systems, and suggest the establishment of AI safety as a sociotechnical discipline that is open to the normative and political dimensions of artificial intelligence.

本文批判性地评估了人工智能(AI)系统,特别是大型语言模型(llm),通过从反馈方法中强化学习,包括人类反馈(RLHF)或人工智能反馈(RLAIF),与人类价值观和意图相一致的尝试。具体地说,我们展示了广泛追求的诚实、无害和乐于助人的对齐目标的缺点。通过多学科的社会技术批判,我们研究了RLHF技术的理论基础和实际实施,揭示了它们在捕捉人类伦理复杂性和促进人工智能安全方面的重大局限性。我们强调了RLHF目标中固有的紧张关系,正如HHH原则(有益、无害和诚实)所描述的那样。此外,我们还讨论了在关于校准和RLHF的讨论中往往被忽视的伦理相关问题,其中包括用户友好与欺骗、灵活性与可解释性以及系统安全性之间的权衡。我们为人工智能安全和伦理提供了另一种愿景,将RLHF方法置于跨机构、流程和技术系统的综合设计的更广泛背景下,并建议将人工智能安全建立为一门社会技术学科,对人工智能的规范和政治层面开放。
{"title":"Helpful, harmless, honest? Sociotechnical limits of AI alignment and safety through Reinforcement Learning from Human Feedback.","authors":"Adam Dahlgren Lindström, Leila Methnani, Lea Krause, Petter Ericson, Íñigo Martínez de Rituerto de Troya, Dimitri Coelho Mollo, Roel Dobbe","doi":"10.1007/s10676-025-09837-2","DOIUrl":"10.1007/s10676-025-09837-2","url":null,"abstract":"<p><p>This paper critically evaluates the attempts to align Artificial Intelligence (AI) systems, especially Large Language Models (LLMs), with human values and intentions through Reinforcement Learning from Feedback methods, involving either human feedback (RLHF) or AI feedback (RLAIF). Specifically, we show the shortcomings of the broadly pursued alignment goals of honesty, harmlessness, and helpfulness. Through a multidisciplinary sociotechnical critique, we examine both the theoretical underpinnings and practical implementations of RLHF techniques, revealing significant limitations in their approach to capturing the complexities of human ethics, and contributing to AI safety. We highlight tensions inherent in the goals of RLHF, as captured in the HHH principle (helpful, harmless and honest). In addition, we discuss ethically-relevant issues that tend to be neglected in discussions about alignment and RLHF, among which the trade-offs between user-friendliness and deception, flexibility and interpretability, and system safety. We offer an alternative vision for AI safety and ethics which positions RLHF approaches within a broader context of comprehensive design across institutions, processes and technological systems, and suggest the establishment of AI safety as a sociotechnical discipline that is open to the normative and political dimensions of artificial intelligence.</p>","PeriodicalId":51495,"journal":{"name":"Ethics and Information Technology","volume":"27 2","pages":"28"},"PeriodicalIF":3.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12137480/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144250839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A critique of current approaches to privacy in machine learning. 对当前机器学习中隐私处理方法的批判。
IF 3.4 2区 哲学 Q1 ETHICS Pub Date : 2025-01-01 Epub Date: 2025-06-20 DOI: 10.1007/s10676-025-09843-4
Florian van Daalen, Marine Jacquemin, Johan van Soest, Nina Stahl, David Townend, Andre Dekker, Inigo Bermejo

Access to large datasets, the rise of the Internet of Things (IoT) and the ease of collecting personal data, have led to significant breakthroughs in machine learning. However, they have also raised new concerns about privacy data protection. Controversies like the Facebook-Cambridge Analytica scandal highlight unethical practices in today's digital landscape. Historical privacy incidents have led to the development of technical and legal solutions to protect data subjects' right to privacy. However, within machine learning, these problems have largely been approached from a mathematical point of view, ignoring the larger context in which privacy is relevant. This technical approach has benefited data-controllers and failed to protect individuals adequately. Moreover, it has aligned with Big Tech organizations' interests and allowed them to further push the discussion in a direction that is favorable to their interests. This paper reflects on current privacy approaches in machine learning and explores how various big organizations guide the public discourse, and how this harms data subjects. It also critiques the current data protection regulations, as they allow superficial compliance without addressing deeper ethical issues. Finally, it argues that redefining privacy to focus on harm to data subjects rather than on data breaches would benefit data subjects as well as society at large.

访问大型数据集、物联网(IoT)的兴起以及收集个人数据的便利性,导致了机器学习方面的重大突破。然而,它们也引发了人们对隐私数据保护的新担忧。像Facebook-Cambridge Analytica丑闻这样的争议凸显了当今数字领域的不道德行为。历史上的隐私事件导致了技术和法律解决方案的发展,以保护数据主体的隐私权。然而,在机器学习中,这些问题主要是从数学的角度来解决的,忽略了与隐私相关的更大背景。这种技术方法使数据控制者受益,但未能充分保护个人。此外,它与大型科技组织的利益保持一致,并允许他们进一步推动讨论朝着有利于他们利益的方向发展。本文反映了当前机器学习中的隐私方法,并探讨了各种大组织如何引导公共话语,以及这如何损害数据主体。它还批评了当前的数据保护法规,因为它们允许表面的遵守,而没有解决更深层次的道德问题。最后,它认为,重新定义隐私,关注对数据主体的伤害,而不是数据泄露,将有利于数据主体和整个社会。
{"title":"A critique of current approaches to privacy in machine learning.","authors":"Florian van Daalen, Marine Jacquemin, Johan van Soest, Nina Stahl, David Townend, Andre Dekker, Inigo Bermejo","doi":"10.1007/s10676-025-09843-4","DOIUrl":"10.1007/s10676-025-09843-4","url":null,"abstract":"<p><p>Access to large datasets, the rise of the Internet of Things (IoT) and the ease of collecting personal data, have led to significant breakthroughs in machine learning. However, they have also raised new concerns about privacy data protection. Controversies like the Facebook-Cambridge Analytica scandal highlight unethical practices in today's digital landscape. Historical privacy incidents have led to the development of technical and legal solutions to protect data subjects' right to privacy. However, within machine learning, these problems have largely been approached from a mathematical point of view, ignoring the larger context in which privacy is relevant. This technical approach has benefited data-controllers and failed to protect individuals adequately. Moreover, it has aligned with Big Tech organizations' interests and allowed them to further push the discussion in a direction that is favorable to their interests. This paper reflects on current privacy approaches in machine learning and explores how various big organizations guide the public discourse, and how this harms data subjects. It also critiques the current data protection regulations, as they allow superficial compliance without addressing deeper ethical issues. Finally, it argues that redefining privacy to focus on harm to data subjects rather than on data breaches would benefit data subjects as well as society at large.</p>","PeriodicalId":51495,"journal":{"name":"Ethics and Information Technology","volume":"27 3","pages":"32"},"PeriodicalIF":3.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12181200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144477838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Engineers on responsibility: feminist approaches to who’s responsible for ethical AI 关于责任的工程师:女权主义者对人工智能伦理责任的看法
IF 3.6 2区 哲学 Q1 ETHICS Pub Date : 2024-01-02 DOI: 10.1007/s10676-023-09739-1
Eleanor Drage, Kerry McInerney, Jude Browne
{"title":"Engineers on responsibility: feminist approaches to who’s responsible for ethical AI","authors":"Eleanor Drage, Kerry McInerney, Jude Browne","doi":"10.1007/s10676-023-09739-1","DOIUrl":"https://doi.org/10.1007/s10676-023-09739-1","url":null,"abstract":"","PeriodicalId":51495,"journal":{"name":"Ethics and Information Technology","volume":"109 21","pages":"1-13"},"PeriodicalIF":3.6,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139391278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI and the need for justification (to the patient). 人工智能和(向患者)说明理由的必要性。
IF 3.6 2区 哲学 Q1 ETHICS Pub Date : 2024-01-01 Epub Date: 2024-03-04 DOI: 10.1007/s10676-024-09754-w
Anantharaman Muralidharan, Julian Savulescu, G Owen Schaefer

This paper argues that one problem that besets black-box AI is that it lacks algorithmic justifiability. We argue that the norm of shared decision making in medical care presupposes that treatment decisions ought to be justifiable to the patient. Medical decisions are justifiable to the patient only if they are compatible with the patient's values and preferences and the patient is able to see that this is so. Patient-directed justifiability is threatened by black-box AIs because the lack of rationale provided for the decision makes it difficult for patients to ascertain whether there is adequate fit between the decision and the patient's values. This paper argues that achieving algorithmic transparency does not help patients bridge the gap between their medical decisions and values. We introduce a hypothetical model we call Justifiable AI to illustrate this argument. Justifiable AI aims at modelling normative and evaluative considerations in an explicit way so as to provide a stepping stone for patient and physician to jointly decide on a course of treatment. If our argument succeeds, we should prefer these justifiable models over alternatives if the former are available and aim to develop said models if not.

本文认为,困扰黑盒人工智能的一个问题是它缺乏算法上的合理性。我们认为,医疗保健中的共同决策规范预先假定治疗决定对患者而言应该是合理的。只有当医疗决策符合患者的价值观和偏好,并且患者能够看到这一点时,医疗决策对患者来说才是合理的。以患者为导向的合理性受到了黑盒人工智能的威胁,因为人工智能决策缺乏合理性,患者很难确定决策与患者的价值观之间是否有足够的契合点。本文认为,实现算法透明并不能帮助患者弥合医疗决策与价值观之间的差距。为了说明这一论点,我们引入了一个假设模型,我们称之为 "正当的人工智能"(Justifiable AI)。合理的人工智能旨在以明确的方式模拟规范性和评价性考虑因素,从而为病人和医生共同决定治疗方案提供一块垫脚石。如果我们的论证成功,我们就应该优先选择这些合理的模型,而不是替代品(如果有的话),如果没有的话,我们就应该致力于开发上述模型。
{"title":"AI and the need for justification (to the patient).","authors":"Anantharaman Muralidharan, Julian Savulescu, G Owen Schaefer","doi":"10.1007/s10676-024-09754-w","DOIUrl":"10.1007/s10676-024-09754-w","url":null,"abstract":"<p><p>This paper argues that one problem that besets black-box AI is that it lacks algorithmic justifiability. We argue that the norm of shared decision making in medical care presupposes that treatment decisions ought to be justifiable to the patient. Medical decisions are justifiable to the patient only if they are compatible with the patient's values and preferences and the patient is able to see that this is so. Patient-directed justifiability is threatened by black-box AIs because the lack of rationale provided for the decision makes it difficult for patients to ascertain whether there is adequate fit between the decision and the patient's values. This paper argues that achieving algorithmic transparency does not help patients bridge the gap between their medical decisions and values. We introduce a hypothetical model we call Justifiable AI to illustrate this argument. Justifiable AI aims at modelling normative and evaluative considerations in an explicit way so as to provide a stepping stone for patient and physician to jointly decide on a course of treatment. If our argument succeeds, we should prefer these justifiable models over alternatives if the former are available and aim to develop said models if not.</p>","PeriodicalId":51495,"journal":{"name":"Ethics and Information Technology","volume":"26 1","pages":"16"},"PeriodicalIF":3.6,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10912120/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140051468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trustworthiness of voting advice applications in Europe. 欧洲投票建议应用程序的可信度。
IF 3.4 2区 哲学 Q1 ETHICS Pub Date : 2024-01-01 Epub Date: 2024-08-12 DOI: 10.1007/s10676-024-09790-6
Elisabeth Stockinger, Jonne Maas, Christofer Talvitie, Virginia Dignum

Voting Advice Applications (VAAs) are interactive tools used to assist in one's choice of a party or candidate to vote for in an upcoming election. They have the potential to increase citizens' trust and participation in democratic structures. However, there is no established ground truth for one's electoral choice, and VAA recommendations depend strongly on architectural and design choices. We assessed several representative European VAAs according to the Ethics Guidelines for Trustworthy AI provided by the European Commission using publicly available information. We found scores to be comparable across VAAs and low in most requirements, with differences reflecting the kind of developing institution. Across VAAs, we identify the need for improvement in (i) transparency regarding the subjectivity of recommendations, (ii) diversity of stakeholder participation, (iii) user-centric documentation of algorithm, and (iv) disclosure of the underlying values and assumptions.

Supplementary information: The online version contains supplementary material available at 10.1007/s10676-024-09790-6.

投票咨询应用程序(VAAs)是一种互动工具,用于帮助人们在即将举行的选举中选择党派或候选人。它们有可能增加公民对民主结构的信任和参与。然而,一个人的选举选择并没有既定的基本事实,VAA 的建议在很大程度上取决于架构和设计选择。我们根据欧盟委员会提供的《可信人工智能道德准则》,利用公开信息评估了几个具有代表性的欧洲 VAA。我们发现各 VAA 的得分不相上下,大多数要求的得分较低,差异反映了发展中机构的类型。在所有 VAA 中,我们发现在以下方面需要改进:(i) 建议主观性的透明度;(ii) 利益相关者参与的多样性;(iii) 以用户为中心的算法文档;(iv) 基本价值观和假设的披露:在线版本包含补充材料,可查阅 10.1007/s10676-024-09790-6。
{"title":"Trustworthiness of voting advice applications in Europe.","authors":"Elisabeth Stockinger, Jonne Maas, Christofer Talvitie, Virginia Dignum","doi":"10.1007/s10676-024-09790-6","DOIUrl":"10.1007/s10676-024-09790-6","url":null,"abstract":"<p><p>Voting Advice Applications (VAAs) are interactive tools used to assist in one's choice of a party or candidate to vote for in an upcoming election. They have the potential to increase citizens' trust and participation in democratic structures. However, there is no established ground truth for one's electoral choice, and VAA recommendations depend strongly on architectural and design choices. We assessed several representative European VAAs according to the Ethics Guidelines for Trustworthy AI provided by the European Commission using publicly available information. We found scores to be comparable across VAAs and low in most requirements, with differences reflecting the kind of developing institution. Across VAAs, we identify the need for improvement in (i) transparency regarding the subjectivity of recommendations, (ii) diversity of stakeholder participation, (iii) user-centric documentation of algorithm, and (iv) disclosure of the underlying values and assumptions.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s10676-024-09790-6.</p>","PeriodicalId":51495,"journal":{"name":"Ethics and Information Technology","volume":"26 3","pages":"55"},"PeriodicalIF":3.4,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11415416/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142300499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Ethics and Information Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1