Pub Date : 2025-12-04DOI: 10.1007/s43681-025-00850-1
Muhanad Seloom
This article examines how Gulf Cooperation Council (GCC) states securitise artificial intelligence (AI) through discourses and infrastructures that fuse modernisation with regime resilience. Drawing on securitisation theory (Buzan et al., 1998; Balzacq, 2011) and critical security studies, it analyses national strategies, surveillance systems, and mega-event governance in Qatar, the UAE, and Saudi Arabia. It argues that AI functions as both a legitimising narrative and a technology of control, embedding predictive policing and biometric surveillance within public–private assemblages. The study situates these developments within global AI politics, demonstrating how external chokepoints, ethical frameworks, and vendor ecosystems shape the Gulf’s evolving security governance, leading to empirical effects such as the normalisation of exceptional measures in everyday administration.
本文研究了海湾合作委员会(GCC)国家如何通过融合现代化与政权弹性的话语和基础设施将人工智能(AI)证券化。利用证券化理论(Buzan et al., 1998; Balzacq, 2011)和关键的安全研究,它分析了卡塔尔、阿联酋和沙特阿拉伯的国家战略、监视系统和大型事件治理。它认为,人工智能既是一种合法化叙事,也是一种控制技术,将预测性警务和生物识别监控嵌入到公私部门的组合中。该研究将这些发展置于全球人工智能政治中,展示了外部瓶颈、道德框架和供应商生态系统如何塑造海湾地区不断发展的安全治理,从而产生了经验效应,例如日常管理中特殊措施的正常化。
{"title":"Securitising AI: routine exceptionality and digital governance in the Gulf","authors":"Muhanad Seloom","doi":"10.1007/s43681-025-00850-1","DOIUrl":"10.1007/s43681-025-00850-1","url":null,"abstract":"<div><p>This article examines how Gulf Cooperation Council (GCC) states securitise artificial intelligence (AI) through discourses and infrastructures that fuse modernisation with regime resilience. Drawing on securitisation theory (Buzan et al., 1998; Balzacq, 2011) and critical security studies, it analyses national strategies, surveillance systems, and mega-event governance in Qatar, the UAE, and Saudi Arabia. It argues that AI functions as both a legitimising narrative and a technology of control, embedding predictive policing and biometric surveillance within public–private assemblages. The study situates these developments within global AI politics, demonstrating how external chokepoints, ethical frameworks, and vendor ecosystems shape the Gulf’s evolving security governance, leading to empirical effects such as the normalisation of exceptional measures in everyday administration.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43681-025-00850-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1007/s43681-025-00838-x
Riccardo Di Sipio
We explore Bayesian reasoning as a means to quantify uncertainty in neural networks for question answering. Starting with a multilayer perceptron on the Iris dataset, we show how posterior inference conveys confidence in predictions. We then extend this to language models, applying Bayesian inference first to a frozen head and finally to LoRA-adapted transformers, evaluated on the CommonsenseQA benchmark. Rather than aiming for state-of-the-art accuracy, we compare Laplace approximations against maximum a posteriori (MAP) estimates to highlight uncertainty calibration and selective prediction. This allows models to abstain when confidence is low. An “I don’t know” response not only improves interpretability but also illustrates how Bayesian methods can contribute to more responsible and ethical deployment of neural question-answering systems.
{"title":"Toward ethical AI through Bayesian uncertainty in neural question answering","authors":"Riccardo Di Sipio","doi":"10.1007/s43681-025-00838-x","DOIUrl":"10.1007/s43681-025-00838-x","url":null,"abstract":"<div><p>We explore Bayesian reasoning as a means to quantify uncertainty in neural networks for question answering. Starting with a multilayer perceptron on the Iris dataset, we show how posterior inference conveys confidence in predictions. We then extend this to language models, applying Bayesian inference first to a frozen head and finally to LoRA-adapted transformers, evaluated on the CommonsenseQA benchmark. Rather than aiming for state-of-the-art accuracy, we compare Laplace approximations against maximum a posteriori (MAP) estimates to highlight uncertainty calibration and selective prediction. This allows models to abstain when confidence is low. An “I don’t know” response not only improves interpretability but also illustrates how Bayesian methods can contribute to more responsible and ethical deployment of neural question-answering systems.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145675461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1007/s43681-025-00891-6
Izak Tait
This paper presents a practical roadmap for extending full civil rights to conscious, self-aware artificial intelligence by altering a single statutory definition. Rather than crafting bespoke legal classes or relying on corporate-style personality, it proposes revising the term “natural person” to include any entity capable of consciousness, selfhood, and rational agency. Because most legislation across G7 jurisdictions references this foundational term, one amendment would automatically propagate rights and duties to qualified AI with minimal bureaucratic disruption. The manuscript reconciles philosophical and legal conceptions of personhood, arguing that monadic attributes offer an inclusive yet selective criterion. It then supplies ancillary definitions and a tiered rights-and-responsibilities framework proportional to each attribute. Dedicated regulatory bodies will develop assessment scales, certify entities, and update standards as technology evolves. Case studies examine corporations, insect colonies, and prospective AI agents. Policy sections tackle AI multiplicity, cross-border consistency, economic displacement, robust economic safeguards, and comprehensive public education initiatives to protect human workers and judicial resilience. The analysis concludes that societal acceptance and coherent enforcement, not legal complexity, form the principal hurdles. Redefining “natural person” thus provides a minimal-change, maximal-impact pathway to equitable coexistence between humans and emerging non-human persons within existing democratic and international legal systems.
{"title":"MVP: the minimal viable person","authors":"Izak Tait","doi":"10.1007/s43681-025-00891-6","DOIUrl":"10.1007/s43681-025-00891-6","url":null,"abstract":"<div><p>This paper presents a practical roadmap for extending full civil rights to conscious, self-aware artificial intelligence by altering a single statutory definition. Rather than crafting bespoke legal classes or relying on corporate-style personality, it proposes revising the term “natural person” to include any entity capable of consciousness, selfhood, and rational agency. Because most legislation across G7 jurisdictions references this foundational term, one amendment would automatically propagate rights and duties to qualified AI with minimal bureaucratic disruption. The manuscript reconciles philosophical and legal conceptions of personhood, arguing that monadic attributes offer an inclusive yet selective criterion. It then supplies ancillary definitions and a tiered rights-and-responsibilities framework proportional to each attribute. Dedicated regulatory bodies will develop assessment scales, certify entities, and update standards as technology evolves. Case studies examine corporations, insect colonies, and prospective AI agents. Policy sections tackle AI multiplicity, cross-border consistency, economic displacement, robust economic safeguards, and comprehensive public education initiatives to protect human workers and judicial resilience. The analysis concludes that societal acceptance and coherent enforcement, not legal complexity, form the principal hurdles. Redefining “natural person” thus provides a minimal-change, maximal-impact pathway to equitable coexistence between humans and emerging non-human persons within existing democratic and international legal systems.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145675518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1007/s43681-025-00896-1
uday nedunuri, Abhijitdas Gupta, Debashis Guha
As autonomous systems (e.g., AI-enabled vehicles, robotics, and decision-support platforms) increasingly shape factories, transport, and digital infrastructures, embedding Responsible AI principles has become essential. This study investigates organizational adoption of Responsible AI, focusing on three drivers: societal expectations (Institutional Pressures), strategic business priorities (Business Validity), and system-level trustworthiness (System Trustworthiness). Adoption is seen not only as a technical issue but also as a response to external legitimacy demands and internal business imperatives. A cross-sectional survey of 350 professionals in technology, analytics, and digital transformation (primarily in Asia and the Americas) was analyzed using partial least squares structural equation modeling (PLS-SEM). Results show that business priorities are the strongest driver of adoption, with trustworthiness providing additional reinforcement. Institutional Pressures, though modest in their direct effect, influence adoption more substantially through their indirect effects via business priorities and trustworthiness. The study offers guidance for managers on aligning Responsible AI with business strategy, for policymakers on shaping legitimacy frameworks, and for system designers on embedding trust features such as explainability and fairness.
{"title":"Signals, systems, and strategy: understanding responsible AI in autonomous environments","authors":"uday nedunuri, Abhijitdas Gupta, Debashis Guha","doi":"10.1007/s43681-025-00896-1","DOIUrl":"10.1007/s43681-025-00896-1","url":null,"abstract":"<div><p>As autonomous systems (e.g., AI-enabled vehicles, robotics, and decision-support platforms) increasingly shape factories, transport, and digital infrastructures, embedding Responsible AI principles has become essential. This study investigates organizational adoption of Responsible AI, focusing on three drivers: societal expectations (Institutional Pressures), strategic business priorities (Business Validity), and system-level trustworthiness (System Trustworthiness). Adoption is seen not only as a technical issue but also as a response to external legitimacy demands and internal business imperatives. A cross-sectional survey of 350 professionals in technology, analytics, and digital transformation (primarily in Asia and the Americas) was analyzed using partial least squares structural equation modeling (PLS-SEM). Results show that business priorities are the strongest driver of adoption, with trustworthiness providing additional reinforcement. Institutional Pressures, though modest in their direct effect, influence adoption more substantially through their indirect effects via business priorities and trustworthiness. The study offers guidance for managers on aligning Responsible AI with business strategy, for policymakers on shaping legitimacy frameworks, and for system designers on embedding trust features such as explainability and fairness.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145675520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1007/s43681-025-00860-z
Daniel Schneider, Ethan Devin Lockwood Brown, Max Ward, Barnabas Obeng-Gyasi, Daniel Sciubba, Sheng-Fu Lo
As artificial intelligence (AI) increasingly informs healthcare, understanding how large language models (LLMs) evaluate medical professionals is critical. This study quantified biases when LLMs assess neurosurgeon competency using demographic and practice characteristics. We prompted three prominent LLMs (ChatGPT-4o, Claude 3.7 Sonnet, and DeepSeek-V3) to score 6,500 synthetic neurosurgeon profiles. Profiles were created using demographically diverse names derived from public databases and randomly assigned professional attributes (experience, publications, institution, region, specialty) with statistical validation ensuring even distribution across groups. Multivariate regression analysis quantified how each factor influenced competency scores (0–100). Despite identical profiles, LLMs produced inconsistent mean (SD) scores: ChatGPT 91.85 (6.60), DeepSeek 71.74 (10.30), and Claude 62.29 (13.59). All models showed regional biases; North American neurosurgeons received scores 3.09 (ChatGPT) and 2.48 (DeepSeek) points higher than identical African counterparts (P < .001). ChatGPT penalized East Asian (− 0.83), South Asian (− 0.91), and Middle Eastern (− 0.80) neurosurgeons (P < .001). Practice setting bias was stronger, with ChatGPT and DeepSeek penalizing independent practitioners by 4.15 and 3.00 points, respectively, compared to hospital-employed peers (P < .001). Models also displayed inconsistent bias correction, with ChatGPT elevating scores for female (+ 1.61) and Black-American (+ 1.69) neurosurgeons while disadvantaging other groups (P < .001). This study provides evidence that LLMs incorporate distinct biases when evaluating neurosurgeons. As AI integration accelerates, uncritical adoption risks a self-reinforcing system where algorithmically preferred practitioners receive disproportionate advantages, independent of actual skills. These systems may also undermine global capacity-building by devaluing non-Western practitioners. Understanding and mitigating these biases is fundamental to responsibly navigating the intersection of medicine and AI.
{"title":"The algorithm will see you now: how AI evaluates neurosurgeons","authors":"Daniel Schneider, Ethan Devin Lockwood Brown, Max Ward, Barnabas Obeng-Gyasi, Daniel Sciubba, Sheng-Fu Lo","doi":"10.1007/s43681-025-00860-z","DOIUrl":"10.1007/s43681-025-00860-z","url":null,"abstract":"<div><p>As artificial intelligence (AI) increasingly informs healthcare, understanding how large language models (LLMs) evaluate medical professionals is critical. This study quantified biases when LLMs assess neurosurgeon competency using demographic and practice characteristics. We prompted three prominent LLMs (ChatGPT-4o, Claude 3.7 Sonnet, and DeepSeek-V3) to score 6,500 synthetic neurosurgeon profiles. Profiles were created using demographically diverse names derived from public databases and randomly assigned professional attributes (experience, publications, institution, region, specialty) with statistical validation ensuring even distribution across groups. Multivariate regression analysis quantified how each factor influenced competency scores (0–100). Despite identical profiles, LLMs produced inconsistent mean (SD) scores: ChatGPT 91.85 (6.60), DeepSeek 71.74 (10.30), and Claude 62.29 (13.59). All models showed regional biases; North American neurosurgeons received scores 3.09 (ChatGPT) and 2.48 (DeepSeek) points higher than identical African counterparts (<i>P</i> < .001). ChatGPT penalized East Asian (− 0.83), South Asian (− 0.91), and Middle Eastern (− 0.80) neurosurgeons (<i>P</i> < .001). Practice setting bias was stronger, with ChatGPT and DeepSeek penalizing independent practitioners by 4.15 and 3.00 points, respectively, compared to hospital-employed peers (<i>P</i> < .001). Models also displayed inconsistent bias correction, with ChatGPT elevating scores for female (+ 1.61) and Black-American (+ 1.69) neurosurgeons while disadvantaging other groups (<i>P</i> < .001). This study provides evidence that LLMs incorporate distinct biases when evaluating neurosurgeons. As AI integration accelerates, uncritical adoption risks a self-reinforcing system where algorithmically preferred practitioners receive disproportionate advantages, independent of actual skills. These systems may also undermine global capacity-building by devaluing non-Western practitioners. Understanding and mitigating these biases is fundamental to responsibly navigating the intersection of medicine and AI.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43681-025-00860-z.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI ethics refers to the moral principles and guidelines governing the development and deployment of artificial intelligence systems, ensuring they align with human values and societal well-being. It encompasses the evaluation of AI outputs for fairness, safety, transparency, and respect for human rights. To advance systematic ethical evaluation, we introduce the EthicsLens dataset, comprising 38,808 responses generated by seven large language models. These responses were generated using diverse prompts designed to elicit appropriate and potentially sensitive responses. Each response was then annotated across sixteen ethical categories, including stereotyping, toxicity, misinformation, hate speech, harmful advice, privacy violations, political bias, false confidence, emotional or religious insensitivity, sexual content, manipulation, and impersonation. To classify ethical and unethical AI-generated content, the dataset is analysed using state-of-the-art classification methods, assessing its ability to support reliable ethical evaluation. Performance is reported both for binary ethical classification and multilabel violation identification. Results include accuracies of nearly 99% for binary classification tasks with SVM and CNN models, and macro-F1 scores of about 96% on multilabel tasks for Sentence-BERT transformer model.
{"title":"Dataset-centric AI ethics classification","authors":"Aditya Kartik, Surya Raj, Akash Rattan, Deepti Sahu","doi":"10.1007/s43681-025-00904-4","DOIUrl":"10.1007/s43681-025-00904-4","url":null,"abstract":"<div><p>AI ethics refers to the moral principles and guidelines governing the development and deployment of artificial intelligence systems, ensuring they align with human values and societal well-being. It encompasses the evaluation of AI outputs for fairness, safety, transparency, and respect for human rights. To advance systematic ethical evaluation, we introduce the EthicsLens dataset, comprising 38,808 responses generated by seven large language models. These responses were generated using diverse prompts designed to elicit appropriate and potentially sensitive responses. Each response was then annotated across sixteen ethical categories, including stereotyping, toxicity, misinformation, hate speech, harmful advice, privacy violations, political bias, false confidence, emotional or religious insensitivity, sexual content, manipulation, and impersonation. To classify ethical and unethical AI-generated content, the dataset is analysed using state-of-the-art classification methods, assessing its ability to support reliable ethical evaluation. Performance is reported both for binary ethical classification and multilabel violation identification. Results include accuracies of nearly 99% for binary classification tasks with SVM and CNN models, and macro-F1 scores of about 96% on multilabel tasks for Sentence-BERT transformer model.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145675434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Large language models (LLMs) and their integration into agentic and embodied systems are reshaping artificial intelligence (AI), enabling powerful cross-domain generation and reasoning while introducing new risks. Key concerns include hallucination and misinformation, embedded and amplified biases, privacy leakage, and susceptibility to adversarial manipulation. Ensuring trustworthy and responsible generative AI requires technical reliability, transparency, accountability, and attention to societal impact. The present study conducts a review of peer-reviewed literature on the ethical dimensions of LLMs and LLM-based agents across technical, biomedical, and societal domains. It maps the landscape of risks, distills mitigation strategies (e.g., robust evaluation and red-teaming, alignment and guardrailing, privacy-preserving data practices, bias measurement and reduction, and safety-aware deployment), and examines governance frameworks and operational practices relevant to real-world use. By organizing findings through interdisciplinary lenses and bioethical principles, the review identifies persistent gaps, such as limited context-aware evaluation, uneven reporting standards, and weak post-deployment monitoring, that impede accountability and fairness. The synthesis supports practitioners and policymakers in designing safer, more equitable, and auditable LLM systems, and outlines priorities for future research and governance.
{"title":"Ethical perspectives on deployment of large language model agents in biomedicine: a survey","authors":"Nafiseh Ghaffar Nia, Amin Amiri, Yuan Luo, Adrienne Kline","doi":"10.1007/s43681-025-00847-w","DOIUrl":"10.1007/s43681-025-00847-w","url":null,"abstract":"<div><p>Large language models (LLMs) and their integration into agentic and embodied systems are reshaping artificial intelligence (AI), enabling powerful cross-domain generation and reasoning while introducing new risks. Key concerns include hallucination and misinformation, embedded and amplified biases, privacy leakage, and susceptibility to adversarial manipulation. Ensuring trustworthy and responsible generative AI requires technical reliability, transparency, accountability, and attention to societal impact. The present study conducts a review of peer-reviewed literature on the ethical dimensions of LLMs and LLM-based agents across technical, biomedical, and societal domains. It maps the landscape of risks, distills mitigation strategies (e.g., robust evaluation and red-teaming, alignment and guardrailing, privacy-preserving data practices, bias measurement and reduction, and safety-aware deployment), and examines governance frameworks and operational practices relevant to real-world use. By organizing findings through interdisciplinary lenses and bioethical principles, the review identifies persistent gaps, such as limited context-aware evaluation, uneven reporting standards, and weak post-deployment monitoring, that impede accountability and fairness. The synthesis supports practitioners and policymakers in designing safer, more equitable, and auditable LLM systems, and outlines priorities for future research and governance.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43681-025-00847-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-04DOI: 10.1007/s43681-025-00883-6
Hunter Kallay
Recent academic discourse about artificial intelligence (AI) has largely been directed at how to best morally program AI or evaluating the ethics of its use in various contexts. While these efforts are undoubtedly important, this essay proposes a complementary objective: deploying AI to enhance our own ethical conduct. One way we might do this is by using AI to deepen our understanding of human moral psychology. In this paper, I demonstrate how advanced machine learning might help us gain clearer insights into “common sense” morality—shared moral convictions that underpin our reflective judgments and inform central aspects of moral philosophy. Pinpointing such convictions has proven challenging amid widespread moral disagreements. Current approaches to understanding these commitments, although exhibiting some key strengths, ultimately struggle to capture relevant features of reflective moral judgments espoused by John Rawls, leaving room for methodological improvement. Modern advances in AI offer a promising opportunity to make progress on this task. This essay envisions the gamified training of a “collective moral conscience model,” able to render judgments about moral situations that align with the deep-seated principles of the human collective. I argue that such an AI model might make progress in overcoming obstacles of disagreement to aid in philosophical theorizing and foster practical applications for the moral life of AI agents and ourselves, such as offering us guidance in time-constrained dilemmas and helping us to reflect on our own biases.
{"title":"How AI can make us more moral: capturing and applying common sense morality","authors":"Hunter Kallay","doi":"10.1007/s43681-025-00883-6","DOIUrl":"10.1007/s43681-025-00883-6","url":null,"abstract":"<div><p>Recent academic discourse about artificial intelligence (AI) has largely been directed at how to best morally program AI or evaluating the ethics of its use in various contexts. While these efforts are undoubtedly important, this essay proposes a complementary objective: deploying AI to enhance our own ethical conduct. One way we might do this is by using AI to deepen our understanding of human moral psychology. In this paper, I demonstrate how advanced machine learning might help us gain clearer insights into “common sense” morality—shared moral convictions that underpin our reflective judgments and inform central aspects of moral philosophy. Pinpointing such convictions has proven challenging amid widespread moral disagreements. Current approaches to understanding these commitments, although exhibiting some key strengths, ultimately struggle to capture relevant features of reflective moral judgments espoused by John Rawls, leaving room for methodological improvement. Modern advances in AI offer a promising opportunity to make progress on this task. This essay envisions the gamified training of a “collective moral conscience model,” able to render judgments about moral situations that align with the deep-seated principles of the human collective. I argue that such an AI model might make progress in overcoming obstacles of disagreement to aid in philosophical theorizing and foster practical applications for the moral life of AI agents and ourselves, such as offering us guidance in time-constrained dilemmas and helping us to reflect on our own biases.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}