Evaluating approaches for reducing catastrophic risks from AI

AI and ethics Pub Date : 2024-04-08 DOI:10.1007/s43681-024-00475-w

Leonard Dung

{"title":"Evaluating approaches for reducing catastrophic risks from AI","authors":"Leonard Dung","doi":"10.1007/s43681-024-00475-w","DOIUrl":null,"url":null,"abstract":"<div><p>According to a growing number of researchers, AI may pose catastrophic – or even existential – risks to humanity. Catastrophic risks may be taken to be risks of 100 million human deaths, or a similarly bad outcome. I argue that such risks – while contested – are sufficiently likely to demand rigorous discussion of potential societal responses. Subsequently, I propose four desiderata for approaches to the reduction of catastrophic risks from AI. The quality of such approaches can be assessed by their chance of success, degree of beneficence, degree of non-maleficence, and beneficent side effects. Then, I employ these desiderata to evaluate the promises, limitations and risks of alignment research, timelines research, policy research, halting or slowing down AI research, and compute governance for tackling catastrophic AI risks. While more research is needed, this investigation shows that several approaches for dealing with catastrophic AI risks are available, and where their respective strengths and weaknesses lie. It turns out that many approaches are complementary and that the approaches have a nuanced relationship to approaches to present AI harms. While some approaches are similarly useful for addressing catastrophic risks and present harms, this is not always the case.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"5 2","pages":"1177 - 1188"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43681-024-00475-w.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI and ethics","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43681-024-00475-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

According to a growing number of researchers, AI may pose catastrophic – or even existential – risks to humanity. Catastrophic risks may be taken to be risks of 100 million human deaths, or a similarly bad outcome. I argue that such risks – while contested – are sufficiently likely to demand rigorous discussion of potential societal responses. Subsequently, I propose four desiderata for approaches to the reduction of catastrophic risks from AI. The quality of such approaches can be assessed by their chance of success, degree of beneficence, degree of non-maleficence, and beneficent side effects. Then, I employ these desiderata to evaluate the promises, limitations and risks of alignment research, timelines research, policy research, halting or slowing down AI research, and compute governance for tackling catastrophic AI risks. While more research is needed, this investigation shows that several approaches for dealing with catastrophic AI risks are available, and where their respective strengths and weaknesses lie. It turns out that many approaches are complementary and that the approaches have a nuanced relationship to approaches to present AI harms. While some approaches are similarly useful for addressing catastrophic risks and present harms, this is not always the case.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

评估减少人工智能灾难性风险的方法

越来越多的研究人员表示，人工智能可能会给人类带来灾难性的——甚至是生死存亡的——风险。灾难性风险可能被认为是1亿人死亡的风险，或类似的不良后果。我认为，尽管存在争议，但这些风险很可能需要对潜在的社会反应进行严格讨论。随后，我提出了减少人工智能带来的灾难性风险的四个理想方法。这些方法的质量可以通过其成功的机会、有益的程度、无害的程度和有益的副作用来评估。然后，我利用这些需求来评估一致性研究、时间表研究、政策研究、停止或减缓人工智能研究的承诺、局限性和风险，以及应对灾难性人工智能风险的计算治理。虽然需要更多的研究，但这项调查表明，应对灾难性人工智能风险的几种方法是可用的，以及它们各自的优势和劣势。事实证明，许多方法是互补的，这些方法与呈现人工智能危害的方法有着微妙的关系。虽然有些方法对解决灾难性风险和当前危害同样有用，但情况并非总是如此。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

AI and ethics

自引率

0.00%

发文量