LLMs Will Always Hallucinate, and We Need to Live With This

Sourav Banerjee, Ayushi Agarwal, Saloni Singla
{"title":"LLMs Will Always Hallucinate, and We Need to Live With This","authors":"Sourav Banerjee, Ayushi Agarwal, Saloni Singla","doi":"arxiv-2409.05746","DOIUrl":null,"url":null,"abstract":"As Large Language Models become more ubiquitous across domains, it becomes\nimportant to examine their inherent limitations critically. This work argues\nthat hallucinations in language models are not just occasional errors but an\ninevitable feature of these systems. We demonstrate that hallucinations stem\nfrom the fundamental mathematical and logical structure of LLMs. It is,\ntherefore, impossible to eliminate them through architectural improvements,\ndataset enhancements, or fact-checking mechanisms. Our analysis draws on\ncomputational theory and Godel's First Incompleteness Theorem, which references\nthe undecidability of problems like the Halting, Emptiness, and Acceptance\nProblems. We demonstrate that every stage of the LLM process-from training data\ncompilation to fact retrieval, intent classification, and text generation-will\nhave a non-zero probability of producing hallucinations. This work introduces\nthe concept of Structural Hallucination as an intrinsic nature of these\nsystems. By establishing the mathematical certainty of hallucinations, we\nchallenge the prevailing notion that they can be fully mitigated.","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05746","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

As Large Language Models become more ubiquitous across domains, it becomes important to examine their inherent limitations critically. This work argues that hallucinations in language models are not just occasional errors but an inevitable feature of these systems. We demonstrate that hallucinations stem from the fundamental mathematical and logical structure of LLMs. It is, therefore, impossible to eliminate them through architectural improvements, dataset enhancements, or fact-checking mechanisms. Our analysis draws on computational theory and Godel's First Incompleteness Theorem, which references the undecidability of problems like the Halting, Emptiness, and Acceptance Problems. We demonstrate that every stage of the LLM process-from training data compilation to fact retrieval, intent classification, and text generation-will have a non-zero probability of producing hallucinations. This work introduces the concept of Structural Hallucination as an intrinsic nature of these systems. By establishing the mathematical certainty of hallucinations, we challenge the prevailing notion that they can be fully mitigated.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
法学硕士总会产生幻觉,我们需要接受这一点
随着大型语言模型在各个领域变得越来越普遍,批判性地审视其固有的局限性变得非常重要。本研究认为,语言模型中的幻觉并不只是偶尔出现的错误,而是这些系统不可避免的特征。我们证明,幻觉源于语言模型的基本数学和逻辑结构。因此,不可能通过架构改进、数据集增强或事实检查机制来消除幻觉。我们的分析借鉴了计算理论和戈德尔第一不完备性定理,其中提到了诸如 "停止问题"、"空性问题 "和 "接受问题 "等问题的不可判定性。我们证明,LLM 过程的每个阶段--从训练数据编译到事实检索、意图分类和文本生成--产生幻觉的概率都不为零。这项工作引入了 "结构性幻觉 "的概念,将其视为系统的内在本质。通过确定幻觉在数学上的确定性,我们对可以完全避免幻觉的普遍观点提出了挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Fitting Multilevel Factor Models Cartan moving frames and the data manifolds Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks Recurrent Interpolants for Probabilistic Time Series Prediction PieClam: A Universal Graph Autoencoder Based on Overlapping Inclusive and Exclusive Communities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1