Hallucination Reduction and Optimization for Large Language Model-Based Autonomous Driving

Symmetry Pub Date : 2024-09-11 DOI:10.3390/sym16091196

Jue Wang

{"title":"Hallucination Reduction and Optimization for Large Language Model-Based Autonomous Driving","authors":"Jue Wang","doi":"10.3390/sym16091196","DOIUrl":null,"url":null,"abstract":"Large language models (LLMs) are widely integrated into autonomous driving systems to enhance their operational intelligence and responsiveness and improve self-driving vehicles’ overall performance. Despite these advances, LLMs still struggle between hallucinations—when models either misinterpret the environment or generate imaginary parts for downstream use cases—and taxing computational overhead that relegates their performance to strictly non-real-time operations. These are essential problems to solve to make autonomous driving as safe and efficient as possible. This work is thus focused on symmetrical trade-offs between the reduction of hallucination and optimization, leading to a framework for these two combined and at least specifically motivated by these limitations. This framework intends to generate a symmetry of mapping between real and virtual worlds. It helps in minimizing hallucinations and optimizing computational resource consumption reasonably. In autonomous driving tasks, we use multimodal LLMs that combine an image-encoding Visual Transformer (ViT) and a decoding GPT-2 with responses generated by the powerful new sequence generator from OpenAI known as GPT4. Our hallucination reduction and optimization framework leverages iterative refinement loops, RLHF—reinforcement learning from human feedback (RLHF)—along with symmetric performance metrics, e.g., BLEU, ROUGE, and CIDEr similarity scores between machine-generated answers specific to other human reference answers. This ensures that improvements in model accuracy are not overused to the detriment of increased computational overhead. Experimental results show a twofold improvement in decision-maker error rate and processing efficiency, resulting in an overall decrease of 30% for the model and a 25% improvement in processing efficiency across diverse driving scenarios. Not only does this symmetrical approach reduce hallucination, but it also better aligns the virtual and real-world representations.","PeriodicalId":501198,"journal":{"name":"Symmetry","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Symmetry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/sym16091196","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Large language models (LLMs) are widely integrated into autonomous driving systems to enhance their operational intelligence and responsiveness and improve self-driving vehicles’ overall performance. Despite these advances, LLMs still struggle between hallucinations—when models either misinterpret the environment or generate imaginary parts for downstream use cases—and taxing computational overhead that relegates their performance to strictly non-real-time operations. These are essential problems to solve to make autonomous driving as safe and efficient as possible. This work is thus focused on symmetrical trade-offs between the reduction of hallucination and optimization, leading to a framework for these two combined and at least specifically motivated by these limitations. This framework intends to generate a symmetry of mapping between real and virtual worlds. It helps in minimizing hallucinations and optimizing computational resource consumption reasonably. In autonomous driving tasks, we use multimodal LLMs that combine an image-encoding Visual Transformer (ViT) and a decoding GPT-2 with responses generated by the powerful new sequence generator from OpenAI known as GPT4. Our hallucination reduction and optimization framework leverages iterative refinement loops, RLHF—reinforcement learning from human feedback (RLHF)—along with symmetric performance metrics, e.g., BLEU, ROUGE, and CIDEr similarity scores between machine-generated answers specific to other human reference answers. This ensures that improvements in model accuracy are not overused to the detriment of increased computational overhead. Experimental results show a twofold improvement in decision-maker error rate and processing efficiency, resulting in an overall decrease of 30% for the model and a 25% improvement in processing efficiency across diverse driving scenarios. Not only does this symmetrical approach reduce hallucination, but it also better aligns the virtual and real-world representations.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于大语言模型的自动驾驶的幻觉减少与优化

大型语言模型（LLM）被广泛集成到自动驾驶系统中，以提高其操作智能和响应速度，并改善自动驾驶车辆的整体性能。尽管取得了这些进步，大型语言模型仍在幻觉和计算开销之间挣扎，幻觉是指模型误解环境或为下游用例生成假想部分，而计算开销则使其性能严格限于非实时操作。要使自动驾驶尽可能安全高效，这些都是必须解决的问题。因此，这项工作的重点是在减少幻觉和优化之间进行对称权衡，从而为这两者的结合建立一个框架，至少是专门针对这些限制因素而设计的。该框架旨在生成现实世界和虚拟世界之间的对称映射。它有助于最大限度地减少幻觉，并合理优化计算资源消耗。在自动驾驶任务中，我们使用多模态 LLM，将图像编码 Visual Transformer（ViT）和解码 GPT-2 与 OpenAI 强大的新序列生成器 GPT4 生成的响应结合起来。我们的幻觉减少和优化框架利用了迭代完善循环、RLHF--从人类反馈中强化学习（RLHF）--以及对称性能指标，例如 BLEU、ROUGE 和机器生成的答案与其他人类参考答案之间的 CIDEr 相似度得分。这确保了模型准确性的提高不会被过度使用，以免增加计算开销。实验结果表明，在不同的驾驶场景下，决策者错误率和处理效率提高了两倍，使模型的整体错误率降低了 30%，处理效率提高了 25%。这种对称方法不仅能减少幻觉，还能更好地协调虚拟和现实世界的表征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Symmetry

自引率

0.00%

发文量