Shuai Wang, Liang Ding, Li Shen, Yong Luo, Zheng He, Wei Yu, Dacheng Tao
{"title":"$mathbb{USCD}$:通过不确定性感知的选择性对比解码改进 LLM 的代码生成","authors":"Shuai Wang, Liang Ding, Li Shen, Yong Luo, Zheng He, Wei Yu, Dacheng Tao","doi":"arxiv-2409.05923","DOIUrl":null,"url":null,"abstract":"Large language models (LLMs) have shown remarkable capabilities in code\ngeneration. However, the effects of hallucinations (e.g., output noise) make it\nparticularly challenging for LLMs to generate high-quality code in one pass. In\nthis work, we propose a simple and effective \\textbf{u}ncertainty-aware\n\\textbf{s}elective \\textbf{c}ontrastive \\textbf{d}ecoding ($\\mathbb{USCD}$)\nmechanism to improve the quality of one-pass code generation in LLMs and reduce\nthe impact of output noise. To be specific, we first elaborately designed a\nnegative prompt (namely lame prompt) to output noise by removing input-output\nexamples from the standard few-shot prompt. Our preliminary study shows that\nthe Jensen-Shannon divergence (JS divergence) between token distribution\nuncertainty and the output noise is relatively low (approximately $0.25$),\nindicating their high relevance. Then, we selectively eliminate output noise\ninduced by lame prompts based on the uncertainty of the prediction distribution\nfrom the standard prompt. Notably, our proposed plug-and-play mechanism is an\ninference-only method, enjoying appealing flexibility. Extensive experiments on\nwidely used benchmarks, e.g., HumanEval, MBPP, and MultiPL-E, upon several LLMs\n(i.e., Inocder-6b, CodeLlama-7b, WizardCoder-15b, StarCoder, and Llama2-7b),\ndemonstrate that our proposed USCD significantly improves one-pass code\ngeneration, with an average \\textit{pass@$1$} scores increase of 16.59\\%. We\nwill release code and data on GitHub.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"$\\\\mathbb{USCD}$: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding\",\"authors\":\"Shuai Wang, Liang Ding, Li Shen, Yong Luo, Zheng He, Wei Yu, Dacheng Tao\",\"doi\":\"arxiv-2409.05923\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large language models (LLMs) have shown remarkable capabilities in code\\ngeneration. However, the effects of hallucinations (e.g., output noise) make it\\nparticularly challenging for LLMs to generate high-quality code in one pass. In\\nthis work, we propose a simple and effective \\\\textbf{u}ncertainty-aware\\n\\\\textbf{s}elective \\\\textbf{c}ontrastive \\\\textbf{d}ecoding ($\\\\mathbb{USCD}$)\\nmechanism to improve the quality of one-pass code generation in LLMs and reduce\\nthe impact of output noise. To be specific, we first elaborately designed a\\nnegative prompt (namely lame prompt) to output noise by removing input-output\\nexamples from the standard few-shot prompt. Our preliminary study shows that\\nthe Jensen-Shannon divergence (JS divergence) between token distribution\\nuncertainty and the output noise is relatively low (approximately $0.25$),\\nindicating their high relevance. Then, we selectively eliminate output noise\\ninduced by lame prompts based on the uncertainty of the prediction distribution\\nfrom the standard prompt. Notably, our proposed plug-and-play mechanism is an\\ninference-only method, enjoying appealing flexibility. Extensive experiments on\\nwidely used benchmarks, e.g., HumanEval, MBPP, and MultiPL-E, upon several LLMs\\n(i.e., Inocder-6b, CodeLlama-7b, WizardCoder-15b, StarCoder, and Llama2-7b),\\ndemonstrate that our proposed USCD significantly improves one-pass code\\ngeneration, with an average \\\\textit{pass@$1$} scores increase of 16.59\\\\%. We\\nwill release code and data on GitHub.\",\"PeriodicalId\":501278,\"journal\":{\"name\":\"arXiv - CS - Software Engineering\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05923\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
$\mathbb{USCD}$: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding
Large language models (LLMs) have shown remarkable capabilities in code
generation. However, the effects of hallucinations (e.g., output noise) make it
particularly challenging for LLMs to generate high-quality code in one pass. In
this work, we propose a simple and effective \textbf{u}ncertainty-aware
\textbf{s}elective \textbf{c}ontrastive \textbf{d}ecoding ($\mathbb{USCD}$)
mechanism to improve the quality of one-pass code generation in LLMs and reduce
the impact of output noise. To be specific, we first elaborately designed a
negative prompt (namely lame prompt) to output noise by removing input-output
examples from the standard few-shot prompt. Our preliminary study shows that
the Jensen-Shannon divergence (JS divergence) between token distribution
uncertainty and the output noise is relatively low (approximately $0.25$),
indicating their high relevance. Then, we selectively eliminate output noise
induced by lame prompts based on the uncertainty of the prediction distribution
from the standard prompt. Notably, our proposed plug-and-play mechanism is an
inference-only method, enjoying appealing flexibility. Extensive experiments on
widely used benchmarks, e.g., HumanEval, MBPP, and MultiPL-E, upon several LLMs
(i.e., Inocder-6b, CodeLlama-7b, WizardCoder-15b, StarCoder, and Llama2-7b),
demonstrate that our proposed USCD significantly improves one-pass code
generation, with an average \textit{pass@$1$} scores increase of 16.59\%. We
will release code and data on GitHub.