LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains

IF 1.3 3区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Data Intelligence Pub Date : 2024-04-11 DOI:10.1162/dint_a_00251

Songlin Chen, Weicheng Wang, Xiaoliang Chen, Peng Lu, Zaiyan Yang, Yajun Du

{"title":"LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains","authors":"Songlin Chen, Weicheng Wang, Xiaoliang Chen, Peng Lu, Zaiyan Yang, Yajun Du","doi":"10.1162/dint_a_00251","DOIUrl":null,"url":null,"abstract":"\n The exption of Chinese natural language processing (NLP) has stimulated research in the broader NLP domain. However, existing large language models have limitations in comprehending and reasoning in Chinese. This paper addresses these limitations by enhancing Chinese language models comprehension and reasoning capabilities while minimizing resource requirements. We propose LLaMA-LoRA, a neural prompt engineering framework that builds upon the LLaMA-13B model and incorporates the Low-Rank Adaptation(LoRA) of Large Language Models technique for refinement. Chain-of-Thought(CoT) are crucial for generating intermediate reasoning chains in language models, but their effectiveness can be limited by isolated language patterns. Erroneous reasoning resulting from conventional prompts negatively impacts model performance. Automatic prompts are introduced to encourage reasoning chain generation and accurate answer inference. Training the model with an extensive corpus of Chinese CoT data enhances its comprehension and reasoning abilities. The LLaMA-LoRA model demonstrates exceptional performance across numerous Chinese language tasks, surpassing benchmark performance achieved by related language models such as GPT-3.5, Chat-GLM, and OpenAssistant, delivering accurate, comprehensive, and professional answers. The availability of our open-source model code facilitates further research in the field of Chinese text logical reasoning thinking chains.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/dint_a_00251","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The exption of Chinese natural language processing (NLP) has stimulated research in the broader NLP domain. However, existing large language models have limitations in comprehending and reasoning in Chinese. This paper addresses these limitations by enhancing Chinese language models comprehension and reasoning capabilities while minimizing resource requirements. We propose LLaMA-LoRA, a neural prompt engineering framework that builds upon the LLaMA-13B model and incorporates the Low-Rank Adaptation(LoRA) of Large Language Models technique for refinement. Chain-of-Thought(CoT) are crucial for generating intermediate reasoning chains in language models, but their effectiveness can be limited by isolated language patterns. Erroneous reasoning resulting from conventional prompts negatively impacts model performance. Automatic prompts are introduced to encourage reasoning chain generation and accurate answer inference. Training the model with an extensive corpus of Chinese CoT data enhances its comprehension and reasoning abilities. The LLaMA-LoRA model demonstrates exceptional performance across numerous Chinese language tasks, surpassing benchmark performance achieved by related language models such as GPT-3.5, Chat-GLM, and OpenAssistant, delivering accurate, comprehensive, and professional answers. The availability of our open-source model code facilitates further research in the field of Chinese text logical reasoning thinking chains.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

LLaMA-LoRA 神经提示工程：自动生成中文文本逻辑推理思维链的深度调整框架

中文自然语言处理（NLP）的兴起促进了更广泛的 NLP 领域的研究。然而，现有的大型语言模型在中文理解和推理方面存在局限性。本文通过增强中文模型的理解和推理能力，同时最大限度地减少资源需求，来解决这些局限性。我们提出的 LLaMA-LoRA 是一个神经提示工程框架，它建立在 LLaMA-13B 模型的基础上，并结合了大型语言模型的 Low-Rank Adaptation（LoRA）技术进行完善。思维链（CoT）是语言模型中生成中间推理链的关键，但其有效性会受到孤立语言模式的限制。传统提示导致的错误推理会对模型性能产生负面影响。为了鼓励推理链的生成和准确的答案推理，我们引入了自动提示。使用大量的中文 CoT 数据对模型进行训练，增强了模型的理解和推理能力。LLaMA-LoRA 模型在众多中文任务中表现出卓越的性能，超越了 GPT-3.5、Chat-GLM 和 OpenAssistant 等相关语言模型的基准性能，提供了准确、全面和专业的答案。我们的开源模型代码的可用性促进了在中文文本逻辑推理思维链领域的进一步研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊