LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains

IF 1.3 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Data Intelligence Pub Date : 2024-04-11 DOI:10.1162/dint_a_00251
Songlin Chen, Weicheng Wang, Xiaoliang Chen, Peng Lu, Zaiyan Yang, Yajun Du
{"title":"LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains","authors":"Songlin Chen, Weicheng Wang, Xiaoliang Chen, Peng Lu, Zaiyan Yang, Yajun Du","doi":"10.1162/dint_a_00251","DOIUrl":null,"url":null,"abstract":"\n The exption of Chinese natural language processing (NLP) has stimulated research in the broader NLP domain. However, existing large language models have limitations in comprehending and reasoning in Chinese. This paper addresses these limitations by enhancing Chinese language models comprehension and reasoning capabilities while minimizing resource requirements. We propose LLaMA-LoRA, a neural prompt engineering framework that builds upon the LLaMA-13B model and incorporates the Low-Rank Adaptation(LoRA) of Large Language Models technique for refinement. Chain-of-Thought(CoT) are crucial for generating intermediate reasoning chains in language models, but their effectiveness can be limited by isolated language patterns. Erroneous reasoning resulting from conventional prompts negatively impacts model performance. Automatic prompts are introduced to encourage reasoning chain generation and accurate answer inference. Training the model with an extensive corpus of Chinese CoT data enhances its comprehension and reasoning abilities. The LLaMA-LoRA model demonstrates exceptional performance across numerous Chinese language tasks, surpassing benchmark performance achieved by related language models such as GPT-3.5, Chat-GLM, and OpenAssistant, delivering accurate, comprehensive, and professional answers. The availability of our open-source model code facilitates further research in the field of Chinese text logical reasoning thinking chains.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/dint_a_00251","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The exption of Chinese natural language processing (NLP) has stimulated research in the broader NLP domain. However, existing large language models have limitations in comprehending and reasoning in Chinese. This paper addresses these limitations by enhancing Chinese language models comprehension and reasoning capabilities while minimizing resource requirements. We propose LLaMA-LoRA, a neural prompt engineering framework that builds upon the LLaMA-13B model and incorporates the Low-Rank Adaptation(LoRA) of Large Language Models technique for refinement. Chain-of-Thought(CoT) are crucial for generating intermediate reasoning chains in language models, but their effectiveness can be limited by isolated language patterns. Erroneous reasoning resulting from conventional prompts negatively impacts model performance. Automatic prompts are introduced to encourage reasoning chain generation and accurate answer inference. Training the model with an extensive corpus of Chinese CoT data enhances its comprehension and reasoning abilities. The LLaMA-LoRA model demonstrates exceptional performance across numerous Chinese language tasks, surpassing benchmark performance achieved by related language models such as GPT-3.5, Chat-GLM, and OpenAssistant, delivering accurate, comprehensive, and professional answers. The availability of our open-source model code facilitates further research in the field of Chinese text logical reasoning thinking chains.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
LLaMA-LoRA 神经提示工程:自动生成中文文本逻辑推理思维链的深度调整框架
中文自然语言处理(NLP)的兴起促进了更广泛的 NLP 领域的研究。然而,现有的大型语言模型在中文理解和推理方面存在局限性。本文通过增强中文模型的理解和推理能力,同时最大限度地减少资源需求,来解决这些局限性。我们提出的 LLaMA-LoRA 是一个神经提示工程框架,它建立在 LLaMA-13B 模型的基础上,并结合了大型语言模型的 Low-Rank Adaptation(LoRA)技术进行完善。思维链(CoT)是语言模型中生成中间推理链的关键,但其有效性会受到孤立语言模式的限制。传统提示导致的错误推理会对模型性能产生负面影响。为了鼓励推理链的生成和准确的答案推理,我们引入了自动提示。使用大量的中文 CoT 数据对模型进行训练,增强了模型的理解和推理能力。LLaMA-LoRA 模型在众多中文任务中表现出卓越的性能,超越了 GPT-3.5、Chat-GLM 和 OpenAssistant 等相关语言模型的基准性能,提供了准确、全面和专业的答案。我们的开源模型代码的可用性促进了在中文文本逻辑推理思维链领域的进一步研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Data Intelligence
Data Intelligence COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
6.50
自引率
15.40%
发文量
40
审稿时长
8 weeks
期刊最新文献
AuToGen: Automated Tool Learning Data Generation with Domain-specific Structured Data Risk Factors Categorizations of Ischemic Heart Disease in South-Western Bangladesh Using Intelligent Screening Service Platform (ISSP) to improve the screening process of clinical trial subjects during COVID-19 pandemic: an experimental study Sustainable Connectivity in a Community Repository LLaMA-LoRA Neural Prompt Engineering: A Deep Tuning Framework for Automatically Generating Chinese Text Logical Reasoning Thinking Chains
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1