低成本语言模型:Python代码生成的调查和性能评估

IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Engineering Applications of Artificial Intelligence Pub Date : 2024-11-29 DOI:10.1016/j.engappai.2024.109490
Jessica López Espejel, Mahaman Sanoussi Yahaya Alassan, Merieme Bouhandi, Walid Dahhane, El Hassane Ettifouri
{"title":"低成本语言模型:Python代码生成的调查和性能评估","authors":"Jessica López Espejel,&nbsp;Mahaman Sanoussi Yahaya Alassan,&nbsp;Merieme Bouhandi,&nbsp;Walid Dahhane,&nbsp;El Hassane Ettifouri","doi":"10.1016/j.engappai.2024.109490","DOIUrl":null,"url":null,"abstract":"<div><div>Large Language Models (LLMs) have become a popular choice for many Natural Language Processing (NLP) tasks due to their versatility and ability to produce high-quality results. Specifically, they are increasingly used for automatic code generation to help developers tackle repetitive coding tasks. However, LLMs’ substantial computational and memory requirements often make them inaccessible to users with limited resources. This paper focuses on very low-cost models which offer a more accessible alternative to resource-intensive LLMs. We notably: (1) propose a thorough semi-manual evaluation of their performance in generating Python code, (2) introduce a Chain-of-Thought (CoT) prompting strategy to improve model reasoning and code quality, and (3) propose a new dataset of 60 programming problems, with varied difficulty levels, designed to extend existing benchmarks like HumanEval and EvalPlus. Our findings show that some low-cost compatible models achieve competitive results compared to larger models like ChatGPT despite using significantly fewer resources. We will make our dataset and prompts publicly available to support further research.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"140 ","pages":"Article 109490"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Low-cost language models: Survey and performance evaluation on Python code generation\",\"authors\":\"Jessica López Espejel,&nbsp;Mahaman Sanoussi Yahaya Alassan,&nbsp;Merieme Bouhandi,&nbsp;Walid Dahhane,&nbsp;El Hassane Ettifouri\",\"doi\":\"10.1016/j.engappai.2024.109490\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Large Language Models (LLMs) have become a popular choice for many Natural Language Processing (NLP) tasks due to their versatility and ability to produce high-quality results. Specifically, they are increasingly used for automatic code generation to help developers tackle repetitive coding tasks. However, LLMs’ substantial computational and memory requirements often make them inaccessible to users with limited resources. This paper focuses on very low-cost models which offer a more accessible alternative to resource-intensive LLMs. We notably: (1) propose a thorough semi-manual evaluation of their performance in generating Python code, (2) introduce a Chain-of-Thought (CoT) prompting strategy to improve model reasoning and code quality, and (3) propose a new dataset of 60 programming problems, with varied difficulty levels, designed to extend existing benchmarks like HumanEval and EvalPlus. Our findings show that some low-cost compatible models achieve competitive results compared to larger models like ChatGPT despite using significantly fewer resources. We will make our dataset and prompts publicly available to support further research.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"140 \",\"pages\":\"Article 109490\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197624016488\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624016488","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

大型语言模型(llm)由于其通用性和产生高质量结果的能力,已成为许多自然语言处理(NLP)任务的热门选择。具体来说,它们越来越多地用于自动代码生成,以帮助开发人员处理重复的编码任务。然而,llm的大量计算和内存需求往往使资源有限的用户无法访问它们。本文的重点是非常低成本的模型,它为资源密集型法学硕士提供了一个更容易获得的替代方案。我们值得注意的是:(1)提出了一个全面的半手动评估它们在生成Python代码方面的性能,(2)引入了一个思维链(CoT)提示策略,以提高模型推理和代码质量,(3)提出了一个包含60个编程问题的新数据集,具有不同的难度级别,旨在扩展现有的基准,如HumanEval和EvalPlus。我们的研究结果表明,尽管使用的资源少得多,但与ChatGPT等大型模型相比,一些低成本兼容模型获得了具有竞争力的结果。我们将公开我们的数据集和提示,以支持进一步的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Low-cost language models: Survey and performance evaluation on Python code generation
Large Language Models (LLMs) have become a popular choice for many Natural Language Processing (NLP) tasks due to their versatility and ability to produce high-quality results. Specifically, they are increasingly used for automatic code generation to help developers tackle repetitive coding tasks. However, LLMs’ substantial computational and memory requirements often make them inaccessible to users with limited resources. This paper focuses on very low-cost models which offer a more accessible alternative to resource-intensive LLMs. We notably: (1) propose a thorough semi-manual evaluation of their performance in generating Python code, (2) introduce a Chain-of-Thought (CoT) prompting strategy to improve model reasoning and code quality, and (3) propose a new dataset of 60 programming problems, with varied difficulty levels, designed to extend existing benchmarks like HumanEval and EvalPlus. Our findings show that some low-cost compatible models achieve competitive results compared to larger models like ChatGPT despite using significantly fewer resources. We will make our dataset and prompts publicly available to support further research.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Engineering Applications of Artificial Intelligence
Engineering Applications of Artificial Intelligence 工程技术-工程:电子与电气
CiteScore
9.60
自引率
10.00%
发文量
505
审稿时长
68 days
期刊介绍: Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.
期刊最新文献
Navigating beyond the training set: A deep learning framework for inverse design of architected composite materials Learning adaptive distractor-aware-suppression appearance model for visual tracking Graph Neural Networks with scattering transform for network anomaly detection Consistency-based decision-making method with linguistic Q-rung orthopair fuzzy preference relation for power battery selection of new energy vehicles Self-supervised combustion state diagnosis using a noise-augmented generative adversarial network and flame image sequences
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1