SuperCoder2.0:探索 LLM 作为自主程序员可行性的技术报告

Anmol Gautam, Kishore Kumar, Adarsh Jha, Mukunda NS, Ishaan Bhola
{"title":"SuperCoder2.0:探索 LLM 作为自主程序员可行性的技术报告","authors":"Anmol Gautam, Kishore Kumar, Adarsh Jha, Mukunda NS, Ishaan Bhola","doi":"arxiv-2409.11190","DOIUrl":null,"url":null,"abstract":"We present SuperCoder2.0, an advanced autonomous system designed to enhance\nsoftware development through artificial intelligence. The system combines an\nAI-native development approach with intelligent agents to enable fully\nautonomous coding. Key focus areas include a retry mechanism with error output\ntraceback, comprehensive code rewriting and replacement using Abstract Syntax\nTree (ast) parsing to minimize linting issues, code embedding technique for\nretrieval-augmented generation, and a focus on localizing methods for\nproblem-solving rather than identifying specific line numbers. The methodology\nemploys a three-step hierarchical search space reduction approach for code base\nnavigation and bug localization:utilizing Retrieval Augmented Generation (RAG)\nand a Repository File Level Map to identify candidate files, (2) narrowing down\nto the most relevant files using a File Level Schematic Map, and (3) extracting\n'relevant locations' within these files. Code editing is performed through a\ntwo-part module comprising CodeGeneration and CodeEditing, which generates\nmultiple solutions at different temperature values and replaces entire methods\nor classes to maintain code integrity. A feedback loop executes\nrepository-level test cases to validate and refine solutions. Experiments\nconducted on the SWE-bench Lite dataset demonstrate SuperCoder2.0's\neffectiveness, achieving correct file localization in 84.33% of cases within\nthe top 5 candidates and successfully resolving 34% of test instances. This\nperformance places SuperCoder2.0 fourth globally on the SWE-bench leaderboard.\nThe system's ability to handle diverse repositories and problem types\nhighlights its potential as a versatile tool for autonomous software\ndevelopment. Future work will focus on refining the code editing process and\nexploring advanced embedding models for improved natural language to code\nmapping.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SuperCoder2.0: Technical Report on Exploring the feasibility of LLMs as Autonomous Programmer\",\"authors\":\"Anmol Gautam, Kishore Kumar, Adarsh Jha, Mukunda NS, Ishaan Bhola\",\"doi\":\"arxiv-2409.11190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present SuperCoder2.0, an advanced autonomous system designed to enhance\\nsoftware development through artificial intelligence. The system combines an\\nAI-native development approach with intelligent agents to enable fully\\nautonomous coding. Key focus areas include a retry mechanism with error output\\ntraceback, comprehensive code rewriting and replacement using Abstract Syntax\\nTree (ast) parsing to minimize linting issues, code embedding technique for\\nretrieval-augmented generation, and a focus on localizing methods for\\nproblem-solving rather than identifying specific line numbers. The methodology\\nemploys a three-step hierarchical search space reduction approach for code base\\nnavigation and bug localization:utilizing Retrieval Augmented Generation (RAG)\\nand a Repository File Level Map to identify candidate files, (2) narrowing down\\nto the most relevant files using a File Level Schematic Map, and (3) extracting\\n'relevant locations' within these files. Code editing is performed through a\\ntwo-part module comprising CodeGeneration and CodeEditing, which generates\\nmultiple solutions at different temperature values and replaces entire methods\\nor classes to maintain code integrity. A feedback loop executes\\nrepository-level test cases to validate and refine solutions. Experiments\\nconducted on the SWE-bench Lite dataset demonstrate SuperCoder2.0's\\neffectiveness, achieving correct file localization in 84.33% of cases within\\nthe top 5 candidates and successfully resolving 34% of test instances. This\\nperformance places SuperCoder2.0 fourth globally on the SWE-bench leaderboard.\\nThe system's ability to handle diverse repositories and problem types\\nhighlights its potential as a versatile tool for autonomous software\\ndevelopment. Future work will focus on refining the code editing process and\\nexploring advanced embedding models for improved natural language to code\\nmapping.\",\"PeriodicalId\":501278,\"journal\":{\"name\":\"arXiv - CS - Software Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们介绍的超级编码器 2.0 是一种先进的自主系统,旨在通过人工智能增强软件开发能力。该系统将人工智能原生开发方法与智能代理相结合,实现了完全自主的编码。重点领域包括:带有错误输出回溯功能的重试机制;使用抽象语法树(ast)解析技术进行全面的代码重写和替换,以最大限度地减少剔除问题;用于检索增强生成的代码嵌入技术;以及将重点放在解决问题的本地化方法上,而不是识别具体的行号。该方法采用三步分层搜索空间缩减法进行代码库导航和错误定位:利用检索增强生成(RAG)和资源库文件级地图识别候选文件;(2) 利用文件级示意图缩小最相关文件的范围;(3) 在这些文件中提取 "相关位置"。代码编辑通过由代码生成和代码编辑两部分组成的模块进行,该模块在不同温度值下生成多个解决方案,并替换整个方法或类,以保持代码的完整性。反馈回路执行存储库级测试用例,以验证和完善解决方案。在 SWE-bench Lite 数据集上进行的实验证明了 SuperCoder2.0 的有效性,在前 5 个候选案例中,84.33% 的案例实现了正确的文件定位,并成功解决了 34% 的测试实例。该系统处理不同资源库和问题类型的能力凸显了其作为自主软件开发多功能工具的潜力。未来的工作重点是完善代码编辑流程,并探索先进的嵌入模型,以改进从自然语言到代码的映射。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SuperCoder2.0: Technical Report on Exploring the feasibility of LLMs as Autonomous Programmer
We present SuperCoder2.0, an advanced autonomous system designed to enhance software development through artificial intelligence. The system combines an AI-native development approach with intelligent agents to enable fully autonomous coding. Key focus areas include a retry mechanism with error output traceback, comprehensive code rewriting and replacement using Abstract Syntax Tree (ast) parsing to minimize linting issues, code embedding technique for retrieval-augmented generation, and a focus on localizing methods for problem-solving rather than identifying specific line numbers. The methodology employs a three-step hierarchical search space reduction approach for code base navigation and bug localization:utilizing Retrieval Augmented Generation (RAG) and a Repository File Level Map to identify candidate files, (2) narrowing down to the most relevant files using a File Level Schematic Map, and (3) extracting 'relevant locations' within these files. Code editing is performed through a two-part module comprising CodeGeneration and CodeEditing, which generates multiple solutions at different temperature values and replaces entire methods or classes to maintain code integrity. A feedback loop executes repository-level test cases to validate and refine solutions. Experiments conducted on the SWE-bench Lite dataset demonstrate SuperCoder2.0's effectiveness, achieving correct file localization in 84.33% of cases within the top 5 candidates and successfully resolving 34% of test instances. This performance places SuperCoder2.0 fourth globally on the SWE-bench leaderboard. The system's ability to handle diverse repositories and problem types highlights its potential as a versatile tool for autonomous software development. Future work will focus on refining the code editing process and exploring advanced embedding models for improved natural language to code mapping.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and Memorization Shannon Entropy is better Feature than Category and Sentiment in User Feedback Processing Motivations, Challenges, Best Practices, and Benefits for Bots and Conversational Agents in Software Engineering: A Multivocal Literature Review A Taxonomy of Self-Admitted Technical Debt in Deep Learning Systems Investigating team maturity in an agile automotive reorganization
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1