SuperCoder2.0：探索 LLM 作为自主程序员可行性的技术报告

arXiv - CS - Software Engineering Pub Date : 2024-09-17 DOI:arxiv-2409.11190

Anmol Gautam, Kishore Kumar, Adarsh Jha, Mukunda NS, Ishaan Bhola

{"title":"SuperCoder2.0：探索 LLM 作为自主程序员可行性的技术报告","authors":"Anmol Gautam, Kishore Kumar, Adarsh Jha, Mukunda NS, Ishaan Bhola","doi":"arxiv-2409.11190","DOIUrl":null,"url":null,"abstract":"We present SuperCoder2.0, an advanced autonomous system designed to enhance\nsoftware development through artificial intelligence. The system combines an\nAI-native development approach with intelligent agents to enable fully\nautonomous coding. Key focus areas include a retry mechanism with error output\ntraceback, comprehensive code rewriting and replacement using Abstract Syntax\nTree (ast) parsing to minimize linting issues, code embedding technique for\nretrieval-augmented generation, and a focus on localizing methods for\nproblem-solving rather than identifying specific line numbers. The methodology\nemploys a three-step hierarchical search space reduction approach for code base\nnavigation and bug localization:utilizing Retrieval Augmented Generation (RAG)\nand a Repository File Level Map to identify candidate files, (2) narrowing down\nto the most relevant files using a File Level Schematic Map, and (3) extracting\n'relevant locations' within these files. Code editing is performed through a\ntwo-part module comprising CodeGeneration and CodeEditing, which generates\nmultiple solutions at different temperature values and replaces entire methods\nor classes to maintain code integrity. A feedback loop executes\nrepository-level test cases to validate and refine solutions. Experiments\nconducted on the SWE-bench Lite dataset demonstrate SuperCoder2.0's\neffectiveness, achieving correct file localization in 84.33% of cases within\nthe top 5 candidates and successfully resolving 34% of test instances. This\nperformance places SuperCoder2.0 fourth globally on the SWE-bench leaderboard.\nThe system's ability to handle diverse repositories and problem types\nhighlights its potential as a versatile tool for autonomous software\ndevelopment. Future work will focus on refining the code editing process and\nexploring advanced embedding models for improved natural language to code\nmapping.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"35 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SuperCoder2.0: Technical Report on Exploring the feasibility of LLMs as Autonomous Programmer\",\"authors\":\"Anmol Gautam, Kishore Kumar, Adarsh Jha, Mukunda NS, Ishaan Bhola\",\"doi\":\"arxiv-2409.11190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present SuperCoder2.0, an advanced autonomous system designed to enhance\\nsoftware development through artificial intelligence. The system combines an\\nAI-native development approach with intelligent agents to enable fully\\nautonomous coding. Key focus areas include a retry mechanism with error output\\ntraceback, comprehensive code rewriting and replacement using Abstract Syntax\\nTree (ast) parsing to minimize linting issues, code embedding technique for\\nretrieval-augmented generation, and a focus on localizing methods for\\nproblem-solving rather than identifying specific line numbers. The methodology\\nemploys a three-step hierarchical search space reduction approach for code base\\nnavigation and bug localization:utilizing Retrieval Augmented Generation (RAG)\\nand a Repository File Level Map to identify candidate files, (2) narrowing down\\nto the most relevant files using a File Level Schematic Map, and (3) extracting\\n'relevant locations' within these files. Code editing is performed through a\\ntwo-part module comprising CodeGeneration and CodeEditing, which generates\\nmultiple solutions at different temperature values and replaces entire methods\\nor classes to maintain code integrity. A feedback loop executes\\nrepository-level test cases to validate and refine solutions. Experiments\\nconducted on the SWE-bench Lite dataset demonstrate SuperCoder2.0's\\neffectiveness, achieving correct file localization in 84.33% of cases within\\nthe top 5 candidates and successfully resolving 34% of test instances. This\\nperformance places SuperCoder2.0 fourth globally on the SWE-bench leaderboard.\\nThe system's ability to handle diverse repositories and problem types\\nhighlights its potential as a versatile tool for autonomous software\\ndevelopment. Future work will focus on refining the code editing process and\\nexploring advanced embedding models for improved natural language to code\\nmapping.\",\"PeriodicalId\":501278,\"journal\":{\"name\":\"arXiv - CS - Software Engineering\",\"volume\":\"35 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

我们介绍的超级编码器 2.0 是一种先进的自主系统，旨在通过人工智能增强软件开发能力。该系统将人工智能原生开发方法与智能代理相结合，实现了完全自主的编码。重点领域包括：带有错误输出回溯功能的重试机制；使用抽象语法树（ast）解析技术进行全面的代码重写和替换，以最大限度地减少剔除问题；用于检索增强生成的代码嵌入技术；以及将重点放在解决问题的本地化方法上，而不是识别具体的行号。该方法采用三步分层搜索空间缩减法进行代码库导航和错误定位：利用检索增强生成（RAG）和资源库文件级地图识别候选文件；(2) 利用文件级示意图缩小最相关文件的范围；(3) 在这些文件中提取 "相关位置"。代码编辑通过由代码生成和代码编辑两部分组成的模块进行，该模块在不同温度值下生成多个解决方案，并替换整个方法或类，以保持代码的完整性。反馈回路执行存储库级测试用例，以验证和完善解决方案。在 SWE-bench Lite 数据集上进行的实验证明了 SuperCoder2.0 的有效性，在前 5 个候选案例中，84.33% 的案例实现了正确的文件定位，并成功解决了 34% 的测试实例。该系统处理不同资源库和问题类型的能力凸显了其作为自主软件开发多功能工具的潜力。未来的工作重点是完善代码编辑流程，并探索先进的嵌入模型，以改进从自然语言到代码的映射。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SuperCoder2.0: Technical Report on Exploring the feasibility of LLMs as Autonomous Programmer

We present SuperCoder2.0, an advanced autonomous system designed to enhance software development through artificial intelligence. The system combines an AI-native development approach with intelligent agents to enable fully autonomous coding. Key focus areas include a retry mechanism with error output traceback, comprehensive code rewriting and replacement using Abstract Syntax Tree (ast) parsing to minimize linting issues, code embedding technique for retrieval-augmented generation, and a focus on localizing methods for problem-solving rather than identifying specific line numbers. The methodology employs a three-step hierarchical search space reduction approach for code base navigation and bug localization:utilizing Retrieval Augmented Generation (RAG) and a Repository File Level Map to identify candidate files, (2) narrowing down to the most relevant files using a File Level Schematic Map, and (3) extracting 'relevant locations' within these files. Code editing is performed through a two-part module comprising CodeGeneration and CodeEditing, which generates multiple solutions at different temperature values and replaces entire methods or classes to maintain code integrity. A feedback loop executes repository-level test cases to validate and refine solutions. Experiments conducted on the SWE-bench Lite dataset demonstrate SuperCoder2.0's effectiveness, achieving correct file localization in 84.33% of cases within the top 5 candidates and successfully resolving 34% of test instances. This performance places SuperCoder2.0 fourth globally on the SWE-bench leaderboard. The system's ability to handle diverse repositories and problem types highlights its potential as a versatile tool for autonomous software development. Future work will focus on refining the code editing process and exploring advanced embedding models for improved natural language to code mapping.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Software Engineering

自引率

0.00%

发文量