SE Factual Knowledge in Frozen Giant Code Model: A Study on FQN and Its Retrieval

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-11-12 DOI:10.1109/TKDE.2024.3436883

Qing Huang;Dianshu Liao;Zhenchang Xing;Zhiqiang Yuan;Qinghua Lu;Xiwei Xu;Jiaxing Lu

{"title":"SE Factual Knowledge in Frozen Giant Code Model: A Study on FQN and Its Retrieval","authors":"Qing Huang;Dianshu Liao;Zhenchang Xing;Zhiqiang Yuan;Qinghua Lu;Xiwei Xu;Jiaxing Lu","doi":"10.1109/TKDE.2024.3436883","DOIUrl":null,"url":null,"abstract":"Giant pre-trained code models (PCMs) start coming into the developers’ daily practices. Understanding the type and amount of software knowledge in PCMs is essential for integrating PCMs into software engineering (SE) tasks and unlocking their potential. In this work, we conduct the first systematic study on the SE factual knowledge in the state-of-the-art PCM CoPilot, focusing on APIs’ Fully Qualified Names (FQNs), the fundamental knowledge for effective code analysis, search and reuse. Driven by FQNs’ data distribution properties, we design a novel lightweight in-context learning on Copilot for FQN inference, which does not require code compilation as traditional methods or gradient update by recent FQN prompt-tuning. We systematically experiment with five in-context learning design factors to identify the best configuration for practical use. With this best configuration, we investigate the impact of example prompts and FQN data properties on CoPilot's FQN inference capability. Our results confirm that CoPilot stores diverse FQN knowledge and can be applied for FQN inference due to its high accuracy and non-reliance on code analysis. Additionally, our extended study shows that the in-context learning method can be generalized to retrieve other SE factual knowledge embedded in giant PCMs. Furthermore, we find that the advanced general model GPT-4 also stores substantial SE knowledge. Comparing FQN inference between CoPilot and GPT-4, we observe that as model capabilities improve, the same prompts yield better results. Based on our experience interacting with Copilot, we discuss various opportunities to improve human-CoPilot interaction in the FQN inference task.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"9220-9234"},"PeriodicalIF":8.9000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10750898/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Giant pre-trained code models (PCMs) start coming into the developers’ daily practices. Understanding the type and amount of software knowledge in PCMs is essential for integrating PCMs into software engineering (SE) tasks and unlocking their potential. In this work, we conduct the first systematic study on the SE factual knowledge in the state-of-the-art PCM CoPilot, focusing on APIs’ Fully Qualified Names (FQNs), the fundamental knowledge for effective code analysis, search and reuse. Driven by FQNs’ data distribution properties, we design a novel lightweight in-context learning on Copilot for FQN inference, which does not require code compilation as traditional methods or gradient update by recent FQN prompt-tuning. We systematically experiment with five in-context learning design factors to identify the best configuration for practical use. With this best configuration, we investigate the impact of example prompts and FQN data properties on CoPilot's FQN inference capability. Our results confirm that CoPilot stores diverse FQN knowledge and can be applied for FQN inference due to its high accuracy and non-reliance on code analysis. Additionally, our extended study shows that the in-context learning method can be generalized to retrieve other SE factual knowledge embedded in giant PCMs. Furthermore, we find that the advanced general model GPT-4 also stores substantial SE knowledge. Comparing FQN inference between CoPilot and GPT-4, we observe that as model capabilities improve, the same prompts yield better results. Based on our experience interacting with Copilot, we discuss various opportunities to improve human-CoPilot interaction in the FQN inference task.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

冷冻巨码模型中的 SE 事实知识：FQN 及其检索研究

巨型预训练代码模型（PCM）开始进入开发人员的日常工作。了解 PCM 中软件知识的类型和数量对于将 PCM 整合到软件工程（SE）任务中并释放其潜力至关重要。在这项工作中，我们首次对最先进的 PCM CoPilot 中的 SE 事实知识进行了系统研究，重点关注 API 的完全限定名称（FQN），这是有效代码分析、搜索和重用的基础知识。在 FQN 数据分布特性的驱动下，我们在 Copilot 上设计了一种用于 FQN 推断的新型轻量级上下文学习方法，它不需要像传统方法那样进行代码编译，也不需要通过最近的 FQN 提示调整进行梯度更新。我们系统地试验了五种上下文学习设计因素，以确定实际应用中的最佳配置。在这种最佳配置下，我们研究了示例提示和 FQN 数据属性对 CoPilot FQN 推断能力的影响。我们的研究结果证实，CoPilot 可存储各种 FQN 知识，并且由于其高精度和不依赖代码分析，可用于 FQN 推断。此外，我们的扩展研究还表明，上下文学习方法可以推广到检索巨型 PCM 中嵌入的其他 SE 事实知识。此外，我们还发现高级通用模型 GPT-4 也存储了大量 SE 知识。对比 CoPilot 和 GPT-4 的 FQN 推断，我们发现随着模型能力的提高，相同的提示会产生更好的结果。根据我们与 Copilot 交互的经验，我们讨论了在 FQN 推断任务中改进人类与 CoPilot 交互的各种机会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.