Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit

IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS ACM Computing Surveys Pub Date : 2024-05-18 DOI:10.1145/3664597
Yao Wan, Zhangqian Bi, Yang He, Jianguo Zhang, Hongyu Zhang, Yulei Sui, Guandong Xu, Hai Jin, Philip Yu
{"title":"Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit","authors":"Yao Wan, Zhangqian Bi, Yang He, Jianguo Zhang, Hongyu Zhang, Yulei Sui, Guandong Xu, Hai Jin, Philip Yu","doi":"10.1145/3664597","DOIUrl":null,"url":null,"abstract":"<p>Code intelligence leverages machine learning techniques to extract knowledge from extensive code corpora, with the aim of developing intelligent tools to improve the quality and productivity of computer programming. Currently, there is already a thriving research community focusing on code intelligence, with efforts ranging from software engineering, machine learning, data mining, natural language processing, and programming languages. In this paper, we conduct a comprehensive literature review on deep learning for code intelligence, from the aspects of code representation learning, deep learning techniques, and application tasks. We also benchmark several state-of-the-art neural models for code intelligence, and provide an open-source toolkit tailored for the rapid prototyping of deep-learning-based code intelligence models. In particular, we inspect the existing code intelligence models under the basis of code representation learning, and provide a comprehensive overview to enhance comprehension of the present state of code intelligence. Furthermore, we publicly release the source code and data resources to provide the community with a ready-to-use benchmark, which can facilitate the evaluation and comparison of existing and future code intelligence models (https://xcodemind.github.io). At last, we also point out several challenging and promising directions for future research.</p>","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":null,"pages":null},"PeriodicalIF":23.8000,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3664597","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Code intelligence leverages machine learning techniques to extract knowledge from extensive code corpora, with the aim of developing intelligent tools to improve the quality and productivity of computer programming. Currently, there is already a thriving research community focusing on code intelligence, with efforts ranging from software engineering, machine learning, data mining, natural language processing, and programming languages. In this paper, we conduct a comprehensive literature review on deep learning for code intelligence, from the aspects of code representation learning, deep learning techniques, and application tasks. We also benchmark several state-of-the-art neural models for code intelligence, and provide an open-source toolkit tailored for the rapid prototyping of deep-learning-based code intelligence models. In particular, we inspect the existing code intelligence models under the basis of code representation learning, and provide a comprehensive overview to enhance comprehension of the present state of code intelligence. Furthermore, we publicly release the source code and data resources to provide the community with a ready-to-use benchmark, which can facilitate the evaluation and comparison of existing and future code intelligence models (https://xcodemind.github.io). At last, we also point out several challenging and promising directions for future research.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
代码智能深度学习:调查、基准和工具包
代码智能利用机器学习技术从大量代码库中提取知识,目的是开发智能工具,提高计算机编程的质量和生产率。目前,专注于代码智能的研究社区已经蓬勃发展,研究领域涉及软件工程、机器学习、数据挖掘、自然语言处理和编程语言。在本文中,我们从代码表示学习、深度学习技术和应用任务等方面,对用于代码智能的深度学习进行了全面的文献综述。我们还对几种最先进的代码智能神经模型进行了基准测试,并为基于深度学习的代码智能模型的快速原型开发提供了一个开源工具包。特别是,我们在代码表示学习的基础上考察了现有的代码智能模型,并提供了一个全面的概述,以加深对代码智能现状的理解。此外,我们还公开发布了源代码和数据资源,为社会各界提供了一个现成可用的基准,便于对现有和未来的代码智能模型(https://xcodemind.github.io)进行评估和比较。最后,我们还指出了几个具有挑战性和前景的未来研究方向。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACM Computing Surveys
ACM Computing Surveys 工程技术-计算机:理论方法
CiteScore
33.20
自引率
0.60%
发文量
372
审稿时长
12 months
期刊介绍: ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods. ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.
期刊最新文献
Security and Privacy on Generative Data in AIGC: A Survey Open-Ethical AI: Advancements in Open-Source Human-Centric Neural Language Models Fog Computing Technology Research: A Retrospective Overview and Bibliometric Analysis Evaluation Methodologies in Software Protection Research Gender Bias in Natural Language Processing and Computer Vision: A Comparative Survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1