Deep API Learning Revisited

James Martin, Jinrong Guo
{"title":"Deep API Learning Revisited","authors":"James Martin, Jinrong Guo","doi":"10.1145/3524610.3527872","DOIUrl":null,"url":null,"abstract":"Understanding the correct API usage sequences is one of the most important tasks for programmers when they work with unfamiliar libraries. However, programmers often encounter obstacles to finding the appropriate information due to either poor quality of API documentation or ineffective query-based searching strategy. To help solve this issue, researchers have proposed various methods to suggest the sequence of APIs given natural language queries representing the information needs from programmers. Among such efforts, Gu et al. adopted a deep learning method, in particular an RNN Encoder-Decoder architecture, to perform this task and obtained promising results on common APIs in Java. In this work, we aim to reproduce their results and apply the same methods for APIs in Python. Additionally, we compare the performance with a more recent Transformer-based method, i.e., CodeBERT, for the same task. Our experiment reveals a clear drop in performance measures when careful data cleaning is performed. Owing to the pretraining from a large number of source code files and effective encoding technique, CodeBERT outperforms the method by Gu et al., to a large extent.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"112 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3524610.3527872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Understanding the correct API usage sequences is one of the most important tasks for programmers when they work with unfamiliar libraries. However, programmers often encounter obstacles to finding the appropriate information due to either poor quality of API documentation or ineffective query-based searching strategy. To help solve this issue, researchers have proposed various methods to suggest the sequence of APIs given natural language queries representing the information needs from programmers. Among such efforts, Gu et al. adopted a deep learning method, in particular an RNN Encoder-Decoder architecture, to perform this task and obtained promising results on common APIs in Java. In this work, we aim to reproduce their results and apply the same methods for APIs in Python. Additionally, we compare the performance with a more recent Transformer-based method, i.e., CodeBERT, for the same task. Our experiment reveals a clear drop in performance measures when careful data cleaning is performed. Owing to the pretraining from a large number of source code files and effective encoding technique, CodeBERT outperforms the method by Gu et al., to a large extent.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
深度API学习重访
当程序员使用不熟悉的库时,理解正确的API使用顺序是最重要的任务之一。但是,由于API文档质量差或基于查询的搜索策略无效,程序员在查找适当信息时经常遇到障碍。为了帮助解决这个问题,研究人员提出了各种方法来建议api的顺序,给出了代表程序员信息需求的自然语言查询。在这些努力中,Gu等人采用了深度学习方法,特别是RNN编码器-解码器架构来执行该任务,并在Java中的常见api上获得了令人满意的结果。在这项工作中,我们的目标是重现他们的结果,并将相同的方法应用于Python中的api。此外,对于相同的任务,我们将性能与最近的基于transformer的方法(即CodeBERT)进行比较。我们的实验显示,在执行仔细的数据清理时,性能指标明显下降。CodeBERT基于大量源代码文件的预训练和有效的编码技术,在很大程度上优于Gu等人的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Context-based Cluster Fault Localization Fine-Grained Code-Comment Semantic Interaction Analysis Find Bugs in Static Bug Finders Self-Supervised Learning of Smart Contract Representations An Exploratory Study of Analyzing JavaScript Online Code Clones
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1