2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)最新文献_第4页

Towards Exploring the Code Reuse from Stack Overflow during Software Development 从软件开发中的堆栈溢出探讨代码重用

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

Pub Date : 2022-04-27 DOI: 10.1145/3524610.3527923

Yuan Huang, Furen Xu, Hao-Jie Zhou, Xiangping Chen, Xiaocong Zhou, Tongjie Wang

As one of the most well-known programmer Q&A websites, Stack Overflow (i.e., SO) is serving tens of thousands of developers ev-ery day. Previous work has shown that many developers reuse the code snippets on SO when they find an answer (from SO) that functionally matches the programming problem they encounter in their development activities. To study how programmers reuse code on SO during project development, we conduct a comprehensive empirical study. First, to capture the development activities of pro-grammers, we collect 342,148 modified code snippets in commits from 793 open-source Java projects, and these modified code can reflect the programming problems encountered during development. We also collect the code snippets from 1,355,617 posts on SO. Then, we employ CCFinder to detect the code clone between the modified code from commits and the code from SO, and further analyze the code reuse when programmer solves a programming problem during development. We count the code reuse ratios of the modified code snippets in the commits of each project in different years, the results show that the average code reuse ratio is 6.32%, and the maximum is 8.38%. The code reuse ratio in project commits has increased year by year, and the proportion of code reuse in the newly established project is higher than that of old projects. We also find that some projects reuse the code snippets from many years ago. Additionally, we find that experienced developers seem to be more likely to reuse the knowledge on SO. Moreover, we find that the code reuse ratio in bug-related commits (6.67%) is slightly higher than that of in non-bug-related commits (6.59%). Furthermore, we also find that the code reuse ratio (14.44%) in Java class files that have undergone multiple modifications is more than double the overall code reuse ratio (6.32%).

作为最知名的程序员问答网站之一，Stack Overflow(即SO)每天为成千上万的开发人员提供服务。以前的工作表明，许多开发人员在找到与他们在开发活动中遇到的编程问题在功能上匹配的答案(来自SO)时重用SO上的代码片段。为了研究程序员在项目开发过程中如何在SO上重用代码，我们进行了全面的实证研究。首先，为了捕获程序员的开发活动，我们从793个开源Java项目的提交中收集了342148个修改过的代码片段，这些修改过的代码可以反映开发过程中遇到的编程问题。我们还从SO上的1,355,617篇文章中收集了代码片段。然后，我们使用CCFinder检测提交修改后的代码与SO修改后的代码之间的代码克隆，并进一步分析程序员在开发过程中解决编程问题时的代码重用情况。我们统计了各个项目在不同年份提交的修改代码片段的代码重用率，结果表明，平均代码重用率为6.32%，最大值为8.38%。项目提交中的代码重用率逐年提高，新建项目的代码重用率高于老项目。我们还发现一些项目重用了许多年前的代码片段。此外，我们发现经验丰富的开发人员似乎更有可能重用SO方面的知识。此外，我们发现bug相关提交的代码重用率(6.67%)略高于非bug相关提交的代码重用率(6.59%)。此外，我们还发现，经过多次修改的Java类文件中的代码重用率(14.44%)是总体代码重用率(6.32%)的两倍多。

{"title":"Towards Exploring the Code Reuse from Stack Overflow during Software Development","authors":"Yuan Huang, Furen Xu, Hao-Jie Zhou, Xiangping Chen, Xiaocong Zhou, Tongjie Wang","doi":"10.1145/3524610.3527923","DOIUrl":"https://doi.org/10.1145/3524610.3527923","url":null,"abstract":"As one of the most well-known programmer Q&A websites, Stack Overflow (i.e., SO) is serving tens of thousands of developers ev-ery day. Previous work has shown that many developers reuse the code snippets on SO when they find an answer (from SO) that functionally matches the programming problem they encounter in their development activities. To study how programmers reuse code on SO during project development, we conduct a comprehensive empirical study. First, to capture the development activities of pro-grammers, we collect 342,148 modified code snippets in commits from 793 open-source Java projects, and these modified code can reflect the programming problems encountered during development. We also collect the code snippets from 1,355,617 posts on SO. Then, we employ CCFinder to detect the code clone between the modified code from commits and the code from SO, and further analyze the code reuse when programmer solves a programming problem during development. We count the code reuse ratios of the modified code snippets in the commits of each project in different years, the results show that the average code reuse ratio is 6.32%, and the maximum is 8.38%. The code reuse ratio in project commits has increased year by year, and the proportion of code reuse in the newly established project is higher than that of old projects. We also find that some projects reuse the code snippets from many years ago. Additionally, we find that experienced developers seem to be more likely to reuse the knowledge on SO. Moreover, we find that the code reuse ratio in bug-related commits (6.67%) is slightly higher than that of in non-bug-related commits (6.59%). Furthermore, we also find that the code reuse ratio (14.44%) in Java class files that have undergone multiple modifications is more than double the overall code reuse ratio (6.32%).","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131409614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Impact of Change Granularity in Refactoring Detection 变更粒度对重构检测的影响

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

Pub Date : 2022-04-24 DOI: 10.1145/3524610.3528386

Lei Chen, Shinpei Hayashi

Detecting refactorings in commit history is essential to improve the comprehension of code changes in code reviews and to provide valuable information for empirical studies on software evolution. Several techniques have been proposed to detect refactorings accurately at the granularity level of a single commit. However, refactorings may be performed over multiple commits because of code complexity or other real development problems, which is why attempting to detect refactorings at single-commit granularity is insufficient. We observe that some refactorings can be detected only at coarser granularity, that is, changes spread across multiple commits. Herein, this type of refactoring is referred to as coarse-grained refactoring (CGR). We compared the refactorings detected on different granularities of commits from 19 open-source repositories. The results show that CGRs are common, and their frequency increases as the granularity becomes coarser. In addition, we found that Move-related refactorings tended to be the most frequent CGRs. We also analyzed the causes of CGR and suggested that CGRs will be valuable in refactoring research.

在提交历史中检测重构对于在代码审查中提高对代码变更的理解，并为软件进化的实证研究提供有价值的信息是必不可少的。已经提出了几种技术，可以在单个提交的粒度级别上准确地检测重构。然而，由于代码复杂性或其他实际的开发问题，重构可能会在多次提交中执行，这就是为什么试图以单提交粒度检测重构是不够的。我们观察到，一些重构只能在更粗的粒度下检测到，也就是说，在多个提交中分散的更改。在这里，这种类型的重构被称为粗粒度重构(CGR)。我们比较了在19个开源存储库中不同粒度的提交上检测到的重构。结果表明，cgr是常见的，其频率随着粒度的增加而增加。此外，我们发现与move相关的重构往往是最常见的cgr。我们还分析了CGR产生的原因，并指出CGR将在重构研究中发挥重要作用。

引用次数: 1

On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules 通过适配器模块实现自然语言到代码的跨模态转换

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

Pub Date : 2022-04-19 DOI: 10.1145/3524610.3527892

Divyam Goel, Raman Grover, F. H. Fard

Pre-trained neural Language Models (PTLM), such as CodeBERT, are recently used in software engineering as models pre-trained on large source code corpora. Their knowledge is transferred to downstream tasks (e.g. code clone detection) via fine-tuning. In natural language processing (NLP), other alternatives for transferring the knowledge of PTLMs are explored through using adapters, compact, parameter efficient modules inserted in the layers of the PTLM. Although adapters are known to facilitate adapting to many downstream tasks compared to fine-tuning the model that require retraining all of the models' parameters- which owes to the adapters' plug and play nature and being parameter efficient-their usage in software engineering is not explored. Here, we explore the knowledge transfer using adapters and based on the Naturalness Hypothesis proposed by Hindle et. al [12]. Thus, studying the bimodality of adapters for two tasks of cloze test and code clone detection, compared to their benchmarks from the CodeXGLUE platform. These adapters are trained using programming languages and are inserted in a PTLM that is pre-trained on English corpora (N-PTLM). Three programming languages, $mathrm{C}/mathrm{C}++$, Python, and Java, are studied along with extensive experiments on the best setup used for adapters. Improving the results of the N-PTLM confirms the success of the adapters in knowledge transfer to software engineering, which sometimes are in par with or exceed the results of a PTLM trained on source code; while being more efficient in terms of the number of parameters, memory usage, and inference time. Our results can open new directions to build smaller models for more software engineering tasks. We open source all the scripts and the trained adapters.

预训练神经语言模型(PTLM)，如CodeBERT，最近被用于软件工程中，作为在大型源代码语料库上预训练的模型。他们的知识通过微调转移到下游任务(例如代码克隆检测)。在自然语言处理(NLP)中，通过在PTLM层中插入适配器、紧凑的、参数高效的模块，探索了PTLM知识转移的其他替代方法。虽然众所周知，与需要重新训练所有模型参数的模型微调相比，适配器有助于适应许多下游任务——这要归功于适配器的即插即用性质和参数效率——但它们在软件工程中的使用并没有被探索。在这里，我们基于Hindle等人[12]提出的自然假设，使用适配器来探索知识转移。因此，研究完形测试和代码克隆检测这两个任务的适配器的双峰性，并将它们与CodeXGLUE平台的基准测试进行比较。这些适配器使用编程语言进行训练，并插入在英语语料库(N-PTLM)上进行预训练的PTLM中。本文对三种编程语言$ mathm {C}/ mathm {C}++$、Python和Java进行了研究，并对用于适配器的最佳设置进行了大量实验。改进N-PTLM的结果证实了适配器在知识转移到软件工程方面的成功，这有时与在源代码上训练的PTLM的结果相当或超过;同时在参数数量、内存使用和推理时间方面更有效。我们的结果可以为为更多的软件工程任务构建更小的模型开辟新的方向。我们开放了所有脚本和经过培训的适配器的源代码。

{"title":"On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules","authors":"Divyam Goel, Raman Grover, F. H. Fard","doi":"10.1145/3524610.3527892","DOIUrl":"https://doi.org/10.1145/3524610.3527892","url":null,"abstract":"Pre-trained neural Language Models (PTLM), such as CodeBERT, are recently used in software engineering as models pre-trained on large source code corpora. Their knowledge is transferred to downstream tasks (e.g. code clone detection) via fine-tuning. In natural language processing (NLP), other alternatives for transferring the knowledge of PTLMs are explored through using adapters, compact, parameter efficient modules inserted in the layers of the PTLM. Although adapters are known to facilitate adapting to many downstream tasks compared to fine-tuning the model that require retraining all of the models' parameters- which owes to the adapters' plug and play nature and being parameter efficient-their usage in software engineering is not explored. Here, we explore the knowledge transfer using adapters and based on the Naturalness Hypothesis proposed by Hindle et. al [12]. Thus, studying the bimodality of adapters for two tasks of cloze test and code clone detection, compared to their benchmarks from the CodeXGLUE platform. These adapters are trained using programming languages and are inserted in a PTLM that is pre-trained on English corpora (N-PTLM). Three programming languages, $mathrm{C}/mathrm{C}++$, Python, and Java, are studied along with extensive experiments on the best setup used for adapters. Improving the results of the N-PTLM confirms the success of the adapters in knowledge transfer to software engineering, which sometimes are in par with or exceed the results of a PTLM trained on source code; while being more efficient in terms of the number of parameters, memory usage, and inference time. Our results can open new directions to build smaller models for more software engineering tasks. We open source all the scripts and the trained adapters.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129832195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Zero-Shot Program Representation Learning 零射击程序表示学习

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

Pub Date : 2022-04-18 DOI: 10.1145/3524610.3527888

Nan Cui, Yuze Jiang, Xiaodong Gu, Beijun Shen

Learning program representations has been the core prerequisite of code intelligence tasks (e.g., code search and code clone detection). The state-of-the-art pre-trained models such as CodeBERT require the availability of large-scale code corpora. However, gathering training samples can be costly and infeasible for domain-specific languages such as Solidity for smart contracts. In this paper, we propose Zecoler, a zero-shot learning approach for code representations. Zecoler is built upon a pre-trained programming language model. In order to elicit knowledge from the pre-trained models efficiently, Zecoler casts the downstream tasks to the same form of pre-training tasks by inserting trainable prompts into the original input. Then, it employs the prompt learning technique to optimize the pre-trained model by merely adjusting the original input. This enables the representation model to efficiently fit the scarce task-specific data while reusing pre-trained knowledge. We evaluate Zecoler in three code intelligence tasks in two programming languages that have no training samples, namely, Solidity and Go, with model trained in corpora of common languages such as Java. Experimental results show that our approach significantly outperforms baseline models in both zero-shot and few-shot settings.

学习程序表示一直是代码智能任务(例如，代码搜索和代码克隆检测)的核心先决条件。最先进的预训练模型，如CodeBERT，需要大规模代码语料库的可用性。然而，收集训练样本对于特定于领域的语言(如用于智能合约的Solidity)来说可能是昂贵且不可行的。在本文中，我们提出了Zecoler，一种用于代码表示的零采样学习方法。Zecoler是建立在预先训练的编程语言模型之上的。为了有效地从预训练模型中提取知识，Zecoler通过在原始输入中插入可训练的提示，将下游任务转换为与预训练任务相同的形式。然后，采用提示学习技术，仅通过调整原始输入对预训练模型进行优化。这使得表示模型能够有效地拟合稀缺的特定于任务的数据，同时重用预训练的知识。我们在两种没有训练样本的编程语言(即Solidity和Go)的三个代码智能任务中评估Zecoler，并在Java等常用语言的语料库中训练模型。实验结果表明，我们的方法在零射击和少射击设置下都明显优于基线模型。

{"title":"Zero-Shot Program Representation Learning","authors":"Nan Cui, Yuze Jiang, Xiaodong Gu, Beijun Shen","doi":"10.1145/3524610.3527888","DOIUrl":"https://doi.org/10.1145/3524610.3527888","url":null,"abstract":"Learning program representations has been the core prerequisite of code intelligence tasks (e.g., code search and code clone detection). The state-of-the-art pre-trained models such as CodeBERT require the availability of large-scale code corpora. However, gathering training samples can be costly and infeasible for domain-specific languages such as Solidity for smart contracts. In this paper, we propose Zecoler, a zero-shot learning approach for code representations. Zecoler is built upon a pre-trained programming language model. In order to elicit knowledge from the pre-trained models efficiently, Zecoler casts the downstream tasks to the same form of pre-training tasks by inserting trainable prompts into the original input. Then, it employs the prompt learning technique to optimize the pre-trained model by merely adjusting the original input. This enables the representation model to efficiently fit the scarce task-specific data while reusing pre-trained knowledge. We evaluate Zecoler in three code intelligence tasks in two programming languages that have no training samples, namely, Solidity and Go, with model trained in corpora of common languages such as Java. Experimental results show that our approach significantly outperforms baseline models in both zero-shot and few-shot settings.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130434564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Revisiting the Effect of Branch Handling Strategies on Change Recommendation 再论分行处理策略对变更建议的影响

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

Pub Date : 2022-04-09 DOI: 10.1145/3524610.3527870

Keisuke Isemoto, Takashi Kobayashi, Shinpei Hayashi

Although literature has noted the effects of branch handling strate-gies on change recommendation based on evolutionary coupling, they have been tested in a limited experimental setting. Additionally, the branches characteristics that lead to these effects have not been investigated. In this study, we revisited the investigation conducted by Kovalenko et al. on the effect to change recommendation using two different branch handling strategies: including changesets from commits on a branch and excluding them. In addition to the setting by Kovalenko et al., we introduced another setting to compare: ex-tracting a changeset for a branch from a merge commit at once. We compared the change recommendation results and the similarity of the extracted co-changes to those in the future obtained using two strategies through 30 open-source software systems. The results show that handling commits on a branch separately is often more appropriate in change recommendation, although the comparison in an additional setting resulted in a balanced performance among the branch handling strategies. Additionally, we found that the merge commit size and the branch length positively influence the change recommendation results.

虽然文献已经注意到分支处理策略对基于进化耦合的变更推荐的影响，但它们已经在有限的实验环境中进行了测试。此外，导致这些影响的分支特性尚未得到研究。在这项研究中，我们重新审视了Kovalenko等人对使用两种不同的分支处理策略对更改建议的影响的调查:包括分支上提交的更改集和排除它们。除了Kovalenko等人的设置之外，我们还引入了另一种设置来进行比较:一次从合并提交中提取分支的变更集。我们通过30个开源软件系统，比较了采用两种策略获得的变更推荐结果以及提取的共同变更与未来获得的变更的相似度。结果表明，在变更建议中单独处理分支上的提交通常更合适，尽管在另一个设置中的比较导致了分支处理策略之间的平衡性能。此外，我们发现合并提交大小和分支长度对变更推荐结果有积极影响。

{"title":"Revisiting the Effect of Branch Handling Strategies on Change Recommendation","authors":"Keisuke Isemoto, Takashi Kobayashi, Shinpei Hayashi","doi":"10.1145/3524610.3527870","DOIUrl":"https://doi.org/10.1145/3524610.3527870","url":null,"abstract":"Although literature has noted the effects of branch handling strate-gies on change recommendation based on evolutionary coupling, they have been tested in a limited experimental setting. Additionally, the branches characteristics that lead to these effects have not been investigated. In this study, we revisited the investigation conducted by Kovalenko et al. on the effect to change recommendation using two different branch handling strategies: including changesets from commits on a branch and excluding them. In addition to the setting by Kovalenko et al., we introduced another setting to compare: ex-tracting a changeset for a branch from a merge commit at once. We compared the change recommendation results and the similarity of the extracted co-changes to those in the future obtained using two strategies through 30 open-source software systems. The results show that handling commits on a branch separately is often more appropriate in change recommendation, although the comparison in an additional setting resulted in a balanced performance among the branch handling strategies. Additionally, we found that the merge commit size and the branch length positively influence the change recommendation results.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129352807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Backports: Change Types, Challenges and Strategies 后端:改变类型、挑战和策略

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

Pub Date : 2022-04-07 DOI: 10.1145/3524610.3527920

Debasish Chakroborti, Kevin A. Schneider, C. Roy

Source code repositories allow developers to manage multiple versions (or branches) of a software system. Pull-requests are used to modify a branch, and backporting is a regular activity used to port changes from a current development branch to other versions. In open-source software, backports are common and often need to be adapted by hand, which motivates us to explore backports and backporting challenges and strategies. In our exploration of 68,424 backports from 10 GitHub projects, we found that bug, test, document, and feature changes are commonly backported. We iden-tified a number of backporting challenges, including that backports were inconsistently linked to their original pull-request (49%), that backports had incompatible code (13%), that backports failed to be accepted (10%), and that there were backporting delays (16 days to create, 5 days to merge). We identified some general strategies for addressing backporting issues. We also noted that backporting strategies depend on the project type and that further investigation is needed to determine their suitability. Furthermore, we created the first-ever backports dataset that can be used by other researchers and practitioners for investigating backports and backporting.

源代码存储库允许开发人员管理软件系统的多个版本(或分支)。拉取请求用于修改分支，后移植是用于将更改从当前开发分支移植到其他版本的常规活动。在开源软件中，反向移植很常见，通常需要手工调整，这促使我们探索反向移植以及反向移植的挑战和策略。在我们对10个GitHub项目的68,424个后端进行的研究中，我们发现bug、测试、文档和功能更改通常都是后端。我们发现了一些后移植挑战，包括后移植与最初的拉取请求不一致(49%)，后移植有不兼容的代码(13%)，后移植未能被接受(10%)，以及后移植延迟(创建16天，合并5天)。我们确定了一些解决后台问题的一般策略。我们还注意到，后撤策略取决于项目类型，需要进一步调查以确定其适用性。此外，我们创建了第一个可以被其他研究人员和从业者用于调查后端和后端的后端数据集。

{"title":"Backports: Change Types, Challenges and Strategies","authors":"Debasish Chakroborti, Kevin A. Schneider, C. Roy","doi":"10.1145/3524610.3527920","DOIUrl":"https://doi.org/10.1145/3524610.3527920","url":null,"abstract":"Source code repositories allow developers to manage multiple versions (or branches) of a software system. Pull-requests are used to modify a branch, and backporting is a regular activity used to port changes from a current development branch to other versions. In open-source software, backports are common and often need to be adapted by hand, which motivates us to explore backports and backporting challenges and strategies. In our exploration of 68,424 backports from 10 GitHub projects, we found that bug, test, document, and feature changes are commonly backported. We iden-tified a number of backporting challenges, including that backports were inconsistently linked to their original pull-request (49%), that backports had incompatible code (13%), that backports failed to be accepted (10%), and that there were backporting delays (16 days to create, 5 days to merge). We identified some general strategies for addressing backporting issues. We also noted that backporting strategies depend on the project type and that further investigation is needed to determine their suitability. Furthermore, we created the first-ever backports dataset that can be used by other researchers and practitioners for investigating backports and backporting.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115081718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages 低资源编程语言的预训练语言模型可移植性研究

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

Pub Date : 2022-04-05 DOI: 10.1145/3524610.3527917

Fuxiang Chen, F. Fard, David Lo, T. Bryksin

A recent study by Ahmed and Devanbu reported that using a corpus of code written in multilingual datasets to fine-tune multilingual Pre-trained Language Models (PLMs) achieves higher performance as opposed to using a corpus of code written in just one programming language. However, no analysis was made with respect to fine-tuning monolingual PLMs. Furthermore, some programming languages are inherently different and code written in one language usually cannot be interchanged with the others, i.e., Ruby and Java code possess very different structure. To better understand how monolingual and multilingual PLMs affect different programming languages, we investigate 1) the performance of PLMs on Ruby for two popular Software Engineering tasks: Code Summarization and Code Search, 2) the strategy (to select programming languages) that works well on fine-tuning multilingual PLMs for Ruby, and 3) the performance of the fine-tuned PLMs on Ruby given different code lengths. In this work, we analyze over a hundred of pre-trained and fine-tuned models. Our results show that 1) multilingual PLMs have a lower Performance-to-Time Ratio (the BLEU, METEOR, or MRR scores over the fine-tuning duration) as compared to monolingual PLMs, 2) our proposed strategy to select target programming languages to fine-tune multilingual PLMs is effective — it reduces the time to fine-tune yet achieves higher performance in Code Summarization and Code Search tasks, and 3) our proposed strategy consistently shows good performance on different code lengths.

Ahmed和Devanbu最近的一项研究报告称，使用用多语言数据集编写的代码语料库来微调多语言预训练语言模型(PLMs)，与只使用一种编程语言编写的代码语料库相比，可以获得更高的性能。然而，没有对微调单语PLMs进行分析。此外，一些编程语言本质上是不同的，用一种语言编写的代码通常不能与其他语言交换，也就是说，Ruby和Java代码具有非常不同的结构。为了更好地理解单语言和多语言plm如何影响不同的编程语言，我们研究了1)Ruby上plm在两个流行的软件工程任务上的性能:代码摘要和代码搜索，2)在Ruby上微调多语言plm时工作良好的策略(选择编程语言)，以及3)在不同代码长度的Ruby上微调plm的性能。在这项工作中，我们分析了一百多个预训练和微调的模型。我们的研究结果表明，1)与单语言plm相比，多语言plm具有较低的性能时间比(BLEU, METEOR或MRR分数在微调期间)，2)我们提出的选择目标编程语言来微调多语言plm的策略是有效的-它减少了微调的时间，但在代码摘要和代码搜索任务中实现了更高的性能，3)我们提出的策略在不同的代码长度上始终显示出良好的性能。

{"title":"On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages","authors":"Fuxiang Chen, F. Fard, David Lo, T. Bryksin","doi":"10.1145/3524610.3527917","DOIUrl":"https://doi.org/10.1145/3524610.3527917","url":null,"abstract":"A recent study by Ahmed and Devanbu reported that using a corpus of code written in multilingual datasets to fine-tune multilingual Pre-trained Language Models (PLMs) achieves higher performance as opposed to using a corpus of code written in just one programming language. However, no analysis was made with respect to fine-tuning monolingual PLMs. Furthermore, some programming languages are inherently different and code written in one language usually cannot be interchanged with the others, i.e., Ruby and Java code possess very different structure. To better understand how monolingual and multilingual PLMs affect different programming languages, we investigate 1) the performance of PLMs on Ruby for two popular Software Engineering tasks: Code Summarization and Code Search, 2) the strategy (to select programming languages) that works well on fine-tuning multilingual PLMs for Ruby, and 3) the performance of the fine-tuned PLMs on Ruby given different code lengths. In this work, we analyze over a hundred of pre-trained and fine-tuned models. Our results show that 1) multilingual PLMs have a lower Performance-to-Time Ratio (the BLEU, METEOR, or MRR scores over the fine-tuning duration) as compared to monolingual PLMs, 2) our proposed strategy to select target programming languages to fine-tune multilingual PLMs is effective — it reduces the time to fine-tune yet achieves higher performance in Code Summarization and Code Search tasks, and 3) our proposed strategy consistently shows good performance on different code lengths.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130228231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

On the Effectiveness of Pretrained Models for API Learning 预训练模型在API学习中的有效性研究

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

Pub Date : 2022-04-05 DOI: 10.1145/3524610.3527886

M. Hadi, Imam Nur Bani Yusuf, Ferdian Thung, K. Luong, Lingxiao Jiang, F. H. Fard, David Lo

Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc. Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner. Existing approaches utilize information retrieval models to search for matching API sequences given a query or use RNN-based encoder-decoder to generate API sequences. As it stands, the first approach treats queries and API names as bags of words. It lacks deep comprehension of the semantics of the queries. The latter approach adapts a neural language model to encode a user query into a fixed-length context vector and generate API sequences from the context vector. We want to understand the effectiveness of recent Pre-trained Transformer based Models (PTMs) for the API learning task. These PTMs are trained on large natural language corpora in an unsupervised manner to retain contextual knowledge about the language and have found success in solving similar Natural Language Processing (NLP) problems. However, the applicability of PTMs has not yet been explored for the API sequence generation task. We use a dataset that contains 7 million annotations collected from GitHub to evaluate the PTMs empirically. This dataset was also used to assess previous approaches. Based on our results, PTMs generate more accurate API sequences and outperform other related methods by ∼11%. We have also identified two different tokenization approaches that can contribute to a significant boost in PTMs' performance for the API sequence generation task.

开发人员经常使用api来实现某些功能，例如解析Excel文件，逐行读写文本文件等。开发人员可以从基于自然语言查询的自动API使用序列生成中受益匪浅，从而以更快、更清晰的方式构建应用程序。现有的方法利用信息检索模型来搜索给定查询的匹配API序列或使用基于rnn的编码器-解码器来生成API序列。目前，第一种方法将查询和API名称视为单词包。它缺乏对查询语义的深刻理解。后一种方法采用神经语言模型将用户查询编码为固定长度的上下文向量，并从上下文向量生成API序列。我们想了解最近基于预训练的变压器模型(ptm)在API学习任务中的有效性。这些ptm以无监督的方式在大型自然语言语料库上进行训练，以保留有关语言的上下文知识，并在解决类似的自然语言处理(NLP)问题上取得了成功。然而，ptm在API序列生成任务中的适用性尚未得到探讨。我们使用从GitHub收集的包含700万个注释的数据集来对ptm进行经验评估。该数据集也用于评估以前的方法。根据我们的研究结果，PTMs生成更准确的API序列，并且比其他相关方法高出约11%。我们还确定了两种不同的标记化方法，它们可以显著提高ptm在API序列生成任务中的性能。

{"title":"On the Effectiveness of Pretrained Models for API Learning","authors":"M. Hadi, Imam Nur Bani Yusuf, Ferdian Thung, K. Luong, Lingxiao Jiang, F. H. Fard, David Lo","doi":"10.1145/3524610.3527886","DOIUrl":"https://doi.org/10.1145/3524610.3527886","url":null,"abstract":"Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc. Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner. Existing approaches utilize information retrieval models to search for matching API sequences given a query or use RNN-based encoder-decoder to generate API sequences. As it stands, the first approach treats queries and API names as bags of words. It lacks deep comprehension of the semantics of the queries. The latter approach adapts a neural language model to encode a user query into a fixed-length context vector and generate API sequences from the context vector. We want to understand the effectiveness of recent Pre-trained Transformer based Models (PTMs) for the API learning task. These PTMs are trained on large natural language corpora in an unsupervised manner to retain contextual knowledge about the language and have found success in solving similar Natural Language Processing (NLP) problems. However, the applicability of PTMs has not yet been explored for the API sequence generation task. We use a dataset that contains 7 million annotations collected from GitHub to evaluate the PTMs empirically. This dataset was also used to assess previous approaches. Based on our results, PTMs generate more accurate API sequences and outperform other related methods by ∼11%. We have also identified two different tokenization approaches that can contribute to a significant boost in PTMs' performance for the API sequence generation task.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133945350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

An Exploratory Study on Code Attention in BERT BERT中代码注意的探索性研究

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

Pub Date : 2022-04-05 DOI: 10.1145/3524610.3527921

Rishab Sharma, Fuxiang Chen, Fatemeh H. Fard, David Lo

Many recent models in software engineering introduced deep neural models based on the Transformer architecture or use transformer-based Pre-trained Language Models (PLM) trained on code. Although these models achieve the state of the arts results in many downstream tasks such as code summarization and bug detection, they are based on Transformer and PLM, which are mainly studied in the Natural Language Processing (NLP) field. The current studies rely on the reasoning and practices from NLP for these models in code, despite the differences between natural languages and programming languages. There is also limited literature on explaining how code is modeled. Here, we investigate the attention behavior of PLM on code and compare it with natural language. We pre-trained BERT, a Transformer based PLM, on code and explored what kind of information it learns, both semantic and syntactic. We run several experiments to analyze the attention values of code constructs on each other and what BERT learns in each layer. Our analyses show that BERT pays more attention to syntactic entities, specifically identifiers and separators, in contrast to the most attended token [CLS] in NLP. This observation motivated us to leverage identifiers to represent the code sequence instead of the [CLS] token when used for code clone detection. Our results show that employing embeddings from identifiers increases the performance of BERT by 605% and 4% F1-score in its lower layers and the upper layers, respectively. When identifiers' embeddings are used in CodeBERT, a code-based PLM, the performance is improved by 21-24% in the F1-score of clone detection. The findings can benefit the research community by using code-specific representations instead of applying the common embeddings used in NLP, and open new directions for developing smaller models with similar performance.

软件工程中的许多最新模型引入了基于Transformer体系结构的深度神经模型，或者使用基于Transformer的预训练语言模型(PLM)对代码进行训练。尽管这些模型在许多下游任务(如代码总结和错误检测)中实现了最先进的结果，但它们是基于Transformer和PLM的，这些模型主要在自然语言处理(NLP)领域进行研究。尽管自然语言和编程语言之间存在差异，但目前的研究依赖于NLP在代码中对这些模型的推理和实践。在解释代码如何建模方面也有有限的文献。在这里，我们研究PLM对代码的注意行为，并将其与自然语言进行比较。我们在代码上预先训练了BERT(一个基于PLM的Transformer)，并探索了它学习的信息类型，包括语义和语法。我们运行了几个实验来分析代码结构对彼此的关注值以及BERT在每层学习到的内容。我们的分析表明，与NLP中最受关注的令牌[CLS]相比，BERT更关注语法实体，特别是标识符和分隔符。这一观察结果促使我们在进行代码克隆检测时利用标识符来表示代码序列，而不是使用[CLS]令牌。我们的研究结果表明，使用标识符嵌入使BERT的性能在底层和上层分别提高了605%和4%的f1分数。将标识符嵌入到基于代码的PLM CodeBERT中，克隆检测的f1分数提高了21-24%。研究结果可以通过使用代码特定的表示而不是NLP中使用的常见嵌入来使研究界受益，并为开发具有相似性能的较小模型开辟了新的方向。

{"title":"An Exploratory Study on Code Attention in BERT","authors":"Rishab Sharma, Fuxiang Chen, Fatemeh H. Fard, David Lo","doi":"10.1145/3524610.3527921","DOIUrl":"https://doi.org/10.1145/3524610.3527921","url":null,"abstract":"Many recent models in software engineering introduced deep neural models based on the Transformer architecture or use transformer-based Pre-trained Language Models (PLM) trained on code. Although these models achieve the state of the arts results in many downstream tasks such as code summarization and bug detection, they are based on Transformer and PLM, which are mainly studied in the Natural Language Processing (NLP) field. The current studies rely on the reasoning and practices from NLP for these models in code, despite the differences between natural languages and programming languages. There is also limited literature on explaining how code is modeled. Here, we investigate the attention behavior of PLM on code and compare it with natural language. We pre-trained BERT, a Transformer based PLM, on code and explored what kind of information it learns, both semantic and syntactic. We run several experiments to analyze the attention values of code constructs on each other and what BERT learns in each layer. Our analyses show that BERT pays more attention to syntactic entities, specifically identifiers and separators, in contrast to the most attended token [CLS] in NLP. This observation motivated us to leverage identifiers to represent the code sequence instead of the [CLS] token when used for code clone detection. Our results show that employing embeddings from identifiers increases the performance of BERT by 605% and 4% F1-score in its lower layers and the upper layers, respectively. When identifiers' embeddings are used in CodeBERT, a code-based PLM, the performance is improved by 21-24% in the F1-score of clone detection. The findings can benefit the research community by using code-specific representations instead of applying the common embeddings used in NLP, and open new directions for developing smaller models with similar performance.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127906169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

A Study of Single Statement Bugs Involving Dynamic Language Features 涉及动态语言特征的单语句错误研究

2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)

Pub Date : 2022-04-03 DOI: 10.1145/3524610.3527883

Li Sui, Shawn Rasheed, Amjed Tahir, Jens Dietrich

Dynamic language features are widely available in programming languages to implement functionality that can adapt to multiple usage contexts, enabling reuse. Functionality such as data binding, object-relational mapping and user interface builders can be heavily dependent on these features. However, their use has risks and downsides as they affect the soundness of static analyses and techniques that rely on such analyses (such as bug detection and automated program repair). They can also make software more error-prone due to potential difficulties in understanding reflective code, loss of compile-time safety and incorrect API usage. In this paper, we set out to quantify some of the effects of using dynamic language features in Java programs - that is, the error-proneness of using those features with respect to a particular type of bug known as single statement bugs. By mining 2,024 GitHub projects, we found 139 single statement bug instances (falling under 10 different bug patterns), with the highest number of bugs belonging to three specific patterns: Wrong Function Name, Same Function More Args and Change Identifier Used. These results can help practitioners to quantify the risk of using dynamic techniques over alternatives (such as code generation). We hope this classification raises attention on choosing dynamic APIs that are likely to be error-prone, and provides developers a better understanding when designing bug detection tools for such feature.

动态语言特性在编程语言中广泛可用，以实现能够适应多种使用上下文的功能，从而实现重用。数据绑定、对象关系映射和用户界面构建器等功能可能严重依赖于这些特性。然而，它们的使用具有风险和缺点，因为它们会影响静态分析和依赖于此类分析的技术的可靠性(例如错误检测和自动程序修复)。由于理解反射代码的潜在困难、编译时安全性的丧失以及不正确的API使用，它们还可能使软件更容易出错。在本文中，我们着手量化在Java程序中使用动态语言特性的一些影响——也就是说，使用这些特性对特定类型的错误(称为单语句错误)的错误倾向。通过挖掘2024个GitHub项目，我们发现了139个单语句错误实例(属于10种不同的错误模式)，其中最多的错误属于三种特定模式:错误的函数名称，相同的函数更多的参数和使用的更改标识符。这些结果可以帮助实践者量化使用动态技术而不是替代技术的风险(比如代码生成)。我们希望这种分类能够引起人们对选择可能容易出错的动态api的关注，并为开发人员在为此类特性设计bug检测工具时提供更好的理解。

{"title":"A Study of Single Statement Bugs Involving Dynamic Language Features","authors":"Li Sui, Shawn Rasheed, Amjed Tahir, Jens Dietrich","doi":"10.1145/3524610.3527883","DOIUrl":"https://doi.org/10.1145/3524610.3527883","url":null,"abstract":"Dynamic language features are widely available in programming languages to implement functionality that can adapt to multiple usage contexts, enabling reuse. Functionality such as data binding, object-relational mapping and user interface builders can be heavily dependent on these features. However, their use has risks and downsides as they affect the soundness of static analyses and techniques that rely on such analyses (such as bug detection and automated program repair). They can also make software more error-prone due to potential difficulties in understanding reflective code, loss of compile-time safety and incorrect API usage. In this paper, we set out to quantify some of the effects of using dynamic language features in Java programs - that is, the error-proneness of using those features with respect to a particular type of bug known as single statement bugs. By mining 2,024 GitHub projects, we found 139 single statement bug instances (falling under 10 different bug patterns), with the highest number of bugs belonging to three specific patterns: Wrong Function Name, Same Function More Args and Change Identifier Used. These results can help practitioners to quantify the risk of using dynamic techniques over alternatives (such as code generation). We hope this classification raises attention on choosing dynamic APIs that are likely to be error-prone, and provides developers a better understanding when designing bug detection tools for such feature.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133281732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0