Utilization of pre-trained language models for adapter-based knowledge transfer in software engineering

IF 3.5 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Empirical Software Engineering Pub Date : 2024-06-13 DOI:10.1007/s10664-024-10457-5

Iman Saberi, Fatemeh Fard, Fuxiang Chen

{"title":"Utilization of pre-trained language models for adapter-based knowledge transfer in software engineering","authors":"Iman Saberi, Fatemeh Fard, Fuxiang Chen","doi":"10.1007/s10664-024-10457-5","DOIUrl":null,"url":null,"abstract":"Software Engineering (SE) Pre-trained Language Models (PLMs), such as CodeBERT, are pre-trained on large code corpora, and their learned knowledge has shown success in transferring into downstream tasks (e.g., code clone detection) through the fine-tuning of PLMs. In Natural Language Processing (NLP), an alternative in transferring the knowledge of PLMs is explored through the use of adapter, a compact and parameter efficient module that is inserted into a PLM. Although the use of adapters has shown promising results in many NLP-based downstream tasks, their application and exploration in SE-based downstream tasks are limited. Here, we study the knowledge transfer using adapters on multiple downstream tasks including cloze test, code clone detection, and code summarization. These adapters are trained on code corpora and are inserted into a PLM that is pre-trained on English corpora or code corpora. We called these PLMs as NL-PLM and C-PLM, respectively. We observed an improvement in results using NL-PLM over a PLM that does not have adapters, and this suggested that adapters can transfer and utilize useful knowledge from NL-PLM to SE tasks. The results are sometimes on par with or exceed the results of C-PLM; while being more efficient in terms of the number of parameters and training time. Interestingly, adapters inserted into a C-PLM generally yield better results than a traditional fine-tuned C-PLM. Our results open new directions to build more compact models for SE tasks.","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"355 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Empirical Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10664-024-10457-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Software Engineering (SE) Pre-trained Language Models (PLMs), such as CodeBERT, are pre-trained on large code corpora, and their learned knowledge has shown success in transferring into downstream tasks (e.g., code clone detection) through the fine-tuning of PLMs. In Natural Language Processing (NLP), an alternative in transferring the knowledge of PLMs is explored through the use of adapter, a compact and parameter efficient module that is inserted into a PLM. Although the use of adapters has shown promising results in many NLP-based downstream tasks, their application and exploration in SE-based downstream tasks are limited. Here, we study the knowledge transfer using adapters on multiple downstream tasks including cloze test, code clone detection, and code summarization. These adapters are trained on code corpora and are inserted into a PLM that is pre-trained on English corpora or code corpora. We called these PLMs as NL-PLM and C-PLM, respectively. We observed an improvement in results using NL-PLM over a PLM that does not have adapters, and this suggested that adapters can transfer and utilize useful knowledge from NL-PLM to SE tasks. The results are sometimes on par with or exceed the results of C-PLM; while being more efficient in terms of the number of parameters and training time. Interestingly, adapters inserted into a C-PLM generally yield better results than a traditional fine-tuned C-PLM. Our results open new directions to build more compact models for SE tasks.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在软件工程中利用预训练语言模型进行基于适配器的知识转移

软件工程（SE）的预训练语言模型（PLMs），如 CodeBERT，是在大型代码库中进行预训练的，通过对 PLMs 进行微调，其学习到的知识可以成功地转移到下游任务中（如代码克隆检测）。在自然语言处理（NLP）领域，通过使用适配器（一种插入到 PLM 中的紧凑、参数高效的模块），探索了一种转移 PLM 知识的替代方法。虽然适配器的使用在许多基于 NLP 的下游任务中显示出了良好的效果，但其在基于 SE 的下游任务中的应用和探索还很有限。在此，我们研究了在多个下游任务中使用适配器进行知识转移的情况，包括掐头去尾测试、代码克隆检测和代码总结。这些适配器是在代码语料库中训练的，并被插入到预先在英语语料库或代码语料库中训练过的 PLM 中。我们将这些 PLM 分别称为 NL-PLM 和 C-PLM。我们观察到，使用 NL-PLM 的结果比不使用适配器的 PLM 有所改进，这表明适配器可以将有用的知识从 NL-PLM 转移到 SE 任务中并加以利用。其结果有时与 C-PLM 的结果相当，有时甚至超过 C-PLM 的结果；而在参数数量和训练时间方面，NL-PLM 的效率更高。有趣的是，插入 C-PLM 的适配器通常比传统的微调 C-PLM 产生更好的结果。我们的研究结果为建立更紧凑的 SE 任务模型开辟了新的方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Empirical Software Engineering 工程技术-计算机：软件工程

CiteScore

8.50

自引率

12.20%

发文量

169

审稿时长

>12 weeks

期刊介绍： Empirical Software Engineering provides a forum for applied software engineering research with a strong empirical component, and a venue for publishing empirical results relevant to both researchers and practitioners. Empirical studies presented here usually involve the collection and analysis of data and experience that can be used to characterize, evaluate and reveal relationships between software development deliverables, practices, and technologies. Over time, it is expected that such empirical results will form a body of knowledge leading to widely accepted and well-formed theories. The journal also offers industrial experience reports detailing the application of software technologies - processes, methods, or tools - and their effectiveness in industrial settings. Empirical Software Engineering promotes the publication of industry-relevant research, to address the significant gap between research and practice.