Mapping APIs in Dynamic-typed Programs by Leveraging Transfer Learning

IF 6.6 2区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING ACM Transactions on Software Engineering and Methodology Pub Date : 2024-01-22 DOI:10.1145/3641848
Zhenfei Huang, Junjie Chen, Jiajun Jiang, Yihua Liang, Hanmo You, Fengjie Li
{"title":"Mapping APIs in Dynamic-typed Programs by Leveraging Transfer Learning","authors":"Zhenfei Huang, Junjie Chen, Jiajun Jiang, Yihua Liang, Hanmo You, Fengjie Li","doi":"10.1145/3641848","DOIUrl":null,"url":null,"abstract":"<p>Application Programming Interface (API) migration is a common task for adapting software across different programming languages and platforms, where manually constructing the mapping relations between APIs is indeed time-consuming and error-prone. To facilitate this process, many automated API mapping approaches have been proposed. However, existing approaches were mainly designed and evaluated for mapping APIs of statically-typed languages, while their performance on dynamically-typed languages remains unexplored. </p><p>In this paper, we conduct the first extensive study to explore existing API mapping approaches’ performance for mapping APIs in dynamically-typed languages, for which we have manually constructed a high-quality dataset. According to the empirical results, we have summarized several insights. In particular, the source code implementations of APIs can significantly improve the effectiveness of API mapping. However, due to the confidentiality policy, they may not be available in practice. To overcome this, we propose a novel API mapping approach, named <span>Matl</span>, which leverages the transfer learning technique to learn the semantic embeddings of source code implementations from large-scale open-source repositories and then transfers the learned model to facilitate the mapping of APIs. In this way, <span>Matl</span> can produce more accurate API embedding of its functionality for more effective mapping without knowing the source code of the APIs. To evaluate the performance of <span>Matl</span>, we have conducted an extensive study by comparing <span>Matl</span> with state-of-the-art approaches. The results demonstrate that <span>Matl</span> is indeed effective as it improves the state-of-the-art approach by at least 18.36% for mapping APIs of dynamically-typed language and by 30.77% for mapping APIs of the statically-typed language.</p>","PeriodicalId":50933,"journal":{"name":"ACM Transactions on Software Engineering and Methodology","volume":"222 1","pages":""},"PeriodicalIF":6.6000,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3641848","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Application Programming Interface (API) migration is a common task for adapting software across different programming languages and platforms, where manually constructing the mapping relations between APIs is indeed time-consuming and error-prone. To facilitate this process, many automated API mapping approaches have been proposed. However, existing approaches were mainly designed and evaluated for mapping APIs of statically-typed languages, while their performance on dynamically-typed languages remains unexplored.

In this paper, we conduct the first extensive study to explore existing API mapping approaches’ performance for mapping APIs in dynamically-typed languages, for which we have manually constructed a high-quality dataset. According to the empirical results, we have summarized several insights. In particular, the source code implementations of APIs can significantly improve the effectiveness of API mapping. However, due to the confidentiality policy, they may not be available in practice. To overcome this, we propose a novel API mapping approach, named Matl, which leverages the transfer learning technique to learn the semantic embeddings of source code implementations from large-scale open-source repositories and then transfers the learned model to facilitate the mapping of APIs. In this way, Matl can produce more accurate API embedding of its functionality for more effective mapping without knowing the source code of the APIs. To evaluate the performance of Matl, we have conducted an extensive study by comparing Matl with state-of-the-art approaches. The results demonstrate that Matl is indeed effective as it improves the state-of-the-art approach by at least 18.36% for mapping APIs of dynamically-typed language and by 30.77% for mapping APIs of the statically-typed language.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用迁移学习在动态类型程序中映射应用程序接口
应用程序接口(API)迁移是在不同编程语言和平台之间调整软件的一项常见任务,而手动构建 API 之间的映射关系确实既耗时又容易出错。为了简化这一过程,人们提出了许多自动 API 映射方法。然而,现有的方法主要是针对静态类型语言的 API 映射而设计和评估的,而它们在动态类型语言上的性能仍有待探索。在本文中,我们首次对现有 API 映射方法在动态类型语言 API 映射中的性能进行了广泛的研究,并为此手动构建了一个高质量的数据集。根据实证结果,我们总结出了几点启示。其中,API 的源代码实现可以显著提高 API 映射的有效性。然而,由于保密政策的原因,在实践中可能无法获得这些源代码。为了克服这一问题,我们提出了一种名为 Matl 的新型 API 映射方法,它利用迁移学习技术从大规模开源资源库中学习源代码实现的语义嵌入,然后将学习到的模型迁移到 API 映射中。通过这种方式,Matl 可以为其功能生成更准确的 API 嵌入,从而在不知道 API 源代码的情况下实现更有效的映射。为了评估 Matl 的性能,我们进行了一项广泛的研究,将 Matl 与最先进的方法进行了比较。结果表明,Matl 确实是有效的,因为它在映射动态类型语言的应用程序接口方面比最先进的方法至少提高了 18.36%,在映射静态类型语言的应用程序接口方面提高了 30.77%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology 工程技术-计算机:软件工程
CiteScore
6.30
自引率
4.50%
发文量
164
审稿时长
>12 weeks
期刊介绍: Designing and building a large, complex software system is a tremendous challenge. ACM Transactions on Software Engineering and Methodology (TOSEM) publishes papers on all aspects of that challenge: specification, design, development and maintenance. It covers tools and methodologies, languages, data structures, and algorithms. TOSEM also reports on successful efforts, noting practical lessons that can be scaled and transferred to other projects, and often looks at applications of innovative technologies. The tone is scholarly but readable; the content is worthy of study; the presentation is effective.
期刊最新文献
Effective, Platform-Independent GUI Testing via Image Embedding and Reinforcement Learning Bitmap-Based Security Monitoring for Deeply Embedded Systems Harmonising Contributions: Exploring Diversity in Software Engineering through CQA Mining on Stack Overflow An Empirical Study on the Characteristics of Database Access Bugs in Java Applications Self-planning Code Generation with Large Language Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1