Hongwei Wei, Xiaohong Su, Weining Zheng, Wenxin Tao
{"title":"Documentation-Guided API Sequence Search without Worrying about the Text-API Semantic Gap","authors":"Hongwei Wei, Xiaohong Su, Weining Zheng, Wenxin Tao","doi":"10.1109/SANER56733.2023.00040","DOIUrl":null,"url":null,"abstract":"Developers often search for application programming interfaces (APIs) and their usage patterns to speed up the efficiency of software development. This paper focuses on the API sequence search task, which refers to using a function-relevant textual query to search for API sequences mined from open-source software repositories that can implement this function. However, the severe semantic gap between text and API makes it challenging to discover the correspondence between natural language queries and desired API sequences. Therefore, we propose a method called documentation-guided API sequence search (DGAS), through which we do not need to worry about the semantic gap between text and API. Specifically, DGAS consists of documentation-guided cross-modal attention (DGCA) and documentation-guided cross-modal matching (DGCM). DGCA calculates the cross-modal attention map using features extracted from the same modality (i.e., API documentation sequence and textual query) instead of from different modalities (i.e., API sequence and textual query) to bridge the semantic gap during the cross-modal attention phase. Besides, DGCM takes API documentation as supplementary information of API sequence to bridge the semantic gap during the cross-modal matching phase. We use the API documentation to extend the existing dataset for API sequence generation to construct a dataset for API sequence search to evaluate DGAS. Experimental results show that DGAS outperforms the baseline methods.","PeriodicalId":281850,"journal":{"name":"2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SANER56733.2023.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Developers often search for application programming interfaces (APIs) and their usage patterns to speed up the efficiency of software development. This paper focuses on the API sequence search task, which refers to using a function-relevant textual query to search for API sequences mined from open-source software repositories that can implement this function. However, the severe semantic gap between text and API makes it challenging to discover the correspondence between natural language queries and desired API sequences. Therefore, we propose a method called documentation-guided API sequence search (DGAS), through which we do not need to worry about the semantic gap between text and API. Specifically, DGAS consists of documentation-guided cross-modal attention (DGCA) and documentation-guided cross-modal matching (DGCM). DGCA calculates the cross-modal attention map using features extracted from the same modality (i.e., API documentation sequence and textual query) instead of from different modalities (i.e., API sequence and textual query) to bridge the semantic gap during the cross-modal attention phase. Besides, DGCM takes API documentation as supplementary information of API sequence to bridge the semantic gap during the cross-modal matching phase. We use the API documentation to extend the existing dataset for API sequence generation to construct a dataset for API sequence search to evaluate DGAS. Experimental results show that DGAS outperforms the baseline methods.