{"title":"改进的指针混合网络代码补全方法","authors":"Cheng Wei, Zhiqiu Huang, Yaoshen Yu","doi":"10.1109/QRS57517.2022.00095","DOIUrl":null,"url":null,"abstract":"Code completion is an efficient software development technique in modern integrated development environments (IDEs), which can predict the most likely code token(s) based on the context of the code to be completed, so as to improve the work efficiency of developers. The Pointer Mixture Network proposed in recent years has achieved good results in code completion, the contribution of this paper is to improve the Pointer Mixture Network’s method. We used one-hot encoding in the data preprocessing phase, which makes the distance between the tokens of calculation more reasonable, and also has an effect on the expansion characteristics of the code. Besides, we add label smoothing to avoid the overfitting of neural language networks and improve the generalization ability of the model. In neural language networks, we apply the three-layer LSTM, so that the hidden layers of LSTM can fully learn the context information. In terms of the optimizer, we choose NAdam whose performance is better than Adam used in the Pointer Mixture Network, which greatly accelerates the training speed of the model. Experiments show that our work exceeds the results obtained in the Pointer Mixture Network, which is in code completion tasks in Python and JavaScript programming languages.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved Methods of Pointer Mixture Network for Code Completion\",\"authors\":\"Cheng Wei, Zhiqiu Huang, Yaoshen Yu\",\"doi\":\"10.1109/QRS57517.2022.00095\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code completion is an efficient software development technique in modern integrated development environments (IDEs), which can predict the most likely code token(s) based on the context of the code to be completed, so as to improve the work efficiency of developers. The Pointer Mixture Network proposed in recent years has achieved good results in code completion, the contribution of this paper is to improve the Pointer Mixture Network’s method. We used one-hot encoding in the data preprocessing phase, which makes the distance between the tokens of calculation more reasonable, and also has an effect on the expansion characteristics of the code. Besides, we add label smoothing to avoid the overfitting of neural language networks and improve the generalization ability of the model. In neural language networks, we apply the three-layer LSTM, so that the hidden layers of LSTM can fully learn the context information. In terms of the optimizer, we choose NAdam whose performance is better than Adam used in the Pointer Mixture Network, which greatly accelerates the training speed of the model. Experiments show that our work exceeds the results obtained in the Pointer Mixture Network, which is in code completion tasks in Python and JavaScript programming languages.\",\"PeriodicalId\":143812,\"journal\":{\"name\":\"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/QRS57517.2022.00095\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS57517.2022.00095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improved Methods of Pointer Mixture Network for Code Completion
Code completion is an efficient software development technique in modern integrated development environments (IDEs), which can predict the most likely code token(s) based on the context of the code to be completed, so as to improve the work efficiency of developers. The Pointer Mixture Network proposed in recent years has achieved good results in code completion, the contribution of this paper is to improve the Pointer Mixture Network’s method. We used one-hot encoding in the data preprocessing phase, which makes the distance between the tokens of calculation more reasonable, and also has an effect on the expansion characteristics of the code. Besides, we add label smoothing to avoid the overfitting of neural language networks and improve the generalization ability of the model. In neural language networks, we apply the three-layer LSTM, so that the hidden layers of LSTM can fully learn the context information. In terms of the optimizer, we choose NAdam whose performance is better than Adam used in the Pointer Mixture Network, which greatly accelerates the training speed of the model. Experiments show that our work exceeds the results obtained in the Pointer Mixture Network, which is in code completion tasks in Python and JavaScript programming languages.