基于结构化树输入和AST解码器注意力增强的代码生成方法

Wenjun Wei, Junhua Wu
{"title":"基于结构化树输入和AST解码器注意力增强的代码生成方法","authors":"Wenjun Wei, Junhua Wu","doi":"10.1109/QRS-C57518.2022.00077","DOIUrl":null,"url":null,"abstract":"Automatic code generation based on natural language input is important to research in the field of software engineering. In the past, it was mostly a seq2seq structure and used the RNN model. Input and output are regarded as simple sequences, and syntactic structure information in source information is often ignored. This paper proposes a code generation method Tx(Tree-Tree). It uses structured trees to replace simple word sequences so that the model can better learn the syntactic and semantic information in the source information. Therefore, it can alleviate the long dependency problem caused by too long source information. At the same time, the enhanced attention mechanism is adopted in the decoder to distinguish the influence of different historical actions on the current predicted action. The model is validated on three datasets: DJANGO, CONALA, and ATIS. Compared with some typical models, Tx(Tree-Tree) improves both accuracy and BLEU.","PeriodicalId":183728,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Code Generation Method based on Structured Tree Input and AST Decoder Attention Augmentation\",\"authors\":\"Wenjun Wei, Junhua Wu\",\"doi\":\"10.1109/QRS-C57518.2022.00077\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic code generation based on natural language input is important to research in the field of software engineering. In the past, it was mostly a seq2seq structure and used the RNN model. Input and output are regarded as simple sequences, and syntactic structure information in source information is often ignored. This paper proposes a code generation method Tx(Tree-Tree). It uses structured trees to replace simple word sequences so that the model can better learn the syntactic and semantic information in the source information. Therefore, it can alleviate the long dependency problem caused by too long source information. At the same time, the enhanced attention mechanism is adopted in the decoder to distinguish the influence of different historical actions on the current predicted action. The model is validated on three datasets: DJANGO, CONALA, and ATIS. Compared with some typical models, Tx(Tree-Tree) improves both accuracy and BLEU.\",\"PeriodicalId\":183728,\"journal\":{\"name\":\"2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)\",\"volume\":\"113 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/QRS-C57518.2022.00077\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QRS-C57518.2022.00077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

基于自然语言输入的代码自动生成是软件工程领域的一个重要研究课题。在过去,它主要是一个seq2seq结构,并使用RNN模型。输入和输出被视为简单的序列,而源信息中的语法结构信息往往被忽略。本文提出了一种代码生成方法Tx(Tree-Tree)。它使用结构化的树来代替简单的词序列,使模型能够更好地学习源信息中的语法和语义信息。因此,它可以缓解由于源信息过长而导致的长依赖问题。同时,在解码器中采用增强注意机制,区分不同历史动作对当前预测动作的影响。模型在三个数据集上进行了验证:DJANGO、CONALA和ATIS。与一些典型模型相比,Tx(Tree-Tree)既提高了准确率,又提高了BLEU。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Code Generation Method based on Structured Tree Input and AST Decoder Attention Augmentation
Automatic code generation based on natural language input is important to research in the field of software engineering. In the past, it was mostly a seq2seq structure and used the RNN model. Input and output are regarded as simple sequences, and syntactic structure information in source information is often ignored. This paper proposes a code generation method Tx(Tree-Tree). It uses structured trees to replace simple word sequences so that the model can better learn the syntactic and semantic information in the source information. Therefore, it can alleviate the long dependency problem caused by too long source information. At the same time, the enhanced attention mechanism is adopted in the decoder to distinguish the influence of different historical actions on the current predicted action. The model is validated on three datasets: DJANGO, CONALA, and ATIS. Compared with some typical models, Tx(Tree-Tree) improves both accuracy and BLEU.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Software Bug Prediction based on Complex Network Considering Control Flow A Fault Localization Method Based on Similarity Weighting with Unlabeled Test Cases What Should Abeeha do? an Activity for Phishing Awareness The Real-Time General Display and Control Platform Designing Method based on Software Product Line Code Search Method Based on Multimodal Representation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1