自然语言是一种编程语言:将自然语言处理应用于软件开发

Michael D. Ernst
{"title":"自然语言是一种编程语言:将自然语言处理应用于软件开发","authors":"Michael D. Ernst","doi":"10.4230/LIPIcs.SNAPL.2017.4","DOIUrl":null,"url":null,"abstract":"A powerful, but limited, way to view software is as source code alone. Treating a program as a sequence of instructions enables it to be formalized and makes it amenable to mathematical techniques such as abstract interpretation and model checking. \n \nA program consists of much more than a sequence of instructions. Developers make use of test cases, documentation, variable names, program structure, the version control repository, and more. I argue that it is time to take the blinders off of software analysis tools: tools should use all these artifacts to deduce more powerful and useful information about the program. \n \nResearchers are beginning to make progress towards this vision. This paper gives, as examples, four results that find bugs and generate code by applying natural language processing techniques to software artifacts. The four techniques use as input error messages, variable names, procedure documentation, and user questions. They use four different NLP techniques: document similarity, word semantics, parse trees, and neural networks. \n \nThe initial results suggest that this is a promising avenue for future work.","PeriodicalId":231548,"journal":{"name":"Summit on Advances in Programming Languages","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":"{\"title\":\"Natural Language is a Programming Language: Applying Natural Language Processing to Software Development\",\"authors\":\"Michael D. Ernst\",\"doi\":\"10.4230/LIPIcs.SNAPL.2017.4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A powerful, but limited, way to view software is as source code alone. Treating a program as a sequence of instructions enables it to be formalized and makes it amenable to mathematical techniques such as abstract interpretation and model checking. \\n \\nA program consists of much more than a sequence of instructions. Developers make use of test cases, documentation, variable names, program structure, the version control repository, and more. I argue that it is time to take the blinders off of software analysis tools: tools should use all these artifacts to deduce more powerful and useful information about the program. \\n \\nResearchers are beginning to make progress towards this vision. This paper gives, as examples, four results that find bugs and generate code by applying natural language processing techniques to software artifacts. The four techniques use as input error messages, variable names, procedure documentation, and user questions. They use four different NLP techniques: document similarity, word semantics, parse trees, and neural networks. \\n \\nThe initial results suggest that this is a promising avenue for future work.\",\"PeriodicalId\":231548,\"journal\":{\"name\":\"Summit on Advances in Programming Languages\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"41\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Summit on Advances in Programming Languages\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4230/LIPIcs.SNAPL.2017.4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Summit on Advances in Programming Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.SNAPL.2017.4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 41

摘要

查看软件的一种强大但有限的方法是将其单独视为源代码。将程序视为指令序列,可以使其形式化,并使其适用于抽象解释和模型检查等数学技术。程序不仅仅是由一系列指令组成的。开发人员利用测试用例、文档、变量名、程序结构、版本控制存储库等等。我认为是时候揭开软件分析工具的面纱了:工具应该使用所有这些工件来推断关于程序的更强大和有用的信息。研究人员正开始朝着这一愿景取得进展。作为例子,本文给出了将自然语言处理技术应用于软件工件的四种发现错误和生成代码的结果。这四种技术用作输入错误消息、变量名、过程文档和用户问题。他们使用四种不同的NLP技术:文档相似度、词语义、解析树和神经网络。初步结果表明,这是未来工作的一个有希望的途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Natural Language is a Programming Language: Applying Natural Language Processing to Software Development
A powerful, but limited, way to view software is as source code alone. Treating a program as a sequence of instructions enables it to be formalized and makes it amenable to mathematical techniques such as abstract interpretation and model checking. A program consists of much more than a sequence of instructions. Developers make use of test cases, documentation, variable names, program structure, the version control repository, and more. I argue that it is time to take the blinders off of software analysis tools: tools should use all these artifacts to deduce more powerful and useful information about the program. Researchers are beginning to make progress towards this vision. This paper gives, as examples, four results that find bugs and generate code by applying natural language processing techniques to software artifacts. The four techniques use as input error messages, variable names, procedure documentation, and user questions. They use four different NLP techniques: document similarity, word semantics, parse trees, and neural networks. The initial results suggest that this is a promising avenue for future work.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
From Theory to Systems: A Grounded Approach to Programming Language Education Linking Types for Multi-Language Software: Have Your Cake and Eat It Too AP: Artificial Programming Fission: Secure Dynamic Code-Splitting for JavaScript Migratory Typing: Ten Years Later
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1