Good comments help developers understand software faster and provide better maintenance. However, comments are often missing, generally inaccurate, or out of date. Many of these problems can be avoided by automatic comment generation. This paper presents a method to generate informative comments directly from the source code using general-purpose techniques from natural language processing. We generate comments using an existing natural language model that couples words with their individual logical meaning and grammar rules, allowing comment generation to proceed by search from declarative descriptions of program text. We evaluate our algorithm on several classic algorithms implemented in Python.
{"title":"Generating comments from source code with CCGs","authors":"Sergey Matskevich, Colin S. Gordon","doi":"10.1145/3283812.3283822","DOIUrl":"https://doi.org/10.1145/3283812.3283822","url":null,"abstract":"Good comments help developers understand software faster and provide better maintenance. However, comments are often missing, generally inaccurate, or out of date. Many of these problems can be avoided by automatic comment generation. This paper presents a method to generate informative comments directly from the source code using general-purpose techniques from natural language processing. We generate comments using an existing natural language model that couples words with their individual logical meaning and grammar rules, allowing comment generation to proceed by search from declarative descriptions of program text. We evaluate our algorithm on several classic algorithms implemented in Python.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127349417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A broad class of software engineering problems can be generalized as the "total recall problem". This short paper claims that identifying and exploring the total recall problems in software engineering is an important task with wide applicability. To make that case, we show that by applying and adapting the state of the art active learning and natural language processing algorithms for solving the total recall problem, two important software engineering tasks can also be addressed : (a) supporting large literature reviews and (b) identifying software security vulnerabilities. Furthermore, we conjecture that (c) test case prioritization and (d) static warning identification can also be generalized as and benefit from the total recall problem. The widespread applicability of "total recall" to software engineering suggests that there exists some underlying framework that encompasses not just natural language processing, but a wide range of important software engineering tasks.
{"title":"Total recall, language processing, and software engineering","authors":"Zhe Yu, T. Menzies","doi":"10.1145/3283812.3283818","DOIUrl":"https://doi.org/10.1145/3283812.3283818","url":null,"abstract":"A broad class of software engineering problems can be generalized as the \"total recall problem\". This short paper claims that identifying and exploring the total recall problems in software engineering is an important task with wide applicability. To make that case, we show that by applying and adapting the state of the art active learning and natural language processing algorithms for solving the total recall problem, two important software engineering tasks can also be addressed : (a) supporting large literature reviews and (b) identifying software security vulnerabilities. Furthermore, we conjecture that (c) test case prioritization and (d) static warning identification can also be generalized as and benefit from the total recall problem. The widespread applicability of \"total recall\" to software engineering suggests that there exists some underlying framework that encompasses not just natural language processing, but a wide range of important software engineering tasks.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126080286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}