{"title":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","authors":"","doi":"10.1145/3283812","DOIUrl":"https://doi.org/10.1145/3283812","url":null,"abstract":"","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"2019 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131506791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Test generation can have a large impact on the software engineering process by decreasing the amount of time and effort required to maintain a high level of test coverage. This increases the quality of the resultant software while decreasing the associated effort. In this paper, we present TestNMT, an experimental approach to test generation using neural machine translation. TestNMT aims to learn to translate from functions to tests, allowing a developer to generate an approximate test for a given function, which can then be adapted to produce the final desired test. We also present a preliminary quantitative and qualitative evaluation of TestNMT in both cross-project and within-project scenarios. This evaluation shows that TestNMT is potentially useful in the within-project scenario, where it achieves a maximum BLEU score of 21.2, a maximum ROUGE-L score of 38.67, and is shown to be capable of generating approximate tests that are easy to adapt to working tests.
{"title":"TestNMT: function-to-test neural machine translation","authors":"Robert White, J. Krinke","doi":"10.1145/3283812.3283823","DOIUrl":"https://doi.org/10.1145/3283812.3283823","url":null,"abstract":"Test generation can have a large impact on the software engineering process by decreasing the amount of time and effort required to maintain a high level of test coverage. This increases the quality of the resultant software while decreasing the associated effort. In this paper, we present TestNMT, an experimental approach to test generation using neural machine translation. TestNMT aims to learn to translate from functions to tests, allowing a developer to generate an approximate test for a given function, which can then be adapted to produce the final desired test. We also present a preliminary quantitative and qualitative evaluation of TestNMT in both cross-project and within-project scenarios. This evaluation shows that TestNMT is potentially useful in the within-project scenario, where it achieves a maximum BLEU score of 21.2, a maximum ROUGE-L score of 38.67, and is shown to be capable of generating approximate tests that are easy to adapt to working tests.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130176250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we describe a new approach for automatic identification of monitoring concerns implementation in Java-based software systems. We also present the results obtained by using our approach on 21 Java-based systems, ranging from small to very large systems.
{"title":"Mining monitoring concerns implementation in Java-based software systems","authors":"G. Cojocar, A. Guran","doi":"10.1145/3283812.3283821","DOIUrl":"https://doi.org/10.1145/3283812.3283821","url":null,"abstract":"In this paper we describe a new approach for automatic identification of monitoring concerns implementation in Java-based software systems. We also present the results obtained by using our approach on 21 Java-based systems, ranging from small to very large systems.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122425860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Readability of code is commonly believed to impact the overall quality of software. Poor readability not only hinders developers from understanding what the code is doing but also can cause developers to make sub-optimal changes and introduce bugs. Developers also recognize this risk and state readability among their top information needs. Researchers have modeled readability scores. However, thus far, no one has investigated how readability evolves over time and how that impacts design quality of software. We perform a large scale study of 49 open source Java projects, spanning 8296 commits and 1766 files. We find that readability is high in open source projects and does not fluctuate over project’s lifetime unlike design quality of a project. Also readability has a non-significant correlation of 0.151 (Kendall’s τ ) with code smell count (indicator of design quality). Since current readability measure is unable to capture the increased difficulty in reading code due to the degraded design quality, our results hint towards the need of a better measurement and modeling of code readability.
{"title":"Towards understanding code readability and its impact on design quality","authors":"Umme Ayda Mannan, Iftekhar Ahmed, A. Sarma","doi":"10.1145/3283812.3283820","DOIUrl":"https://doi.org/10.1145/3283812.3283820","url":null,"abstract":"Readability of code is commonly believed to impact the overall quality of software. Poor readability not only hinders developers from understanding what the code is doing but also can cause developers to make sub-optimal changes and introduce bugs. Developers also recognize this risk and state readability among their top information needs. Researchers have modeled readability scores. However, thus far, no one has investigated how readability evolves over time and how that impacts design quality of software. We perform a large scale study of 49 open source Java projects, spanning 8296 commits and 1766 files. We find that readability is high in open source projects and does not fluctuate over project’s lifetime unlike design quality of a project. Also readability has a non-significant correlation of 0.151 (Kendall’s τ ) with code smell count (indicator of design quality). Since current readability measure is unable to capture the increased difficulty in reading code due to the degraded design quality, our results hint towards the need of a better measurement and modeling of code readability.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127884225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper studies the software documentation quality in Stack Overflow from two perspectives: the questioners’ who are accepting answers and the community’s who is voting for answers. We show what developers can do to increase the chance that their questions or answers get accepted by the community or by the questioners. We found different expectations of what information such as code or images should be included in a question or an answer. We evaluated six different quality indicators (such as Flesch Reading Ease or images) which a developer should consider before posting a question and an answer. In addition, we found different quality indicators for different types of questions, in particular error, discrepancy, and how-to questions. Finally we use a supervised machine-learning algorithm to predict when an answer will be accepted or voted.
{"title":"Two perspectives on software documentation quality in stack overflow","authors":"Mathias Ellmann, M. Schnecke","doi":"10.1145/3283812.3283816","DOIUrl":"https://doi.org/10.1145/3283812.3283816","url":null,"abstract":"This paper studies the software documentation quality in Stack Overflow from two perspectives: the questioners’ who are accepting answers and the community’s who is voting for answers. We show what developers can do to increase the chance that their questions or answers get accepted by the community or by the questioners. We found different expectations of what information such as code or images should be included in a question or an answer. We evaluated six different quality indicators (such as Flesch Reading Ease or images) which a developer should consider before posting a question and an answer. In addition, we found different quality indicators for different types of questions, in particular error, discrepancy, and how-to questions. Finally we use a supervised machine-learning algorithm to predict when an answer will be accepted or voted.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126603975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kate M. Bowers, Reihaneh H. Hariri, Katey A. Price
Alzheimer’s disease is a progressive illness that affects more than 5.5 million people in the United States with no effective cure or treatment. Symptoms of the disease include declines in memory and speech abilities and increases in aggression and insomnia. Recent research suggests that NLP techniques can detect early cognitive decline as well as monitor the rate of decline over time. The processed data can be used in a smart home environment to enhance the level of home care for Alzheimer’s patients. This paper proposes early-stage research in software engineering and natural language processing for quantifying and evaluating the patient’s cognitive state to determine the required level of support in a smart home.
{"title":"3CAP: categorizing the cognitive capabilities of Alzheimer’s patients in a smart home environment","authors":"Kate M. Bowers, Reihaneh H. Hariri, Katey A. Price","doi":"10.1145/3283812.3283824","DOIUrl":"https://doi.org/10.1145/3283812.3283824","url":null,"abstract":"Alzheimer’s disease is a progressive illness that affects more than 5.5 million people in the United States with no effective cure or treatment. Symptoms of the disease include declines in memory and speech abilities and increases in aggression and insomnia. Recent research suggests that NLP techniques can detect early cognitive decline as well as monitor the rate of decline over time. The processed data can be used in a smart home environment to enhance the level of home care for Alzheimer’s patients. This paper proposes early-stage research in software engineering and natural language processing for quantifying and evaluating the patient’s cognitive state to determine the required level of support in a smart home.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134410735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Danielle Gonzalez, Suzanne Prentice, Mehdi Mirakhorli
Converting source or unit test code to English has been shown to improve the maintainability, understandability, and analysis of software and tests. Code summarizers identify 'important' statements in the source/tests and convert them to easily understood English sentences using static analysis and NLP techniques. However, current test summarization approaches handle only a subset of the variation and customization allowed in the JUnit assert API (a critical component of test cases) which may affect the accuracy of conversions. In this paper, we present our work towards improving JUnit test summarization with a detailed process for converting a total of 45 unique JUnit assertions to English, including 37 previously-unhandled variations of the assertThat method. This process has also been implemented and released as the AssertConvert tool. Initial evaluations have shown that this tool generates English conversions that accurately represent a wide variety of assertion statements which could be used for code summarization or other NLP analyses.
{"title":"A fine-grained approach for automated conversion of JUnit assertions to English","authors":"Danielle Gonzalez, Suzanne Prentice, Mehdi Mirakhorli","doi":"10.1145/3283812.3283819","DOIUrl":"https://doi.org/10.1145/3283812.3283819","url":null,"abstract":"Converting source or unit test code to English has been shown to improve the maintainability, understandability, and analysis of software and tests. Code summarizers identify 'important' statements in the source/tests and convert them to easily understood English sentences using static analysis and NLP techniques. However, current test summarization approaches handle only a subset of the variation and customization allowed in the JUnit assert API (a critical component of test cases) which may affect the accuracy of conversions. In this paper, we present our work towards improving JUnit test summarization with a detailed process for converting a total of 45 unique JUnit assertions to English, including 37 previously-unhandled variations of the assertThat method. This process has also been implemented and released as the AssertConvert tool. Initial evaluations have shown that this tool generates English conversions that accurately represent a wide variety of assertion statements which could be used for code summarization or other NLP analyses.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129315943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the domain of software engineering NLP techniques are needed to use and find duplicate or similar development knowledge which are stored in development documentation as development tasks. To understand duplicate and similar development documentations we will discuss different NLP techniques as descriptive statistics, topic analysis and similarity algorithms as N-grams, the Jaccard or LSI algorithm as well as machine learning algorithms as Decision trees or support vector machines (SVM). Those techniques are used to reach a better understanding of the characteristics, the lexical relations (syntactical and semantical) and the classification and prediction of duplicate development tasks. We found that duplicate tasks share conceptual information and are rather created by inexperienced developers. By tuning different features to predict development tasks with a gradient or a Fidelity loss function a system can identify a duplicate tasks with a 100% accuracy.
{"title":"Natural language processing (NLP) applied on issue trackers","authors":"Mathias Ellmann","doi":"10.1145/3283812.3283825","DOIUrl":"https://doi.org/10.1145/3283812.3283825","url":null,"abstract":"In the domain of software engineering NLP techniques are needed to use and find duplicate or similar development knowledge which are stored in development documentation as development tasks. To understand duplicate and similar development documentations we will discuss different NLP techniques as descriptive statistics, topic analysis and similarity algorithms as N-grams, the Jaccard or LSI algorithm as well as machine learning algorithms as Decision trees or support vector machines (SVM). Those techniques are used to reach a better understanding of the characteristics, the lexical relations (syntactical and semantical) and the classification and prediction of duplicate development tasks. We found that duplicate tasks share conceptual information and are rather created by inexperienced developers. By tuning different features to predict development tasks with a gradient or a Fidelity loss function a system can identify a duplicate tasks with a 100% accuracy.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127304851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present LinkSO, a dataset for learning to rank similar questions on Stack Overflow. Stack Overflow contains a massive amount of crowd-sourced question links of high quality, which provides a great opportunity for evaluating retrieval algorithms for community-based question answer (cQA) archives and for learning to rank such archives. However, due to the existence of missing links, one question is whether question links can be readily used as the relevance judgment for evaluation. We study this question by measuring the closeness between question links and the relevance judgment, and we find their agreement rates range from 80% to 88%. We conduct an empirical study on the performance of existing work on LinkSO. While existing work focuses on non-learning approaches, our study results reveal that learning-based approaches has great potential to further improve the retrieval performance.
{"title":"LinkSO: a dataset for learning to retrieve similar question answer pairs on software development forums","authors":"Xueqing Liu, Chi Wang, Yue Leng, ChengXiang Zhai","doi":"10.1145/3283812.3283815","DOIUrl":"https://doi.org/10.1145/3283812.3283815","url":null,"abstract":"We present LinkSO, a dataset for learning to rank similar questions on Stack Overflow. Stack Overflow contains a massive amount of crowd-sourced question links of high quality, which provides a great opportunity for evaluating retrieval algorithms for community-based question answer (cQA) archives and for learning to rank such archives. However, due to the existence of missing links, one question is whether question links can be readily used as the relevance judgment for evaluation. We study this question by measuring the closeness between question links and the relevance judgment, and we find their agreement rates range from 80% to 88%. We conduct an empirical study on the performance of existing work on LinkSO. While existing work focuses on non-learning approaches, our study results reveal that learning-based approaches has great potential to further improve the retrieval performance.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131944254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Learning from large corpora of source code ("Big Code") has seen increasing interest over the past few years. A first wave of work has focused on leveraging off-the-shelf methods from other machine learning fields such as natural language processing. While these techniques have succeeded in showing the feasibility of learning from code, and led to some initial practical solutions, they forego explicit use of known program semantics. In a range of recent work, we have tried to solve this issue by integrating deep learning techniques with program analysis methods in graphs. Graphs are a convenient, general formalism to model entities and their relationships, and are seeing increasing interest from machine learning researchers as well. In this talk, I present two applications of graph-based learning to understanding and generating programs and discuss a range of future work building on the success of this work.
{"title":"Learning from code with graphs (keynote)","authors":"Marc Brockschmidt","doi":"10.1145/3283812.3283813","DOIUrl":"https://doi.org/10.1145/3283812.3283813","url":null,"abstract":"Learning from large corpora of source code (\"Big Code\") has seen increasing interest over the past few years. A first wave of work has focused on leveraging off-the-shelf methods from other machine learning fields such as natural language processing. While these techniques have succeeded in showing the feasibility of learning from code, and led to some initial practical solutions, they forego explicit use of known program semantics. In a range of recent work, we have tried to solve this issue by integrating deep learning techniques with program analysis methods in graphs. Graphs are a convenient, general formalism to model entities and their relationships, and are seeing increasing interest from machine learning researchers as well. In this talk, I present two applications of graph-based learning to understanding and generating programs and discuss a range of future work building on the success of this work.","PeriodicalId":231305,"journal":{"name":"Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123773194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}