Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330223
Rana Alkadhi, Manuel Nonnenmacher, Emitzá Guzmán, B. Brügge
Developers make various decisions during software development. The rationale behind these decisions is of great importance during software evolution of long living software systems. However, current practices for documenting rationale often fall short and rationale remains hidden in the heads of developers or embedded in development artifacts. Further challenges are faced for capturing rationale in OSS projects; in which developers are geographically distributed and rely mostly on written communication channels to support and coordinate their activities. In this paper, we present an empirical study to understand how OSS developers discuss rationale in IRC channels and explore the possibility of automatic extraction of rationale elements by analyzing IRC messages of development teams. To achieve this, we manually analyzed 7,500 messages of three large OSS projects and identified all fine-grained elements of rationale. We evaluated various machine learning algorithms for automatically detecting and classifying rationale in IRC messages. Our results show that 1) rationale is discussed on average in 25% of IRC messages, 2) code committers contributed on average 54% of the discussed rationale, and 3) machine learning algorithms can detect rationale with 0.76 precision and 0.79 recall, and classify messages into finer-grained rationale elements with an average of 0.45 precision and 0.43 recall.
{"title":"How do developers discuss rationale?","authors":"Rana Alkadhi, Manuel Nonnenmacher, Emitzá Guzmán, B. Brügge","doi":"10.1109/SANER.2018.8330223","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330223","url":null,"abstract":"Developers make various decisions during software development. The rationale behind these decisions is of great importance during software evolution of long living software systems. However, current practices for documenting rationale often fall short and rationale remains hidden in the heads of developers or embedded in development artifacts. Further challenges are faced for capturing rationale in OSS projects; in which developers are geographically distributed and rely mostly on written communication channels to support and coordinate their activities. In this paper, we present an empirical study to understand how OSS developers discuss rationale in IRC channels and explore the possibility of automatic extraction of rationale elements by analyzing IRC messages of development teams. To achieve this, we manually analyzed 7,500 messages of three large OSS projects and identified all fine-grained elements of rationale. We evaluated various machine learning algorithms for automatically detecting and classifying rationale in IRC messages. Our results show that 1) rationale is discussed on average in 25% of IRC messages, 2) code committers contributed on average 54% of the discussed rationale, and 3) machine learning algorithms can detect rationale with 0.76 precision and 0.79 recall, and classify messages into finer-grained rationale elements with an average of 0.45 precision and 0.43 recall.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"12 1","pages":"357-369"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81607058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330193
Foutse Khomh, Yann-Gaël Guéhéneuc
Software engineers are creators of habits. During software development, they follow again and again the same patterns when architecting, designing and implementing programs. Alexander introduced such patterns in architecture in 1974 and, 20 years later, they made their way in software development thanks to the work of Gamma et al. Software design patterns were promoted to make the design of programs more "flexible, modular, reusable, and understandable". However, ten years later, these patterns, their roles, and their impact on software quality were not fully understood. We then set out to study the impact of design patterns on different quality attributes and published a paper entitled "Do Design Patterns Impact Software Quality Positively?" in the proceedings of the 12th European Conference on Software Maintenance and Reengineering (CSMR) in 2008. Ten years later, this paper received the Most Influential Paper award at the 25th International Conference on Software Analysis, Evolution, and Reengineering (SANER) in 2018. In this retrospective paper for the award, we report and reflect on our and others' studies on the impact of design patterns, discussing some key findings reported about design patterns. We also take a step back from these studies and re-examine the role that design patterns should play in software development. Finally, we outline some avenues for future research work on design patterns, e.g., the identification of the patterns really used by developers, the theories explaining the impact of patterns, or their use to raise the abstraction level of programming languages.
{"title":"Design patterns impact on software quality: Where are the theories?","authors":"Foutse Khomh, Yann-Gaël Guéhéneuc","doi":"10.1109/SANER.2018.8330193","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330193","url":null,"abstract":"Software engineers are creators of habits. During software development, they follow again and again the same patterns when architecting, designing and implementing programs. Alexander introduced such patterns in architecture in 1974 and, 20 years later, they made their way in software development thanks to the work of Gamma et al. Software design patterns were promoted to make the design of programs more \"flexible, modular, reusable, and understandable\". However, ten years later, these patterns, their roles, and their impact on software quality were not fully understood. We then set out to study the impact of design patterns on different quality attributes and published a paper entitled \"Do Design Patterns Impact Software Quality Positively?\" in the proceedings of the 12th European Conference on Software Maintenance and Reengineering (CSMR) in 2008. Ten years later, this paper received the Most Influential Paper award at the 25th International Conference on Software Analysis, Evolution, and Reengineering (SANER) in 2018. In this retrospective paper for the award, we report and reflect on our and others' studies on the impact of design patterns, discussing some key findings reported about design patterns. We also take a step back from these studies and re-examine the role that design patterns should play in software development. Finally, we outline some avenues for future research work on design patterns, e.g., the identification of the patterns really used by developers, the theories explaining the impact of patterns, or their use to raise the abstraction level of programming languages.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"14 1","pages":"15-25"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81699932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330221
Chariton Karamitas, A. Kehagias
Binary diffing is the process of reverse engineering two programs, when source code is not available, in order to study their syntactic and semantic differences. For large programs, binary diffing can be performed by function matching which, in turn, is reduced to a graph isomorphism problem between the compared programs' CFGs (Control Flow Graphs) and/or CGs (Call Graphs). In this paper we provide a set of carefully chosen features, extracted from a binary's CG and CFG, which can be used by BinDiff algorithm variants to, first, build a set of initial exact matches with minimal false positives (by scanning for unique perfect matches) and, second, propagate approximate matching information using, for example, a nearest-neighbor scheme. Furthermore, we investigate the benefits of applying Markov lumping techniques to function CFGs (to our knowledge, this technique has not been previously studied). The proposed function features are evaluated in a series of experiments on various versions of the Linux kernel (Intel64), the OpenSSH server (Intel64) and Firefox's xul.dll (IA-32). Our prototype system is also compared to Diaphora, the current state-of-the-art binary diffing software.
{"title":"Efficient features for function matching between binary executables","authors":"Chariton Karamitas, A. Kehagias","doi":"10.1109/SANER.2018.8330221","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330221","url":null,"abstract":"Binary diffing is the process of reverse engineering two programs, when source code is not available, in order to study their syntactic and semantic differences. For large programs, binary diffing can be performed by function matching which, in turn, is reduced to a graph isomorphism problem between the compared programs' CFGs (Control Flow Graphs) and/or CGs (Call Graphs). In this paper we provide a set of carefully chosen features, extracted from a binary's CG and CFG, which can be used by BinDiff algorithm variants to, first, build a set of initial exact matches with minimal false positives (by scanning for unique perfect matches) and, second, propagate approximate matching information using, for example, a nearest-neighbor scheme. Furthermore, we investigate the benefits of applying Markov lumping techniques to function CFGs (to our knowledge, this technique has not been previously studied). The proposed function features are evaluated in a series of experiments on various versions of the Linux kernel (Intel64), the OpenSSH server (Intel64) and Firefox's xul.dll (IA-32). Our prototype system is also compared to Diaphora, the current state-of-the-art binary diffing software.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"86 9 1","pages":"335-345"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87684820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330229
Liang Xu, Shuo Wang, Wensheng Dou, Bo Yang, Chushu Gao, Jun Wei, Tao Huang
Spreadsheets play an important role in various business tasks, such as financial reports and data analysis. In spreadsheets, empty cells are widely used for different purposes, e.g., separating different tables, or default value "0". However, a user may delete a formula unintentionally, and leave a cell empty. Such ad-hoc modification may introduce a faulty empty cell that should have a formula. We observe that the context of an empty cell can help determine whether the empty cell is faulty. For example, is the empty cell next to a cell array in which all cells share the same semantics? Does the empty cell have headers similar to other non-empty cells'? In this paper, we propose EmptyCheck, to detect faulty empty cells in spreadsheets. By analyzing the context of an empty cell, EmptyCheck validates whether the cell belong to a cell array. If yes, the empty cell is faulty since it does not contain a formula. We evaluate EmptyCheck on 100 randomly sampled EUSES spreadsheets. The experimental result shows that EmptyCheck can detect faulty empty cells with high precision (75.00%) and recall (87.04%). Existing techniques can detect only 4.26% of the true faulty empty cells that EmptyCheck detects.
{"title":"Detecting faulty empty cells in spreadsheets","authors":"Liang Xu, Shuo Wang, Wensheng Dou, Bo Yang, Chushu Gao, Jun Wei, Tao Huang","doi":"10.1109/SANER.2018.8330229","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330229","url":null,"abstract":"Spreadsheets play an important role in various business tasks, such as financial reports and data analysis. In spreadsheets, empty cells are widely used for different purposes, e.g., separating different tables, or default value \"0\". However, a user may delete a formula unintentionally, and leave a cell empty. Such ad-hoc modification may introduce a faulty empty cell that should have a formula. We observe that the context of an empty cell can help determine whether the empty cell is faulty. For example, is the empty cell next to a cell array in which all cells share the same semantics? Does the empty cell have headers similar to other non-empty cells'? In this paper, we propose EmptyCheck, to detect faulty empty cells in spreadsheets. By analyzing the context of an empty cell, EmptyCheck validates whether the cell belong to a cell array. If yes, the empty cell is faulty since it does not contain a formula. We evaluate EmptyCheck on 100 randomly sampled EUSES spreadsheets. The experimental result shows that EmptyCheck can detect faulty empty cells with high precision (75.00%) and recall (87.04%). Existing techniques can detect only 4.26% of the true faulty empty cells that EmptyCheck detects.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"85 1","pages":"423-433"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80334925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Common software weaknesses, such as improper input validation, integer overflow, can harm system security directly or indirectly, causing adverse effects such as denial-of-service, execution of unauthorized code. Common Weakness Enumeration (CWE) maintains a standard list and classification of common software weakness. Although CWE contains rich information about software weaknesses, including textual descriptions, common sequences and relations between software weaknesses, the current data representation, i.e., hyperlined documents, does not support advanced reasoning tasks on software weaknesses, such as prediction of missing relations and common consequences of CWEs. Such reasoning tasks become critical to managing and analyzing large numbers of common software weaknesses and their relations. In this paper, we propose to represent common software weaknesses and their relations as a knowledge graph, and develop a translation-based, description-embodied knowledge representation learning method to embed both software weaknesses and their relations in the knowledge graph into a semantic vector space. The vector representations (i.e., embeddings) of software weaknesses and their relations can be exploited for knowledge acquisition and inference. We conduct extensive experiments to evaluate the performance of software weakness and relation embeddings in three reasoning tasks, including CWE link prediction, CWE triple classification, and common consequence prediction. Our knowledge graph embedding approach outperforms other description- and/or structure-based representation learning methods.
{"title":"DeepWeak: Reasoning common software weaknesses via knowledge graph embedding","authors":"Zhuobing Han, Xiaohong Li, Hongtao Liu, Zhenchang Xing, Zhiyong Feng","doi":"10.1109/SANER.2018.8330232","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330232","url":null,"abstract":"Common software weaknesses, such as improper input validation, integer overflow, can harm system security directly or indirectly, causing adverse effects such as denial-of-service, execution of unauthorized code. Common Weakness Enumeration (CWE) maintains a standard list and classification of common software weakness. Although CWE contains rich information about software weaknesses, including textual descriptions, common sequences and relations between software weaknesses, the current data representation, i.e., hyperlined documents, does not support advanced reasoning tasks on software weaknesses, such as prediction of missing relations and common consequences of CWEs. Such reasoning tasks become critical to managing and analyzing large numbers of common software weaknesses and their relations. In this paper, we propose to represent common software weaknesses and their relations as a knowledge graph, and develop a translation-based, description-embodied knowledge representation learning method to embed both software weaknesses and their relations in the knowledge graph into a semantic vector space. The vector representations (i.e., embeddings) of software weaknesses and their relations can be exploited for knowledge acquisition and inference. We conduct extensive experiments to evaluate the performance of software weakness and relation embeddings in three reasoning tasks, including CWE link prediction, CWE triple classification, and common consequence prediction. Our knowledge graph embedding approach outperforms other description- and/or structure-based representation learning methods.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"8 1","pages":"456-466"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83740330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330264
L. Pascarella, Fabio Palomba, Alberto Bacchelli
Bug prediction is aimed at supporting developers in the identification of code artifacts more likely to be defective. Researchers have proposed prediction models to identify bug prone methods and provided promising evidence that it is possible to operate at this level of granularity. Particularly, models based on a mixture of product and process metrics, used as independent variables, led to the best results. In this study, we first replicate previous research on method-level bug prediction on different systems/timespans. Afterwards, we reflect on the evaluation strategy and propose a more realistic one. Key results of our study show that the performance of the method-level bug prediction model is similar to what previously reported also for different systems/timespans, when evaluated with the same strategy. However—when evaluated with a more realistic strategy—all the models show a dramatic drop in performance exhibiting results close to that of a random classifier. Our replication and negative results indicate that method-level bug prediction is still an open challenge.
{"title":"Re-evaluating method-level bug prediction","authors":"L. Pascarella, Fabio Palomba, Alberto Bacchelli","doi":"10.1109/SANER.2018.8330264","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330264","url":null,"abstract":"Bug prediction is aimed at supporting developers in the identification of code artifacts more likely to be defective. Researchers have proposed prediction models to identify bug prone methods and provided promising evidence that it is possible to operate at this level of granularity. Particularly, models based on a mixture of product and process metrics, used as independent variables, led to the best results. In this study, we first replicate previous research on method-level bug prediction on different systems/timespans. Afterwards, we reflect on the evaluation strategy and propose a more realistic one. Key results of our study show that the performance of the method-level bug prediction model is similar to what previously reported also for different systems/timespans, when evaluated with the same strategy. However—when evaluated with a more realistic strategy—all the models show a dramatic drop in performance exhibiting results close to that of a random classifier. Our replication and negative results indicate that method-level bug prediction is still an open challenge.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"14 1","pages":"592-601"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72684146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330265
Sarah Fakhoury, V. Arnaoudova, Cedric Noiseux, Foutse Khomh, G. Antoniol
Deep neural networks is a popular technique that has been applied successfully to domains such as image processing, sentiment analysis, speech recognition, and computational linguistic. Deep neural networks are machine learning algorithms that, in general, require a labeled set of positive and negative examples that are used to tune hyper-parameters and adjust model coefficients to learn a prediction function. Recently, deep neural networks have also been successfully applied to certain software engineering problem domains (e.g., bug prediction), however, results are shown to be outperformed by traditional machine learning approaches in other domains (e.g., recovering links between entries in a discussion forum). In this paper, we report our experience in building an automatic Linguistic Antipattern Detector (LAPD) using deep neural networks. We manually build and validate an oracle of around 1,700 instances and create binary classification models using traditional machine learning approaches and Convolutional Neural Networks. Our experience is that, considering the size of the oracle, the available hardware and software, as well as the theory to interpret results, deep neural networks are outperformed by traditional machine learning algorithms in terms of all evaluation metrics we used and resources (time and memory). Therefore, although deep learning is reported to produce results comparable and even superior to human experts for certain complex tasks, it does not seem to be a good fit for simple classification tasks like smell detection. Researchers and practitioners should be careful when selecting machine learning models for the problem at hand.
{"title":"Keep it simple: Is deep learning good for linguistic smell detection?","authors":"Sarah Fakhoury, V. Arnaoudova, Cedric Noiseux, Foutse Khomh, G. Antoniol","doi":"10.1109/SANER.2018.8330265","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330265","url":null,"abstract":"Deep neural networks is a popular technique that has been applied successfully to domains such as image processing, sentiment analysis, speech recognition, and computational linguistic. Deep neural networks are machine learning algorithms that, in general, require a labeled set of positive and negative examples that are used to tune hyper-parameters and adjust model coefficients to learn a prediction function. Recently, deep neural networks have also been successfully applied to certain software engineering problem domains (e.g., bug prediction), however, results are shown to be outperformed by traditional machine learning approaches in other domains (e.g., recovering links between entries in a discussion forum). In this paper, we report our experience in building an automatic Linguistic Antipattern Detector (LAPD) using deep neural networks. We manually build and validate an oracle of around 1,700 instances and create binary classification models using traditional machine learning approaches and Convolutional Neural Networks. Our experience is that, considering the size of the oracle, the available hardware and software, as well as the theory to interpret results, deep neural networks are outperformed by traditional machine learning algorithms in terms of all evaluation metrics we used and resources (time and memory). Therefore, although deep learning is reported to produce results comparable and even superior to human experts for certain complex tasks, it does not seem to be a good fit for simple classification tasks like smell detection. Researchers and practitioners should be careful when selecting machine learning models for the problem at hand.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"28 1","pages":"602-611"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81090953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330220
A. Nguyen, Trong Duc Nguyen, H. Phan, T. Nguyen
Statistical language models (LMs) have been applied in several software engineering applications. However, they have issues in dealing with ambiguities in the names of program and API elements (classes and method calls). In this paper, inspired by the success of Deep Neural Network (DNN) in natural language processing, we present Dnn4C, a DNN language model that complements the local context of lexical code elements with both syntactic and type contexts. We designed a context-incorporating method to use with syntactic and type annotations for source code in order to learn to distinguish the lexical tokens in different syntactic and type contexts. Our empirical evaluation on code completion for real-world projects shows that Dnn4C relatively improves 11.6%, 16.3%, 27.1%, and 44.7% top-1 accuracy over the state-of-the-art language models for source code used with the same features: RNN LM, DNN LM, SLAMC, and n-gram LM, respectively. For another application, we showed that Dnn4C helps improve accuracy over n-gram LM in migrating source code from Java to C# with a machine translation model.
{"title":"A deep neural network language model with contexts for source code","authors":"A. Nguyen, Trong Duc Nguyen, H. Phan, T. Nguyen","doi":"10.1109/SANER.2018.8330220","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330220","url":null,"abstract":"Statistical language models (LMs) have been applied in several software engineering applications. However, they have issues in dealing with ambiguities in the names of program and API elements (classes and method calls). In this paper, inspired by the success of Deep Neural Network (DNN) in natural language processing, we present Dnn4C, a DNN language model that complements the local context of lexical code elements with both syntactic and type contexts. We designed a context-incorporating method to use with syntactic and type annotations for source code in order to learn to distinguish the lexical tokens in different syntactic and type contexts. Our empirical evaluation on code completion for real-world projects shows that Dnn4C relatively improves 11.6%, 16.3%, 27.1%, and 44.7% top-1 accuracy over the state-of-the-art language models for source code used with the same features: RNN LM, DNN LM, SLAMC, and n-gram LM, respectively. For another application, we showed that Dnn4C helps improve accuracy over n-gram LM in migrating source code from Java to C# with a machine translation model.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"26 1 1","pages":"323-334"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79745870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330245
M. R. Islam, M. Zibran
Sentiment Analysis (SA) in software engineering (SE) text has drawn immense interests recently. The poor performance of general-purpose SA tools, when operated on SE text, has led to recent emergence of domain-specific SA tools especially designed for SE text. However, these domain-specific tools were tested on single dataset and their performances were compared mainly against general-purpose tools. Thus, two things remain unclear: (i) how well these tools really work on other datasets, and (ii) which tool to choose in which context. To address these concerns, we operate three recent domain-specific SA tools on three separate datasets. Using standard accuracy measurement metrics, we compute and compare their accuracies in the detection of sentiments in SE text.
{"title":"A comparison of software engineering domain specific sentiment analysis tools","authors":"M. R. Islam, M. Zibran","doi":"10.1109/SANER.2018.8330245","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330245","url":null,"abstract":"Sentiment Analysis (SA) in software engineering (SE) text has drawn immense interests recently. The poor performance of general-purpose SA tools, when operated on SE text, has led to recent emergence of domain-specific SA tools especially designed for SE text. However, these domain-specific tools were tested on single dataset and their performances were compared mainly against general-purpose tools. Thus, two things remain unclear: (i) how well these tools really work on other datasets, and (ii) which tool to choose in which context. To address these concerns, we operate three recent domain-specific SA tools on three separate datasets. Using standard accuracy measurement metrics, we compute and compare their accuracies in the detection of sentiments in SE text.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"51 1","pages":"487-491"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84005339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330263
G. Pinto, I. Wiese, Luiz Felipe Dias
Although the goal of scientists is to do science, not to develop software, many scientists have extended their roles to include software development to their skills. However, since scientists have different background, it remains unclear how do they perceive software engineering practices or how do they acquire software engineering knowledge. In this paper we conducted an external replication of one influential 10 years paper about how scientists develop and use scientific software. In particular, we employed the same method (an on-line questionnaire) in a different population (R developers). When analyzing the more than 1,574 responses received, enriched with data gathered from their GitHub repositories, we correlated our findings with the original study. We found that the results were consistent in many ways, including: (1) scientists that develop software work mostly alone, (2) they decide themselves what they want to work on next, and (3) most of what they learnt came from self-study, rather than a formal education. However, we also uncover new facts, such as: some of the "pain points" regarding software development are not related to technical activities (e.g., interruptions, lack of collaborators, and lack of a reward system play a role). Our replication can help researchers, practitioners, and educators to better focus their efforts on topics that are important to the scientific community that develops software.
{"title":"How do scientists develop scientific software? An external replication","authors":"G. Pinto, I. Wiese, Luiz Felipe Dias","doi":"10.1109/SANER.2018.8330263","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330263","url":null,"abstract":"Although the goal of scientists is to do science, not to develop software, many scientists have extended their roles to include software development to their skills. However, since scientists have different background, it remains unclear how do they perceive software engineering practices or how do they acquire software engineering knowledge. In this paper we conducted an external replication of one influential 10 years paper about how scientists develop and use scientific software. In particular, we employed the same method (an on-line questionnaire) in a different population (R developers). When analyzing the more than 1,574 responses received, enriched with data gathered from their GitHub repositories, we correlated our findings with the original study. We found that the results were consistent in many ways, including: (1) scientists that develop software work mostly alone, (2) they decide themselves what they want to work on next, and (3) most of what they learnt came from self-study, rather than a formal education. However, we also uncover new facts, such as: some of the \"pain points\" regarding software development are not related to technical activities (e.g., interruptions, lack of collaborators, and lack of a reward system play a role). Our replication can help researchers, practitioners, and educators to better focus their efforts on topics that are important to the scientific community that develops software.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"9 1","pages":"582-591"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79530759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}