Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330194
C. Roy, J. Cordy
There have been a great many methods and tools proposed for software clone detection. While some work has been done on assessing and comparing performance of these tools, very little empirical evaluation has been done. In particular, accuracy measures such as precision and recall have only been roughly estimated, due both to problems in creating a validated clone benchmark against which tools can be compared, and to the manual effort required to hand check large numbers of candidate clones. In order to cope with this issue, over the last 10 years we have been working towards building cloning benchmarks for objectively evaluating clone detection tools. Beginning with our WCRE 2008 paper, where we conducted a modestly large empirical study with the NiCad clone detection tool, over the past ten years we have extended and grown our work to include several languages, much larger datasets, and model clones in languages such as Simulink. From a modest set of 15 C and Java systems comprising a total of 7 million lines in 2008, our work has progressed to a benchmark called BigCloneBench with eight million manually validated clone pairs in a large inter-project source dataset of more than 25,000 projects and 365 million lines of code. In this paper, we present a history and overview of software clone detection benchmarks, and review the steps of ourselves and others to come to this stage. We outline a future for clone detection benchmarks and hope to encourage researchers to both use existing benchmarks and to contribute to building the benchmarks of the future.
{"title":"Benchmarks for software clone detection: A ten-year retrospective","authors":"C. Roy, J. Cordy","doi":"10.1109/SANER.2018.8330194","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330194","url":null,"abstract":"There have been a great many methods and tools proposed for software clone detection. While some work has been done on assessing and comparing performance of these tools, very little empirical evaluation has been done. In particular, accuracy measures such as precision and recall have only been roughly estimated, due both to problems in creating a validated clone benchmark against which tools can be compared, and to the manual effort required to hand check large numbers of candidate clones. In order to cope with this issue, over the last 10 years we have been working towards building cloning benchmarks for objectively evaluating clone detection tools. Beginning with our WCRE 2008 paper, where we conducted a modestly large empirical study with the NiCad clone detection tool, over the past ten years we have extended and grown our work to include several languages, much larger datasets, and model clones in languages such as Simulink. From a modest set of 15 C and Java systems comprising a total of 7 million lines in 2008, our work has progressed to a benchmark called BigCloneBench with eight million manually validated clone pairs in a large inter-project source dataset of more than 25,000 projects and 365 million lines of code. In this paper, we present a history and overview of software clone detection benchmarks, and review the steps of ourselves and others to come to this stage. We outline a future for clone detection benchmarks and hope to encourage researchers to both use existing benchmarks and to contribute to building the benchmarks of the future.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"10 1","pages":"26-37"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84647222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330224
Xin Chen, He Jiang, Xiaochen Li, Tieke He, Zhenyu Chen
In crowdsourced mobile application testing, crowd workers help developers perform testing and submit test reports for unexpected behaviors. These submitted test reports usually provide critical information for developers to understand and reproduce the bugs. However, due to the poor performance of workers and the inconvenience of editing on mobile devices, the quality of test reports may vary sharply. At times developers have to spend a significant portion of their available resources to handle the low-quality test reports, thus heavily decreasing their efficiency. In this paper, to help developers predict whether a test report should be selected for inspection within limited resources, we propose a new framework named TERQAF to automatically model the quality of test reports. TERQAF defines a series of quantifiable indicators to measure the desirable properties of test reports and aggregates the numerical values of all indicators to determine the quality of test reports by using step transformation functions. Experiments conducted over five crowdsourced test report datasets of mobile applications show that TERQAF can correctly predict the quality of test reports with accuracy of up to 88.06% and outperform baselines by up to 23.06%. Meanwhile, the experimental results also demonstrate that the four categories of measurable indicators have positive impacts on TERQAF in evaluating the quality of test reports.
{"title":"Automated quality assessment for crowdsourced test reports of mobile applications","authors":"Xin Chen, He Jiang, Xiaochen Li, Tieke He, Zhenyu Chen","doi":"10.1109/SANER.2018.8330224","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330224","url":null,"abstract":"In crowdsourced mobile application testing, crowd workers help developers perform testing and submit test reports for unexpected behaviors. These submitted test reports usually provide critical information for developers to understand and reproduce the bugs. However, due to the poor performance of workers and the inconvenience of editing on mobile devices, the quality of test reports may vary sharply. At times developers have to spend a significant portion of their available resources to handle the low-quality test reports, thus heavily decreasing their efficiency. In this paper, to help developers predict whether a test report should be selected for inspection within limited resources, we propose a new framework named TERQAF to automatically model the quality of test reports. TERQAF defines a series of quantifiable indicators to measure the desirable properties of test reports and aggregates the numerical values of all indicators to determine the quality of test reports by using step transformation functions. Experiments conducted over five crowdsourced test report datasets of mobile applications show that TERQAF can correctly predict the quality of test reports with accuracy of up to 88.06% and outperform baselines by up to 23.06%. Meanwhile, the experimental results also demonstrate that the four categories of measurable indicators have positive impacts on TERQAF in evaluating the quality of test reports.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"1 1","pages":"368-379"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83091898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330226
N. Obbink, I. Malavolta, Gian Luca Scoccia, P. Lago
JavaScript is becoming the de-facto programming language of the Web. Large-scale web applications (web apps) written in Javascript are commonplace nowadays, with big technology players (e.g., Google, Facebook) using it in their core flagship products. Today, it is common practice to reuse existing JavaScript code, usually in the form of third-party libraries and frameworks. If on one side this practice helps in speeding up development time, on the other side it comes with the risk of bringing dead code, i.e., JavaScript code which is never executed, but still downloaded from the network and parsed in the browser. This overhead can negatively impact the overall performance and energy consumption of the web app. In this paper we present Lacuna, an approach for JavaScript dead code elimination, where existing JavaScript analysis techniques are applied in combination. The proposed approach supports both static and dynamic analyses, it is extensible, and independent of the specificities of the used JavaScript analysis techniques. Lacuna can be applied to any JavaScript code base, without imposing any constraints to the developer, e.g., on her coding style or on the use of some specific JavaScript feature (e.g., modules). Lacuna has been evaluated on a suite of 29 publicly-available web apps, composed of 15,946 JavaScript functions, and built with different JavaScript frameworks (e.g., Angular, Vue.js, jQuery). Despite being a prototype, Lacuna obtained promising results in terms of analysis execution time and precision.
{"title":"An extensible approach for taming the challenges of JavaScript dead code elimination","authors":"N. Obbink, I. Malavolta, Gian Luca Scoccia, P. Lago","doi":"10.1109/SANER.2018.8330226","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330226","url":null,"abstract":"JavaScript is becoming the de-facto programming language of the Web. Large-scale web applications (web apps) written in Javascript are commonplace nowadays, with big technology players (e.g., Google, Facebook) using it in their core flagship products. Today, it is common practice to reuse existing JavaScript code, usually in the form of third-party libraries and frameworks. If on one side this practice helps in speeding up development time, on the other side it comes with the risk of bringing dead code, i.e., JavaScript code which is never executed, but still downloaded from the network and parsed in the browser. This overhead can negatively impact the overall performance and energy consumption of the web app. In this paper we present Lacuna, an approach for JavaScript dead code elimination, where existing JavaScript analysis techniques are applied in combination. The proposed approach supports both static and dynamic analyses, it is extensible, and independent of the specificities of the used JavaScript analysis techniques. Lacuna can be applied to any JavaScript code base, without imposing any constraints to the developer, e.g., on her coding style or on the use of some specific JavaScript feature (e.g., modules). Lacuna has been evaluated on a suite of 29 publicly-available web apps, composed of 15,946 JavaScript functions, and built with different JavaScript frameworks (e.g., Angular, Vue.js, jQuery). Despite being a prototype, Lacuna obtained promising results in terms of analysis execution time and precision.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"14 1","pages":"291-401"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78257347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330259
Markus Exler, M. Moser, J. Pichler, Günter Fleck, B. Dorninger
Complex engineering problems are typically solved by running a batch of software programs. Data exchange between these software programs is frequently based on semi-structured text files. These files are edited by text editors providing basic input support, however without proper input validation prior program execution. Consequently, even minor lexical or syntactic errors cause software programs to stop without delivering a result. To tackle these problems a more specific editor support, which is aware of language concepts of data exchange files, needs to be provided. In this paper, we investigate if and in what quality a language grammar can be inferred from a set of existing text files, in order to provide a basis for the desired editing support. For this experiment, we chose a Minimal Adequate Teacher (MAT) method together with specific preprocessing of the existing text files. Thereby, we were able to construct complete grammar rules for most of the language constructs found in a corpus of semi-structured text files. The inferred grammar, however, requires refactoring towards a suitable and maintainable basis for the desired editor support.
{"title":"Grammatical inference from data exchange files: An experiment on engineering software","authors":"Markus Exler, M. Moser, J. Pichler, Günter Fleck, B. Dorninger","doi":"10.1109/SANER.2018.8330259","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330259","url":null,"abstract":"Complex engineering problems are typically solved by running a batch of software programs. Data exchange between these software programs is frequently based on semi-structured text files. These files are edited by text editors providing basic input support, however without proper input validation prior program execution. Consequently, even minor lexical or syntactic errors cause software programs to stop without delivering a result. To tackle these problems a more specific editor support, which is aware of language concepts of data exchange files, needs to be provided. In this paper, we investigate if and in what quality a language grammar can be inferred from a set of existing text files, in order to provide a basis for the desired editing support. For this experiment, we chose a Minimal Adequate Teacher (MAT) method together with specific preprocessing of the existing text files. Thereby, we were able to construct complete grammar rules for most of the language constructs found in a corpus of semi-structured text files. The inferred grammar, however, requires refactoring towards a suitable and maintainable basis for the desired editor support.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"5 1","pages":"557-561"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76631822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330196
Manishankar Mondal, C. Roy, Kevin A. Schneider
Detection, tracking, and refactoring of code clones (i.e., identical or nearly similar code fragments in the code-base of a software system) have been extensively investigated by a great many studies. Code clones have often been considered bad smells. While clone refactoring is important for removing code clones from the code-base, clone tracking is important for consistently updating code clones that are not suitable for refactoring. In this research we investigate the importance of micro-clones (i.e., code clones of less than five lines of code) in consistent updating of the code-base. While the existing clone detectors and trackers have ignored micro clones, our investigation on thousands of commits from six subject systems imply that around 80% of all consistent updates during system evolution occur in micro clones. The percentage of consistent updates occurring in micro clones is significantly higher than that in regular clones according to our statistical significance tests. Also, the consistent updates occurring in micro-clones can be up to 23% of all updates during the whole period of evolution. According to our manual analysis, around 83% of the consistent updates in micro-clones are non-trivial. As micro-clones also require consistent updates like the regular clones, tracking or refactoring micro-clones can help us considerably minimize effort for consistently updating such clones. Thus, micro-clones should also be taken into proper consideration when making clone management decisions.
{"title":"Micro-clones in evolving software","authors":"Manishankar Mondal, C. Roy, Kevin A. Schneider","doi":"10.1109/SANER.2018.8330196","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330196","url":null,"abstract":"Detection, tracking, and refactoring of code clones (i.e., identical or nearly similar code fragments in the code-base of a software system) have been extensively investigated by a great many studies. Code clones have often been considered bad smells. While clone refactoring is important for removing code clones from the code-base, clone tracking is important for consistently updating code clones that are not suitable for refactoring. In this research we investigate the importance of micro-clones (i.e., code clones of less than five lines of code) in consistent updating of the code-base. While the existing clone detectors and trackers have ignored micro clones, our investigation on thousands of commits from six subject systems imply that around 80% of all consistent updates during system evolution occur in micro clones. The percentage of consistent updates occurring in micro clones is significantly higher than that in regular clones according to our statistical significance tests. Also, the consistent updates occurring in micro-clones can be up to 23% of all updates during the whole period of evolution. According to our manual analysis, around 83% of the consistent updates in micro-clones are non-trivial. As micro-clones also require consistent updates like the regular clones, tracking or refactoring micro-clones can help us considerably minimize effort for consistently updating such clones. Thus, micro-clones should also be taken into proper consideration when making clone management decisions.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"15 1","pages":"50-60"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85461957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330199
Nathan Jay, B. Miller
Decoding binary executable files is a critical facility for software analysis, including debugging, performance monitoring, malware detection, cyber forensics, and sandboxing, among other techniques. As a foundational capability, binary decoding must be consistently correct for the techniques that rely on it to be viable. Unfortunately, modern instruction sets are huge and the encodings are complex, so as a result, modern binary decoders are buggy. In this paper, we present a testing methodology that automatically infers structural information for an instruction set and uses the inferred structure to efficiently generate structured-random test cases independent of the instruction set being tested. Our testing methodology includes automatic output verification using differential analysis and reassembly to generate error reports. This testing methodology requires little instruction-set-specific knowledge, allowing rapid testing of decoders for new architectures and extensions to existing ones. We have implemented our testing procedure in a tool name Fleece and used it to test multiple binary decoders (Intel XED, libopcodes, LLVM, Dyninst and Capstone) on multiple architectures (x86, ARM and PowerPC). Our testing efficiently covered thousands of instruction format variations for each instruction set and uncovered decoding bugs in every decoder we tested.
{"title":"Structured random differential testing of instruction decoders","authors":"Nathan Jay, B. Miller","doi":"10.1109/SANER.2018.8330199","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330199","url":null,"abstract":"Decoding binary executable files is a critical facility for software analysis, including debugging, performance monitoring, malware detection, cyber forensics, and sandboxing, among other techniques. As a foundational capability, binary decoding must be consistently correct for the techniques that rely on it to be viable. Unfortunately, modern instruction sets are huge and the encodings are complex, so as a result, modern binary decoders are buggy. In this paper, we present a testing methodology that automatically infers structural information for an instruction set and uses the inferred structure to efficiently generate structured-random test cases independent of the instruction set being tested. Our testing methodology includes automatic output verification using differential analysis and reassembly to generate error reports. This testing methodology requires little instruction-set-specific knowledge, allowing rapid testing of decoders for new architectures and extensions to existing ones. We have implemented our testing procedure in a tool name Fleece and used it to test multiple binary decoders (Intel XED, libopcodes, LLVM, Dyninst and Capstone) on multiple architectures (x86, ARM and PowerPC). Our testing efficiently covered thousands of instruction format variations for each instruction set and uncovered decoding bugs in every decoder we tested.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"11 9","pages":"84-94"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91496192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330204
Yuan Zhang, Jiarun Dai, Xiaohan Zhang, S. Huang, Zhemin Yang, Min Yang, Hao Chen
Third-party libraries are widely used in Android applications to ease development and enhance functionalities. However, the incorporated libraries also bring new security & privacy issues to the host application, and blur the accounting between application code and library code. Under this situation, a precise and reliable library detector is highly desirable. In fact, library code may be customized by developers during integration and dead library code may be eliminated by code obfuscators during application build process. However, existing research on library detection has not gracefully handled these problems, thus facing severe limitations in practice. In this paper, we propose LibPecker, an obfuscation-resilient, highly precise and reliable library detector for Android applications. LibPecker adopts signature matching to give a similarity score between a given library and an application. By fully utilizing the internal class dependencies inside a library, LibPecker generates a strict signature for each class. To tolerate library code customization and elimination as much as possible, LibPecker introduces adaptive class similarity threshold and weighted class similarity score when calculating library similarity. To quantitatively evaluate the precision and the recall of LibPecker, we perform the first such experiment (to the best of our knowledge) with a large number of libraries and applications. Results show that LibPecker significantly outperforms the state-of-the-art tools in both recall and precision (91% and 98.1% respectively).
{"title":"Detecting third-party libraries in Android applications with high precision and recall","authors":"Yuan Zhang, Jiarun Dai, Xiaohan Zhang, S. Huang, Zhemin Yang, Min Yang, Hao Chen","doi":"10.1109/SANER.2018.8330204","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330204","url":null,"abstract":"Third-party libraries are widely used in Android applications to ease development and enhance functionalities. However, the incorporated libraries also bring new security & privacy issues to the host application, and blur the accounting between application code and library code. Under this situation, a precise and reliable library detector is highly desirable. In fact, library code may be customized by developers during integration and dead library code may be eliminated by code obfuscators during application build process. However, existing research on library detection has not gracefully handled these problems, thus facing severe limitations in practice. In this paper, we propose LibPecker, an obfuscation-resilient, highly precise and reliable library detector for Android applications. LibPecker adopts signature matching to give a similarity score between a given library and an application. By fully utilizing the internal class dependencies inside a library, LibPecker generates a strict signature for each class. To tolerate library code customization and elimination as much as possible, LibPecker introduces adaptive class similarity threshold and weighted class similarity score when calculating library similarity. To quantitatively evaluate the precision and the recall of LibPecker, we perform the first such experiment (to the best of our knowledge) with a large number of libraries and applications. Results show that LibPecker significantly outperforms the state-of-the-art tools in both recall and precision (91% and 98.1% respectively).","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"53 1","pages":"141-152"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75946663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330247
Reno Dantas, Antonio Carvalho, Diego Marcilio, Luisa Fantin, Uriel Silva, Walter Lucas, R. Bonifácio
Software systems change frequently over time, either due to new business requirements or technology pressures. Programming languages evolve in a similar constant fashion, though when a language release introduces new programming constructs, older constructs and idioms might become obsolete. The coexistence between newer and older constructs leads to several problems, such as increased maintenance efforts and steeper learning curve for developers. In this paper we present a RASCAL Java transformation library that evolves legacy systems to use more recent programming language constructs (such as multi-catch and lambda expressions). In order to understand how relevant automatic software rejuvenation is, we submitted 2462 transformations to 40 open source projects via the GitHub pull request mechanism. Initial results show that simple transformations, for instance the introduction of the diamond operator, are more likely to be accepted than transformations that change the code substantially, such as refactoring enhanced for loops to the newer functional style.
{"title":"Reconciling the past and the present: An empirical study on the application of source code transformations to automatically rejuvenate Java programs","authors":"Reno Dantas, Antonio Carvalho, Diego Marcilio, Luisa Fantin, Uriel Silva, Walter Lucas, R. Bonifácio","doi":"10.1109/SANER.2018.8330247","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330247","url":null,"abstract":"Software systems change frequently over time, either due to new business requirements or technology pressures. Programming languages evolve in a similar constant fashion, though when a language release introduces new programming constructs, older constructs and idioms might become obsolete. The coexistence between newer and older constructs leads to several problems, such as increased maintenance efforts and steeper learning curve for developers. In this paper we present a RASCAL Java transformation library that evolves legacy systems to use more recent programming language constructs (such as multi-catch and lambda expressions). In order to understand how relevant automatic software rejuvenation is, we submitted 2462 transformations to 40 open source projects via the GitHub pull request mechanism. Initial results show that simple transformations, for instance the introduction of the diamond operator, are more likely to be accepted than transformations that change the code substantially, such as refactoring enhanced for loops to the newer functional style.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"23 1","pages":"497-501"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82177113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330243
Y. Li, Sandro Schulze, G. Saake
Analyzing and extracting features and variability from different artifacts is an indispensable activity to support systematic integration of single software systems and Software Product Line (SPL). Beyond manually extracting variability, a variety of approaches, such as feature location in source code and feature extraction in requirements, has been proposed for automating the identification of features and their variation points. While requirements contain more complete variability information and provide traceability links to other artifacts, current techniques exhibit a lack of accuracy as well as a limited degree of automation. In this paper, we propose an unsupervised learning structure to overcome the abovementioned limitations. In particular, our technique consists of two steps: First, we apply Laplacian Eigenmaps, an unsupervised dimensionality reduction technique, to embed text requirements into compact binary codes. Second, requirements are transformed into a matrix representation by looking up a pre-trained word embedding. Then, the matrix is fed into CNN to learn linguistic characteristics of the requirements. Furthermore, we train CNN by matching the output of CNN with the pre-trained binary codes. Initial results show that accuracy is still limited, but that our approach allows to automate the entire process.
{"title":"Extracting features from requirements: Achieving accuracy and automation with neural networks","authors":"Y. Li, Sandro Schulze, G. Saake","doi":"10.1109/SANER.2018.8330243","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330243","url":null,"abstract":"Analyzing and extracting features and variability from different artifacts is an indispensable activity to support systematic integration of single software systems and Software Product Line (SPL). Beyond manually extracting variability, a variety of approaches, such as feature location in source code and feature extraction in requirements, has been proposed for automating the identification of features and their variation points. While requirements contain more complete variability information and provide traceability links to other artifacts, current techniques exhibit a lack of accuracy as well as a limited degree of automation. In this paper, we propose an unsupervised learning structure to overcome the abovementioned limitations. In particular, our technique consists of two steps: First, we apply Laplacian Eigenmaps, an unsupervised dimensionality reduction technique, to embed text requirements into compact binary codes. Second, requirements are transformed into a matrix representation by looking up a pre-trained word embedding. Then, the matrix is fed into CNN to learn linguistic characteristics of the requirements. Furthermore, we train CNN by matching the output of CNN with the pre-trained binary codes. Initial results show that accuracy is still limited, but that our approach allows to automate the entire process.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"16 1","pages":"477-481"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81890656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-20DOI: 10.1109/SANER.2018.8330246
Di Liu, Xiaofang Zhang, Yang Feng, James A. Jones
Crowdsourced software testing has been shown to be capable of detecting many bugs and simulating real usage scenarios. As such, it is popular in mobile-application testing. However in mobile testing, test reports often consist of only some screenshots and short text descriptions. Inspecting and under-standing the overwhelming number of mobile crowdsourced test reports becomes a time-consuming but inevitable task. The paucity and potential inaccuracy of textual information and the well-defined screenshots of activity views within mobile applications motivate us to propose a novel technique to assist developers in understanding crowdsourced test reports by automatically describing the screenshots. To reach this goal, in this paper, we propose a fully automatic technique to generate descriptive words for the well-defined screenshots. We employ the test reports written by professional testers to build up language models. We use the computer-vision technique, namely Spatial Pyramid Matching (SPM), to measure similarities and extract features from the screenshot images. The experimental results, based on more than 1000 test reports from 4 industrial crowdsourced projects, show that our proposed technique is promising for developers to better understand the mobile crowdsourced test reports.
{"title":"Generating descriptions for screenshots to assist crowdsourced testing","authors":"Di Liu, Xiaofang Zhang, Yang Feng, James A. Jones","doi":"10.1109/SANER.2018.8330246","DOIUrl":"https://doi.org/10.1109/SANER.2018.8330246","url":null,"abstract":"Crowdsourced software testing has been shown to be capable of detecting many bugs and simulating real usage scenarios. As such, it is popular in mobile-application testing. However in mobile testing, test reports often consist of only some screenshots and short text descriptions. Inspecting and under-standing the overwhelming number of mobile crowdsourced test reports becomes a time-consuming but inevitable task. The paucity and potential inaccuracy of textual information and the well-defined screenshots of activity views within mobile applications motivate us to propose a novel technique to assist developers in understanding crowdsourced test reports by automatically describing the screenshots. To reach this goal, in this paper, we propose a fully automatic technique to generate descriptive words for the well-defined screenshots. We employ the test reports written by professional testers to build up language models. We use the computer-vision technique, namely Spatial Pyramid Matching (SPM), to measure similarities and extract features from the screenshot images. The experimental results, based on more than 1000 test reports from 4 industrial crowdsourced projects, show that our proposed technique is promising for developers to better understand the mobile crowdsourced test reports.","PeriodicalId":6602,"journal":{"name":"2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"1 1","pages":"492-496"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89277915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}