Alaaeddin Swidan, Alexander Serebrenik, F. Hermans
Research shows the importance of selecting good names to identifiers in software code: more meaningful names improve readability. In particular, several guidelines encourage long and descriptive variable names. A recent study analyzed the use of variable names in five programming languages, focusing on single-letter variable names, because of the apparent contradiction between their frequent use and the fact that these variables violate the aforementioned guidelines.,,In this paper, we analyze variables in Scratch, a popular block-based language aimed at children. We start by replicating the above single-letter study for Scratch. We augment this study by analyzing single-letter procedure names, and by investigating the use of Scratch specific naming patterns: spaces in variable names, numerics as variables and textual labels in procedure names.,,The results of our analysis show that Scratch programmers often prefer longer identifier names than developers in other languages, while Scratch procedure names have even longer names than Scratch variables. For the single-letter variables, the most frequent names are x, y, and i. Single-letter procedures are less popular, but show more tendency to be in upper case. When compared to the other programming languages, the usage of single uppercase letters in Scratch variables seems to be similar to the pattern found in Perl, while for the lowercase letters—to the pattern found in Java. Concerning Scratch specific features, 44% of the unique variable names and 34% of the projects in the dataset include at least one space. The usage of textual labels between parameters in procedure names appears as not common, however textual patterns used imply an influence from textual languages, for example by using brackets.,,Previous research indicate the identifier names as one significant issue in transitioning from visual block-based to textual programming languages. The naming patterns we found support this claim for Scratch programmers who may incur difficulties when transitioning to the use of mainstream textual programming languages. Those languages restrict the use of spaces in identifiers and more often divert into short and single-letter names—tendencies opposite to the naming preferences in Scratch.
{"title":"How do Scratch Programmers Name Variables and Procedures?","authors":"Alaaeddin Swidan, Alexander Serebrenik, F. Hermans","doi":"10.1109/SCAM.2017.12","DOIUrl":"https://doi.org/10.1109/SCAM.2017.12","url":null,"abstract":"Research shows the importance of selecting good names to identifiers in software code: more meaningful names improve readability. In particular, several guidelines encourage long and descriptive variable names. A recent study analyzed the use of variable names in five programming languages, focusing on single-letter variable names, because of the apparent contradiction between their frequent use and the fact that these variables violate the aforementioned guidelines.,,In this paper, we analyze variables in Scratch, a popular block-based language aimed at children. We start by replicating the above single-letter study for Scratch. We augment this study by analyzing single-letter procedure names, and by investigating the use of Scratch specific naming patterns: spaces in variable names, numerics as variables and textual labels in procedure names.,,The results of our analysis show that Scratch programmers often prefer longer identifier names than developers in other languages, while Scratch procedure names have even longer names than Scratch variables. For the single-letter variables, the most frequent names are x, y, and i. Single-letter procedures are less popular, but show more tendency to be in upper case. When compared to the other programming languages, the usage of single uppercase letters in Scratch variables seems to be similar to the pattern found in Perl, while for the lowercase letters—to the pattern found in Java. Concerning Scratch specific features, 44% of the unique variable names and 34% of the projects in the dataset include at least one space. The usage of textual labels between parameters in procedure names appears as not common, however textual patterns used imply an influence from textual languages, for example by using brackets.,,Previous research indicate the identifier names as one significant issue in transitioning from visual block-based to textual programming languages. The naming patterns we found support this claim for Scratch programmers who may incur difficulties when transitioning to the use of mainstream textual programming languages. Those languages restrict the use of spaces in identifiers and more often divert into short and single-letter names—tendencies opposite to the naming preferences in Scratch.","PeriodicalId":306744,"journal":{"name":"2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114290063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinshui Wang, Xin Peng, Zhenchang Xing, Kun Fu, Wenyun Zhao
When performing feature location tasks, developers often need to explore a large number of program elements by following a variety of clues (such as program element location, dependency, and content). As there are often complex relationships among program elements, it is likely that some relevant program elements are omitted, especially when the implementations for a feature or concern scatter across several source files. In this paper, we propose an approach for recommending potentially relevant program elements in an interactive feature location process. The two characteristics of our approach are: considering ongoing user context (i.e., confirmed or negated elements) in an interactive manner; performing an example-based reasoning to determine relevance of program elements. Based on an initial set of program elements confirmed by developers, our approach recommends additional program elements in an iterative process, in which developers can confirm relevant results, negate irrelevant results, and obtain an updated recommendation list. We have implemented our approach as an Eclipse plug-in called RecFL and conducted an experimental study. The results show that the participants using RecFL achieved a much better performance in their feature location tasks than the participants not using RecFL. The participants using RecFL also felt it easier to accomplish their feature location tasks with the support of RecFL.
{"title":"Contextual Recommendation of Relevant Program Elements in an Interactive Feature Location Process","authors":"Jinshui Wang, Xin Peng, Zhenchang Xing, Kun Fu, Wenyun Zhao","doi":"10.1109/SCAM.2017.14","DOIUrl":"https://doi.org/10.1109/SCAM.2017.14","url":null,"abstract":"When performing feature location tasks, developers often need to explore a large number of program elements by following a variety of clues (such as program element location, dependency, and content). As there are often complex relationships among program elements, it is likely that some relevant program elements are omitted, especially when the implementations for a feature or concern scatter across several source files. In this paper, we propose an approach for recommending potentially relevant program elements in an interactive feature location process. The two characteristics of our approach are: considering ongoing user context (i.e., confirmed or negated elements) in an interactive manner; performing an example-based reasoning to determine relevance of program elements. Based on an initial set of program elements confirmed by developers, our approach recommends additional program elements in an iterative process, in which developers can confirm relevant results, negate irrelevant results, and obtain an updated recommendation list. We have implemented our approach as an Eclipse plug-in called RecFL and conducted an experimental study. The results show that the participants using RecFL achieved a much better performance in their feature location tasks than the participants not using RecFL. The participants using RecFL also felt it easier to accomplish their feature location tasks with the support of RecFL.","PeriodicalId":306744,"journal":{"name":"2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124540016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PHP is one of the most popular web development tools in use today. A major concern though is the improper and insecure uses of the language by application developers, motivating the development of various static analyses that detect security vulnerabilities in PHP programs. However, many of these approaches do not handle recent, important PHP features such as object orientation, which greatly limits the use of such approaches in practice. In this paper, we present OOPIXY, a security analysis tool that extends the PHP security analyzer PIXY to support reasoning about object-oriented features in PHP applications. Our empirical evaluation shows that OOPIXY detects 88% of security vulnerabilities found in micro benchmarks. When used on real-world PHP applications, OOPIXY detects security vulnerabilities that could not be detected using state-of-the-art tools, retaining a high level of precision. We have contacted the maintainers of those applications, and two applications’ development teams verified the correctness of our findings. They are currently working on fixing the bugs that lead to those vulnerabilities.
{"title":"Detecting Security Vulnerabilities in Object-Oriented PHP Programs","authors":"Mona Nashaat, Karim Ali, James Miller","doi":"10.1109/SCAM.2017.20","DOIUrl":"https://doi.org/10.1109/SCAM.2017.20","url":null,"abstract":"PHP is one of the most popular web development tools in use today. A major concern though is the improper and insecure uses of the language by application developers, motivating the development of various static analyses that detect security vulnerabilities in PHP programs. However, many of these approaches do not handle recent, important PHP features such as object orientation, which greatly limits the use of such approaches in practice. In this paper, we present OOPIXY, a security analysis tool that extends the PHP security analyzer PIXY to support reasoning about object-oriented features in PHP applications. Our empirical evaluation shows that OOPIXY detects 88% of security vulnerabilities found in micro benchmarks. When used on real-world PHP applications, OOPIXY detects security vulnerabilities that could not be detected using state-of-the-art tools, retaining a high level of precision. We have contacted the maintainers of those applications, and two applications’ development teams verified the correctness of our findings. They are currently working on fixing the bugs that lead to those vulnerabilities.","PeriodicalId":306744,"journal":{"name":"2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"147 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125885566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The measurement of software quality, including the preparation and management of the necessary resources and libraries, is a major challenge in continuous software quality measurement and assessment. When applying code analysis tools to a large number of projects, the preparation of the source code and its dependencies, focusing on the completeness of these elements, is the basis for correct analysis results. In order to make this preparation process efficient and effective, there is a need to automate this process. Therefore, we built a tool infrastructure, which automates this preparation and analysis process. As part of the code preparation process, we developed the tool LibLoader, which automatically resolves missing dependencies in open source Java projects. This enables the analysis of complete projects in due time and with more accurate results from static code analysis tools.
{"title":"Automatically Adding Missing Libraries to Java Projects to Foster Better Results from Static Analysis","authors":"Thomas Atzenhofer, Reinhold Plösch","doi":"10.1109/SCAM.2017.10","DOIUrl":"https://doi.org/10.1109/SCAM.2017.10","url":null,"abstract":"The measurement of software quality, including the preparation and management of the necessary resources and libraries, is a major challenge in continuous software quality measurement and assessment. When applying code analysis tools to a large number of projects, the preparation of the source code and its dependencies, focusing on the completeness of these elements, is the basis for correct analysis results. In order to make this preparation process efficient and effective, there is a need to automate this process. Therefore, we built a tool infrastructure, which automates this preparation and analysis process. As part of the code preparation process, we developed the tool LibLoader, which automatically resolves missing dependencies in open source Java projects. This enables the analysis of complete projects in due time and with more accurate results from static code analysis tools.","PeriodicalId":306744,"journal":{"name":"2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126315194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the widespread use of mobile devices relying on limited battery power, the burden of optimizing applications for energy has shifted towards the application developers. In their quest to develop energy efficient applications, developers face the hurdle of measuring the effect of software change on energy consumption. A naive solution to this problem would be to have an exhaustive suite of test cases that are executed upon every change to measure their effect on energy consumption. This method is inefficient and also suffers from environment dependent inconsistencies. A more generalized method would be to relate software structural metrics with its energy consumption behavior. Previous attempts to relate change in objectoriented metrics to their effects on energy consumption have been inconclusive. We observe that structural information is global and executed tests are rarely comprehensive in their coverage, this approach is prone to errors. In this paper, we present a methodology to relate software energy consumption with software structural metrics considering the test case execution traces. Furthermore, we demonstrate that software structural metrics can be reliably related to energy consumption behavior of programs using several versions of three open-source iteratively developed android applications. We discover that by using our approach we are able to identify strong correlations between several software metrics and energy consumption behavior.
{"title":"A Methodology for Relating Software Structure with Energy Consumption","authors":"A. A. Bangash, Hareem Sahar, M. O. Beg","doi":"10.1109/SCAM.2017.18","DOIUrl":"https://doi.org/10.1109/SCAM.2017.18","url":null,"abstract":"With the widespread use of mobile devices relying on limited battery power, the burden of optimizing applications for energy has shifted towards the application developers. In their quest to develop energy efficient applications, developers face the hurdle of measuring the effect of software change on energy consumption. A naive solution to this problem would be to have an exhaustive suite of test cases that are executed upon every change to measure their effect on energy consumption. This method is inefficient and also suffers from environment dependent inconsistencies. A more generalized method would be to relate software structural metrics with its energy consumption behavior. Previous attempts to relate change in objectoriented metrics to their effects on energy consumption have been inconclusive. We observe that structural information is global and executed tests are rarely comprehensive in their coverage, this approach is prone to errors. In this paper, we present a methodology to relate software energy consumption with software structural metrics considering the test case execution traces. Furthermore, we demonstrate that software structural metrics can be reliably related to energy consumption behavior of programs using several versions of three open-source iteratively developed android applications. We discover that by using our approach we are able to identify strong correlations between several software metrics and energy consumption behavior.","PeriodicalId":306744,"journal":{"name":"2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123850452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The code behind dynamic webpages often includes calls to database libraries, with queries formed using a combination of static text and values computed at runtime. In this paper, we describe our work on a program analysis for extracting models of database queries that can compactly represent all queries that could be used in a specific database library call. We also describe our work on parsing partial queries, with holes representing parts of the query that are computed dynamically. Implemented in Rascal as part of the PHP AiR framework, the goal of this work is to enable empirical research on database usage in PHP scripts, to support developer tools for understanding existing queries, and to support program transformation tools to evolve existing systems and to improve the security of existing code.
{"title":"Supporting Analysis of SQL Queries in PHP AiR","authors":"David Anderson, M. Hills","doi":"10.1109/SCAM.2017.23","DOIUrl":"https://doi.org/10.1109/SCAM.2017.23","url":null,"abstract":"The code behind dynamic webpages often includes calls to database libraries, with queries formed using a combination of static text and values computed at runtime. In this paper, we describe our work on a program analysis for extracting models of database queries that can compactly represent all queries that could be used in a specific database library call. We also describe our work on parsing partial queries, with holes representing parts of the query that are computed dynamically. Implemented in Rascal as part of the PHP AiR framework, the goal of this work is to enable empirical research on database usage in PHP scripts, to support developer tools for understanding existing queries, and to support program transformation tools to evolve existing systems and to improve the security of existing code.","PeriodicalId":306744,"journal":{"name":"2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121834151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Configuration frameworks are routinely used in software systems to change application behavior without recompilation. Selecting a suitable configuration framework among the vast variety of existing choices is a crucial decision for developers, as it can impact project reliability and its maintenance profile. In this paper, we analyze almost 2,000 Java projects on GitHub to investigate the features and properties of 11 major Java configuration frameworks. We analyze the popularity of the frameworks and try to identify links between the maintenance effort involved with the usage of these frameworks and the frameworks' properties. More basic frameworks turn out to be the most popular, but in half of the cases are complemented by more complex frameworks. Furthermore, younger, more active frameworks with more detailed documentation, support for hierarchical configuration models and/or more data formats seem to require more maintenance by client developers.
{"title":"Does the Choice of Configuration Framework Matter for Developers? Empirical Study on 11 Java Configuration Frameworks","authors":"M. Sayagh, Zhen Dong, A. Andrzejak, Bram Adams","doi":"10.1109/SCAM.2017.25","DOIUrl":"https://doi.org/10.1109/SCAM.2017.25","url":null,"abstract":"Configuration frameworks are routinely used in software systems to change application behavior without recompilation. Selecting a suitable configuration framework among the vast variety of existing choices is a crucial decision for developers, as it can impact project reliability and its maintenance profile. In this paper, we analyze almost 2,000 Java projects on GitHub to investigate the features and properties of 11 major Java configuration frameworks. We analyze the popularity of the frameworks and try to identify links between the maintenance effort involved with the usage of these frameworks and the frameworks' properties. More basic frameworks turn out to be the most popular, but in half of the cases are complemented by more complex frameworks. Furthermore, younger, more active frameworks with more detailed documentation, support for hierarchical configuration models and/or more data formats seem to require more maintenance by client developers.","PeriodicalId":306744,"journal":{"name":"2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122542466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The verification of the time behavior in distributed, multi-threaded programs is challenging, mainly because modern programming languages only provide means to represent time without a proper semantics. Current approaches to extract time models from source code represent time only as a sequence of events or require developers to manually provide a formal model of the time behavior. This makes it difficult for developers to verify various aspects of their systems, such as timeouts, delays and periodicity of the execution.In this paper, we introduce a definition of the time semantics of the Java programming language. Based on the semantics, we present an approach to automatically extract timed automata and their time constraints from the Java methods source code. First, we detect Java statements which involve time, from which we then extract the timed automata that are directly amenable to the verification of time properties of the methods.We evaluated the accuracy of our approach on ten open source Java projects that heavily use time in their source code. The results show a precision of 98.62% and recall of 95.37% in extracting time constraints from Java code. Finally, we demonstrate the effectiveness of our approach with five reported bugs of four different Apache systems that we could confirm.
{"title":"Extracting Timed Automata from Java Methods","authors":"Giovanni Liva, M. Khan, M. Pinzger","doi":"10.1109/SCAM.2017.9","DOIUrl":"https://doi.org/10.1109/SCAM.2017.9","url":null,"abstract":"The verification of the time behavior in distributed, multi-threaded programs is challenging, mainly because modern programming languages only provide means to represent time without a proper semantics. Current approaches to extract time models from source code represent time only as a sequence of events or require developers to manually provide a formal model of the time behavior. This makes it difficult for developers to verify various aspects of their systems, such as timeouts, delays and periodicity of the execution.In this paper, we introduce a definition of the time semantics of the Java programming language. Based on the semantics, we present an approach to automatically extract timed automata and their time constraints from the Java methods source code. First, we detect Java statements which involve time, from which we then extract the timed automata that are directly amenable to the verification of time properties of the methods.We evaluated the accuracy of our approach on ten open source Java projects that heavily use time in their source code. The results show a precision of 98.62% and recall of 95.37% in extracting time constraints from Java code. Finally, we demonstrate the effectiveness of our approach with five reported bugs of four different Apache systems that we could confirm.","PeriodicalId":306744,"journal":{"name":"2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114193343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Ghafari, Pascal Gadient, Oscar Nierstrasz
The ubiquity of smartphones, and their very broad capabilities and usage, make the security of these devices tremendously important. Unfortunately, despite all progress in security and privacy mechanisms, vulnerabilities continue to proliferate.,,Research has shown that many vulnerabilities are due to insecure programming practices. However, each study has often dealt with a specific issue, making the results less actionable for practitioners.,,To promote secure programming practices, we have reviewed related research, and identified avoidable vulnerabilities in Android-run devices and the security code smells that indicate their presence. In particular, we explain the vulnerabilities, their corresponding smells, and we discuss how they could be eliminated or mitigated during development. Moreover, we develop a lightweight static analysis tool and discuss the extent to which it successfully detects several vulnerabilities in about 46 000 apps hosted by the official Android market.
{"title":"Security Smells in Android","authors":"Mohammad Ghafari, Pascal Gadient, Oscar Nierstrasz","doi":"10.1109/SCAM.2017.24","DOIUrl":"https://doi.org/10.1109/SCAM.2017.24","url":null,"abstract":"The ubiquity of smartphones, and their very broad capabilities and usage, make the security of these devices tremendously important. Unfortunately, despite all progress in security and privacy mechanisms, vulnerabilities continue to proliferate.,,Research has shown that many vulnerabilities are due to insecure programming practices. However, each study has often dealt with a specific issue, making the results less actionable for practitioners.,,To promote secure programming practices, we have reviewed related research, and identified avoidable vulnerabilities in Android-run devices and the security code smells that indicate their presence. In particular, we explain the vulnerabilities, their corresponding smells, and we discuss how they could be eliminated or mitigated during development. Moreover, we develop a lightweight static analysis tool and discuss the extent to which it successfully detects several vulnerabilities in about 46 000 apps hosted by the official Android market.","PeriodicalId":306744,"journal":{"name":"2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"84 9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124712667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Binkley, N. Gold, Syed S. Islam, J. Krinke, S. Yoo
Observation-based slicing is a recently-introduced, language-independent slicing technique based on the dependencies observable from program behavior. The original algorithm processed traditional source code at the line-of-text level. A recent variation was developed to slice the tree-based XML representation of executable models. We ported the model slicer to source code using srcML to construct a tree-based representation of traditional source code. We present the results of a comparison of the two slicers using four experiments involving seventeen different programs, including classic benchmarks and larger production systems. The resulting slices had essentially the same size and quite often the same content. Where they differ, the use of tree structure traded an ability to remove unnecessary parts of a statement for the requirement of maintaining aspect of the code structure. Comparing the slicers finds that each has its advantages. For example, when the tree representation facilitates the deletion of large chunks of code, the tree slicer was over eight times faster. In contrast, when slicing C++ code it was over nine times slower because of the multitude of small trees created to support C++ syntax. Given the pros and cons of the two, the results suggest the value of their hybrid combination.
{"title":"Tree-Oriented vs. Line-Oriented Observation-Based Slicing","authors":"D. Binkley, N. Gold, Syed S. Islam, J. Krinke, S. Yoo","doi":"10.1109/SCAM.2017.11","DOIUrl":"https://doi.org/10.1109/SCAM.2017.11","url":null,"abstract":"Observation-based slicing is a recently-introduced, language-independent slicing technique based on the dependencies observable from program behavior. The original algorithm processed traditional source code at the line-of-text level. A recent variation was developed to slice the tree-based XML representation of executable models. We ported the model slicer to source code using srcML to construct a tree-based representation of traditional source code. We present the results of a comparison of the two slicers using four experiments involving seventeen different programs, including classic benchmarks and larger production systems. The resulting slices had essentially the same size and quite often the same content. Where they differ, the use of tree structure traded an ability to remove unnecessary parts of a statement for the requirement of maintaining aspect of the code structure. Comparing the slicers finds that each has its advantages. For example, when the tree representation facilitates the deletion of large chunks of code, the tree slicer was over eight times faster. In contrast, when slicing C++ code it was over nine times slower because of the multitude of small trees created to support C++ syntax. Given the pros and cons of the two, the results suggest the value of their hybrid combination.","PeriodicalId":306744,"journal":{"name":"2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125133528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}