Strings with length constraints are prominent in software security analysis. Recent endeavors have made significant progress in developing constraint solvers for strings and integers. Most prior methods are based on deduction with inference rules or analysis using automata. The former may be inefficient when the constraints involve complex string manipulations such as language replacement; the latter may not be easily extended to handle length constraints and may be inadequate for counterexample generation due to approximation. Inspired by recent work on string analysis with logic circuit representation, we propose a new method for solving string with length constraints by an implicit representation of automata with length encoding. The length-encoded automata are of infinite states and can represent languages beyond regular expressions. By converting string and length constraints into a dependency graph of manipulations over length-encoded automata, a symbolic model checker for infinite state systems can be leveraged as an engine for the analysis of string and length constraints. Experiments show that our method has its unique capability of handling complex string and length constraints not solvable by existing methods.
{"title":"A Symbolic Model Checking Approach to the Analysis of String and Length Constraints","authors":"Hung-En Wang, Shih-Yu Chen, Fang Yu, J. H. Jiang","doi":"10.1145/3238147.3238189","DOIUrl":"https://doi.org/10.1145/3238147.3238189","url":null,"abstract":"Strings with length constraints are prominent in software security analysis. Recent endeavors have made significant progress in developing constraint solvers for strings and integers. Most prior methods are based on deduction with inference rules or analysis using automata. The former may be inefficient when the constraints involve complex string manipulations such as language replacement; the latter may not be easily extended to handle length constraints and may be inadequate for counterexample generation due to approximation. Inspired by recent work on string analysis with logic circuit representation, we propose a new method for solving string with length constraints by an implicit representation of automata with length encoding. The length-encoded automata are of infinite states and can represent languages beyond regular expressions. By converting string and length constraints into a dependency graph of manipulations over length-encoded automata, a symbolic model checker for infinite state systems can be leveraged as an engine for the analysis of string and length constraints. Experiments show that our method has its unique capability of handling complex string and length constraints not solvable by existing methods.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"14 1","pages":"623-633"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87127473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simone Scalabrino, Giovanni Grano, Dario Di Nucci, Michele Guerra, A. De Lucia, H. Gall, R. Oliveto
Automatically generating test cases plays an important role to reduce the time spent by developers during the testing phase. In last years, several approaches have been proposed to tackle such a problem: amongst others, search-based techniques have been shown to be particularly promising. In this paper we describe Ocelot, a search-based tool for the automatic generation of test cases in C. Ocelot allows practitioners to write skeletons of test cases for their programs and researchers to easily implement and experiment new approaches for automatic test-data generation. We show that Ocelot achieves a higher coverage compared to a competitive tool in 81% of the cases. Ocelot is publicly available to support both researchers and practitioners.
{"title":"OCELOT: A Search-Based Test-Data Generation Tool for C","authors":"Simone Scalabrino, Giovanni Grano, Dario Di Nucci, Michele Guerra, A. De Lucia, H. Gall, R. Oliveto","doi":"10.1145/3238147.3240477","DOIUrl":"https://doi.org/10.1145/3238147.3240477","url":null,"abstract":"Automatically generating test cases plays an important role to reduce the time spent by developers during the testing phase. In last years, several approaches have been proposed to tackle such a problem: amongst others, search-based techniques have been shown to be particularly promising. In this paper we describe Ocelot, a search-based tool for the automatic generation of test cases in C. Ocelot allows practitioners to write skeletons of test cases for their programs and researchers to easily implement and experiment new approaches for automatic test-data generation. We show that Ocelot achieves a higher coverage compared to a competitive tool in 81% of the cases. Ocelot is publicly available to support both researchers and practitioners.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"46 1","pages":"868-871"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88598339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Renzo Degiovanni, F. Molina, Germán Regis, Nazareno Aguirre
Goal-conflict analysis has been widely used as an abstraction for risk analysis in goal-oriented requirements engineering approaches. In this context, where the expected behaviour of the system-to-be is captured in terms of domain properties and goals, identifying combinations of circumstances that may make the goals diverge, i.e., not to be satisfied as a whole, is of most importance. Various approaches have been proposed in order to automatically identify boundary conditions, i.e., formulas capturing goal-divergent situations, but they either apply only to some specific goal expressions, or are affected by scalability issues that make them applicable only to relatively small specifications. In this paper, we present a novel approach to automatically identify boundary conditions, using evolutionary computation. More precisely, we develop a genetic algorithm that, given the LTL formulation of the domain properties and the goals, it searches for formulas that capture divergences in the specification. We exploit a modern LTL satisfiability checker to successfully guide our genetic algorithm to the solutions. We assess our technique on a set of case studies, and show that our genetic algorithm is able to find boundary conditions that cannot be generated by related approaches, and is able to efficiently scale to LTL specifications that other approaches are unable to deal with.
{"title":"A Genetic Algorithm for Goal-Conflict Identification","authors":"Renzo Degiovanni, F. Molina, Germán Regis, Nazareno Aguirre","doi":"10.1145/3238147.3238220","DOIUrl":"https://doi.org/10.1145/3238147.3238220","url":null,"abstract":"Goal-conflict analysis has been widely used as an abstraction for risk analysis in goal-oriented requirements engineering approaches. In this context, where the expected behaviour of the system-to-be is captured in terms of domain properties and goals, identifying combinations of circumstances that may make the goals diverge, i.e., not to be satisfied as a whole, is of most importance. Various approaches have been proposed in order to automatically identify boundary conditions, i.e., formulas capturing goal-divergent situations, but they either apply only to some specific goal expressions, or are affected by scalability issues that make them applicable only to relatively small specifications. In this paper, we present a novel approach to automatically identify boundary conditions, using evolutionary computation. More precisely, we develop a genetic algorithm that, given the LTL formulation of the domain properties and the goals, it searches for formulas that capture divergences in the specification. We exploit a modern LTL satisfiability checker to successfully guide our genetic algorithm to the solutions. We assess our technique on a set of case studies, and show that our genetic algorithm is able to find boundary conditions that cannot be generated by related approaches, and is able to efficiently scale to LTL specifications that other approaches are unable to deal with.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"520-531"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83167187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hashim Sharif, Muhammad Abubakar, Ashish Gehani, Fareed Zaffar
With the proliferation of new hardware architectures and ever-evolving user requirements, the software stack is becoming increasingly bloated. In practice, only a limited subset of the supported functionality is utilized in a particular usage context, thereby presenting an opportunity to eliminate unused features. In the past, program specialization has been proposed as a mechanism for enabling automatic software debloating. In this work, we show how existing program specialization techniques lack the analyses required for providing code simplification for real-world programs. We present an approach that uses stronger analysis techniques to take advantage of constant configuration data, thereby enabling more effective debloating. We developed Trimmer, an application specialization tool that leverages user-provided configuration data to specialize an application to its deployment context. The specialization process attempts to eliminate the application functionality that is unused in the user-defined context. Our evaluation demonstrates Trimmer can effectively reduce code bloat. For 13 applications spanning various domains, we observe a mean binary size reduction of 21% and a maximum reduction of 75%. We also show specialization reduces the surface for code-reuse attacks by reducing the number of exploitable gadgets. For the evaluated programs, we observe a 20% mean reduction in the total gadget count and a maximum reduction of 87%.
{"title":"Trimmer: Application Specialization for Code Debloating","authors":"Hashim Sharif, Muhammad Abubakar, Ashish Gehani, Fareed Zaffar","doi":"10.1145/3238147.3238160","DOIUrl":"https://doi.org/10.1145/3238147.3238160","url":null,"abstract":"With the proliferation of new hardware architectures and ever-evolving user requirements, the software stack is becoming increasingly bloated. In practice, only a limited subset of the supported functionality is utilized in a particular usage context, thereby presenting an opportunity to eliminate unused features. In the past, program specialization has been proposed as a mechanism for enabling automatic software debloating. In this work, we show how existing program specialization techniques lack the analyses required for providing code simplification for real-world programs. We present an approach that uses stronger analysis techniques to take advantage of constant configuration data, thereby enabling more effective debloating. We developed Trimmer, an application specialization tool that leverages user-provided configuration data to specialize an application to its deployment context. The specialization process attempts to eliminate the application functionality that is unused in the user-defined context. Our evaluation demonstrates Trimmer can effectively reduce code bloat. For 13 applications spanning various domains, we observe a mean binary size reduction of 21% and a maximum reduction of 75%. We also show specialization reduces the surface for code-reuse attacks by reducing the number of exploitable gadgets. For the evaluated programs, we observe a 20% mean reduction in the total gadget count and a maximum reduction of 87%.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"64 1","pages":"329-339"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80729556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Regular expression (regex) with modern extensions is one of the most popular string processing tools. However, poorly-designed regexes can yield exponentially many matching steps, and lead to regex Denial-of-Service (ReDoS) attacks under well-conceived string inputs. This paper presents ReScue, a three-phase gray-box analytical technique, to automatically generate ReDoS strings to highlight vulnerabilities of given regexes. ReScue systematically seeds (by a genetic search), incubates (by another genetic search), and finally pumps (by a regex-dedicated algorithm) for generating strings with maximized search time. We implemenmted the ReScue tool and evaluated it against 29,088 practical regexes in real-world projects. The evaluation results show that ReScue found 49% more attack strings compared with the best existing technique, and applying ReScue to popular GitHub projects discovered ten previously unknown ReDoS vulnerabilities.
{"title":"ReScue: Crafting Regular Expression DoS Attacks*","authors":"Yuju Shen, Yanyan Jiang, Chang Xu, Ping Yu, Xiaoxing Ma, Jian Lu","doi":"10.1145/3238147.3238159","DOIUrl":"https://doi.org/10.1145/3238147.3238159","url":null,"abstract":"Regular expression (regex) with modern extensions is one of the most popular string processing tools. However, poorly-designed regexes can yield exponentially many matching steps, and lead to regex Denial-of-Service (ReDoS) attacks under well-conceived string inputs. This paper presents ReScue, a three-phase gray-box analytical technique, to automatically generate ReDoS strings to highlight vulnerabilities of given regexes. ReScue systematically seeds (by a genetic search), incubates (by another genetic search), and finally pumps (by a regex-dedicated algorithm) for generating strings with maximized search time. We implemenmted the ReScue tool and evaluated it against 29,088 practical regexes in real-world projects. The evaluation results show that ReScue found 49% more attack strings compared with the best existing technique, and applying ReScue to popular GitHub projects discovered ten previously unknown ReDoS vulnerabilities.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"5 1","pages":"225-235"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75023754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ran Mo, W. Snipes, Yuanfang Cai, S. Ramaswamy, R. Kazman, M. Naedele
In this paper, we report our experiences of applying three complementary automated software architecture analysis techniques, supported by a tool suite, called DV8, to 8 industrial projects within a large company. DV8 includes two state-of-the-art architecture-level maintainability metrics—Decoupling Level and Propagation Cost, an architecture flaw detection tool, and an architecture root detection tool. We collected development process data from the project teams as input to these tools, reported the results back to the practitioners, and followed up with telephone conferences and interviews. Our experiences revealed that the metrics scores, quantitative debt analysis, and architecture flaw visualization can effectively bridge the gap between management and development, help them decide if, when, and where to refactor. In particular, the metrics scores, compared against industrial benchmarks, faithfully reflected the practitioners' intuitions about the maintainability of their projects, and enabled them to better understand the maintainability relative to other projects internal to their company, and to other industrial products. The automatically detected architecture flaws and roots enabled the practitioners to precisely pinpoint, visualize, and quantify the “hotspots” within the systems that are responsible for high maintenance costs. Except for the two smallest projects for which both architecture metrics indicated high maintainability, all other projects are planning or have already begun refactorings to address the problems detected by our analyses. We are working on further automating the tool chain, and transforming the analysis suite into deployable services accessible by all projects within the company.
{"title":"Experiences Applying Automated Architecture Analysis Tool Suites","authors":"Ran Mo, W. Snipes, Yuanfang Cai, S. Ramaswamy, R. Kazman, M. Naedele","doi":"10.1145/3238147.3240467","DOIUrl":"https://doi.org/10.1145/3238147.3240467","url":null,"abstract":"In this paper, we report our experiences of applying three complementary automated software architecture analysis techniques, supported by a tool suite, called DV8, to 8 industrial projects within a large company. DV8 includes two state-of-the-art architecture-level maintainability metrics—Decoupling Level and Propagation Cost, an architecture flaw detection tool, and an architecture root detection tool. We collected development process data from the project teams as input to these tools, reported the results back to the practitioners, and followed up with telephone conferences and interviews. Our experiences revealed that the metrics scores, quantitative debt analysis, and architecture flaw visualization can effectively bridge the gap between management and development, help them decide if, when, and where to refactor. In particular, the metrics scores, compared against industrial benchmarks, faithfully reflected the practitioners' intuitions about the maintainability of their projects, and enabled them to better understand the maintainability relative to other projects internal to their company, and to other industrial products. The automatically detected architecture flaws and roots enabled the practitioners to precisely pinpoint, visualize, and quantify the “hotspots” within the systems that are responsible for high maintenance costs. Except for the two smallest projects for which both architecture metrics indicated high maintainability, all other projects are planning or have already begun refactorings to address the problems detected by our analyses. We are working on further automating the tool chain, and transforming the analysis suite into deployable services accessible by all projects within the company.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"63 1","pages":"779-789"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73994015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many procedural languages, such as C and C++, have pointers. Pointers are powerful and convenient, but pointer aliasing still hinders compiler optimizations, despite several years of research on pointer aliasing analysis. Because alias analysis is a difficult task and results are not always accurate, the ISO C standard 99 has added a keyword, named restrict to allow the programmer to specify non-aliasing as an aid to the compiler's optimizer and to thereby possibly improve performance. The task of annotating pointers with the restrict keyword is still left to the programmer. This task is, in general, tedious and prone to errors especially since the C does not perform any verification to ensure that restrict keyword is not misplaced. In this paper we present a static analysis tool that (i) finds CUDA kernels call sites in which actual parameters do not alias; (ii) clones the kernels called at such sites; (iii) after performing an alias analysis in these kernels, adds the restrict keyword to their arguments; and (iv) replaces the original kernel call by a call to the optimized clone whenever possible.
{"title":"Towards Automatic Restrictification of CUDA Kernel Arguments","authors":"R. Diarra","doi":"10.1145/3238147.3241533","DOIUrl":"https://doi.org/10.1145/3238147.3241533","url":null,"abstract":"Many procedural languages, such as C and C++, have pointers. Pointers are powerful and convenient, but pointer aliasing still hinders compiler optimizations, despite several years of research on pointer aliasing analysis. Because alias analysis is a difficult task and results are not always accurate, the ISO C standard 99 has added a keyword, named restrict to allow the programmer to specify non-aliasing as an aid to the compiler's optimizer and to thereby possibly improve performance. The task of annotating pointers with the restrict keyword is still left to the programmer. This task is, in general, tedious and prone to errors especially since the C does not perform any verification to ensure that restrict keyword is not misplaced. In this paper we present a static analysis tool that (i) finds CUDA kernels call sites in which actual parameters do not alias; (ii) clones the kernels called at such sites; (iii) after performing an alias analysis in these kernels, adds the restrict keyword to their arguments; and (iv) replaces the original kernel call by a call to the optimized clone whenever possible.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"105 1","pages":"928-931"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76550766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Differential program analysis means to identify the behavioral divergences in one or multiple programs, and it can be classified into two categories: identify the behavioral divergences (1) between two program versions for the same input (aka regression analysis), and (2) for the same program with two different inputs (e.g, side-channel analysis). Most of the existent approaches for both subproblems try to solve it with single techniques, which suffer from its weaknesses like scalability issues or imprecision. This research proposes to combine two very strong techniques, namely fuzzing and symbolic execution to tackle these problems and provide scalable solutions for real-world applications. The proposed approaches will be implemented on top of state-of-the-art tools like AFL and Symbolic Pathfinder to evaluate them against existent work.
{"title":"Differential Program Analysis with Fuzzing and Symbolic Execution","authors":"Yannic Noller","doi":"10.1145/3238147.3241537","DOIUrl":"https://doi.org/10.1145/3238147.3241537","url":null,"abstract":"Differential program analysis means to identify the behavioral divergences in one or multiple programs, and it can be classified into two categories: identify the behavioral divergences (1) between two program versions for the same input (aka regression analysis), and (2) for the same program with two different inputs (e.g, side-channel analysis). Most of the existent approaches for both subproblems try to solve it with single techniques, which suffer from its weaknesses like scalability issues or imprecision. This research proposes to combine two very strong techniques, namely fuzzing and symbolic execution to tackle these problems and provide scalable solutions for real-world applications. The proposed approaches will be implemented on top of state-of-the-art tools like AFL and Symbolic Pathfinder to evaluate them against existent work.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"14 1","pages":"944-947"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78743460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Scheduling constraint based abstraction refinement (SCAR) is one of the most efficient methods for verifying programs under sequential consistency (SC). However, most multi-processor architectures implement weak memory models (WMMs) in order to improve the performance of a program. Due to the nondeterministic execution of those memory operations by the same thread, the behavior of a program under WMMs is much more complex than that under SC, which significantly increases the verification complexity. This paper elegantly extends the SCAR method to WMMs such as TSO and PSO. To capture the order requirements of an abstraction counterexample under WMMs, we have enriched the event order graph (EOG) of a counterexample such that it is competent for both SC and WMMs. We have also proposed a unified EOG generation method which can always obtain a minimal EOG efficiently. Experimental results on a large set of multi-threaded C programs show promising results of our method. It significantly outperforms state-of-the-art tools, and the time and memory it required to verify a program under TSO and PSO are roughly comparable to that under SC.
{"title":"Scheduling Constraint Based Abstraction Refinement for Weak Memory Models","authors":"Liangze Yin, Wei Dong, Wanwei Liu, Ji Wang","doi":"10.1145/3238147.3238223","DOIUrl":"https://doi.org/10.1145/3238147.3238223","url":null,"abstract":"Scheduling constraint based abstraction refinement (SCAR) is one of the most efficient methods for verifying programs under sequential consistency (SC). However, most multi-processor architectures implement weak memory models (WMMs) in order to improve the performance of a program. Due to the nondeterministic execution of those memory operations by the same thread, the behavior of a program under WMMs is much more complex than that under SC, which significantly increases the verification complexity. This paper elegantly extends the SCAR method to WMMs such as TSO and PSO. To capture the order requirements of an abstraction counterexample under WMMs, we have enriched the event order graph (EOG) of a counterexample such that it is competent for both SC and WMMs. We have also proposed a unified EOG generation method which can always obtain a minimal EOG efficiently. Experimental results on a large set of multi-threaded C programs show promising results of our method. It significantly outperforms state-of-the-art tools, and the time and memory it required to verify a program under TSO and PSO are roughly comparable to that under SC.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"14 1","pages":"645-655"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84784422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Han Liu, Chao Liu, Wenqi Zhao, Yu Jiang, Jiaguang Sun
Smart contracts, as a promising and powerful application on the Ethereum blockchain, have been growing rapidly in the past few years. Since they are highly vulnerable to different forms of attacks, their security becomes a top priority. However, existing security auditing techniques are either limited in finding vulnerabilities (rely on pre-defined bug patterns) or very expensive (rely on program analysis), thus are insufficient for Ethereum. To mitigate these limitations, we proposed a novel semantic-aware security auditing technique called S-GRAM for Ethereum. The key insight is a combination of N-gram language modeling and lightweight static semantic labeling, which can learn statistical regularities of contract tokens and capture high-level semantics as well (e.g., flow sensitivity of a transaction). S-GRAM can be used to predict potential vulnerabilities by identifying irregular token sequences and optimize existing in-depth analyzers (e.g., symbolic execution engines, fuzzers etc.). We have implemented S-GRAM for Solidity smart contracts in Ethereum. The evaluation demonstrated the potential of S-GRAM in identifying possible security issues.
{"title":"S-gram: Towards Semantic-Aware Security Auditing for Ethereum Smart Contracts","authors":"Han Liu, Chao Liu, Wenqi Zhao, Yu Jiang, Jiaguang Sun","doi":"10.1145/3238147.3240728","DOIUrl":"https://doi.org/10.1145/3238147.3240728","url":null,"abstract":"Smart contracts, as a promising and powerful application on the Ethereum blockchain, have been growing rapidly in the past few years. Since they are highly vulnerable to different forms of attacks, their security becomes a top priority. However, existing security auditing techniques are either limited in finding vulnerabilities (rely on pre-defined bug patterns) or very expensive (rely on program analysis), thus are insufficient for Ethereum. To mitigate these limitations, we proposed a novel semantic-aware security auditing technique called S-GRAM for Ethereum. The key insight is a combination of N-gram language modeling and lightweight static semantic labeling, which can learn statistical regularities of contract tokens and capture high-level semantics as well (e.g., flow sensitivity of a transaction). S-GRAM can be used to predict potential vulnerabilities by identifying irregular token sequences and optimize existing in-depth analyzers (e.g., symbolic execution engines, fuzzers etc.). We have implemented S-GRAM for Solidity smart contracts in Ethereum. The evaluation demonstrated the potential of S-GRAM in identifying possible security issues.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"34 1","pages":"814-819"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87234712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}