Symbolic execution is a well established technique for software testing and analysis. However, scalability continues to be a challenge, both in terms of constraint solving cost and path explosion. In this work, we present a novel approach for symbolic execution, which can enhance its scalability by aggressively prioritising execution paths that are already known to be feasible, and deferring all other paths. We evaluate our technique on nine applications, including SQLite3, make and tcpdump and show it can achieve higher coverage for both seeded and non-seeded exploration.
{"title":"Pending Constraints in Symbolic Execution for Better Exploration and Seeding","authors":"Timotej Kapus, Frank Busse, Cristian Cadar","doi":"10.1145/3324884.3416589","DOIUrl":"https://doi.org/10.1145/3324884.3416589","url":null,"abstract":"Symbolic execution is a well established technique for software testing and analysis. However, scalability continues to be a challenge, both in terms of constraint solving cost and path explosion. In this work, we present a novel approach for symbolic execution, which can enhance its scalability by aggressively prioritising execution paths that are already known to be feasible, and deferring all other paths. We evaluate our technique on nine applications, including SQLite3, make and tcpdump and show it can achieve higher coverage for both seeded and non-seeded exploration.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125282543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a novel approach to proving the termination of imperative programs by -induction. By our approach, the termination proving problem can be formalized as a -inductive invariant synthesis task. On the one hand, -induction uses weaker invariants than that required by the standard inductive approach. On the other hand, the base case of -induction, which unrolls the program, can provide stronger pre-condition for invariant synthesis. As a result, the termination arguments of our approach can be synthesized more efficiently than the standard method. We implement a prototype of our inductive approach. The experimental results show the significant effectiveness and efficiency of our approach.
{"title":"Proving Termination by k-Induction","authors":"Jianhui Chen, Fei He","doi":"10.1145/3324884.3418929","DOIUrl":"https://doi.org/10.1145/3324884.3418929","url":null,"abstract":"We propose a novel approach to proving the termination of imperative programs by -induction. By our approach, the termination proving problem can be formalized as a -inductive invariant synthesis task. On the one hand, -induction uses weaker invariants than that required by the standard inductive approach. On the other hand, the base case of -induction, which unrolls the program, can provide stronger pre-condition for invariant synthesis. As a result, the termination arguments of our approach can be synthesized more efficiently than the standard method. We implement a prototype of our inductive approach. The experimental results show the significant effectiveness and efficiency of our approach.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117010289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a static vulnerability detection and patching framework at both source code and binary level. It automatically identifies and collects known vulnerability information to build the signature. It matches vulnerable functions with similar signatures and filters out the ones that have been patched in the target program. For the vulnerable functions, the framework tries to generate hot patches by learning from the source code.
{"title":"Source Code and Binary Level Vulnerability Detection and Hot Patching","authors":"Zhengzi Xu","doi":"10.1145/3324884.3418914","DOIUrl":"https://doi.org/10.1145/3324884.3418914","url":null,"abstract":"This paper presents a static vulnerability detection and patching framework at both source code and binary level. It automatically identifies and collects known vulnerability information to build the signature. It matches vulnerable functions with similar signatures and filters out the ones that have been patched in the target program. For the vulnerable functions, the framework tries to generate hot patches by learning from the source code.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"295 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115220778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents Solar, a system for automatic synthesis of adversarial contracts that exploit vulnerabilities in a victim smart contract. To make the synthesis tractable, we introduce a query language as well as summary-based symbolic evaluation, which significantly reduces the number of instructions that our synthesizer needs to evaluate symbolically, without compromising the precision of the vulnerability query. We encoded common vulnerabilities of smart contracts and evaluated Solar on the entire data set from Etherscan. Our experiments demonstrate the benefits of summary-based symbolic evaluation and show that Solar outperforms state-of-the-art smart contracts analyzers, TEETHER, Mythril, and Contract Fuzzer, in terms of running time and precision.
{"title":"Summary-Based Symbolic Evaluation for Smart Contracts","authors":"Yu Feng, E. Torlak, R. Bodík","doi":"10.1145/3324884.3416646","DOIUrl":"https://doi.org/10.1145/3324884.3416646","url":null,"abstract":"This paper presents Solar, a system for automatic synthesis of adversarial contracts that exploit vulnerabilities in a victim smart contract. To make the synthesis tractable, we introduce a query language as well as summary-based symbolic evaluation, which significantly reduces the number of instructions that our synthesizer needs to evaluate symbolically, without compromising the precision of the vulnerability query. We encoded common vulnerabilities of smart contracts and evaluated Solar on the entire data set from Etherscan. Our experiments demonstrate the benefits of summary-based symbolic evaluation and show that Solar outperforms state-of-the-art smart contracts analyzers, TEETHER, Mythril, and Contract Fuzzer, in terms of running time and precision.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129633567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Pham, Shangshu Qian, Jiannan Wang, Thibaud Lutellier, Jonathan Rosenthal, Lin Tan, Yaoliang Yu, Nachiappan Nagappan
Deep learning (DL) training algorithms utilize nondeterminism to improve models' accuracy and training efficiency. Hence, multiple identical training runs (e.g., identical training data, algorithm, and network) produce different models with different accuracies and training times. In addition to these algorithmic factors, DL libraries (e.g., TensorFlow and cuDNN) introduce additional variance (referred to as implementation-level variance) due to parallelism, optimization, and floating-point computation. This work is the first to study the variance of DL systems and the awareness of this variance among researchers and practitioners. Our experiments on three datasets with six popular networks show large overall accuracy differences among identical training runs. Even after excluding weak models, the accuracy difference is 10.8%. In addition, implementation-level factors alone cause the accuracy difference across identical training runs to be up to 2.9%, the per-class accuracy difference to be up to 52.4%, and the training time difference to be up to 145.3%. All core libraries (TensorFlow, CNTK, and Theano) and low-level libraries (e.g., cuDNN) exhibit implementation-level variance across all evaluated versions. Our researcher and practitioner survey shows that 83.8% of the 901 participants are unaware of or unsure about any implementation-level variance. In addition, our literature survey shows that only 19.5±3% of papers in recent top software engineering (SE), artificial intelligence (AI), and systems conferences use multiple identical training runs to quantify the variance of their DL approaches. This paper raises awareness of DL variance and directs SE researchers to challenging tasks such as creating deterministic DL implementations to facilitate debugging and improving the reproducibility of DL software and results.
{"title":"Problems and Opportunities in Training Deep Learning Software Systems: An Analysis of Variance","authors":"H. Pham, Shangshu Qian, Jiannan Wang, Thibaud Lutellier, Jonathan Rosenthal, Lin Tan, Yaoliang Yu, Nachiappan Nagappan","doi":"10.1145/3324884.3416545","DOIUrl":"https://doi.org/10.1145/3324884.3416545","url":null,"abstract":"Deep learning (DL) training algorithms utilize nondeterminism to improve models' accuracy and training efficiency. Hence, multiple identical training runs (e.g., identical training data, algorithm, and network) produce different models with different accuracies and training times. In addition to these algorithmic factors, DL libraries (e.g., TensorFlow and cuDNN) introduce additional variance (referred to as implementation-level variance) due to parallelism, optimization, and floating-point computation. This work is the first to study the variance of DL systems and the awareness of this variance among researchers and practitioners. Our experiments on three datasets with six popular networks show large overall accuracy differences among identical training runs. Even after excluding weak models, the accuracy difference is 10.8%. In addition, implementation-level factors alone cause the accuracy difference across identical training runs to be up to 2.9%, the per-class accuracy difference to be up to 52.4%, and the training time difference to be up to 145.3%. All core libraries (TensorFlow, CNTK, and Theano) and low-level libraries (e.g., cuDNN) exhibit implementation-level variance across all evaluated versions. Our researcher and practitioner survey shows that 83.8% of the 901 participants are unaware of or unsure about any implementation-level variance. In addition, our literature survey shows that only 19.5±3% of papers in recent top software engineering (SE), artificial intelligence (AI), and systems conferences use multiple identical training runs to quantify the variance of their DL approaches. This paper raises awareness of DL variance and directs SE researchers to challenging tasks such as creating deterministic DL implementations to facilitate debugging and improving the reproducibility of DL software and results.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126556139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Self-Admitted Technical Debt (SATD) is a sub-type of technical debt. It is introduced to represent such technical debts that are intentionally introduced by developers in the process of software development. While being able to gain short-term benefits, the introduction of SATDs often requires to be paid back later with a higher cost, e.g., introducing bugs to the software or increasing the complexity of the software. To cope with these issues, our community has proposed various machine learning-based approaches to detect SATDs. These approaches, however, are either not generic that usually require manual feature engineering efforts or do not provide promising means to explain the predicted outcomes. To that end, we propose to the community a novel approach, namely HATD (Hybrid Attention-based method for self-admitted Technical Debt detection), to detect and explain SATDs using attention-based neural networks. Through extensive experiments on 445,365 comments in 20 projects, we show that HATD is effective in detecting SATDs on both in-the-lab and in-the-wild datasets under both within-project and cross-project settings. HATD also outperforms the state-of-the-art approaches in detecting and explaining SATDs.
{"title":"Detecting and Explaining Self-Admitted Technical Debts with Attention-based Neural Networks","authors":"Xin Wang","doi":"10.1145/3324884.3416583","DOIUrl":"https://doi.org/10.1145/3324884.3416583","url":null,"abstract":"Self-Admitted Technical Debt (SATD) is a sub-type of technical debt. It is introduced to represent such technical debts that are intentionally introduced by developers in the process of software development. While being able to gain short-term benefits, the introduction of SATDs often requires to be paid back later with a higher cost, e.g., introducing bugs to the software or increasing the complexity of the software. To cope with these issues, our community has proposed various machine learning-based approaches to detect SATDs. These approaches, however, are either not generic that usually require manual feature engineering efforts or do not provide promising means to explain the predicted outcomes. To that end, we propose to the community a novel approach, namely HATD (Hybrid Attention-based method for self-admitted Technical Debt detection), to detect and explain SATDs using attention-based neural networks. Through extensive experiments on 445,365 comments in 20 projects, we show that HATD is effective in detecting SATDs on both in-the-lab and in-the-wild datasets under both within-project and cross-project settings. HATD also outperforms the state-of-the-art approaches in detecting and explaining SATDs.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128548173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automated debugging techniques, including fault localization and program repair, have been studied for over a decade. However, the only existing connection between fault localization and program repair is that fault localization computes the potential buggy elements for program repair to patch. Recently, a pioneering work, ProFL, explored the idea of unified debugging to unify fault localization and program repair in the other direction for thefi rst time to boost both areas. More specifically, ProFL utilizes the patch execution results from one state-of-the-art repair system, PraPR, to help improve state-of-the-art fault localization. In this way, ProFL not only improves fault localization for manual repair, but also extends the application scope of automated repair to all possible bugs (not only the small ratio of bugs that can be automaticallyfi xed). However, ProFL only considers one APR system (i.e., PraPR), and it is not clear how other existing APR systems based on different designs contribute to unified debugging. In this work, we perform an extensive study of the unified-debugging approach on 16 state-of-the-art program repair systems for thefi rst time. Our experimental results on the widely studied Defects4J benchmark suite reveal various practical guidelines for unified debugging, such as (1) nearly all the studied 16 repair systems can positively contribute to unified debugging despite their varying repairing capabilities, (2) repair systems targeting multi-edit patches can bring extraneous noise into unified debugging, (3) repair systems with more executed/plausible patches tend to perform better for unified debugging, and (4) unified debugging effectiveness does not rely on the availability of correct patches in automated repair. Based on our results, we further propose an advanced unified debugging technique, UniDebug++, which can localize over 20% more bugs within Top-1 positions than state-of-the-art unified debugging technique, ProFL.
{"title":"On the Effectiveness of Unified Debugging: An Extensive Study on 16 Program Repair Systems","authors":"Samuel Benton, Xia Li, Yiling Lou, Lingming Zhang","doi":"10.1145/3324884.3416566","DOIUrl":"https://doi.org/10.1145/3324884.3416566","url":null,"abstract":"Automated debugging techniques, including fault localization and program repair, have been studied for over a decade. However, the only existing connection between fault localization and program repair is that fault localization computes the potential buggy elements for program repair to patch. Recently, a pioneering work, ProFL, explored the idea of unified debugging to unify fault localization and program repair in the other direction for thefi rst time to boost both areas. More specifically, ProFL utilizes the patch execution results from one state-of-the-art repair system, PraPR, to help improve state-of-the-art fault localization. In this way, ProFL not only improves fault localization for manual repair, but also extends the application scope of automated repair to all possible bugs (not only the small ratio of bugs that can be automaticallyfi xed). However, ProFL only considers one APR system (i.e., PraPR), and it is not clear how other existing APR systems based on different designs contribute to unified debugging. In this work, we perform an extensive study of the unified-debugging approach on 16 state-of-the-art program repair systems for thefi rst time. Our experimental results on the widely studied Defects4J benchmark suite reveal various practical guidelines for unified debugging, such as (1) nearly all the studied 16 repair systems can positively contribute to unified debugging despite their varying repairing capabilities, (2) repair systems targeting multi-edit patches can bring extraneous noise into unified debugging, (3) repair systems with more executed/plausible patches tend to perform better for unified debugging, and (4) unified debugging effectiveness does not rely on the availability of correct patches in automated repair. Based on our results, we further propose an advanced unified debugging technique, UniDebug++, which can localize over 20% more bugs within Top-1 positions than state-of-the-art unified debugging technique, ProFL.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126418097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In software development, issue tracker systems are widely used to manage bug reports. In such a system, a bug report can be filed, diagnosed, assigned, and fixed. In the standard process, a bug can be resolved as fixed, invalid, duplicated or won't fix. Although the above resolutions are well-defined and easy to understand, a bug report can end with a less known resolution, i.e., workaround. Compared with other resolutions, the definition of workarounds is more ambiguous. Besides the problem that is reported in a bug report, the resolution of a workaround raises more questions. Some questions are important for users, especially those programmers who build their projects upon others (e.g., libraries). Although some early studies have been conducted to analyze API workarounds, many research questions on workarounds are still open. For example, which bugs are resolved as workarounds? Why is a bug report resolved as workarounds? What are the repairs of workarounds? In this experience paper, we conduct the first empirical study to explore the above research questions. In particular, we analyzed 221 real workarounds that were collected from Apache projects. Our results lead to some interesting and useful answers to all the above questions. For example, we find that most bug reports are resolved as workarounds, because their problems reside in libraries (24.43%), settings (18.55%), and clients (10.41%). Among them, many bugs are difficult to be fixed fully and perfectly. As a late breaking result, we can only briefly introduce our study, but we present a detailed plan to extend it to a full paper.ACM Reference Format: Daohan Song, Hao Zhong, and Li Jia. 2020. The Symptom, Cause and Repair of Workaround. In 35th IEEE/ACM International Conference on Automated Software Engineering (ASE ‘20), September 21–25, 2020, Virtual Event, Australia. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3324884.3418910
{"title":"The Symptom, Cause and Repair of Workaround","authors":"Daohan Song, Hao Zhong, Li Jia","doi":"10.1145/3324884.3418910","DOIUrl":"https://doi.org/10.1145/3324884.3418910","url":null,"abstract":"In software development, issue tracker systems are widely used to manage bug reports. In such a system, a bug report can be filed, diagnosed, assigned, and fixed. In the standard process, a bug can be resolved as fixed, invalid, duplicated or won't fix. Although the above resolutions are well-defined and easy to understand, a bug report can end with a less known resolution, i.e., workaround. Compared with other resolutions, the definition of workarounds is more ambiguous. Besides the problem that is reported in a bug report, the resolution of a workaround raises more questions. Some questions are important for users, especially those programmers who build their projects upon others (e.g., libraries). Although some early studies have been conducted to analyze API workarounds, many research questions on workarounds are still open. For example, which bugs are resolved as workarounds? Why is a bug report resolved as workarounds? What are the repairs of workarounds? In this experience paper, we conduct the first empirical study to explore the above research questions. In particular, we analyzed 221 real workarounds that were collected from Apache projects. Our results lead to some interesting and useful answers to all the above questions. For example, we find that most bug reports are resolved as workarounds, because their problems reside in libraries (24.43%), settings (18.55%), and clients (10.41%). Among them, many bugs are difficult to be fixed fully and perfectly. As a late breaking result, we can only briefly introduce our study, but we present a detailed plan to extend it to a full paper.ACM Reference Format: Daohan Song, Hao Zhong, and Li Jia. 2020. The Symptom, Cause and Repair of Workaround. In 35th IEEE/ACM International Conference on Automated Software Engineering (ASE ‘20), September 21–25, 2020, Virtual Event, Australia. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3324884.3418910","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128365970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most programming languages support foreign language interoperation that allows developers to integrate multiple modules implemented in different languages into a single multilingual program. While utilizing various features from multiple languages expands expressivity, differences in language semantics require developers to understand the semantics of multiple languages and their inter-operation. Because current compilers do not support compile-time checking for interoperation, they do not help developers avoid in-teroperation bugs. Similarly, active research on static analysis and bug detection has been focusing on programs written in a single language. In this paper, we propose a novel approach to analyze multilingual programs statically. Unlike existing approaches that extend a static analyzer for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summaries. To show practicality of our approach, we design and implement a static analyzer for multilingual programs, which analyzes JNI interoperation between Java and C. Our empirical evaluation shows that the analyzer is scalable in that it can construct call graphs for large programs that use JNI interoperation, and useful in that it found 74 genuine interoperation bugs in real-world Android JNI applications.
{"title":"Broadening Horizons of Multilingual Static Analysis: Semantic Summary Extraction from C Code for JNI Program Analysis","authors":"Sungho Lee, Hyogun Lee, Sukyoung Ryu","doi":"10.1145/3324884.3416558","DOIUrl":"https://doi.org/10.1145/3324884.3416558","url":null,"abstract":"Most programming languages support foreign language interoperation that allows developers to integrate multiple modules implemented in different languages into a single multilingual program. While utilizing various features from multiple languages expands expressivity, differences in language semantics require developers to understand the semantics of multiple languages and their inter-operation. Because current compilers do not support compile-time checking for interoperation, they do not help developers avoid in-teroperation bugs. Similarly, active research on static analysis and bug detection has been focusing on programs written in a single language. In this paper, we propose a novel approach to analyze multilingual programs statically. Unlike existing approaches that extend a static analyzer for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summaries. To show practicality of our approach, we design and implement a static analyzer for multilingual programs, which analyzes JNI interoperation between Java and C. Our empirical evaluation shows that the analyzer is scalable in that it can construct call graphs for large programs that use JNI interoperation, and useful in that it found 74 genuine interoperation bugs in real-world Android JNI applications.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132390893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Derakhshanfar, Xavier Devroey, A. Zaidman, A. Deursen, Annibale Panichella
Writing a test case reproducing a reported software crash is a common practice to identify the root cause of an anomaly in the software under test. However, this task is usually labor-intensive and time-taking. Hence, evolutionary intelligence approaches have been successfully applied to assist developers during debugging by generating a test case reproducing reported crashes. These approaches use a single fitness function called Crash Distance to guide the search process toward reproducing a target crash. Despite the reported achievements, these approaches do not always successfully reproduce some crashes due to a lack of test diversity (premature convergence). In this study, we introduce a new approach, called MOHO, that addresses this issue via multi-objectivization. In particular, we introduce two new Helper-Objectives for crash reproduction, namely test length (to minimize) and method sequence diversity (to maximize), in addition to Crash Distance. We assessed MO-HO using five multi-objective evolutionary algorithms (NSGA-II, SPEA2, PESA-II, MOEA/D, FEMO) on 124 non-trivial crashes stemming from open-source projects. Our results indicate that SPEA2 is the best-performing multi-objective algorithm for MO-HO. We evaluated this best-performing algorithm for MO-HO against the state-of-the-art: single-objective approach (Single-Objective Search) and decomposition-based multi-objectivization approach (De-MO). Our results show that MO-HO reproduces five crashes that cannot be reproduced by the current state-of-the-art. Besides, MO-HO improves the effectiveness (+10% and +8% in reproduction ratio) and the efficiency in 34.6% and 36% of crashes (i.e., significantly lower running time) compared to Single-Objective Search and De-MO, respectively. For some crashes, the improvements are very large, being up to +93.3% for reproduction ratio and −92% for the required running time.
{"title":"Good Things Come In Threes: Improving Search-based Crash Reproduction With Helper Objectives","authors":"P. Derakhshanfar, Xavier Devroey, A. Zaidman, A. Deursen, Annibale Panichella","doi":"10.1145/3324884.3416643","DOIUrl":"https://doi.org/10.1145/3324884.3416643","url":null,"abstract":"Writing a test case reproducing a reported software crash is a common practice to identify the root cause of an anomaly in the software under test. However, this task is usually labor-intensive and time-taking. Hence, evolutionary intelligence approaches have been successfully applied to assist developers during debugging by generating a test case reproducing reported crashes. These approaches use a single fitness function called Crash Distance to guide the search process toward reproducing a target crash. Despite the reported achievements, these approaches do not always successfully reproduce some crashes due to a lack of test diversity (premature convergence). In this study, we introduce a new approach, called MOHO, that addresses this issue via multi-objectivization. In particular, we introduce two new Helper-Objectives for crash reproduction, namely test length (to minimize) and method sequence diversity (to maximize), in addition to Crash Distance. We assessed MO-HO using five multi-objective evolutionary algorithms (NSGA-II, SPEA2, PESA-II, MOEA/D, FEMO) on 124 non-trivial crashes stemming from open-source projects. Our results indicate that SPEA2 is the best-performing multi-objective algorithm for MO-HO. We evaluated this best-performing algorithm for MO-HO against the state-of-the-art: single-objective approach (Single-Objective Search) and decomposition-based multi-objectivization approach (De-MO). Our results show that MO-HO reproduces five crashes that cannot be reproduced by the current state-of-the-art. Besides, MO-HO improves the effectiveness (+10% and +8% in reproduction ratio) and the efficiency in 34.6% and 36% of crashes (i.e., significantly lower running time) compared to Single-Objective Search and De-MO, respectively. For some crashes, the improvements are very large, being up to +93.3% for reproduction ratio and −92% for the required running time.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131910946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}