Modern distributed systems should be built to anticipate performance degradation. Often requests in these systems involve ten to thousands Remote Procedure Calls, each of which can be a source of performance degradation. The PhD programme presented here intends to address this issue by providing automated instruments to effectively drive performance fault injection in distributed systems. The envisioned approach exploits multi-objective search-based techniques to automatically find small combinations of tiny performance degradations induced by specific RPCs, which have significant impacts on the user-perceived performance. Automating the search of these events will improve the ability to inject performance issues in production in order to force developers to anticipate and mitigate them.
{"title":"A Multi-objective Framework for Effective Performance Fault Injection in Distributed Systems","authors":"L. Traini","doi":"10.1145/3238147.3241535","DOIUrl":"https://doi.org/10.1145/3238147.3241535","url":null,"abstract":"Modern distributed systems should be built to anticipate performance degradation. Often requests in these systems involve ten to thousands Remote Procedure Calls, each of which can be a source of performance degradation. The PhD programme presented here intends to address this issue by providing automated instruments to effectively drive performance fault injection in distributed systems. The envisioned approach exploits multi-objective search-based techniques to automatically find small combinations of tiny performance degradations induced by specific RPCs, which have significant impacts on the user-perceived performance. Automating the search of these events will improve the ability to inject performance issues in production in order to force developers to anticipate and mitigate them.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"21 1","pages":"936-939"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90166917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
While formal task models allow definition and time-based assessment of user-interaction, they have not yet been used as a baseline for assessing the variability patterns of user interaction with software. Consequently, operationalizing these variability patterns enables us to evaluate how suitable the defined task-execution paths are for the user to achieve a predefined goal. Improvements of the software could thereby be evidence-based on knowledge of changes' effects on functional suitability for the user rather than on prospective or opinion-based usage scenarios. Following a design thinking approach tailored to software engineering, understanding and observing these variability patterns are mandatory steps for continuous user-centered improvement of software. In practice however, due to the absence of an operational quality model for functional suitability it is hardly possible to effectively put these operational measures into context and to derive concrete software improvement actions. Having operational suitability metrics is even more important for software engineers as it allows to increasingly focus development activities especially on functionality which has business value for the user.
{"title":"Assessing and Evaluating Functional Suitability of Software","authors":"Philipp Haindl","doi":"10.1145/3238147.3241531","DOIUrl":"https://doi.org/10.1145/3238147.3241531","url":null,"abstract":"While formal task models allow definition and time-based assessment of user-interaction, they have not yet been used as a baseline for assessing the variability patterns of user interaction with software. Consequently, operationalizing these variability patterns enables us to evaluate how suitable the defined task-execution paths are for the user to achieve a predefined goal. Improvements of the software could thereby be evidence-based on knowledge of changes' effects on functional suitability for the user rather than on prospective or opinion-based usage scenarios. Following a design thinking approach tailored to software engineering, understanding and observing these variability patterns are mandatory steps for continuous user-centered improvement of software. In practice however, due to the absence of an operational quality model for functional suitability it is hardly possible to effectively put these operational measures into context and to derive concrete software improvement actions. Having operational suitability metrics is even more important for software engineers as it allows to increasingly focus development activities especially on functionality which has business value for the user.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"76 1","pages":"920-923"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90476564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Satisfiability modulo theories (SMT) solvers have been widely applied as the reasoning engine for diverse software analysis and verification technologies. The efficiency of the SMT solver has significant effects on the performance of these technologies. However, the current SMT solvers are designed for the general purpose of constraint solving. Many useful knowledge of programs cannot be utilized during the SMT solving. As a result, the SMT solver may spend a lot of effort to explore redundant search space. In this paper, we propose a novel approach for utilizing control-flow knowledge in SMT solving. With this technique, the search space can be considerably reduced and the efficiency of SMT solving is observably improved. We conducted extensive experiments on credible benchmarks, the results show orders of magnitude improvements of our approach.
{"title":"Control Flow-Guided SMT Solving for Program Verification","authors":"Jianhui Chen, Fei He","doi":"10.1145/3238147.3238218","DOIUrl":"https://doi.org/10.1145/3238147.3238218","url":null,"abstract":"Satisfiability modulo theories (SMT) solvers have been widely applied as the reasoning engine for diverse software analysis and verification technologies. The efficiency of the SMT solver has significant effects on the performance of these technologies. However, the current SMT solvers are designed for the general purpose of constraint solving. Many useful knowledge of programs cannot be utilized during the SMT solving. As a result, the SMT solver may spend a lot of effort to explore redundant search space. In this paper, we propose a novel approach for utilizing control-flow knowledge in SMT solving. With this technique, the search space can be considerably reduced and the efficiency of SMT solving is observably improved. We conducted extensive experiments on credible benchmarks, the results show orders of magnitude improvements of our approach.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"63 1","pages":"351-361"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83161672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. R. Gadelha, Felipe R. Monteiro, J. Morse, L. Cordeiro, B. Fischer, D. Nicole
ESBMC is a mature, permissively licensed open-source context-bounded model checker for the verification of single- and multithreaded C programs. It can verify both predefined safety properties (e.g., bounds check, pointer safety, overflow) and user-defined program assertions automatically. ESBMC provides C++ and Python APIs to access internal data structures, allowing inspection and extension at any stage of the verification process. We discuss improvements over previous versions of ESBMC, including the description of new front- and back-ends, IEEE floating-point support, and an improved k-induction algorithm. A demonstration is available at https://www.youtube.com/watch?v=YcJjXHlN1v8.
{"title":"ESBMC 5.0: An Industrial-Strength C Model Checker","authors":"M. R. Gadelha, Felipe R. Monteiro, J. Morse, L. Cordeiro, B. Fischer, D. Nicole","doi":"10.1145/3238147.3240481","DOIUrl":"https://doi.org/10.1145/3238147.3240481","url":null,"abstract":"ESBMC is a mature, permissively licensed open-source context-bounded model checker for the verification of single- and multithreaded C programs. It can verify both predefined safety properties (e.g., bounds check, pointer safety, overflow) and user-defined program assertions automatically. ESBMC provides C++ and Python APIs to access internal data structures, allowing inspection and extension at any stage of the verification process. We discuss improvements over previous versions of ESBMC, including the description of new front- and back-ends, IEEE floating-point support, and an improved k-induction algorithm. A demonstration is available at https://www.youtube.com/watch?v=YcJjXHlN1v8.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"13 1","pages":"888-891"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82033473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Huang, Chunyang Chen, Zhenchang Xing, Tian Lin, Yang Liu
Developers can use different technologies for many software development tasks in their work. However, when faced with several technologies with comparable functionalities, it is not easy for developers to select the most appropriate one, as comparisons among technologies are time-consuming by trial and error. Instead, developers can resort to expert articles, read official documents or ask questions in Q&A sites for technology comparison, but it is opportunistic to get a comprehensive comparison as online information is often fragmented or contradictory. To overcome these limitations, we propose the diffTech system that exploits the crowdsourced discussions from Stack Overflow, and assists technology comparison with an informative summary of different comparison aspects. We first build a large database of comparable software technologies by mining tags in Stack Overflow, and locate comparative sentences about comparable technologies with NLP methods. We further mine prominent comparison aspects by clustering similar comparative sentences and represent each cluster with its keywords. The evaluation demonstrates both the accuracy and usefulness of our model and we implement a practical website for public use.
{"title":"Tell Them Apart: Distilling Technology Differences from Crowd-Scale Comparison Discussions","authors":"Yi Huang, Chunyang Chen, Zhenchang Xing, Tian Lin, Yang Liu","doi":"10.1145/3238147.3238208","DOIUrl":"https://doi.org/10.1145/3238147.3238208","url":null,"abstract":"Developers can use different technologies for many software development tasks in their work. However, when faced with several technologies with comparable functionalities, it is not easy for developers to select the most appropriate one, as comparisons among technologies are time-consuming by trial and error. Instead, developers can resort to expert articles, read official documents or ask questions in Q&A sites for technology comparison, but it is opportunistic to get a comprehensive comparison as online information is often fragmented or contradictory. To overcome these limitations, we propose the diffTech system that exploits the crowdsourced discussions from Stack Overflow, and assists technology comparison with an informative summary of different comparison aspects. We first build a large database of comparable software technologies by mining tags in Stack Overflow, and locate comparative sentences about comparable technologies with NLP methods. We further mine prominent comparison aspects by clustering similar comparative sentences and represent each cluster with its keywords. The evaluation demonstrates both the accuracy and usefulness of our model and we implement a practical website for public use.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"34 1","pages":"214-224"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84829479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Code retrieval and summarization are two tasks often employed by software developers to reuse code that spreads over online repositories. In this paper, we present a neural framework that allows bidirectional mapping between source code and natural language to improve these two tasks. Our framework, BVAE, is designed to have two Variational AutoEncoders (VAEs) to model bimodal data: C-VAE for source code and L-VAE for natural language. Both VAEs are trained jointly to reconstruct their input as much as possible with regularization that captures the closeness between the latent variables of code and description. BVAE could learn semantic vector representations for both code and description and generate completely new descriptions for arbitrary code snippets. We design two instance models of BVAE for retrieval and summarization tasks respectively and evaluate their performance on a benchmark which involves two programming languages: C# and SQL. Experiments demonstrate BVAE's potential on the two tasks.
{"title":"A Neural Framework for Retrieval and Summarization of Source Code","authors":"Qingying Chen, Minghui Zhou","doi":"10.1145/3238147.3240471","DOIUrl":"https://doi.org/10.1145/3238147.3240471","url":null,"abstract":"Code retrieval and summarization are two tasks often employed by software developers to reuse code that spreads over online repositories. In this paper, we present a neural framework that allows bidirectional mapping between source code and natural language to improve these two tasks. Our framework, BVAE, is designed to have two Variational AutoEncoders (VAEs) to model bimodal data: C-VAE for source code and L-VAE for natural language. Both VAEs are trained jointly to reconstruct their input as much as possible with regularization that captures the closeness between the latent variables of code and description. BVAE could learn semantic vector representations for both code and description and generate completely new descriptions for arbitrary code snippets. We design two instance models of BVAE for retrieval and summarization tasks respectively and evaluate their performance on a benchmark which involves two programming languages: C# and SQL. Experiments demonstrate BVAE's potential on the two tasks.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"42 1","pages":"826-831"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88911603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Mukelabai, Damir Nesic, Salome Maro, T. Berger, Jan-Philipp Steghöfer
Highly configurable systems are complex pieces of software. To tackle this complexity, hundreds of dedicated analysis techniques have been conceived, many of which able to analyze system properties for all possible system configurations, as opposed to traditional, single-system analyses. Unfortunately, it is largely unknown whether these techniques are adopted in practice, whether they address actual needs, or what strategies practitioners actually apply to analyze highly configurable systems. We present a study of analysis practices and needs in industry. It relied on a survey with 27 practitioners engineering highly configurable systems and followup interviews with 15 of them, covering 18 different companies from eight countries. We confirm that typical properties considered in the literature (e.g., reliability) are relevant, that consistency between variability models and artifacts is critical, but that the majority of analyses for specifications of configuration options (a.k.a., variability model analysis) is not perceived as needed. We identified rather pragmatic analysis strategies, including practices to avoid the need for analysis. For instance, testing with experience-based sampling is the most commonly applied strategy, while systematic sampling is rarely applicable. We discuss analyses that are missing and synthesize our insights into suggestions for future research.
{"title":"Tackling Combinatorial Explosion: A Study of Industrial Needs and Practices for Analyzing Highly Configurable Systems","authors":"M. Mukelabai, Damir Nesic, Salome Maro, T. Berger, Jan-Philipp Steghöfer","doi":"10.1145/3238147.3238201","DOIUrl":"https://doi.org/10.1145/3238147.3238201","url":null,"abstract":"Highly configurable systems are complex pieces of software. To tackle this complexity, hundreds of dedicated analysis techniques have been conceived, many of which able to analyze system properties for all possible system configurations, as opposed to traditional, single-system analyses. Unfortunately, it is largely unknown whether these techniques are adopted in practice, whether they address actual needs, or what strategies practitioners actually apply to analyze highly configurable systems. We present a study of analysis practices and needs in industry. It relied on a survey with 27 practitioners engineering highly configurable systems and followup interviews with 15 of them, covering 18 different companies from eight countries. We confirm that typical properties considered in the literature (e.g., reliability) are relevant, that consistency between variability models and artifacts is critical, but that the majority of analyses for specifications of configuration options (a.k.a., variability model analysis) is not perceived as needed. We identified rather pragmatic analysis strategies, including practices to avoid the need for analysis. For instance, testing with experience-based sampling is the most commonly applied strategy, while systematic sampling is rarely applicable. We discuss analyses that are missing and synthesize our insights into suggestions for future research.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"8 1","pages":"155-166"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84278283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Statement coverage is commonly used as a measure of test suite quality. Coverage is often used as a part of a code review process: if a patch decreases overall coverage, or is itself not covered, then the patch is scrutinized more closely. Traditional studies of how coverage changes with code evolution have examined the overall coverage of the entire program, and more recent work directly examines the coverage of patches (changed statements). We present an evaluation much larger than prior studies and moreover consider a new, important kind of change – coverage changes of unchanged statements. We present a large-scale evaluation of code coverage evolution over 7,816 builds of 47 projects written in popular languages including Java, Python, and Scala. We find that in large, mature projects, simply measuring the change to statement coverage does not capture the nuances of code evolution. Going beyond considering statement coverage as a simple ratio, we examine how the set of statements covered evolves between project revisions. We present and study new ways to assess the impact of a patch on a project's test suite quality that both separates coverage of the patch from coverage of the non-patch, and separates changes in coverage from changes in the set of statements covered.
{"title":"A Large-Scale Study of Test Coverage Evolution","authors":"Michael C Hilton, Jonathan Bell, D. Marinov","doi":"10.1145/3238147.3238183","DOIUrl":"https://doi.org/10.1145/3238147.3238183","url":null,"abstract":"Statement coverage is commonly used as a measure of test suite quality. Coverage is often used as a part of a code review process: if a patch decreases overall coverage, or is itself not covered, then the patch is scrutinized more closely. Traditional studies of how coverage changes with code evolution have examined the overall coverage of the entire program, and more recent work directly examines the coverage of patches (changed statements). We present an evaluation much larger than prior studies and moreover consider a new, important kind of change – coverage changes of unchanged statements. We present a large-scale evaluation of code coverage evolution over 7,816 builds of 47 projects written in popular languages including Java, Python, and Scala. We find that in large, mature projects, simply measuring the change to statement coverage does not capture the nuances of code evolution. Going beyond considering statement coverage as a simple ratio, we examine how the set of statements covered evolves between project revisions. We present and study new ways to assess the impact of a patch on a project's test suite quality that both separates coverage of the patch from coverage of the non-patch, and separates changes in coverage from changes in the set of statements covered.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"16 1","pages":"53-63"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85847403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benno Stein, Lazaro Clapp, Manu Sridharan, B. E. Chang
In stream-based programming, data sources are abstracted as a stream of values that can be manipulated via callback functions. Stream-based programming is exploding in popularity, as it provides a powerful and expressive paradigm for handling asynchronous data sources in interactive software. However, high-level stream abstractions can also make it difficult for developers to reason about control- and data-flow relationships in their programs. This is particularly impactful when asynchronous stream-based code interacts with thread-limited features such as UI frameworks that restrict UI access to a single thread, since the threading behavior of streaming constructs is often non-intuitive and insufficiently documented. In this paper, we present a type-based approach that can statically prove the thread-safety of UI accesses in stream-based software. Our key insight is that the fluent APIs of stream-processing frameworks enable the tracking of threads via type-refinement, making it possible to reason automatically about what thread a piece of code runs on - a difficult problem in general. We implement the system as an annotation-based Java type-checker for Android programs built upon the popular ReactiveX framework and evaluate its efficacy by annotating and analyzing 8 open-source apps, where we find 33 instances of unsafe UI access while incurring an annotation burden of only one annotation per 186 source lines of code. We also report on our experience applying the typechecker to two much larger apps from the Uber Technologies, Inc. codebase, where it currently runs on every code change and blocks changes that introduce potential threading bugs.
{"title":"Safe Stream-Based Programming with Refinement Types","authors":"Benno Stein, Lazaro Clapp, Manu Sridharan, B. E. Chang","doi":"10.1145/3238147.3238174","DOIUrl":"https://doi.org/10.1145/3238147.3238174","url":null,"abstract":"In stream-based programming, data sources are abstracted as a stream of values that can be manipulated via callback functions. Stream-based programming is exploding in popularity, as it provides a powerful and expressive paradigm for handling asynchronous data sources in interactive software. However, high-level stream abstractions can also make it difficult for developers to reason about control- and data-flow relationships in their programs. This is particularly impactful when asynchronous stream-based code interacts with thread-limited features such as UI frameworks that restrict UI access to a single thread, since the threading behavior of streaming constructs is often non-intuitive and insufficiently documented. In this paper, we present a type-based approach that can statically prove the thread-safety of UI accesses in stream-based software. Our key insight is that the fluent APIs of stream-processing frameworks enable the tracking of threads via type-refinement, making it possible to reason automatically about what thread a piece of code runs on - a difficult problem in general. We implement the system as an annotation-based Java type-checker for Android programs built upon the popular ReactiveX framework and evaluate its efficacy by annotating and analyzing 8 open-source apps, where we find 33 instances of unsafe UI access while incurring an annotation burden of only one annotation per 186 source lines of code. We also report on our experience applying the typechecker to two much larger apps from the Uber Technologies, Inc. codebase, where it currently runs on every code change and blocks changes that introduce potential threading bugs.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"331 1","pages":"565-576"},"PeriodicalIF":0.0,"publicationDate":"2018-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77621992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lingling Fan, Ting Su, Sen Chen, Guozhu Meng, Yang Liu, Lihua Xu, G. Pu
Android, the #1 mobile app framework, enforces the single-GUI-thread model, in which a single UI thread manages GUI rendering and event dispatching. Due to this model, it is vital to avoid blocking the UI thread for responsiveness. One common practice is to offload long-running tasks into async threads. To achieve this, Android provides various async programming constructs, and leaves developers themselves to obey the rules implied by the model. However, as our study reveals, more than 25% apps violate these rules and introduce hard-to-detect, fail-stop errors, which we term as aysnc programming errors (APEs). To this end, this paper introduces APEChecker, a technique to automatically and efficiently manifest APEs. The key idea is to characterize APEs as specific fault patterns, and synergistically combine static analysis and dynamic UI exploration to detect and verify such errors. Among the 40 real-world Android apps, APEChecker unveils and processes 61 APEs, of which 51 are confirmed (83.6% hit rate). Specifically, APEChecker detects 3X more APEs than the state-of-art testing tools (Monkey, Sapienz and Stoat), and reduces testing time from half an hour to a few minutes. On a specific type of APEs, APEChecker confirms 5X more errors than the data race detection tool, EventRacer, with very few false alarms.
{"title":"Efficiently Manifesting Asynchronous Programming Errors in Android Apps","authors":"Lingling Fan, Ting Su, Sen Chen, Guozhu Meng, Yang Liu, Lihua Xu, G. Pu","doi":"10.1145/3238147.3238170","DOIUrl":"https://doi.org/10.1145/3238147.3238170","url":null,"abstract":"Android, the #1 mobile app framework, enforces the single-GUI-thread model, in which a single UI thread manages GUI rendering and event dispatching. Due to this model, it is vital to avoid blocking the UI thread for responsiveness. One common practice is to offload long-running tasks into async threads. To achieve this, Android provides various async programming constructs, and leaves developers themselves to obey the rules implied by the model. However, as our study reveals, more than 25% apps violate these rules and introduce hard-to-detect, fail-stop errors, which we term as aysnc programming errors (APEs). To this end, this paper introduces APEChecker, a technique to automatically and efficiently manifest APEs. The key idea is to characterize APEs as specific fault patterns, and synergistically combine static analysis and dynamic UI exploration to detect and verify such errors. Among the 40 real-world Android apps, APEChecker unveils and processes 61 APEs, of which 51 are confirmed (83.6% hit rate). Specifically, APEChecker detects 3X more APEs than the state-of-art testing tools (Monkey, Sapienz and Stoat), and reduces testing time from half an hour to a few minutes. On a specific type of APEs, APEChecker confirms 5X more errors than the data race detection tool, EventRacer, with very few false alarms.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"13 1","pages":"486-497"},"PeriodicalIF":0.0,"publicationDate":"2018-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88021731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}