Current statistical language modeling techniques, including deep-learning based models, have proven to be quite effective for source code. We argue here that the special properties of source code can be exploited for further improvements. In this work, we enhance established language modeling approaches to handle the special challenges of modeling source code, such as: frequent changes, larger, changing vocabularies, deeply nested scopes, etc. We present a fast, nested language modeling toolkit specifically designed for software, with the ability to add & remove text, and mix & swap out many models. Specifically, we improve upon prior cache-modeling work and present a model with a much more expansive, multi-level notion of locality that we show to be well-suited for modeling software. We present results on varying corpora in comparison with traditional N-gram, as well as RNN, and LSTM deep-learning language models, and release all our source code for public use. Our evaluations suggest that carefully adapting N-gram models for source code can yield performance that surpasses even RNN and LSTM based deep-learning models.
{"title":"Are deep neural networks the best choice for modeling source code?","authors":"V. Hellendoorn, Premkumar T. Devanbu","doi":"10.1145/3106237.3106290","DOIUrl":"https://doi.org/10.1145/3106237.3106290","url":null,"abstract":"Current statistical language modeling techniques, including deep-learning based models, have proven to be quite effective for source code. We argue here that the special properties of source code can be exploited for further improvements. In this work, we enhance established language modeling approaches to handle the special challenges of modeling source code, such as: frequent changes, larger, changing vocabularies, deeply nested scopes, etc. We present a fast, nested language modeling toolkit specifically designed for software, with the ability to add & remove text, and mix & swap out many models. Specifically, we improve upon prior cache-modeling work and present a model with a much more expansive, multi-level notion of locality that we show to be well-suited for modeling software. We present results on varying corpora in comparison with traditional N-gram, as well as RNN, and LSTM deep-learning language models, and release all our source code for public use. Our evaluations suggest that carefully adapting N-gram models for source code can yield performance that surpasses even RNN and LSTM based deep-learning models.","PeriodicalId":313494,"journal":{"name":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129428140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we present the design and implementation of a distributed, whole-program static analysis framework that is designed to scale with the size of the input. Our approach is based on the actor programming model and is deployed in the cloud. Our reliance on a cloud cluster provides a degree of elasticity for CPU, memory, and storage resources. To demonstrate the potential of our technique, we show how a typical call graph analysis can be implemented in a distributed setting. The vision that motivates this work is that every large-scale software repository such as GitHub, BitBucket, or Visual Studio Online will be able to perform static analysis on a large scale. We experimentally validate our implementation of the distributed call graph analysis using a combination of both synthetic and real benchmarks. To show scalability, we demonstrate how the analysis presented in this paper is able to handle inputs that are almost 10 million lines of code (LOC) in size, without running out of memory. Our results show that the analysis scales well in terms of memory pressure independently of the input size, as we add more virtual machines (VMs). As the number of worker VMs increases, we observe that the analysis time generally improves as well. Lastly, we demonstrate that querying the results can be performed with a median latency of 15 ms.
在本文中,我们提出了一个分布式、全程序静态分析框架的设计和实现,该框架被设计为随输入的大小而扩展。我们的方法基于参与者编程模型,并部署在云中。我们对云集群的依赖为CPU、内存和存储资源提供了一定程度的弹性。为了演示我们技术的潜力,我们将展示如何在分布式设置中实现典型的调用图分析。激励这项工作的愿景是,每个大型软件存储库(如GitHub、BitBucket或Visual Studio Online)都能够大规模地执行静态分析。我们通过实验验证了分布式调用图分析的实现,使用了合成基准和真实基准的组合。为了展示可伸缩性,我们演示了本文中提供的分析如何能够处理大小接近1000万行代码(LOC)的输入,而不会耗尽内存。我们的结果表明,当我们添加更多虚拟机(vm)时,就内存压力而言,分析的伸缩性与输入大小无关。随着工作虚拟机数量的增加,我们观察到分析时间通常也会提高。最后,我们证明查询结果可以在中位延迟为15 ms的情况下执行。
{"title":"Toward full elasticity in distributed static analysis: the case of callgraph analysis","authors":"D. Garbervetsky, Edgardo Zoppi, B. Livshits","doi":"10.1145/3106237.3106261","DOIUrl":"https://doi.org/10.1145/3106237.3106261","url":null,"abstract":"In this paper we present the design and implementation of a distributed, whole-program static analysis framework that is designed to scale with the size of the input. Our approach is based on the actor programming model and is deployed in the cloud. Our reliance on a cloud cluster provides a degree of elasticity for CPU, memory, and storage resources. To demonstrate the potential of our technique, we show how a typical call graph analysis can be implemented in a distributed setting. The vision that motivates this work is that every large-scale software repository such as GitHub, BitBucket, or Visual Studio Online will be able to perform static analysis on a large scale. We experimentally validate our implementation of the distributed call graph analysis using a combination of both synthetic and real benchmarks. To show scalability, we demonstrate how the analysis presented in this paper is able to handle inputs that are almost 10 million lines of code (LOC) in size, without running out of memory. Our results show that the analysis scales well in terms of memory pressure independently of the input size, as we add more virtual machines (VMs). As the number of worker VMs increases, we observe that the analysis time generally improves as well. Lastly, we demonstrate that querying the results can be performed with a median latency of 15 ms.","PeriodicalId":313494,"journal":{"name":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127310078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yamilet R. Serrano Llerena, Guoxin Su, David S. Rosenblum
Probabilistic model checking is a formal verification technique that has been applied successfully in a variety of domains, providing identification of system errors through quantitative verification of stochastic system models. One domain that can benefit from probabilistic model checking is cloud computing, which must provide highly reliable and secure computational and storage services to large numbers of mission-critical software systems. For real-world domains like cloud computing, external system factors and environmental changes must be estimated accurately in the form of probabilities in system models; inaccurate estimates for the model probabilities can lead to invalid verification results. To address the effects of uncertainty in probability estimates, in previous work we have developed a variety of techniques for perturbation analysis of discrete- and continuous-time Markov chains (DTMCs and CTMCs). These techniques determine the consequences of the uncertainty on verification of system properties. In this paper, we present the first approach for perturbation analysis of Markov decision processes (MDPs), a stochastic formalism that is especially popular due to the significant expressive power it provides through the combination of both probabilistic and nondeterministic choice. Our primary contribution is a novel technique for efficiently analyzing the effects of perturbations of model probabilities on verification of reachability properties of MDPs. The technique heuristically explores the space of adversaries of an MDP, which encode the different ways of resolving the MDP's nondeterministic choices. We demonstrate the practical effectiveness of our approach by applying it to two case studies of cloud systems.
{"title":"Probabilistic model checking of perturbed MDPs with applications to cloud computing","authors":"Yamilet R. Serrano Llerena, Guoxin Su, David S. Rosenblum","doi":"10.1145/3106237.3106301","DOIUrl":"https://doi.org/10.1145/3106237.3106301","url":null,"abstract":"Probabilistic model checking is a formal verification technique that has been applied successfully in a variety of domains, providing identification of system errors through quantitative verification of stochastic system models. One domain that can benefit from probabilistic model checking is cloud computing, which must provide highly reliable and secure computational and storage services to large numbers of mission-critical software systems. For real-world domains like cloud computing, external system factors and environmental changes must be estimated accurately in the form of probabilities in system models; inaccurate estimates for the model probabilities can lead to invalid verification results. To address the effects of uncertainty in probability estimates, in previous work we have developed a variety of techniques for perturbation analysis of discrete- and continuous-time Markov chains (DTMCs and CTMCs). These techniques determine the consequences of the uncertainty on verification of system properties. In this paper, we present the first approach for perturbation analysis of Markov decision processes (MDPs), a stochastic formalism that is especially popular due to the significant expressive power it provides through the combination of both probabilistic and nondeterministic choice. Our primary contribution is a novel technique for efficiently analyzing the effects of perturbations of model probabilities on verification of reachability properties of MDPs. The technique heuristically explores the space of adversaries of an MDP, which encode the different ways of resolving the MDP's nondeterministic choices. We demonstrate the practical effectiveness of our approach by applying it to two case studies of cloud systems.","PeriodicalId":313494,"journal":{"name":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114975763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, M. D. Penta, Andrian Marcus, G. Bavota, Vincent Ng
Bug reports document unexpected software behaviors experienced by users. To be effective, they should allow bug triagers to easily understand and reproduce the potential reported bugs, by clearly describing the Observed Behavior (OB), the Steps to Reproduce (S2R), and the Expected Behavior (EB). Unfortunately, while considered extremely useful, reporters often miss such pieces of information in bug reports and, to date, there is no effective way to automatically check and enforce their presence. We manually analyzed nearly 3k bug reports to understand to what extent OB, EB, and S2R are reported in bug reports and what discourse patterns reporters use to describe such information. We found that (i) while most reports contain OB (i.e., 93.5%), only 35.2% and 51.4% explicitly describe EB and S2R, respectively; and (ii) reporters recurrently use 154 discourse patterns to describe such content. Based on these findings, we designed and evaluated an automated approach to detect the absence (or presence) of EB and S2R in bug descriptions. With its best setting, our approach is able to detect missing EB (S2R) with 85.9% (69.2%) average precision and 93.2% (83%) average recall. Our approach intends to improve bug descriptions quality by alerting reporters about missing EB and S2R at reporting time.
{"title":"Detecting missing information in bug descriptions","authors":"Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, M. D. Penta, Andrian Marcus, G. Bavota, Vincent Ng","doi":"10.1145/3106237.3106285","DOIUrl":"https://doi.org/10.1145/3106237.3106285","url":null,"abstract":"Bug reports document unexpected software behaviors experienced by users. To be effective, they should allow bug triagers to easily understand and reproduce the potential reported bugs, by clearly describing the Observed Behavior (OB), the Steps to Reproduce (S2R), and the Expected Behavior (EB). Unfortunately, while considered extremely useful, reporters often miss such pieces of information in bug reports and, to date, there is no effective way to automatically check and enforce their presence. We manually analyzed nearly 3k bug reports to understand to what extent OB, EB, and S2R are reported in bug reports and what discourse patterns reporters use to describe such information. We found that (i) while most reports contain OB (i.e., 93.5%), only 35.2% and 51.4% explicitly describe EB and S2R, respectively; and (ii) reporters recurrently use 154 discourse patterns to describe such content. Based on these findings, we designed and evaluated an automated approach to detect the absence (or presence) of EB and S2R in bug descriptions. With its best setting, our approach is able to detect missing EB (S2R) with 85.9% (69.2%) average precision and 93.2% (83%) average recall. Our approach intends to improve bug descriptions quality by alerting reporters about missing EB and S2R at reporting time.","PeriodicalId":313494,"journal":{"name":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130424554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents the framework based on the emulator QEMU. Our framework provides set of multi-platform analysis tools for the virtual machines and mechanism for creating instrumentation and analysis tools. Our framework is based on a lightweight approach to dynamic analysis of binary code executed in virtual machines. This approach is non-intrusive and provides system-wide analysis capabilities. It does not require loading any guest agents and source code of the OS. Therefore it may be applied to ROM-based guest systems and enables using of record/replay of the system execution. We use application binary interface (ABI) of the platform to be analyzed for creating introspection tools. These tools recover the part of kernel-level information related to the system calls executed on the guest machine.
{"title":"QEMU-based framework for non-intrusive virtual machine instrumentation and introspection","authors":"P. Dovgalyuk, N. Fursova, I. Vasiliev, V. Makarov","doi":"10.1145/3106237.3122817","DOIUrl":"https://doi.org/10.1145/3106237.3122817","url":null,"abstract":"This paper presents the framework based on the emulator QEMU. Our framework provides set of multi-platform analysis tools for the virtual machines and mechanism for creating instrumentation and analysis tools. Our framework is based on a lightweight approach to dynamic analysis of binary code executed in virtual machines. This approach is non-intrusive and provides system-wide analysis capabilities. It does not require loading any guest agents and source code of the OS. Therefore it may be applied to ROM-based guest systems and enables using of record/replay of the system execution. We use application binary interface (ABI) of the platform to be analyzed for creating introspection tools. These tools recover the part of kernel-level information related to the system calls executed on the guest machine.","PeriodicalId":313494,"journal":{"name":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130476422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study presents DecisionDroid, a supervised learning based system to identify cloned Android app pairs. DecisionDroid is trained using a manually verified diverse dataset of 12,000 Android app pairs. On a hundred ten-fold cross validations, DecisionDroid achieved 97.9% precision, 98.3% recall, and 98.4% accuracy.
{"title":"DecisionDroid: a supervised learning-based system to identify cloned Android applications","authors":"Ayush Kohli","doi":"10.1145/3106237.3121277","DOIUrl":"https://doi.org/10.1145/3106237.3121277","url":null,"abstract":"This study presents DecisionDroid, a supervised learning based system to identify cloned Android app pairs. DecisionDroid is trained using a manually verified diverse dataset of 12,000 Android app pairs. On a hundred ten-fold cross validations, DecisionDroid achieved 97.9% precision, 98.3% recall, and 98.4% accuracy.","PeriodicalId":313494,"journal":{"name":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","volume":"383 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132944258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Concurrent systems may fail in the field due to various elusive faults such as race conditions. Reproducing such failures is hard because (1) concurrency failures at the system level often involve multiple processes or event handlers (e.g., software signals), which cannot be handled by existing tools for reproducing intra-process (thread-level) failures; (2) detailed field data, such as user input, file content and interleaving schedule, may not be available to developers; and (3) the debugging environment may differ from the deployed environment, which further complicates failure reproduction. To address these problems, we present DESCRY, the first fully automated tool for reproducing system-level concurrency failures based only on default log messages collected from the field. DESCRY uses a combination of static and dynamic analysis techniques, together with symbolic execution, to synthesize both the failure-inducing data input and the interleaving schedule, and leverages them to deterministically replay the failed execution using existing virtual platforms. We have evaluated DESCRY on 22 real-world multi-process Linux applications with a total of 236,875 lines of code to demonstrate both its effectiveness and its efficiency in reproducing failures that no other tool can reproduce.
{"title":"DESCRY: reproducing system-level concurrency failures","authors":"Tingting Yu, T. S. Zaman, Chao Wang","doi":"10.1145/3106237.3106266","DOIUrl":"https://doi.org/10.1145/3106237.3106266","url":null,"abstract":"Concurrent systems may fail in the field due to various elusive faults such as race conditions. Reproducing such failures is hard because (1) concurrency failures at the system level often involve multiple processes or event handlers (e.g., software signals), which cannot be handled by existing tools for reproducing intra-process (thread-level) failures; (2) detailed field data, such as user input, file content and interleaving schedule, may not be available to developers; and (3) the debugging environment may differ from the deployed environment, which further complicates failure reproduction. To address these problems, we present DESCRY, the first fully automated tool for reproducing system-level concurrency failures based only on default log messages collected from the field. DESCRY uses a combination of static and dynamic analysis techniques, together with symbolic execution, to synthesize both the failure-inducing data input and the interleaving schedule, and leverages them to deterministically replay the failed execution using existing virtual platforms. We have evaluated DESCRY on 22 real-world multi-process Linux applications with a total of 236,875 lines of code to demonstrate both its effectiveness and its efficiency in reproducing failures that no other tool can reproduce.","PeriodicalId":313494,"journal":{"name":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133034888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Fielding, R. Taylor, Justin R. Erenkrantz, M. Gorlick, Jim Whitehead, Rohit Khare, P. Oreizy
Seventeen years after its initial publication at ICSE 2000, the Representational State Transfer (REST) architectural style continues to hold significance as both a guide for understanding how the World Wide Web is designed to work and an example of how principled design, through the application of architectural styles, can impact the development and understanding of large-scale software architecture. However, REST has also become an industry buzzword: frequently abused to suit a particular argument, confused with the general notion of using HTTP, and denigrated for not being more like a programming methodology or implementation framework. In this paper, we chart the history, evolution, and shortcomings of REST, as well as several related architectural styles that it inspired, from the perspective of a chain of doctoral dissertations produced by the University of California's Institute for Software Research at UC Irvine. These successive theses share a common theme: extending the insights of REST to new domains and, in their own way, exploring the boundary of software engineering as it applies to decentralized software architectures and architectural design. We conclude with discussion of the circumstances, environment, and organizational characteristics that gave rise to this body of work.
在ICSE 2000上首次发表17年后,Representational State Transfer (REST)架构风格作为理解万维网如何设计工作的指南,以及通过架构风格的应用,原则性设计如何影响大规模软件架构的开发和理解的一个例子,仍然具有重要意义。然而,REST也已经成为一个行业流行语:经常被滥用以适应特定的论点,与使用HTTP的一般概念混淆,并因不更像编程方法或实现框架而受到诋毁。在本文中,我们从加州大学欧文分校软件研究所的一系列博士论文的角度,描绘了REST的历史、演变和缺点,以及它所启发的几个相关的架构风格。这些连续的论文都有一个共同的主题:将REST的见解扩展到新的领域,并以自己的方式探索软件工程的边界,因为它适用于分散的软件体系结构和体系结构设计。最后,我们将讨论产生本研究的环境、环境和组织特征。
{"title":"Reflections on the REST architectural style and \"principled design of the modern web architecture\" (impact paper award)","authors":"R. Fielding, R. Taylor, Justin R. Erenkrantz, M. Gorlick, Jim Whitehead, Rohit Khare, P. Oreizy","doi":"10.1145/3106237.3121282","DOIUrl":"https://doi.org/10.1145/3106237.3121282","url":null,"abstract":"Seventeen years after its initial publication at ICSE 2000, the Representational State Transfer (REST) architectural style continues to hold significance as both a guide for understanding how the World Wide Web is designed to work and an example of how principled design, through the application of architectural styles, can impact the development and understanding of large-scale software architecture. However, REST has also become an industry buzzword: frequently abused to suit a particular argument, confused with the general notion of using HTTP, and denigrated for not being more like a programming methodology or implementation framework. In this paper, we chart the history, evolution, and shortcomings of REST, as well as several related architectural styles that it inspired, from the perspective of a chain of doctoral dissertations produced by the University of California's Institute for Software Research at UC Irvine. These successive theses share a common theme: extending the insights of REST to new domains and, in their own way, exploring the boundary of software engineering as it applies to decentralized software architectures and architectural design. We conclude with discussion of the circumstances, environment, and organizational characteristics that gave rise to this body of work.","PeriodicalId":313494,"journal":{"name":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127637650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automatic program repair offers the promise of significant reduction in debugging time, but still faces challenges in making the process efficient, accurate, and generalizable enough for practical application. Recent efforts such as Prophet demonstrate that machine learning can be used to develop heuristics about which patches are likely to be correct, reducing overfitting problems and improving speed of repair. SearchRepair takes a different approach to accuracy, using blocks of human-written code as patches to better constrain repairs and avoid overfitting. This project combines Prophet's learning techniques with SearchRepair's larger block size to create a method that is both fast and accurate, leading to higher-quality repairs. We propose a novel first-pass filter to substantially reduce the number of candidate patches in SearchRepair and demonstrate 85% reduction in runtime over standard SearchRepair on the IntroClass dataset.
{"title":"Improving performance of automatic program repair using learned heuristics","authors":"Liam Schramm","doi":"10.1145/3106237.3121281","DOIUrl":"https://doi.org/10.1145/3106237.3121281","url":null,"abstract":"Automatic program repair offers the promise of significant reduction in debugging time, but still faces challenges in making the process efficient, accurate, and generalizable enough for practical application. Recent efforts such as Prophet demonstrate that machine learning can be used to develop heuristics about which patches are likely to be correct, reducing overfitting problems and improving speed of repair. SearchRepair takes a different approach to accuracy, using blocks of human-written code as patches to better constrain repairs and avoid overfitting. This project combines Prophet's learning techniques with SearchRepair's larger block size to create a method that is both fast and accurate, leading to higher-quality repairs. We propose a novel first-pass filter to substantially reduce the number of candidate patches in SearchRepair and demonstrate 85% reduction in runtime over standard SearchRepair on the IntroClass dataset.","PeriodicalId":313494,"journal":{"name":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133663012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a memory-model-aware static program analysis method for accurately analyzing the behavior of concurrent software running on processors with weak consistency models such as x86-TSO, SPARC-PSO, and SPARC-RMO. At the center of our method is a unified framework for deciding the feasibility of inter-thread interferences to avoid propagating spurious data flows during static analysis and thus boost the performance of the static analyzer. We formulate the checking of interference feasibility as a set of Datalog rules which are both efficiently solvable and general enough to capture a range of hardware-level memory models. Compared to existing techniques, our method can significantly reduce the number of bogus alarms as well as unsound proofs. We implemented the method and evaluated it on a large set of multithreaded C programs. Our experiments show the method significantly outperforms state-of-the-art techniques in terms of accuracy with only moderate runtime overhead.
{"title":"Thread-modular static analysis for relaxed memory models","authors":"Markus Kusano, Chao Wang","doi":"10.1145/3106237.3106243","DOIUrl":"https://doi.org/10.1145/3106237.3106243","url":null,"abstract":"We propose a memory-model-aware static program analysis method for accurately analyzing the behavior of concurrent software running on processors with weak consistency models such as x86-TSO, SPARC-PSO, and SPARC-RMO. At the center of our method is a unified framework for deciding the feasibility of inter-thread interferences to avoid propagating spurious data flows during static analysis and thus boost the performance of the static analyzer. We formulate the checking of interference feasibility as a set of Datalog rules which are both efficiently solvable and general enough to capture a range of hardware-level memory models. Compared to existing techniques, our method can significantly reduce the number of bogus alarms as well as unsound proofs. We implemented the method and evaluated it on a large set of multithreaded C programs. Our experiments show the method significantly outperforms state-of-the-art techniques in terms of accuracy with only moderate runtime overhead.","PeriodicalId":313494,"journal":{"name":"Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering","volume":"2501 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131238557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}