Block-based programming languages like Scratch support learners by providing high-level constructs that hide details and by preventing syntactically incorrect programs. Questions nevertheless frequently arise: Is this program satisfying the given task? Why is my program not working? To support learners and educators, automated program analysis is needed for answering such questions. While adapting existing analyses to process blocks instead of textual statements is straightforward, the domain of programs controlled by block-based languages like Scratch is very different from traditional programs: In Scratch multiple actors, represented as highly concurrent programs, interact on a graphical stage, controlled by user inputs, and while the block-based program statements look playful, they hide complex mathematical operations that determine visual aspects and movement. Analyzing such programs is further hampered by the absence of clearly defined semantics, often resulting from ad-hoc decisions made by the implementers of the programming environment. To enable program analysis, we define the semantics of Scratch using an intermediate language. Based on this intermediate language, we implement the Bastet program analysis framework for Scratch programs, using concepts from abstract interpretation and software model checking. Like Scratch, Bastet is based on Web technologies, written in TypeScript, and can be executed using NodeJS or even directly in a browser. Evaluation on 279 programs written by children suggests that Bastet offers a practical solution for analysis of Scratch programs, thus enabling applications such as automated hint generation, automated evaluation of learner progress, or automated grading.
{"title":"Verified from Scratch: Program Analysis for Learners' Programs","authors":"Andreas Stahlbauer, Christoph Frädrich, G. Fraser","doi":"10.1145/3324884.3416554","DOIUrl":"https://doi.org/10.1145/3324884.3416554","url":null,"abstract":"Block-based programming languages like Scratch support learners by providing high-level constructs that hide details and by preventing syntactically incorrect programs. Questions nevertheless frequently arise: Is this program satisfying the given task? Why is my program not working? To support learners and educators, automated program analysis is needed for answering such questions. While adapting existing analyses to process blocks instead of textual statements is straightforward, the domain of programs controlled by block-based languages like Scratch is very different from traditional programs: In Scratch multiple actors, represented as highly concurrent programs, interact on a graphical stage, controlled by user inputs, and while the block-based program statements look playful, they hide complex mathematical operations that determine visual aspects and movement. Analyzing such programs is further hampered by the absence of clearly defined semantics, often resulting from ad-hoc decisions made by the implementers of the programming environment. To enable program analysis, we define the semantics of Scratch using an intermediate language. Based on this intermediate language, we implement the Bastet program analysis framework for Scratch programs, using concepts from abstract interpretation and software model checking. Like Scratch, Bastet is based on Web technologies, written in TypeScript, and can be executed using NodeJS or even directly in a browser. Evaluation on 279 programs written by children suggests that Bastet offers a practical solution for analysis of Scratch programs, thus enabling applications such as automated hint generation, automated evaluation of learner progress, or automated grading.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"395 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123202100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianqi Zhang, Yufeng Zhang, Zhenbang Chen, Ziqi Shuai, Ji Wang
Symbolic execution is still facing the scalability problem caused by path explosion and constraint solving overhead. The recently proposed MuSE framework supports exploring multiple paths by generating partial solutions in one time of solving. In this work, we improve MuSE from two aspects. Firstly, we use a light-weight check to reduce redundant partial solutions for avoiding the redundant executions having the same results. Secondly, we introduce online learning to devise an adaptive search strategy for the target programs. The preliminary experimental results indicate the promising of the proposed methods.
{"title":"Efficient Multiplex Symbolic Execution with Adaptive Search Strategy","authors":"Tianqi Zhang, Yufeng Zhang, Zhenbang Chen, Ziqi Shuai, Ji Wang","doi":"10.1145/3324884.3418902","DOIUrl":"https://doi.org/10.1145/3324884.3418902","url":null,"abstract":"Symbolic execution is still facing the scalability problem caused by path explosion and constraint solving overhead. The recently proposed MuSE framework supports exploring multiple paths by generating partial solutions in one time of solving. In this work, we improve MuSE from two aspects. Firstly, we use a light-weight check to reduce redundant partial solutions for avoiding the redundant executions having the same results. Secondly, we introduce online learning to devise an adaptive search strategy for the target programs. The preliminary experimental results indicate the promising of the proposed methods.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129565483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Code completion is one of the most useful features in the Integrated Development Environments (IDEs), which can accelerate software development by suggesting the next probable token based on the contextual code in real-time. Recent studies have shown that statistical language modeling techniques can improve the performance of code completion tools through learning from large-scale software repositories. However, these models suffer from two major drawbacks: a) Existing research uses static embeddings, which map a word to the same vector regardless of its context. The differences in the meaning of a token in varying contexts are lost when each token is associated with a single representation; b) Existing language model based code completion models perform poor on completing identifiers, and the type information of the identifiers is ignored in most of these models. To address these challenges, in this paper, we develop a multi-task learning based pre-trained language model for code understanding and code generation with a Transformer-based neural architecture. We pre-train it with hybrid objective functions that incorporate both code understanding and code generation tasks. Then we fine-tune the pre-trained model on code completion. During the completion, our model does not directly predict the next token. Instead, we adopt multi-task learning to predict the token and its type jointly and utilize the predicted type to assist the token prediction. Experiments results on two real-world datasets demonstrate the effectiveness of our model when compared with state-of-the-art methods.
{"title":"Multi-task Learning based Pre-trained Language Model for Code Completion","authors":"F. Liu, Ge Li, Yunfei Zhao, Zhi Jin","doi":"10.1145/3324884.3416591","DOIUrl":"https://doi.org/10.1145/3324884.3416591","url":null,"abstract":"Code completion is one of the most useful features in the Integrated Development Environments (IDEs), which can accelerate software development by suggesting the next probable token based on the contextual code in real-time. Recent studies have shown that statistical language modeling techniques can improve the performance of code completion tools through learning from large-scale software repositories. However, these models suffer from two major drawbacks: a) Existing research uses static embeddings, which map a word to the same vector regardless of its context. The differences in the meaning of a token in varying contexts are lost when each token is associated with a single representation; b) Existing language model based code completion models perform poor on completing identifiers, and the type information of the identifiers is ignored in most of these models. To address these challenges, in this paper, we develop a multi-task learning based pre-trained language model for code understanding and code generation with a Transformer-based neural architecture. We pre-train it with hybrid objective functions that incorporate both code understanding and code generation tasks. Then we fine-tune the pre-trained model on code completion. During the completion, our model does not directly predict the next token. Instead, we adopt multi-task learning to predict the token and its type jointly and utilize the predicted type to assist the token prediction. Experiments results on two real-world datasets demonstrate the effectiveness of our model when compared with state-of-the-art methods.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130123351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rohit Mehra, V. Sharma, Vikrant S. Kaulgud, Sanjay Podder, Adam P. Burden
While traditionally, software comprehension relies on approaches like reading through the code or looking at charts on screens, which are 2D mediums, there have been some recent approaches that advocate exploring 3D approaches like Augmented or Virtual Reality (AR/VR) to have a richer experience towards understanding software and its internal relationships. However, there is a dearth of objective studies that compare such 3D representations with their traditional 2D counterparts in the context of software comprehension. In this paper, we present an evaluation study to quantitatively and qualitatively compare 2D and 3D software representations with respect to typical comprehension tasks. For the 3D medium, we utilize an AR-based approach for 3D visualizations of a software system (XRaSE), while the 2D medium comprises of textual IDEs and 2D graph representations. The study, which has been conducted using 20 professional developers, shows that for most comprehension tasks, the developers perform much better using the 3D representation, especially in terms of velocity and recollection, while also displaying reduced cognitive load and better engagement.
{"title":"Towards Immersive Comprehension of Software Systems Using Augmented Reality - An Empirical Evaluation","authors":"Rohit Mehra, V. Sharma, Vikrant S. Kaulgud, Sanjay Podder, Adam P. Burden","doi":"10.1145/3324884.3418907","DOIUrl":"https://doi.org/10.1145/3324884.3418907","url":null,"abstract":"While traditionally, software comprehension relies on approaches like reading through the code or looking at charts on screens, which are 2D mediums, there have been some recent approaches that advocate exploring 3D approaches like Augmented or Virtual Reality (AR/VR) to have a richer experience towards understanding software and its internal relationships. However, there is a dearth of objective studies that compare such 3D representations with their traditional 2D counterparts in the context of software comprehension. In this paper, we present an evaluation study to quantitatively and qualitatively compare 2D and 3D software representations with respect to typical comprehension tasks. For the 3D medium, we utilize an AR-based approach for 3D visualizations of a software system (XRaSE), while the 2D medium comprises of textual IDEs and 2D graph representations. The study, which has been conducted using 20 professional developers, shows that for most comprehension tasks, the developers perform much better using the 3D representation, especially in terms of velocity and recollection, while also displaying reduced cognitive load and better engagement.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116440636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Serialisation related security vulnerabilities have recently been reported for numerous Java applications. Since serialisation presents both soundness and precision challenges for static analysis, it can be difficult for analyses to precisely pinpoint serialisation vulnerabilities in a Java library. In this paper, we propose a hybrid approach that extends a static analysis with fuzzing to detect serialisation vulnerabilities. The novelty of our approach is in its use of a heap abstraction to direct fuzzing for vulnerabilities in Java libraries. This guides fuzzing to produce results quickly and effectively, and it validates static analysis reports automatically. Our approach shows potential as it can detect known serialisation vulnerabilities in the Apache Commons Collections library.
{"title":"A Hybrid Analysis to Detect Java Serialisation Vulnerabilities","authors":"Shawn Rasheed, Jens Dietrich","doi":"10.1145/3324884.3418931","DOIUrl":"https://doi.org/10.1145/3324884.3418931","url":null,"abstract":"Serialisation related security vulnerabilities have recently been reported for numerous Java applications. Since serialisation presents both soundness and precision challenges for static analysis, it can be difficult for analyses to precisely pinpoint serialisation vulnerabilities in a Java library. In this paper, we propose a hybrid approach that extends a static analysis with fuzzing to detect serialisation vulnerabilities. The novelty of our approach is in its use of a heap abstraction to direct fuzzing for vulnerabilities in Java libraries. This guides fuzzing to produce results quickly and effectively, and it validates static analysis reports automatically. Our approach shows potential as it can detect known serialisation vulnerabilities in the Apache Commons Collections library.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115971604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Cao, Sahar Badihi, Khaled Ahmed, Peiyu Xiong, J. Rubin
This paper investigates the problem of classifying Android applications into malicious and benign. We analyze the performance of a popular malware detection tool, Drebin, and show that its correct classification decisions often stem from using benign rather than malicious features for making predictions. That, effectively, turns the classifier into a benign app detector rather than a malware detector. While such behavior allows the classifier to achieve a high detection accuracy, it also makes it vulnerable to attacks, e.g., by a malicious app pretending to be benign by using features similar to those of benign apps. In this paper, we propose an approach for deprioritizing benign features in malware detection, focusing the detection on truly malicious portions of the apps. We show that our proposed approach makes a classifier more resilient to attacks while still allowing it to maintain a high detection accuracy.
{"title":"On Benign Features in Malware Detection","authors":"Michael Cao, Sahar Badihi, Khaled Ahmed, Peiyu Xiong, J. Rubin","doi":"10.1145/3324884.3418926","DOIUrl":"https://doi.org/10.1145/3324884.3418926","url":null,"abstract":"This paper investigates the problem of classifying Android applications into malicious and benign. We analyze the performance of a popular malware detection tool, Drebin, and show that its correct classification decisions often stem from using benign rather than malicious features for making predictions. That, effectively, turns the classifier into a benign app detector rather than a malware detector. While such behavior allows the classifier to achieve a high detection accuracy, it also makes it vulnerable to attacks, e.g., by a malicious app pretending to be benign by using features similar to those of benign apps. In this paper, we propose an approach for deprioritizing benign features in malware detection, focusing the detection on truly malicious portions of the apps. We show that our proposed approach makes a classifier more resilient to attacks while still allowing it to maintain a high detection accuracy.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123805758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Programs are becoming increasingly complex and typically contain an abundance of unneeded features, which can degrade the performance and security of the software. Recently, we have witnessed a surge of debloating techniques that aim to create a reduced version of a program by eliminating the unneeded features therein. To de-bloat a program, most existing techniques require a usage profile of the program, typically provided as a set of inputs $I$. Unfortunately, these techniques tend to generate a reduced program that is overfitted to $I$ and thus fails to behave correctly for other inputs. To address this limitation, we propose Domgad, which has two main advantages over existing debloating approaches. First, it produces a reduced program that is guaranteed to work for subdomains, rather than for specific inputs. Second, it uses stochastic optimization to generate reduced programs that achieve a close-to-optimal tradeoff between reduction and generality (i.e., the extent to which the reduced program is able to correctly handle inputs in its whole domain). To assess the effectiveness of Domgad, we applied our approach to a benchmark of ten Unix utility programs. Our results are promising, as they show that Domgad could produce debloated programs that achieve, on average, 50% code reduction and 95% generality. Our results also show that Domgad performs well when compared with two state-of-the-art debloating approaches.
{"title":"Subdomain-Based Generality-Aware Debloating","authors":"Qi Xin, Myeongsoo Kim, Qirun Zhang, A. Orso","doi":"10.1145/3324884.3416644","DOIUrl":"https://doi.org/10.1145/3324884.3416644","url":null,"abstract":"Programs are becoming increasingly complex and typically contain an abundance of unneeded features, which can degrade the performance and security of the software. Recently, we have witnessed a surge of debloating techniques that aim to create a reduced version of a program by eliminating the unneeded features therein. To de-bloat a program, most existing techniques require a usage profile of the program, typically provided as a set of inputs $I$. Unfortunately, these techniques tend to generate a reduced program that is overfitted to $I$ and thus fails to behave correctly for other inputs. To address this limitation, we propose Domgad, which has two main advantages over existing debloating approaches. First, it produces a reduced program that is guaranteed to work for subdomains, rather than for specific inputs. Second, it uses stochastic optimization to generate reduced programs that achieve a close-to-optimal tradeoff between reduction and generality (i.e., the extent to which the reduced program is able to correctly handle inputs in its whole domain). To assess the effectiveness of Domgad, we applied our approach to a benchmark of ten Unix utility programs. Our results are promising, as they show that Domgad could produce debloated programs that achieve, on average, 50% code reduction and 95% generality. Our results also show that Domgad performs well when compared with two state-of-the-art debloating approaches.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127325318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Because creating and maintaining an in-house test lab is expensive and time-consuming, companies and app developers often use device clouds to test their apps. Because quality-assurance activities depend on such device clouds, it is important to understand possible issues related to their use. To this end, in this paper we present a preliminary study that investigates issues and highlights research opportunities in the context of managing and maintaining device clouds. In the study, we analyzed over 12 million test executions on 110 devices. We found that the management software of the cloud infrastructure we considered affected some test executions, and almost all the cloud devices had at least one security-related issue.
{"title":"Managing App Testing Device Clouds: Issues and Opportunities","authors":"M. Fazzini, A. Orso","doi":"10.1145/3324884.3418909","DOIUrl":"https://doi.org/10.1145/3324884.3418909","url":null,"abstract":"Because creating and maintaining an in-house test lab is expensive and time-consuming, companies and app developers often use device clouds to test their apps. Because quality-assurance activities depend on such device clouds, it is important to understand possible issues related to their use. To this end, in this paper we present a preliminary study that investigates issues and highlights research opportunities in the context of managing and maintaining device clouds. In the study, we analyzed over 12 million test executions on 110 devices. We found that the management software of the cloud infrastructure we considered affected some test executions, and almost all the cloud devices had at least one security-related issue.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132082760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoxue Ren, Xinyuan Ye, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, Jianling Sun
API misuses cause significant problem in software development. Existing methods detect API misuses against frequent API usage patterns mined from codebase. They make a naive assumption that API usage that deviates from the most-frequent API usage is a misuse. However, there is a big knowledge gap between API usage patterns and API usage caveats in terms of comprehensiveness, explainability and best practices. In this work, we propose a novel approach that detects API misuses directly against the API caveat knowledge, rather than API usage patterns. We develop open information extraction methods to construct a novel API-constraint knowledge graph from API reference documentation. This knowledge graph explicitly models two types of API-constraint relations (call-order and condition-checking) and enriches return and throw relations with return conditions and exception triggers. It empowers the detection of three types of frequent API misuses - missing calls, missing condition checking and missing exception handling, while existing detectors mostly focus on only missing calls. As a proof-of-concept, we apply our approach to Java SDK API Specification. Our evaluation confirms the high accuracy of the extracted API-constraint relations. Our knowledge-driven API misuse detector achieves 0.60 (68/113) precision and 0.28 (68/239) recall for detecting Java API misuses in the API misuse benchmark MuBench. This performance is significantly higher than that of existing pattern-based API misused detectors. A pilot user study with 12 developers shows that our knowledge-driven API misuse detection is very promising in helping developers avoid API misuses and debug the bugs caused by API misuses.
API误用会在软件开发中造成严重的问题。现有方法根据从代码库中挖掘的频繁API使用模式来检测API滥用。他们天真地认为,偏离最常用API的API使用是一种误用。然而,就全面性、可解释性和最佳实践而言,API使用模式和API使用警告之间存在很大的知识差距。在这项工作中,我们提出了一种新的方法,直接根据API警告知识而不是API使用模式检测API滥用。我们开发了开放的信息提取方法,从API参考文档中构建新的API约束知识图。这个知识图显式地为两种类型的api约束关系(调用顺序和条件检查)建模,并通过返回条件和异常触发器丰富了返回和抛出关系。它支持检测三种常见的API误用——缺失调用、缺失条件检查和缺失异常处理,而现有的检测器大多只关注缺失调用。作为概念验证,我们将我们的方法应用于Java SDK API Specification。我们的评估证实了提取的api约束关系的高准确性。我们的知识驱动的API误用检测器在API误用基准MuBench中检测Java API误用达到0.60(68/113)精度和0.28(68/239)召回率。这种性能明显高于现有的基于模式的API滥用检测器。一项针对12名开发人员的试点用户研究表明,我们的知识驱动的API误用检测在帮助开发人员避免API误用和调试由API误用引起的错误方面非常有前途。
{"title":"API-Misuse Detection Driven by Fine-Grained API-Constraint Knowledge Graph","authors":"Xiaoxue Ren, Xinyuan Ye, Zhenchang Xing, Xin Xia, Xiwei Xu, Liming Zhu, Jianling Sun","doi":"10.1145/3324884.3416551","DOIUrl":"https://doi.org/10.1145/3324884.3416551","url":null,"abstract":"API misuses cause significant problem in software development. Existing methods detect API misuses against frequent API usage patterns mined from codebase. They make a naive assumption that API usage that deviates from the most-frequent API usage is a misuse. However, there is a big knowledge gap between API usage patterns and API usage caveats in terms of comprehensiveness, explainability and best practices. In this work, we propose a novel approach that detects API misuses directly against the API caveat knowledge, rather than API usage patterns. We develop open information extraction methods to construct a novel API-constraint knowledge graph from API reference documentation. This knowledge graph explicitly models two types of API-constraint relations (call-order and condition-checking) and enriches return and throw relations with return conditions and exception triggers. It empowers the detection of three types of frequent API misuses - missing calls, missing condition checking and missing exception handling, while existing detectors mostly focus on only missing calls. As a proof-of-concept, we apply our approach to Java SDK API Specification. Our evaluation confirms the high accuracy of the extracted API-constraint relations. Our knowledge-driven API misuse detector achieves 0.60 (68/113) precision and 0.28 (68/239) recall for detecting Java API misuses in the API misuse benchmark MuBench. This performance is significantly higher than that of existing pattern-based API misused detectors. A pilot user study with 12 developers shows that our knowledge-driven API misuse detection is very promising in helping developers avoid API misuses and debug the bugs caused by API misuses.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132110214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Liu, Jinhui Xie, Jianbo Yang, Shi-ze Guo, Yuetang Deng, Shuqing Li, Yechang Wu, Yepang Liu
JavaScript is one of the most popular programming languages. WeChat Mini-Program is a large ecosystem of JavaScript applications that runs on the WeChat platform. Millions of Mini-Programs are accessed by WeChat users every week. Consequently, the performance and robustness of Mini-Programs are particularly important. Unfortunately, many Mini-Programs suffer from various defects and performance problems. Dynamic analysis is a useful technique to pinpoint application defects. However, due to the dynamic features of the JavaScript language and the complexity of the runtime environment, dynamic analysis techniques were rarely used to improve the quality of JavaScript applications running on industrial platforms such as WeChat Mini-Program previously. In this work, we report our experience of extending Jalangi, a dynamic analysis framework for JavaScript applications developed by academia, and applying the extended version, named WeJalangi, to diagnose defects in WeChat Mini-Programs. WeJalangi is compatible with existing dynamic analysis tools such as DLint, Smemory, and JITProf. We implemented a null pointer checker on WeJalangi and tested the tool's usability on 152 open-source Mini-Programs. We also conducted a case study in Tencent by applying WeJalangi on six popular commercial Mini-Programs. In the case study, WeJalangi accurately located six null pointer issues and three of them haven't been discovered previously. All of the reported defects have been confirmed by developers and testers.
{"title":"Industry Practice of JavaScript Dynamic Analysis on WeChat Mini-Programs","authors":"Yi Liu, Jinhui Xie, Jianbo Yang, Shi-ze Guo, Yuetang Deng, Shuqing Li, Yechang Wu, Yepang Liu","doi":"10.1145/3324884.3421842","DOIUrl":"https://doi.org/10.1145/3324884.3421842","url":null,"abstract":"JavaScript is one of the most popular programming languages. WeChat Mini-Program is a large ecosystem of JavaScript applications that runs on the WeChat platform. Millions of Mini-Programs are accessed by WeChat users every week. Consequently, the performance and robustness of Mini-Programs are particularly important. Unfortunately, many Mini-Programs suffer from various defects and performance problems. Dynamic analysis is a useful technique to pinpoint application defects. However, due to the dynamic features of the JavaScript language and the complexity of the runtime environment, dynamic analysis techniques were rarely used to improve the quality of JavaScript applications running on industrial platforms such as WeChat Mini-Program previously. In this work, we report our experience of extending Jalangi, a dynamic analysis framework for JavaScript applications developed by academia, and applying the extended version, named WeJalangi, to diagnose defects in WeChat Mini-Programs. WeJalangi is compatible with existing dynamic analysis tools such as DLint, Smemory, and JITProf. We implemented a null pointer checker on WeJalangi and tested the tool's usability on 152 open-source Mini-Programs. We also conducted a case study in Tencent by applying WeJalangi on six popular commercial Mini-Programs. In the case study, WeJalangi accurately located six null pointer issues and three of them haven't been discovered previously. All of the reported defects have been confirmed by developers and testers.","PeriodicalId":106337,"journal":{"name":"2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"42 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134357185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}