2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)最新文献_第5页

Exploring API Embedding for API Usages and Applications 探索API用法和应用的API嵌入

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE.2017.47

Trong Duc Nguyen, A. Nguyen, H. Phan, T. Nguyen

Word2Vec is a class of neural network models that as being trainedfrom a large corpus of texts, they can produce for each unique word acorresponding vector in a continuous space in which linguisticcontexts of words can be observed. In this work, we study thecharacteristics of Word2Vec vectors, called API2VEC or API embeddings, for the API elements within the API sequences in source code. Ourempirical study shows that the close proximity of the API2VEC vectorsfor API elements reflects the similar usage contexts containing thesurrounding APIs of those API elements. Moreover, API2VEC can captureseveral similar semantic relations between API elements in API usagesvia vector offsets. We demonstrate the usefulness of API2VEC vectorsfor API elements in three applications. First, we build a tool thatmines the pairs of API elements that share the same usage relationsamong them. The other applications are in the code migrationdomain. We develop API2API, a tool to automatically learn the APImappings between Java and C# using a characteristic of the API2VECvectors for API elements in the two languages: semantic relationsamong API elements in their usages are observed in the two vectorspaces for the two languages as similar geometric arrangements amongtheir API2VEC vectors. Our empirical evaluation shows that API2APIrelatively improves 22.6% and 40.1% top-1 and top-5 accuracy over astate-of-the-art mining approach for API mappings. Finally, as anotherapplication in code migration, we are able to migrate equivalent APIusages from Java to C# with up to 90.6% recall and 87.2% precision.

Word2Vec是一类从大量文本语料库中训练出来的神经网络模型，它们可以在一个连续的空间中为每个独特的单词产生相应的向量，在这个空间中可以观察到单词的语言上下文。在这项工作中，我们研究了Word2Vec向量的特征，称为API2VEC或API嵌入，用于源代码中API序列中的API元素。我们的实证研究表明，API元素的API2VEC向量的接近性反映了包含这些API元素周围API的相似使用上下文。此外，API2VEC可以通过向量偏移捕获API使用中API元素之间的几个类似的语义关系。我们在三个应用程序中演示了API2VEC向量对API元素的有用性。首先，我们构建一个工具来挖掘API元素对，它们之间共享相同的使用关系。其他应用程序位于代码迁移域中。我们开发了API2API，这是一个使用两种语言中API元素的api2vecvector特性自动学习Java和c#之间的API映射的工具:在两种语言的两个向量空间中观察到API元素在其使用中的语义关系，就像它们的API2VEC向量之间的相似几何排列一样。我们的经验评估表明，api2api相对于最先进的API映射挖掘方法提高了22.6%和40.1%的top-1和top-5准确率。最后，作为代码迁移中的另一个应用程序，我们能够以高达90.6%的召回率和87.2%的精度将等效的api用法从Java迁移到c#。

{"title":"Exploring API Embedding for API Usages and Applications","authors":"Trong Duc Nguyen, A. Nguyen, H. Phan, T. Nguyen","doi":"10.1109/ICSE.2017.47","DOIUrl":"https://doi.org/10.1109/ICSE.2017.47","url":null,"abstract":"Word2Vec is a class of neural network models that as being trainedfrom a large corpus of texts, they can produce for each unique word acorresponding vector in a continuous space in which linguisticcontexts of words can be observed. In this work, we study thecharacteristics of Word2Vec vectors, called API2VEC or API embeddings, for the API elements within the API sequences in source code. Ourempirical study shows that the close proximity of the API2VEC vectorsfor API elements reflects the similar usage contexts containing thesurrounding APIs of those API elements. Moreover, API2VEC can captureseveral similar semantic relations between API elements in API usagesvia vector offsets. We demonstrate the usefulness of API2VEC vectorsfor API elements in three applications. First, we build a tool thatmines the pairs of API elements that share the same usage relationsamong them. The other applications are in the code migrationdomain. We develop API2API, a tool to automatically learn the APImappings between Java and C# using a characteristic of the API2VECvectors for API elements in the two languages: semantic relationsamong API elements in their usages are observed in the two vectorspaces for the two languages as similar geometric arrangements amongtheir API2VEC vectors. Our empirical evaluation shows that API2APIrelatively improves 22.6% and 40.1% top-1 and top-5 accuracy over astate-of-the-art mining approach for API mappings. Finally, as anotherapplication in code migration, we are able to migrate equivalent APIusages from Java to C# with up to 90.6% recall and 87.2% precision.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"36 1","pages":"438-449"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79224728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 145

Symbolic Model Extraction for Web Application Verification 用于Web应用验证的符号模型提取

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE.2017.72

Ivan Bocic, T. Bultan

Modern web applications use complex data models and access control rules which lead to data integrity and access control errors. One approach to find such errors is to use formal verification techniques. However, as a first step, most formal verification techniques require extraction of a formal model which is a difficult problem in itself due to dynamic features of modern languages, and it is typically done either manually, or using ad hoc techniques. In this paper, we present a technique called symbolic model extraction for extracting formal data models from web applications. The key ideas of symbolic model extraction are 1) to use the source language interpreter for model extraction, which enables us to handle dynamic features of the language, 2) to use code instrumentation so that execution of each instrumented piece of code returns the formal model that corresponds to that piece of code, 3) to instrument the code dynamically so that the models of methods that are created at runtime can also be extracted, and 4) to execute both sides of branches during instrumented execution so that all program behaviors can be covered in a single instrumented execution. We implemented the symbolic model extraction technique for the Rails framework and used it to extract data and access control models from web applications. Our experiments demonstrate that symbolic model extraction is scalable and extracts formal models that are precise enough to find bugs in real-world applications without reporting too many false positives.

现代web应用程序使用复杂的数据模型和访问控制规则，导致数据完整性和访问控制错误。发现此类错误的一种方法是使用形式化验证技术。然而，作为第一步，大多数形式验证技术需要提取形式模型，由于现代语言的动态特性，这本身就是一个困难的问题，并且通常是手动完成的，或者使用特别的技术。在本文中，我们提出了一种称为符号模型提取的技术，用于从web应用程序中提取正式数据模型。符号模型提取的关键思想是:1)使用源语言解释器进行模型提取，这使我们能够处理语言的动态特性;2)使用代码插装，以便执行插装的每段代码返回与该代码对应的形式模型;3)动态插装代码，以便在运行时创建的方法模型也可以被提取。4)在插装执行期间执行分支的两侧，以便在一次插装执行中覆盖所有程序行为。我们为Rails框架实现了符号模型提取技术，并使用它从web应用程序中提取数据和访问控制模型。我们的实验表明，符号模型提取是可扩展的，并且提取的形式模型足够精确，可以在不报告太多误报的情况下发现实际应用程序中的错误。

{"title":"Symbolic Model Extraction for Web Application Verification","authors":"Ivan Bocic, T. Bultan","doi":"10.1109/ICSE.2017.72","DOIUrl":"https://doi.org/10.1109/ICSE.2017.72","url":null,"abstract":"Modern web applications use complex data models and access control rules which lead to data integrity and access control errors. One approach to find such errors is to use formal verification techniques. However, as a first step, most formal verification techniques require extraction of a formal model which is a difficult problem in itself due to dynamic features of modern languages, and it is typically done either manually, or using ad hoc techniques. In this paper, we present a technique called symbolic model extraction for extracting formal data models from web applications. The key ideas of symbolic model extraction are 1) to use the source language interpreter for model extraction, which enables us to handle dynamic features of the language, 2) to use code instrumentation so that execution of each instrumented piece of code returns the formal model that corresponds to that piece of code, 3) to instrument the code dynamically so that the models of methods that are created at runtime can also be extracted, and 4) to execute both sides of branches during instrumented execution so that all program behaviors can be covered in a single instrumented execution. We implemented the symbolic model extraction technique for the Rails framework and used it to extract data and access control models from web applications. Our experiments demonstrate that symbolic model extraction is scalable and extracts formal models that are precise enough to find bugs in real-world applications without reporting too many false positives.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"2 1","pages":"724-734"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73718881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Syntactic and Semantic Differencing for Combinatorial Models of Test Designs 测试设计组合模型的句法和语义差异

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE.2017.63

Rachel Tzoref, S. Maoz

Combinatorial test design (CTD) is an effective test design technique, considered to be a testing best practice. CTD provides automatic test plan generation, but it requires a manual definition of the test space in the form of a combinatorial model. As the system under test evolves, e.g., due to iterative development processes and bug fixing, so does the test space, and thus, in the context of CTD, evolution translates into frequent manual model definition updates. Manually reasoning about the differences between versions of real-world models following such updates is infeasible due to their complexity and size. Moreover, representing the differences is challenging. In this work, we propose a first syntactic and semantic differencing technique for combinatorial models of test designs. We define a concise and canonical representation for differences between two models, and suggest a scalable algorithm for automatically computing and presenting it. We use our differencing technique to analyze the evolution of 42 real-world industrial models, demonstrating its applicability and scalability. Further, a user study with 16 CTD practitioners shows that comprehension of differences between real-world combinatorial model versions is challenging and that our differencing tool significantly improves the performance of less experienced practitioners. The analysis and user study provide evidence for the potential usefulness of our differencing approach. Our work advances the state-of-the-art in CTD with better capabilities for change comprehension and management.

组合测试设计(CTD)是一种有效的测试设计技术，被认为是测试的最佳实践。CTD提供自动的测试计划生成，但是它需要以组合模型的形式手动定义测试空间。随着测试系统的发展，例如，由于迭代开发过程和错误修复，测试空间也在发展，因此，在CTD的上下文中，发展转化为频繁的手动模型定义更新。由于现实世界模型的复杂性和规模，手动推理这些更新后的现实世界模型版本之间的差异是不可行的。此外，表达这些差异是具有挑战性的。在这项工作中，我们提出了一种用于测试设计组合模型的句法和语义区分技术。我们为两个模型之间的差异定义了一个简洁和规范的表示，并提出了一个可扩展的算法来自动计算和表示它。我们使用我们的差分技术分析了42个现实世界工业模型的演变，证明了其适用性和可扩展性。此外，对16名CTD从业者的用户研究表明，理解现实世界组合模型版本之间的差异是具有挑战性的，我们的差异工具显著提高了经验不足的从业者的表现。分析和用户研究为我们的差异方法的潜在有效性提供了证据。我们的工作提高了CTD的技术水平，提高了对变化的理解和管理能力。

{"title":"Syntactic and Semantic Differencing for Combinatorial Models of Test Designs","authors":"Rachel Tzoref, S. Maoz","doi":"10.1109/ICSE.2017.63","DOIUrl":"https://doi.org/10.1109/ICSE.2017.63","url":null,"abstract":"Combinatorial test design (CTD) is an effective test design technique, considered to be a testing best practice. CTD provides automatic test plan generation, but it requires a manual definition of the test space in the form of a combinatorial model. As the system under test evolves, e.g., due to iterative development processes and bug fixing, so does the test space, and thus, in the context of CTD, evolution translates into frequent manual model definition updates. Manually reasoning about the differences between versions of real-world models following such updates is infeasible due to their complexity and size. Moreover, representing the differences is challenging. In this work, we propose a first syntactic and semantic differencing technique for combinatorial models of test designs. We define a concise and canonical representation for differences between two models, and suggest a scalable algorithm for automatically computing and presenting it. We use our differencing technique to analyze the evolution of 42 real-world industrial models, demonstrating its applicability and scalability. Further, a user study with 16 CTD practitioners shows that comprehension of differences between real-world combinatorial model versions is challenging and that our differencing tool significantly improves the performance of less experienced practitioners. The analysis and user study provide evidence for the potential usefulness of our differencing approach. Our work advances the state-of-the-art in CTD with better capabilities for change comprehension and management.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"31 1","pages":"621-631"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74053260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Making Malory Behave Maliciously: Targeted Fuzzing of Android Execution Environments 使Malory行为恶意:Android执行环境的目标模糊

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE.2017.35

Siegfried Rasthofer, Steven Arzt, S. Triller, Michael Pradel

Android applications, or apps, provide useful features to end-users, but many apps also contain malicious behavior. Modern malware makes understanding such behavior challenging by behaving maliciously only under particular conditions. For example, a malware app may check whether it runs on a real device and not an emulator, in a particular country, and alongside a specific target app, such as a vulnerable banking app. To observe the malicious behavior, a security analyst must find out and emulate all these app-specific constraints. This paper presents FuzzDroid, a framework for automatically generating an Android execution environment where an app exposes its malicious behavior. The key idea is to combine an extensible set of static and dynamic analyses through a search-based algorithm that steers the app toward a configurable target location. On recent malware, the approach reaches the target location in 75% of the apps. In total, we reach 240 code locations within an average time of only one minute. To reach these code locations, FuzzDroid generates 106 different environments, too many for a human analyst to create manually.

Android应用程序为最终用户提供了有用的功能，但许多应用程序也包含恶意行为。现代恶意软件只有在特定条件下才会表现出恶意行为，因此很难理解这种行为。例如，恶意软件应用程序可能会检查它是否在特定国家的真实设备上运行，而不是在模拟器上运行，并与特定的目标应用程序(如易受攻击的银行应用程序)一起运行。为了观察恶意行为，安全分析师必须找出并模拟所有这些特定于应用程序的约束。本文介绍了FuzzDroid，一个用于自动生成Android执行环境的框架，其中应用程序暴露其恶意行为。其关键思想是通过基于搜索的算法将一组可扩展的静态和动态分析结合起来，从而将应用程序引导到可配置的目标位置。在最近的恶意软件中，这种方法可以在75%的应用程序中到达目标位置。总的来说，我们在平均一分钟的时间内到达了240个代码位置。为了到达这些代码位置，FuzzDroid生成106个不同的环境，对于人工分析师来说，手动创建的环境太多了。

{"title":"Making Malory Behave Maliciously: Targeted Fuzzing of Android Execution Environments","authors":"Siegfried Rasthofer, Steven Arzt, S. Triller, Michael Pradel","doi":"10.1109/ICSE.2017.35","DOIUrl":"https://doi.org/10.1109/ICSE.2017.35","url":null,"abstract":"Android applications, or apps, provide useful features to end-users, but many apps also contain malicious behavior. Modern malware makes understanding such behavior challenging by behaving maliciously only under particular conditions. For example, a malware app may check whether it runs on a real device and not an emulator, in a particular country, and alongside a specific target app, such as a vulnerable banking app. To observe the malicious behavior, a security analyst must find out and emulate all these app-specific constraints. This paper presents FuzzDroid, a framework for automatically generating an Android execution environment where an app exposes its malicious behavior. The key idea is to combine an extensible set of static and dynamic analyses through a search-based algorithm that steers the app toward a configurable target location. On recent malware, the approach reaches the target location in 75% of the apps. In total, we reach 240 code locations within an average time of only one minute. To reach these code locations, FuzzDroid generates 106 different environments, too many for a human analyst to create manually.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"17 1","pages":"300-311"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78108435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54

To Type or Not to Type: Quantifying Detectable Bugs in JavaScript 输入还是不输入:量化JavaScript中可检测的bug

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE.2017.75

Zheng Gao, C. Bird, Earl T. Barr

JavaScript is growing explosively and is now used in large mature projects even outside the web domain. JavaScript is also a dynamically typed language for which static type systems, notably Facebook's Flow and Microsoft's TypeScript, have been written. What benefits do these static type systems provide? Leveraging JavaScript project histories, we select a fixed bug and check out the code just prior to the fix. We manually add type annotations to the buggy code and test whether Flow and TypeScript report an error on the buggy code, thereby possibly prompting a developer to fix the bug before its public release. We then report the proportion of bugs on which these type systems reported an error. Evaluating static type systems against public bugs, which have survived testing and review, is conservative: it understates their effectiveness at detecting bugs during private development, not to mention their other benefits such as facilitating code search/completion and serving as documentation. Despite this uneven playing field, our central finding is that both static type systems find an important percentage of public bugs: both Flow 0.30 and TypeScript 2.0 successfully detect 15%!.

JavaScript正在爆炸式地增长，现在甚至在web领域之外的大型成熟项目中也有使用。JavaScript也是一种动态类型语言，静态类型系统，特别是Facebook的Flow和微软的TypeScript，已经为其编写。这些静态类型系统提供了什么好处?利用JavaScript项目历史，我们选择一个已修复的bug，并在修复之前检出代码。我们手动向有问题的代码中添加类型注释，并测试Flow和TypeScript是否会在有问题的代码中报告错误，从而可能促使开发人员在公开发布之前修复错误。然后我们报告这些类型系统报告错误的bug的比例。根据公共bug对静态类型系统进行评估是保守的:它低估了它们在私有开发期间检测bug的有效性，更不用说它们的其他好处，比如促进代码搜索/完成和作为文档。尽管存在这种不公平的竞争环境，但我们的主要发现是，这两个静态类型系统都发现了相当大比例的公开bug: Flow 0.30和TypeScript 2.0都成功地检测到了15%!

{"title":"To Type or Not to Type: Quantifying Detectable Bugs in JavaScript","authors":"Zheng Gao, C. Bird, Earl T. Barr","doi":"10.1109/ICSE.2017.75","DOIUrl":"https://doi.org/10.1109/ICSE.2017.75","url":null,"abstract":"JavaScript is growing explosively and is now used in large mature projects even outside the web domain. JavaScript is also a dynamically typed language for which static type systems, notably Facebook's Flow and Microsoft's TypeScript, have been written. What benefits do these static type systems provide? Leveraging JavaScript project histories, we select a fixed bug and check out the code just prior to the fix. We manually add type annotations to the buggy code and test whether Flow and TypeScript report an error on the buggy code, thereby possibly prompting a developer to fix the bug before its public release. We then report the proportion of bugs on which these type systems reported an error. Evaluating static type systems against public bugs, which have survived testing and review, is conservative: it understates their effectiveness at detecting bugs during private development, not to mention their other benefits such as facilitating code search/completion and serving as documentation. Despite this uneven playing field, our central finding is that both static type systems find an important percentage of public bugs: both Flow 0.30 and TypeScript 2.0 successfully detect 15%!.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"23 1","pages":"758-769"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78698597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 69

LibD: Scalable and Precise Third-Party Library Detection in Android Markets LibD: Android市场中可扩展和精确的第三方库检测

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE.2017.38

Menghao Li, Wei Wang, Pei Wang, Shuai Wang, Dinghao Wu, Jian Liu, Rui Xue, Wei Huo

With the thriving of the mobile app markets, third-party libraries are pervasively integrated in the Android applications. Third-party libraries provide functionality such as advertisements, location services, and social networking services, making multi-functional app development much more productive. However, the spread of vulnerable or harmful third-party libraries may also hurt the entire mobile ecosystem, leading to various security problems. The Android platform suffers severely from such problems due to the way its ecosystem is constructed and maintained. Therefore, third-party Android library identification has emerged as an important problem which is the basis of many security applications such as repackaging detection and malware analysis. According to our investigation, existing work on Android library detection still requires improvement in many aspects, including accuracy and obfuscation resilience. In response to these limitations, we propose a novel approach to identifying third-party Android libraries. Our method utilizes the internal code dependencies of an Android app to detect and classify library candidates. Different from most previous methods which classify detected library candidates based on similarity comparison, our method is based on feature hashing and can better handle code whose package and method names are obfuscated. Based on this approach, we have developed a prototypical tool called LibD and evaluated it with an update-to-date and large-scale dataset. Our experimental results on 1,427,395 apps show that compared to existing tools, LibD can better handle multi-package third-party libraries in the presence of name-based obfuscation, leading to significantly improved precision without the loss of scalability.

随着移动应用市场的蓬勃发展，第三方库被广泛地集成到Android应用程序中。第三方库提供广告、位置服务和社交网络服务等功能，使多功能应用程序开发更加高效。然而，脆弱或有害的第三方库的传播也可能伤害整个移动生态系统，导致各种安全问题。Android平台由于其生态系统的构建和维护方式严重存在这些问题。因此，第三方Android库识别已经成为一个重要的问题，它是许多安全应用的基础，如重新包装检测和恶意软件分析。根据我们的调查，现有的Android库检测工作在很多方面还需要改进，包括准确性和抗混淆能力。针对这些限制，我们提出了一种识别第三方Android库的新方法。我们的方法利用Android应用程序的内部代码依赖来检测和分类候选库。与以往大多数基于相似性比较对检测到的候选库进行分类的方法不同，我们的方法基于特征哈希，可以更好地处理包名和方法名混淆的代码。基于这种方法，我们开发了一个名为LibD的原型工具，并使用最新的大规模数据集对其进行了评估。我们在1,427,395个应用程序上的实验结果表明，与现有工具相比，LibD可以更好地处理存在基于名称混淆的多包第三方库，从而在不损失可扩展性的情况下显著提高精度。

{"title":"LibD: Scalable and Precise Third-Party Library Detection in Android Markets","authors":"Menghao Li, Wei Wang, Pei Wang, Shuai Wang, Dinghao Wu, Jian Liu, Rui Xue, Wei Huo","doi":"10.1109/ICSE.2017.38","DOIUrl":"https://doi.org/10.1109/ICSE.2017.38","url":null,"abstract":"With the thriving of the mobile app markets, third-party libraries are pervasively integrated in the Android applications. Third-party libraries provide functionality such as advertisements, location services, and social networking services, making multi-functional app development much more productive. However, the spread of vulnerable or harmful third-party libraries may also hurt the entire mobile ecosystem, leading to various security problems. The Android platform suffers severely from such problems due to the way its ecosystem is constructed and maintained. Therefore, third-party Android library identification has emerged as an important problem which is the basis of many security applications such as repackaging detection and malware analysis. According to our investigation, existing work on Android library detection still requires improvement in many aspects, including accuracy and obfuscation resilience. In response to these limitations, we propose a novel approach to identifying third-party Android libraries. Our method utilizes the internal code dependencies of an Android app to detect and classify library candidates. Different from most previous methods which classify detected library candidates based on similarity comparison, our method is based on feature hashing and can better handle code whose package and method names are obfuscated. Based on this approach, we have developed a prototypical tool called LibD and evaluated it with an update-to-date and large-scale dataset. Our experimental results on 1,427,395 apps show that compared to existing tools, LibD can better handle multi-package third-party libraries in the presence of name-based obfuscation, leading to significantly improved precision without the loss of scalability.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"60 1","pages":"335-346"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85993925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 130

Challenges for Static Analysis of Java Reflection - Literature Review and Empirical Study Java反射静态分析的挑战-文献回顾与实证研究

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE.2017.53

D. Landman, Alexander Serebrenik, J. Vinju

The behavior of software that uses the Java Reflection API is fundamentally hard to predict by analyzing code. Only recent static analysis approaches can resolve reflection under unsound yet pragmatic assumptions. We survey what approaches exist and what their limitations are. We then analyze how real-world Java code uses the Reflection API, and how many Java projects contain code challenging state-of-the-art static analysis. Using a systematic literature review we collected and categorized all known methods of statically approximating reflective Java code. Next to this we constructed a representative corpus of Java systems and collected descriptive statistics of the usage of the Reflection API. We then applied an analysis on the abstract syntax trees of all source code to count code idioms which go beyond the limitation boundaries of static analysis approaches. The resulting data answers the research questions. The corpus, the tool and the results are openly available. We conclude that the need for unsound assumptions to resolve reflection is widely supported. In our corpus, reflection can not be ignored for 78% of the projects. Common challenges for analysis tools such as non-exceptional exceptions, programmatic filtering meta objects, semantics of collections, and dynamic proxies, widely occur in the corpus. For Java software engineers prioritizing on robustness, we list tactics to obtain more easy to analyze reflection code, and for static analysis tool builders we provide a list of opportunities to have significant impact on real Java code.

使用Java Reflection API的软件的行为基本上很难通过分析代码来预测。只有最近的静态分析方法才能在不健全但实用的假设下解决反思。我们调查了现有的方法以及它们的局限性。然后我们分析真实的Java代码如何使用Reflection API，以及有多少Java项目包含挑战最先进的静态分析的代码。通过系统的文献回顾，我们收集并分类了所有已知的静态近似反射Java代码的方法。接下来，我们构建了一个具有代表性的Java系统语料库，并收集了Reflection API使用情况的描述性统计信息。然后，我们对所有源代码的抽象语法树进行了分析，以计算超出静态分析方法限制范围的代码习惯用法。所得数据回答了研究问题。语料库、工具和结果都是公开的。我们得出的结论是，需要不合理的假设来解决反思是得到广泛支持的。在我们的语料库中，反思在78%的项目中不容忽视。分析工具面临的常见挑战，如非异常异常、可编程过滤元对象、集合语义和动态代理，广泛出现在语料库中。对于优先考虑健壮性的Java软件工程师，我们列出了获得更容易分析反射代码的策略，对于静态分析工具构建者，我们提供了对实际Java代码有重大影响的机会列表。

{"title":"Challenges for Static Analysis of Java Reflection - Literature Review and Empirical Study","authors":"D. Landman, Alexander Serebrenik, J. Vinju","doi":"10.1109/ICSE.2017.53","DOIUrl":"https://doi.org/10.1109/ICSE.2017.53","url":null,"abstract":"The behavior of software that uses the Java Reflection API is fundamentally hard to predict by analyzing code. Only recent static analysis approaches can resolve reflection under unsound yet pragmatic assumptions. We survey what approaches exist and what their limitations are. We then analyze how real-world Java code uses the Reflection API, and how many Java projects contain code challenging state-of-the-art static analysis. Using a systematic literature review we collected and categorized all known methods of statically approximating reflective Java code. Next to this we constructed a representative corpus of Java systems and collected descriptive statistics of the usage of the Reflection API. We then applied an analysis on the abstract syntax trees of all source code to count code idioms which go beyond the limitation boundaries of static analysis approaches. The resulting data answers the research questions. The corpus, the tool and the results are openly available. We conclude that the need for unsound assumptions to resolve reflection is widely supported. In our corpus, reflection can not be ignored for 78% of the projects. Common challenges for analysis tools such as non-exceptional exceptions, programmatic filtering meta objects, semantics of collections, and dynamic proxies, widely occur in the corpus. For Java software engineers prioritizing on robustness, we list tactics to obtain more easy to analyze reflection code, and for static analysis tool builders we provide a list of opportunities to have significant impact on real Java code.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"12 1","pages":"507-518"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91527610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 97

How Good Is a Security Policy against Real Breaches? A HIPAA Case Study 针对真正的漏洞，安全策略有多好?HIPAA案例研究

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE.2017.55

Özgür Kafali, Jasmine Jones, Megan Petruso, L. Williams, Munindar P. Singh

Policy design is an important part of software development. As security breaches increase in variety, designing a security policy that addresses all potential breaches becomes a nontrivial task. A complete security policy would specify rules to prevent breaches. Systematically determining which, if any, policy clause has been violated by a reported breach is a means for identifying gaps in a policy. Our research goal is to help analysts measure the gaps between security policies and reported breaches by developing a systematic process based on semantic reasoning. We propose SEMAVER, a framework for determining coverage of breaches by policies via comparison of individual policy clauses and breach descriptions. We represent a security policy as a set of norms. Norms (commitments, authorizations, and prohibitions) describe expected behaviors of users, and formalize who is accountable to whom and for what. A breach corresponds to a norm violation. We develop a semantic similarity metric for pairwise comparison between the norm that represents a policy clause and the norm that has been violated by a reported breach. We use the US Health Insurance Portability and Accountability Act (HIPAA) as a case study. Our investigation of a subset of the breaches reported by the US Department of Health and Human Services (HHS) reveals the gaps between HIPAA and reported breaches, leading to a coverage of 65%. Additionally, our classification of the 1,577 HHS breaches shows that 44% of the breaches are accidental misuses and 56% are malicious misuses. We find that HIPAA's gaps regarding accidental misuses are significantly larger than its gaps regarding malicious misuses.

策略设计是软件开发的重要组成部分。随着安全漏洞种类的增加，设计一个解决所有潜在漏洞的安全策略成为一项重要的任务。完整的安全策略将指定防止违规的规则。系统地确定报告的违约行为违反了哪个(如果有的话)政策条款，是识别政策漏洞的一种手段。我们的研究目标是通过开发基于语义推理的系统流程来帮助分析人员衡量安全策略和报告的漏洞之间的差距。我们提出了SEMAVER，这是一个通过比较个别保单条款和违约描述来确定保单违约覆盖范围的框架。我们将安全策略表示为一组规范。规范(承诺、授权和禁止)描述了用户的预期行为，并形式化了谁对谁负责以及对什么负责。违反与违反规范相对应。我们开发了一个语义相似度度量，用于两两比较代表策略条款的规范和被报告的违规行为所违反的规范。我们使用美国健康保险流通与责任法案(HIPAA)作为案例研究。我们对美国卫生与公众服务部(HHS)报告的违规行为的一个子集进行了调查，发现HIPAA与报告的违规行为之间存在差距，导致覆盖率为65%。此外，我们对1577起HHS违规行为的分类显示，44%的违规行为是意外滥用，56%是恶意滥用。我们发现HIPAA在意外滥用方面的差距明显大于其在恶意滥用方面的差距。

{"title":"How Good Is a Security Policy against Real Breaches? A HIPAA Case Study","authors":"Özgür Kafali, Jasmine Jones, Megan Petruso, L. Williams, Munindar P. Singh","doi":"10.1109/ICSE.2017.55","DOIUrl":"https://doi.org/10.1109/ICSE.2017.55","url":null,"abstract":"Policy design is an important part of software development. As security breaches increase in variety, designing a security policy that addresses all potential breaches becomes a nontrivial task. A complete security policy would specify rules to prevent breaches. Systematically determining which, if any, policy clause has been violated by a reported breach is a means for identifying gaps in a policy. Our research goal is to help analysts measure the gaps between security policies and reported breaches by developing a systematic process based on semantic reasoning. We propose SEMAVER, a framework for determining coverage of breaches by policies via comparison of individual policy clauses and breach descriptions. We represent a security policy as a set of norms. Norms (commitments, authorizations, and prohibitions) describe expected behaviors of users, and formalize who is accountable to whom and for what. A breach corresponds to a norm violation. We develop a semantic similarity metric for pairwise comparison between the norm that represents a policy clause and the norm that has been violated by a reported breach. We use the US Health Insurance Portability and Accountability Act (HIPAA) as a case study. Our investigation of a subset of the breaches reported by the US Department of Health and Human Services (HHS) reveals the gaps between HIPAA and reported breaches, leading to a coverage of 65%. Additionally, our classification of the 1,577 HHS breaches shows that 44% of the breaches are accidental misuses and 56% are malicious misuses. We find that HIPAA's gaps regarding accidental misuses are significantly larger than its gaps regarding malicious misuses.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"3 1","pages":"530-540"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87276253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Recommending and Localizing Change Requests for Mobile Apps Based on User Reviews 根据用户评论推荐和本地化移动应用的变更请求

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE.2017.18

Fabio Palomba, P. Salza, Adelina Ciurumelea, Sebastiano Panichella, H. Gall, F. Ferrucci, A. D. Lucia

Researchers have proposed several approaches to extract information from user reviews useful for maintaining and evolving mobile apps. However, most of them just perform automatic classification of user reviews according to specific keywords (e.g., bugs, features). Moreover, they do not provide any support for linking user feedback to the source code components to be changed, thus requiring a manual, time-consuming, and error-prone task. In this paper, we introduce ChangeAdvisor, a novel approach that analyzes the structure, semantics, and sentiments of sentences contained in user reviews to extract useful (user) feedback from maintenance perspectives and recommend to developers changes to software artifacts. It relies on natural language processing and clustering algorithms to group user reviews around similar user needs and suggestions for change. Then, it involves textual based heuristics to determine the code artifacts that need to be maintained according to the recommended software changes. The quantitative and qualitative studies carried out on 44,683 user reviews of 10 open source mobile apps and their original developers showed a high accuracy of ChangeAdvisor in (i) clustering similar user change requests and (ii) identifying the code components impacted by the suggested changes. Moreover, the obtained results show that ChangeAdvisor is more accurate than a baseline approach for linking user feedback clusters to the source code in terms of both precision (+47%) and recall (+38%).

研究人员提出了几种从用户评论中提取信息的方法，这些信息对维护和发展移动应用程序很有用。然而，他们中的大多数只是根据特定的关键字(例如，bug，功能)对用户评论进行自动分类。此外，它们不提供将用户反馈链接到要更改的源代码组件的任何支持，因此需要手动、耗时且容易出错的任务。在本文中，我们介绍了ChangeAdvisor，这是一种新颖的方法，它分析用户评论中包含的句子的结构、语义和情感，从维护的角度提取有用的(用户)反馈，并向开发人员推荐对软件工件的更改。它依靠自然语言处理和聚类算法，围绕类似的用户需求和更改建议对用户评论进行分组。然后，它涉及到基于文本的启发式方法，以根据推荐的软件更改确定需要维护的代码工件。对10个开源移动应用及其原始开发者的44,683条用户评论进行的定量和定性研究表明，ChangeAdvisor在(i)聚类类似用户更改请求和(ii)识别受建议更改影响的代码组件方面具有很高的准确性。此外，所获得的结果表明，在精度(+47%)和召回率(+38%)方面，ChangeAdvisor比基线方法更准确地将用户反馈聚类链接到源代码。

{"title":"Recommending and Localizing Change Requests for Mobile Apps Based on User Reviews","authors":"Fabio Palomba, P. Salza, Adelina Ciurumelea, Sebastiano Panichella, H. Gall, F. Ferrucci, A. D. Lucia","doi":"10.1109/ICSE.2017.18","DOIUrl":"https://doi.org/10.1109/ICSE.2017.18","url":null,"abstract":"Researchers have proposed several approaches to extract information from user reviews useful for maintaining and evolving mobile apps. However, most of them just perform automatic classification of user reviews according to specific keywords (e.g., bugs, features). Moreover, they do not provide any support for linking user feedback to the source code components to be changed, thus requiring a manual, time-consuming, and error-prone task. In this paper, we introduce ChangeAdvisor, a novel approach that analyzes the structure, semantics, and sentiments of sentences contained in user reviews to extract useful (user) feedback from maintenance perspectives and recommend to developers changes to software artifacts. It relies on natural language processing and clustering algorithms to group user reviews around similar user needs and suggestions for change. Then, it involves textual based heuristics to determine the code artifacts that need to be maintained according to the recommended software changes. The quantitative and qualitative studies carried out on 44,683 user reviews of 10 open source mobile apps and their original developers showed a high accuracy of ChangeAdvisor in (i) clustering similar user change requests and (ii) identifying the code components impacted by the suggested changes. Moreover, the obtained results show that ChangeAdvisor is more accurate than a baseline approach for linking user feedback clusters to the source code in terms of both precision (+47%) and recall (+38%).","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"76 1","pages":"106-117"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86501667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 136

Learning to Prioritize Test Programs for Compiler Testing 学习优先考虑编译器测试的测试程序

2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)

Pub Date : 2017-05-20 DOI: 10.1109/ICSE.2017.70

Junjie Chen, Y. Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, Bing Xie

Compiler testing is a crucial way of guaranteeing the reliability of compilers (and software systems in general). Many techniques have been proposed to facilitate automated compiler testing. These techniques rely on a large number of test programs (which are test inputs of compilers) generated by some test-generation tools (e.g., CSmith). However, these compiler testing techniques have serious efficiency problems as they usually take a long period of time to find compiler bugs. To accelerate compiler testing, it is desirable to prioritize the generated test programs so that the test programs that are more likely to trigger compiler bugs are executed earlier. In this paper, we propose the idea of learning to test, which learns the characteristics of bug-revealing test programs from previous test programs that triggered bugs. Based on the idea of learning to test, we propose LET, an approach to prioritizing test programs for compiler testing acceleration. LET consists of a learning process and a scheduling process. In the learning process, LET identifies a set of features of test programs, trains a capability model to predict the probability of a new test program for triggering compiler bugs and a time model to predict the execution time of a test program. In the scheduling process, LET prioritizes new test programs according to their bug-revealing probabilities in unit time, which is calculated based on the two trained models. Our extensive experiments show that LET significantly accelerates compiler testing. In particular, LET reduces more than 50% of the testing time in 24.64% of the cases, and reduces between 25% and 50% of the testing time in 36.23% of the cases.

编译器测试是保证编译器(以及一般的软件系统)可靠性的关键方法。已经提出了许多技术来促进自动化编译器测试。这些技术依赖于由一些测试生成工具(例如CSmith)生成的大量测试程序(它们是编译器的测试输入)。然而，这些编译器测试技术存在严重的效率问题，因为它们通常需要很长时间才能找到编译器错误。为了加速编译器测试，最好对生成的测试程序进行优先排序，以便更有可能触发编译器错误的测试程序被更早地执行。在本文中，我们提出了学习测试的思想，即从之前触发bug的测试程序中学习揭示bug的测试程序的特征。基于学习测试的思想，我们提出了LET，一种对编译器测试加速的测试程序进行优先排序的方法。LET由学习过程和调度过程组成。在学习过程中，LET识别测试程序的一组特征，训练一个能力模型来预测新测试程序触发编译器错误的概率，以及一个时间模型来预测测试程序的执行时间。在调度过程中，LET根据它们在单位时间内的漏洞暴露概率对新的测试程序进行优先级排序，这是基于两个训练好的模型计算的。我们的大量实验表明，LET显著地加速了编译器测试。特别是，在24.64%的病例中，LET减少了50%以上的检测时间，在36.23%的病例中，LET减少了25%到50%的检测时间。

{"title":"Learning to Prioritize Test Programs for Compiler Testing","authors":"Junjie Chen, Y. Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, Bing Xie","doi":"10.1109/ICSE.2017.70","DOIUrl":"https://doi.org/10.1109/ICSE.2017.70","url":null,"abstract":"Compiler testing is a crucial way of guaranteeing the reliability of compilers (and software systems in general). Many techniques have been proposed to facilitate automated compiler testing. These techniques rely on a large number of test programs (which are test inputs of compilers) generated by some test-generation tools (e.g., CSmith). However, these compiler testing techniques have serious efficiency problems as they usually take a long period of time to find compiler bugs. To accelerate compiler testing, it is desirable to prioritize the generated test programs so that the test programs that are more likely to trigger compiler bugs are executed earlier. In this paper, we propose the idea of learning to test, which learns the characteristics of bug-revealing test programs from previous test programs that triggered bugs. Based on the idea of learning to test, we propose LET, an approach to prioritizing test programs for compiler testing acceleration. LET consists of a learning process and a scheduling process. In the learning process, LET identifies a set of features of test programs, trains a capability model to predict the probability of a new test program for triggering compiler bugs and a time model to predict the execution time of a test program. In the scheduling process, LET prioritizes new test programs according to their bug-revealing probabilities in unit time, which is calculated based on the two trained models. Our extensive experiments show that LET significantly accelerates compiler testing. In particular, LET reduces more than 50% of the testing time in 24.64% of the cases, and reduces between 25% and 50% of the testing time in 36.23% of the cases.","PeriodicalId":6505,"journal":{"name":"2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"700-711"},"PeriodicalIF":0.0,"publicationDate":"2017-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83182909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 70