2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)最新文献_第3页

DiffViz: A Diff Algorithm Independent Visualization Tool for Edit Scripts DiffViz:一个独立于Diff算法的编辑脚本可视化工具

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00081

Veit Frick, Christoph Wedenig, M. Pinzger

A number of approaches and tools exist that extract and visualize the changes between two versions of a file and thereby help developers to understand them. DiffViz is an interactive visualization tool that visualizes the changes independent from the differencing algorithm. It supports, but is not limited to, a granularity on the level of abstract syntax trees. Furthermore, it provides several new features, such as node matching and the mini-map, to navigate and analyze the changes. A demo of the installation and example usage of the tool is available here: https://youtu.be/RF93ey9GYoc

有许多方法和工具可以提取和可视化文件两个版本之间的更改，从而帮助开发人员理解这些更改。DiffViz是一种交互式可视化工具，可以独立于差分算法对变化进行可视化。它支持但不限于抽象语法树级别的粒度。此外，它还提供了几个新功能，如节点匹配和迷你地图，用于导航和分析更改。该工具的安装和示例用法的演示可以在这里获得:https://youtu.be/RF93ey9GYoc

引用次数: 4

A Reflexive and Automated Approach to Syntactic Pattern Matching in Code Transformations 代码转换中语法模式匹配的自反和自动化方法

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00052

J. Lecerf, J. Brant, T. Goubier, Stéphane Ducasse

Empowering software engineers often requires to let them write code transformations. However existing automated or tool-supported approaches force developers to have a detailed knowledge of the internal representation of the underlying tool. While this knowledge is time consuming to master, the syntax of the language, on the other hand, is already well known to developers and can serve as a strong foundation for pattern matching. Pattern languages with metavariables (that is variables holding abstract syntax subtrees once the pattern has been matched) have been used to help programmers define program transformations at the language syntax level. The question raised is then the engineering cost of metavariable support. Our contribution is to show that, with a GLR parser, such patterns with metavariables can be supported by using a form of runtime reflexivity on the parser internal structures. This approach allows one to directly implement such patterns on any parser generated by a parser generation framework, without asking the pattern writer to learn the AST structure and node types. As a use case for that approach we describe the implementation built on top of the SmaCC (Smalltalk Compiler Compiler) GLR parser generator framework. This approach has been used in production for source code transformations on a large scale. We will express perspectives to adapt this approach to other types of parsing technologies.

授权软件工程师通常需要让他们编写代码转换。然而，现有的自动化或工具支持的方法迫使开发人员对底层工具的内部表示有详细的了解。虽然掌握这些知识需要花费大量时间，但另一方面，开发人员已经熟悉了该语言的语法，可以作为模式匹配的坚实基础。带有元变量的模式语言(在模式匹配之后，包含抽象语法子树的变量)被用来帮助程序员在语言语法级别定义程序转换。接下来的问题是元变量支持的工程成本。我们的贡献是表明，使用GLR解析器，可以通过在解析器内部结构上使用一种形式的运行时反身性来支持这种带有元变量的模式。这种方法允许在解析器生成框架生成的任何解析器上直接实现这样的模式，而不需要模式编写者学习AST结构和节点类型。作为该方法的一个用例，我们描述了建立在SmaCC (Smalltalk Compiler Compiler) GLR解析器生成器框架之上的实现。这种方法已经在生产环境中用于大规模的源代码转换。我们将表达透视图，以使这种方法适应其他类型的解析技术。

{"title":"A Reflexive and Automated Approach to Syntactic Pattern Matching in Code Transformations","authors":"J. Lecerf, J. Brant, T. Goubier, Stéphane Ducasse","doi":"10.1109/ICSME.2018.00052","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00052","url":null,"abstract":"Empowering software engineers often requires to let them write code transformations. However existing automated or tool-supported approaches force developers to have a detailed knowledge of the internal representation of the underlying tool. While this knowledge is time consuming to master, the syntax of the language, on the other hand, is already well known to developers and can serve as a strong foundation for pattern matching. Pattern languages with metavariables (that is variables holding abstract syntax subtrees once the pattern has been matched) have been used to help programmers define program transformations at the language syntax level. The question raised is then the engineering cost of metavariable support. Our contribution is to show that, with a GLR parser, such patterns with metavariables can be supported by using a form of runtime reflexivity on the parser internal structures. This approach allows one to directly implement such patterns on any parser generated by a parser generation framework, without asking the pattern writer to learn the AST structure and node types. As a use case for that approach we describe the implementation built on top of the SmaCC (Smalltalk Compiler Compiler) GLR parser generator framework. This approach has been used in production for source code transformations on a large scale. We will express perspectives to adapt this approach to other types of parsing technologies.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"12 1","pages":"426-436"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87843228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Software Process Analysis Methodology – A Methodology Based on Lessons Learned in Embracing Legacy Software 软件过程分析方法论——一种基于接受遗留软件的经验教训的方法论

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00076

M. Leemans, Wil M.P. van der Aalst, M. Brand, R. Schiffelers, L. Lensink

Over the last decades, the complexity of high-tech systems, and the software systems controlling them, has increased considerably. In practice, it is hard to keep knowledge and documentation of these ever-evolving software systems up-to-date with their actual realization; we are dealing with legacy software. Clearly, this lack of knowledge, insight, and understanding is more and more becoming a critical issue. Process mining provides an interesting opportunity to improve understanding and analyze software behavior based on observations from the system on the run. However, a concrete software process analysis methodology was lacking. This paper 1) discusses a software process analysis case study at ASML, a large high-tech company, and, based on the lessons learned, 2) presents a concrete methodology for analyzing software processes. The presented methodology actively includes the system under analysis and is based on practical experiences in applying process mining on industrial-scale legacy software.

在过去的几十年里，高科技系统的复杂性，以及控制它们的软件系统，已经大大增加了。在实践中，很难保持这些不断发展的软件系统的知识和文档与它们的实际实现保持同步;我们正在处理遗留软件。显然，这种知识、洞察力和理解的缺乏越来越成为一个关键问题。过程挖掘提供了一个有趣的机会，可以基于对运行中的系统的观察来改进对软件行为的理解和分析。然而，缺乏具体的软件过程分析方法。本文1)讨论了一个大型高科技公司ASML的软件过程分析案例研究，并根据所获得的经验教训，2)提出了分析软件过程的具体方法。所提出的方法积极地包括所分析的系统，并基于在工业规模遗留软件上应用过程挖掘的实际经验。

引用次数: 6

Are Bug Reports Enough for Text Retrieval-Based Bug Localization? Bug报告是否足以用于基于文本检索的Bug定位?

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00046

Chris Mills, Jevgenija Pantiuchina, Esteban Parra, G. Bavota, S. Haiduc

Text Retrieval (TR) has been widely used to support many software engineering tasks, including bug localization (i.e., the activity of localizing buggy code starting from a bug report). Many studies show TR's effectiveness in lowering the manual effort required to perform this maintenance task; however, the actual usefulness of TR-based bug localization has been questioned in recent studies. These studies discuss (i) potential biases in the experimental design usually adopted to evaluate TRbased bug localization techniques and (ii) their poor performance in the scenario when they are needed most: when the bug report, which serves as the de facto query in most studies, does not contain localization hints (e.g., code snippets, method names, etc.) Fundamentally, these studies raise the question: do bug reports provide sufficient information to perform TR-based localization? In this work, we approach that question from two perspectives. First, we investigate potential biases in the evaluation of TR-based approaches which artificially boost the performance of these techniques, making them appear more successful than they are. Second, we analyze bug report text with and without localization hints using a genetic algorithm to derive a near-optimal query that provides insight into the potential of that bug report for use in TR-based localization. Through this analysis we show that in most cases the bug report vocabulary (i.e., the terms contained in the bug title and description) is all we need to formulate effective queries, making TR-based bug localization successful without supplementary query expansion. Most notably, this also holds when localization hints are completely absent from the bug report. In fact, our results suggest that the next major step in improving TR-based bug localization is the ability to formulate these near-optimal queries.

文本检索(TR)已被广泛用于支持许多软件工程任务，包括错误本地化(即，从错误报告开始本地化错误代码的活动)。许多研究表明，TR在降低执行此维护任务所需的人工工作量方面是有效的;然而，在最近的研究中，基于tr的bug定位的实际用途受到了质疑。这些研究讨论了(i)通常用于评估基于trs的bug定位技术的实验设计中的潜在偏差，以及(ii)它们在最需要的场景中的糟糕表现:当大多数研究中作为事实上的查询的bug报告不包含本地化提示(例如代码片段、方法名等)时，这些研究从根本上提出了一个问题:bug报告是否提供了足够的信息来执行基于trs的本地化?在这项工作中，我们从两个角度来探讨这个问题。首先，我们调查了基于tr的方法评估中的潜在偏差，这些偏差人为地提高了这些技术的性能，使它们看起来比实际更成功。其次，我们使用遗传算法分析带有和不带有本地化提示的bug报告文本，以获得近乎最优的查询，该查询提供了对该bug报告的潜力的洞察，以便在基于tr的本地化中使用。通过这个分析，我们发现在大多数情况下，bug报告词汇表(即bug标题和描述中包含的术语)是我们制定有效查询所需要的全部，使得基于tr的bug定位成功，而无需补充查询扩展。最值得注意的是，当bug报告中完全没有本地化提示时也是如此。事实上，我们的结果表明，改进基于tr的错误定位的下一个主要步骤是制定这些接近最优查询的能力。

{"title":"Are Bug Reports Enough for Text Retrieval-Based Bug Localization?","authors":"Chris Mills, Jevgenija Pantiuchina, Esteban Parra, G. Bavota, S. Haiduc","doi":"10.1109/ICSME.2018.00046","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00046","url":null,"abstract":"Text Retrieval (TR) has been widely used to support many software engineering tasks, including bug localization (i.e., the activity of localizing buggy code starting from a bug report). Many studies show TR's effectiveness in lowering the manual effort required to perform this maintenance task; however, the actual usefulness of TR-based bug localization has been questioned in recent studies. These studies discuss (i) potential biases in the experimental design usually adopted to evaluate TRbased bug localization techniques and (ii) their poor performance in the scenario when they are needed most: when the bug report, which serves as the de facto query in most studies, does not contain localization hints (e.g., code snippets, method names, etc.) Fundamentally, these studies raise the question: do bug reports provide sufficient information to perform TR-based localization? In this work, we approach that question from two perspectives. First, we investigate potential biases in the evaluation of TR-based approaches which artificially boost the performance of these techniques, making them appear more successful than they are. Second, we analyze bug report text with and without localization hints using a genetic algorithm to derive a near-optimal query that provides insight into the potential of that bug report for use in TR-based localization. Through this analysis we show that in most cases the bug report vocabulary (i.e., the terms contained in the bug title and description) is all we need to formulate effective queries, making TR-based bug localization successful without supplementary query expansion. Most notably, this also holds when localization hints are completely absent from the bug report. In fact, our results suggest that the next major step in improving TR-based bug localization is the ability to formulate these near-optimal queries.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"14 1","pages":"381-392"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88634401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

Context-Aware Software Documentation 上下文感知软件文档

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00090

Emad Aghajani

Software developers often do not possess the knowledge needed to understand a piece of code at hand, and the lack of code comments and outdated documentation exacerbates the problem. Asking for the help of colleagues, browsing the official documentation, or accessing online resources, such as Stack Overflow, can clearly help in this "code comprehension" activity that, however, still remains highly time-consuming and is not always successful. Enhancing this process has been addressed in different studies under the subject of automatic documentation of software artifacts. For example, "recommender systems" have been designed with the goal of retrieving and suggesting relevant pieces of information (e.g., Stack Overflow discussions) for a given piece of code inspected in an IDE. However, these techniques rely on limited contextual information, mainly solely source code. Our goal is to build a context-aware proactive recommender system supporting the code comprehension process. The system must be able to understand the context, consider the developer's profile, and help her by generating pieces of documentation at whatever granularity is required, e.g., going from summarizing the responsibilities implemented in a subsystem, to explaining how two classes collaborate to implement a functionality, down to documenting a single line of code. Generated documentation will be tailored for the current context (e.g., the task at hand, the developer's background knowledge, the history of interactions). In this paper we present our first steps toward our goal by introducing the ADANA project, a framework which generates fine-grained code comments for a given piece of code.

软件开发人员通常不具备理解手头一段代码所需的知识，并且缺乏代码注释和过时的文档加剧了这个问题。向同事寻求帮助，浏览官方文档，或者访问在线资源，比如Stack Overflow，都可以明显地帮助完成这种“代码理解”活动，然而，这种活动仍然非常耗时，而且并不总是成功的。在软件工件的自动文档化主题下，在不同的研究中已经讨论了增强这个过程。例如，“推荐系统”的设计目标是为在IDE中检查的给定代码片段检索和建议相关信息(例如，Stack Overflow讨论)。然而，这些技术依赖于有限的上下文信息，主要是源代码。我们的目标是构建一个支持代码理解过程的上下文感知的主动推荐系统。系统必须能够理解上下文，考虑开发人员的配置文件，并通过生成所需的任何粒度的文档来帮助她，例如，从总结子系统中实现的职责，到解释两个类如何协作实现功能，再到记录一行代码。生成的文档将针对当前上下文(例如，手头的任务、开发人员的背景知识、交互的历史)进行定制。在本文中，我们通过介绍ADANA项目(一个为给定代码段生成细粒度代码注释的框架)，向我们的目标迈出了第一步。

{"title":"Context-Aware Software Documentation","authors":"Emad Aghajani","doi":"10.1109/ICSME.2018.00090","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00090","url":null,"abstract":"Software developers often do not possess the knowledge needed to understand a piece of code at hand, and the lack of code comments and outdated documentation exacerbates the problem. Asking for the help of colleagues, browsing the official documentation, or accessing online resources, such as Stack Overflow, can clearly help in this \"code comprehension\" activity that, however, still remains highly time-consuming and is not always successful. Enhancing this process has been addressed in different studies under the subject of automatic documentation of software artifacts. For example, \"recommender systems\" have been designed with the goal of retrieving and suggesting relevant pieces of information (e.g., Stack Overflow discussions) for a given piece of code inspected in an IDE. However, these techniques rely on limited contextual information, mainly solely source code. Our goal is to build a context-aware proactive recommender system supporting the code comprehension process. The system must be able to understand the context, consider the developer's profile, and help her by generating pieces of documentation at whatever granularity is required, e.g., going from summarizing the responsibilities implemented in a subsystem, to explaining how two classes collaborate to implement a functionality, down to documenting a single line of code. Generated documentation will be tailored for the current context (e.g., the task at hand, the developer's background knowledge, the history of interactions). In this paper we present our first steps toward our goal by introducing the ADANA project, a framework which generates fine-grained code comments for a given piece of code.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"8 1","pages":"727-731"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88720232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Predicting Higher Order Structural Feature Interactions in Variable Systems 预测变量系统中的高阶结构特征相互作用

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00035

Stefan Fischer, L. Linsbauer, Alexander Egyed, R. Lopez-Herrejon

Robust and effective support for the detection and management of software features and their interactions is crucial for many development tasks but has proven to be an elusive goal despite extensive research on the subject. This is especially challenging for variable systems where multiple variants of a system and their features must be collectively considered. Here an important issue is the typically large number of feature interactions that can occur in variable systems. We propose a method that computes, from a set of known source code level interactions of n features, the relevant interactions involving n+1 features. Our method is based on the insight that, if a set of features interact, it is much more likely that these features also interact with additional features, as opposed to completely different features interacting. This key insight enables us to drastically prune the space of potential feature interactions to those that will have a true impact at source code level. This substantial space reduction can be leveraged by analysis techniques that are based on feature interactions (e.g Combinatorial Interaction Testing). Our observation is based on eight variable systems, implemented in Java and C, totaling over nine million LoC, with over seven thousand feature interactions.

对软件特性及其交互的检测和管理的健壮而有效的支持对于许多开发任务是至关重要的，但尽管对该主题进行了广泛的研究，但已被证明是一个难以捉摸的目标。这对于必须集体考虑系统的多个变体及其特性的可变系统来说尤其具有挑战性。这里的一个重要问题是变量系统中可能出现的典型的大量特征交互。我们提出了一种方法，从一组已知的n个特征的源代码级交互中，计算涉及n+1个特征的相关交互。我们的方法是基于这样一种认识:如果一组特征相互作用，那么这些特征更有可能与其他特征相互作用，而不是完全不同的特征相互作用。这个关键的洞察力使我们能够大幅度地减少潜在的特性交互的空间，使其对源代码级别产生真正的影响。基于特征交互的分析技术(例如组合交互测试)可以利用这种实质性的空间缩减。我们的观察是基于八个变量系统，用Java和C实现的，总共超过900万个LoC，有超过7000个功能交互。

{"title":"Predicting Higher Order Structural Feature Interactions in Variable Systems","authors":"Stefan Fischer, L. Linsbauer, Alexander Egyed, R. Lopez-Herrejon","doi":"10.1109/ICSME.2018.00035","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00035","url":null,"abstract":"Robust and effective support for the detection and management of software features and their interactions is crucial for many development tasks but has proven to be an elusive goal despite extensive research on the subject. This is especially challenging for variable systems where multiple variants of a system and their features must be collectively considered. Here an important issue is the typically large number of feature interactions that can occur in variable systems. We propose a method that computes, from a set of known source code level interactions of n features, the relevant interactions involving n+1 features. Our method is based on the insight that, if a set of features interact, it is much more likely that these features also interact with additional features, as opposed to completely different features interacting. This key insight enables us to drastically prune the space of potential feature interactions to those that will have a true impact at source code level. This substantial space reduction can be leveraged by analysis techniques that are based on feature interactions (e.g Combinatorial Interaction Testing). Our observation is based on eight variable systems, implemented in Java and C, totaling over nine million LoC, with over seven thousand feature interactions.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"42 1","pages":"252-263"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88287458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

RegionDroid: A Tool for Detecting Android Application Repackaging Based on Runtime UI Region Features RegionDroid:基于运行时UI区域特征检测Android应用程序重新打包的工具

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00041

Shengtao Yue, Qingwei Sun, Jun Ma, Xianping Tao, Chang Xu, Jian Lu

With the rapid development of mobile devices, Android applications (apps) are universally used. However, attackers repackage Android apps and release them to the markets for illegal purposes, which brings great threats to the Android ecosystem. To leverage the popularity of original apps, they keep similar software behaviors to confuse app users. Furthermore, repackaged apps can be obfuscated or encrypted to avoid being detected. Besides, hybrid mobile apps, built by combining web technology and native elements, are becoming a preferred choice for developers. The structure of hybrid apps differs a lot from that of native apps which would raise great challenges to repackaging detection. Existing works still have some limitations in detecting repackaging from obfuscated and encrypted apps. Besides, few of them can deal with hybrid apps. In this paper, we proposed an approach based on the app UI regions extracted from app's runtime UI traces. We also implement a tool named RegionDroid based on the approach. We apply RegionDroid to tree datasets with totally 369 apps. It successfully finds all the 98 obfuscated or encrypted repackaged pairs in dataset S1. It also shows good credibility in distinguishing another 114 commercial apps in dataset S2. We also test our approach in dataset S3 with 157 hybrid apps by comparing them pairwisely and the false positive rate is 0.016%.

随着移动设备的快速发展，Android应用程序(app)被普遍使用。然而，攻击者将Android应用重新打包并投放到市场上，用于非法目的，这给Android生态系统带来了巨大的威胁。为了利用原创应用的受欢迎程度，他们保留了类似的软件行为来迷惑应用用户。此外，重新打包的应用程序可以被混淆或加密，以避免被检测到。此外，结合网络技术和本地元素构建的混合手机应用正成为开发者的首选。混合应用的结构与原生应用有很大不同，这给重新包装检测带来了巨大挑战。现有的工作在检测从混淆和加密的应用程序重新打包方面仍然有一些限制。此外，他们中很少有人能处理混合应用。在本文中，我们提出了一种基于从应用运行时UI轨迹中提取的应用UI区域的方法。我们还基于该方法实现了一个名为RegionDroid的工具。我们将RegionDroid应用于共369个应用程序的3个数据集。它成功地找到了数据集S1中所有98个混淆或加密的重新打包对。它在区分数据集S2中的另外114个商业应用程序方面也显示出良好的可信度。我们还在数据集S3中测试了我们的方法，其中包含157个混合应用程序，通过对它们进行配对比较，假阳性率为0.016%。

{"title":"RegionDroid: A Tool for Detecting Android Application Repackaging Based on Runtime UI Region Features","authors":"Shengtao Yue, Qingwei Sun, Jun Ma, Xianping Tao, Chang Xu, Jian Lu","doi":"10.1109/ICSME.2018.00041","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00041","url":null,"abstract":"With the rapid development of mobile devices, Android applications (apps) are universally used. However, attackers repackage Android apps and release them to the markets for illegal purposes, which brings great threats to the Android ecosystem. To leverage the popularity of original apps, they keep similar software behaviors to confuse app users. Furthermore, repackaged apps can be obfuscated or encrypted to avoid being detected. Besides, hybrid mobile apps, built by combining web technology and native elements, are becoming a preferred choice for developers. The structure of hybrid apps differs a lot from that of native apps which would raise great challenges to repackaging detection. Existing works still have some limitations in detecting repackaging from obfuscated and encrypted apps. Besides, few of them can deal with hybrid apps. In this paper, we proposed an approach based on the app UI regions extracted from app's runtime UI traces. We also implement a tool named RegionDroid based on the approach. We apply RegionDroid to tree datasets with totally 369 apps. It successfully finds all the 98 obfuscated or encrypted repackaged pairs in dataset S1. It also shows good credibility in distinguishing another 114 commercial apps in dataset S2. We also test our approach in dataset S3 with 157 hybrid apps by comparing them pairwisely and the false positive rate is 0.016%.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"34 1","pages":"323-333"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88316467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Publisher's Information 出版商的信息

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/icsme.2018.00097

引用次数: 0

How Maintainability Issues of Android Apps Evolve Android应用的可维护性问题是如何演变的

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00042

I. Malavolta, R. Verdecchia, Bojana Filipović, M. Bruntink, P. Lago

Context. Android is the largest mobile platform today, with thousands of apps published and updated in the Google Play store everyday. Maintenance is an important factor in Android apps lifecycle, as it allows developers to constantly improve their apps and better tailor them to their user base. Goal. In this paper we investigate the evolution of various maintainability issues along the lifetime of Android apps. Method. We designed and conducted an empirical study on 434 GitHub repositories containing open, real (i.e., published in the Google Play store), and actively maintained Android apps. We statically analyzed 9,945 weekly snapshots of all apps for identifying their maintainability issues over time. We also identified maintainability hotspots along the lifetime of Android apps according to how their density of maintainability issues evolves over time. More than 2,000 GitHub commits belonging to identified hotspots have been manually categorized to understand the context in which maintainability hotspots occur. Results. Our results shed light on (i) how often various types of maintainability issues occur over the lifetime of Android apps, (ii) the evolution trends of the density of maintainability issues in Android apps, and (iii) an in-depth characterization of development activities related to maintainability hotspots. Together, these results can help Android developers in (i) better planning code refactoring sessions, (ii) better planning their code review sessions (e.g., steering the assignment of code reviews), and (iii) taking special care of their code quality when performing tasks belonging to activities highly correlated with maintainability issues. We also support researchers by objectively characterizing the state of the practice about maintainability of Android apps. Conclusions. Independently from the type of development activity, maintainability issues grow until they stabilize, but are never fully resolved.

上下文。Android是当今最大的移动平台，每天都有成千上万的应用在谷歌Play商店中发布和更新。维护是Android应用生命周期中的一个重要因素，因为它允许开发者不断改进他们的应用，更好地为他们的用户群量身定制。的目标。在本文中，我们研究了Android应用生命周期中各种可维护性问题的演变。方法。我们设计并对434个GitHub存储库进行了实证研究，这些存储库包含开放的、真实的(即在b谷歌Play商店中发布的)、积极维护的Android应用程序。我们对所有应用的9945个每周快照进行了静态分析，以确定它们的可维护性问题。我们还根据可维护性问题的密度随时间的变化确定了Android应用生命周期中的可维护性热点。超过2000个属于已确定热点的GitHub提交已经被手动分类，以了解可维护性热点发生的背景。结果。我们的结果揭示了(i)在Android应用的生命周期中，各种类型的可维护性问题发生的频率，(ii) Android应用中可维护性问题密度的演变趋势，以及(iii)与可维护性热点相关的开发活动的深入特征。总之，这些结果可以帮助Android开发者(i)更好地规划代码重构会议，(ii)更好地规划代码审查会议(例如，指导代码审查的分配)，以及(iii)在执行属于与可维护性问题高度相关的活动的任务时特别注意代码质量。我们还通过客观地描述Android应用程序可维护性的实践状态来支持研究人员。结论。与开发活动的类型无关，可维护性问题会一直增长，直到它们稳定下来，但永远不会得到完全解决。

{"title":"How Maintainability Issues of Android Apps Evolve","authors":"I. Malavolta, R. Verdecchia, Bojana Filipović, M. Bruntink, P. Lago","doi":"10.1109/ICSME.2018.00042","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00042","url":null,"abstract":"Context. Android is the largest mobile platform today, with thousands of apps published and updated in the Google Play store everyday. Maintenance is an important factor in Android apps lifecycle, as it allows developers to constantly improve their apps and better tailor them to their user base. Goal. In this paper we investigate the evolution of various maintainability issues along the lifetime of Android apps. Method. We designed and conducted an empirical study on 434 GitHub repositories containing open, real (i.e., published in the Google Play store), and actively maintained Android apps. We statically analyzed 9,945 weekly snapshots of all apps for identifying their maintainability issues over time. We also identified maintainability hotspots along the lifetime of Android apps according to how their density of maintainability issues evolves over time. More than 2,000 GitHub commits belonging to identified hotspots have been manually categorized to understand the context in which maintainability hotspots occur. Results. Our results shed light on (i) how often various types of maintainability issues occur over the lifetime of Android apps, (ii) the evolution trends of the density of maintainability issues in Android apps, and (iii) an in-depth characterization of development activities related to maintainability hotspots. Together, these results can help Android developers in (i) better planning code refactoring sessions, (ii) better planning their code review sessions (e.g., steering the assignment of code reviews), and (iii) taking special care of their code quality when performing tasks belonging to activities highly correlated with maintainability issues. We also support researchers by objectively characterizing the state of the practice about maintainability of Android apps. Conclusions. Independently from the type of development activity, maintainability issues grow until they stabilize, but are never fully resolved.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"19 1","pages":"334-344"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89562949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

AudioHighlight: Code Skimming for Blind Programmers AudioHighlight:盲人程序员的代码浏览

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00030

A. Armaly, Paige Rodeghero, Collin McMillan

Blind programmers use a screen reader to read code aloud. Screen readers force blind programmers to read code sequentially one line at a time. In contrast, sighted programmers are able to skim visually to the most important code areas, assisted by syntax highlighting. However, there is a place where there is a widely adopted approach to skimming a structured document: the web. Modern screen readers employ what is known as a virtual cursor to navigate structural information on webpages such as HTML heading tags. These tags can indicate different sections and subsections in the structure of a page. We harness the existing familiarity of blind computer users with this interface in our approach which we call AudioHighlight. AudioHighlight renders the code inside a web view, either as part of the Eclipse IDE or as a web service. It places HTML heading tags on the structural elements of a source file such as classes, functions and control flow statements. We compare AudioHighlight to the state of the art in code skimming represented by a previous code skimming approach called StructJumper. We also compare to the state of practice in reading code on the web as represented by GitHub. We found that AudioHighlight increased the quality and speed of code comprehension as compared to both approaches.

盲人程序员使用屏幕阅读器朗读代码。屏幕阅读器迫使盲人程序员按顺序一次一行地阅读代码。相比之下，视力正常的程序员能够在语法高亮显示的帮助下，直观地浏览最重要的代码区域。然而，有一个地方有一种被广泛采用的方法来浏览结构化文档:网络。现代屏幕阅读器使用所谓的虚拟光标来浏览网页上的结构信息，如HTML标题标签。这些标记可以指示页面结构中的不同部分和子部分。我们利用盲人电脑用户对这个界面的熟悉程度，在我们的方法中，我们称之为AudioHighlight。AudioHighlight在web视图中呈现代码，既可以作为Eclipse IDE的一部分，也可以作为web服务。它将HTML标题标签放在源文件的结构元素上，比如类、函数和控制流语句。我们将AudioHighlight与之前的代码浏览方法StructJumper所代表的代码浏览技术进行比较。我们还比较了GitHub所代表的在web上阅读代码的实践状态。我们发现，与这两种方法相比，AudioHighlight提高了代码理解的质量和速度。

{"title":"AudioHighlight: Code Skimming for Blind Programmers","authors":"A. Armaly, Paige Rodeghero, Collin McMillan","doi":"10.1109/ICSME.2018.00030","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00030","url":null,"abstract":"Blind programmers use a screen reader to read code aloud. Screen readers force blind programmers to read code sequentially one line at a time. In contrast, sighted programmers are able to skim visually to the most important code areas, assisted by syntax highlighting. However, there is a place where there is a widely adopted approach to skimming a structured document: the web. Modern screen readers employ what is known as a virtual cursor to navigate structural information on webpages such as HTML heading tags. These tags can indicate different sections and subsections in the structure of a page. We harness the existing familiarity of blind computer users with this interface in our approach which we call AudioHighlight. AudioHighlight renders the code inside a web view, either as part of the Eclipse IDE or as a web service. It places HTML heading tags on the structural elements of a source file such as classes, functions and control flow statements. We compare AudioHighlight to the state of the art in code skimming represented by a previous code skimming approach called StructJumper. We also compare to the state of practice in reading code on the web as represented by GitHub. We found that AudioHighlight increased the quality and speed of code comprehension as compared to both approaches.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"30 1","pages":"206-216"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86211013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17