2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)最新文献_第2页

Studying Permission Related Issues in Android Wearable Apps 研究Android可穿戴应用的权限相关问题

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00043

Suhaib Mujahid, Rabe Abdalkareem, Emad Shihab

Wearable devices are becoming increasingly popular; these devices host software that is known as wearable apps. Wearable apps could be packaged alongside handheld apps, hence they must be installed on the accompanying device (e.g., smartphone). This device dependency causes both apps to be also tightly coupled. Most importantly, when a wearable app is distributed by embedded it in a handheld app, Android Wear platform requires to include the wearable permission also in the handheld app which is error-prone. In this paper, we defined two permission issues related to wearable apps-namely permission mismatches and superfluous features. To study the permission related issues, we propose a technique to detect permission issues in wearable apps. We implement our technique in a tool called Permlyzer, which automatically detects these permission issues from an app's APK. We run Permlyzer on a dataset of 2,724 apps that have embedded wearable version and 339 standalone wearable app. Our result shows that I) 6% of wearable apps that request permissions are suffering from the permission mismatching problem; II) out of the apps that requires underlying features, 523 (52.4%) of handheld apps and 66 (80.5%) of standalone wearable apps have at least one superfluous feature; III) all the studied apps missed a declaration of underlying features for one or more of their permissions, which shows that developers may not know the mapping between the permissions they request and the hardware features. Additionally, in a survey of wearable app developers, all of the developers that responded mention that having a tool like Permlyzer, that detect permission related issues would be useful to them. Our results contribute to the understanding of permissions related issues in wearable apps, in particular, proposing a technique to detect permission mismatch and superfluous features.

可穿戴设备正变得越来越流行;这些设备承载的软件被称为可穿戴应用程序。可穿戴应用程序可以与手持应用程序一起打包，因此它们必须安装在配套设备上(例如智能手机)。这种设备依赖导致两个应用程序也是紧密耦合的。最重要的是，当可穿戴应用通过嵌入到手持应用中进行分发时，Android Wear平台要求在手持应用中也包含可穿戴权限，这很容易出错。在本文中，我们定义了两个与可穿戴应用相关的权限问题，即权限不匹配和多余的功能。为了研究权限相关问题，我们提出了一种检测可穿戴应用程序权限问题的技术。我们在一个名为Permlyzer的工具中实现了我们的技术，它可以自动从应用程序的APK中检测这些权限问题。我们在包含2724个嵌入式可穿戴版本应用和339个独立可穿戴应用的数据集上运行Permlyzer。我们的结果表明:1)6%请求权限的可穿戴应用存在权限不匹配问题;II)在需要底层功能的应用中，523款(52.4%)手持应用和66款(80.5%)独立可穿戴应用至少有一个多余的功能;III)所有被研究的应用程序都遗漏了一个或多个权限的底层功能声明，这表明开发人员可能不知道他们请求的权限和硬件功能之间的映射关系。此外，在一项针对可穿戴应用开发者的调查中，所有回应的开发者都提到，拥有像Permlyzer这样检测许可相关问题的工具对他们很有用。我们的研究结果有助于理解可穿戴应用程序中的权限相关问题，特别是提出了一种检测权限不匹配和多余功能的技术。

{"title":"Studying Permission Related Issues in Android Wearable Apps","authors":"Suhaib Mujahid, Rabe Abdalkareem, Emad Shihab","doi":"10.1109/ICSME.2018.00043","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00043","url":null,"abstract":"Wearable devices are becoming increasingly popular; these devices host software that is known as wearable apps. Wearable apps could be packaged alongside handheld apps, hence they must be installed on the accompanying device (e.g., smartphone). This device dependency causes both apps to be also tightly coupled. Most importantly, when a wearable app is distributed by embedded it in a handheld app, Android Wear platform requires to include the wearable permission also in the handheld app which is error-prone. In this paper, we defined two permission issues related to wearable apps-namely permission mismatches and superfluous features. To study the permission related issues, we propose a technique to detect permission issues in wearable apps. We implement our technique in a tool called Permlyzer, which automatically detects these permission issues from an app's APK. We run Permlyzer on a dataset of 2,724 apps that have embedded wearable version and 339 standalone wearable app. Our result shows that I) 6% of wearable apps that request permissions are suffering from the permission mismatching problem; II) out of the apps that requires underlying features, 523 (52.4%) of handheld apps and 66 (80.5%) of standalone wearable apps have at least one superfluous feature; III) all the studied apps missed a declaration of underlying features for one or more of their permissions, which shows that developers may not know the mapping between the permissions they request and the hardware features. Additionally, in a survey of wearable app developers, all of the developers that responded mention that having a tool like Permlyzer, that detect permission related issues would be useful to them. Our results contribute to the understanding of permissions related issues in wearable apps, in particular, proposing a technique to detect permission mismatch and superfluous features.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"10 1","pages":"345-356"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73156904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Clone-Based Variability Management in the Android Ecosystem Android生态系统中基于克隆的可变性管理

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00072

John Businge, Moses Openja, Sarah Nadi, Engineer Bainomugisha, T. Berger

Mobile app developers often need to create variants to account for different customer segments, payment models or functionalities. A common strategy is to clone (or fork) an existing app and then adapt it to new requirements. This form of reuse has been enhanced with the advent of social-coding platforms such as Github, cultivating a more systematic reuse. Different facilities, such as forks, pull requests, and cross-project traceability support clone-based development. Unfortunately, even though, many apps are known to be maintained in many variants, little is known about how practitioners manage variants of mobile apps. We present a study that explores clone-based reuse practices for open-source Android apps. We identified and analyzed families of apps that are maintained together and that exist both on the official app store (Google Play) as well as on Github, allowing us to analyze reuse practices in depth. We mined both repositories to identify app families and to study their characteristics, including their variabilities as well as code-propagation practices and maintainer relationships. We found that, indeed, app families exist and that forked app variants fall into the following categories: (i) re-branding and simple customizations, (ii) feature extension, (iii) supporting of the mainline app, and (iv) implementation of different, but related features. Other notable characteristic of the app families we discovered include: (i) 72.7% of the app families did not perform any form of code propagation, and (ii) 74% of the app families we studied do not have common maintainers.

手机应用开发者通常需要针对不同的用户群体、付费模式或功能创造不同的变体。一个常见的策略是克隆(或分叉)一个现有的应用程序，然后调整它以适应新的需求。随着社交编码平台(如Github)的出现，这种形式的重用得到了加强，培养了更系统的重用。不同的工具，如分叉、拉取请求和跨项目的可追溯性支持基于克隆的开发。不幸的是，尽管我们知道许多应用程序以多种变体维护，但从业者如何管理移动应用程序的变体却知之甚少。我们提出了一项研究，探讨了基于克隆的开源Android应用的重用实践。我们识别并分析了同时存在于官方应用商店(Google Play)和Github上的应用程序家族，使我们能够深入分析重用实践。我们挖掘了这两个存储库来识别应用程序家族并研究它们的特征，包括它们的可变性以及代码传播实践和维护者关系。我们发现，确实存在应用程序家族，并且分支应用程序变体分为以下类别:(i)重新命名和简单定制，(ii)功能扩展，(iii)支持主线应用程序，以及(iv)实现不同但相关的功能。我们发现的应用程序家族的其他显著特征包括:(i) 72.7%的应用程序家族没有执行任何形式的代码传播，(ii)我们研究的74%的应用程序家族没有共同的维护者。

{"title":"Clone-Based Variability Management in the Android Ecosystem","authors":"John Businge, Moses Openja, Sarah Nadi, Engineer Bainomugisha, T. Berger","doi":"10.1109/ICSME.2018.00072","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00072","url":null,"abstract":"Mobile app developers often need to create variants to account for different customer segments, payment models or functionalities. A common strategy is to clone (or fork) an existing app and then adapt it to new requirements. This form of reuse has been enhanced with the advent of social-coding platforms such as Github, cultivating a more systematic reuse. Different facilities, such as forks, pull requests, and cross-project traceability support clone-based development. Unfortunately, even though, many apps are known to be maintained in many variants, little is known about how practitioners manage variants of mobile apps. We present a study that explores clone-based reuse practices for open-source Android apps. We identified and analyzed families of apps that are maintained together and that exist both on the official app store (Google Play) as well as on Github, allowing us to analyze reuse practices in depth. We mined both repositories to identify app families and to study their characteristics, including their variabilities as well as code-propagation practices and maintainer relationships. We found that, indeed, app families exist and that forked app variants fall into the following categories: (i) re-branding and simple customizations, (ii) feature extension, (iii) supporting of the mainline app, and (iv) implementation of different, but related features. Other notable characteristic of the app families we discovered include: (i) 72.7% of the app families did not perform any form of code propagation, and (ii) 74% of the app families we studied do not have common maintainers.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"269 1","pages":"625-634"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74363562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 39

Understanding, Debugging, and Optimizing Distributed Software Builds: A Design Study 理解、调试和优化分布式软件构建:设计研究

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00060

Carlene Lebeuf, Elena Voyloshnikova, Kim Herzig, M. Storey

Today's build systems distribute build tasks across thousands of machines, reusing cached build results whenever possible. But despite the sophisticated nature of modern build tools, the core software architecture of the system under build defines the lower bound for how fast the system can compile. Long, consecutive build chains or slow individual build targets can introduce expensive compilation bottlenecks. Further, the growing complexity of both build systems and software systems under build makes comprehending, debugging, and optimizing build performance a significant challenge faced by many software engineers. We present a design study to describe and help mitigate the cognitive challenges faced by software engineers that use modern, cached, and distributed build systems. We characterize the performance analysis process and identify the main stakeholders involved, key usage scenarios, and elicit important requirements for tool support. We propose an interactive BuildExplorer tool for understanding, optimizing, and debugging cached and distributed build sessions, justifying our design decisions among alternative solutions. Our novel solution is evaluated through usage scenario walkthroughs, iterative deployments of the tool in the field, and a user study.

今天的构建系统在数千台机器上分发构建任务，尽可能重用缓存的构建结果。但是，尽管现代构建工具具有复杂的本质，所构建系统的核心软件架构定义了系统编译速度的下限。长而连续的构建链或缓慢的单个构建目标可能会引入代价高昂的编译瓶颈。此外，构建系统和被构建软件系统的复杂性不断增长，使得理解、调试和优化构建性能成为许多软件工程师面临的重大挑战。我们提出了一项设计研究，以描述并帮助减轻使用现代、缓存和分布式构建系统的软件工程师所面临的认知挑战。我们描述了性能分析过程，确定了所涉及的主要涉众、关键使用场景，并引出了对工具支持的重要需求。我们提出了一个交互式的BuildExplorer工具，用于理解、优化和调试缓存的和分布式的构建会话，在可选的解决方案中证明我们的设计决策。我们的新解决方案通过使用场景演练、工具在现场的迭代部署和用户研究进行评估。

{"title":"Understanding, Debugging, and Optimizing Distributed Software Builds: A Design Study","authors":"Carlene Lebeuf, Elena Voyloshnikova, Kim Herzig, M. Storey","doi":"10.1109/ICSME.2018.00060","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00060","url":null,"abstract":"Today's build systems distribute build tasks across thousands of machines, reusing cached build results whenever possible. But despite the sophisticated nature of modern build tools, the core software architecture of the system under build defines the lower bound for how fast the system can compile. Long, consecutive build chains or slow individual build targets can introduce expensive compilation bottlenecks. Further, the growing complexity of both build systems and software systems under build makes comprehending, debugging, and optimizing build performance a significant challenge faced by many software engineers. We present a design study to describe and help mitigate the cognitive challenges faced by software engineers that use modern, cached, and distributed build systems. We characterize the performance analysis process and identify the main stakeholders involved, key usage scenarios, and elicit important requirements for tool support. We propose an interactive BuildExplorer tool for understanding, optimizing, and debugging cached and distributed build sessions, justifying our design decisions among alternative solutions. Our novel solution is evaluated through usage scenario walkthroughs, iterative deployments of the tool in the field, and a user study.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"25 1","pages":"496-507"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82040765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Detecting and Predicting Evolution in Spreadsheets - A Case Study in an Energy Network Company 检测和预测电子表格的演变-一个能源网络公司的案例研究

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00074

Bas Jansen, F. Hermans, E. Tazelaar

The use of spreadsheets in industry is widespread and the information that they provide is often used for decisions. Research has shown that spreadsheets are error-prone, leading to the risk that decisions are made on incorrect information. Software Evolution is a well-researched topic and the results have proven to support developers in creating better software. Could this also be applied to spreadsheets? Unfortunately, the research on spreadsheet evolution is still limited. Therefore, the aim of this paper is to obtain a better understanding of how spreadsheets evolve over time and if the results of such a study provide similar benefits for spreadsheets as it does for source code. In this study, we cooperated with Alliander, a large energy network company in the Netherlands. We conducted two case studies on two different set of spreadsheets that both were already maintained for a period of three years. To have a better understanding of the spreadsheets itself and the context in which they evolved, we also interviewed the creators of the spreadsheets. We focus on the changes that are made over time in the formulas. Changes in these formulas change the behavior of the spreadsheet and could possibly introduce errors. To effectively analyze these changes we developed an algorithm that is able to detect and visualize these changes. Results indicate that studying the evolution of a spreadsheet helps to identify areas in the spreadsheet that are error-prone, likely to change or that could benefit from refactoring. Furthermore, by analyzing the frequency in which formulas are changed from version to version, it is possible to predict which formulas need to be changed when a new version of the spreadsheet is created.

电子表格在工业中的使用是广泛的，它们提供的信息经常用于决策。研究表明，电子表格容易出错，导致根据不正确的信息做出决策的风险。软件进化是一个经过充分研究的主题，其结果已被证明可以支持开发人员创建更好的软件。这也适用于电子表格吗?不幸的是，对电子表格演变的研究仍然有限。因此，本文的目的是更好地理解电子表格是如何随着时间的推移而发展的，以及这种研究的结果是否为电子表格提供了与源代码类似的好处。在这项研究中，我们与荷兰的一家大型能源网络公司Alliander合作。我们在两组不同的电子表格上进行了两个案例研究，这两组电子表格都已经维护了三年。为了更好地理解电子表格本身及其发展的背景，我们还采访了电子表格的创建者。我们关注的是公式随着时间的推移而发生的变化。这些公式的更改会改变电子表格的行为，并可能引入错误。为了有效地分析这些变化，我们开发了一种能够检测和可视化这些变化的算法。结果表明，研究电子表格的演变有助于确定电子表格中容易出错、可能更改或可以从重构中受益的区域。此外，通过分析公式在不同版本之间更改的频率，可以预测在创建新版本的电子表格时需要更改哪些公式。

{"title":"Detecting and Predicting Evolution in Spreadsheets - A Case Study in an Energy Network Company","authors":"Bas Jansen, F. Hermans, E. Tazelaar","doi":"10.1109/ICSME.2018.00074","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00074","url":null,"abstract":"The use of spreadsheets in industry is widespread and the information that they provide is often used for decisions. Research has shown that spreadsheets are error-prone, leading to the risk that decisions are made on incorrect information. Software Evolution is a well-researched topic and the results have proven to support developers in creating better software. Could this also be applied to spreadsheets? Unfortunately, the research on spreadsheet evolution is still limited. Therefore, the aim of this paper is to obtain a better understanding of how spreadsheets evolve over time and if the results of such a study provide similar benefits for spreadsheets as it does for source code. In this study, we cooperated with Alliander, a large energy network company in the Netherlands. We conducted two case studies on two different set of spreadsheets that both were already maintained for a period of three years. To have a better understanding of the spreadsheets itself and the context in which they evolved, we also interviewed the creators of the spreadsheets. We focus on the changes that are made over time in the formulas. Changes in these formulas change the behavior of the spreadsheet and could possibly introduce errors. To effectively analyze these changes we developed an algorithm that is able to detect and visualize these changes. Results indicate that studying the evolution of a spreadsheet helps to identify areas in the spreadsheet that are error-prone, likely to change or that could benefit from refactoring. Furthermore, by analyzing the frequency in which formulas are changed from version to version, it is possible to predict which formulas need to be changed when a new version of the spreadsheet is created.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"123 1","pages":"645-654"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79669520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Predicting Software Maintainability in Object-Oriented Systems Using Ensemble Techniques 使用集成技术预测面向对象系统中的软件可维护性

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00088

Hadeel Alsolai

Prediction of the maintainability of classes in object-oriented systems is a significant factor for software success, however it is a challenging task to achieve. To date, several machine learning models have been applied with variable results and no clear indication of which techniques are more appropriate. With the goal of achieving more consistent results, this paper presents the first set of results in an extensive empirical study designed to evaluate the capability of bagging models to increase accuracy prediction over individual models. The study compares two major machine learning based approaches for predicting software maintainability: individual models (regression tree, multilayer perceptron, k-nearest neighbors and m5rules), and an ensemble model (bagging) that are applied to the QUES data set. The results obtained from this study indicate that k-nearest neighbors model outperformed all other individual models. The bagging ensemble model improved accuracy prediction significantly over almost all individual models, and the bagging ensemble models with k-nearest neighbors as a base model achieved superior accurate prediction. This paper also provides a description of the planned programme of research which aims to investigate the performance over various datasets of advanced (ensemble-based) machine learning models.

预测面向对象系统中类的可维护性是软件成功的重要因素，然而这是一项具有挑战性的任务。到目前为止，已经应用了几种机器学习模型，结果各不相同，没有明确的迹象表明哪种技术更合适。为了获得更一致的结果，本文提出了一项广泛的实证研究中的第一组结果，旨在评估套袋模型比单个模型提高预测精度的能力。该研究比较了两种主要的基于机器学习的预测软件可维护性的方法:单个模型(回归树、多层感知器、k近邻和m5规则)，以及应用于QUES数据集的集成模型(bagging)。本研究的结果表明，k近邻模型优于所有其他单个模型。套袋系综模型的预测精度比几乎所有单个模型都有显著提高，以k近邻为基础模型的套袋系综模型预测精度更高。本文还提供了计划研究计划的描述，该计划旨在调查高级(基于集成的)机器学习模型在各种数据集上的性能。

{"title":"Predicting Software Maintainability in Object-Oriented Systems Using Ensemble Techniques","authors":"Hadeel Alsolai","doi":"10.1109/ICSME.2018.00088","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00088","url":null,"abstract":"Prediction of the maintainability of classes in object-oriented systems is a significant factor for software success, however it is a challenging task to achieve. To date, several machine learning models have been applied with variable results and no clear indication of which techniques are more appropriate. With the goal of achieving more consistent results, this paper presents the first set of results in an extensive empirical study designed to evaluate the capability of bagging models to increase accuracy prediction over individual models. The study compares two major machine learning based approaches for predicting software maintainability: individual models (regression tree, multilayer perceptron, k-nearest neighbors and m5rules), and an ensemble model (bagging) that are applied to the QUES data set. The results obtained from this study indicate that k-nearest neighbors model outperformed all other individual models. The bagging ensemble model improved accuracy prediction significantly over almost all individual models, and the bagging ensemble models with k-nearest neighbors as a base model achieved superior accurate prediction. This paper also provides a description of the planned programme of research which aims to investigate the performance over various datasets of advanced (ensemble-based) machine learning models.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"136 1","pages":"716-721"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86099260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Improving API Caveats Accessibility by Mining API Caveats Knowledge Graph 通过挖掘API警告知识图提高API警告的可访问性

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00028

HongWei Li, Sirui Li, Jiamou Sun, Zhenchang Xing, Xin Peng, Mingwei Liu, Xuejiao Zhao

API documentation provides important knowledge about the functionality and usage of APIs. In this paper, we focus on API caveats that developers should be aware of in order to avoid unintended use of an API. Our formative study of Stack Overflow questions suggests that API caveats are often scattered in multiple API documents, and are buried in lengthy textual descriptions. These characteristics make the API caveats less discoverable. When developers fail to notice API caveats, it is very likely to cause some unexpected programming errors. In this paper, we propose natural language processing(NLP) techniques to extract ten subcategories of API caveat sentences from API documentation and link these sentences to API entities in an API caveats knowledge graph. The API caveats knowledge graph can support information retrieval based or entity-centric search of API caveats. As a proof-of-concept, we construct an API caveats knowledge graph for Android APIs from the API documentation on the Android Developers website. We study the abundance of different subcategories of API caveats and use a sampling method to manually evaluate the quality of the API caveats knowledge graph. We also conduct a user study to validate whether and how the API caveats knowledge graph may improve the accessibility of API caveats in API documentation.

API文档提供了关于API的功能和用法的重要知识。在本文中，我们将重点关注开发人员应该注意的API警告，以避免意外使用API。我们对堆栈溢出问题的形成性研究表明，API警告通常分散在多个API文档中，并且隐藏在冗长的文本描述中。这些特征使得API警告不太容易被发现。当开发人员没有注意到API警告时，很可能会导致一些意想不到的编程错误。在本文中，我们提出了自然语言处理(NLP)技术来从API文档中提取10个子类别的API警告句子，并将这些句子链接到API警告知识图中的API实体。API警告知识图谱可以支持基于信息检索或以实体为中心的API警告搜索。作为概念验证，我们从Android Developers网站上的API文档中为Android API构建了一个API警告知识图谱。我们研究了API警告的不同子类别的丰度，并使用抽样方法手动评估API警告知识图的质量。我们还进行了一项用户研究，以验证API警告知识图是否以及如何提高API文档中API警告的可访问性。

{"title":"Improving API Caveats Accessibility by Mining API Caveats Knowledge Graph","authors":"HongWei Li, Sirui Li, Jiamou Sun, Zhenchang Xing, Xin Peng, Mingwei Liu, Xuejiao Zhao","doi":"10.1109/ICSME.2018.00028","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00028","url":null,"abstract":"API documentation provides important knowledge about the functionality and usage of APIs. In this paper, we focus on API caveats that developers should be aware of in order to avoid unintended use of an API. Our formative study of Stack Overflow questions suggests that API caveats are often scattered in multiple API documents, and are buried in lengthy textual descriptions. These characteristics make the API caveats less discoverable. When developers fail to notice API caveats, it is very likely to cause some unexpected programming errors. In this paper, we propose natural language processing(NLP) techniques to extract ten subcategories of API caveat sentences from API documentation and link these sentences to API entities in an API caveats knowledge graph. The API caveats knowledge graph can support information retrieval based or entity-centric search of API caveats. As a proof-of-concept, we construct an API caveats knowledge graph for Android APIs from the API documentation on the Android Developers website. We study the abundance of different subcategories of API caveats and use a sampling method to manually evaluate the quality of the API caveats knowledge graph. We also conduct a user study to validate whether and how the API caveats knowledge graph may improve the accessibility of API caveats in API documentation.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"69 1","pages":"183-193"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91187619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 78

How do Multiple Pull Requests Change the Same Code: A Study of Competing Pull Requests in GitHub 多个拉请求如何改变相同的代码:在GitHub竞争拉请求的研究

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00032

Xin Zhang, Yang Chen, Y. Gu, W. Zou, Xiaoyuan Xie, Xiangyang Jia, J. Xuan

GitHub is a widely used collaborative platform for global software development. A pull request plays an important role in bridging code changes with version controlling. Developers can freely and parallelly submit pull requests to base branches and wait for the merge of their contributions. However, several developers may submit pull requests to edit the same lines of code; such pull requests result in a latent collaborative conflict. We refer such pull requests that tend to change the same lines and remain open during an overlapping time period to as competing pull requests. In this paper, we conduct a study on 9,476 competing pull requests from 60 Java repositories in GitHub. The data are collected by mining pull requests that are submitted in 2017 from top Java projects with the most forks. We explore how multiple pull requests change the same code via answering four research questions, including the distribution of competing pull requests, the involved developers, the changed lines of code, and the impact on pull request integration. Our study shows that there indeed exist competing pull requests in GitHub: in 45 out of 60 repositories, over 31% of pull requests belong to competing pull requests; 20 repositories have more than 100 groups of competing pull requests, each of which is submitted by over five developers; 42 repositories have over 10% of competing pull requests with over 10 same lines of code. Meanwhile, we observe that attributes of competing pull requests do not have strong impacts on pull request integration, comparing with other types of pull requests. Our study provides a preliminary analysis for further research that aims to detect and eliminate conflicts among competing pull requests.

GitHub是一个广泛使用的全球软件开发协作平台。拉取请求在通过版本控制桥接代码更改方面起着重要作用。开发人员可以自由地并行地向基本分支提交拉取请求，并等待他们的贡献合并。但是，几个开发人员可能会提交pull请求来编辑同一行代码;这样的pull请求会导致潜在的协作冲突。我们将这种倾向于改变相同线路并在重叠时间段内保持开放的拉请求称为竞争性拉请求。在本文中，我们对GitHub中来自60个Java存储库的9476个竞争性拉取请求进行了研究。这些数据是通过挖掘2017年从分叉最多的顶级Java项目提交的拉请求收集的。我们通过回答四个研究问题来探讨多个拉取请求如何改变相同的代码，包括竞争拉取请求的分布、涉及的开发人员、更改的代码行以及对拉取请求集成的影响。我们的研究表明，在GitHub中确实存在竞争性的拉取请求:在60个存储库中的45个中，超过31%的拉取请求属于竞争性的拉取请求;20个存储库有超过100组相互竞争的拉取请求，每个拉取请求由5个以上的开发人员提交;42个存储库拥有超过10%的竞争性拉取请求，这些请求包含超过10行相同的代码。同时，我们观察到，与其他类型的拉请求相比，竞争拉请求的属性对拉请求集成的影响并不大。我们的研究为进一步的研究提供了初步的分析，旨在检测和消除竞争拉请求之间的冲突。

{"title":"How do Multiple Pull Requests Change the Same Code: A Study of Competing Pull Requests in GitHub","authors":"Xin Zhang, Yang Chen, Y. Gu, W. Zou, Xiaoyuan Xie, Xiangyang Jia, J. Xuan","doi":"10.1109/ICSME.2018.00032","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00032","url":null,"abstract":"GitHub is a widely used collaborative platform for global software development. A pull request plays an important role in bridging code changes with version controlling. Developers can freely and parallelly submit pull requests to base branches and wait for the merge of their contributions. However, several developers may submit pull requests to edit the same lines of code; such pull requests result in a latent collaborative conflict. We refer such pull requests that tend to change the same lines and remain open during an overlapping time period to as competing pull requests. In this paper, we conduct a study on 9,476 competing pull requests from 60 Java repositories in GitHub. The data are collected by mining pull requests that are submitted in 2017 from top Java projects with the most forks. We explore how multiple pull requests change the same code via answering four research questions, including the distribution of competing pull requests, the involved developers, the changed lines of code, and the impact on pull request integration. Our study shows that there indeed exist competing pull requests in GitHub: in 45 out of 60 repositories, over 31% of pull requests belong to competing pull requests; 20 repositories have more than 100 groups of competing pull requests, each of which is submitted by over five developers; 42 repositories have over 10% of competing pull requests with over 10 same lines of code. Meanwhile, we observe that attributes of competing pull requests do not have strong impacts on pull request integration, comparing with other types of pull requests. Our study provides a preliminary analysis for further research that aims to detect and eliminate conflicts among competing pull requests.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"19 1","pages":"228-239"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84862740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

GemChecker: Reporting on the Status of Gems in Ruby on Rails Projects GemChecker:报告Gems在Ruby on Rails项目中的状态

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00080

Jamie Cleare, Claudia Iacob

Ruby projects rely on gems, i.e. package libraries which provide a variety of features and functions. Once a package library has been installed onto an application, checking if it has become out of date or if it is poorly maintained can only be done manually for Ruby on Rails projects. This is both error prone and time consuming. Out of date gems can potentially introduce vulnerabilities that may only become obvious at a later stage. In this paper, we introduce GemChecker, a software tool designed to support Ruby on Rails developers in gaining knowledge about the version status of gems installed upon their application. GemChecker is designed to: a) allow queries of the latest version available for a gem, b) summarize the results of checking the versions of all the gems associated with a particular project, and c) support software maintenance tasks by alerting developers of code deprecation in gems used by a particular project, of new versions being released for particular gems, and when a gem used by a particular project is out of date.

Ruby项目依赖于gem，即提供各种特性和功能的包库。一旦一个包库被安装到应用程序上，检查它是否已经过时，或者它是否维护得很差，只能在Ruby on Rails项目中手工完成。这既容易出错又耗时。过时的gem可能会引入漏洞，这些漏洞只有在后期才会变得明显。在本文中，我们将介绍GemChecker，这是一个软件工具，旨在支持Ruby on Rails开发人员获取关于安装在其应用程序上的gem的版本状态的知识。GemChecker的设计目的是:a)允许查询gem可用的最新版本，b)汇总与特定项目相关的所有gem的版本检查结果，以及c)通过提醒开发人员特定项目使用的gem中的代码弃用，特定gem的新版本发布以及特定项目使用的gem过时来支持软件维护任务。

引用次数: 3

Replication Package for "Threats of Aggregating Software Repository Data" “聚合软件存储库数据的威胁”复制包

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00082

M. Robillard, M. Nassif, Shane McIntosh

This artifact is a data set generated as part of a study on the threats of aggregating software repository data, which includes information derived from the GitHub repositories of eight open-source projects.

该工件是作为聚合软件存储库数据的威胁研究的一部分生成的数据集，其中包括来自八个开源项目的GitHub存储库的信息。

引用次数: 0

A Large-Scale Empirical Study on Linguistic Antipatterns Affecting APIs 影响api的语言反模式的大规模实证研究

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Pub Date : 2018-09-01 DOI: 10.1109/ICSME.2018.00012

Emad Aghajani, Csaba Nagy, G. Bavota, Michele Lanza

The concept of monolithic stand-alone software systems developed completely from scratch has become obsolete, as modern systems nowadays leverage the abundant presence of Application Programming Interfaces (APIs) developed by third parties, which leads on the one hand to accelerated development, but on the other hand introduces potentially fragile dependencies on external resources. In this context, the design of any API strongly influences how developers write code utilizing it. A wrong design decision like a poorly chosen method name can lead to a steeper learning curve, due to misunderstandings, misuse and eventually bug-prone code in the client projects using the API. It is not unfrequent to find APIs with poorly expressive or misleading names, possibly lacking appropriate documentation. Such issues can manifest in what have been defined in the literature as Linguistic Antipatterns (LAs), i.e., inconsistencies among the naming, documentation, and implementation of a code entity. While previous studies showed the relevance of LAs for software developers, their impact on (developers of) client projects using APIs affected by LAs has not been investigated. This paper fills this gap by presenting a large-scale study conducted on 1.6k releases of popular Maven libraries, 14k open-source Java projects using these libraries, and 4.4k questions related to the investigated APIs asked on Stack Overflow. In particular, we investigate whether developers of client projects have higher chances of introducing bugs when using APIs affected by LAs and if these trigger more questions on Stack Overflow as compared to non-affected APIs.

完全从零开始开发的单片独立软件系统的概念已经过时了，因为现在的现代系统利用了第三方开发的大量应用程序编程接口(api)，这一方面加速了开发，但另一方面引入了对外部资源的潜在脆弱依赖。在这种情况下，任何API的设计都会严重影响开发人员如何利用它编写代码。错误的设计决策，比如选择不当的方法名，可能会导致更陡峭的学习曲线，因为在使用API的客户项目中会产生误解、误用和最终容易出错的代码。经常会发现api的名称表达不佳或具有误导性，可能缺乏适当的文档。这些问题可以在文献中定义的语言反模式(LAs)中表现出来，即代码实体的命名、文档和实现之间的不一致。虽然以前的研究表明了应用程序接口与软件开发人员的相关性，但它们对使用受应用程序接口影响的api的客户端项目(开发人员)的影响尚未被调查。本文通过对1.6k个流行的Maven库版本、14k个使用这些库的开源Java项目，以及在Stack Overflow上询问的4.4万个与调查api相关的问题进行大规模研究，填补了这一空白。特别是，我们调查了客户端项目的开发人员在使用受LAs影响的api时是否有更高的机会引入bug，以及与未受影响的api相比，这些bug是否会引发更多关于Stack Overflow的问题。

{"title":"A Large-Scale Empirical Study on Linguistic Antipatterns Affecting APIs","authors":"Emad Aghajani, Csaba Nagy, G. Bavota, Michele Lanza","doi":"10.1109/ICSME.2018.00012","DOIUrl":"https://doi.org/10.1109/ICSME.2018.00012","url":null,"abstract":"The concept of monolithic stand-alone software systems developed completely from scratch has become obsolete, as modern systems nowadays leverage the abundant presence of Application Programming Interfaces (APIs) developed by third parties, which leads on the one hand to accelerated development, but on the other hand introduces potentially fragile dependencies on external resources. In this context, the design of any API strongly influences how developers write code utilizing it. A wrong design decision like a poorly chosen method name can lead to a steeper learning curve, due to misunderstandings, misuse and eventually bug-prone code in the client projects using the API. It is not unfrequent to find APIs with poorly expressive or misleading names, possibly lacking appropriate documentation. Such issues can manifest in what have been defined in the literature as Linguistic Antipatterns (LAs), i.e., inconsistencies among the naming, documentation, and implementation of a code entity. While previous studies showed the relevance of LAs for software developers, their impact on (developers of) client projects using APIs affected by LAs has not been investigated. This paper fills this gap by presenting a large-scale study conducted on 1.6k releases of popular Maven libraries, 14k open-source Java projects using these libraries, and 4.4k questions related to the investigated APIs asked on Stack Overflow. In particular, we investigate whether developers of client projects have higher chances of introducing bugs when using APIs affected by LAs and if these trigger more questions on Stack Overflow as compared to non-affected APIs.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"95 1 1","pages":"25-35"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80289466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14