首页 > 最新文献

2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)最新文献

英文 中文
PathMiner: A Library for Mining of Path-Based Representations of Code PathMiner:用于挖掘基于路径的代码表示的库
V. Kovalenko, Egor Bogomolov, T. Bryksin, Alberto Bacchelli
One recent, significant advance in modeling source code for machine learning algorithms has been the introduction of path-based representation – an approach consisting in representing a snippet of code as a collection of paths from its syntax tree. Such representation efficiently captures the structure of code, which, in turn, carries its semantics and other information. Building the path-based representation involves parsing the code and extracting the paths from its syntax tree; these steps build up to a substantial technical job. With no common reusable toolkit existing for this task, the burden of mining diverts the focus of researchers from the essential work and hinders newcomers in the field of machine learning on code. In this paper, we present PathMiner – an open-source library for mining path-based representations of code. PathMiner is fast, flexible, well-tested, and easily extensible to support input code in any common programming language. Preprint [https://doi.org/10.5281/zenodo.2595271]; released tool [https://doi.org/10.5281/zenodo.2595257].
最近,机器学习算法在源代码建模方面的一个重大进展是引入了基于路径的表示——一种将代码片段表示为语法树中路径的集合的方法。这种表示有效地捕获了代码的结构,而代码的结构又承载了代码的语义和其他信息。构建基于路径的表示包括解析代码并从其语法树中提取路径;这些步骤构成了一项实质性的技术工作。由于这项任务没有通用的可重用工具包,挖掘的负担转移了研究人员对基本工作的关注,并阻碍了机器学习领域的新人。在本文中,我们介绍了PathMiner——一个用于挖掘基于路径的代码表示的开源库。PathMiner快速、灵活、经过良好测试,并且易于扩展以支持任何通用编程语言的输入代码。预印本[https://doi.org/10.5281/zenodo.2595271];已发布工具[https://doi.org/10.5281/zenodo.2595257]。
{"title":"PathMiner: A Library for Mining of Path-Based Representations of Code","authors":"V. Kovalenko, Egor Bogomolov, T. Bryksin, Alberto Bacchelli","doi":"10.1109/MSR.2019.00013","DOIUrl":"https://doi.org/10.1109/MSR.2019.00013","url":null,"abstract":"One recent, significant advance in modeling source code for machine learning algorithms has been the introduction of path-based representation – an approach consisting in representing a snippet of code as a collection of paths from its syntax tree. Such representation efficiently captures the structure of code, which, in turn, carries its semantics and other information. Building the path-based representation involves parsing the code and extracting the paths from its syntax tree; these steps build up to a substantial technical job. With no common reusable toolkit existing for this task, the burden of mining diverts the focus of researchers from the essential work and hinders newcomers in the field of machine learning on code. In this paper, we present PathMiner – an open-source library for mining path-based representations of code. PathMiner is fast, flexible, well-tested, and easily extensible to support input code in any common programming language. Preprint [https://doi.org/10.5281/zenodo.2595271]; released tool [https://doi.org/10.5281/zenodo.2595257].","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"31 1","pages":"13-17"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90108187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
An Empirical Study of Multiple Names and Email Addresses in OSS Version Control Repositories OSS版本控制库中多个名称和电子邮件地址的实证研究
Jiaxin Zhu, Jun Wei
Data produced by version control systems are widely used in software research and development. Version control data users always use the name or email address field to identify the committer or author of a modification. However, developers may use multiple names and email addresses, which brings difficulties for identification of distinct developers. In this paper, we sample 450 Git repositories from GitHub to study the multiple names and email addresses of developers. We conduct a conservative estimation of its prevalence and impact on related measurements. We merge the multiple names and email addresses of a developer through a method of high precision. With the merged identities, we obtain a number of interesting findings, e.g., about 6% of the developers used multiple names or email addresses in more than 60% of the repositories, and they contributed about half of all the commits. Our impact analysis shows that the multiple names and email addresses issue cannot be ignored for the basic related measurements, e.g., the number of developers in a repository. Our results could help researchers and practitioners have a more clear understanding of multiple names and email addresses in practice to improve the accuracy of related measurements.
版本控制系统产生的数据广泛用于软件研究和开发。版本控制数据用户总是使用姓名或电子邮件地址字段来标识修改的提交者或作者。然而,开发人员可能使用多个名称和电子邮件地址,这给识别不同的开发人员带来了困难。在本文中,我们从GitHub中选取了450个Git存储库来研究开发人员的多个名称和电子邮件地址。我们对其患病率和对相关测量的影响进行了保守估计。我们通过高精度的方法合并开发人员的多个名称和电子邮件地址。通过合并身份,我们得到了许多有趣的发现,例如,在超过60%的存储库中,大约6%的开发人员使用多个名称或电子邮件地址,并且他们贡献了大约一半的提交。我们的影响分析表明,对于基本的相关度量,例如,存储库中的开发人员数量,多个名称和电子邮件地址问题是不能忽略的。我们的研究结果可以帮助研究人员和从业者在实践中更清楚地了解多个姓名和电子邮件地址,以提高相关测量的准确性。
{"title":"An Empirical Study of Multiple Names and Email Addresses in OSS Version Control Repositories","authors":"Jiaxin Zhu, Jun Wei","doi":"10.1109/MSR.2019.00068","DOIUrl":"https://doi.org/10.1109/MSR.2019.00068","url":null,"abstract":"Data produced by version control systems are widely used in software research and development. Version control data users always use the name or email address field to identify the committer or author of a modification. However, developers may use multiple names and email addresses, which brings difficulties for identification of distinct developers. In this paper, we sample 450 Git repositories from GitHub to study the multiple names and email addresses of developers. We conduct a conservative estimation of its prevalence and impact on related measurements. We merge the multiple names and email addresses of a developer through a method of high precision. With the merged identities, we obtain a number of interesting findings, e.g., about 6% of the developers used multiple names or email addresses in more than 60% of the repositories, and they contributed about half of all the commits. Our impact analysis shows that the multiple names and email addresses issue cannot be ignored for the basic related measurements, e.g., the number of developers in a repository. Our results could help researchers and practitioners have a more clear understanding of multiple names and email addresses in practice to improve the accuracy of related measurements.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":"409-420"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89485463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Challenges with Responding to Static Analysis Tool Alerts 响应静态分析工具警报的挑战
Nasif Imtiaz, A. Rahman, Effat Farhana, L. Williams
Static analysis tool alerts can help developers detect potential defects in the code early in the development cycle. However, developers are not always able to respond to the alerts with their preferred action and may turn away from using the tool. In this paper, we qualitatively analyze 280 Stack Overflow (SO) questions regarding static analysis tool alerts to identify the challenges developers face in understanding and responding to these alerts. We find that the most prevalent question on SO is how to ignore and filter alerts, followed by validation of false positives. Our findings confirm prior researchers' findings related to notification communication theory as 44.6% of the SO questions that we analyzed indicate developers face communication challenges.
静态分析工具警报可以帮助开发人员在开发周期的早期检测代码中的潜在缺陷。然而,开发人员并不总是能够用他们喜欢的操作来响应警报,并且可能会放弃使用该工具。在本文中,我们定性分析了280个关于静态分析工具警报的堆栈溢出(SO)问题,以确定开发人员在理解和响应这些警报时面临的挑战。我们发现SO中最普遍的问题是如何忽略和过滤警报,其次是假阳性的验证。我们的发现证实了之前研究人员关于通知沟通理论的发现,我们分析的44.6%的SO问题表明开发者面临沟通挑战。
{"title":"Challenges with Responding to Static Analysis Tool Alerts","authors":"Nasif Imtiaz, A. Rahman, Effat Farhana, L. Williams","doi":"10.1109/MSR.2019.00049","DOIUrl":"https://doi.org/10.1109/MSR.2019.00049","url":null,"abstract":"Static analysis tool alerts can help developers detect potential defects in the code early in the development cycle. However, developers are not always able to respond to the alerts with their preferred action and may turn away from using the tool. In this paper, we qualitatively analyze 280 Stack Overflow (SO) questions regarding static analysis tool alerts to identify the challenges developers face in understanding and responding to these alerts. We find that the most prevalent question on SO is how to ignore and filter alerts, followed by validation of false positives. Our findings confirm prior researchers' findings related to notification communication theory as 44.6% of the SO questions that we analyzed indicate developers face communication challenges.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"55 1","pages":"245-249"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83747370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Snoring: A Noise in Defect Prediction Datasets 打鼾:缺陷预测数据集中的噪声
A. Ahluwalia, D. Falessi, M. D. Penta
In order to develop and train defect prediction models, researchers rely on datasets in which a defect is often attributed to a release where the defect itself is discovered. However, in many circumstances, it can happen that a defect is only discovered several releases after its introduction. This might introduce a bias in the dataset, i.e., treating the intermediate releases as defect-free and the latter as defect-prone. We call this phenomenon as "sleeping defects". We call "snoring" the phenomenon where classes are affected by sleeping defects only, that would be treated as defect-free until the defect is discovered. In this paper we analyze, on data from 282 releases of six open source projects from the Apache ecosystem, the magnitude of the sleeping defects and of the snoring classes. Our results indicate that 1) on all projects, most of the defects in a project slept for more than 20% of the existing releases, and 2) in the majority of the projects the missing rate is more than 25% even if we remove 50% of releases.
为了开发和训练缺陷预测模型,研究人员依赖于数据集,其中缺陷通常归因于发现缺陷本身的发布。然而,在许多情况下,缺陷可能在引入后的几个版本中才被发现。这可能会在数据集中引入偏差,即,将中间版本视为无缺陷版本,而将后者视为容易出现缺陷的版本。我们把这种现象称为“睡眠缺陷”。我们把类只受睡眠缺陷影响的现象称为“打鼾”,在发现缺陷之前,它将被视为没有缺陷。在本文中,我们分析了Apache生态系统中六个开源项目的282个版本的数据,分析了睡眠缺陷和打鼾类的大小。我们的结果表明,1)在所有项目中,项目中的大多数缺陷休眠了超过现有版本的20%,并且2)在大多数项目中,即使我们删除了50%的版本,缺陷率也超过25%。
{"title":"Snoring: A Noise in Defect Prediction Datasets","authors":"A. Ahluwalia, D. Falessi, M. D. Penta","doi":"10.1109/MSR.2019.00019","DOIUrl":"https://doi.org/10.1109/MSR.2019.00019","url":null,"abstract":"In order to develop and train defect prediction models, researchers rely on datasets in which a defect is often attributed to a release where the defect itself is discovered. However, in many circumstances, it can happen that a defect is only discovered several releases after its introduction. This might introduce a bias in the dataset, i.e., treating the intermediate releases as defect-free and the latter as defect-prone. We call this phenomenon as \"sleeping defects\". We call \"snoring\" the phenomenon where classes are affected by sleeping defects only, that would be treated as defect-free until the defect is discovered. In this paper we analyze, on data from 282 releases of six open source projects from the Apache ecosystem, the magnitude of the sleeping defects and of the snoring classes. Our results indicate that 1) on all projects, most of the defects in a project slept for more than 20% of the existing releases, and 2) in the majority of the projects the missing rate is more than 25% even if we remove 50% of releases.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":"63-67"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82795391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
[Title page i] [标题页i]
{"title":"[Title page i]","authors":"","doi":"10.1109/msr.2019.00001","DOIUrl":"https://doi.org/10.1109/msr.2019.00001","url":null,"abstract":"","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88240683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ConPan: A Tool to Analyze Packages in Software Containers 分析软件容器中的包的工具
Ahmed Zerouali, Valerio Cosentino, G. Robles, Jesus M. Gonzalez-Barahona, T. Mens
Deploying software packages and services into containers is a popular software engineering practice that increases portability and reusability. Docker, the most popular containerization technology, helps DevOps practitioners in their daily activities. Despite being successfully and increasingly employed, containers may include buggy and vulnerable packages that put at risk the environments in which the containers have been deployed. Existing quality and security monitoring tools provide only limited support to analyze Docker containers, thus forcing practitioners to perform additional manual work or develop adhoc scripts when the analysis goes beyond security purposes. This limitation also affects researchers desiring to empirically study the evolution dynamics of Docker containers and their contained packages. To overcome this limitation, we present ConPan, an automated tool to inspect the characteristics of packages in Docker containers, such as their outdatedness and other possible flaws (e.g., bugs and security vulnerabilities). ConPan comes with a CLI and API, and the analysis results can be presented to the user in a variety of formats.
将软件包和服务部署到容器中是一种流行的软件工程实践,可以提高可移植性和可重用性。Docker是最流行的容器化技术,可以帮助DevOps从业者进行日常活动。尽管越来越多的人成功地使用了容器,但容器可能包含错误和易受攻击的包,这些包会给部署容器的环境带来风险。现有的质量和安全监控工具对分析Docker容器只提供有限的支持,因此当分析超出安全目的时,从业者被迫执行额外的手工工作或开发专门的脚本。这一限制也影响了希望经验性地研究Docker容器及其包含的包的演化动态的研究人员。为了克服这一限制,我们提出了ConPan,这是一个自动化工具,用于检查Docker容器中的包的特征,例如它们的过时性和其他可能的缺陷(例如,错误和安全漏洞)。ConPan附带了CLI和API,分析结果可以以多种格式呈现给用户。
{"title":"ConPan: A Tool to Analyze Packages in Software Containers","authors":"Ahmed Zerouali, Valerio Cosentino, G. Robles, Jesus M. Gonzalez-Barahona, T. Mens","doi":"10.1109/MSR.2019.00089","DOIUrl":"https://doi.org/10.1109/MSR.2019.00089","url":null,"abstract":"Deploying software packages and services into containers is a popular software engineering practice that increases portability and reusability. Docker, the most popular containerization technology, helps DevOps practitioners in their daily activities. Despite being successfully and increasingly employed, containers may include buggy and vulnerable packages that put at risk the environments in which the containers have been deployed. Existing quality and security monitoring tools provide only limited support to analyze Docker containers, thus forcing practitioners to perform additional manual work or develop adhoc scripts when the analysis goes beyond security purposes. This limitation also affects researchers desiring to empirically study the evolution dynamics of Docker containers and their contained packages. To overcome this limitation, we present ConPan, an automated tool to inspect the characteristics of packages in Docker containers, such as their outdatedness and other possible flaws (e.g., bugs and security vulnerabilities). ConPan comes with a CLI and API, and the analysis results can be presented to the user in a variety of formats.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"8 1","pages":"592-596"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81351279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Automated Software Vulnerability Assessment with Concept Drift 基于概念漂移的自动化软件漏洞评估
T. H. Le, Bushra Sabir, M. Babar
Software Engineering researchers are increasingly using Natural Language Processing (NLP) techniques to automate Software Vulnerabilities (SVs) assessment using the descriptions in public repositories. However, the existing NLP-based approaches suffer from concept drift. This problem is caused by a lack of proper treatment of new (out-of-vocabulary) terms for the evaluation of unseen SVs over time. To perform automated SVs assessment with concept drift using SVs' descriptions, we propose a systematic approach that combines both character and word features. The proposed approach is used to predict seven Vulnerability Characteristics (VCs). The optimal model of each VC is selected using our customized time-based cross-validation method from a list of eight NLP representations and six well-known Machine Learning models. We have used the proposed approach to conduct large-scale experiments on more than 100,000 SVs in the National Vulnerability Database (NVD). The results show that our approach can effectively tackle the concept drift issue of the SVs' descriptions reported from 2000 to 2018 in NVD even without retraining the model. In addition, our approach performs competitively compared to the existing word-only method. We also investigate how to build compact concept-drift-aware models with much fewer features and give some recommendations on the choice of classifiers and NLP representations for SVs assessment.
软件工程研究人员越来越多地使用自然语言处理(NLP)技术,利用公共存储库中的描述来自动化软件漏洞(SVs)评估。然而,现有的基于nlp的方法存在概念漂移的问题。造成这个问题的原因是,随着时间的推移,在评估未见的SVs时,缺乏对新(词汇表外)术语的适当处理。为了利用SVs的描述进行概念漂移的自动SVs评估,我们提出了一种结合字符和单词特征的系统方法。利用该方法预测了7个漏洞特征(VCs)。使用我们定制的基于时间的交叉验证方法,从8个NLP表示和6个知名机器学习模型中选择每个VC的最佳模型。我们已经使用提出的方法在国家漏洞数据库(NVD)中对超过10万辆sv进行了大规模实验。结果表明,即使不重新训练模型,我们的方法也可以有效地解决NVD中2000 - 2018年报告的SVs描述的概念漂移问题。此外,与现有的单字方法相比,我们的方法具有竞争力。我们还研究了如何用更少的特征构建紧凑的概念漂移感知模型,并给出了一些关于选择分类器和NLP表示用于SVs评估的建议。
{"title":"Automated Software Vulnerability Assessment with Concept Drift","authors":"T. H. Le, Bushra Sabir, M. Babar","doi":"10.1109/MSR.2019.00063","DOIUrl":"https://doi.org/10.1109/MSR.2019.00063","url":null,"abstract":"Software Engineering researchers are increasingly using Natural Language Processing (NLP) techniques to automate Software Vulnerabilities (SVs) assessment using the descriptions in public repositories. However, the existing NLP-based approaches suffer from concept drift. This problem is caused by a lack of proper treatment of new (out-of-vocabulary) terms for the evaluation of unseen SVs over time. To perform automated SVs assessment with concept drift using SVs' descriptions, we propose a systematic approach that combines both character and word features. The proposed approach is used to predict seven Vulnerability Characteristics (VCs). The optimal model of each VC is selected using our customized time-based cross-validation method from a list of eight NLP representations and six well-known Machine Learning models. We have used the proposed approach to conduct large-scale experiments on more than 100,000 SVs in the National Vulnerability Database (NVD). The results show that our approach can effectively tackle the concept drift issue of the SVs' descriptions reported from 2000 to 2018 in NVD even without retraining the model. In addition, our approach performs competitively compared to the existing word-only method. We also investigate how to build compact concept-drift-aware models with much fewer features and give some recommendations on the choice of classifiers and NLP representations for SVs assessment.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"51 1","pages":"371-382"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90959753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Organizing Committee for MSR 2019 MSR 2019组委会
{"title":"Organizing Committee for MSR 2019","authors":"","doi":"10.1109/msr.2019.00007","DOIUrl":"https://doi.org/10.1109/msr.2019.00007","url":null,"abstract":"","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86924283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Benchmark of Data Loss Bugs for Android Apps Android应用程序数据丢失bug的基准测试
O. Riganelli, M. Mobilio, D. Micucci, L. Mariani
Android apps must be able to deal with both stop events, which require immediately stopping the execution of the app without losing state information, and start events, which require resuming the execution of the app at the same point it was stopped. Support to these kinds of events must be explicitly implemented by developers who unfortunately often fail to implement the proper logic for saving and restoring the state of an app. As a consequence apps can lose data when moved to background and then back to foreground (e.g., to answer a call) or when the screen is simply rotated. These faults can be the cause of annoying usability issues and unexpected crashes. This paper presents a public benchmark of 110 data loss faults in Android apps that we systematically collected to facilitate research and experimentation with these problems. The benchmark is available on GitLab and includes the faulty apps, the fixed apps (when available), the test cases to automatically reproduce the problems, and additional information that may help researchers in their tasks.
Android应用程序必须能够处理stop事件和start事件,stop事件要求立即停止应用程序的执行而不丢失状态信息,start事件要求在停止的同一点恢复应用程序的执行。对这些事件的支持必须由开发者明确地实现,不幸的是,他们经常无法实现适当的逻辑来保存和恢复应用程序的状态。因此,当应用程序移动到后台,然后回到前台(例如,接听电话)或当屏幕只是旋转时,应用程序可能会丢失数据。这些错误可能会导致恼人的可用性问题和意外崩溃。本文提出了一个公开的基准,我们系统地收集了110个Android应用程序中的数据丢失故障,以促进对这些问题的研究和实验。该基准测试在GitLab上可用,包括有问题的应用程序,修复的应用程序(当可用时),自动重现问题的测试用例,以及可能有助于研究人员完成任务的其他信息。
{"title":"A Benchmark of Data Loss Bugs for Android Apps","authors":"O. Riganelli, M. Mobilio, D. Micucci, L. Mariani","doi":"10.1109/MSR.2019.00087","DOIUrl":"https://doi.org/10.1109/MSR.2019.00087","url":null,"abstract":"Android apps must be able to deal with both stop events, which require immediately stopping the execution of the app without losing state information, and start events, which require resuming the execution of the app at the same point it was stopped. Support to these kinds of events must be explicitly implemented by developers who unfortunately often fail to implement the proper logic for saving and restoring the state of an app. As a consequence apps can lose data when moved to background and then back to foreground (e.g., to answer a call) or when the screen is simply rotated. These faults can be the cause of annoying usability issues and unexpected crashes. This paper presents a public benchmark of 110 data loss faults in Android apps that we systematically collected to facilitate research and experimentation with these problems. The benchmark is available on GitLab and includes the faulty apps, the fixed apps (when available), the test cases to automatically reproduce the problems, and additional information that may help researchers in their tasks.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"73 1","pages":"582-586"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85844707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
We Need to Talk About Microservices: an Analysis from the Discussions on StackOverflow 我们需要谈论微服务:从StackOverflow的讨论分析
Alan Bandeira, Carlos Alberto Medeiros, M. Paixão, P. Maia
Microservices are a new and rapidly growing architectural model aimed at developing highly scalable software solutions based on independently deployable and evolvable components. Due to its novelty, microservice-related discussions are increasing in Q&A websites, such as StackOverflow (SO). In order to understand what is being discussed by the microservice community, this work has applied mining techniques and topic modelling to a manually-curated dataset of 1,043 microservice-related posts from StackOverflow. As a result, we found that 13.68% of microservice technical posts on SO discuss a single technology: Netflix Eureka. Moreover, buzzwords in the microservice ecosystem, e.g., blue/green deployment, were not identified as relevant subjects of discussion on SO. Finally, we show how a high discussion rate on SO may not reflect the popularity of a certain subject within the microservice community.
微服务是一种快速发展的新型架构模型,旨在基于可独立部署和可演化的组件开发高度可伸缩的软件解决方案。由于它的新颖性,微服务相关的讨论在问答网站上越来越多,比如StackOverflow (SO)。为了了解微服务社区正在讨论的内容,这项工作将挖掘技术和主题建模应用于StackOverflow上的1043个微服务相关帖子的人工管理数据集。结果,我们发现在SO上有13.68%的微服务技术帖子讨论的是一种技术:Netflix Eureka。此外,微服务生态系统中的流行语,例如蓝色/绿色部署,并没有被确定为SO讨论的相关主题。最后,我们展示了SO的高讨论率如何不能反映微服务社区中某个主题的受欢迎程度。
{"title":"We Need to Talk About Microservices: an Analysis from the Discussions on StackOverflow","authors":"Alan Bandeira, Carlos Alberto Medeiros, M. Paixão, P. Maia","doi":"10.1109/MSR.2019.00051","DOIUrl":"https://doi.org/10.1109/MSR.2019.00051","url":null,"abstract":"Microservices are a new and rapidly growing architectural model aimed at developing highly scalable software solutions based on independently deployable and evolvable components. Due to its novelty, microservice-related discussions are increasing in Q&A websites, such as StackOverflow (SO). In order to understand what is being discussed by the microservice community, this work has applied mining techniques and topic modelling to a manually-curated dataset of 1,043 microservice-related posts from StackOverflow. As a result, we found that 13.68% of microservice technical posts on SO discuss a single technology: Netflix Eureka. Moreover, buzzwords in the microservice ecosystem, e.g., blue/green deployment, were not identified as relevant subjects of discussion on SO. Finally, we show how a high discussion rate on SO may not reflect the popularity of a certain subject within the microservice community.","PeriodicalId":6706,"journal":{"name":"2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)","volume":"64 1","pages":"255-259"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86203535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
期刊
2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1