首页 > 最新文献

Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering最新文献

英文 中文
Data structure synthesis 数据结构综合
Calvin Loncaric
All mainstream languages ship with libraries implementing lists, maps, sets, trees, and other common data structures. These libraries are sufficient for some use cases, but other applications need specialized data structures with different operations. For such applications, the standard libraries are not enough. I propose to develop techniques to automatically synthesize data structure implementations from high-level specifications. My initial results on a large class of collection data structures demonstrate that this is possible and lend hope to the prospect of general data structure synthesis. Synthesized implementations can save programmer time and improve correctness while matching the performance of handwritten code.
所有主流语言都附带了实现列表、映射、集合、树和其他常见数据结构的库。这些库对于某些用例已经足够了,但是其他应用程序需要具有不同操作的专用数据结构。对于这样的应用程序,标准库是不够的。我建议开发从高级规范中自动合成数据结构实现的技术。我对一大类集合数据结构的初步研究结果表明,这是可能的,并为通用数据结构合成的前景带来了希望。综合实现可以节省程序员的时间,并在匹配手写代码的性能的同时提高正确性。
{"title":"Data structure synthesis","authors":"Calvin Loncaric","doi":"10.1145/2950290.2983946","DOIUrl":"https://doi.org/10.1145/2950290.2983946","url":null,"abstract":"All mainstream languages ship with libraries implementing lists, maps, sets, trees, and other common data structures. These libraries are sufficient for some use cases, but other applications need specialized data structures with different operations. For such applications, the standard libraries are not enough. I propose to develop techniques to automatically synthesize data structure implementations from high-level specifications. My initial results on a large class of collection data structures demonstrate that this is possible and lend hope to the prospect of general data structure synthesis. Synthesized implementations can save programmer time and improve correctness while matching the performance of handwritten code.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84456427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automating repetitive code changes using examples 使用示例自动化重复的代码更改
Reudismam Rolim
While adding features, fixing bugs, or refactoring the code, developers may perform repetitive code edits. Although Integrated Development Environments (IDEs) automate some transformations such as renaming, many repetitive edits are performed manually, which is error-prone and time-consuming. To help developers to apply these edits, we propose a technique to perform repetitive edits using examples. The technique receives as input the source code before and after the developer edits some target locations of the change and produces as output the top-ranked program transformation that can be applied to edit the remaining target locations in the codebase. The technique uses a state-of-the-art program synthesis methodology and has three main components: a) a DSL for describing program transformations; b) synthesis algorithms to learn program transformations in this DSL; c) ranking algorithms to select the program transformation with the higher probability of performing the desired repetitive edit. In our preliminary evaluation, in a dataset of 59 repetitive edit cases taken from real C# source code repositories, the technique performed, in 83% of the cases, the intended transformation using only 2.8 examples.
在添加特性、修复bug或重构代码时,开发人员可能会执行重复的代码编辑。尽管集成开发环境(ide)自动化了一些转换,比如重命名,但是许多重复的编辑是手动执行的,这很容易出错,而且很耗时。为了帮助开发人员应用这些编辑,我们提出了一种使用示例执行重复编辑的技术。该技术在开发人员编辑变更的一些目标位置之前和之后接收作为输入的源代码,并产生可用于编辑代码库中剩余目标位置的顶级程序转换作为输出。该技术使用最先进的程序综合方法,并有三个主要组成部分:a)用于描述程序转换的DSL;b)合成算法来学习该DSL中的程序转换;C)排序算法,以选择执行所需重复编辑的概率较高的程序转换。在我们的初步评估中,在来自真实c#源代码存储库的59个重复编辑案例的数据集中,该技术仅使用2.8个示例就在83%的案例中执行了预期的转换。
{"title":"Automating repetitive code changes using examples","authors":"Reudismam Rolim","doi":"10.1145/2950290.2983944","DOIUrl":"https://doi.org/10.1145/2950290.2983944","url":null,"abstract":"While adding features, fixing bugs, or refactoring the code, developers may perform repetitive code edits. Although Integrated Development Environments (IDEs) automate some transformations such as renaming, many repetitive edits are performed manually, which is error-prone and time-consuming. To help developers to apply these edits, we propose a technique to perform repetitive edits using examples. The technique receives as input the source code before and after the developer edits some target locations of the change and produces as output the top-ranked program transformation that can be applied to edit the remaining target locations in the codebase. The technique uses a state-of-the-art program synthesis methodology and has three main components: a) a DSL for describing program transformations; b) synthesis algorithms to learn program transformations in this DSL; c) ranking algorithms to select the program transformation with the higher probability of performing the desired repetitive edit. In our preliminary evaluation, in a dataset of 59 repetitive edit cases taken from real C# source code repositories, the technique performed, in 83% of the cases, the intended transformation using only 2.8 examples.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82942519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatic trigger generation for end user written rules for home automation 为最终用户编写的家庭自动化规则自动生成触发器
Chandrakana Nandi
To customize the behavior of a smart home, an end user writes rules. When an external event satisfies a rule's trigger, the rule's action executes; for example, when the temperature is above a certain threshold, then window awnings might be extended. End users often write incorrect rules. This paper's technique prevents a certain category of errors in the rules: errors due to too few triggers. It statically analyzes a rule's actions to automatically determine a set of necessary and sufficient triggers. We implemented the technique in a tool called TrigGen and tested it on 96 end-user written rules for openHAB, an open-source home automation platform. It identified that 80% of the rules had fewer triggers than required for correct behavior. The missing triggers could lead to unexpected behavior and security vulnerabilities in a smart home.
为了定制智能家居的行为,终端用户需要编写规则。当外部事件满足规则的触发器时,规则的操作执行;例如,当温度超过某个阈值时,可能会延长遮阳篷。最终用户经常编写不正确的规则。本文的技术防止了规则中某些类型的错误:由于触发器太少而导致的错误。它静态地分析规则的操作,以自动确定一组必要和充分的触发器。我们在名为TrigGen的工具中实现了该技术,并在openHAB(一个开源家庭自动化平台)的96个最终用户编写的规则上进行了测试。研究发现,80%的规则的触发因素比正确行为所需的要少。缺少触发器可能会导致智能家居中的意外行为和安全漏洞。
{"title":"Automatic trigger generation for end user written rules for home automation","authors":"Chandrakana Nandi","doi":"10.1145/2950290.2983965","DOIUrl":"https://doi.org/10.1145/2950290.2983965","url":null,"abstract":"To customize the behavior of a smart home, an end user writes rules. When an external event satisfies a rule's trigger, the rule's action executes; for example, when the temperature is above a certain threshold, then window awnings might be extended. End users often write incorrect rules. This paper's technique prevents a certain category of errors in the rules: errors due to too few triggers. It statically analyzes a rule's actions to automatically determine a set of necessary and sufficient triggers. We implemented the technique in a tool called TrigGen and tested it on 96 end-user written rules for openHAB, an open-source home automation platform. It identified that 80% of the rules had fewer triggers than required for correct behavior. The missing triggers could lead to unexpected behavior and security vulnerabilities in a smart home.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"71 4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90718884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Building a socio-technical theory of coordination: why and how (outstanding research award) 构建协调的社会技术理论:为什么和如何(杰出研究奖)
J. Herbsleb
Research aimed at understanding and addressing coordination breakdowns experienced in global software development (GSD) projects at Lucent Technologies took a path from open-ended qualitative exploratory studies to quantitative studies with a tight focus on a key problem – delay – and its causes. Rather than being directly associated with delay, multi-site work items involved more people than comparable same-site work items, and the number of people was a powerful predictor of delay. To counteract this, we developed and deployed tools and practices to support more effective communication and expertise location. After conducting two case studies of open source development, an extreme form of GSD, we realized that many tools and practices could be effective for multi-site work, but none seemed to work under all conditions. To achieve deeper insight, we developed and tested our Socio-Technical Theory of Coordination (STTC) in which the dependencies among engineering decisions are seen as defining a constraint satisfaction problem that the organization can solve in a variety of ways. I conclude by explaining how we applied these ideas to transparent development environments, then sketch important open research questions.
朗讯科技公司旨在理解和处理全球软件开发(GSD)项目中出现的协调故障的研究,从开放式的定性探索性研究转向了定量研究,重点关注一个关键问题——延迟——及其原因。与延迟直接相关的是,多地点工作项目比可比的同地点工作项目涉及更多的人,而人数是延迟的有力预测因素。为了解决这个问题,我们开发并部署了工具和实践,以支持更有效的沟通和专家定位。在对开源开发(GSD的一种极端形式)进行了两个案例研究之后,我们意识到许多工具和实践对于多站点工作是有效的,但是似乎没有一个可以在所有条件下工作。为了获得更深入的见解,我们开发并测试了我们的社会技术协调理论(STTC),其中工程决策之间的依赖关系被视为定义了组织可以通过各种方式解决的约束满足问题。最后,我解释了我们如何将这些想法应用于透明的开发环境,然后概述了重要的开放式研究问题。
{"title":"Building a socio-technical theory of coordination: why and how (outstanding research award)","authors":"J. Herbsleb","doi":"10.1145/2950290.2994160","DOIUrl":"https://doi.org/10.1145/2950290.2994160","url":null,"abstract":"Research aimed at understanding and addressing coordination breakdowns experienced in global software development (GSD) projects at Lucent Technologies took a path from open-ended qualitative exploratory studies to quantitative studies with a tight focus on a key problem – delay – and its causes. Rather than being directly associated with delay, multi-site work items involved more people than comparable same-site work items, and the number of people was a powerful predictor of delay. To counteract this, we developed and deployed tools and practices to support more effective communication and expertise location. After conducting two case studies of open source development, an extreme form of GSD, we realized that many tools and practices could be effective for multi-site work, but none seemed to work under all conditions. To achieve deeper insight, we developed and tested our Socio-Technical Theory of Coordination (STTC) in which the dependencies among engineering decisions are seen as defining a constraint satisfaction problem that the organization can solve in a variety of ways. I conclude by explaining how we applied these ideas to transparent development environments, then sketch important open research questions.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"311 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77390588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Automated test input generation for Android: are we really there yet in an industrial case? Android的自动化测试输入生成:我们真的已经在工业案例中实现了吗?
Xia Zeng, Dengfeng Li, Wujie Zheng, Fan Xia, Yuetang Deng, Wing Lam, Wei Yang, Tao Xie
Given the ever increasing number of research tools to automatically generate inputs to test Android applications (or simply apps), researchers recently asked the question "Are we there yet?" (in terms of the practicality of the tools). By conducting an empirical study of the various tools, the researchers found that Monkey (the most widely used tool of this category in industrial practices) outperformed all of the research tools that they studied. In this paper, we present two significant extensions of that study. First, we conduct the first industrial case study of applying Monkey against WeChat, a popular messenger app with over 762 million monthly active users, and report the empirical findings on Monkey's limitations in an industrial setting. Second, we develop a new approach to address major limitations of Monkey and accomplish substantial code-coverage improvements over Monkey, along with empirical insights for future enhancements to both Monkey and our approach.
考虑到越来越多的研究工具可以自动生成测试Android应用程序(或简单的应用程序)的输入,研究人员最近提出了一个问题:“我们还到了吗?”(就工具的实用性而言)。通过对各种工具进行实证研究,研究人员发现Monkey(这类工具在工业实践中使用最广泛)的表现优于他们研究的所有研究工具。在本文中,我们提出了该研究的两个重要扩展。首先,我们对每月活跃用户超过7.62亿的流行即时通讯应用微信应用Monkey进行了首次工业案例研究,并报告了Monkey在工业环境下的局限性的实证结果。其次,我们开发了一种新的方法来解决Monkey的主要局限性,并在Monkey的基础上实现了实质性的代码覆盖率改进,同时为Monkey和我们的方法的未来增强提供了经验见解。
{"title":"Automated test input generation for Android: are we really there yet in an industrial case?","authors":"Xia Zeng, Dengfeng Li, Wujie Zheng, Fan Xia, Yuetang Deng, Wing Lam, Wei Yang, Tao Xie","doi":"10.1145/2950290.2983958","DOIUrl":"https://doi.org/10.1145/2950290.2983958","url":null,"abstract":"Given the ever increasing number of research tools to automatically generate inputs to test Android applications (or simply apps), researchers recently asked the question \"Are we there yet?\" (in terms of the practicality of the tools). By conducting an empirical study of the various tools, the researchers found that Monkey (the most widely used tool of this category in industrial practices) outperformed all of the research tools that they studied. In this paper, we present two significant extensions of that study. First, we conduct the first industrial case study of applying Monkey against WeChat, a popular messenger app with over 762 million monthly active users, and report the empirical findings on Monkey's limitations in an industrial setting. Second, we develop a new approach to address major limitations of Monkey and accomplish substantial code-coverage improvements over Monkey, along with empirical insights for future enhancements to both Monkey and our approach.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"168 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77937731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 83
A discrete-time feedback controller for containerized cloud applications 用于容器化云应用程序的离散时间反馈控制器
L. Baresi, Sam Guinea, A. Leva, G. Quattrocchi
Modern Web applications exploit Cloud infrastructures to scale their resources and cope with sudden changes in the workload. While the state of practice is to focus on dynamically adding and removing virtual machines, we advocate that there are strong benefits in containerizing the applications and in scaling the containers. In this paper we present an autoscaling technique that allows containerized applications to scale their resources both at the VM level and at the container level. Furthermore, applications can combine this infrastructural adaptation with platform-level adaptation. The autoscaling is made possible by our planner, which consists of a grey-box discrete-time feedback controller. The work has been validated using two application benchmarks deployed to Amazon EC2. Our experiments show that our planner outperforms Amazon's AutoScaling by 78% on average without containers; and that the introduction of containers allows us to improve by yet another 46% on average.
现代Web应用程序利用云基础设施来扩展其资源并应对工作负载的突然变化。虽然实践状态关注的是动态添加和删除虚拟机,但我们主张将应用程序容器化和扩展容器有很大的好处。在本文中,我们提出了一种自动扩展技术,该技术允许容器化应用程序在虚拟机级别和容器级别扩展其资源。此外,应用程序可以将这种基础设施适应和平台级适应结合起来。我们的规划器由一个灰盒离散时间反馈控制器组成,使自动缩放成为可能。使用部署到Amazon EC2的两个应用程序基准测试验证了这项工作。我们的实验表明,在没有容器的情况下,我们的计划器比亚马逊的自动缩放(AutoScaling)平均高出78%;容器的引入使我们平均又提高了46%。
{"title":"A discrete-time feedback controller for containerized cloud applications","authors":"L. Baresi, Sam Guinea, A. Leva, G. Quattrocchi","doi":"10.1145/2950290.2950328","DOIUrl":"https://doi.org/10.1145/2950290.2950328","url":null,"abstract":"Modern Web applications exploit Cloud infrastructures to scale their resources and cope with sudden changes in the workload. While the state of practice is to focus on dynamically adding and removing virtual machines, we advocate that there are strong benefits in containerizing the applications and in scaling the containers. In this paper we present an autoscaling technique that allows containerized applications to scale their resources both at the VM level and at the container level. Furthermore, applications can combine this infrastructural adaptation with platform-level adaptation. The autoscaling is made possible by our planner, which consists of a grey-box discrete-time feedback controller. The work has been validated using two application benchmarks deployed to Amazon EC2. Our experiments show that our planner outperforms Amazon's AutoScaling by 78% on average without containers; and that the introduction of containers allows us to improve by yet another 46% on average.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88507996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 74
SMT-based verification of parameterized systems 基于smt的参数化系统验证
A. Gurfinkel, Sharon Shoham, Yuri Meshman
It is well known that verification of safety properties of sequential programs is reducible to satisfiability modulo theory of a first-order logic formula, called a verification condition (VC). The reduction is used both in deductive and automated verification, the difference is only in whether the user or the solver provides candidates for inductive invariants. In this paper, we extend the reduction to parameterized systems consisting of arbitrary many copies of a user-specified process, and whose transition relation is definable in first-order logic modulo theory of linear arithmetic and arrays. We show that deciding whether a parameterized system has a universally quantified inductive invariant is reducible to satisfiability of (non-linear) Constraint Horn Clauses (CHC). As a consequence of our reduction, we obtain a new automated procedure for verifying parameterized systems using existing PDR and CHC engines. While the new procedure is applicable to a wide variety of systems, we show that it is a decision procedure for several decidable fragments.
众所周知,序列程序的安全性验证可以归结为一阶逻辑公式的可满足模理论,称为验证条件(VC)。约简既用于演绎验证也用于自动验证,区别仅在于用户或求解器是否提供归纳不变量的候选项。本文将此约简推广到由用户指定过程的任意多个副本组成的参数化系统,该系统的转移关系在线性算术和数组的一阶逻辑模理论中是可定义的。我们证明了判定参数化系统是否具有普遍量化的归纳不变量可约化为(非线性)约束角子句(CHC)的可满足性。由于我们的简化,我们获得了一个新的自动化程序,用于使用现有的PDR和CHC引擎验证参数化系统。虽然新程序适用于各种各样的系统,但我们表明它是几个可确定片段的决策程序。
{"title":"SMT-based verification of parameterized systems","authors":"A. Gurfinkel, Sharon Shoham, Yuri Meshman","doi":"10.1145/2950290.2950330","DOIUrl":"https://doi.org/10.1145/2950290.2950330","url":null,"abstract":"It is well known that verification of safety properties of sequential programs is reducible to satisfiability modulo theory of a first-order logic formula, called a verification condition (VC). The reduction is used both in deductive and automated verification, the difference is only in whether the user or the solver provides candidates for inductive invariants. In this paper, we extend the reduction to parameterized systems consisting of arbitrary many copies of a user-specified process, and whose transition relation is definable in first-order logic modulo theory of linear arithmetic and arrays. We show that deciding whether a parameterized system has a universally quantified inductive invariant is reducible to satisfiability of (non-linear) Constraint Horn Clauses (CHC). As a consequence of our reduction, we obtain a new automated procedure for verifying parameterized systems using existing PDR and CHC engines. While the new procedure is applicable to a wide variety of systems, we show that it is a decision procedure for several decidable fragments.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88608340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Effectiveness of code contribution: from patch-based to pull-request-based tools 代码贡献的有效性:从基于补丁的工具到基于拉取请求的工具
Jiaxin Zhu, Minghui Zhou, A. Mockus
Code contributions in Free/Libre and Open Source Software projects are controlled to maintain high-quality of software. Alternatives to patch-based code contribution tools such as mailing lists and issue trackers have been developed with the pull request systems being the most visible and widely available on GitHub. Is the code contribution process more effective with pull request systems? To answer that, we quantify the effectiveness via the rates contributions are accepted and ignored, via the time until the first response and final resolution and via the numbers of contributions. To control for the latent variables, our study includes a project that migrated from an issue tracker to the GitHub pull request system and a comparison between projects using mailing lists and pull request systems. Our results show pull request systems to be associated with reduced review times and larger numbers of contributions. However, not all the comparisons indicate substantially better accept or ignore rates in pull request systems. These variations may be most simply explained by the differences in contribution practices the projects employ and may be less affected by the type of tool. Our results clarify the importance of understanding the role of tools in effective management of the broad network of potential contributors and may lead to strategies and practices making the code contribution more satisfying and efficient from both contributors' and maintainers' perspectives.
免费/自由和开源软件项目中的代码贡献受到控制,以保持软件的高质量。基于补丁的代码贡献工具(如邮件列表和问题跟踪器)的替代方案已经开发出来,其中拉请求系统是GitHub上最可见和最广泛可用的。使用拉取请求系统,代码贡献过程是否更有效?为了回答这个问题,我们通过贡献被接受和忽略的比率、通过第一次响应和最终解决的时间以及通过贡献的数量来量化有效性。为了控制潜在变量,我们的研究包括一个从问题跟踪器迁移到GitHub拉请求系统的项目,以及使用邮件列表和拉请求系统的项目之间的比较。我们的结果显示,拉取请求系统与减少的审查时间和更多的贡献相关联。然而,并不是所有的比较都表明在拉取请求系统中接受或忽略率更好。这些变化可能最简单地解释为项目所采用的贡献实践的差异,并且可能受工具类型的影响较小。我们的结果阐明了理解工具在有效管理潜在贡献者的广泛网络中的作用的重要性,并且可能导致从贡献者和维护者的角度来看,使代码贡献更令人满意和有效的策略和实践。
{"title":"Effectiveness of code contribution: from patch-based to pull-request-based tools","authors":"Jiaxin Zhu, Minghui Zhou, A. Mockus","doi":"10.1145/2950290.2950364","DOIUrl":"https://doi.org/10.1145/2950290.2950364","url":null,"abstract":"Code contributions in Free/Libre and Open Source Software projects are controlled to maintain high-quality of software. Alternatives to patch-based code contribution tools such as mailing lists and issue trackers have been developed with the pull request systems being the most visible and widely available on GitHub. Is the code contribution process more effective with pull request systems? To answer that, we quantify the effectiveness via the rates contributions are accepted and ignored, via the time until the first response and final resolution and via the numbers of contributions. To control for the latent variables, our study includes a project that migrated from an issue tracker to the GitHub pull request system and a comparison between projects using mailing lists and pull request systems. Our results show pull request systems to be associated with reduced review times and larger numbers of contributions. However, not all the comparisons indicate substantially better accept or ignore rates in pull request systems. These variations may be most simply explained by the differences in contribution practices the projects employ and may be less affected by the type of tool. Our results clarify the importance of understanding the role of tools in effective management of the broad network of potential contributors and may lead to strategies and practices making the code contribution more satisfying and efficient from both contributors' and maintainers' perspectives.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90371378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
T2API: synthesizing API code usage templates from English texts with statistical translation T2API:通过统计翻译从英文文本中合成API代码使用模板
THANH VAN NGUYEN, Peter C. Rigby, A. Nguyen, Mark Karanfil, T. Nguyen
In this work, we develop T2API, a statistical machine translation-based tool that takes a given English description of a programming task as a query, and synthesizes the API usage template for the task by learning from training data. T2API works in two steps. First, it derives the API elements relevant to the task described in the input by statistically learning from a StackOverflow corpus of text descriptions and corresponding code. To infer those API elements, it also considers the context of the words in the textual input and the context of API elements that often go together in the corpus. The inferred API elements with their relevance scores are ensembled into an API usage by our novel API usage synthesis algorithm that learns the API usages from a large code corpus via a graph-based language model. Importantly, T2API is capable of generating new API usages from smaller, previously-seen usages.
在这项工作中,我们开发了T2API,这是一个基于统计机器翻译的工具,它将给定的编程任务的英语描述作为查询,并通过从训练数据中学习来合成任务的API使用模板。T2API分两步工作。首先,它通过统计地从文本描述和相应代码的StackOverflow语料库中学习,派生与输入中描述的任务相关的API元素。为了推断这些API元素,它还考虑文本输入中单词的上下文和语料库中经常一起出现的API元素的上下文。通过我们新颖的API使用综合算法,将推断的API元素及其相关分数集成到API使用中,该算法通过基于图的语言模型从大型代码语料库中学习API使用。重要的是,T2API能够从较小的、以前见过的使用中生成新的API使用。
{"title":"T2API: synthesizing API code usage templates from English texts with statistical translation","authors":"THANH VAN NGUYEN, Peter C. Rigby, A. Nguyen, Mark Karanfil, T. Nguyen","doi":"10.1145/2950290.2983931","DOIUrl":"https://doi.org/10.1145/2950290.2983931","url":null,"abstract":"In this work, we develop T2API, a statistical machine translation-based tool that takes a given English description of a programming task as a query, and synthesizes the API usage template for the task by learning from training data. T2API works in two steps. First, it derives the API elements relevant to the task described in the input by statistically learning from a StackOverflow corpus of text descriptions and corresponding code. To infer those API elements, it also considers the context of the words in the textual input and the context of API elements that often go together in the corpus. The inferred API elements with their relevance scores are ensembled into an API usage by our novel API usage synthesis algorithm that learns the API usages from a large code corpus via a graph-based language model. Importantly, T2API is capable of generating new API usages from smaller, previously-seen usages.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89211386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Detecting sensitive data disclosure via bi-directional text correlation analysis 通过双向文本关联分析检测敏感数据泄露
Jianjun Huang, X. Zhang, Lin Tan
Traditional sensitive data disclosure analysis faces two challenges: to identify sensitive data that is not generated by specific API calls, and to report the potential disclosures when the disclosed data is recognized as sensitive only after the sink operations. We address these issues by developing BidText, a novel static technique to detect sensitive data disclosures. BidText formulates the problem as a type system, in which variables are typed with the text labels that they encounter (e.g., during key-value pair operations). The type system features a novel bi-directional propagation technique that propagates the variable label sets through forward and backward data-flow. A data disclosure is reported if a parameter at a sink point is typed with a sensitive text label. BidText is evaluated on 10,000 Android apps. It reports 4,406 apps that have sensitive data disclosures, with 4,263 apps having log based disclosures and 1,688 having disclosures due to other sinks such as HTTP requests. Existing techniques can only report 64.0% of what BidText reports. And manual inspection shows that the false positive rate for BidText is 10%.
传统的敏感数据公开分析面临两个挑战:识别不是由特定API调用生成的敏感数据,以及报告仅在接收操作之后才被识别为敏感的公开数据时的潜在公开。我们通过开发BidText来解决这些问题,BidText是一种检测敏感数据泄露的新型静态技术。BidText将问题表述为一个类型系统,其中变量使用它们遇到的文本标签进行类型化(例如,在键值对操作期间)。该类型系统采用了一种新颖的双向传播技术,通过向前和向后的数据流传播变量标签集。如果接收点的参数键入敏感文本标签,则报告数据公开。BidText在10,000个Android应用程序上进行了评估。它报告了4406个应用程序有敏感数据泄露,其中4263个应用程序有基于日志的泄露,1688个应用程序由于HTTP请求等其他吸收而泄露。现有的技术只能报告BidText报告的64.0%。人工检测表明,对BidText的误阳性率为10%。
{"title":"Detecting sensitive data disclosure via bi-directional text correlation analysis","authors":"Jianjun Huang, X. Zhang, Lin Tan","doi":"10.1145/2950290.2950348","DOIUrl":"https://doi.org/10.1145/2950290.2950348","url":null,"abstract":"Traditional sensitive data disclosure analysis faces two challenges: to identify sensitive data that is not generated by specific API calls, and to report the potential disclosures when the disclosed data is recognized as sensitive only after the sink operations. We address these issues by developing BidText, a novel static technique to detect sensitive data disclosures. BidText formulates the problem as a type system, in which variables are typed with the text labels that they encounter (e.g., during key-value pair operations). The type system features a novel bi-directional propagation technique that propagates the variable label sets through forward and backward data-flow. A data disclosure is reported if a parameter at a sink point is typed with a sensitive text label. BidText is evaluated on 10,000 Android apps. It reports 4,406 apps that have sensitive data disclosures, with 4,263 apps having log based disclosures and 1,688 having disclosures due to other sinks such as HTTP requests. Existing techniques can only report 64.0% of what BidText reports. And manual inspection shows that the false positive rate for BidText is 10%.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89309288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
期刊
Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1