Best practices for developers, as encoded in recent programming language designs, recommend the use of immutability whenever practical. However, there is a lack of empirical evidence about the uptake of this advice. Our goal is to understand the usage of immutability by C++ developers in practice. This work investigates how C++ developers use immutability by analyzing their use of the C++ immutability qualifier, const, and by analyzing the code itself. We answer the following broad questions about const usage: 1) do developers actually write non-trivial (more than 3 methods) immutable classes and immutable methods? 2) do developers label their immutable classes and methods? We analyzed 7 medium-to-large open source projects and collected two sources of empirical data: 1) const annotations by developers, indicating an intent to write immutable code; and 2) the results of a simple static analysis which identified easily const-able methods---those that clearly did not mutate state. We estimate that 5% of non-trivial classes (median) are immutable. We found the vast majority of classes do carry immutability labels on methods: surprisingly, developers const-annotate 46% of methods, and we estimate that at least 51% of methods could be const-annotated. Furthermore, developers missed immutability labels on at least 6% of unannotated methods. We provide an in-depth discussion on how developers use const and the results of our analyses.
{"title":"How C++ Developers Use Immutability Declarations: An Empirical Study","authors":"Jon Eyolfson, Patrick Lam","doi":"10.1109/ICSE.2019.00050","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00050","url":null,"abstract":"Best practices for developers, as encoded in recent programming language designs, recommend the use of immutability whenever practical. However, there is a lack of empirical evidence about the uptake of this advice. Our goal is to understand the usage of immutability by C++ developers in practice. This work investigates how C++ developers use immutability by analyzing their use of the C++ immutability qualifier, const, and by analyzing the code itself. We answer the following broad questions about const usage: 1) do developers actually write non-trivial (more than 3 methods) immutable classes and immutable methods? 2) do developers label their immutable classes and methods? We analyzed 7 medium-to-large open source projects and collected two sources of empirical data: 1) const annotations by developers, indicating an intent to write immutable code; and 2) the results of a simple static analysis which identified easily const-able methods---those that clearly did not mutate state. We estimate that 5% of non-trivial classes (median) are immutable. We found the vast majority of classes do carry immutability labels on methods: surprisingly, developers const-annotate 46% of methods, and we estimate that at least 51% of methods could be const-annotated. Furthermore, developers missed immutability labels on at least 6% of unannotated methods. We provide an in-depth discussion on how developers use const and the results of our analyses.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"34 1","pages":"362-372"},"PeriodicalIF":0.0,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77090907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Delplanque, Stéphane Ducasse, G. Polito, A. Black, Anne Etien
Unit tests are a tenant of agile programming methodologies, and are widely used to improve code quality and prevent code regression. A green (passing) test is usually taken as a robust sign that the code under test is valid. However, some green tests contain assertions that are never executed. We call such tests Rotten Green Tests. Rotten Green Tests represent a case worse than a broken test: they report that the code under test is valid, but in fact do not test that validity. We describe an approach to identify rotten green tests by combining simple static and dynamic call-site analyses. Our approach takes into account test helper methods, inherited helpers, and trait compositions, and has been implemented in a tool called DrTest. DrTest reports no false negatives, yet it still reports some false positives due to conditional use or multiple test contexts. Using DrTest we conducted an empirical evaluation of 19,905 real test cases in mature projects of the Pharo ecosystem. The results of the evaluation show that the tool is effective; it detected 294 tests as rotten—green tests that contain assertions that are not executed. Some rotten tests have been “sleeping” in Pharo for at least 5 years.
{"title":"Rotten Green Tests","authors":"J. Delplanque, Stéphane Ducasse, G. Polito, A. Black, Anne Etien","doi":"10.1109/ICSE.2019.00062","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00062","url":null,"abstract":"Unit tests are a tenant of agile programming methodologies, and are widely used to improve code quality and prevent code regression. A green (passing) test is usually taken as a robust sign that the code under test is valid. However, some green tests contain assertions that are never executed. We call such tests Rotten Green Tests. Rotten Green Tests represent a case worse than a broken test: they report that the code under test is valid, but in fact do not test that validity. We describe an approach to identify rotten green tests by combining simple static and dynamic call-site analyses. Our approach takes into account test helper methods, inherited helpers, and trait compositions, and has been implemented in a tool called DrTest. DrTest reports no false negatives, yet it still reports some false positives due to conditional use or multiple test contexts. Using DrTest we conducted an empirical evaluation of 19,905 real test cases in mature projects of the Pharo ecosystem. The results of the evaluation show that the tool is effective; it detected 294 tests as rotten—green tests that contain assertions that are not executed. Some rotten tests have been “sleeping” in Pharo for at least 5 years.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"52 1","pages":"500-511"},"PeriodicalIF":0.0,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82686956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adithya Abraham Philip, Ranjita Bhagwan, Rahul Kumar, C. Maddila, Nachiappan Nagappan
Today, we depend on numerous large-scale services for basic operations such as email. These services, built on the basis of Continuous Integration/Continuous Deployment (CI/CD) processes, are extremely dynamic: developers continuously commit code and introduce new features, functionality and fixes. Hundreds of commits may enter the code-base in a single day. Therefore one of the most time-critical, yet resource-intensive tasks towards ensuring code-quality is effectively testing such large code-bases. This paper presents FastLane, a system that performs data-driven test minimization. FastLane uses light-weight machine-learning models built upon a rich history of test and commit logs to predict test outcomes. Tests for which we predict outcomes need not be explicitly run, thereby saving us precious test-time and resources. Our evaluation on a large-scale email and collaboration platform service shows that our techniques can save 18.04%, i.e., almost a fifth of test-time while obtaining a test outcome accuracy of 99.99%.
{"title":"FastLane: Test Minimization for Rapidly Deployed Large-Scale Online Services","authors":"Adithya Abraham Philip, Ranjita Bhagwan, Rahul Kumar, C. Maddila, Nachiappan Nagappan","doi":"10.1109/ICSE.2019.00054","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00054","url":null,"abstract":"Today, we depend on numerous large-scale services for basic operations such as email. These services, built on the basis of Continuous Integration/Continuous Deployment (CI/CD) processes, are extremely dynamic: developers continuously commit code and introduce new features, functionality and fixes. Hundreds of commits may enter the code-base in a single day. Therefore one of the most time-critical, yet resource-intensive tasks towards ensuring code-quality is effectively testing such large code-bases. This paper presents FastLane, a system that performs data-driven test minimization. FastLane uses light-weight machine-learning models built upon a rich history of test and commit logs to predict test outcomes. Tests for which we predict outcomes need not be explicitly run, thereby saving us precious test-time and resources. Our evaluation on a large-scale email and collaboration platform service shows that our techniques can save 18.04%, i.e., almost a fifth of test-time while obtaining a test outcome accuracy of 99.99%.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"57 10","pages":"408-418"},"PeriodicalIF":0.0,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91483322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Kavaler, Asher Trockman, Bogdan Vasilescu, V. Filkov
Quality assurance automation is essential in modern software development. In practice, this automation is supported by a multitude of tools that fit different needs and require developers to make decisions about which tool to choose in a given context. Data and analytics of the pros and cons can inform these decisions. Yet, in most cases, there is a dearth of empirical evidence on the effectiveness of existing practices and tool choices. We propose a general methodology to model the time- dependent effect of automation tool choice on four outcomes of interest: prevalence of issues, code churn, number of pull requests, and number of contributors, all with a multitude of controls. On a large data set of npm JavaScript projects, we extract the adoption events for popular tools in three task classes: linters, dependency managers, and coverage reporters. Using mixed methods approaches, we study the reasons for the adoptions and compare the adoption effects within each class, and sequential tool adoptions across classes. We find that some tools within each group are associated with more beneficial outcomes than others, providing an empirical perspective for the benefits of each. We also find that the order in which some tools are implemented is associated with varying outcomes.
{"title":"Tool Choice Matters: JavaScript Quality Assurance Tools and Usage Outcomes in GitHub Projects","authors":"David Kavaler, Asher Trockman, Bogdan Vasilescu, V. Filkov","doi":"10.1109/ICSE.2019.00060","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00060","url":null,"abstract":"Quality assurance automation is essential in modern software development. In practice, this automation is supported by a multitude of tools that fit different needs and require developers to make decisions about which tool to choose in a given context. Data and analytics of the pros and cons can inform these decisions. Yet, in most cases, there is a dearth of empirical evidence on the effectiveness of existing practices and tool choices. We propose a general methodology to model the time- dependent effect of automation tool choice on four outcomes of interest: prevalence of issues, code churn, number of pull requests, and number of contributors, all with a multitude of controls. On a large data set of npm JavaScript projects, we extract the adoption events for popular tools in three task classes: linters, dependency managers, and coverage reporters. Using mixed methods approaches, we study the reasons for the adoptions and compare the adoption effects within each class, and sequential tool adoptions across classes. We find that some tools within each group are associated with more beneficial outcomes than others, providing an empirical perspective for the benefits of each. We also find that the order in which some tools are implemented is associated with varying outcomes.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"3 1","pages":"476-487"},"PeriodicalIF":0.0,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87091966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent works have concluded that software code is more repetitive and predictable, i.e. more natural, than English texts. On re-examination, we find that much of the apparent "naturalness" of source code is due to the presence of language specific syntax, especially separators, such as semi-colons and brackets. For example, separators account for 44% of all tokens in our Java corpus. When we follow the NLP practices of eliminating punctuation (e.g., separators) and stopwords (e.g., keywords), we find that code is still repetitive and predictable, but to a lesser degree than previously thought. We suggest that SyntaxTokens be filtered to reduce noise in code recommenders. Unlike the code written for a particular project, API code usage is similar across projects: a file is opened and closed in the same manner regardless of domain. When we restrict our n-grams to those contained in the Java API, we find that API usages are highly repetitive. Since API calls are common across programs, researchers have made reliable statistical models to recommend sophisticated API call sequences. Sequential n-gram models were developed for natural languages. Code is usually represented by an AST which contains control and data flow, making n-grams models a poor representation of code. Comparing n-grams to statistical graph representations of the same codebase, we find that graphs are more repetitive and contain higherlevel patterns than n-grams. We suggest that future work focus on statistical code graphs models that accurately capture complex coding patterns. Our replication package makes our scripts and data available to future researchers[1].
{"title":"Natural Software Revisited","authors":"Musfiqur Rahman, Dharani Palani, Peter C. Rigby","doi":"10.1109/ICSE.2019.00022","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00022","url":null,"abstract":"Recent works have concluded that software code is more repetitive and predictable, i.e. more natural, than English texts. On re-examination, we find that much of the apparent \"naturalness\" of source code is due to the presence of language specific syntax, especially separators, such as semi-colons and brackets. For example, separators account for 44% of all tokens in our Java corpus. When we follow the NLP practices of eliminating punctuation (e.g., separators) and stopwords (e.g., keywords), we find that code is still repetitive and predictable, but to a lesser degree than previously thought. We suggest that SyntaxTokens be filtered to reduce noise in code recommenders. Unlike the code written for a particular project, API code usage is similar across projects: a file is opened and closed in the same manner regardless of domain. When we restrict our n-grams to those contained in the Java API, we find that API usages are highly repetitive. Since API calls are common across programs, researchers have made reliable statistical models to recommend sophisticated API call sequences. Sequential n-gram models were developed for natural languages. Code is usually represented by an AST which contains control and data flow, making n-grams models a poor representation of code. Comparing n-grams to statistical graph representations of the same codebase, we find that graphs are more repetitive and contain higherlevel patterns than n-grams. We suggest that future work focus on statistical code graphs models that accurately capture complex coding patterns. Our replication package makes our scripts and data available to future researchers[1].","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"22 1","pages":"37-48"},"PeriodicalIF":0.0,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88141835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Open source software communities have demonstrated that they can produce high quality results. The overall success of peer code review, commonly used in open source projects, has likely contributed strongly to this success. Code review is an emotionally loaded practice, with public exposure of reputation and ample opportunities for conflict. We set off to ask why code review works for open source communities, despite this inherent challenge. We interviewed 21 open source contributors from four communities and participated in meetings of ROS community devoted to implementation of the code review process. It appears that the hacker ethic is a key reason behind the success of code review in FOSS communities. It is built around the ethic of passion and the ethic of caring. Furthermore, we observed that tasks of code review are performed with strong intrinsic motivation, supported by many non-material extrinsic motivation mechanisms, such as desire to learn, to grow reputation, or to improve one's positioning on the job market. In the paper, we describe the study design, analyze the collected data and formulate 20 proposals for how what we know about hacker ethics and human and social aspects of code review, could be exploited to improve the effectiveness of the practice in software projects.
{"title":"Why Does Code Review Work for Open Source Software Communities?","authors":"A. Alami, M. Cohn, A. Wąsowski","doi":"10.1109/ICSE.2019.00111","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00111","url":null,"abstract":"Open source software communities have demonstrated that they can produce high quality results. The overall success of peer code review, commonly used in open source projects, has likely contributed strongly to this success. Code review is an emotionally loaded practice, with public exposure of reputation and ample opportunities for conflict. We set off to ask why code review works for open source communities, despite this inherent challenge. We interviewed 21 open source contributors from four communities and participated in meetings of ROS community devoted to implementation of the code review process. It appears that the hacker ethic is a key reason behind the success of code review in FOSS communities. It is built around the ethic of passion and the ethic of caring. Furthermore, we observed that tasks of code review are performed with strong intrinsic motivation, supported by many non-material extrinsic motivation mechanisms, such as desire to learn, to grow reputation, or to improve one's positioning on the job market. In the paper, we describe the study design, analyze the collected data and formulate 20 proposals for how what we know about hacker ethics and human and social aspects of code review, could be exploited to improve the effectiveness of the practice in software projects.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"61 1","pages":"1073-1083"},"PeriodicalIF":0.0,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89448706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Chowdhury, Abram Hindle, R. Kazman, Takumi Shuto, Ken Matsui, Yasutaka Kamei
Energy consumption is a concern in the data-center and at the edge, on mobile devices such as smartphones. Software that consumes too much energy threatens the utility of the end-user's mobile device. Energy consumption is fundamentally a systemic kind of performance and hence it should be addressed at design time via a software architecture that supports it, rather than after release, via some form of refactoring. Unfortunately developers often lack knowledge of what kinds of designs and architectures can help address software energy consumption. In this paper we show that some simple design choices can have significant effects on energy consumption. In particular we examine the Model-View-Controller architectural pattern and demonstrate how converting to Model-View-Presenter with bundling can improve the energy performance of both benchmark systems and real world applications. We show the relationship between energy consumption and bundled and delayed view updates: bundling events in the presenter can often reduce energy consumption by 30%.
{"title":"GreenBundle: An Empirical Study on the Energy Impact of Bundled Processing","authors":"S. Chowdhury, Abram Hindle, R. Kazman, Takumi Shuto, Ken Matsui, Yasutaka Kamei","doi":"10.1109/ICSE.2019.00114","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00114","url":null,"abstract":"Energy consumption is a concern in the data-center and at the edge, on mobile devices such as smartphones. Software that consumes too much energy threatens the utility of the end-user's mobile device. Energy consumption is fundamentally a systemic kind of performance and hence it should be addressed at design time via a software architecture that supports it, rather than after release, via some form of refactoring. Unfortunately developers often lack knowledge of what kinds of designs and architectures can help address software energy consumption. In this paper we show that some simple design choices can have significant effects on energy consumption. In particular we examine the Model-View-Controller architectural pattern and demonstrate how converting to Model-View-Presenter with bundling can improve the energy performance of both benchmark systems and real world applications. We show the relationship between energy consumption and bundled and delayed view updates: bundling events in the presenter can often reduce energy consumption by 30%.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"1 1","pages":"1107-1118"},"PeriodicalIF":0.0,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89931391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Control-CPS software fault localization (SFL, aka bug localization) is of critical importance as bugs may cause major failures, even injuries/deaths. To locate the bugs in control-CPSs, SFL tools often demand many labeled ("correct"/"incorrect") source code execution traces as inputs. To label the correctness of these traces, we must judge the corresponding control-CPS physical trajectories' correctness. However, unlike discrete outputs, the boundaries between correct and incorrect physical trajectories are often vague. The mechanism (aka oracle) to judge the physical trajectories' correctness thus becomes a major challenge. So far, the ad hoc practice of ``human oracles'' is still widely used, whose qualities heavily depend on the human experts' expertise and availability. This paper proposes an oracle based on the well adopted autoregressive system identification (AR-SI). With proven success for controlling black-box physical systems, AR-SI is adapted by us to identify the buggy control-CPS as a black-box. We use this identification result as an oracle to judge the control-CPS's behaviors, and propose a methodology to prepare traces for control-CPS debugging. Comprehensive evaluations on classic control-CPSs with injected real-life and artificial bugs show that our proposed approach significantly outperforms the human oracle approach in SFL accuracy (recall) and latency, and in oracle false positive/negative rates. Our approach also helps discover a new real-life bug in a consumer-grade control-CPS.
{"title":"A System Identification Based Oracle for Control-CPS Software Fault Localization","authors":"Zhijian He, Yao Chen, Enyan Huang, Qixin Wang, Yu Pei, Haidong Yuan","doi":"10.1109/ICSE.2019.00029","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00029","url":null,"abstract":"Control-CPS software fault localization (SFL, aka bug localization) is of critical importance as bugs may cause major failures, even injuries/deaths. To locate the bugs in control-CPSs, SFL tools often demand many labeled (\"correct\"/\"incorrect\") source code execution traces as inputs. To label the correctness of these traces, we must judge the corresponding control-CPS physical trajectories' correctness. However, unlike discrete outputs, the boundaries between correct and incorrect physical trajectories are often vague. The mechanism (aka oracle) to judge the physical trajectories' correctness thus becomes a major challenge. So far, the ad hoc practice of ``human oracles'' is still widely used, whose qualities heavily depend on the human experts' expertise and availability. This paper proposes an oracle based on the well adopted autoregressive system identification (AR-SI). With proven success for controlling black-box physical systems, AR-SI is adapted by us to identify the buggy control-CPS as a black-box. We use this identification result as an oracle to judge the control-CPS's behaviors, and propose a methodology to prepare traces for control-CPS debugging. Comprehensive evaluations on classic control-CPSs with injected real-life and artificial bugs show that our proposed approach significantly outperforms the human oracle approach in SFL accuracy (recall) and latency, and in oracle false positive/negative rates. Our approach also helps discover a new real-life bug in a consumer-grade control-CPS.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"18 1","pages":"116-127"},"PeriodicalIF":0.0,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74044339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Sarker, Bogdan Vasilescu, Kelly Blincoe, V. Filkov
Software developers work on a variety of tasks ranging from the technical, e.g., writing code, to the social, e.g., participating in issue resolution discussions. The amount of work developers perform per week (their work-rate) also varies and depends on project needs and developer schedules. Prior work has shown that while moderate levels of increased technical work and multitasking lead to higher productivity, beyond a certain threshold, they can lead to lowered performance. Here, we study how increases in the short-term work-rate along both the technical and social dimensions are associated with changes in developers' work patterns, in particular communication sentiment, technical productivity, and social productivity. We surveyed active and prolific developers on GitHub to understand the causes and impacts of increased work-rates. Guided by the responses, we developed regression models to study how communication and committing patterns change with increased work-rates and fit those models to large-scale data gathered from traces left by thousands of GitHub developers. From our survey and models, we find that most developers do experience work-rate-increase-related changes in behavior. Most notably, our models show that there is a sizable effect when developers comment much more than their average: the negative sentiment in their comments increases, suggesting an increased level of stress. Our models also show that committing patterns do not change with increased commenting, and vice versa, suggesting that technical and social activities tend not to be multitasked.
{"title":"Socio-Technical Work-Rate Increase Associates With Changes in Work Patterns in Online Projects","authors":"F. Sarker, Bogdan Vasilescu, Kelly Blincoe, V. Filkov","doi":"10.1109/ICSE.2019.00099","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00099","url":null,"abstract":"Software developers work on a variety of tasks ranging from the technical, e.g., writing code, to the social, e.g., participating in issue resolution discussions. The amount of work developers perform per week (their work-rate) also varies and depends on project needs and developer schedules. Prior work has shown that while moderate levels of increased technical work and multitasking lead to higher productivity, beyond a certain threshold, they can lead to lowered performance. Here, we study how increases in the short-term work-rate along both the technical and social dimensions are associated with changes in developers' work patterns, in particular communication sentiment, technical productivity, and social productivity. We surveyed active and prolific developers on GitHub to understand the causes and impacts of increased work-rates. Guided by the responses, we developed regression models to study how communication and committing patterns change with increased work-rates and fit those models to large-scale data gathered from traces left by thousands of GitHub developers. From our survey and models, we find that most developers do experience work-rate-increase-related changes in behavior. Most notably, our models show that there is a sizable effect when developers comment much more than their average: the negative sentiment in their comments increases, suggesting an increased level of stress. Our models also show that committing patterns do not change with increased commenting, and vice versa, suggesting that technical and social activities tend not to be multitasked.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"78 1","pages":"936-947"},"PeriodicalIF":0.0,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85169742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei You, Xuwei Liu, Shiqing Ma, D. Perry, X. Zhang, Bin Liang
Fuzzing is an important technique to detect software bugs and vulnerabilities. It works by mutating a small set of seed inputs to generate a large number of new inputs. Fuzzers' performance often substantially degrades when valid seed inputs are not available. Although existing techniques such as symbolic execution can generate seed inputs from scratch, they have various limitations hindering their applications in real-world complex software. In this paper, we propose a novel fuzzing technique that features the capability of generating valid seed inputs. It piggy-backs on AFL to identify input validity checks and the input fields that have impact on such checks. It further classifies these checks according to their relations to the input. Such classes include arithmetic relation, object offset, data structure length and so on. A multi-goal search algorithm is developed to apply class-specific mutations in order to satisfy inter-dependent checks all together. We evaluate our technique on 20 popular benchmark programs collected from other fuzzing projects and the Google fuzzer test suite, and compare it with existing fuzzers AFL and AFLFast, symbolic execution engines KLEE and S2E, and a hybrid tool Driller that combines fuzzing with symbolic execution. The results show that our technique is highly effective and efficient, out-performing the other tools.
{"title":"SLF: Fuzzing without Valid Seed Inputs","authors":"Wei You, Xuwei Liu, Shiqing Ma, D. Perry, X. Zhang, Bin Liang","doi":"10.1109/ICSE.2019.00080","DOIUrl":"https://doi.org/10.1109/ICSE.2019.00080","url":null,"abstract":"Fuzzing is an important technique to detect software bugs and vulnerabilities. It works by mutating a small set of seed inputs to generate a large number of new inputs. Fuzzers' performance often substantially degrades when valid seed inputs are not available. Although existing techniques such as symbolic execution can generate seed inputs from scratch, they have various limitations hindering their applications in real-world complex software. In this paper, we propose a novel fuzzing technique that features the capability of generating valid seed inputs. It piggy-backs on AFL to identify input validity checks and the input fields that have impact on such checks. It further classifies these checks according to their relations to the input. Such classes include arithmetic relation, object offset, data structure length and so on. A multi-goal search algorithm is developed to apply class-specific mutations in order to satisfy inter-dependent checks all together. We evaluate our technique on 20 popular benchmark programs collected from other fuzzing projects and the Google fuzzer test suite, and compare it with existing fuzzers AFL and AFLFast, symbolic execution engines KLEE and S2E, and a hybrid tool Driller that combines fuzzing with symbolic execution. The results show that our technique is highly effective and efficient, out-performing the other tools.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"25 1","pages":"712-723"},"PeriodicalIF":0.0,"publicationDate":"2019-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88886420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}