Service-oriented architectures are more popular than ever, and increasingly companies and organizations depend on services offered through Web APIs. The capabilities and complexity of Web APIs differ from service to service, and therefore the impact of API errors varies. API problem cases related to Adyen's payment service were found to have direct considerable impact on API consumer applications. With more than 60,000 daily API errors, the potential impact is enormous. In an effort to reduce the impact of API related problems, we analyze 2.43 million API error responses to identify the underlying faults. We quantify the occurrence of faults in terms of the frequency and impacted API consumers. We also challenge our quantitative results by means of a survey with 40 API consumers. Our results show that 1) faults in API integration can be grouped into 11 general causes: invalid user input, missing user input, expired request data, invalid request data, missing request data, insufficient permissions, double processing, configuration, missing server data, internal and third party, 2) most faults can be attributed to the invalid or missing request data, and most API consumers seem to be impacted by faults caused by invalid request data and third party integration; and 3) insufficient guidance on certain aspects of the integration and on how to recover from errors is an important challenge to developers.
{"title":"An Exploratory Study on Faults inWeb API Integration in a Large-Scale Payment Company","authors":"J. Aué, M. Aniche, M. Lobbezoo, A. Deursen","doi":"10.1145/3183519.3183537","DOIUrl":"https://doi.org/10.1145/3183519.3183537","url":null,"abstract":"Service-oriented architectures are more popular than ever, and increasingly companies and organizations depend on services offered through Web APIs. The capabilities and complexity of Web APIs differ from service to service, and therefore the impact of API errors varies. API problem cases related to Adyen's payment service were found to have direct considerable impact on API consumer applications. With more than 60,000 daily API errors, the potential impact is enormous. In an effort to reduce the impact of API related problems, we analyze 2.43 million API error responses to identify the underlying faults. We quantify the occurrence of faults in terms of the frequency and impacted API consumers. We also challenge our quantitative results by means of a survey with 40 API consumers. Our results show that 1) faults in API integration can be grouped into 11 general causes: invalid user input, missing user input, expired request data, invalid request data, missing request data, insufficient permissions, double processing, configuration, missing server data, internal and third party, 2) most faults can be attributed to the invalid or missing request data, and most API consumers seem to be impacted by faults caused by invalid request data and third party integration; and 3) insufficient guidance on certain aspects of the integration and on how to recover from errors is an important challenge to developers.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123403793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software development projects need clear agreed goals, a provable value proposition and key stakeholder commitment. Employing Design Thinking methods to focus the expertise of workshop participants can uncover and support these needs. The Data61 User Experience and Design team lead Design Thinking collaboration workshops as professional practice during engagements with clients. These workshops are run with cross-functional teams working on data/digital platforms before going to market. This talk will present findings from a total 54 Design Thinking workshops and 5x5 day co-design sprints run from Sept 2016 - Oct 2017. The talk will cover how the workshops were designed, using initial trials of a proposed workshop approach and ongoing review of workshops and design sprints. The use of our approach will be illustrated using 3 short case studies from different application domains: e-science for materials and manufacturing, infrastructure sustainability, and agricultural intelligence. Key learnings from our approach include getting internal stakeholder support for workshops, characteristics of a good format and how to balance domain expertise with user centred design practises.
{"title":"Improving the Definition of Software Development Projects Through Design Thinking Led Collaboration Workshops","authors":"Hilary Cinis","doi":"10.1145/3183519.3183535","DOIUrl":"https://doi.org/10.1145/3183519.3183535","url":null,"abstract":"Software development projects need clear agreed goals, a provable value proposition and key stakeholder commitment. Employing Design Thinking methods to focus the expertise of workshop participants can uncover and support these needs. The Data61 User Experience and Design team lead Design Thinking collaboration workshops as professional practice during engagements with clients. These workshops are run with cross-functional teams working on data/digital platforms before going to market. This talk will present findings from a total 54 Design Thinking workshops and 5x5 day co-design sprints run from Sept 2016 - Oct 2017. The talk will cover how the workshops were designed, using initial trials of a proposed workshop approach and ongoing review of workshops and design sprints. The use of our approach will be illustrated using 3 short case studies from different application domains: e-science for materials and manufacturing, infrastructure sustainability, and agricultural intelligence. Key learnings from our approach include getting internal stakeholder support for workshops, characteristics of a good format and how to balance domain expertise with user centred design practises.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124629522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Kriebel, Matthias Markthaler, Karin Samira Salman, Timo Greifenberg, S. Hillemacher, Bernhard Rumpe, Christoph Schulze, A. Wortmann, P. Orth, J. Richenhagen
Testing is crucial to successfully engineering reliable automotive software. The manual derivation of test cases from ambiguous textual requirements is costly and error-prone. Model-based development can reduce the test case derivation effort by capturing requirements in structured models from which test cases can be generated with reduced effort. To facilitate the automated test case derivation at BMW, we conducted an anonymous survey among its testing practitioners and conceived a model-based improvement of the testing activities. The new model-based test case derivation extends BMW's SMArDT method with automated generation of tests, which addresses many of the practitioners' challenges uncovered through our study. This ultimately can facilitate quality assurance for automotive software.
{"title":"Improving Model-Based Testing in Automotive Software Engineering","authors":"S. Kriebel, Matthias Markthaler, Karin Samira Salman, Timo Greifenberg, S. Hillemacher, Bernhard Rumpe, Christoph Schulze, A. Wortmann, P. Orth, J. Richenhagen","doi":"10.1145/3183519.3183533","DOIUrl":"https://doi.org/10.1145/3183519.3183533","url":null,"abstract":"Testing is crucial to successfully engineering reliable automotive software. The manual derivation of test cases from ambiguous textual requirements is costly and error-prone. Model-based development can reduce the test case derivation effort by capturing requirements in structured models from which test cases can be generated with reduced effort. To facilitate the automated test case derivation at BMW, we conducted an anonymous survey among its testing practitioners and conceived a model-based improvement of the testing activities. The new model-based test case derivation extends BMW's SMArDT method with automated generation of tests, which addresses many of the practitioners' challenges uncovered through our study. This ultimately can facilitate quality assurance for automotive software.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134175129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As software systems evolve and change over time, test suites used for checking the correctness of software typically grow larger. Together with size, test suites tend to grow in redundancy. This is especially problematic for complex highly-configurable software domains, as growing the size of test suites significantly impacts the cost of regression testing. In this paper we present a practical approach for reducing ineffective redundancy of regression suites in continuous integration testing (strict constraints on time-efficiency) for highly-configurable software. The main idea of our approach consists in combining coverage based redundancy metrics (test overlap) with historical fault-detection effectiveness of integration tests, to identify ineffective redundancy that is eliminated from a regression test suite. We first apply and evaluate the approach in testing of industrial video conferencing software. We further evaluate the approach using a large set of artificial subjects, in terms of fault-detection effectiveness and timeliness of regression test feedback. We compare the results with an advanced retest-all approach and random test selection. The results show that regression test selection based on coverage and history analysis can: 1) reduce regression test feedback compared to industry practice (up to 39%), 2) reduce test feedback compared to the advanced retest-all approach (up to 45%) without significantly compromising fault-detection effectiveness (less than 0.5% on average), and 3) improve fault detection effectiveness compared to random selection (72% on average).
{"title":"Practical Selective Regression Testing with Effective Redundancy in Interleaved Tests","authors":"D. Marijan, Marius Liaaen","doi":"10.1145/3183519.3183532","DOIUrl":"https://doi.org/10.1145/3183519.3183532","url":null,"abstract":"As software systems evolve and change over time, test suites used for checking the correctness of software typically grow larger. Together with size, test suites tend to grow in redundancy. This is especially problematic for complex highly-configurable software domains, as growing the size of test suites significantly impacts the cost of regression testing. In this paper we present a practical approach for reducing ineffective redundancy of regression suites in continuous integration testing (strict constraints on time-efficiency) for highly-configurable software. The main idea of our approach consists in combining coverage based redundancy metrics (test overlap) with historical fault-detection effectiveness of integration tests, to identify ineffective redundancy that is eliminated from a regression test suite. We first apply and evaluate the approach in testing of industrial video conferencing software. We further evaluate the approach using a large set of artificial subjects, in terms of fault-detection effectiveness and timeliness of regression test feedback. We compare the results with an advanced retest-all approach and random test selection. The results show that regression test selection based on coverage and history analysis can: 1) reduce regression test feedback compared to industry practice (up to 39%), 2) reduce test feedback compared to the advanced retest-all approach (up to 45%) without significantly compromising fault-detection effectiveness (less than 0.5% on average), and 3) improve fault detection effectiveness compared to random selection (72% on average).","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115031553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fangxi Yin, Denghui Dong, Sanhong Li, Jianmei Guo, K. Chow
Alibaba is moving toward one of the most efficient cloud infrastructures for global online shopping. On the 2017 Double 11 Global Shopping Festival, Alibaba's cloud platform achieved total sales of more than 25 billion dollars and supported peak volumes of 325,000 transactions and 256,000 payments per second. Most of the cloud-based e-commerce transactions were processed by hundreds of thousands of Java applications with above a billion lines of code. It is challenging to achieve comprehensive and efficient performance troubleshooting and optimization for large-scale online Java applications in production. We proposed new approaches to method profiling and code warmup for Java performance tuning. Our fine-grained, low-overhead method profiler improves the efficiency of Java performance troubleshooting. Moreover, our approach to ahead-of-time code warmup significantly reduces the runtime overheads of just-in-time compiler to address the bursty traffic. Our approaches have been implemented in Alibaba JDK (AJDK), a customized version of OpenJDK, and have been rolled out to Alibaba's cloud platform to support online critical business.
{"title":"Java Performance Troubleshooting and Optimization at Alibaba","authors":"Fangxi Yin, Denghui Dong, Sanhong Li, Jianmei Guo, K. Chow","doi":"10.1145/3183519.3183536","DOIUrl":"https://doi.org/10.1145/3183519.3183536","url":null,"abstract":"Alibaba is moving toward one of the most efficient cloud infrastructures for global online shopping. On the 2017 Double 11 Global Shopping Festival, Alibaba's cloud platform achieved total sales of more than 25 billion dollars and supported peak volumes of 325,000 transactions and 256,000 payments per second. Most of the cloud-based e-commerce transactions were processed by hundreds of thousands of Java applications with above a billion lines of code. It is challenging to achieve comprehensive and efficient performance troubleshooting and optimization for large-scale online Java applications in production. We proposed new approaches to method profiling and code warmup for Java performance tuning. Our fine-grained, low-overhead method profiler improves the efficiency of Java performance troubleshooting. Moreover, our approach to ahead-of-time code warmup significantly reduces the runtime overheads of just-in-time compiler to address the bursty traffic. Our approaches have been implemented in Alibaba JDK (AJDK), a customized version of OpenJDK, and have been rolled out to Alibaba's cloud platform to support online critical business.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126920097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Safety-critical software systems in the aviation domain, e.g., a UAV autopilot software, needs to go through a formal process of certification (e.g., DO-178C standard). One of the main requirements for this certification is having a set of explicit test cases for each software requirement. To achieve this, the DO-178C standard recommends using a model-driven approach. For instance, model-based testing (MBT) is recommended in its DO-331 supplement to automatically generate system-level test cases for the requirements provided as the specification models. In addition, the DO-178C standard also requires high level of source code coverage, which typically is achieved by a separate set of structural testing. However, the standard allows targeting high code coverage with MBT, only if the applicants justify their plan on how to achieve high code coverage through model-level testing. In this study, we propose using the Modified Condition and Decision coverage ("MC/DC") criterion on the specification-level constraints rather than the standard-recommended "all transition coverage" criterion, to achieve higher code coverage through MBT. We evaluate our approach in the context of a case study at MicroPilot Inc., our industry collaborator, which is a UAV producer company. We implemented our idea as an MC/DC coverage on transition guards in a UML state-machine-based testing tool that was developed in-house. The results show that applying model-level MC/DC coverage outperforms the typical transition-coverage (DO-178C's required MBT coverage criterion), with respect to source code-level "all condition-decision coverage criterion" by 33%. In addition, our MC/DC test suite detected three new faults and two instances of legacy specification in the code that are no longer in use, compared to the "all transition" test suite.
{"title":"Evaluating Specification-level MC/DC Criterion in Model-Based Testing of Safety Critical Systems","authors":"S. S. Arefin, H. Hemmati, Howard W. Loewen","doi":"10.1145/3183519.3183551","DOIUrl":"https://doi.org/10.1145/3183519.3183551","url":null,"abstract":"Safety-critical software systems in the aviation domain, e.g., a UAV autopilot software, needs to go through a formal process of certification (e.g., DO-178C standard). One of the main requirements for this certification is having a set of explicit test cases for each software requirement. To achieve this, the DO-178C standard recommends using a model-driven approach. For instance, model-based testing (MBT) is recommended in its DO-331 supplement to automatically generate system-level test cases for the requirements provided as the specification models. In addition, the DO-178C standard also requires high level of source code coverage, which typically is achieved by a separate set of structural testing. However, the standard allows targeting high code coverage with MBT, only if the applicants justify their plan on how to achieve high code coverage through model-level testing. In this study, we propose using the Modified Condition and Decision coverage (\"MC/DC\") criterion on the specification-level constraints rather than the standard-recommended \"all transition coverage\" criterion, to achieve higher code coverage through MBT. We evaluate our approach in the context of a case study at MicroPilot Inc., our industry collaborator, which is a UAV producer company. We implemented our idea as an MC/DC coverage on transition guards in a UML state-machine-based testing tool that was developed in-house. The results show that applying model-level MC/DC coverage outperforms the typical transition-coverage (DO-178C's required MBT coverage criterion), with respect to source code-level \"all condition-decision coverage criterion\" by 33%. In addition, our MC/DC test suite detected three new faults and two instances of legacy specification in the code that are no longer in use, compared to the \"all transition\" test suite.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130484202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tushar Sharma, Marios Fragkoulis, Stamatia Rizou, M. Bruntink, D. Spinellis
Context: Databases are an integral element of enterprise applications. Similarly to code, database schemas are also prone to smells - best practice violations. Objective: We aim to explore database schema quality, associated characteristics and their relationships with other software artifacts. Method: We present a catalog of 13 database schema smells and elicit developers' perspective through a survey. We extract embedded SQL statements and identify database schema smells by employing the DbDeo tool which we developed. We analyze 2925 production-quality systems (357 industrial and 2568 well-engineered open-source projects) and empirically study quality characteristics of their database schemas. In total, we analyze 629 million lines of code containing more than 393 thousand SQL statements. Results: We find that the index abuse smell occurs most frequently in database code, that the use of an ORM framework doesn't immune the application from database smells, and that some database smells, such as adjacency list, are more prone to occur in industrial projects compared to open-source projects. Our co-occurrence analysis shows that whenever the clone table smell in industrial projects and the values in attribute definition smell in open-source projects get spotted, it is very likely to find other database smells in the project. Conclusion: The awareness and knowledge of database smells are crucial for developing high-quality software systems and can be enhanced by the adoption of better tools helping developers to identify database smells early.
{"title":"Smelly Relations: Measuring and Understanding Database Schema Quality","authors":"Tushar Sharma, Marios Fragkoulis, Stamatia Rizou, M. Bruntink, D. Spinellis","doi":"10.1145/3183519.3183529","DOIUrl":"https://doi.org/10.1145/3183519.3183529","url":null,"abstract":"Context: Databases are an integral element of enterprise applications. Similarly to code, database schemas are also prone to smells - best practice violations. Objective: We aim to explore database schema quality, associated characteristics and their relationships with other software artifacts. Method: We present a catalog of 13 database schema smells and elicit developers' perspective through a survey. We extract embedded SQL statements and identify database schema smells by employing the DbDeo tool which we developed. We analyze 2925 production-quality systems (357 industrial and 2568 well-engineered open-source projects) and empirically study quality characteristics of their database schemas. In total, we analyze 629 million lines of code containing more than 393 thousand SQL statements. Results: We find that the index abuse smell occurs most frequently in database code, that the use of an ORM framework doesn't immune the application from database smells, and that some database smells, such as adjacency list, are more prone to occur in industrial projects compared to open-source projects. Our co-occurrence analysis shows that whenever the clone table smell in industrial projects and the values in attribute definition smell in open-source projects get spotted, it is very likely to find other database smells in the project. Conclusion: The awareness and knowledge of database smells are crucial for developing high-quality software systems and can be enhanced by the adoption of better tools helping developers to identify database smells early.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116749434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heng Li, T. Chen, A. Hassan, Mohamed N. Nasser, P. Flora
In current DevOps practice, developers are responsible for the operation and maintenance of software systems. However, the human costs for the operation and maintenance grow fast along with the increasing functionality and complexity of software systems. Autonomic computing aims to reduce or eliminate such human intervention. However, there are many existing large systems that did not consider autonomic computing capabilities in their design. Adding autonomic computing capabilities to these existing systems is particularly challenging, because of 1) the significant amount of efforts that are required for investigating and refactoring the existing code base, 2) the risk of adding additional complexity, and 3) the difficulties for allocating resources while developers are busy adding core features to the system. In this paper, we share our industrial experience of re-engineering autonomic computing capabilities to an existing large-scale software system. Our autonomic computing capabilities effectively reduce human intervention on performance configuration tuning and significantly improve system performance. In particular, we discuss the challenges that we encountered and the lessons that we learned during this re-engineering process. For example, in order to minimize the change impact to the original system, we use a variety of approaches (e.g., aspect-oriented programming) to separate the concerns of autonomic computing from the original behaviour of the system. We also share how we tested such autonomic computing capabilities under different conditions, which has never been discussed in prior work. As there are numerous large-scale software systems that still require expensive human intervention, we believe our experience provides valuable insights to software practitioners who wish to add autonomic computing capabilities to these existing large-scale software systems.
{"title":"Adopting Autonomic Computing Capabilities in Existing Large-Scale Systems","authors":"Heng Li, T. Chen, A. Hassan, Mohamed N. Nasser, P. Flora","doi":"10.1145/3183519.3183544","DOIUrl":"https://doi.org/10.1145/3183519.3183544","url":null,"abstract":"In current DevOps practice, developers are responsible for the operation and maintenance of software systems. However, the human costs for the operation and maintenance grow fast along with the increasing functionality and complexity of software systems. Autonomic computing aims to reduce or eliminate such human intervention. However, there are many existing large systems that did not consider autonomic computing capabilities in their design. Adding autonomic computing capabilities to these existing systems is particularly challenging, because of 1) the significant amount of efforts that are required for investigating and refactoring the existing code base, 2) the risk of adding additional complexity, and 3) the difficulties for allocating resources while developers are busy adding core features to the system. In this paper, we share our industrial experience of re-engineering autonomic computing capabilities to an existing large-scale software system. Our autonomic computing capabilities effectively reduce human intervention on performance configuration tuning and significantly improve system performance. In particular, we discuss the challenges that we encountered and the lessons that we learned during this re-engineering process. For example, in order to minimize the change impact to the original system, we use a variety of approaches (e.g., aspect-oriented programming) to separate the concerns of autonomic computing from the original behaviour of the system. We also share how we tested such autonomic computing capabilities under different conditions, which has never been discussed in prior work. As there are numerous large-scale software systems that still require expensive human intervention, we believe our experience provides valuable insights to software practitioners who wish to add autonomic computing capabilities to these existing large-scale software systems.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125184196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ciera Jaspan, M. Jorde, Andrea Knight, Caitlin Sadowski, Edward K. Smith, C. Winter, E. Murphy-Hill
Monolithic source code repositories (repos) are used by several large tech companies, but little is known about their advantages or disadvantages compared to multiple per-project repos. This paper investigates the relative tradeoffs by utilizing a mixed-methods approach. Our primary contribution is a survey of engineers who have experience with both monolithic repos and multiple, per-project repos. This paper also backs up the claims made by these engineers with a large-scale analysis of developer tool logs. Our study finds that the visibility of the codebase is a significant advantage of a monolithic repo: it enables engineers to discover APIs to reuse, find examples for using an API, and automatically have dependent code updated as an API migrates to a new version. Engineers also appreciate the centralization of dependency management in the repo. In contrast, multiple-repository (multi-repo) systems afford engineers more flexibility to select their own toolchains and provide significant access control and stability benefits. In both cases, the related tooling is also a significant factor; engineers favor particular tools and are drawn to repo management systems that support their desired toolchain.
{"title":"Advantages and Disadvantages of a Monolithic Repository: A Case Study at Google","authors":"Ciera Jaspan, M. Jorde, Andrea Knight, Caitlin Sadowski, Edward K. Smith, C. Winter, E. Murphy-Hill","doi":"10.1145/3183519.3183550","DOIUrl":"https://doi.org/10.1145/3183519.3183550","url":null,"abstract":"Monolithic source code repositories (repos) are used by several large tech companies, but little is known about their advantages or disadvantages compared to multiple per-project repos. This paper investigates the relative tradeoffs by utilizing a mixed-methods approach. Our primary contribution is a survey of engineers who have experience with both monolithic repos and multiple, per-project repos. This paper also backs up the claims made by these engineers with a large-scale analysis of developer tool logs. Our study finds that the visibility of the codebase is a significant advantage of a monolithic repo: it enables engineers to discover APIs to reuse, find examples for using an API, and automatically have dependent code updated as an API migrates to a new version. Engineers also appreciate the centralization of dependency management in the repo. In contrast, multiple-repository (multi-repo) systems afford engineers more flexibility to select their own toolchains and provide significant access control and stability benefits. In both cases, the related tooling is also a significant factor; engineers favor particular tools and are drawn to repo management systems that support their desired toolchain.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133617700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Blue, O. Raz, Rachel Tzoref, Paul Wojciak, Marcel Zalmanovici
Combinatorial testing (CT) is a well-known technique for improving the quality of test plans while reducing testing costs. Traditionally, CT is used by testers at testing phase to design a test plan based on a manual definition of the test space. In this work, we extend the traditional use of CT to other parts of the development life cycle. We use CT at early design phase to improve design quality. We also use CT after test cases have been created and executed, in order to find gaps between design and test. For the latter use case we deploy a novel technique for a semi-automated definition of the test space, which significantly reduces the effort associated with manual test space definition. We report on our practical experience in applying CT for these use cases to three large and heavily deployed industrial products. We demonstrate the value gained from extending the use of CT by (1) discovering latent design flaws with high potential impact, and (2) correlating CT-uncovered gaps between design and test with field reported problems.
{"title":"Proactive and Pervasive Combinatorial Testing","authors":"D. Blue, O. Raz, Rachel Tzoref, Paul Wojciak, Marcel Zalmanovici","doi":"10.1145/3183519.3183522","DOIUrl":"https://doi.org/10.1145/3183519.3183522","url":null,"abstract":"Combinatorial testing (CT) is a well-known technique for improving the quality of test plans while reducing testing costs. Traditionally, CT is used by testers at testing phase to design a test plan based on a manual definition of the test space. In this work, we extend the traditional use of CT to other parts of the development life cycle. We use CT at early design phase to improve design quality. We also use CT after test cases have been created and executed, in order to find gaps between design and test. For the latter use case we deploy a novel technique for a semi-automated definition of the test space, which significantly reduces the effort associated with manual test space definition. We report on our practical experience in applying CT for these use cases to three large and heavily deployed industrial products. We demonstrate the value gained from extending the use of CT by (1) discovering latent design flaws with high potential impact, and (2) correlating CT-uncovered gaps between design and test with field reported problems.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131769264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}