2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)最新文献

An Exploratory Study on Faults inWeb API Integration in a Large-Scale Payment Company 某大型支付公司web API集成故障的探索性研究

2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)

Pub Date : 2017-05-01 DOI: 10.1145/3183519.3183537

J. Aué, M. Aniche, M. Lobbezoo, A. Deursen

Service-oriented architectures are more popular than ever, and increasingly companies and organizations depend on services offered through Web APIs. The capabilities and complexity of Web APIs differ from service to service, and therefore the impact of API errors varies. API problem cases related to Adyen's payment service were found to have direct considerable impact on API consumer applications. With more than 60,000 daily API errors, the potential impact is enormous. In an effort to reduce the impact of API related problems, we analyze 2.43 million API error responses to identify the underlying faults. We quantify the occurrence of faults in terms of the frequency and impacted API consumers. We also challenge our quantitative results by means of a survey with 40 API consumers. Our results show that 1) faults in API integration can be grouped into 11 general causes: invalid user input, missing user input, expired request data, invalid request data, missing request data, insufficient permissions, double processing, configuration, missing server data, internal and third party, 2) most faults can be attributed to the invalid or missing request data, and most API consumers seem to be impacted by faults caused by invalid request data and third party integration; and 3) insufficient guidance on certain aspects of the integration and on how to recover from errors is an important challenge to developers.

面向服务的体系结构比以往任何时候都更加流行，越来越多的公司和组织依赖于通过Web api提供的服务。Web API的功能和复杂性因服务而异，因此API错误的影响也各不相同。与Adyen支付服务相关的API问题案例被发现对API消费者应用程序有直接的相当大的影响。每天有超过60,000个API错误，潜在的影响是巨大的。为了减少API相关问题的影响，我们分析了243万个API错误响应，以识别潜在的错误。我们根据频率和受影响的API使用者来量化故障的发生。我们还通过对40个API消费者的调查来挑战我们的定量结果。我们的研究结果表明，1)API集成中的故障可分为11种常见原因:无效用户输入、缺失用户输入、过期请求数据、无效请求数据、缺失请求数据、权限不足、双重处理、配置、缺失服务器数据、内部和第三方;2)大多数故障可归因于无效或缺失的请求数据，并且大多数API消费者似乎受到无效请求数据和第三方集成引起的故障的影响;3)对集成的某些方面和如何从错误中恢复的指导不足是开发人员面临的一个重要挑战。

{"title":"An Exploratory Study on Faults inWeb API Integration in a Large-Scale Payment Company","authors":"J. Aué, M. Aniche, M. Lobbezoo, A. Deursen","doi":"10.1145/3183519.3183537","DOIUrl":"https://doi.org/10.1145/3183519.3183537","url":null,"abstract":"Service-oriented architectures are more popular than ever, and increasingly companies and organizations depend on services offered through Web APIs. The capabilities and complexity of Web APIs differ from service to service, and therefore the impact of API errors varies. API problem cases related to Adyen's payment service were found to have direct considerable impact on API consumer applications. With more than 60,000 daily API errors, the potential impact is enormous. In an effort to reduce the impact of API related problems, we analyze 2.43 million API error responses to identify the underlying faults. We quantify the occurrence of faults in terms of the frequency and impacted API consumers. We also challenge our quantitative results by means of a survey with 40 API consumers. Our results show that 1) faults in API integration can be grouped into 11 general causes: invalid user input, missing user input, expired request data, invalid request data, missing request data, insufficient permissions, double processing, configuration, missing server data, internal and third party, 2) most faults can be attributed to the invalid or missing request data, and most API consumers seem to be impacted by faults caused by invalid request data and third party integration; and 3) insufficient guidance on certain aspects of the integration and on how to recover from errors is an important challenge to developers.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123403793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Improving the Definition of Software Development Projects Through Design Thinking Led Collaboration Workshops 通过以设计思维为主导的协作研讨会改进软件开发项目的定义

2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)

Pub Date : 2017-05-01 DOI: 10.1145/3183519.3183535

Hilary Cinis

Software development projects need clear agreed goals, a provable value proposition and key stakeholder commitment. Employing Design Thinking methods to focus the expertise of workshop participants can uncover and support these needs. The Data61 User Experience and Design team lead Design Thinking collaboration workshops as professional practice during engagements with clients. These workshops are run with cross-functional teams working on data/digital platforms before going to market. This talk will present findings from a total 54 Design Thinking workshops and 5x5 day co-design sprints run from Sept 2016 - Oct 2017. The talk will cover how the workshops were designed, using initial trials of a proposed workshop approach and ongoing review of workshops and design sprints. The use of our approach will be illustrated using 3 short case studies from different application domains: e-science for materials and manufacturing, infrastructure sustainability, and agricultural intelligence. Key learnings from our approach include getting internal stakeholder support for workshops, characteristics of a good format and how to balance domain expertise with user centred design practises.

软件开发项目需要明确的商定目标、可证明的价值主张和关键涉众的承诺。采用设计思维方法来集中研讨会参与者的专业知识，可以发现并支持这些需求。Data61用户体验和设计团队将设计思维协作研讨会作为与客户合作的专业实践。这些研讨会在进入市场之前，由跨职能团队负责数据/数字平台。本次演讲将展示从2016年9月至2017年10月共54场设计思维研讨会和5x5天的协同设计冲刺的研究成果。讲座将介绍如何设计讲习班，使用拟议的讲习班方法的初步试验以及对讲习班和设计冲刺的持续审查。我们的方法将使用来自不同应用领域的3个简短案例研究来说明:材料和制造的电子科学，基础设施可持续性和农业智能。从我们的方法中学到的关键知识包括获得内部利益相关者对研讨会的支持，良好格式的特征以及如何平衡领域专业知识与以用户为中心的设计实践。

{"title":"Improving the Definition of Software Development Projects Through Design Thinking Led Collaboration Workshops","authors":"Hilary Cinis","doi":"10.1145/3183519.3183535","DOIUrl":"https://doi.org/10.1145/3183519.3183535","url":null,"abstract":"Software development projects need clear agreed goals, a provable value proposition and key stakeholder commitment. Employing Design Thinking methods to focus the expertise of workshop participants can uncover and support these needs. The Data61 User Experience and Design team lead Design Thinking collaboration workshops as professional practice during engagements with clients. These workshops are run with cross-functional teams working on data/digital platforms before going to market. This talk will present findings from a total 54 Design Thinking workshops and 5x5 day co-design sprints run from Sept 2016 - Oct 2017. The talk will cover how the workshops were designed, using initial trials of a proposed workshop approach and ongoing review of workshops and design sprints. The use of our approach will be illustrated using 3 short case studies from different application domains: e-science for materials and manufacturing, infrastructure sustainability, and agricultural intelligence. Key learnings from our approach include getting internal stakeholder support for workshops, characteristics of a good format and how to balance domain expertise with user centred design practises.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124629522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Improving Model-Based Testing in Automotive Software Engineering 改进汽车软件工程中基于模型的测试

2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)

Pub Date : 2017-05-01 DOI: 10.1145/3183519.3183533

S. Kriebel, Matthias Markthaler, Karin Samira Salman, Timo Greifenberg, S. Hillemacher, Bernhard Rumpe, Christoph Schulze, A. Wortmann, P. Orth, J. Richenhagen

Testing is crucial to successfully engineering reliable automotive software. The manual derivation of test cases from ambiguous textual requirements is costly and error-prone. Model-based development can reduce the test case derivation effort by capturing requirements in structured models from which test cases can be generated with reduced effort. To facilitate the automated test case derivation at BMW, we conducted an anonymous survey among its testing practitioners and conceived a model-based improvement of the testing activities. The new model-based test case derivation extends BMW's SMArDT method with automated generation of tests, which addresses many of the practitioners' challenges uncovered through our study. This ultimately can facilitate quality assurance for automotive software.

测试对于成功设计可靠的汽车软件至关重要。从模棱两可的文本需求中手工派生测试用例是昂贵且容易出错的。基于模型的开发可以通过捕获结构化模型中的需求来减少测试用例派生的工作量，从结构化模型中可以减少生成测试用例的工作量。为了促进BMW的自动化测试用例派生，我们在其测试从业者中进行了匿名调查，并构思了测试活动的基于模型的改进。新的基于模型的测试用例派生扩展了BMW的SMArDT方法，自动生成测试，解决了许多从业者在我们的研究中发现的挑战。这最终可以促进汽车软件的质量保证。

引用次数: 24

Practical Selective Regression Testing with Effective Redundancy in Interleaved Tests 交错测试中具有有效冗余的实用选择性回归测试

2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)

Pub Date : 2017-05-01 DOI: 10.1145/3183519.3183532

D. Marijan, Marius Liaaen

As software systems evolve and change over time, test suites used for checking the correctness of software typically grow larger. Together with size, test suites tend to grow in redundancy. This is especially problematic for complex highly-configurable software domains, as growing the size of test suites significantly impacts the cost of regression testing. In this paper we present a practical approach for reducing ineffective redundancy of regression suites in continuous integration testing (strict constraints on time-efficiency) for highly-configurable software. The main idea of our approach consists in combining coverage based redundancy metrics (test overlap) with historical fault-detection effectiveness of integration tests, to identify ineffective redundancy that is eliminated from a regression test suite. We first apply and evaluate the approach in testing of industrial video conferencing software. We further evaluate the approach using a large set of artificial subjects, in terms of fault-detection effectiveness and timeliness of regression test feedback. We compare the results with an advanced retest-all approach and random test selection. The results show that regression test selection based on coverage and history analysis can: 1) reduce regression test feedback compared to industry practice (up to 39%), 2) reduce test feedback compared to the advanced retest-all approach (up to 45%) without significantly compromising fault-detection effectiveness (less than 0.5% on average), and 3) improve fault detection effectiveness compared to random selection (72% on average).

随着软件系统的发展和变化，用于检查软件正确性的测试套件通常会变大。与大小一起，测试套件倾向于在冗余中增长。这对于复杂的、高度可配置的软件领域来说尤其成问题，因为测试套件的规模的增长会显著影响回归测试的成本。在本文中，我们提出了一种实用的方法来减少高可配置软件持续集成测试中回归套件的无效冗余(严格的时间效率约束)。我们方法的主要思想包括将基于覆盖的冗余度量(测试重叠)与集成测试的历史故障检测有效性相结合，以识别从回归测试套件中消除的无效冗余。本文首先将该方法应用于工业视频会议软件的测试并进行了评价。我们使用大量的人工受试者进一步评估了回归测试反馈的故障检测有效性和及时性。我们将结果与先进的全重测方法和随机测试选择进行比较。结果表明，基于覆盖率和历史分析的回归测试选择可以:1)与行业实践相比，减少回归测试反馈(高达39%);2)与先进的全部重测方法相比，减少测试反馈(高达45%)，而不会显著影响故障检测效率(平均低于0.5%);3)与随机选择相比，提高故障检测效率(平均低于72%)。

{"title":"Practical Selective Regression Testing with Effective Redundancy in Interleaved Tests","authors":"D. Marijan, Marius Liaaen","doi":"10.1145/3183519.3183532","DOIUrl":"https://doi.org/10.1145/3183519.3183532","url":null,"abstract":"As software systems evolve and change over time, test suites used for checking the correctness of software typically grow larger. Together with size, test suites tend to grow in redundancy. This is especially problematic for complex highly-configurable software domains, as growing the size of test suites significantly impacts the cost of regression testing. In this paper we present a practical approach for reducing ineffective redundancy of regression suites in continuous integration testing (strict constraints on time-efficiency) for highly-configurable software. The main idea of our approach consists in combining coverage based redundancy metrics (test overlap) with historical fault-detection effectiveness of integration tests, to identify ineffective redundancy that is eliminated from a regression test suite. We first apply and evaluate the approach in testing of industrial video conferencing software. We further evaluate the approach using a large set of artificial subjects, in terms of fault-detection effectiveness and timeliness of regression test feedback. We compare the results with an advanced retest-all approach and random test selection. The results show that regression test selection based on coverage and history analysis can: 1) reduce regression test feedback compared to industry practice (up to 39%), 2) reduce test feedback compared to the advanced retest-all approach (up to 45%) without significantly compromising fault-detection effectiveness (less than 0.5% on average), and 3) improve fault detection effectiveness compared to random selection (72% on average).","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115031553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Java Performance Troubleshooting and Optimization at Alibaba Java性能故障排除和优化在阿里巴巴

2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)

Pub Date : 2017-05-01 DOI: 10.1145/3183519.3183536

Fangxi Yin, Denghui Dong, Sanhong Li, Jianmei Guo, K. Chow

Alibaba is moving toward one of the most efficient cloud infrastructures for global online shopping. On the 2017 Double 11 Global Shopping Festival, Alibaba's cloud platform achieved total sales of more than 25 billion dollars and supported peak volumes of 325,000 transactions and 256,000 payments per second. Most of the cloud-based e-commerce transactions were processed by hundreds of thousands of Java applications with above a billion lines of code. It is challenging to achieve comprehensive and efficient performance troubleshooting and optimization for large-scale online Java applications in production. We proposed new approaches to method profiling and code warmup for Java performance tuning. Our fine-grained, low-overhead method profiler improves the efficiency of Java performance troubleshooting. Moreover, our approach to ahead-of-time code warmup significantly reduces the runtime overheads of just-in-time compiler to address the bursty traffic. Our approaches have been implemented in Alibaba JDK (AJDK), a customized version of OpenJDK, and have been rolled out to Alibaba's cloud platform to support online critical business.

阿里巴巴正朝着全球在线购物最高效的云基础设施之一迈进。在2017年双11全球购物节上，阿里巴巴的云平台实现了超过250亿美元的总销售额，支持的峰值交易量为每秒32.5万笔交易和25.6万笔支付。大多数基于云的电子商务交易都是由数十万个Java应用程序处理的，这些应用程序的代码超过10亿行。为生产中的大规模在线Java应用程序实现全面而有效的性能故障排除和优化是一项挑战。我们为Java性能调优提出了方法分析和代码预热的新方法。我们的细粒度、低开销的方法分析器提高了Java性能故障排除的效率。此外，我们的提前代码预热方法显著降低了即时编译器处理突发流量的运行时开销。我们的方法已经在阿里巴巴JDK (AJDK)中实现，AJDK是OpenJDK的定制版本，并已推广到阿里巴巴的云平台，以支持在线关键业务。

{"title":"Java Performance Troubleshooting and Optimization at Alibaba","authors":"Fangxi Yin, Denghui Dong, Sanhong Li, Jianmei Guo, K. Chow","doi":"10.1145/3183519.3183536","DOIUrl":"https://doi.org/10.1145/3183519.3183536","url":null,"abstract":"Alibaba is moving toward one of the most efficient cloud infrastructures for global online shopping. On the 2017 Double 11 Global Shopping Festival, Alibaba's cloud platform achieved total sales of more than 25 billion dollars and supported peak volumes of 325,000 transactions and 256,000 payments per second. Most of the cloud-based e-commerce transactions were processed by hundreds of thousands of Java applications with above a billion lines of code. It is challenging to achieve comprehensive and efficient performance troubleshooting and optimization for large-scale online Java applications in production. We proposed new approaches to method profiling and code warmup for Java performance tuning. Our fine-grained, low-overhead method profiler improves the efficiency of Java performance troubleshooting. Moreover, our approach to ahead-of-time code warmup significantly reduces the runtime overheads of just-in-time compiler to address the bursty traffic. Our approaches have been implemented in Alibaba JDK (AJDK), a customized version of OpenJDK, and have been rolled out to Alibaba's cloud platform to support online critical business.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126920097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Evaluating Specification-level MC/DC Criterion in Model-Based Testing of Safety Critical Systems 基于模型的安全关键系统测试中规范级MC/DC准则的评估

2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)

Pub Date : 2017-05-01 DOI: 10.1145/3183519.3183551

S. S. Arefin, H. Hemmati, Howard W. Loewen

Safety-critical software systems in the aviation domain, e.g., a UAV autopilot software, needs to go through a formal process of certification (e.g., DO-178C standard). One of the main requirements for this certification is having a set of explicit test cases for each software requirement. To achieve this, the DO-178C standard recommends using a model-driven approach. For instance, model-based testing (MBT) is recommended in its DO-331 supplement to automatically generate system-level test cases for the requirements provided as the specification models. In addition, the DO-178C standard also requires high level of source code coverage, which typically is achieved by a separate set of structural testing. However, the standard allows targeting high code coverage with MBT, only if the applicants justify their plan on how to achieve high code coverage through model-level testing. In this study, we propose using the Modified Condition and Decision coverage ("MC/DC") criterion on the specification-level constraints rather than the standard-recommended "all transition coverage" criterion, to achieve higher code coverage through MBT. We evaluate our approach in the context of a case study at MicroPilot Inc., our industry collaborator, which is a UAV producer company. We implemented our idea as an MC/DC coverage on transition guards in a UML state-machine-based testing tool that was developed in-house. The results show that applying model-level MC/DC coverage outperforms the typical transition-coverage (DO-178C's required MBT coverage criterion), with respect to source code-level "all condition-decision coverage criterion" by 33%. In addition, our MC/DC test suite detected three new faults and two instances of legacy specification in the code that are no longer in use, compared to the "all transition" test suite.

航空领域的安全关键软件系统，例如无人机自动驾驶软件，需要通过正式的认证过程(例如，DO-178C标准)。该认证的主要要求之一是为每个软件需求提供一组明确的测试用例。为了实现这一点，DO-178C标准建议使用模型驱动的方法。例如，基于模型的测试(MBT)在其DO-331补充中被推荐为作为规范模型提供的需求自动生成系统级测试用例。此外，DO-178C标准还需要高水平的源代码覆盖率，这通常是通过一组单独的结构测试来实现的。然而，标准允许使用MBT瞄准高代码覆盖率，只有当申请人证明他们的计划如何通过模型级测试实现高代码覆盖率。在本研究中，我们建议在规范级别约束上使用修改条件和决策覆盖(“MC/DC”)标准，而不是标准推荐的“所有转换覆盖”标准，以通过MBT实现更高的代码覆盖。我们在MicroPilot公司的案例研究中评估了我们的方法，MicroPilot公司是我们的行业合作伙伴，是一家无人机生产公司。我们在内部开发的基于UML状态机的测试工具中，将我们的想法实现为转换守卫的MC/DC覆盖。结果表明，相对于源代码级别的“所有条件决策覆盖标准”，应用模型级MC/DC覆盖比典型的过渡覆盖(DO-178C要求的MBT覆盖标准)要好33%。此外，与“全部转换”测试套件相比，我们的MC/DC测试套件检测到代码中不再使用的三个新错误和两个遗留规范实例。

{"title":"Evaluating Specification-level MC/DC Criterion in Model-Based Testing of Safety Critical Systems","authors":"S. S. Arefin, H. Hemmati, Howard W. Loewen","doi":"10.1145/3183519.3183551","DOIUrl":"https://doi.org/10.1145/3183519.3183551","url":null,"abstract":"Safety-critical software systems in the aviation domain, e.g., a UAV autopilot software, needs to go through a formal process of certification (e.g., DO-178C standard). One of the main requirements for this certification is having a set of explicit test cases for each software requirement. To achieve this, the DO-178C standard recommends using a model-driven approach. For instance, model-based testing (MBT) is recommended in its DO-331 supplement to automatically generate system-level test cases for the requirements provided as the specification models. In addition, the DO-178C standard also requires high level of source code coverage, which typically is achieved by a separate set of structural testing. However, the standard allows targeting high code coverage with MBT, only if the applicants justify their plan on how to achieve high code coverage through model-level testing. In this study, we propose using the Modified Condition and Decision coverage (\"MC/DC\") criterion on the specification-level constraints rather than the standard-recommended \"all transition coverage\" criterion, to achieve higher code coverage through MBT. We evaluate our approach in the context of a case study at MicroPilot Inc., our industry collaborator, which is a UAV producer company. We implemented our idea as an MC/DC coverage on transition guards in a UML state-machine-based testing tool that was developed in-house. The results show that applying model-level MC/DC coverage outperforms the typical transition-coverage (DO-178C's required MBT coverage criterion), with respect to source code-level \"all condition-decision coverage criterion\" by 33%. In addition, our MC/DC test suite detected three new faults and two instances of legacy specification in the code that are no longer in use, compared to the \"all transition\" test suite.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130484202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Smelly Relations: Measuring and Understanding Database Schema Quality 臭关系:测量和理解数据库模式质量

2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)

Pub Date : 2017-05-01 DOI: 10.1145/3183519.3183529

Tushar Sharma, Marios Fragkoulis, Stamatia Rizou, M. Bruntink, D. Spinellis

Context: Databases are an integral element of enterprise applications. Similarly to code, database schemas are also prone to smells - best practice violations. Objective: We aim to explore database schema quality, associated characteristics and their relationships with other software artifacts. Method: We present a catalog of 13 database schema smells and elicit developers' perspective through a survey. We extract embedded SQL statements and identify database schema smells by employing the DbDeo tool which we developed. We analyze 2925 production-quality systems (357 industrial and 2568 well-engineered open-source projects) and empirically study quality characteristics of their database schemas. In total, we analyze 629 million lines of code containing more than 393 thousand SQL statements. Results: We find that the index abuse smell occurs most frequently in database code, that the use of an ORM framework doesn't immune the application from database smells, and that some database smells, such as adjacency list, are more prone to occur in industrial projects compared to open-source projects. Our co-occurrence analysis shows that whenever the clone table smell in industrial projects and the values in attribute definition smell in open-source projects get spotted, it is very likely to find other database smells in the project. Conclusion: The awareness and knowledge of database smells are crucial for developing high-quality software systems and can be enhanced by the adoption of better tools helping developers to identify database smells early.

上下文:数据库是企业应用程序的一个组成部分。与代码类似，数据库模式也容易产生气味——违反最佳实践。目的:我们的目标是探索数据库模式的质量、相关特征以及它们与其他软件工件的关系。方法:我们提出了13种数据库模式气味的目录，并通过调查得出开发人员的观点。利用开发的DbDeo工具提取嵌入式SQL语句，识别数据库模式气味。我们分析了2925个生产质量系统(357个工业项目和2568个设计良好的开源项目)，并实证研究了它们的数据库模式的质量特征。我们总共分析了6.29亿行代码，其中包含超过39.3万条SQL语句。结果:我们发现索引滥用气味最常发生在数据库代码中，使用ORM框架并不能使应用程序免受数据库气味的影响，并且与开源项目相比，一些数据库气味，如邻接表，更容易发生在工业项目中。我们的共现分析表明，只要发现工业项目中的克隆表气味和开源项目中的属性定义值气味，就很有可能发现项目中的其他数据库气味。结论:数据库气味的意识和知识对于开发高质量的软件系统至关重要，并且可以通过采用更好的工具来帮助开发人员早期识别数据库气味来增强。

{"title":"Smelly Relations: Measuring and Understanding Database Schema Quality","authors":"Tushar Sharma, Marios Fragkoulis, Stamatia Rizou, M. Bruntink, D. Spinellis","doi":"10.1145/3183519.3183529","DOIUrl":"https://doi.org/10.1145/3183519.3183529","url":null,"abstract":"Context: Databases are an integral element of enterprise applications. Similarly to code, database schemas are also prone to smells - best practice violations. Objective: We aim to explore database schema quality, associated characteristics and their relationships with other software artifacts. Method: We present a catalog of 13 database schema smells and elicit developers' perspective through a survey. We extract embedded SQL statements and identify database schema smells by employing the DbDeo tool which we developed. We analyze 2925 production-quality systems (357 industrial and 2568 well-engineered open-source projects) and empirically study quality characteristics of their database schemas. In total, we analyze 629 million lines of code containing more than 393 thousand SQL statements. Results: We find that the index abuse smell occurs most frequently in database code, that the use of an ORM framework doesn't immune the application from database smells, and that some database smells, such as adjacency list, are more prone to occur in industrial projects compared to open-source projects. Our co-occurrence analysis shows that whenever the clone table smell in industrial projects and the values in attribute definition smell in open-source projects get spotted, it is very likely to find other database smells in the project. Conclusion: The awareness and knowledge of database smells are crucial for developing high-quality software systems and can be enhanced by the adoption of better tools helping developers to identify database smells early.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116749434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Adopting Autonomic Computing Capabilities in Existing Large-Scale Systems 在现有大型系统中采用自主计算能力

2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)

Pub Date : 2017-05-01 DOI: 10.1145/3183519.3183544

Heng Li, T. Chen, A. Hassan, Mohamed N. Nasser, P. Flora

In current DevOps practice, developers are responsible for the operation and maintenance of software systems. However, the human costs for the operation and maintenance grow fast along with the increasing functionality and complexity of software systems. Autonomic computing aims to reduce or eliminate such human intervention. However, there are many existing large systems that did not consider autonomic computing capabilities in their design. Adding autonomic computing capabilities to these existing systems is particularly challenging, because of 1) the significant amount of efforts that are required for investigating and refactoring the existing code base, 2) the risk of adding additional complexity, and 3) the difficulties for allocating resources while developers are busy adding core features to the system. In this paper, we share our industrial experience of re-engineering autonomic computing capabilities to an existing large-scale software system. Our autonomic computing capabilities effectively reduce human intervention on performance configuration tuning and significantly improve system performance. In particular, we discuss the challenges that we encountered and the lessons that we learned during this re-engineering process. For example, in order to minimize the change impact to the original system, we use a variety of approaches (e.g., aspect-oriented programming) to separate the concerns of autonomic computing from the original behaviour of the system. We also share how we tested such autonomic computing capabilities under different conditions, which has never been discussed in prior work. As there are numerous large-scale software systems that still require expensive human intervention, we believe our experience provides valuable insights to software practitioners who wish to add autonomic computing capabilities to these existing large-scale software systems.

在当前的DevOps实践中，开发人员负责软件系统的操作和维护。然而，随着软件系统的功能和复杂性的增加，操作和维护的人力成本也在快速增长。自主计算旨在减少或消除这种人为干预。然而，许多现有的大型系统在设计时并没有考虑到自主计算能力。向这些现有系统添加自主计算能力尤其具有挑战性，因为1)调查和重构现有代码库需要大量的工作，2)增加额外复杂性的风险，以及3)在开发人员忙于向系统添加核心特性时分配资源的困难。在本文中，我们分享了我们在现有大型软件系统中重新设计自主计算能力的工业经验。我们的自主计算能力有效地减少了对性能配置调优的人为干预，并显著提高了系统性能。特别地，我们讨论了在这个重新设计过程中遇到的挑战和我们学到的教训。例如，为了最小化对原始系统的更改影响，我们使用各种方法(例如，面向方面的编程)将自治计算的关注点从系统的原始行为中分离出来。我们还分享了我们如何在不同条件下测试这种自主计算能力，这在以前的工作中从未讨论过。由于有许多大型软件系统仍然需要昂贵的人工干预，我们相信我们的经验为希望在这些现有的大型软件系统中添加自主计算能力的软件从业者提供了有价值的见解。

{"title":"Adopting Autonomic Computing Capabilities in Existing Large-Scale Systems","authors":"Heng Li, T. Chen, A. Hassan, Mohamed N. Nasser, P. Flora","doi":"10.1145/3183519.3183544","DOIUrl":"https://doi.org/10.1145/3183519.3183544","url":null,"abstract":"In current DevOps practice, developers are responsible for the operation and maintenance of software systems. However, the human costs for the operation and maintenance grow fast along with the increasing functionality and complexity of software systems. Autonomic computing aims to reduce or eliminate such human intervention. However, there are many existing large systems that did not consider autonomic computing capabilities in their design. Adding autonomic computing capabilities to these existing systems is particularly challenging, because of 1) the significant amount of efforts that are required for investigating and refactoring the existing code base, 2) the risk of adding additional complexity, and 3) the difficulties for allocating resources while developers are busy adding core features to the system. In this paper, we share our industrial experience of re-engineering autonomic computing capabilities to an existing large-scale software system. Our autonomic computing capabilities effectively reduce human intervention on performance configuration tuning and significantly improve system performance. In particular, we discuss the challenges that we encountered and the lessons that we learned during this re-engineering process. For example, in order to minimize the change impact to the original system, we use a variety of approaches (e.g., aspect-oriented programming) to separate the concerns of autonomic computing from the original behaviour of the system. We also share how we tested such autonomic computing capabilities under different conditions, which has never been discussed in prior work. As there are numerous large-scale software systems that still require expensive human intervention, we believe our experience provides valuable insights to software practitioners who wish to add autonomic computing capabilities to these existing large-scale software systems.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125184196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Advantages and Disadvantages of a Monolithic Repository: A Case Study at Google 单片存储库的优点和缺点:以Google为例

2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)

Pub Date : 2017-05-01 DOI: 10.1145/3183519.3183550

Ciera Jaspan, M. Jorde, Andrea Knight, Caitlin Sadowski, Edward K. Smith, C. Winter, E. Murphy-Hill

Monolithic source code repositories (repos) are used by several large tech companies, but little is known about their advantages or disadvantages compared to multiple per-project repos. This paper investigates the relative tradeoffs by utilizing a mixed-methods approach. Our primary contribution is a survey of engineers who have experience with both monolithic repos and multiple, per-project repos. This paper also backs up the claims made by these engineers with a large-scale analysis of developer tool logs. Our study finds that the visibility of the codebase is a significant advantage of a monolithic repo: it enables engineers to discover APIs to reuse, find examples for using an API, and automatically have dependent code updated as an API migrates to a new version. Engineers also appreciate the centralization of dependency management in the repo. In contrast, multiple-repository (multi-repo) systems afford engineers more flexibility to select their own toolchains and provide significant access control and stability benefits. In both cases, the related tooling is also a significant factor; engineers favor particular tools and are drawn to repo management systems that support their desired toolchain.

单块源代码存储库(repos)被一些大型科技公司所使用，但与多个每个项目的repos相比，它们的优点或缺点鲜为人知。本文利用混合方法研究了相对权衡。我们的主要贡献是对有单块仓库和多个、每个项目仓库经验的工程师进行调查。本文还通过对开发人员工具日志的大规模分析来支持这些工程师的说法。我们的研究发现，代码库的可见性是单块仓库的一个显著优势:它使工程师能够发现重用的API，找到使用API的示例，并在API迁移到新版本时自动更新相关代码。工程师们也很欣赏repo中依赖项管理的集中化。相比之下，多存储库(多repo)系统为工程师提供了更大的灵活性来选择他们自己的工具链，并提供了显著的访问控制和稳定性优势。在这两种情况下，相关的工具也是一个重要因素;工程师偏爱特定的工具，并被吸引到支持他们想要的工具链的回购管理系统。

{"title":"Advantages and Disadvantages of a Monolithic Repository: A Case Study at Google","authors":"Ciera Jaspan, M. Jorde, Andrea Knight, Caitlin Sadowski, Edward K. Smith, C. Winter, E. Murphy-Hill","doi":"10.1145/3183519.3183550","DOIUrl":"https://doi.org/10.1145/3183519.3183550","url":null,"abstract":"Monolithic source code repositories (repos) are used by several large tech companies, but little is known about their advantages or disadvantages compared to multiple per-project repos. This paper investigates the relative tradeoffs by utilizing a mixed-methods approach. Our primary contribution is a survey of engineers who have experience with both monolithic repos and multiple, per-project repos. This paper also backs up the claims made by these engineers with a large-scale analysis of developer tool logs. Our study finds that the visibility of the codebase is a significant advantage of a monolithic repo: it enables engineers to discover APIs to reuse, find examples for using an API, and automatically have dependent code updated as an API migrates to a new version. Engineers also appreciate the centralization of dependency management in the repo. In contrast, multiple-repository (multi-repo) systems afford engineers more flexibility to select their own toolchains and provide significant access control and stability benefits. In both cases, the related tooling is also a significant factor; engineers favor particular tools and are drawn to repo management systems that support their desired toolchain.","PeriodicalId":445513,"journal":{"name":"2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133617700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Proactive and Pervasive Combinatorial Testing 主动和普遍组合测试

2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP)

Pub Date : 2017-05-01 DOI: 10.1145/3183519.3183522

D. Blue, O. Raz, Rachel Tzoref, Paul Wojciak, Marcel Zalmanovici

Combinatorial testing (CT) is a well-known technique for improving the quality of test plans while reducing testing costs. Traditionally, CT is used by testers at testing phase to design a test plan based on a manual definition of the test space. In this work, we extend the traditional use of CT to other parts of the development life cycle. We use CT at early design phase to improve design quality. We also use CT after test cases have been created and executed, in order to find gaps between design and test. For the latter use case we deploy a novel technique for a semi-automated definition of the test space, which significantly reduces the effort associated with manual test space definition. We report on our practical experience in applying CT for these use cases to three large and heavily deployed industrial products. We demonstrate the value gained from extending the use of CT by (1) discovering latent design flaws with high potential impact, and (2) correlating CT-uncovered gaps between design and test with field reported problems.

组合测试(CT)是一种众所周知的技术，可以提高测试计划的质量，同时降低测试成本。传统上，测试人员在测试阶段使用CT来设计基于手动定义测试空间的测试计划。在这项工作中，我们将CT的传统使用扩展到开发生命周期的其他部分。我们在早期设计阶段使用CT来提高设计质量。我们还在创建和执行测试用例之后使用CT，以便找到设计和测试之间的差距。对于后一个用例，我们为测试空间的半自动化定义部署了一种新技术，它显著地减少了与手动测试空间定义相关的工作。我们报告了我们将CT应用于这些用例的实践经验，这些用例应用于三个大型和大量部署的工业产品。我们通过(1)发现具有高潜在影响的潜在设计缺陷，以及(2)将CT发现的设计和测试之间的差距与现场报告的问题联系起来，证明了扩展CT使用所获得的价值。

引用次数: 1