International Symposium on Empirical Software Engineering最新文献

英文中文

Maximising the information gained from an experimental analysis of code inspection and static analysis for concurrent java components 最大化从代码检查的实验分析和并发java组件的静态分析中获得的信息

International Symposium on Empirical Software Engineering

Pub Date : 2006-09-21 DOI: 10.1145/1159733.1159761

M. A. Wojcicki, P. Strooper

The results of empirical studies are limited to particular contexts, difficult to generalise and the studies themselves are expensive to perform. Despite these problems, empirical studies in software engineering can be made effective and they are important to both researchers and practitioners. The key to their effectiveness lies in the maximisation of the information that can be gained by examining existing studies, conducting power analyses for an accurate minimum sample size and benefiting from previous studies through replication. This approach was applied in a controlled experiment examining the combination of automated static analysis tools and code inspection in the context of verification and validation (V&V) of concurrent Java components. The combination of these V&V technologies was shown to be cost-effective despite the size of the study, which thus contributes to research in V&V technology evaluation.

实证研究的结果局限于特定的背景，难以概括，而且研究本身的执行成本很高。尽管存在这些问题，但软件工程中的实证研究是有效的，对研究人员和实践者都很重要。其有效性的关键在于通过检查现有研究，进行准确的最小样本量的功效分析以及通过复制从以前的研究中获益，可以获得最大限度的信息。在并发Java组件的验证和确认(V&V)环境中，该方法被应用于一个受控实验中，该实验检查了自动静态分析工具和代码检查的组合。尽管研究的规模不大，但这些V&V技术的组合被证明具有成本效益，从而有助于V&V技术评估的研究。

引用次数: 6

A comparative study of attribute weighting heuristics for effort estimation by analogy 基于属性加权的启发式类比估算方法的比较研究

International Symposium on Empirical Software Engineering

Pub Date : 2006-09-21 DOI: 10.1145/1159733.1159746

Jingzhou Li, G. Ruhe

Five heuristics for attribute weighting in analogy-based effort estimation are evaluated in this paper. The baseline heuristic involves using all attributes with equal weights. We propose four additional heuristics that use rough set analysis for attribute weighting. These five heuristics are evaluated over five data sets related to software projects. Three of the data sets are publicly available, hence allowing comparison with other methods. The results indicate that three of the rough set analysis based heuristics perform better than the equal weights heuristic. This evaluation is based on an integrated measure of accuracy.

本文对基于类比的工作量估计中属性加权的五种启发式方法进行了评价。基线启发式涉及使用具有相同权重的所有属性。我们提出了另外四种使用粗糙集分析进行属性加权的启发式方法。这五种启发式方法在与软件项目相关的五个数据集上进行评估。其中三个数据集是公开的，因此可以与其他方法进行比较。结果表明，三种基于粗糙集分析的启发式算法的性能优于等权启发式算法。这种评估是基于对准确性的综合衡量。

引用次数: 33

An empirical analysis and comparison of random testing techniques 随机测试技术的实证分析与比较

International Symposium on Empirical Software Engineering

Pub Date : 2006-09-21 DOI: 10.1145/1159733.1159751

Johannes Mayer, Christoph Schneckenburger

Testing with randomly generated test inputs, namely Random Testing, is a strategy that has been applied succefully in a lot of cases. Recently, some new adaptive approaches to the random generation of test cases have been proposed. Whereas there are many comparisons of Random Testing with Partition Testing, a systematic comparison of random testing techniques is still missing. This paper presents an empirical analysis and comparison of all random testing techniques from the field of Adaptive Random Testing (ART). The ART algorithms are compared for effectiveness using the mean F-measure, obtained through simulation and mutation analysis, and the P-measure. An interesting connection between the testing effectiveness measures F-measure and P-measure is described. The spatial distribution of test cases is determined to explain the behavior of the methods and identify possible shortcomings. Besides this, both the theoretical asymptotic runtime and the empirical runtime for each method are given.

使用随机生成的测试输入进行测试，即随机测试，是一种在许多情况下成功应用的策略。近年来，人们提出了一些新的自适应方法来随机生成测试用例。尽管对随机测试和分区测试有很多比较，但对随机测试技术的系统比较仍然缺失。本文对自适应随机测试(ART)领域的各种随机测试技术进行了实证分析和比较。通过模拟和突变分析得到的平均f测度和p测度，比较了ART算法的有效性。描述了测试有效性度量f -度量和p -度量之间的有趣联系。测试用例的空间分布是用来解释方法的行为和识别可能的缺点的。此外，还给出了每种方法的理论渐近运行时间和经验运行时间。

引用次数: 73

Evaluating the efficacy of test-driven development: industrial case studies 评估测试驱动开发的有效性:工业案例研究

International Symposium on Empirical Software Engineering

Pub Date : 2006-09-21 DOI: 10.1145/1159733.1159787

Thirumalesh Bhat, Nachiappan Nagappan

This paper discusses software development using the Test Driven Development (TDD) methodology in two different environments (Windows and MSN divisions) at Microsoft. In both these case studies we measure the various context, product and outcome measures to compare and evaluate the efficacy of TDD. We observed a significant increase in quality of the code (greater than two times) for projects developed using TDD compared to similar projects developed in the same organization in a non-TDD fashion. The projects also took at least 15% extra upfront time for writing the tests. Additionally, the unit tests have served as auto documentation for the code when libraries/APIs had to be used as well as for code maintenance.

本文讨论了在微软的两个不同环境(Windows和MSN部门)中使用测试驱动开发(TDD)方法的软件开发。在这两个案例研究中，我们测量了不同的背景，产品和结果测量来比较和评估TDD的疗效。我们观察到使用TDD开发的项目与在同一组织中以非TDD方式开发的类似项目相比，代码质量有了显著的提高(超过两倍)。项目还花费了至少15%的额外前期时间来编写测试。此外，当必须使用库/ api以及代码维护时，单元测试可以作为代码的自动文档。

引用次数: 170

A framework for the analysis of software cost estimation accuracy 一个软件成本估算精度分析的框架

International Symposium on Empirical Software Engineering

Pub Date : 2006-09-21 DOI: 10.1145/1159733.1159745

Stein Grimstad, M. Jørgensen

Many software companies track and analyze project performance by measuring cost estimation accuracy. A high estimation error is frequently interpreted as poor estimation skills. This is not necessarily a correct interpretation. High estimation error can also be a result of other factors, such as high estimation complexity and insufficient cost control of the project. Through a real-life example we illustrate how the lack of proper estimation error analysis technique can bias analyses of cost estimation accuracy and lead to wrong conclusions. Further, we examine a selection of cost estimation studies, and show that they frequently do not take the necessary actions to ensure meaningful interpretations of estimation error data. Motivated by these results, we propose a general framework that, we believe, will improve analyses of software cost estimation error.

许多软件公司通过测量成本估算的准确性来跟踪和分析项目绩效。较高的估计误差经常被解释为较差的估计技能。这未必是一个正确的解释。高估计误差也可能是其他因素的结果，例如高估计复杂性和项目成本控制不足。通过一个实际的例子，我们说明了缺乏适当的估算误差分析技术会如何影响成本估算准确性的分析并导致错误的结论。此外，我们检查了成本估算研究的选择，并表明他们经常不采取必要的行动来确保估算误差数据的有意义的解释。在这些结果的激励下，我们提出了一个通用的框架，我们相信，它将改进对软件成本估算错误的分析。

引用次数: 63

Distributed versus face-to-face meetings for architecture evalution: a controlled experiment 架构评估的分布式会议与面对面会议:一个受控实验

International Symposium on Empirical Software Engineering

Pub Date : 2006-09-21 DOI: 10.1145/1159733.1159771

M. Babar, B. Kitchenham, D. R. Jeffery

Scenario-based methods for evaluating software architecture require a large number of stakeholders to be collocated for evaluation sessions. Collocating stakeholders is often an expensive exercise. To reduce expense, we have proposed a framework for supporting software architecture evaluation process using groupware systems. This paper presents a controlled experiment that we conducted to assess the effectiveness of scenario profile construction using distributed meetings. We used a cross-over experiment involving 32 teams of three 3rd and 4th year undergraduate students. We found that the quality of scenarios produced by distributed teams using a groupware tool were significantly better than the quality of scenarios produced by face-to-face teams (p<0.001). However, questionnaires indicated that most participants preferred the face-toface arrangement (82%) and 60% thought the distributed meetings were less efficient. We conclude that distributed meetings are extremely effective but that tool support must be of a high standard or participants will not find distributed meetings acceptable.

评估软件架构的基于场景的方法需要为评估会议配置大量的涉众。配置利益相关者通常是一项昂贵的工作。为了减少费用，我们提出了一个使用群件系统支持软件体系结构评估过程的框架。本文提出了一个控制实验，我们使用分布式会议来评估场景概要构建的有效性。我们使用了一个交叉实验，涉及32个小组，三个三年级和四年级的本科生。我们发现，使用群件工具的分布式团队产生的场景的质量明显优于面对面团队产生的场景的质量(p<0.001)。然而，调查问卷显示，大多数参与者更喜欢面对面的安排(82%)，60%的人认为分布式会议效率较低。我们得出结论，分布式会议是非常有效的，但是工具支持必须是高标准的，否则参与者将无法接受分布式会议。

引用次数: 11

Analysis of the influence of communication between researchers on experiment replication 研究人员之间的交流对实验复制的影响分析

International Symposium on Empirical Software Engineering

Pub Date : 2006-09-21 DOI: 10.1145/1159733.1159741

S. Vegas, Natalia Juristo Juzgado, A. Moreno, Martín Solari, P. Letelier

The replication of experiments is a key undertaking in SE. Successful replications enable a discipline's body of knowledge to grow, as the results are added to those of earlier replications. However, replication is extremely difficult in SE, primarily because it is difficult to get a setting that is exactly the same as in the original experiment. Consequently, changes have to be made to the experiment to adapt it to the new site. To be able to replicate an experiment, information also has to be transmitted (usually orally and in writing) between the researchers who ran the experiment earlier and the ones who are going to replicate the experiment. This article examines the influence of the type of communication there is between experimenters on how successful a replication is. We have studied three replications of the same experiment in which different types of communication were used.

实验的复制是东南大学的一项重要工作。成功的复制使一个学科的知识体系得以增长，因为其结果被添加到早期复制的结果中。然而，在SE中复制是极其困难的，主要是因为很难得到与原始实验完全相同的设置。因此，必须对实验进行更改以使其适应新的站点。为了能够重复实验，信息也必须在先前进行实验的研究人员和将要重复实验的研究人员之间传递(通常是口头和书面的)。本文考察了实验人员之间的交流类型对复制成功程度的影响。我们研究了同一实验的三个重复，其中使用了不同类型的通信。

引用次数: 52

Evaluating guidelines for empirical software engineering studies 评估经验软件工程研究的指导方针

International Symposium on Empirical Software Engineering

Pub Date : 2006-09-21 DOI: 10.1145/1159733.1159742

B. Kitchenham, H. Al-Kilidar, M. Babar, Michael Berry, Karl Cox, J. Keung, F. Kurniawati, M. Staples, He Zhang, Liming Zhu

Background. Several researchers have criticized the standards of performing and reporting empirical studies in software engineering. In order to address this problem, Andreas Jedlitschka and Dietmar Pfahl have produced reporting guidelines for controlled experiments in software engineering. They pointed out that their guidelines needed evaluation. We agree that guidelines need to be evaluated before they can be widely adopted. If guidelines are flawed, they will cause more problems that they solve.Aim. The aim of this paper is to present the method we used to evaluate the guidelines and report the results of our evaluation exercise. We suggest our evaluation process may be of more general use if reporting guidelines for other types of empirical study are developed.Method. We used perspective-based inspections to perform a theoretical evaluation of the guidelines. A separate inspection was performed for each perspective. The perspectives used were: Researcher, Practitioner/Consultant, Meta-analyst, Replicator, Reviewer and Author. Apart from the Author perspective, the inspections were based on a set of questions derived by brainstorming. The inspection using the Author perspective reviewed each section of the guidelines sequentially. Results. The question-based perspective inspections detected 42 issues where the guidelines would benefit from amendment or clarification and 8 defects.Conclusions. Reporting guidelines need to specify what information goes into what section and avoid excessive duplication. Software engineering researchers need to be cautious about adopting reporting guidelines that differ from those used by other disciplines. The current guidelines need to be revised and the revised guidelines need to be subjected to further theoretical and empirical validation. Perspective-based inspection is a useful validation method but the practitioner/consultant perspective presents difficulties.

背景。一些研究人员批评了软件工程中执行和报告经验研究的标准。为了解决这个问题，Andreas Jedlitschka和Dietmar Pfahl为软件工程中的受控实验编写了报告指南。他们指出，他们的指导方针需要评估。我们同意，需要对指导方针进行评估，然后才能广泛采用。如果指导方针是有缺陷的，那么它们造成的问题将比它们解决的问题更多。本文的目的是介绍我们用来评估指南的方法，并报告我们评估工作的结果。我们建议，如果为其他类型的实证研究制定报告指南，我们的评估过程可能会更普遍地使用。我们使用基于视角的检查来执行指导方针的理论评估。对每个透视图执行单独的检查。使用的视角是:研究者、从业者/顾问、元分析者、复制者、审稿人和作者。除了作者的观点之外，检查是基于头脑风暴产生的一系列问题。使用Author透视图的检查顺序地检查了指南的每个部分。结果。基于问题的视角检查发现了42个指南需要修改或澄清的地方，以及8个缺陷。报告指南需要指定哪些信息应归入哪些部分，并避免过度重复。软件工程研究人员在采用不同于其他学科使用的报告指南时需要谨慎。目前的指南需要修订，修订后的指南需要进一步的理论和实证验证。基于视角的检查是一种有用的验证方法，但从业者/咨询师的视角存在困难。

{"title":"Evaluating guidelines for empirical software engineering studies","authors":"B. Kitchenham, H. Al-Kilidar, M. Babar, Michael Berry, Karl Cox, J. Keung, F. Kurniawati, M. Staples, He Zhang, Liming Zhu","doi":"10.1145/1159733.1159742","DOIUrl":"https://doi.org/10.1145/1159733.1159742","url":null,"abstract":"Background. Several researchers have criticized the standards of performing and reporting empirical studies in software engineering. In order to address this problem, Andreas Jedlitschka and Dietmar Pfahl have produced reporting guidelines for controlled experiments in software engineering. They pointed out that their guidelines needed evaluation. We agree that guidelines need to be evaluated before they can be widely adopted. If guidelines are flawed, they will cause more problems that they solve.Aim. The aim of this paper is to present the method we used to evaluate the guidelines and report the results of our evaluation exercise. We suggest our evaluation process may be of more general use if reporting guidelines for other types of empirical study are developed.Method. We used perspective-based inspections to perform a theoretical evaluation of the guidelines. A separate inspection was performed for each perspective. The perspectives used were: Researcher, Practitioner/Consultant, Meta-analyst, Replicator, Reviewer and Author. Apart from the Author perspective, the inspections were based on a set of questions derived by brainstorming. The inspection using the Author perspective reviewed each section of the guidelines sequentially. Results. The question-based perspective inspections detected 42 issues where the guidelines would benefit from amendment or clarification and 8 defects.Conclusions. Reporting guidelines need to specify what information goes into what section and avoid excessive duplication. Software engineering researchers need to be cautious about adopting reporting guidelines that differ from those used by other disciplines. The current guidelines need to be revised and the revised guidelines need to be subjected to further theoretical and empirical validation. Perspective-based inspection is a useful validation method but the practitioner/consultant perspective presents difficulties.","PeriodicalId":201305,"journal":{"name":"International Symposium on Empirical Software Engineering","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115642781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 51

Improving software testing by observing practice 通过观察实践来改进软件测试

International Symposium on Empirical Software Engineering

Pub Date : 2006-09-21 DOI: 10.1145/1159733.1159773

Ossi Taipale, K. Smolander

The objective of this qualitative study was to understand the complex practice of software testing, and based on this knowledge, to develop process improvement propositions that could concurrently reduce development and testing costs and improve software quality. First, a survey of testing practices was onducted and 26 organizational units (OUs) were interviewed. From this sample, five OUs were further selected for an in-depth case study. The study used grounded theory as its research method and the data was collected from 41 theme-based interviews. The analysis yielded improvement propositions that included enhanced testability of software components, efficient communication and interaction between development and testing, early involvement of testing, and risk-based testing. The connective and central improvement proposition was that testing ought to adapt to the business orientation of the OU. Other propositions were integrated around this central proposition. The results of this study can be used in improving development and testing processes.

这个定性研究的目标是理解软件测试的复杂实践，并基于这些知识，开发过程改进命题，可以同时减少开发和测试成本，并提高软件质量。首先，对测试实践进行了调查，并对26个组织单位(ou)进行了访谈。从这个样本中，进一步选择了五个ou进行深入的案例研究。本研究采用扎根理论作为研究方法，数据收集自41个主题访谈。分析产生的改进主张包括增强软件组件的可测试性，开发和测试之间的有效沟通和交互，测试的早期参与，以及基于风险的测试。相关的和中心的改进主张是测试应该适应OU的业务方向。其他命题都围绕着这个中心命题。本研究的结果可用于改进开发和测试过程。

引用次数: 49

A family of empirical studies to compare informal and optimization-based planning of software releases 比较非正式的和基于优化的软件发布计划的一系列实证研究

International Symposium on Empirical Software Engineering

Pub Date : 2006-09-21 DOI: 10.1145/1159733.1159766

Gengshen Du, J. McElroy, G. Ruhe

Replication of experiments, or performing a series of related studies, aims at attaining a higher level of validity of results. This paper reports on a series of empirical studies devoted to comparing informal release planning with two variants of optimization-based release planning.Two research questions were studied: How does optimization-based release planning compare with informal planning in terms of (i) time to generate release plans, and the feasibility and quality of those plans, and (ii) understanding and confidence of generated solutions and trust in the release planning process.For the family of empirical studies, the paper presents two types of results related to (i) the two research questions to compare the release planning techniques, and (ii) the evolution and lessons learned while conducting the studies.

重复实验或进行一系列相关研究的目的是获得更高水平的结果有效性。本文报告了一系列的实证研究，致力于比较非正式发布计划与两种基于优化的发布计划的变体。本文研究了两个研究问题:基于优化的发布计划与非正式的发布计划在(i)生成发布计划的时间，以及这些计划的可行性和质量，以及(ii)对生成的解决方案的理解和信心以及对发布计划过程的信任方面的比较。对于实证研究的家族，本文提出了两种类型的结果，这两种结果与(i)两个研究问题来比较发布计划技术，以及(ii)在进行研究时的演变和经验教训有关。

引用次数: 27

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Symposium on Empirical Software Engineering

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀