首页 > 最新文献

First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)最新文献

英文 中文
Is This Cost Estimate Reliable? -- The Relationship between Homogeneity of Analogues and Estimation Reliability 这个成本估算可靠吗?——类似物的同质性与估计可靠性之间的关系
Naoki Ohsugi, Akito Monden, Nahomi Kikuchi, Mike Barker, Masateru Tsunoda, Takeshi Kakimoto, Ken-ichi Matsumoto
Analogy-based cost estimation provides a useful and intuitive means to support decision making in software project management. It derives a cost estimate required for completing a project from information about similar past projects, namely the analogues. While on average this method provides a relatively accurate cost estimate there remains a possibility of large estimation errors. In this paper, we empirically tested the hypothesis that "using more homogeneous analogues produces a more reliable cost estimate" using a software engineering data repository established by the software engineering center (SEC), Information-technology Promotion Agency, Japan. This testing showed that low and high homogeneity projects had a large variation in estimation reliability. For instance, the difference was 22.9% (p = 0.021) in terms of percentage to get accurate estimates (better than Median of Magnitude of Relative Error).
基于类比的成本估算为软件项目管理中的决策提供了一种有用且直观的方法。它从过去类似项目(即类似项目)的信息中得出完成项目所需的成本估算。虽然平均而言,这种方法提供了相对准确的成本估算,但仍有可能出现较大的估算误差。在本文中,我们使用由日本信息技术促进机构软件工程中心(SEC)建立的软件工程数据存储库,对“使用更均匀的类似物产生更可靠的成本估算”这一假设进行了实证检验。这个测试表明,低同质性和高同质性的项目在估计可靠性上有很大的差异。例如,就获得准确估计的百分比而言,差异为22.9% (p = 0.021)(优于相对误差幅度的中位数)。
{"title":"Is This Cost Estimate Reliable? -- The Relationship between Homogeneity of Analogues and Estimation Reliability","authors":"Naoki Ohsugi, Akito Monden, Nahomi Kikuchi, Mike Barker, Masateru Tsunoda, Takeshi Kakimoto, Ken-ichi Matsumoto","doi":"10.1109/ESEM.2007.31","DOIUrl":"https://doi.org/10.1109/ESEM.2007.31","url":null,"abstract":"Analogy-based cost estimation provides a useful and intuitive means to support decision making in software project management. It derives a cost estimate required for completing a project from information about similar past projects, namely the analogues. While on average this method provides a relatively accurate cost estimate there remains a possibility of large estimation errors. In this paper, we empirically tested the hypothesis that \"using more homogeneous analogues produces a more reliable cost estimate\" using a software engineering data repository established by the software engineering center (SEC), Information-technology Promotion Agency, Japan. This testing showed that low and high homogeneity projects had a large variation in estimation reliability. For instance, the difference was 22.9% (p = 0.021) in terms of percentage to get accurate estimates (better than Median of Magnitude of Relative Error).","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115702122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Cognitive Limits of Software Cost Estimation 软件成本估算的认知极限
Ricardo Valerdi
This paper explores the cognitive limits of estimation in the context of software cost estimation. Two heuristics, representativeness and anchoring, motivate two experiments involving psychology students, engineering students, and engineering practitioners. The first experiment, designed to determine if there is a difference in estimating ability in everyday quantities, demonstrates that the three populations estimate with relatively equal accuracy. The results shed light on the distribution of estimates and the process of subjective judgment. The second experiment, designed to explore abilities for estimating the cost of software-intensive systems given incomplete information, shows that predictions by engineering students and practitioners are within 3-12% of each other. The value of this work is in helping better understand how software engineers make decisions based on limited information. The manifestation of the two heuristics is discussed together with the implications for the development of software cost estimation models in light of the findings from the two experiments.
本文在软件成本估算的背景下探讨了估算的认知限制。代表性和锚定两种启发式激发了心理学学生、工科学生和工程实践者的实验。第一个实验旨在确定在日常数量的估计能力上是否存在差异,结果表明三个群体的估计精度相对相等。结果揭示了估计的分布和主观判断的过程。第二个实验旨在探索在信息不完整的情况下估算软件密集型系统成本的能力,结果表明,工程专业学生和从业人员的预测误差在3-12%之间。这项工作的价值在于帮助更好地理解软件工程师如何根据有限的信息做出决策。本文讨论了这两种启发式的表现,并根据这两种实验的发现,讨论了软件成本估算模型的发展。
{"title":"Cognitive Limits of Software Cost Estimation","authors":"Ricardo Valerdi","doi":"10.1109/ESEM.2007.85","DOIUrl":"https://doi.org/10.1109/ESEM.2007.85","url":null,"abstract":"This paper explores the cognitive limits of estimation in the context of software cost estimation. Two heuristics, representativeness and anchoring, motivate two experiments involving psychology students, engineering students, and engineering practitioners. The first experiment, designed to determine if there is a difference in estimating ability in everyday quantities, demonstrates that the three populations estimate with relatively equal accuracy. The results shed light on the distribution of estimates and the process of subjective judgment. The second experiment, designed to explore abilities for estimating the cost of software-intensive systems given incomplete information, shows that predictions by engineering students and practitioners are within 3-12% of each other. The value of this work is in helping better understand how software engineers make decisions based on limited information. The manifestation of the two heuristics is discussed together with the implications for the development of software cost estimation models in light of the findings from the two experiments.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"80 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123411676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 276
Developing Search Strategies for Detecting Relevant Experiments for Systematic Reviews 为系统评价开发相关实验检测的搜索策略
Ó. Dieste, O.A.G. Padua
Information retrieval is an important problem in any evidence-based discipline. Although evidence- based software engineering (EBSE) is not immune to this fact, this question has not been examined at length. The goal of this paper is to analyse the optimality of search strategies for use in systematic reviews. We tried out 29 search strategies using different terms and combinations of terms. We evaluated their sensitivity and precision with a view to finding an optimum strategy. From this study of search strategies we were able to analyse trends and weaknesses in terminology use in articles reporting experiments.
信息检索是任何循证学科中的重要问题。尽管基于证据的软件工程(EBSE)不能幸免于这个事实,但这个问题还没有被详细地研究过。本文的目的是分析在系统评价中使用的搜索策略的最优性。我们使用不同的术语和术语组合尝试了29种搜索策略。我们评估了它们的灵敏度和精度,以期找到一个最佳的策略。从这项搜索策略的研究中,我们能够分析报道实验的文章中术语使用的趋势和弱点。
{"title":"Developing Search Strategies for Detecting Relevant Experiments for Systematic Reviews","authors":"Ó. Dieste, O.A.G. Padua","doi":"10.1109/ESEM.2007.39","DOIUrl":"https://doi.org/10.1109/ESEM.2007.39","url":null,"abstract":"Information retrieval is an important problem in any evidence-based discipline. Although evidence- based software engineering (EBSE) is not immune to this fact, this question has not been examined at length. The goal of this paper is to analyse the optimality of search strategies for use in systematic reviews. We tried out 29 search strategies using different terms and combinations of terms. We evaluated their sensitivity and precision with a view to finding an optimum strategy. From this study of search strategies we were able to analyse trends and weaknesses in terminology use in articles reporting experiments.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130551156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 94
Evaluation of Feature Extraction Methods on Software Cost Estimation 特征提取方法在软件成本估算中的评价
Burak Turhan, Onur Kutlubay, A. Bener
This research investigates the effects of linear and non-linear feature extraction methods on the cost estimation performance. We use principal component analysis (PCA) and Isomap for extracting new features from observed ones and evaluate these methods with support vector regression (SVR) on publicly available datasets. Our results for these datasets indicate there is no significant difference between the performances of these linear and non-linear feature extraction methods.
本文研究了线性和非线性特征提取方法对成本估计性能的影响。我们使用主成分分析(PCA)和Isomap从观察到的特征中提取新特征,并在公开可用的数据集上使用支持向量回归(SVR)对这些方法进行评估。我们对这些数据集的结果表明,这些线性和非线性特征提取方法的性能没有显著差异。
{"title":"Evaluation of Feature Extraction Methods on Software Cost Estimation","authors":"Burak Turhan, Onur Kutlubay, A. Bener","doi":"10.1109/ESEM.2007.57","DOIUrl":"https://doi.org/10.1109/ESEM.2007.57","url":null,"abstract":"This research investigates the effects of linear and non-linear feature extraction methods on the cost estimation performance. We use principal component analysis (PCA) and Isomap for extracting new features from observed ones and evaluate these methods with support vector regression (SVR) on publicly available datasets. Our results for these datasets indicate there is no significant difference between the performances of these linear and non-linear feature extraction methods.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"31 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129103774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Defect Detection Efficiency: Test Case Based vs. Exploratory Testing 缺陷检测效率:基于测试用例与探索性测试
Juha Itkonen, M. Mäntylä, C. Lassenius
This paper presents a controlled experiment comparing the defect detection efficiency of exploratory testing (ET) and test case based testing (TCT). While traditional testing literature emphasizes test cases, ET stresses the individual tester's skills during test execution and does not rely upon predesigned test cases. In the experiment, 79 advanced software engineering students performed manual functional testing on an open-source application with actual and seeded defects. Each student participated in two 90-minute controlled sessions, using ET in one and TCT in the other. We found no significant differences in defect detection efficiency between TCT and ET. The distributions of detected defects did not differ significantly regarding technical type, detection difficulty, or severity. However, TCT produced significantly more false defect reports than ET. Surprisingly, our results show no benefit of using predesigned test cases in terms of defect detection efficiency, emphasizing the need for further studies of manual testing.
本文通过一个对照实验,比较了探索性测试(ET)和基于测试用例的测试(TCT)的缺陷检测效率。传统的测试文献强调测试用例,而ET强调测试执行过程中测试人员的技能,并且不依赖于预先设计的测试用例。在实验中,79名高级软件工程专业的学生在一个具有实际缺陷和播种缺陷的开源应用程序上执行手动功能测试。每个学生参加两个90分钟的控制课程,其中一个使用ET,另一个使用TCT。我们发现TCT和ET在缺陷检测效率上没有显著差异。检测到的缺陷分布在技术类型、检测难度或严重程度上没有显著差异。然而,TCT比ET产生了更多的错误缺陷报告。令人惊讶的是,我们的结果显示,就缺陷检测效率而言,使用预先设计的测试用例没有任何好处,这强调了进一步研究手工测试的必要性。
{"title":"Defect Detection Efficiency: Test Case Based vs. Exploratory Testing","authors":"Juha Itkonen, M. Mäntylä, C. Lassenius","doi":"10.1109/ESEM.2007.56","DOIUrl":"https://doi.org/10.1109/ESEM.2007.56","url":null,"abstract":"This paper presents a controlled experiment comparing the defect detection efficiency of exploratory testing (ET) and test case based testing (TCT). While traditional testing literature emphasizes test cases, ET stresses the individual tester's skills during test execution and does not rely upon predesigned test cases. In the experiment, 79 advanced software engineering students performed manual functional testing on an open-source application with actual and seeded defects. Each student participated in two 90-minute controlled sessions, using ET in one and TCT in the other. We found no significant differences in defect detection efficiency between TCT and ET. The distributions of detected defects did not differ significantly regarding technical type, detection difficulty, or severity. However, TCT produced significantly more false defect reports than ET. Surprisingly, our results show no benefit of using predesigned test cases in terms of defect detection efficiency, emphasizing the need for further studies of manual testing.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131916420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 87
Approaching the ERP Project Cost Estimation Problem: an Experiment ERP项目成本估算问题的探讨:一个实验
M. Daneva
This poster reports on a solution to ERP project cost estimation and on results from its first experimental application.
这张海报报告了ERP项目成本估算的解决方案及其首次实验应用的结果。
{"title":"Approaching the ERP Project Cost Estimation Problem: an Experiment","authors":"M. Daneva","doi":"10.1109/ESEM.2007.72","DOIUrl":"https://doi.org/10.1109/ESEM.2007.72","url":null,"abstract":"This poster reports on a solution to ERP project cost estimation and on results from its first experimental application.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116985331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Tuning anonymity level for assuring high data quality: an empirical study. 调优匿名级别以保证高数据质量:一项实证研究。
G. Canfora, C. A. Visaggio
Preserving data privacy is posing new challenges to software engineering researchers. Current technologies can be too cumbersome, pervasive or costly to be successfully applied in dynamic and complex scenarios where data exchange occurs among a large number of applications. Anonymization techniques seem to be a promising candidate, even if preliminary investigations suggest that they could deteriorate the quality of data. An empirical study has been carried out in order to understand the relationship between the anonymization level and the degradation of data quality.
保护数据隐私对软件工程研究人员提出了新的挑战。当前的技术可能过于繁琐、普及或昂贵,无法成功地应用于在大量应用程序之间进行数据交换的动态和复杂场景中。匿名化技术似乎是一个很有前途的候选人,即使初步调查表明它们可能会降低数据的质量。为了了解匿名化水平与数据质量退化之间的关系,进行了实证研究。
{"title":"Tuning anonymity level for assuring high data quality: an empirical study.","authors":"G. Canfora, C. A. Visaggio","doi":"10.1109/ESEM.2007.23","DOIUrl":"https://doi.org/10.1109/ESEM.2007.23","url":null,"abstract":"Preserving data privacy is posing new challenges to software engineering researchers. Current technologies can be too cumbersome, pervasive or costly to be successfully applied in dynamic and complex scenarios where data exchange occurs among a large number of applications. Anonymization techniques seem to be a promising candidate, even if preliminary investigations suggest that they could deteriorate the quality of data. An empirical study has been carried out in order to understand the relationship between the anonymization level and the degradation of data quality.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124080646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Characterizing Software Architecture Changes: An Initial Study 描述软件架构变化:初步研究
Byron J. Williams, Jeffrey C. Carver
With today's ever increasing demands on software, developers must produce software that can be changed without the risk of degrading the software architecture. Degraded software architecture is problematic because it makes the system more prone to defects and increases the cost of making future changes. The effects of making changes to software can be difficult to measure. One way to address software changes is to characterize their causes and effects. This paper introduces an initial architecture change characterization scheme created to assist developers in measuring the impact of a change on the architecture of the system. It also presents an initial study conducted to gain insight into the validity of the scheme. The results of this study indicated a favorable view of the viability of the scheme by the subjects, and the scheme increased the ability of novice developers to assess and adequately estimate change effort.
随着今天对软件的需求不断增加,开发人员必须生产出能够在不降低软件体系结构风险的情况下进行更改的软件。降级的软件架构是有问题的,因为它使系统更容易出现缺陷,并增加了将来进行更改的成本。对软件进行更改的效果很难衡量。处理软件变更的一种方法是描述其原因和影响。本文介绍了一个初始的体系结构变更描述方案,该方案旨在帮助开发人员度量变更对系统体系结构的影响。本文还介绍了为深入了解该方案的有效性而进行的初步研究。本研究的结果表明,被试对该方案的可行性持积极的看法,并且该方案增加了新手开发人员评估和充分估计变更工作的能力。
{"title":"Characterizing Software Architecture Changes: An Initial Study","authors":"Byron J. Williams, Jeffrey C. Carver","doi":"10.1109/ESEM.2007.26","DOIUrl":"https://doi.org/10.1109/ESEM.2007.26","url":null,"abstract":"With today's ever increasing demands on software, developers must produce software that can be changed without the risk of degrading the software architecture. Degraded software architecture is problematic because it makes the system more prone to defects and increases the cost of making future changes. The effects of making changes to software can be difficult to measure. One way to address software changes is to characterize their causes and effects. This paper introduces an initial architecture change characterization scheme created to assist developers in measuring the impact of a change on the architecture of the system. It also presents an initial study conducted to gain insight into the validity of the scheme. The results of this study indicated a favorable view of the viability of the scheme by the subjects, and the scheme increased the ability of novice developers to assess and adequately estimate change effort.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129489407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA 缺失值对基于类比的软件工作量估算方法AQUA预测精度的影响分析
Jingzhou Li, Ahmed Al-Emran, G. Ruhe
Effort estimation by analogy (EBA) is often confronted with missing values. Our former analogy- based method AUQA is able to tolerate missing values in the data set, but it is unclear how the percentage of missing values impacts the prediction accuracy and if there is an upper bound for how big this percentage might become in order to guarantee the applicability of AQUA. This paper investigates these questions through an impact analysis. The impact analysis is conducted for seven data sets being of different size and having different initial percentages of missing values. The major results are that (i) we confirm the intuition that the more missing values, the poorer the prediction accuracy of AQUA; (ii) there is a quadratic dependency between the prediction accuracy and the percentage of missing values; and (Hi) the upper limit of missing values for the applicability of AQUA is determined as 40%. These results are obtained in the context of AQUA. Further analysis is necessary for other ways of applying EBA, such as using different similarity measures or analogy adaptation methods from those used in AQUA. For that purpose, the experimental design in this study can be adapted.
通过类比进行的工作量估计(EBA)经常面临缺失值的问题。我们以前基于类比的方法AUQA能够容忍数据集中的缺失值,但不清楚缺失值的百分比如何影响预测精度,以及为了保证AQUA的适用性,是否存在该百分比可能变得多大的上限。本文通过影响分析来探讨这些问题。对七个不同大小、初始缺失值百分比不同的数据集进行影响分析。主要结果是:(i)我们证实了AQUA预测精度越差的直觉,即缺失值越多;(ii)预测精度与缺失值百分比之间存在二次依赖关系;(1)确定AQUA适用性的缺失值上限为40%。这些结果是在AQUA环境下获得的。对于其他应用EBA的方法,如使用与AQUA不同的相似性度量或类比适应方法,需要进一步分析。为此,本研究的实验设计可以进行调整。
{"title":"Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA","authors":"Jingzhou Li, Ahmed Al-Emran, G. Ruhe","doi":"10.1109/ESEM.2007.10","DOIUrl":"https://doi.org/10.1109/ESEM.2007.10","url":null,"abstract":"Effort estimation by analogy (EBA) is often confronted with missing values. Our former analogy- based method AUQA is able to tolerate missing values in the data set, but it is unclear how the percentage of missing values impacts the prediction accuracy and if there is an upper bound for how big this percentage might become in order to guarantee the applicability of AQUA. This paper investigates these questions through an impact analysis. The impact analysis is conducted for seven data sets being of different size and having different initial percentages of missing values. The major results are that (i) we confirm the intuition that the more missing values, the poorer the prediction accuracy of AQUA; (ii) there is a quadratic dependency between the prediction accuracy and the percentage of missing values; and (Hi) the upper limit of missing values for the applicability of AQUA is determined as 40%. These results are obtained in the context of AQUA. Further analysis is necessary for other ways of applying EBA, such as using different similarity measures or analogy adaptation methods from those used in AQUA. For that purpose, the experimental design in this study can be adapted.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134222396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 103
Fault-Prone Filtering: Detection of Fault-Prone Modules Using Spam Filtering Technique 易故障过滤:利用垃圾邮件过滤技术检测易故障模块
O. Mizuno, Shiro Ikami, Shuya Nakaichi, T. Kikuno
The fault-prone module detection in source code is of importance for assurance of software quality. Most of previous conventional fault-prone detection approaches have been based on using software metrics. Such approaches, however, have difficulties in collecting the metrics and constructing mathematical models based on the metrics. In order to mitigate such difficulties, we propose a novel approach for detecting fault-prone modules using a spam filtering technique. Because of the increase of needs for spam e-mail detection, the spam filtering technique has been progressed as a convenient and effective technique for text mining. In our approach, fault-prone modules are detected in a way that the source code modules are considered as text files and are applied to the spam filter directly. In order to show the usefulness of our approach, we conducted an experiment using source code repository of a Java based open source development. The result of experiment shows that our approach can classify more than 70% of software modules correctly.
源代码中易故障模块的检测对于保证软件质量具有重要意义。以前大多数传统的易故障检测方法都是基于软件度量的。然而,这种方法在收集度量和基于度量构建数学模型方面存在困难。为了减轻这些困难,我们提出了一种使用垃圾邮件过滤技术检测易故障模块的新方法。随着垃圾邮件检测需求的增加,垃圾邮件过滤技术作为一种方便有效的文本挖掘技术得到了发展。在我们的方法中,检测易出错模块的方式是将源代码模块视为文本文件,并直接应用于垃圾邮件过滤器。为了展示我们的方法的有用性,我们使用基于Java的开源开发的源代码存储库进行了一个实验。实验结果表明,该方法可对70%以上的软件模块进行正确分类。
{"title":"Fault-Prone Filtering: Detection of Fault-Prone Modules Using Spam Filtering Technique","authors":"O. Mizuno, Shiro Ikami, Shuya Nakaichi, T. Kikuno","doi":"10.1109/ESEM.2007.29","DOIUrl":"https://doi.org/10.1109/ESEM.2007.29","url":null,"abstract":"The fault-prone module detection in source code is of importance for assurance of software quality. Most of previous conventional fault-prone detection approaches have been based on using software metrics. Such approaches, however, have difficulties in collecting the metrics and constructing mathematical models based on the metrics. In order to mitigate such difficulties, we propose a novel approach for detecting fault-prone modules using a spam filtering technique. Because of the increase of needs for spam e-mail detection, the spam filtering technique has been progressed as a convenient and effective technique for text mining. In our approach, fault-prone modules are detected in a way that the source code modules are considered as text files and are applied to the spam filter directly. In order to show the usefulness of our approach, we conducted an experiment using source code repository of a Java based open source development. The result of experiment shows that our approach can classify more than 70% of software modules correctly.","PeriodicalId":124420,"journal":{"name":"First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133142478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
期刊
First International Symposium on Empirical Software Engineering and Measurement (ESEM 2007)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1