Background: Given the dynamics in resource allocation schemes offered by cloud computing, effective scheduling algorithms are important to utilize these benefits. Aim: In this paper, we propose a scheduling algorithm integrated with task grouping, priority-aware and SJF (shortest-job-first) to reduce the waiting time and make span, as well as to maximize resource utilization. Method: Scheduling is responsible for allocating the tasks to the best suitable resources with consideration of some dynamic parameters, restrictions and demands, such as network restriction and resource processing capability as well as waiting time. The proposed scheduling algorithm is integrated with task grouping, prioritization of bandwidth awareness and SJF algorithm, which aims at reducing processing time, waiting time and overhead. In the experiment, tasks are generated using Gaussian distribution and resources are created using Random distribution as well as CloudSim framework is used to simulate the proposed algorithm under various conditions. Results are then compared with existing algorithms for evaluation. Results: In comparison with existing task grouping algorithms, results show that the proposed algorithm waiting time and processing time decreased significantly (over 30%). Conclusion: The proposed method effectively minimizes waiting time and processing time and reduces processing cost to achieve optimum resources utilization and minimum overhead, as well as to reduce influence of bandwidth bottleneck in communication.
{"title":"An Empirical Investigation on the Simulation of Priority and Shortest-Job-First Scheduling for Cloud-Based Software Systems","authors":"J. Ru, J. Keung","doi":"10.1109/ASWEC.2013.19","DOIUrl":"https://doi.org/10.1109/ASWEC.2013.19","url":null,"abstract":"Background: Given the dynamics in resource allocation schemes offered by cloud computing, effective scheduling algorithms are important to utilize these benefits. Aim: In this paper, we propose a scheduling algorithm integrated with task grouping, priority-aware and SJF (shortest-job-first) to reduce the waiting time and make span, as well as to maximize resource utilization. Method: Scheduling is responsible for allocating the tasks to the best suitable resources with consideration of some dynamic parameters, restrictions and demands, such as network restriction and resource processing capability as well as waiting time. The proposed scheduling algorithm is integrated with task grouping, prioritization of bandwidth awareness and SJF algorithm, which aims at reducing processing time, waiting time and overhead. In the experiment, tasks are generated using Gaussian distribution and resources are created using Random distribution as well as CloudSim framework is used to simulate the proposed algorithm under various conditions. Results are then compared with existing algorithms for evaluation. Results: In comparison with existing task grouping algorithms, results show that the proposed algorithm waiting time and processing time decreased significantly (over 30%). Conclusion: The proposed method effectively minimizes waiting time and processing time and reduces processing cost to achieve optimum resources utilization and minimum overhead, as well as to reduce influence of bandwidth bottleneck in communication.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"697 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124151542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reliable empirical models such as those used in software effort estimation or defect prediction are inherently dependent on the data from which they are built. As demands for process and product improvement continue to grow, the quality of the data used in measurement and prediction systems warrants increasingly close scrutiny. In this paper we propose a taxonomy of data quality challenges in empirical software engineering, based on an extensive review of prior research. We consider current assessment techniques for each quality issue and proposed mechanisms to address these issues, where available. Our taxonomy classifies data quality issues into three broad areas: first, characteristics of data that mean they are not fit for modeling, second, data set characteristics that lead to concerns about the suitability of applying a given model to another data set, and third, factors that prevent or limit data accessibility and trust. We identify this latter area as of particular need in terms of further research.
{"title":"A Taxonomy of Data Quality Challenges in Empirical Software Engineering","authors":"M. Bosu, Stephen G. MacDonell","doi":"10.1109/ASWEC.2013.21","DOIUrl":"https://doi.org/10.1109/ASWEC.2013.21","url":null,"abstract":"Reliable empirical models such as those used in software effort estimation or defect prediction are inherently dependent on the data from which they are built. As demands for process and product improvement continue to grow, the quality of the data used in measurement and prediction systems warrants increasingly close scrutiny. In this paper we propose a taxonomy of data quality challenges in empirical software engineering, based on an extensive review of prior research. We consider current assessment techniques for each quality issue and proposed mechanisms to address these issues, where available. Our taxonomy classifies data quality issues into three broad areas: first, characteristics of data that mean they are not fit for modeling, second, data set characteristics that lead to concerns about the suitability of applying a given model to another data set, and third, factors that prevent or limit data accessibility and trust. We identify this latter area as of particular need in terms of further research.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128314427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The volume of software development outsourcing is growing enormously owing to the associated benefits of outsourcing and limitations of organizations. However, a large number of the projects outsourced for software development are failed to achieve anticipated results. In most of such cases, the reasons for failure are traced back to the Requirements Engineering (RE) process. In spite of this state of affairs, adequate efforts have not been made to avoid the failure of outsourced software development projects due to RE process issues. This research has been conducted with the aim of cultivating the RE process for software development outsourcing by identifying the significant RE practices for this process. The study is based upon Sommerville and Sawyer's RE practices for six key areas of the RE process. A survey research method has been used to collect data from 108 practitioners belonging to 18 national and multinational organizations associated with software development outsourcing. After analysis of the obtained data, we have identified the RE practices which are significant in order to improve the RE process for software development outsourcing and hence to achieve desired results.
{"title":"Significant Requirements Engineering Practices for Software Development Outsourcing","authors":"J. Iqbal, R. Ahmad, M. Nasir, M. A. Noor","doi":"10.1109/ASWEC.2013.25","DOIUrl":"https://doi.org/10.1109/ASWEC.2013.25","url":null,"abstract":"The volume of software development outsourcing is growing enormously owing to the associated benefits of outsourcing and limitations of organizations. However, a large number of the projects outsourced for software development are failed to achieve anticipated results. In most of such cases, the reasons for failure are traced back to the Requirements Engineering (RE) process. In spite of this state of affairs, adequate efforts have not been made to avoid the failure of outsourced software development projects due to RE process issues. This research has been conducted with the aim of cultivating the RE process for software development outsourcing by identifying the significant RE practices for this process. The study is based upon Sommerville and Sawyer's RE practices for six key areas of the RE process. A survey research method has been used to collect data from 108 practitioners belonging to 18 national and multinational organizations associated with software development outsourcing. After analysis of the obtained data, we have identified the RE practices which are significant in order to improve the RE process for software development outsourcing and hence to achieve desired results.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132764701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Non-functional requirements (NFRs) such as security, reliability and performance play a crucial role in the development of modern distributed systems. The burden of incorporating NFRs into a system's architecture, as well the determination of new design-level NFRs, can be greatly eased by the use of a structured approach providing guidance to developers. Such structured approaches, however, require equally structured system characterisations. This is especially important for distributed systems, which are inherently complex and multi-faceted. In this paper we propose a form of characterisation which we term architectural decomposition, and present a multi-level conceptual framework for decomposing distributed software architectures. Using the framework for decomposing architectures can help guide the incorporation and, via complementary analysis processes, the determination of NFRs at the architectural level. We describe each of the levels of the framework in turn, propose a complementary analysis process for security based on threat modelling, as well as a process for using the framework itself, and demonstrate the utility of our approach via an example derived from a real-life distributed architecture.
{"title":"Decomposing Distributed Software Architectures for the Determination and Incorporation of Security and Other Non-functional Requirements","authors":"Anton V. Uzunov, K. Falkner, E. Fernández","doi":"10.1109/ASWEC.2013.14","DOIUrl":"https://doi.org/10.1109/ASWEC.2013.14","url":null,"abstract":"Non-functional requirements (NFRs) such as security, reliability and performance play a crucial role in the development of modern distributed systems. The burden of incorporating NFRs into a system's architecture, as well the determination of new design-level NFRs, can be greatly eased by the use of a structured approach providing guidance to developers. Such structured approaches, however, require equally structured system characterisations. This is especially important for distributed systems, which are inherently complex and multi-faceted. In this paper we propose a form of characterisation which we term architectural decomposition, and present a multi-level conceptual framework for decomposing distributed software architectures. Using the framework for decomposing architectures can help guide the incorporation and, via complementary analysis processes, the determination of NFRs at the architectural level. We describe each of the levels of the framework in turn, propose a complementary analysis process for security based on threat modelling, as well as a process for using the framework itself, and demonstrate the utility of our approach via an example derived from a real-life distributed architecture.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114890861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Change is one aspect of business that is inevitable. The volatile nature of business requirements is considered one of the main contributors to information technology project failure. One of the key reasons for difficulty in managing change is the lack of adequate methods to communicate change from business to the IT department. In this paper, we present an approach to support change communication and elicitation through a change specification and classification method. The specification framework is based on an onto-terminological concept which is a combination of both ontology and terminology. It is constructed using an abstraction of the Goal Question Metrics representing the linguistic function (terminology) whilst an abstraction of the Resource Description Framework (representing ontology) is used to elaborate the logical connection of the terms. Through this approach, we are able to describe requirements change using clear guidelines and also provide a common communication medium for both IT and business people.
{"title":"A Method of Specifying and Classifying Requirements Change","authors":"Shalinka Jayatilleke, R. Lai","doi":"10.1109/ASWEC.2013.29","DOIUrl":"https://doi.org/10.1109/ASWEC.2013.29","url":null,"abstract":"Change is one aspect of business that is inevitable. The volatile nature of business requirements is considered one of the main contributors to information technology project failure. One of the key reasons for difficulty in managing change is the lack of adequate methods to communicate change from business to the IT department. In this paper, we present an approach to support change communication and elicitation through a change specification and classification method. The specification framework is based on an onto-terminological concept which is a combination of both ontology and terminology. It is constructed using an abstraction of the Goal Question Metrics representing the linguistic function (terminology) whilst an abstraction of the Resource Description Framework (representing ontology) is used to elaborate the logical connection of the terms. Through this approach, we are able to describe requirements change using clear guidelines and also provide a common communication medium for both IT and business people.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"237 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123041618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aibek Sarimbekov, Y. Zheng, Danilo Ansaloni, L. Bulej, L. Marek, Walter Binder, P. Tůma, Zhengwei Qi
Dynamic program analysis tools serve many important software engineering tasks such as profiling, debugging, testing, program comprehension, and reverse engineering. Many dynamic analysis tools rely on program instrumentation and are implemented using low-level instrumentation libraries, resulting in tedious and error-prone tool development. The recently released Domain-Specific Language for Instrumentation (DiSL) was designed to boost the productivity of tool developers targeting the Java Virtual Machine, without impairing the performance of the resulting tools. DiSL offers high-level programming abstractions especially designed for development of instrumentation-based dynamic analysis tools. In this paper, we present a controlled experiment aimed at quantifying the impact of the DiSL programming model and high-level abstractions on the development of dynamic program analysis instrumentations. The experiment results show that compared with a prevailing, state-of-the-art instrumentation library, the DiSL users were able to complete instrumentation development tasks faster, and with more correct results.
动态程序分析工具服务于许多重要的软件工程任务,如概要分析、调试、测试、程序理解和逆向工程。许多动态分析工具依赖于程序插装,并使用低级插装库实现,这导致了冗长且容易出错的工具开发。最近发布的领域专用工具语言(Domain-Specific Language for Instrumentation, DiSL)旨在提高针对Java虚拟机的工具开发人员的工作效率,同时不损害最终工具的性能。DiSL提供了高级编程抽象,特别为开发基于仪器的动态分析工具而设计。在本文中,我们提出了一个对照实验,旨在量化DiSL编程模型和高级抽象对动态程序分析仪器开发的影响。实验结果表明,与目前流行的最先进的仪器库相比,DiSL用户能够更快地完成仪器开发任务,并且得到更准确的结果。
{"title":"Productive Development of Dynamic Program Analysis Tools with DiSL","authors":"Aibek Sarimbekov, Y. Zheng, Danilo Ansaloni, L. Bulej, L. Marek, Walter Binder, P. Tůma, Zhengwei Qi","doi":"10.1109/ASWEC.2013.12","DOIUrl":"https://doi.org/10.1109/ASWEC.2013.12","url":null,"abstract":"Dynamic program analysis tools serve many important software engineering tasks such as profiling, debugging, testing, program comprehension, and reverse engineering. Many dynamic analysis tools rely on program instrumentation and are implemented using low-level instrumentation libraries, resulting in tedious and error-prone tool development. The recently released Domain-Specific Language for Instrumentation (DiSL) was designed to boost the productivity of tool developers targeting the Java Virtual Machine, without impairing the performance of the resulting tools. DiSL offers high-level programming abstractions especially designed for development of instrumentation-based dynamic analysis tools. In this paper, we present a controlled experiment aimed at quantifying the impact of the DiSL programming model and high-level abstractions on the development of dynamic program analysis instrumentations. The experiment results show that compared with a prevailing, state-of-the-art instrumentation library, the DiSL users were able to complete instrumentation development tasks faster, and with more correct results.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133910478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traceability benchmarks are essential for the evaluation of traceability recovery techniques. This includes the validation of an individual trace ability technique itself and the objective comparison of the technique with other traceability techniques. However, it is generally acknowledged that it is a real challenge for researchers to obtain or build meaningful and robust benchmarks. This is because of the difficulty of obtaining or creating suitable benchmarks. In this paper, we describe an approach to enable researchers to establish affordable and robust benchmarks. We have designed rigorous manual identification and verification strategies to determine whether or not a link is correct. We have developed a formula to calculate the probability of errors in benchmarks. Analysis of error probability results shows that our approach can produce high quality benchmarks, and our strategies significantly reduce error probability in them.
{"title":"Development of Robust Traceability Benchmarks","authors":"Xiaofan Chen, J. Hosking, J. Grundy, R. Amor","doi":"10.1109/ASWEC.2013.26","DOIUrl":"https://doi.org/10.1109/ASWEC.2013.26","url":null,"abstract":"Traceability benchmarks are essential for the evaluation of traceability recovery techniques. This includes the validation of an individual trace ability technique itself and the objective comparison of the technique with other traceability techniques. However, it is generally acknowledged that it is a real challenge for researchers to obtain or build meaningful and robust benchmarks. This is because of the difficulty of obtaining or creating suitable benchmarks. In this paper, we describe an approach to enable researchers to establish affordable and robust benchmarks. We have designed rigorous manual identification and verification strategies to determine whether or not a link is correct. We have developed a formula to calculate the probability of errors in benchmarks. Analysis of error probability results shows that our approach can produce high quality benchmarks, and our strategies significantly reduce error probability in them.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132309036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ownership and related systems impose restrictions on the object graph that can help improve program structure, exploit concurrency and verify software. Such systems rely on the presence of appropriate ownership annotations in the source code. Unfortunately, manually adding ownership annotations to legacy systems is a tedious process. Previous attempts at automatically inferring such ownership systems do not produce modularly checkable annotations (i.e. which allow classes to be checked in isolation) making them difficult to incorporate into day-to-day development. In this paper, we present Own Kit - a system for automatically inferring ownership annotations which are modularly checkable. We describe and evaluate our approach on a number of real-world benchmarks and compare against an existing system.
{"title":"OwnKit: Inferring Modularly Checkable Ownership Annotations for Java","authors":"Constantine Dymnikov, David J. Pearce, A. Potanin","doi":"10.1109/ASWEC.2013.30","DOIUrl":"https://doi.org/10.1109/ASWEC.2013.30","url":null,"abstract":"Ownership and related systems impose restrictions on the object graph that can help improve program structure, exploit concurrency and verify software. Such systems rely on the presence of appropriate ownership annotations in the source code. Unfortunately, manually adding ownership annotations to legacy systems is a tedious process. Previous attempts at automatically inferring such ownership systems do not produce modularly checkable annotations (i.e. which allow classes to be checked in isolation) making them difficult to incorporate into day-to-day development. In this paper, we present Own Kit - a system for automatically inferring ownership annotations which are modularly checkable. We describe and evaluate our approach on a number of real-world benchmarks and compare against an existing system.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125116413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In model driven engineering, high-level models of an application are constructed to enable reasoning about functional and non-functional requirements independently of implementation issues and concerns. This allows for reduced maintenance, shortens development time, and permits automated model updates, system model executions, and impact assessment. Part of model driven engineering, multi-modeling integrates models that abstract various aspects of the system, such as I/O, behavioral, and functional among others, at different levels of granularity and using various domain specific modeling languages. An important challenge is to understand the relationship between these models towards preserving multi-model consistency as changes in one model affect other models in the multi-model. This paper presents a multi-modeling architecture that captures model relationships at syntactic and semantic levels. We define a taxonomy of change effects that relies on a relationship correspondence meta-model to highlight and trace the impact of changes across various modeling environments. Following the correspondence meta-model and associated change effects, our prototype implementation ensures that multi-model consistency is met and notifies stakeholders of significant changes. Our case study of a submarine tracking system checks multi model consistency and highlights the impact of changes across system modeling tools that capture its functional and behavioral aspects among others. Our experiments show the feasibility of our approach while highlighting important challenges.
{"title":"A Model-Driven Approach for Ensuring Change Traceability and Multi-model Consistency","authors":"Claudia Szabo, Yufei Chen","doi":"10.1109/ASWEC.2013.24","DOIUrl":"https://doi.org/10.1109/ASWEC.2013.24","url":null,"abstract":"In model driven engineering, high-level models of an application are constructed to enable reasoning about functional and non-functional requirements independently of implementation issues and concerns. This allows for reduced maintenance, shortens development time, and permits automated model updates, system model executions, and impact assessment. Part of model driven engineering, multi-modeling integrates models that abstract various aspects of the system, such as I/O, behavioral, and functional among others, at different levels of granularity and using various domain specific modeling languages. An important challenge is to understand the relationship between these models towards preserving multi-model consistency as changes in one model affect other models in the multi-model. This paper presents a multi-modeling architecture that captures model relationships at syntactic and semantic levels. We define a taxonomy of change effects that relies on a relationship correspondence meta-model to highlight and trace the impact of changes across various modeling environments. Following the correspondence meta-model and associated change effects, our prototype implementation ensures that multi-model consistency is met and notifies stakeholders of significant changes. Our case study of a submarine tracking system checks multi model consistency and highlights the impact of changes across system modeling tools that capture its functional and behavioral aspects among others. Our experiments show the feasibility of our approach while highlighting important challenges.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129183228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The detection and correction of defects remains among the most time consuming and expensive aspects of software development. Extensive automated testing and code inspections may mitigate their effect, but some code fragments are necessarily more likely to be faulty than others, and automated identification of fault prone modules helps to focus testing and inspections, thus limiting wasted effort and potentially improving detection rates. However, software metrics data is often extremely noisy, with enormous imbalances in the size of the positive and negative classes. In this work, we present a new approach to predictive modelling of fault proneness in software modules, introducing a new feature representation to overcome some of these issues. This rank sum representation offers improved or at worst comparable performance to earlier approaches for standard data sets, and readily allows the user to choose an appropriate trade-off between precision and recall to optimise inspection effort to suit different testing environments. The method is evaluated using the NASA Metrics Data Program (MDP) data sets, and performance is compared with existing studies based on the Support Vector Machine (SVM) and Naïve Bayes (NB) Classifiers, and with our own comprehensive evaluation of these methods.
缺陷的检测和纠正仍然是软件开发中最耗时和最昂贵的方面之一。广泛的自动化测试和代码检查可能会减轻它们的影响,但是一些代码片段必然比其他代码片段更有可能出错,而容易出错模块的自动识别有助于集中测试和检查,从而限制了浪费的努力,并潜在地提高了检测率。然而,软件度量数据通常非常嘈杂,在正类和负类的大小上存在巨大的不平衡。在这项工作中,我们提出了一种新的软件模块故障倾向预测建模方法,引入了一种新的特征表示来克服这些问题。这种秩和表示为标准数据集提供了改进的或最坏的可比较性能,并且很容易允许用户在精度和召回率之间选择适当的权衡,以优化检查工作,以适应不同的测试环境。使用NASA Metrics Data Program (MDP)数据集对该方法进行了评估,并将其性能与基于支持向量机(SVM)和Naïve贝叶斯(NB)分类器的现有研究进行了比较,并对这些方法进行了我们自己的综合评估。
{"title":"Predicting Fault-Prone Software Modules with Rank Sum Classification","authors":"J. Cahill, J. Hogan, Richard N. Thomas","doi":"10.1109/ASWEC.2013.33","DOIUrl":"https://doi.org/10.1109/ASWEC.2013.33","url":null,"abstract":"The detection and correction of defects remains among the most time consuming and expensive aspects of software development. Extensive automated testing and code inspections may mitigate their effect, but some code fragments are necessarily more likely to be faulty than others, and automated identification of fault prone modules helps to focus testing and inspections, thus limiting wasted effort and potentially improving detection rates. However, software metrics data is often extremely noisy, with enormous imbalances in the size of the positive and negative classes. In this work, we present a new approach to predictive modelling of fault proneness in software modules, introducing a new feature representation to overcome some of these issues. This rank sum representation offers improved or at worst comparable performance to earlier approaches for standard data sets, and readily allows the user to choose an appropriate trade-off between precision and recall to optimise inspection effort to suit different testing environments. The method is evaluated using the NASA Metrics Data Program (MDP) data sets, and performance is compared with existing studies based on the Support Vector Machine (SVM) and Naïve Bayes (NB) Classifiers, and with our own comprehensive evaluation of these methods.","PeriodicalId":394020,"journal":{"name":"2013 22nd Australian Software Engineering Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133959560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}