{"title":"The Impact of Parameters Optimization in Software Prediction Models","authors":"Asad Ali, C. Gravino","doi":"10.1109/SEAA56994.2022.00041","DOIUrl":null,"url":null,"abstract":"Several studies have raised concerns about the performance of estimation techniques if employed with default parameters provided by specific development toolkits, e.g., Weka. In this paper, we evaluate the impact of parameter optimization with nine different estimation techniques in the Software Development Effort Estimation (SDEE) and Software Fault Prediction (SFP) domains to provide more generic findings of the impact of parameter optimization. To this aim, we employ three datasets from the domain of SDEE (China, Maxwell, Nasa) and three different regression-based datasets from the SFP domain (Ant, Xalan, Xerces). Regarding parameter optimization, we consider four optimization algorithms from different families: Grid Search and Random Search, Simulated Annealing, and Bayesian Optimization. The estimation techniques are: Support Vector Machine, Random Forest, Classification and Regression Tree, Neural Networks, Averaged Neural Networks, k-Nearest Neighbor, Partial Least Square, MultiLayer Perceptron, and Gradient Boosting Machine. Results reveal that, with both SDEE and SFP datasets, seven out of nine estimation techniques require optimization/configuration of at least one parameter. In majority of the cases, the parameters of the employed estimation techniques are sensitive to the optimization of specific types of data. Moreover, not all the parameters need to be optimized as some of them are not sensitive to optimization.","PeriodicalId":269970,"journal":{"name":"2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SEAA56994.2022.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Several studies have raised concerns about the performance of estimation techniques if employed with default parameters provided by specific development toolkits, e.g., Weka. In this paper, we evaluate the impact of parameter optimization with nine different estimation techniques in the Software Development Effort Estimation (SDEE) and Software Fault Prediction (SFP) domains to provide more generic findings of the impact of parameter optimization. To this aim, we employ three datasets from the domain of SDEE (China, Maxwell, Nasa) and three different regression-based datasets from the SFP domain (Ant, Xalan, Xerces). Regarding parameter optimization, we consider four optimization algorithms from different families: Grid Search and Random Search, Simulated Annealing, and Bayesian Optimization. The estimation techniques are: Support Vector Machine, Random Forest, Classification and Regression Tree, Neural Networks, Averaged Neural Networks, k-Nearest Neighbor, Partial Least Square, MultiLayer Perceptron, and Gradient Boosting Machine. Results reveal that, with both SDEE and SFP datasets, seven out of nine estimation techniques require optimization/configuration of at least one parameter. In majority of the cases, the parameters of the employed estimation techniques are sensitive to the optimization of specific types of data. Moreover, not all the parameters need to be optimized as some of them are not sensitive to optimization.