Behzad Soleimani Neysiani, S. Doostali, S. M. Babamir, Zahra Aminoroaya
{"title":"使用采样降维的快速重复Bug报告检测器训练:在现实世界中使用基于实例的学习进行连续查询","authors":"Behzad Soleimani Neysiani, S. Doostali, S. M. Babamir, Zahra Aminoroaya","doi":"10.1109/IKT51791.2020.9345611","DOIUrl":null,"url":null,"abstract":"Duplicate bug report detection (DBRD) is a famous problem in software triage systems like Bugzilla. It is vital to update the internal machine learning (ML) models of DBRD for real-world usage and continuous query of new bug reports. The training phase of ML algorithms is time-consumable and dependent on the training dataset volume. Instance-based learning (IbL) is an ML technique that reduces the number of samples in the training dataset to achieve fast learning for the incremental database. This research introduces a hybrid approach using clustering and straight forward sampling to improve the runtime and validation performance of DBRD. Two bug report datasets of Android and Mozilla Firefox are used to evaluate the proposed approach. The experimental evaluation shows acceptable results and improvement in both runtime and validation performance of DBRD versus the traditional approach without IbL.","PeriodicalId":382725,"journal":{"name":"2020 11th International Conference on Information and Knowledge Technology (IKT)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fast Duplicate Bug Reports Detector Training using Sampling for Dimension Reduction: Using Instance-based Learning for Continous Query in Real-World\",\"authors\":\"Behzad Soleimani Neysiani, S. Doostali, S. M. Babamir, Zahra Aminoroaya\",\"doi\":\"10.1109/IKT51791.2020.9345611\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Duplicate bug report detection (DBRD) is a famous problem in software triage systems like Bugzilla. It is vital to update the internal machine learning (ML) models of DBRD for real-world usage and continuous query of new bug reports. The training phase of ML algorithms is time-consumable and dependent on the training dataset volume. Instance-based learning (IbL) is an ML technique that reduces the number of samples in the training dataset to achieve fast learning for the incremental database. This research introduces a hybrid approach using clustering and straight forward sampling to improve the runtime and validation performance of DBRD. Two bug report datasets of Android and Mozilla Firefox are used to evaluate the proposed approach. The experimental evaluation shows acceptable results and improvement in both runtime and validation performance of DBRD versus the traditional approach without IbL.\",\"PeriodicalId\":382725,\"journal\":{\"name\":\"2020 11th International Conference on Information and Knowledge Technology (IKT)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 11th International Conference on Information and Knowledge Technology (IKT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IKT51791.2020.9345611\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th International Conference on Information and Knowledge Technology (IKT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IKT51791.2020.9345611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fast Duplicate Bug Reports Detector Training using Sampling for Dimension Reduction: Using Instance-based Learning for Continous Query in Real-World
Duplicate bug report detection (DBRD) is a famous problem in software triage systems like Bugzilla. It is vital to update the internal machine learning (ML) models of DBRD for real-world usage and continuous query of new bug reports. The training phase of ML algorithms is time-consumable and dependent on the training dataset volume. Instance-based learning (IbL) is an ML technique that reduces the number of samples in the training dataset to achieve fast learning for the incremental database. This research introduces a hybrid approach using clustering and straight forward sampling to improve the runtime and validation performance of DBRD. Two bug report datasets of Android and Mozilla Firefox are used to evaluate the proposed approach. The experimental evaluation shows acceptable results and improvement in both runtime and validation performance of DBRD versus the traditional approach without IbL.