{"title":"A data-driven approach based on LDA for identifying duplicate bug report","authors":"Jingliang Chen, Zhe Ming, J. Su","doi":"10.1109/IS.2016.7737385","DOIUrl":null,"url":null,"abstract":"Marking duplicate bugs from bug report data has the significance to reduce effort and costs of software development, maintenance and evolution. Prior work has used machine learning techniques to mark duplicate bugs but has employed incomplete knowledge which can be not very effective with the explosive growth in data volume and complexity. To redress this situation, in this paper we discover knowledge from bug report data that lead to high-quality services. Our work is the first to examine the depth of knowledge on quality. Our approach has been used in APACHE, ECLIPSE, and MOZILLA, including 1104,254 bug reports and 26 years of development time. The results show that our approach can obtain high accuracy in marking duplicate bugs.","PeriodicalId":129583,"journal":{"name":"IEEE Conf. on Intelligent Systems","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Conf. on Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IS.2016.7737385","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Marking duplicate bugs from bug report data has the significance to reduce effort and costs of software development, maintenance and evolution. Prior work has used machine learning techniques to mark duplicate bugs but has employed incomplete knowledge which can be not very effective with the explosive growth in data volume and complexity. To redress this situation, in this paper we discover knowledge from bug report data that lead to high-quality services. Our work is the first to examine the depth of knowledge on quality. Our approach has been used in APACHE, ECLIPSE, and MOZILLA, including 1104,254 bug reports and 26 years of development time. The results show that our approach can obtain high accuracy in marking duplicate bugs.