Yingling Li, Xing Che, Yuekai Huang, Junjie Wang, Song Wang, Yawen Wang, Qing Wang
{"title":"A Tale of Two Tasks: Automated Issue Priority Prediction with Deep Multi-task Learning","authors":"Yingling Li, Xing Che, Yuekai Huang, Junjie Wang, Song Wang, Yawen Wang, Qing Wang","doi":"10.1145/3544902.3546257","DOIUrl":null,"url":null,"abstract":"Background. Issues are prevalent, and identifying the correct priority of the reported issues is crucial to reduce the maintenance effort and ensure higher software quality. There are several approaches for the automatic priority prediction, yet they do not fully utilize the related information that might influence the priority assignment. Our observation reveals that there are noticeable correlations between an issue’s priority and its category, e.g., an issue of bug category tends to be assigned with higher priority than an issue of document category. This correlation motivates us to employ multi-task learning to share the knowledge about issue’s category prediction and facilitating priority prediction. Aims. This paper aims at providing an automatic approach for effective issue’s priority prediction, to reduce the burden of the project members and better manage the issues. Method. We propose issue priority prediction approach PRIMA with deep multi-task learning, which takes the issue category prediction as another task to facilitate the information sharing and learning. It consists of three main phases: 1) data preparation and augmentation phase, which allows data sharing beyond single task learning; 2) model construction phase, which designs shared layers to encode the semantics of textual descriptions, and task-specific layers to model two tasks in parallel; it also includes the indicative attributes to better capture an issue’s inherent meaning; 3) model training phase, which enables eavesdropping by shared loss function between two tasks. Results. Evaluations with four large-scale open-source projects show that PRIMA outperforms commonly-used and state-of-the-art baselines, with 32% -55% higher precision, and 28% - 56% higher recall. Compared with single task learning, the performance improvement reaches 18% in precision and 19% in recall. Results from our user study further prove its potential practical value. Conclusions. The proposed approach provides a novel and effective way for issue priority prediction, and sheds light on jointly exploring other issue-management tasks.","PeriodicalId":220679,"journal":{"name":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3544902.3546257","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Background. Issues are prevalent, and identifying the correct priority of the reported issues is crucial to reduce the maintenance effort and ensure higher software quality. There are several approaches for the automatic priority prediction, yet they do not fully utilize the related information that might influence the priority assignment. Our observation reveals that there are noticeable correlations between an issue’s priority and its category, e.g., an issue of bug category tends to be assigned with higher priority than an issue of document category. This correlation motivates us to employ multi-task learning to share the knowledge about issue’s category prediction and facilitating priority prediction. Aims. This paper aims at providing an automatic approach for effective issue’s priority prediction, to reduce the burden of the project members and better manage the issues. Method. We propose issue priority prediction approach PRIMA with deep multi-task learning, which takes the issue category prediction as another task to facilitate the information sharing and learning. It consists of three main phases: 1) data preparation and augmentation phase, which allows data sharing beyond single task learning; 2) model construction phase, which designs shared layers to encode the semantics of textual descriptions, and task-specific layers to model two tasks in parallel; it also includes the indicative attributes to better capture an issue’s inherent meaning; 3) model training phase, which enables eavesdropping by shared loss function between two tasks. Results. Evaluations with four large-scale open-source projects show that PRIMA outperforms commonly-used and state-of-the-art baselines, with 32% -55% higher precision, and 28% - 56% higher recall. Compared with single task learning, the performance improvement reaches 18% in precision and 19% in recall. Results from our user study further prove its potential practical value. Conclusions. The proposed approach provides a novel and effective way for issue priority prediction, and sheds light on jointly exploring other issue-management tasks.