Bug reports play a critical role in the software development lifecycle by helping developers identify and resolve defects efficiently. However, the quality of bug report titles, particularly in open-source communities, can vary significantly, which complicates the bug triage and resolution processes. Existing approaches, such as iTAPE, treat title generation as a one-sentence summarization task using sequence-to-sequence models. While these methods show promise, they face two major limitations: (1) they do not consider the distinct components of bug reports, treating the entire report as a homogeneous input, and (2) they struggle to handle the variability between template-based and non-template-based reports, often resulting in suboptimal titles. To address these limitations, we propose TAB, a hybrid framework that combines a Document Component Analyzer based on a pre-trained BERT model and a Title Generation Model based on CodeT5. TAB addresses the first limitation by segmenting bug reports into four components-Description, Reproduction, Expected Behavior, and Others-to ensure better alignment between input and output. For the second limitation, TAB uses a divergent approach: for template-based reports, titles are generated directly, while for non-template reports, DCA extracts key components to improve title relevance and clarity. We evaluate TAB on both template-based and non-template-based bug reports, demonstrating that it significantly outperforms existing methods. Specifically, TAB achieves average improvements of 170.4–389.5% in METEOR, 67.8–190.0% in ROUGE-L, and 65.7–124.5% in chrF(AF) compared to baseline approaches on template-based reports. Additionally, on non-template-based reports, TAB shows an average improvement of 64% in METEOR, 3.6% in ROUGE-L, and 14.8% in chrF(AF) over the state-of-the-art. These results confirm the robustness of TAB in generating high-quality titles across diverse bug report formats.
扫码关注我们
求助内容:
应助结果提醒方式:
