{"title":"Extractive Text Summarization for Snippet Generation on Indonesian Search Engine using Sentence Transformers","authors":"Komang Uning Sari Devi, Lya Hulliyyatus Suadaa","doi":"10.1109/ICoDSA55874.2022.9862886","DOIUrl":null,"url":null,"abstract":"Search engine results usually show a list of retrieved document titles with document summaries to give a better preview of the retrieved documents, called snippet. This research proposes extractive text summarization models to generate a snippet. A new dataset is constructed for extractive text summarization tasks using Indonesian thesis documents, in which the targeted summaries were created manually by selecting important sentences. In generating snippets, we use Lead-3 and Textrank as baselines and propose fine-tuning Sentence Transformers (SBERT). Based on the evaluation results, SBERT generated a better summary than other baselines with 0.545 Rouge-1, 0.433 Rouge-2, and 0.474 Rouge-L.","PeriodicalId":339135,"journal":{"name":"2022 International Conference on Data Science and Its Applications (ICoDSA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Data Science and Its Applications (ICoDSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICoDSA55874.2022.9862886","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Search engine results usually show a list of retrieved document titles with document summaries to give a better preview of the retrieved documents, called snippet. This research proposes extractive text summarization models to generate a snippet. A new dataset is constructed for extractive text summarization tasks using Indonesian thesis documents, in which the targeted summaries were created manually by selecting important sentences. In generating snippets, we use Lead-3 and Textrank as baselines and propose fine-tuning Sentence Transformers (SBERT). Based on the evaluation results, SBERT generated a better summary than other baselines with 0.545 Rouge-1, 0.433 Rouge-2, and 0.474 Rouge-L.