{"title":"Abstractive Text Summarizer: A Comparative Study on Dot Product Attention and Cosine Similarity","authors":"K. S, Naveen L, A. Raj, R. S, A. S","doi":"10.1109/icecct52121.2021.9616710","DOIUrl":null,"url":null,"abstract":"Text summarization is the process of extracting a subset of the document in such a way that the idea conveyed by the passage is understood while omitting peripheral details which do not have any impact on the passage. The aim of this work is to design an abstractive text summarizer using natural language processing that takes as input a newspaper article and provide a summary on that article in about 100 words. The model is designed using a Sequence to Sequence architecture coupled with an attention mechanism so that the model learns to pay attention to important words rather than trying to remember all of them. The model is trained using a dataset containing newspaper articles and their summaries provided by Kaggle. Pre-trained models such as BERT and T5 are also used to generate summaries and evaluate the performance of the proposed model against the pre-trained models. The three models such as Seq-Seq, BERT and T5 are evaluated on four datasets such as BBC-News-Dataset, Amazon food reviews, News-summary and NewsRoom datasets. Their rouge scores are analysed to select the ideal algorithm for summarization. The attention mechanism is customised to use cosine similarity instead of dot product. Cosine similarity is found to work better in the case of short summaries while dot product is found to work better for long summaries.","PeriodicalId":155129,"journal":{"name":"2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)","volume":"487 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icecct52121.2021.9616710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Text summarization is the process of extracting a subset of the document in such a way that the idea conveyed by the passage is understood while omitting peripheral details which do not have any impact on the passage. The aim of this work is to design an abstractive text summarizer using natural language processing that takes as input a newspaper article and provide a summary on that article in about 100 words. The model is designed using a Sequence to Sequence architecture coupled with an attention mechanism so that the model learns to pay attention to important words rather than trying to remember all of them. The model is trained using a dataset containing newspaper articles and their summaries provided by Kaggle. Pre-trained models such as BERT and T5 are also used to generate summaries and evaluate the performance of the proposed model against the pre-trained models. The three models such as Seq-Seq, BERT and T5 are evaluated on four datasets such as BBC-News-Dataset, Amazon food reviews, News-summary and NewsRoom datasets. Their rouge scores are analysed to select the ideal algorithm for summarization. The attention mechanism is customised to use cosine similarity instead of dot product. Cosine similarity is found to work better in the case of short summaries while dot product is found to work better for long summaries.