Abstractive Text Summarizer: A Comparative Study on Dot Product Attention and Cosine Similarity

2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT) Pub Date : 2021-09-15 DOI:10.1109/icecct52121.2021.9616710

K. S, Naveen L, A. Raj, R. S, A. S

{"title":"Abstractive Text Summarizer: A Comparative Study on Dot Product Attention and Cosine Similarity","authors":"K. S, Naveen L, A. Raj, R. S, A. S","doi":"10.1109/icecct52121.2021.9616710","DOIUrl":null,"url":null,"abstract":"Text summarization is the process of extracting a subset of the document in such a way that the idea conveyed by the passage is understood while omitting peripheral details which do not have any impact on the passage. The aim of this work is to design an abstractive text summarizer using natural language processing that takes as input a newspaper article and provide a summary on that article in about 100 words. The model is designed using a Sequence to Sequence architecture coupled with an attention mechanism so that the model learns to pay attention to important words rather than trying to remember all of them. The model is trained using a dataset containing newspaper articles and their summaries provided by Kaggle. Pre-trained models such as BERT and T5 are also used to generate summaries and evaluate the performance of the proposed model against the pre-trained models. The three models such as Seq-Seq, BERT and T5 are evaluated on four datasets such as BBC-News-Dataset, Amazon food reviews, News-summary and NewsRoom datasets. Their rouge scores are analysed to select the ideal algorithm for summarization. The attention mechanism is customised to use cosine similarity instead of dot product. Cosine similarity is found to work better in the case of short summaries while dot product is found to work better for long summaries.","PeriodicalId":155129,"journal":{"name":"2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)","volume":"487 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icecct52121.2021.9616710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Text summarization is the process of extracting a subset of the document in such a way that the idea conveyed by the passage is understood while omitting peripheral details which do not have any impact on the passage. The aim of this work is to design an abstractive text summarizer using natural language processing that takes as input a newspaper article and provide a summary on that article in about 100 words. The model is designed using a Sequence to Sequence architecture coupled with an attention mechanism so that the model learns to pay attention to important words rather than trying to remember all of them. The model is trained using a dataset containing newspaper articles and their summaries provided by Kaggle. Pre-trained models such as BERT and T5 are also used to generate summaries and evaluate the performance of the proposed model against the pre-trained models. The three models such as Seq-Seq, BERT and T5 are evaluated on four datasets such as BBC-News-Dataset, Amazon food reviews, News-summary and NewsRoom datasets. Their rouge scores are analysed to select the ideal algorithm for summarization. The attention mechanism is customised to use cosine similarity instead of dot product. Cosine similarity is found to work better in the case of short summaries while dot product is found to work better for long summaries.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

抽象文本摘要器:点积注意力与余弦相似度的比较研究

文本摘要是提取文档的一个子集的过程，以这样一种方式，即理解文章所传达的思想，同时省略对文章没有任何影响的外围细节。这项工作的目的是设计一个抽象的文本摘要器，使用自然语言处理，将一篇报纸文章作为输入，并提供大约100个单词的文章摘要。该模型的设计使用了序列到序列(Sequence to Sequence)架构和注意机制，这样模型就可以学会注意重要的单词，而不是试图记住所有的单词。该模型使用Kaggle提供的包含报纸文章及其摘要的数据集进行训练。预训练的模型(如BERT和T5)也用于生成摘要，并根据预训练的模型评估拟议模型的性能。Seq-Seq、BERT和T5这三种模型分别在BBC-News-Dataset、Amazon food reviews、News-summary和NewsRoom 4个数据集上进行了评估。分析它们的胭脂分数，以选择理想的算法进行总结。注意机制被定制为使用余弦相似度而不是点积。余弦相似度被发现在短摘要的情况下工作得更好，而点积被发现在长摘要中工作得更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT)

自引率

0.00%

发文量