Abstractive Text Summary with Transformer on Youtube Video Subtitle

International Journal of Emerging Technology and Advanced Engineering Pub Date : 2023-02-04 DOI:10.46338/ijetae0223_01

Juan Lee Atipa, Javin Javin, Fernando Bryan, V. Yesmaya, Rini Wongso

引用次数: 0

Abstract

Time limitation is one of the most important factors when consuming media. Longer duration makes it harder for users to watch the entirety of the video. Text summarization could be a way for users to acquire information swiftly and concisely. However, the extent to which the summary of the information made has really approached the main core of the information to be conveyed. In this study using YouTube video subtitles as the data that will be used to get a summary of the core information from the video. Consequently, this research focuses on abstractive summarization utilizing several Transformer models namely T5, BART, and PEGASUS, and using the video subtitle dataset to create a summary. The text data from the video subtitle is used as the main source of information in the learning process of the model, ultimately enhancing the model’s ability on this specific summarization task. In evaluating the models’ results, ROUGE is employed, specifically ROUGE-1, ROUGE-2, and ROUGE-L.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

抽象文本摘要与变压器在Youtube视频字幕

时间限制是消费媒体时最重要的因素之一。持续时间越长，用户就越难以完整地观看视频。文本摘要可以成为用户快速、简洁地获取信息的一种方式。然而，所做的信息总结的程度已经真正接近所要传达的信息的主要核心。在本研究中，使用YouTube视频字幕作为数据，将用于从视频中获得核心信息的总结。因此，本研究的重点是利用几个Transformer模型(T5、BART和PEGASUS)进行抽象摘要，并使用视频字幕数据集创建摘要。在模型的学习过程中，将视频字幕中的文本数据作为主要的信息来源，最终增强模型完成这一特定摘要任务的能力。在评估模型的结果时，使用了ROUGE，特别是ROUGE-1, ROUGE-2和ROUGE- l。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Emerging Technology and Advanced Engineering

自引率

0.00%

发文量