基于BART和SVM的新闻聚合系统的自动文本摘要和主题检测

Farrel Octavianus, Albert Wihardi, Muhamad Keenan Ario, Derwin Suhartono
{"title":"基于BART和SVM的新闻聚合系统的自动文本摘要和主题检测","authors":"Farrel Octavianus, Albert Wihardi, Muhamad Keenan Ario, Derwin Suhartono","doi":"10.1109/ISITDI55734.2022.9944521","DOIUrl":null,"url":null,"abstract":"With a large amount of news consumed by the public, it is impossible to digest all the available news. This paper developed an automated text summarization and topic detection algorithm for news articles, allowing the public to read summarized news without losing the essential points of the news. The algorithm will then be used to build and develop a system that has news aggregation technology. First, the system will scrape news articles from various sources, then topic detection and text summarization will be applied to each article before finally being displayed. The methodology used in this research can be divided into data gathering, topic detection, text summarization, and system development. The result of this research shows that the Support Vector Machine performed exceptionally well in topic detection tasks, better than other supervised learning algorithms used in this research, whereas Bidirectional and Auto-Regressive Transformer (BART) with the appropriate parameters performed relatively well in text summarization. To conclude, topic detection and automated text summarization can both be combined and used to develop a news aggregation system, with Support Vector Machine and BART both performing well in their respective tasks.","PeriodicalId":312644,"journal":{"name":"2022 International Symposium on Information Technology and Digital Innovation (ISITDI)","volume":"202 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated Text Summarization and Topic Detection on News Aggregation System Using BART and SVM\",\"authors\":\"Farrel Octavianus, Albert Wihardi, Muhamad Keenan Ario, Derwin Suhartono\",\"doi\":\"10.1109/ISITDI55734.2022.9944521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With a large amount of news consumed by the public, it is impossible to digest all the available news. This paper developed an automated text summarization and topic detection algorithm for news articles, allowing the public to read summarized news without losing the essential points of the news. The algorithm will then be used to build and develop a system that has news aggregation technology. First, the system will scrape news articles from various sources, then topic detection and text summarization will be applied to each article before finally being displayed. The methodology used in this research can be divided into data gathering, topic detection, text summarization, and system development. The result of this research shows that the Support Vector Machine performed exceptionally well in topic detection tasks, better than other supervised learning algorithms used in this research, whereas Bidirectional and Auto-Regressive Transformer (BART) with the appropriate parameters performed relatively well in text summarization. To conclude, topic detection and automated text summarization can both be combined and used to develop a news aggregation system, with Support Vector Machine and BART both performing well in their respective tasks.\",\"PeriodicalId\":312644,\"journal\":{\"name\":\"2022 International Symposium on Information Technology and Digital Innovation (ISITDI)\",\"volume\":\"202 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Symposium on Information Technology and Digital Innovation (ISITDI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISITDI55734.2022.9944521\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Symposium on Information Technology and Digital Innovation (ISITDI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISITDI55734.2022.9944521","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于公众消费了大量的新闻,因此不可能消化所有可用的新闻。本文开发了一种新闻文章的自动文本摘要和主题检测算法,使公众能够在不丢失新闻要点的情况下阅读摘要新闻。然后,该算法将用于构建和开发具有新闻聚合技术的系统。首先,系统会从各种来源抓取新闻文章,然后对每篇文章进行主题检测和文本摘要,最后显示出来。本研究使用的方法可分为数据收集、主题检测、文本摘要和系统开发。研究结果表明,支持向量机在主题检测任务中表现出色,优于本研究中使用的其他监督学习算法,而具有适当参数的双向和自回归变压器(BART)在文本摘要中表现相对较好。综上所述,主题检测和自动文本摘要都可以结合起来用于开发新闻聚合系统,支持向量机和BART在各自的任务中都表现良好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Automated Text Summarization and Topic Detection on News Aggregation System Using BART and SVM
With a large amount of news consumed by the public, it is impossible to digest all the available news. This paper developed an automated text summarization and topic detection algorithm for news articles, allowing the public to read summarized news without losing the essential points of the news. The algorithm will then be used to build and develop a system that has news aggregation technology. First, the system will scrape news articles from various sources, then topic detection and text summarization will be applied to each article before finally being displayed. The methodology used in this research can be divided into data gathering, topic detection, text summarization, and system development. The result of this research shows that the Support Vector Machine performed exceptionally well in topic detection tasks, better than other supervised learning algorithms used in this research, whereas Bidirectional and Auto-Regressive Transformer (BART) with the appropriate parameters performed relatively well in text summarization. To conclude, topic detection and automated text summarization can both be combined and used to develop a news aggregation system, with Support Vector Machine and BART both performing well in their respective tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An RFID-Based Battery-Less Vibration Monitoring System for Electrical Appliances Gender and Intent Classification From Finger Swiping Behaviours on Gesture Keyboards Using LSTM Comparison of Naïve Bayes, C4.5 and K-Nearest Neighbor for Covid-19 Data Classification Gamification Methods of Game-Based Learning Applications in Medical Competence: A Systematic Literature Review The Implementation of Business Intelligence on Visualisation of Transaction Data Analysis using Dashboard System Case Study: XYZ Convenience Store
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1