Macine Learning Approach in Evaluating News Labels Based on Titles: Online Media Case Study

Rezky Yuranda, Tata Sutabri, Delpiah Wahyuningsih
{"title":"Macine Learning Approach in Evaluating News Labels Based on Titles: Online Media Case Study","authors":"Rezky Yuranda, Tata Sutabri, Delpiah Wahyuningsih","doi":"10.32736/sisfokom.v12i3.1808","DOIUrl":null,"url":null,"abstract":"In the current digital era, information availability is abundant, and news serves as a primary source of up-to-date and reliable information for the public. However, with the increasing volume of information, a robust evaluation method is necessary to ensure accurate and dependable news labeling. This research employs a machine learning approach, utilizing three common classification algorithms: Naive Bayes, SVM, and Random Forest, to evaluate news labels based on their titles. The dataset utilized in this study is obtained from Jakarta AI Research and consists of 10,000 samples covering various news topics. Evaluation is conducted using accuracy, precision, recall, and F1-Score metrics to gain a comprehensive understanding of the classification algorithm's performance. The results of this research demonstrate that the SVM algorithm exhibits the best performance, achieving an accuracy rate of 92.92%. Random Forest follows with an accuracy rate of 91.21%, and Naive Bayes with an accuracy rate of 89.61%. These findings provide deep insights into the effectiveness of the machine learning approach in evaluating news labels based on their titles. Furthermore, the study highlights the importance of considering other evaluation metrics such as precision, recall, and F1-Score to obtain a more holistic understanding of the algorithm's performance. Further research is encouraged to involve additional classification algorithms and more diverse and extensive datasets to enhance the comprehension of news label evaluation comprehensively. Such endeavors can significantly contribute to the development of automated systems for classifying news with higher accuracy and reliability in the future","PeriodicalId":34309,"journal":{"name":"Jurnal Sisfokom","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Sisfokom","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32736/sisfokom.v12i3.1808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In the current digital era, information availability is abundant, and news serves as a primary source of up-to-date and reliable information for the public. However, with the increasing volume of information, a robust evaluation method is necessary to ensure accurate and dependable news labeling. This research employs a machine learning approach, utilizing three common classification algorithms: Naive Bayes, SVM, and Random Forest, to evaluate news labels based on their titles. The dataset utilized in this study is obtained from Jakarta AI Research and consists of 10,000 samples covering various news topics. Evaluation is conducted using accuracy, precision, recall, and F1-Score metrics to gain a comprehensive understanding of the classification algorithm's performance. The results of this research demonstrate that the SVM algorithm exhibits the best performance, achieving an accuracy rate of 92.92%. Random Forest follows with an accuracy rate of 91.21%, and Naive Bayes with an accuracy rate of 89.61%. These findings provide deep insights into the effectiveness of the machine learning approach in evaluating news labels based on their titles. Furthermore, the study highlights the importance of considering other evaluation metrics such as precision, recall, and F1-Score to obtain a more holistic understanding of the algorithm's performance. Further research is encouraged to involve additional classification algorithms and more diverse and extensive datasets to enhance the comprehension of news label evaluation comprehensively. Such endeavors can significantly contribute to the development of automated systems for classifying news with higher accuracy and reliability in the future
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于标题评估新闻标签的机器学习方法:在线媒体案例研究
在当今的数字时代,信息的可用性是丰富的,新闻是公众获得最新和可靠信息的主要来源。然而,随着信息量的不断增加,需要一种鲁棒的评价方法来保证新闻标注的准确性和可靠性。本研究采用机器学习方法,利用三种常见的分类算法:朴素贝叶斯、支持向量机和随机森林,根据标题评估新闻标签。本研究中使用的数据集来自雅加达人工智能研究中心,由覆盖各种新闻主题的10,000个样本组成。使用准确性、精密度、召回率和F1-Score指标进行评估,以全面了解分类算法的性能。研究结果表明,SVM算法表现出最好的性能,准确率达到92.92%。其次是随机森林,准确率为91.21%,其次是朴素贝叶斯,准确率为89.61%。这些发现为机器学习方法在基于标题评估新闻标签方面的有效性提供了深刻的见解。此外,该研究强调了考虑其他评估指标(如精度、召回率和F1-Score)的重要性,以便更全面地了解算法的性能。鼓励进一步的研究涉及更多的分类算法和更多样化和广泛的数据集,以全面提高对新闻标签评估的理解。这样的努力可以为未来更高准确性和可靠性的新闻分类自动化系统的发展做出重大贡献
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
40
审稿时长
8 weeks
期刊最新文献
Identifying Credit Card Fraud in Illegal Transactions Using Random Forest and Decision Tree Algorithms Determining Scholarship Recipients at STIT Prabumulih Using the AHP Method Determining Promotional Package Recommendations Using the Frequent Pattern Growth Algorithm at The Java Cafe Systematic Literature Review: Machine Learning Methods in Emotion Classification in Textual Data Heart Chamber Segmentation in Cardiomegaly Conditions Using the CNN Method with U-Net Architecture
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1