Social Media Sentiment Analysis Using K-Means and Naïve Bayes Algorithm

2018 2nd International Conference on Electrical Engineering and Informatics (ICon EEI) Pub Date : 2018-10-01 DOI:10.1109/ICon-EEI.2018.8784326

Muhammad Ihsan Zul, F. Yulia, Dini Nurmalasari

{"title":"Social Media Sentiment Analysis Using K-Means and Naïve Bayes Algorithm","authors":"Muhammad Ihsan Zul, F. Yulia, Dini Nurmalasari","doi":"10.1109/ICon-EEI.2018.8784326","DOIUrl":null,"url":null,"abstract":"Opinions are a major influence when making decisions for individuals or organizations. A collection of opinions can be extracted to gain useful knowledge. This knowledge is used as a source of information which can be used as a consideration in decision making. The extraction of knowledge from text has been known as text mining. Text mining has any kinds of algorithm to extract information from collected text, such as K-Means, K-Nearest Neighbors, Naïve Bayes, and the others. One of the sources of opinion is from social media, especially Facebook and Twitter. On Facebook and Twitter, many people have been writing their opinions about many things. This very much data are difficult to analyze thoroughly. In this paper, K-Means and Naïve Bayes algorithm are developed to analyze public opinions or sentiments. Outlier removal is also added to this analysis. Opinions are taken from Facebook and Twitter. The accuracy of the system is tested 10 times at k different points for each k value (k=6, 7, 8, 9 and 10). As the result, the combination of K-Means and Naïve Bayes has lower accuracy than the accuracy produced by Naïve Bayes without the combination of K-Means, but almost same accuracies. The accuracy of Naïve Bayes algorithm is from 80.526%–82.500%, while the combination of Naïve Bayes and K-Means has 80.323%–81.523% accuracy.","PeriodicalId":114952,"journal":{"name":"2018 2nd International Conference on Electrical Engineering and Informatics (ICon EEI)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 2nd International Conference on Electrical Engineering and Informatics (ICon EEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICon-EEI.2018.8784326","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Opinions are a major influence when making decisions for individuals or organizations. A collection of opinions can be extracted to gain useful knowledge. This knowledge is used as a source of information which can be used as a consideration in decision making. The extraction of knowledge from text has been known as text mining. Text mining has any kinds of algorithm to extract information from collected text, such as K-Means, K-Nearest Neighbors, Naïve Bayes, and the others. One of the sources of opinion is from social media, especially Facebook and Twitter. On Facebook and Twitter, many people have been writing their opinions about many things. This very much data are difficult to analyze thoroughly. In this paper, K-Means and Naïve Bayes algorithm are developed to analyze public opinions or sentiments. Outlier removal is also added to this analysis. Opinions are taken from Facebook and Twitter. The accuracy of the system is tested 10 times at k different points for each k value (k=6, 7, 8, 9 and 10). As the result, the combination of K-Means and Naïve Bayes has lower accuracy than the accuracy produced by Naïve Bayes without the combination of K-Means, but almost same accuracies. The accuracy of Naïve Bayes algorithm is from 80.526%–82.500%, while the combination of Naïve Bayes and K-Means has 80.323%–81.523% accuracy.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于K-Means和Naïve贝叶斯算法的社交媒体情感分析

在为个人或组织做决定时，意见是一个主要的影响因素。收集意见可以获得有用的知识。这些知识被用作信息来源，可以用作决策时的考虑因素。从文本中提取知识被称为文本挖掘。文本挖掘有各种算法从收集的文本中提取信息，如K-Means、K-Nearest Neighbors、Naïve Bayes等。观点的来源之一是来自社交媒体，尤其是Facebook和Twitter。在Facebook和Twitter上，许多人一直在写他们对许多事情的看法。这么多的数据很难分析透彻。本文开发了K-Means和Naïve贝叶斯算法来分析公众意见或情绪。该分析还添加了异常值去除。这些观点来自Facebook和Twitter。对每个k值(k= 6,7,8,9和10)在k个不同的点上测试系统的精度10次。因此，K-Means与Naïve贝叶斯组合得到的准确率低于Naïve贝叶斯不结合K-Means得到的准确率，但准确率相差不大。Naïve贝叶斯算法的准确率为80.526% ~ 82.500%，而Naïve贝叶斯与K-Means的组合准确率为80.323% ~ 81.523%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2018 2nd International Conference on Electrical Engineering and Informatics (ICon EEI)

自引率

0.00%

发文量