Twitter社交媒体上的情感分析使用了带有n克特征的天真的Bayes Classifier

J-SAKTI (Jurnal Sains Komputer dan Informatika) Pub Date : 2018-09-25 DOI:10.30645/J-SAKTI.V2I2.83

A. Nugroho

{"title":"Twitter社交媒体上的情感分析使用了带有n克特征的天真的Bayes Classifier","authors":"A. Nugroho","doi":"10.30645/J-SAKTI.V2I2.83","DOIUrl":null,"url":null,"abstract":"Social media is currently an online media that is widely accessed in the world. Microblogging services such as Twitter allow users to write about various things they experience or write reviews of a product, service, public figures and so on. This can be used to take opinion or sentiment towards an entity that is being discussed on social media such as Twitter. This study utilizes these data to determine public opinion or sentiment regarding public perceptions of the issue of rising electricity tariffs. Opinion taking is based on three classes namely positive, negative and neutral. Users often use non-standard word abbreviations or spelling, this can complicate the process and accuracy of classification results. In this study the authors apply text-preprocessing in handling these problems. For feature extraction, n-gram and classification methods are used using the Naive Bayes classifier. From the results of the research that has been done, the most negative sentiments are formed in response to the issue of the increase in basic electricity tariffs. In addition, from the results of testing with the method of cross validation and confusion matrix it is known that the accuracy of the naïve Bayes method reaches 89.67% before applying n-gram, and the accuracy rate increases 2.33% after applying n-gram characters to 92.00%. It is proven that the application of the n-gram extraction feature can increase the accuracy of the naïve Bayes method.","PeriodicalId":402811,"journal":{"name":"J-SAKTI (Jurnal Sains Komputer dan Informatika)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Analisis Sentimen Pada Media Sosial Twitter Menggunakan Naive Bayes Classifier Dengan Ekstrasi Fitur N-Gram\",\"authors\":\"A. Nugroho\",\"doi\":\"10.30645/J-SAKTI.V2I2.83\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social media is currently an online media that is widely accessed in the world. Microblogging services such as Twitter allow users to write about various things they experience or write reviews of a product, service, public figures and so on. This can be used to take opinion or sentiment towards an entity that is being discussed on social media such as Twitter. This study utilizes these data to determine public opinion or sentiment regarding public perceptions of the issue of rising electricity tariffs. Opinion taking is based on three classes namely positive, negative and neutral. Users often use non-standard word abbreviations or spelling, this can complicate the process and accuracy of classification results. In this study the authors apply text-preprocessing in handling these problems. For feature extraction, n-gram and classification methods are used using the Naive Bayes classifier. From the results of the research that has been done, the most negative sentiments are formed in response to the issue of the increase in basic electricity tariffs. In addition, from the results of testing with the method of cross validation and confusion matrix it is known that the accuracy of the naïve Bayes method reaches 89.67% before applying n-gram, and the accuracy rate increases 2.33% after applying n-gram characters to 92.00%. It is proven that the application of the n-gram extraction feature can increase the accuracy of the naïve Bayes method.\",\"PeriodicalId\":402811,\"journal\":{\"name\":\"J-SAKTI (Jurnal Sains Komputer dan Informatika)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J-SAKTI (Jurnal Sains Komputer dan Informatika)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30645/J-SAKTI.V2I2.83\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J-SAKTI (Jurnal Sains Komputer dan Informatika)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30645/J-SAKTI.V2I2.83","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

摘要

社交媒体是目前在世界范围内广泛使用的在线媒体。像Twitter这样的微博服务允许用户写下他们经历的各种事情，或者写对产品、服务、公众人物等的评论。这可以用来表达对Twitter等社交媒体上正在讨论的实体的意见或情绪。本研究利用这些数据来确定公众对电价上涨问题的看法或情绪。意见的获取基于积极、消极和中立三种类型。用户经常使用非标准的单词缩写或拼写，这会使分类结果的过程和准确性复杂化。在本研究中，作者应用文本预处理来处理这些问题。对于特征提取，使用朴素贝叶斯分类器使用n-gram和分类方法。从已经完成的研究结果来看，最负面的情绪是对基本电价上涨问题的反应。此外，通过交叉验证法和混淆矩阵法的测试结果可知，naïve贝叶斯方法在应用n-gram字符前准确率达到89.67%，应用n-gram字符后准确率提高2.33%，达到92.00%。实验证明，n-gram提取特征的应用可以提高naïve贝叶斯方法的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Analisis Sentimen Pada Media Sosial Twitter Menggunakan Naive Bayes Classifier Dengan Ekstrasi Fitur N-Gram

Social media is currently an online media that is widely accessed in the world. Microblogging services such as Twitter allow users to write about various things they experience or write reviews of a product, service, public figures and so on. This can be used to take opinion or sentiment towards an entity that is being discussed on social media such as Twitter. This study utilizes these data to determine public opinion or sentiment regarding public perceptions of the issue of rising electricity tariffs. Opinion taking is based on three classes namely positive, negative and neutral. Users often use non-standard word abbreviations or spelling, this can complicate the process and accuracy of classification results. In this study the authors apply text-preprocessing in handling these problems. For feature extraction, n-gram and classification methods are used using the Naive Bayes classifier. From the results of the research that has been done, the most negative sentiments are formed in response to the issue of the increase in basic electricity tariffs. In addition, from the results of testing with the method of cross validation and confusion matrix it is known that the accuracy of the naïve Bayes method reaches 89.67% before applying n-gram, and the accuracy rate increases 2.33% after applying n-gram characters to 92.00%. It is proven that the application of the n-gram extraction feature can increase the accuracy of the naïve Bayes method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

J-SAKTI (Jurnal Sains Komputer dan Informatika)

自引率

0.00%

发文量