Sentiment analysis using Latent Dirichlet Allocation and topic polarity wordcloud visualization

M. F. A. Bashri, R. Kusumaningrum
{"title":"Sentiment analysis using Latent Dirichlet Allocation and topic polarity wordcloud visualization","authors":"M. F. A. Bashri, R. Kusumaningrum","doi":"10.1109/ICOICT.2017.8074651","DOIUrl":null,"url":null,"abstract":"Sentiment analysis is a field of study that analyzes sentiment. One method for doing sentiment analysis is Latent Dirichlet Allocation (LDA) that extracts the topic of documents where the topic is represented as the appearance of the words with different topic probability. Therefore, we need data representation in visual form that is easier to understand than text and tables. One form of data visualization is wordcloud that provides a visual representation of words frequency. This research will perform sentiment analysis from the students' comments toward a university, in this case the Universitas Diponegoro, using LDA and topic polarity wordcloud visualization. The purpose of this study is to generate the topic polarity wordcloud of the students' comments by using the best combination of parameters. The best combination is the parameter with the value of alpha 0.1, value of beta 0.1, number of topics 9, threshold 10−7, and perplexity values 8.07. Such parameter combination produces 3 topics as positive sentiment and 6 topics as negative sentiment. In addition, we also compare the proposed method to several algorithms such as Naïve Bayes and Logistic Regression. The final result shows that the proposed method outperforms the Naïve Bayes and Logistic Regression in terms of F-Measure by 61%, 54%, and 56%, respectively.","PeriodicalId":244500,"journal":{"name":"2017 5th International Conference on Information and Communication Technology (ICoIC7)","volume":"54 8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 5th International Conference on Information and Communication Technology (ICoIC7)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOICT.2017.8074651","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

Abstract

Sentiment analysis is a field of study that analyzes sentiment. One method for doing sentiment analysis is Latent Dirichlet Allocation (LDA) that extracts the topic of documents where the topic is represented as the appearance of the words with different topic probability. Therefore, we need data representation in visual form that is easier to understand than text and tables. One form of data visualization is wordcloud that provides a visual representation of words frequency. This research will perform sentiment analysis from the students' comments toward a university, in this case the Universitas Diponegoro, using LDA and topic polarity wordcloud visualization. The purpose of this study is to generate the topic polarity wordcloud of the students' comments by using the best combination of parameters. The best combination is the parameter with the value of alpha 0.1, value of beta 0.1, number of topics 9, threshold 10−7, and perplexity values 8.07. Such parameter combination produces 3 topics as positive sentiment and 6 topics as negative sentiment. In addition, we also compare the proposed method to several algorithms such as Naïve Bayes and Logistic Regression. The final result shows that the proposed method outperforms the Naïve Bayes and Logistic Regression in terms of F-Measure by 61%, 54%, and 56%, respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于潜在狄利克雷分配和主题极性词云可视化的情感分析
情感分析是分析情感的一个研究领域。进行情感分析的一种方法是Latent Dirichlet Allocation (LDA),它提取文档的主题,其中主题被表示为具有不同主题概率的单词的出现。因此,我们需要比文本和表格更容易理解的可视化形式的数据表示。数据可视化的一种形式是词云,它提供了词频的可视化表示。本研究将使用LDA和主题极性词云可视化,从学生对一所大学的评论中进行情感分析,在本例中是Diponegoro大学。本研究的目的是利用参数的最佳组合来生成学生评论的主题极性词云。最佳组合为alpha值为0.1,beta值为0.1,主题数为9,阈值为10−7,perplexity值为8.07的参数。这样的参数组合产生3个积极情绪话题和6个消极情绪话题。此外,我们还将提出的方法与Naïve贝叶斯和逻辑回归等几种算法进行了比较。最终结果表明,该方法在F-Measure方面分别优于Naïve Bayes和Logistic回归方法61%、54%和56%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Self-regulated learning (SRL): The impact of incomplete SRL development on the management of conflicting goals Energy efficient IoT thermometer based on fuzzy logic for fever monitoring Analysis of the number of ants in ant colony system algorithm Sentiment analysis using Latent Dirichlet Allocation and topic polarity wordcloud visualization Digital forensics random access memory using live technique based on network attacked
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1