Dynamic Large Scale Data on Twitter Using Sentiment Analysis and Topic Modeling

A. Alamsyah, Wirawan Rizkika, Ditya Dwi Adhi Nugroho, F. Renaldi, S. Saadah
{"title":"Dynamic Large Scale Data on Twitter Using Sentiment Analysis and Topic Modeling","authors":"A. Alamsyah, Wirawan Rizkika, Ditya Dwi Adhi Nugroho, F. Renaldi, S. Saadah","doi":"10.1109/ICOICT.2018.8528776","DOIUrl":null,"url":null,"abstract":"Digital flows now exert a larger impact, the world is now more connected than ever, the amount of cross-border bandwidth that used has grown 45 times larger since 2005. With the massive amount of data spreading in the net, including social media, speed is one most essential factor in business. companies can take advantage of social media as a source to analyze and extract the customer's opinion, and therefore the company can have quick response towards the condition. The main purpose of this research is content analysis, to obtain the goal, we need to extract the information as well as summarize the topic inside it. However, in order to analyze the content quickly, there are varies choice of tools with its specific output that creates challenges in the process. We use Naïve Bayes Sentiment Analysis based on time-series, specifically on daily basis and topic modeling based on Latent Dirichlet Allocation (LDA) to evaluate the sentiment of the topic as well as the model of the topics discussed. This research may help both companies and individuals to map the public opinion towards certain topic by analyzing the sentiment of the text and create a topic model. Therefore, a real-time information for determining the consumer opinion become a crucial part. Twitter can serve the purpose as one source of realtime information from user-generated content. We pick Uber as the case study, viewed as one of the most favored transportation methods in most part of the world. Data collection period is from 10th February 2017 until 28th February 2017 with 1.048.576 tweets collected.","PeriodicalId":266335,"journal":{"name":"2018 6th International Conference on Information and Communication Technology (ICoICT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 6th International Conference on Information and Communication Technology (ICoICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOICT.2018.8528776","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

Abstract

Digital flows now exert a larger impact, the world is now more connected than ever, the amount of cross-border bandwidth that used has grown 45 times larger since 2005. With the massive amount of data spreading in the net, including social media, speed is one most essential factor in business. companies can take advantage of social media as a source to analyze and extract the customer's opinion, and therefore the company can have quick response towards the condition. The main purpose of this research is content analysis, to obtain the goal, we need to extract the information as well as summarize the topic inside it. However, in order to analyze the content quickly, there are varies choice of tools with its specific output that creates challenges in the process. We use Naïve Bayes Sentiment Analysis based on time-series, specifically on daily basis and topic modeling based on Latent Dirichlet Allocation (LDA) to evaluate the sentiment of the topic as well as the model of the topics discussed. This research may help both companies and individuals to map the public opinion towards certain topic by analyzing the sentiment of the text and create a topic model. Therefore, a real-time information for determining the consumer opinion become a crucial part. Twitter can serve the purpose as one source of realtime information from user-generated content. We pick Uber as the case study, viewed as one of the most favored transportation methods in most part of the world. Data collection period is from 10th February 2017 until 28th February 2017 with 1.048.576 tweets collected.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用情感分析和主题建模的Twitter动态大规模数据
数字流现在发挥着更大的影响,世界比以往任何时候都更加紧密地联系在一起,跨境带宽使用量自2005年以来增长了45倍。随着包括社交媒体在内的大量数据在网络上传播,速度成为商业中最重要的因素之一。公司可以利用社交媒体作为分析和提取客户意见的来源,因此公司可以对情况做出快速反应。本研究的主要目的是内容分析,为了达到目的,我们需要对其中的信息进行提取,并对其中的主题进行总结。然而,为了快速分析内容,有各种不同的工具选择,它们具有特定的输出,这在过程中产生了挑战。我们使用Naïve基于时间序列的贝叶斯情感分析,特别是基于日常的贝叶斯情感分析和基于潜在狄利克雷分配(LDA)的主题建模来评估主题的情感以及所讨论主题的模型。本研究可以帮助公司和个人通过分析文本的情绪来绘制公众对某个话题的看法,并创建话题模型。因此,实时的信息对于确定消费者的意见就成为至关重要的一环。Twitter可以作为用户生成内容的实时信息来源。我们选择优步作为案例研究,优步被认为是世界上大部分地区最受欢迎的交通方式之一。数据收集期为2017年2月10日至2017年2月28日,共收集推文1.048.576条。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Steering Committee Analysis of Non-Negative Double Singular Value Decomposition Initialization Method on Eigenspace-based Fuzzy C-Means Algorithm for Indonesian Online News Topic Detection Mining Web Log Data for Personalized Recommendation System Kernelization of Eigenspace-Based Fuzzy C-Means for Topic Detection on Indonesian News Mining Customer Opinion for Topic Modeling Purpose: Case Study of Ride-Hailing Service Provider
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1