Mining Twitter in the Cloud: A Case Study

P. Noordhuis, M. Heijkoop, A. Lazovik
{"title":"Mining Twitter in the Cloud: A Case Study","authors":"P. Noordhuis, M. Heijkoop, A. Lazovik","doi":"10.1109/CLOUD.2010.59","DOIUrl":null,"url":null,"abstract":"Mining and analyzing data from social networks can be difficult because of the large amounts of data involved. Such activities are usually very expensive, as they require a lot of computational resources. With the recent success of cloud computing, data analysis is going to be more accessible due to easier access to less expensive computational resources. In this work we propose to use cloud computing services as a possible solution for analysis of large amounts of data. As a source for a large data set, we propose to use Twitter, yielding a graph with 50 million nodes and 1.8 billion edges. In this paper, we use computation of PageRank on Twitter’s social graph to investigate whether or not cloud computing, and Amazon cloud services1 in particular, can make these tasks more feasible and, as a side effect, whether or not PageRank provides a good ranking of Twitter users.","PeriodicalId":375404,"journal":{"name":"2010 IEEE 3rd International Conference on Cloud Computing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 3rd International Conference on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUD.2010.59","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 56

Abstract

Mining and analyzing data from social networks can be difficult because of the large amounts of data involved. Such activities are usually very expensive, as they require a lot of computational resources. With the recent success of cloud computing, data analysis is going to be more accessible due to easier access to less expensive computational resources. In this work we propose to use cloud computing services as a possible solution for analysis of large amounts of data. As a source for a large data set, we propose to use Twitter, yielding a graph with 50 million nodes and 1.8 billion edges. In this paper, we use computation of PageRank on Twitter’s social graph to investigate whether or not cloud computing, and Amazon cloud services1 in particular, can make these tasks more feasible and, as a side effect, whether or not PageRank provides a good ranking of Twitter users.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在云端挖掘Twitter:一个案例研究
由于涉及大量数据,从社交网络中挖掘和分析数据可能很困难。此类活动通常非常昂贵,因为它们需要大量的计算资源。随着最近云计算的成功,数据分析将更容易获得,因为更容易获得更便宜的计算资源。在这项工作中,我们建议使用云计算服务作为分析大量数据的可能解决方案。作为大型数据集的来源,我们建议使用Twitter,生成具有5000万个节点和18亿个边的图。在本文中,我们通过计算Twitter社交图上的PageRank来研究云计算,特别是亚马逊的云服务1是否可以使这些任务更加可行,以及作为副作用,PageRank是否提供了一个很好的Twitter用户排名。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Bridging the Gap between Desktop and the Cloud for eScience Applications Storage Management in Virtualized Cloud Environment Adaptive Data Migration in Multi-tiered Storage Based Cloud Environment Performance Measurements and Analysis of Network I/O Applications in Virtualized Cloud Dynamic Provisioning Modeling for Virtualized Multi-tier Applications in Cloud Data Center
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1