Discover Overlapping Topical Regions by Geo-Semantic Clustering of Tweets

Yuta Taniguchi, Daiki Monzen, Lutfiana Sari Ariestien, Daisuke Ikeda
{"title":"Discover Overlapping Topical Regions by Geo-Semantic Clustering of Tweets","authors":"Yuta Taniguchi, Daiki Monzen, Lutfiana Sari Ariestien, Daisuke Ikeda","doi":"10.1109/WAINA.2015.85","DOIUrl":null,"url":null,"abstract":"Geotagging is an interesting feature of social media services which adds metadata of geographical locations to photos, web sites or messages. From a different perspective, geotagging can be seen as annotating geographical locations conversely by images or texts. It is a challenging task to summarize such annotations and uncover topical geographical regions characterized by specific topics locally since such knowledge is useful for location-based advertising and so on. Determining topical regions is not trivial since topical region's topic and geographical area are dependent on each other. In this paper, we aim to discover overlapping topical regions from geotagged text messages (tweets) collected from Twitter. To this end, we employ Mean Shift clustering algorithm and an integrated vector space of a geographic and semantic vector spaces. Running Mean Shift algorithm on the vector space, we can evaluate both geographical density and semantic density of tweets simultaneously. Subsequently, our method determines regions of clusters detected by Mean Shift algorithm applying the kernel density estimation on clustered tweets in the geographical space. Our experiments show clusters get broken into several sub-clusters that overlap each other when we increase the weight of semantic density over that of geographical density.","PeriodicalId":6845,"journal":{"name":"2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops","volume":"1 1","pages":"552-557"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WAINA.2015.85","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Geotagging is an interesting feature of social media services which adds metadata of geographical locations to photos, web sites or messages. From a different perspective, geotagging can be seen as annotating geographical locations conversely by images or texts. It is a challenging task to summarize such annotations and uncover topical geographical regions characterized by specific topics locally since such knowledge is useful for location-based advertising and so on. Determining topical regions is not trivial since topical region's topic and geographical area are dependent on each other. In this paper, we aim to discover overlapping topical regions from geotagged text messages (tweets) collected from Twitter. To this end, we employ Mean Shift clustering algorithm and an integrated vector space of a geographic and semantic vector spaces. Running Mean Shift algorithm on the vector space, we can evaluate both geographical density and semantic density of tweets simultaneously. Subsequently, our method determines regions of clusters detected by Mean Shift algorithm applying the kernel density estimation on clustered tweets in the geographical space. Our experiments show clusters get broken into several sub-clusters that overlap each other when we increase the weight of semantic density over that of geographical density.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过推文的地理语义聚类发现重叠的主题区域
地理标签是社交媒体服务的一个有趣功能,它可以将地理位置的元数据添加到照片、网站或消息中。从另一个角度来看,地理标记可以看作是用图像或文本反过来注释地理位置。总结这些注释并揭示局部特定主题特征的主题地理区域是一项具有挑战性的任务,因为这些知识对基于位置的广告等有用。确定主题区域并非易事,因为主题区域的主题与地理区域是相互依赖的。在本文中,我们的目标是从Twitter收集的地理标记文本消息(tweet)中发现重叠的主题区域。为此,我们采用Mean Shift聚类算法和地理向量空间和语义向量空间的集成向量空间。在向量空间上运行Mean Shift算法,可以同时评估推文的地理密度和语义密度。随后,我们的方法在地理空间中对聚类推文进行核密度估计,确定Mean Shift算法检测到的聚类区域。我们的实验表明,当我们在地理密度的基础上增加语义密度的权重时,聚类会被分解成几个相互重叠的子聚类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Performance Analysis of WMN-GA Simulation System for Different WMN Architectures Considering OLSR A Network Topology Visualization System Based on Mobile AR Technology A Framework for Security Services Based on Software-Defined Networking Extended Lifetime Based Elliptical Sink-Mobility in Depth Based Routing Protocol for UWSNs A Proposal and Implementation of an ID Federation that Conceals a Web Service from an Authentication Server
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1