远距离N-Gram主题模型提取移动行为模式

K. Farrahi, D. Gática-Pérez
{"title":"远距离N-Gram主题模型提取移动行为模式","authors":"K. Farrahi, D. Gática-Pérez","doi":"10.1109/ISWC.2012.20","DOIUrl":null,"url":null,"abstract":"Mining patterns of human behavior from large-scale mobile phone data has potential to understand certain phenomena in society. The study of such human-centric massive datasets requires new mathematical models. In this paper, we propose a probabilistic topic model that we call the distant n-gram topic model (DNTM) to address the problem of learning long duration human location sequences. The DNTM is based on Latent Dirichlet Allocation (LDA). We define the generative process for the model, derive the inference procedure and evaluate our model on real mobile data. We consider two different real-life human datasets, collected by mobile phone locations, the first considering GPS locations and the second considering cell tower connections. The DNTM successfully discovers topics on the two datasets. Finally, the DNTM is compared to LDA by considering log-likelihood performance on unseen data, showing the predictive power of the model on unseen data. We find that the DNTM consistantly outperforms LDA as the sequence length increases.","PeriodicalId":190627,"journal":{"name":"2012 16th International Symposium on Wearable Computers","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":"{\"title\":\"Extracting Mobile Behavioral Patterns with the Distant N-Gram Topic Model\",\"authors\":\"K. Farrahi, D. Gática-Pérez\",\"doi\":\"10.1109/ISWC.2012.20\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mining patterns of human behavior from large-scale mobile phone data has potential to understand certain phenomena in society. The study of such human-centric massive datasets requires new mathematical models. In this paper, we propose a probabilistic topic model that we call the distant n-gram topic model (DNTM) to address the problem of learning long duration human location sequences. The DNTM is based on Latent Dirichlet Allocation (LDA). We define the generative process for the model, derive the inference procedure and evaluate our model on real mobile data. We consider two different real-life human datasets, collected by mobile phone locations, the first considering GPS locations and the second considering cell tower connections. The DNTM successfully discovers topics on the two datasets. Finally, the DNTM is compared to LDA by considering log-likelihood performance on unseen data, showing the predictive power of the model on unseen data. We find that the DNTM consistantly outperforms LDA as the sequence length increases.\",\"PeriodicalId\":190627,\"journal\":{\"name\":\"2012 16th International Symposium on Wearable Computers\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"30\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 16th International Symposium on Wearable Computers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISWC.2012.20\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 16th International Symposium on Wearable Computers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISWC.2012.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30

摘要

从大规模移动电话数据中挖掘人类行为模式有可能理解社会中的某些现象。对这种以人为中心的海量数据集的研究需要新的数学模型。在本文中,我们提出了一个概率主题模型,我们称之为远距n图主题模型(DNTM)来解决长时间人类位置序列的学习问题。DNTM基于潜狄利克雷分配(Latent Dirichlet Allocation, LDA)。定义了模型的生成过程,推导了推理过程,并在实际移动数据上对模型进行了评价。我们考虑了两个不同的现实生活中的人类数据集,这些数据集是通过手机位置收集的,第一个考虑了GPS位置,第二个考虑了手机信号塔的连接。DNTM成功发现了两个数据集上的主题。最后,通过考虑对未知数据的对数似然性能,将DNTM与LDA进行比较,显示了该模型对未知数据的预测能力。我们发现随着序列长度的增加,DNTM的性能始终优于LDA。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Extracting Mobile Behavioral Patterns with the Distant N-Gram Topic Model
Mining patterns of human behavior from large-scale mobile phone data has potential to understand certain phenomena in society. The study of such human-centric massive datasets requires new mathematical models. In this paper, we propose a probabilistic topic model that we call the distant n-gram topic model (DNTM) to address the problem of learning long duration human location sequences. The DNTM is based on Latent Dirichlet Allocation (LDA). We define the generative process for the model, derive the inference procedure and evaluate our model on real mobile data. We consider two different real-life human datasets, collected by mobile phone locations, the first considering GPS locations and the second considering cell tower connections. The DNTM successfully discovers topics on the two datasets. Finally, the DNTM is compared to LDA by considering log-likelihood performance on unseen data, showing the predictive power of the model on unseen data. We find that the DNTM consistantly outperforms LDA as the sequence length increases.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Tongue Mounted Interface for Digitally Actuating the Sense of Taste Energy-Efficient Continuous Activity Recognition on Mobile Phones: An Activity-Adaptive Approach Recognizing Daily Life Context Using Web-Collected Audio Data Toe Input Using a Mobile Projector and Kinect Sensor Energy-Efficient Activity Recognition Using Prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1