Comparison between LDA & NMF for event-detection from large text stream data

Pranav Suri, N. Roy
{"title":"Comparison between LDA & NMF for event-detection from large text stream data","authors":"Pranav Suri, N. Roy","doi":"10.1109/CIACT.2017.7977281","DOIUrl":null,"url":null,"abstract":"Usage of social network for topic identification has become essential when dealing with event detection, especially when the events impact the society. In order to address this task, machine learning algorithms and natural language processing techniques have been extensively used. In this paper, an approach to obtain meaningful data from Twitter has been discussed. Further, Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization (NMF) have been used in order to detect topics from this textual data obtained from Twitter along with RSS feed of news headlines. The observed results show that both the algorithms perform well in detecting topics from text streams, the results of LDA being more semantically interpretable while NMF being faster of the two.","PeriodicalId":218079,"journal":{"name":"2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIACT.2017.7977281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22

Abstract

Usage of social network for topic identification has become essential when dealing with event detection, especially when the events impact the society. In order to address this task, machine learning algorithms and natural language processing techniques have been extensively used. In this paper, an approach to obtain meaningful data from Twitter has been discussed. Further, Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization (NMF) have been used in order to detect topics from this textual data obtained from Twitter along with RSS feed of news headlines. The observed results show that both the algorithms perform well in detecting topics from text streams, the results of LDA being more semantically interpretable while NMF being faster of the two.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
LDA与NMF在大型文本流数据事件检测中的比较
在处理事件检测时,特别是当事件对社会产生影响时,利用社交网络进行主题识别已经变得必不可少。为了解决这个问题,机器学习算法和自然语言处理技术被广泛使用。本文讨论了一种从Twitter获取有意义数据的方法。此外,使用潜在狄利克雷分配(LDA)和非负矩阵分解(NMF)从Twitter以及新闻标题的RSS提要获得的文本数据中检测主题。实验结果表明,两种算法都能很好地从文本流中检测主题,LDA算法的结果语义可解释性更好,NMF算法的结果语义可解释性更高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Smart solar tracking system for optimal power generation SVM with Gaussian kernel-based image spam detection on textual features Comparison between LDA & NMF for event-detection from large text stream data Research on the wisdom education platform of cloud computing architecture Robust TS fuzzy controller for helicopter via parallel distributed compensation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1