Comparison between LDA & NMF for event-detection from large text stream data

2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT) Pub Date : 2017-02-01 DOI:10.1109/CIACT.2017.7977281

Pranav Suri, N. Roy

引用次数: 22

Abstract

Usage of social network for topic identification has become essential when dealing with event detection, especially when the events impact the society. In order to address this task, machine learning algorithms and natural language processing techniques have been extensively used. In this paper, an approach to obtain meaningful data from Twitter has been discussed. Further, Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization (NMF) have been used in order to detect topics from this textual data obtained from Twitter along with RSS feed of news headlines. The observed results show that both the algorithms perform well in detecting topics from text streams, the results of LDA being more semantically interpretable while NMF being faster of the two.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

LDA与NMF在大型文本流数据事件检测中的比较

在处理事件检测时，特别是当事件对社会产生影响时，利用社交网络进行主题识别已经变得必不可少。为了解决这个问题，机器学习算法和自然语言处理技术被广泛使用。本文讨论了一种从Twitter获取有意义数据的方法。此外，使用潜在狄利克雷分配(LDA)和非负矩阵分解(NMF)从Twitter以及新闻标题的RSS提要获得的文本数据中检测主题。实验结果表明，两种算法都能很好地从文本流中检测主题，LDA算法的结果语义可解释性更好，NMF算法的结果语义可解释性更高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT)

自引率

0.00%

发文量

期刊最新文献

Smart solar tracking system for optimal power generation SVM with Gaussian kernel-based image spam detection on textual features Comparison between LDA & NMF for event-detection from large text stream data Research on the wisdom education platform of cloud computing architecture Robust TS fuzzy controller for helicopter via parallel distributed compensation