Twitter流中的单次在线事件检测

Xingfa Qiu, Qiaosha Zou, C. Richard Shi
{"title":"Twitter流中的单次在线事件检测","authors":"Xingfa Qiu, Qiaosha Zou, C. Richard Shi","doi":"10.1145/3457682.3457762","DOIUrl":null,"url":null,"abstract":"Intensive information is emerged in social media every second. Many breaking news often appear first in social media, much earlier than they appear in traditional news media. Through the technology of event detection on social media data streams, scatter information can be gathered together to inform us the popular events discussing online. An event is often modeled as a cluster of documents which discuss the same subject. Traditional event detection methods perform poorly on social media because of their huge amount of data and irregular expressions. In this paper, we propose a simple yet efficient event detection method towards social media. An event is represented by a sequence of keywords extracted from social media. We use a single-pass incremental clustering method with a trained encoder mapping documents and events into the same semantic space, which is helpful for the similarity calculation between them. We consider the similarity calculation between a tweet and an event as a matching process and construct a relevance matching dataset with tweet-event pairs. We finetune BERT (Bidirectional Encoder Representations from Transformers) model in the matching dataset to get an appropriate semantic encoder. Keywords are dynamically changed to represent an event for capturing the development of the event. Our proposed method achieves 0.86 on NMI (Normed Mutual Information), 0.69 on ARI (Adjusted Rand Index) and 0.70 on F1-score on a public twitter dataset, which shows the superiority of our method compared with baseline methods.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Single-Pass On-Line Event Detection in Twitter Streams\",\"authors\":\"Xingfa Qiu, Qiaosha Zou, C. Richard Shi\",\"doi\":\"10.1145/3457682.3457762\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Intensive information is emerged in social media every second. Many breaking news often appear first in social media, much earlier than they appear in traditional news media. Through the technology of event detection on social media data streams, scatter information can be gathered together to inform us the popular events discussing online. An event is often modeled as a cluster of documents which discuss the same subject. Traditional event detection methods perform poorly on social media because of their huge amount of data and irregular expressions. In this paper, we propose a simple yet efficient event detection method towards social media. An event is represented by a sequence of keywords extracted from social media. We use a single-pass incremental clustering method with a trained encoder mapping documents and events into the same semantic space, which is helpful for the similarity calculation between them. We consider the similarity calculation between a tweet and an event as a matching process and construct a relevance matching dataset with tweet-event pairs. We finetune BERT (Bidirectional Encoder Representations from Transformers) model in the matching dataset to get an appropriate semantic encoder. Keywords are dynamically changed to represent an event for capturing the development of the event. Our proposed method achieves 0.86 on NMI (Normed Mutual Information), 0.69 on ARI (Adjusted Rand Index) and 0.70 on F1-score on a public twitter dataset, which shows the superiority of our method compared with baseline methods.\",\"PeriodicalId\":142045,\"journal\":{\"name\":\"2021 13th International Conference on Machine Learning and Computing\",\"volume\":\"88 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 13th International Conference on Machine Learning and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3457682.3457762\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457682.3457762","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

社交媒体上每秒钟都有大量信息涌现。许多突发新闻往往首先出现在社交媒体上,比传统新闻媒体出现的时间要早得多。通过对社交媒体数据流的事件检测技术,将分散的信息收集在一起,告诉我们网络上讨论的热门事件。事件通常被建模为讨论同一主题的一组文档。传统的事件检测方法在社交媒体上表现不佳,因为社交媒体的数据量大,表达式不规则。在本文中,我们提出了一种简单而高效的针对社交媒体的事件检测方法。事件由从社交媒体中提取的一系列关键字表示。我们使用一种单遍增量聚类方法,通过训练好的编码器将文档和事件映射到相同的语义空间,这有助于它们之间的相似度计算。我们将推文和事件之间的相似度计算视为一个匹配过程,并使用推文-事件对构建一个相关匹配数据集。我们在匹配数据集中对BERT (Bidirectional Encoder Representations from Transformers)模型进行微调,得到一个合适的语义编码器。动态更改关键字以表示事件,以便捕获事件的发展。我们提出的方法在NMI (normmed Mutual Information)上达到0.86,在ARI (Adjusted Rand Index)上达到0.69,在F1-score上达到0.70,这表明我们的方法与基线方法相比具有优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Single-Pass On-Line Event Detection in Twitter Streams
Intensive information is emerged in social media every second. Many breaking news often appear first in social media, much earlier than they appear in traditional news media. Through the technology of event detection on social media data streams, scatter information can be gathered together to inform us the popular events discussing online. An event is often modeled as a cluster of documents which discuss the same subject. Traditional event detection methods perform poorly on social media because of their huge amount of data and irregular expressions. In this paper, we propose a simple yet efficient event detection method towards social media. An event is represented by a sequence of keywords extracted from social media. We use a single-pass incremental clustering method with a trained encoder mapping documents and events into the same semantic space, which is helpful for the similarity calculation between them. We consider the similarity calculation between a tweet and an event as a matching process and construct a relevance matching dataset with tweet-event pairs. We finetune BERT (Bidirectional Encoder Representations from Transformers) model in the matching dataset to get an appropriate semantic encoder. Keywords are dynamically changed to represent an event for capturing the development of the event. Our proposed method achieves 0.86 on NMI (Normed Mutual Information), 0.69 on ARI (Adjusted Rand Index) and 0.70 on F1-score on a public twitter dataset, which shows the superiority of our method compared with baseline methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Corpus Construction and Entity Recognition for the Field of Industrial Robot Fault Diagnosis GCN2-NAA: Two-stage Graph Convolutional Networks with Node-Aware Attention for Joint Entity and Relation Extraction A Practical Indoor and Outdoor Seamless Navigation System Based on Electronic Map and Geomagnetism SC-DGCN: Sentiment Classification Based on Densely Connected Graph Convolutional Network Bird Songs Recognition Based on Ensemble Extreme Learning Machine
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1