A novel approach for open domain event schema discovery from twitter

Assia Mezhar, M. Ramdani, A. Mzabi
{"title":"A novel approach for open domain event schema discovery from twitter","authors":"Assia Mezhar, M. Ramdani, A. Mzabi","doi":"10.1109/SITA.2015.7358413","DOIUrl":null,"url":null,"abstract":"Open domain event extraction is a recently-introduced type of event extraction that extracts, aggregate and categorize important events. This is done without any domain specific guidance such as special training data or extraction rules. Because OEE is domain-independent, it helps the final users managing the unstructured data in an easy way. OEE help users creating complex queries or finding a new domain when they have an unknown structure of the explored events. We can help the user by generating a simplified relational schema that describes the extracted events in any given domain. For systems that extract events within a narrow domain, the schema is easily specified in advance. While, the events and types extracted with OEE do not fit with full schema information: we can't know in advance what schema is appropriate for each discovered type. In this paper, we introduce a novel approach of schema discovery based on probabilistic generative models especially LinkLDA for open-domain event extraction. This approach aims to develop an algorithm to automatically derive high quality smart schemas from the extracted events. To evaluate the quality of our results, we will carry out our experiments on a set of events extracted from twitter, the most up-to-date stream of current events.","PeriodicalId":174405,"journal":{"name":"2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SITA.2015.7358413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Open domain event extraction is a recently-introduced type of event extraction that extracts, aggregate and categorize important events. This is done without any domain specific guidance such as special training data or extraction rules. Because OEE is domain-independent, it helps the final users managing the unstructured data in an easy way. OEE help users creating complex queries or finding a new domain when they have an unknown structure of the explored events. We can help the user by generating a simplified relational schema that describes the extracted events in any given domain. For systems that extract events within a narrow domain, the schema is easily specified in advance. While, the events and types extracted with OEE do not fit with full schema information: we can't know in advance what schema is appropriate for each discovered type. In this paper, we introduce a novel approach of schema discovery based on probabilistic generative models especially LinkLDA for open-domain event extraction. This approach aims to develop an algorithm to automatically derive high quality smart schemas from the extracted events. To evaluate the quality of our results, we will carry out our experiments on a set of events extracted from twitter, the most up-to-date stream of current events.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种基于twitter的开放域事件模式发现新方法
开放域事件提取是近年来出现的一种事件提取技术,它可以对重要事件进行提取、聚合和分类。这是在没有任何领域特定指导的情况下完成的,例如特殊的训练数据或提取规则。因为OEE是独立于域的,所以它可以帮助最终用户以一种简单的方式管理非结构化数据。OEE可以帮助用户创建复杂的查询,或者在所探索事件的结构未知时查找新域。我们可以通过生成一个简化的关系模式来帮助用户,该模式描述在任何给定领域中提取的事件。对于在窄域内提取事件的系统,模式很容易提前指定。然而,用OEE提取的事件和类型并不符合完整的模式信息:我们无法提前知道哪种模式适合于每个发现的类型。本文介绍了一种基于概率生成模型的模式发现方法,特别是用于开放域事件提取的LinkLDA。该方法旨在开发一种算法,从提取的事件中自动生成高质量的智能模式。为了评估结果的质量,我们将从twitter上提取一组事件进行实验,twitter是最新的时事流。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Neural network Incremental conductance MPPT algorithm for photovoltaic water pumping system Mapping discovery methodology in a pure P2P mediation system for XML schemas Strategic Alignment and Information System project portfolio optimization model Conceptual alignment between SPEM-based processes and CMMI Towards an interpretable Rules Ensemble algorithm for classification in a categorical data space
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1