从海量文本数据构建结构化异构网络:扩展摘要

Jiawei Han
{"title":"从海量文本数据构建结构化异构网络:扩展摘要","authors":"Jiawei Han","doi":"10.1145/3068943.3068944","DOIUrl":null,"url":null,"abstract":"Network data analytics is important, powerful, and exciting. How big role may network data analytics play in the real world? Much real-world data is unstructured, in the form of natural language text. A grand challenges on big data research is to develop effective and scalable methods to turn such massive text data into actionable knowledge. In order to turn such massive unstructured, text-rich, but interconnected data into knowledge, we propose a data-to-network-to-knowledge (D2N2K) paradigm, that is, first transform data into relatively structured heterogeneous information networks, and then mine such text-rich and structure-rich heterogeneous networks to generate useful knowledge. We argue that such a paradigm represents a promising direction and network data analytics will play an essential role in transforming data to knowledge. However, a critical bottleneck in this game is mining structures from text data. We present our recent progress on developing effective methods for mining structures from massive text data and constructing structured heterogeneous information networks.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Construction of Structured Heterogeneous Networks from Massive Text Data: Extended Abstract\",\"authors\":\"Jiawei Han\",\"doi\":\"10.1145/3068943.3068944\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network data analytics is important, powerful, and exciting. How big role may network data analytics play in the real world? Much real-world data is unstructured, in the form of natural language text. A grand challenges on big data research is to develop effective and scalable methods to turn such massive text data into actionable knowledge. In order to turn such massive unstructured, text-rich, but interconnected data into knowledge, we propose a data-to-network-to-knowledge (D2N2K) paradigm, that is, first transform data into relatively structured heterogeneous information networks, and then mine such text-rich and structure-rich heterogeneous networks to generate useful knowledge. We argue that such a paradigm represents a promising direction and network data analytics will play an essential role in transforming data to knowledge. However, a critical bottleneck in this game is mining structures from text data. We present our recent progress on developing effective methods for mining structures from massive text data and constructing structured heterogeneous information networks.\",\"PeriodicalId\":345682,\"journal\":{\"name\":\"Proceedings of the 2nd International Workshop on Network Data Analytics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd International Workshop on Network Data Analytics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3068943.3068944\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Workshop on Network Data Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3068943.3068944","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

网络数据分析是重要的、强大的和令人兴奋的。网络数据分析在现实世界中可能扮演多大的角色?许多现实世界的数据都是非结构化的,以自然语言文本的形式存在。大数据研究面临的一大挑战是开发有效且可扩展的方法,将如此庞大的文本数据转化为可操作的知识。为了将这些海量的非结构化、富文本但相互关联的数据转化为知识,我们提出了数据到网络到知识(D2N2K)范式,即首先将数据转化为相对结构化的异构信息网络,然后对这些富文本和富结构的异构网络进行挖掘,生成有用的知识。我们认为,这种范式代表了一个有前途的方向,网络数据分析将在将数据转化为知识方面发挥重要作用。然而,这个游戏的一个关键瓶颈是从文本数据中挖掘结构。本文介绍了从海量文本数据中挖掘结构和构建结构化异构信息网络的有效方法的最新进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Construction of Structured Heterogeneous Networks from Massive Text Data: Extended Abstract
Network data analytics is important, powerful, and exciting. How big role may network data analytics play in the real world? Much real-world data is unstructured, in the form of natural language text. A grand challenges on big data research is to develop effective and scalable methods to turn such massive text data into actionable knowledge. In order to turn such massive unstructured, text-rich, but interconnected data into knowledge, we propose a data-to-network-to-knowledge (D2N2K) paradigm, that is, first transform data into relatively structured heterogeneous information networks, and then mine such text-rich and structure-rich heterogeneous networks to generate useful knowledge. We argue that such a paradigm represents a promising direction and network data analytics will play an essential role in transforming data to knowledge. However, a critical bottleneck in this game is mining structures from text data. We present our recent progress on developing effective methods for mining structures from massive text data and constructing structured heterogeneous information networks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
SCAN-XP: Parallel Structural Graph Clustering Algorithm on Intel Xeon Phi Coprocessors Proceedings of the 2nd International Workshop on Network Data Analytics Graph Mining to Characterize Competition for Employment Using Graphical Features To Improve Demographic Prediction From Smart Phone Data Construction of Structured Heterogeneous Networks from Massive Text Data: Extended Abstract
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1