Semantic Oriented Text Clustering Based on RDF

Soukaina Fatimi, Chama El Saili, L. Alaoui
{"title":"Semantic Oriented Text Clustering Based on RDF","authors":"Soukaina Fatimi, Chama El Saili, L. Alaoui","doi":"10.1109/ISCV49265.2020.9204133","DOIUrl":null,"url":null,"abstract":"Text clustering is the discipline that purports to find related groups in a collection of documents. Based on text clustering the use of documents can be more salubrious. Researchers have used various methods to implement text clustering either agglomerative, divisive, or itemsets-based clustering. Most of these proposed approaches do not take into account the semantic relationships between words, in this case, the documents are considered only as bags of unrelated words. Our work aims to consider the semantics of the text phrases in the clustering task, and to get full usage and exploitation of documents. The semantic web concept is overloaded with valuable techniques allowing the significant use of documents. Our goal is to take full advantage of these techniques. Using the Resource Description Framework (RDF) to represent textual data as triplets. They provide a semantic representation of data on which the clustering process will be based, to provide a more efficient clustering system. On the other hand, and based on the clustering process, we opt on incorporating other techniques such as ontology representation using RDF, RDF Schemas (RDFS), and Web Ontology Language (OWL) to manipulate and extract meaningful information. In this paper, we propose a framework of semantic oriented text clustering based on RDF by the means of a semantic similarity measure, and we highlight the benefits of using semantic web techniques in clustering, topic modeling, and information extraction based on questioning, reasoning and inferencing processes.","PeriodicalId":313743,"journal":{"name":"2020 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCV49265.2020.9204133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Text clustering is the discipline that purports to find related groups in a collection of documents. Based on text clustering the use of documents can be more salubrious. Researchers have used various methods to implement text clustering either agglomerative, divisive, or itemsets-based clustering. Most of these proposed approaches do not take into account the semantic relationships between words, in this case, the documents are considered only as bags of unrelated words. Our work aims to consider the semantics of the text phrases in the clustering task, and to get full usage and exploitation of documents. The semantic web concept is overloaded with valuable techniques allowing the significant use of documents. Our goal is to take full advantage of these techniques. Using the Resource Description Framework (RDF) to represent textual data as triplets. They provide a semantic representation of data on which the clustering process will be based, to provide a more efficient clustering system. On the other hand, and based on the clustering process, we opt on incorporating other techniques such as ontology representation using RDF, RDF Schemas (RDFS), and Web Ontology Language (OWL) to manipulate and extract meaningful information. In this paper, we propose a framework of semantic oriented text clustering based on RDF by the means of a semantic similarity measure, and we highlight the benefits of using semantic web techniques in clustering, topic modeling, and information extraction based on questioning, reasoning and inferencing processes.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于RDF的面向语义的文本聚类
文本聚类是一门旨在从文档集合中找到相关组的学科。基于文本聚类的文档使用可以更加有益。研究人员使用了各种方法来实现文本聚类,包括聚类、分裂聚类和基于项集的聚类。这些建议的方法大多没有考虑词之间的语义关系,在这种情况下,文档只是被认为是不相关的词的包。我们的工作旨在在聚类任务中考虑文本短语的语义,并充分利用和利用文档。语义网概念包含了大量有价值的技术,允许大量使用文档。我们的目标是充分利用这些技术。使用资源描述框架(RDF)将文本数据表示为三元组。它们提供了数据的语义表示,聚类过程将以此为基础,从而提供更有效的聚类系统。另一方面,基于聚类过程,我们选择结合其他技术,如使用RDF、RDF schema (RDFS)和Web ontology Language (OWL)的本体表示来操作和提取有意义的信息。在本文中,我们提出了一个基于RDF的基于语义相似性度量的面向语义的文本聚类框架,并强调了在聚类、主题建模和基于提问、推理和推理过程的信息提取中使用语义web技术的好处。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Survey on how computer vision can response to urgent need to contribute in COVID-19 pandemics Toward Classification of Arabic Manuscripts Words Based on the Deep Convolutional Neural Networks Sharing Emotions in the Distance Education Experience: Attitudes and Motivation of University Students k-eNSC: k-estimation for Normalized Spectral Clustering Effective CU size decision algorithm based on depth map homogeneity for 3D-HEVC inter-coding
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1