基于子空间投影维数的局部神经网络文本文档聚类

R. Krakovsky, I. Mokris
{"title":"基于子空间投影维数的局部神经网络文本文档聚类","authors":"R. Krakovsky, I. Mokris","doi":"10.1109/SACI.2012.6250002","DOIUrl":null,"url":null,"abstract":"The paper deals with clustering of text documents by neural networks. For representation of text documents is used the Vector Space (VS) model, which describes the text documents by VS matrix X. Multidimensional space of matrix X for text documents clustering requires the high computational complexity therefore it is needed of its reduction. In our approach for reduction of the text document space we used decomposition of multidimensional space of matrix X by projection into subspaces. The presented approach for creation of subspaces of multidimensional spaces uses the Projective Adaptive Resonance Theory (PART) neural network which enables this way of reduction of multidimensional text document space and also the text document clustering. Efficiency of clustering the text documents by subspaces of multidimensional space it is influenced by properties of PART and because of the optimal parameters of PART have to be set. Thanks to exact settings of distance and vigilance parameter of PART it is possible to find the clusters, their centers in the projective dimensions of subspaces and create outlier cluster for noisy data sets. The utilization of PART neural network to the text document clustering can easy discover the intrinsic clusters in used sets of documents.","PeriodicalId":293436,"journal":{"name":"2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Clustering of text documents by projective dimension of subspaces using part neural network\",\"authors\":\"R. Krakovsky, I. Mokris\",\"doi\":\"10.1109/SACI.2012.6250002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper deals with clustering of text documents by neural networks. For representation of text documents is used the Vector Space (VS) model, which describes the text documents by VS matrix X. Multidimensional space of matrix X for text documents clustering requires the high computational complexity therefore it is needed of its reduction. In our approach for reduction of the text document space we used decomposition of multidimensional space of matrix X by projection into subspaces. The presented approach for creation of subspaces of multidimensional spaces uses the Projective Adaptive Resonance Theory (PART) neural network which enables this way of reduction of multidimensional text document space and also the text document clustering. Efficiency of clustering the text documents by subspaces of multidimensional space it is influenced by properties of PART and because of the optimal parameters of PART have to be set. Thanks to exact settings of distance and vigilance parameter of PART it is possible to find the clusters, their centers in the projective dimensions of subspaces and create outlier cluster for noisy data sets. The utilization of PART neural network to the text document clustering can easy discover the intrinsic clusters in used sets of documents.\",\"PeriodicalId\":293436,\"journal\":{\"name\":\"2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI)\",\"volume\":\"65 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SACI.2012.6250002\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SACI.2012.6250002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

本文研究了基于神经网络的文本文档聚类问题。文本文档的表示使用向量空间(VS)模型,该模型通过VS矩阵X来描述文本文档。矩阵X的多维空间用于文本文档聚类需要较高的计算复杂度,因此需要对其进行约简。在我们减少文本文档空间的方法中,我们通过投影到子空间来分解矩阵X的多维空间。本文提出的多维空间子空间的生成方法采用了投影自适应共振理论(project Adaptive Resonance Theory, PART)神经网络,实现了多维文本文档空间的约简和文本文档聚类。多维空间子空间对文本文档进行聚类的效率受部分属性的影响,并且需要设置部分的最优参数。由于PART的距离和警戒参数的精确设置,可以在子空间的投影维上找到聚类及其中心,并为噪声数据集创建离群聚类。利用PART神经网络对文本文档进行聚类,可以很容易地发现所使用的文档集合中的内在聚类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Clustering of text documents by projective dimension of subspaces using part neural network
The paper deals with clustering of text documents by neural networks. For representation of text documents is used the Vector Space (VS) model, which describes the text documents by VS matrix X. Multidimensional space of matrix X for text documents clustering requires the high computational complexity therefore it is needed of its reduction. In our approach for reduction of the text document space we used decomposition of multidimensional space of matrix X by projection into subspaces. The presented approach for creation of subspaces of multidimensional spaces uses the Projective Adaptive Resonance Theory (PART) neural network which enables this way of reduction of multidimensional text document space and also the text document clustering. Efficiency of clustering the text documents by subspaces of multidimensional space it is influenced by properties of PART and because of the optimal parameters of PART have to be set. Thanks to exact settings of distance and vigilance parameter of PART it is possible to find the clusters, their centers in the projective dimensions of subspaces and create outlier cluster for noisy data sets. The utilization of PART neural network to the text document clustering can easy discover the intrinsic clusters in used sets of documents.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Software architecture for semantically enhanced composition of geoservices in cadastral systems Fuzzy Rule Interpolation Developer Toolbox Library Long life oriented smart control of heating body A model of translation management systems for multilingual documents Using the Gram-Charlier expansion in power systems reliability
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1