基于名称实体识别和自然语言处理的简易模糊聚类

2017 1st International Conference on Intelligent Systems and Information Management (ICISIM) Pub Date : 2017-10-01 DOI:10.1109/ICISIM.2017.8122161

K. Pole, Vishakha R. Mote

{"title":"基于名称实体识别和自然语言处理的简易模糊聚类","authors":"K. Pole, Vishakha R. Mote","doi":"10.1109/ICISIM.2017.8122161","DOIUrl":null,"url":null,"abstract":"Word wide web is considered as the most important information store in recent years. Web development expands to a great extent with new technologies. Search engines are ineffective when the number of docs in the web is multiplied. In the same way, the retrieval of queries, most of which are not related to what the user was looking for. The documents are of varied and flexible web, there are tough relationships with a web docs and a connection with others. Basically more precise clustering methods are required to detect and denominate latency with consistency to monitor significance in context. This article presents a diffused language area of topology with a diffuse cluster algorithm to discover the contextual concept of Web docs. The chief objective and mission of this research is to focus on the clustering algorithm and to discover latent semantics within a diffused linguistic text body. In addition, the scope of applications can be stretched to accompany areas such as data mining, bioinformatics, content control or information gathering, and so on. Secondly, when it is observed that recovery docs usually belongs to one of the research topic that can be distinctly different as compared to other issues, the major difference between is usually with other issues. Web content can be grouped into hierarchy issues based on diffused language measures. Web data and files that constitutes in the definition of docs are complicated and complex in nature. There are complex links within single Web docs, and there may be complex relationships with other docs. The high interactions between the terms of the docs show only vague and little ambiguous concepts. However in our case study the algorithm mentioned for development extracts the functionality of Web docs using so called random hypothetical field methods and creates a diffused linguistic topology according to the attribute associations.","PeriodicalId":139000,"journal":{"name":"2017 1st International Conference on Intelligent Systems and Information Management (ICISIM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Improvised fuzzy clustering using name entity recognition and natural language processing\",\"authors\":\"K. Pole, Vishakha R. Mote\",\"doi\":\"10.1109/ICISIM.2017.8122161\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Word wide web is considered as the most important information store in recent years. Web development expands to a great extent with new technologies. Search engines are ineffective when the number of docs in the web is multiplied. In the same way, the retrieval of queries, most of which are not related to what the user was looking for. The documents are of varied and flexible web, there are tough relationships with a web docs and a connection with others. Basically more precise clustering methods are required to detect and denominate latency with consistency to monitor significance in context. This article presents a diffused language area of topology with a diffuse cluster algorithm to discover the contextual concept of Web docs. The chief objective and mission of this research is to focus on the clustering algorithm and to discover latent semantics within a diffused linguistic text body. In addition, the scope of applications can be stretched to accompany areas such as data mining, bioinformatics, content control or information gathering, and so on. Secondly, when it is observed that recovery docs usually belongs to one of the research topic that can be distinctly different as compared to other issues, the major difference between is usually with other issues. Web content can be grouped into hierarchy issues based on diffused language measures. Web data and files that constitutes in the definition of docs are complicated and complex in nature. There are complex links within single Web docs, and there may be complex relationships with other docs. The high interactions between the terms of the docs show only vague and little ambiguous concepts. However in our case study the algorithm mentioned for development extracts the functionality of Web docs using so called random hypothetical field methods and creates a diffused linguistic topology according to the attribute associations.\",\"PeriodicalId\":139000,\"journal\":{\"name\":\"2017 1st International Conference on Intelligent Systems and Information Management (ICISIM)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 1st International Conference on Intelligent Systems and Information Management (ICISIM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICISIM.2017.8122161\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 1st International Conference on Intelligent Systems and Information Management (ICISIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISIM.2017.8122161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

近年来，万维网被认为是最重要的信息存储。随着新技术的出现，Web开发在很大程度上得到了扩展。当网络上的文档数量成倍增加时，搜索引擎是无效的。以同样的方式，查询的检索，其中大多数与用户正在查找的内容无关。文件是多样的和灵活的网络，有一个棘手的关系与网络文档和其他连接。基本上需要更精确的聚类方法来检测和命名具有一致性的延迟，以监控上下文中的重要性。本文提出了一个拓扑的扩散语言区域，并使用扩散聚类算法来发现Web文档的上下文概念。本研究的主要目标和任务是关注聚类算法，并在分散的语言文本体中发现潜在的语义。此外，应用范围可以扩展到数据挖掘、生物信息学、内容控制或信息收集等领域。其次，当我们观察到，与其他问题相比，恢复文档通常属于一个可以明显不同的研究主题时，其主要区别通常是与其他问题。可以根据分散的语言度量将Web内容分组为层次问题。构成文档定义的Web数据和文件本质上是复杂和复杂的。在单个Web文档中存在复杂的链接，并且与其他文档之间可能存在复杂的关系。文档术语之间的高度交互只显示了模糊和很少含糊的概念。然而，在我们的案例研究中，提到的用于开发的算法使用所谓的随机假设字段方法提取Web文档的功能，并根据属性关联创建扩散的语言拓扑。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Improvised fuzzy clustering using name entity recognition and natural language processing

Word wide web is considered as the most important information store in recent years. Web development expands to a great extent with new technologies. Search engines are ineffective when the number of docs in the web is multiplied. In the same way, the retrieval of queries, most of which are not related to what the user was looking for. The documents are of varied and flexible web, there are tough relationships with a web docs and a connection with others. Basically more precise clustering methods are required to detect and denominate latency with consistency to monitor significance in context. This article presents a diffused language area of topology with a diffuse cluster algorithm to discover the contextual concept of Web docs. The chief objective and mission of this research is to focus on the clustering algorithm and to discover latent semantics within a diffused linguistic text body. In addition, the scope of applications can be stretched to accompany areas such as data mining, bioinformatics, content control or information gathering, and so on. Secondly, when it is observed that recovery docs usually belongs to one of the research topic that can be distinctly different as compared to other issues, the major difference between is usually with other issues. Web content can be grouped into hierarchy issues based on diffused language measures. Web data and files that constitutes in the definition of docs are complicated and complex in nature. There are complex links within single Web docs, and there may be complex relationships with other docs. The high interactions between the terms of the docs show only vague and little ambiguous concepts. However in our case study the algorithm mentioned for development extracts the functionality of Web docs using so called random hypothetical field methods and creates a diffused linguistic topology according to the attribute associations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 1st International Conference on Intelligent Systems and Information Management (ICISIM)

自引率

0.00%

发文量

期刊最新文献

Hybrid technique for splice site prediction Information fusion for images on FPGA: Pixel level with pseudo color Hierarchical document clustering based on cosine similarity measure Embedded home surveillance system with pyroelectric infrared sensor using GSM Healthcare data modeling in R