基于随机游走算法的中文文献信息处理模型

Q3 Medicine Koomesh Pub Date : 2018-08-01 DOI:10.1109/I-SMAC.2018.8653683

Xiao Xian, Zhenhui Yue

{"title":"基于随机游走算法的中文文献信息处理模型","authors":"Xiao Xian, Zhenhui Yue","doi":"10.1109/I-SMAC.2018.8653683","DOIUrl":null,"url":null,"abstract":"In this paper, we conduct research on Chinese document information processing model based on random walk algorithm. Because of the complexity and also the particularity of processing Chinese information, Chinese search engine technology needs to be improved. The Chinese search engine cannot directly copy foreign technology. To study and analyze the expertise of the Chinese, we can accurately find the need in vast information base as the Chinese information. In this paper, the dictionary learning and sparse representation with random walk model are introduced into the character recognition to solve the problem of pen character and noise of the fax characters. The novel analytic framework is presented to assist the processing of the methodologies. The recognition method does not require preprocessing operations such as character binarization and thinning, only one feature and one classifier is needed, compared with the current multi-feature multi-cascade classifier fusion recognition method, proposed recognition method has characteristics of low complexity. The test on the experiment also reflects the robustness of the proposed model.","PeriodicalId":53631,"journal":{"name":"Koomesh","volume":"78 1","pages":"779-783"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Chinese Document Information Processing Model Based on Random Walk Algorithm\",\"authors\":\"Xiao Xian, Zhenhui Yue\",\"doi\":\"10.1109/I-SMAC.2018.8653683\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we conduct research on Chinese document information processing model based on random walk algorithm. Because of the complexity and also the particularity of processing Chinese information, Chinese search engine technology needs to be improved. The Chinese search engine cannot directly copy foreign technology. To study and analyze the expertise of the Chinese, we can accurately find the need in vast information base as the Chinese information. In this paper, the dictionary learning and sparse representation with random walk model are introduced into the character recognition to solve the problem of pen character and noise of the fax characters. The novel analytic framework is presented to assist the processing of the methodologies. The recognition method does not require preprocessing operations such as character binarization and thinning, only one feature and one classifier is needed, compared with the current multi-feature multi-cascade classifier fusion recognition method, proposed recognition method has characteristics of low complexity. The test on the experiment also reflects the robustness of the proposed model.\",\"PeriodicalId\":53631,\"journal\":{\"name\":\"Koomesh\",\"volume\":\"78 1\",\"pages\":\"779-783\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Koomesh\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/I-SMAC.2018.8653683\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Koomesh","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I-SMAC.2018.8653683","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}

引用次数: 1

摘要

本文对基于随机漫步算法的中文文档信息处理模型进行了研究。由于中文信息处理的复杂性和特殊性，中文搜索引擎技术有待改进。中国的搜索引擎不能直接抄袭外国技术。通过对汉语专业知识的研究和分析，我们可以在庞大的信息库中准确地找到所需的汉语信息。本文将字典学习和随机游走模型的稀疏表示引入到字符识别中，解决了传真字符的笔头字符和噪声问题。提出了一种新的分析框架来辅助方法的处理。该识别方法不需要字符二值化和细化等预处理操作，只需要一个特征和一个分类器，与目前多特征多级联分类器融合识别方法相比，该识别方法具有低复杂度的特点。对实验的检验也反映了所提模型的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Chinese Document Information Processing Model Based on Random Walk Algorithm

In this paper, we conduct research on Chinese document information processing model based on random walk algorithm. Because of the complexity and also the particularity of processing Chinese information, Chinese search engine technology needs to be improved. The Chinese search engine cannot directly copy foreign technology. To study and analyze the expertise of the Chinese, we can accurately find the need in vast information base as the Chinese information. In this paper, the dictionary learning and sparse representation with random walk model are introduced into the character recognition to solve the problem of pen character and noise of the fax characters. The novel analytic framework is presented to assist the processing of the methodologies. The recognition method does not require preprocessing operations such as character binarization and thinning, only one feature and one classifier is needed, compared with the current multi-feature multi-cascade classifier fusion recognition method, proposed recognition method has characteristics of low complexity. The test on the experiment also reflects the robustness of the proposed model.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Koomesh Medicine-Medicine (all)

CiteScore

0.80

自引率

0.00%

发文量

审稿时长

24 weeks