Design of a language-independent parallel string matching unit for NLP

2003 IEEE International Workshop on Computer Architectures for Machine Perception Pub Date : 2003-05-12 DOI:10.1109/CAMP.2003.1598159

V. S. Murty, P. C. Reghu Raj, S. Raman

引用次数: 2

Abstract

In natural language processing applications, string matching is the main time-consuming operation due to the large size of lexicon. Data dependence is minimal in string matching operations, and hence it is ideal for parallelization. A dedicated hardware for string matching that uses memory interleaving and parallel processing techniques can relieve the host CPU from this burden, thereby making the system suitable for real-time applications. This paper reports the FPGA design of such a system with m parallel matching units. The time complexity of the proposed algorithm is O (log2 n), where n is the total number of lexical entries. This has been achieved by a proper selection of the value of m. A special memory organization technique, which reduces the storage space by nearly 70%, has been adopted for storing lexical entries. The techniques used for matching and storage of lexical entries make the system language independent

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

面向自然语言处理的并行字符串匹配单元设计

在自然语言处理应用中，由于词汇量大，字符串匹配是最耗时的操作。数据依赖性在字符串匹配操作中是最小的，因此它是并行化的理想选择。使用内存交错并行处理技术的字符串匹配专用硬件可以减轻主机CPU的负担，从而使系统适合实时应用。本文报道了一个具有m个并行匹配单元的系统的FPGA设计。本文算法的时间复杂度为O (log2 n)，其中n为词法条目的总数。这是通过正确选择m的值来实现的。在存储词法条目时，采用了一种特殊的内存组织技术，该技术将存储空间减少了近70%。用于匹配和存储词法条目的技术使系统与语言无关

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2003 IEEE International Workshop on Computer Architectures for Machine Perception

自引率

0.00%

发文量