Patterns of syntactic trees for parsing arabic texts

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010) Pub Date : 2010-09-30 DOI:10.1109/NLPKE.2010.5587791

Fériel Ben Fraj Trabelsi, C. Zribi, M. Ahmed

引用次数: 2

Abstract

In order to parse Arabic texts, we have chosen to use a machine learning approach. It learns from an Arabic Treebank. The knowledge enclosed in this Treebank is structured as patterns of syntactic trees. These patterns are representative models of syntactic components of the Arabic language. They are not only layered but also both structurally and contextually rich. They serve as an informational source for guiding the parsing process. Our parser is progressive given that it proceeds by treating a sentence into a number of stages, equal to the number of its words. At each step, the parser affects the target word with the most likely patterns to represent it in the context where it is put. Then, it joins the selected patterns with those collected in the previous steps so as to construct the representative syntactic tree(s) of the whole sentence. Preliminary tests have yielded to obtain accuracy and f-score which are respectively equal to 84.78% and 77.52%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

解析阿拉伯文本的语法树模式

为了解析阿拉伯文本，我们选择使用机器学习方法。它从阿拉伯树库学习。这个树库中包含的知识被结构化为语法树的模式。这些模式是阿拉伯语语法成分的代表性模型。它们不仅有层次，而且在结构和语境上都很丰富。它们作为指导解析过程的信息源。我们的解析器是渐进式的，因为它将一个句子分成若干个阶段，这些阶段等于它的单词数量。在每一步中，解析器都会使用最可能的模式来影响目标单词，以便在放置该单词的上下文中表示该单词。然后，将选择的模式与前面步骤中收集的模式连接起来，从而构建整个句子的代表性句法树。初步试验获得的准确率和f-score分别为84.78%和77.52%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊