Discourse Segmentation of German Texts

J. Lang. Technol. Comput. Linguistics Pub Date : 2015-07-01 DOI:10.21248/jlcl.30.2015.196

Wladimir Sidorenko, A. Peldszus, Manfred Stede

引用次数: 16

Abstract

This paper addresses the problem of segmenting German texts into minimal discourse units, as they are needed, for example, in RST-based discourse parsing. We discuss relevant variants of the problem, introduce the design of our annotation guidelines, and provide the results of an extensive interannotator agreement study of the corpus. Afterwards, we report on our experiments with three automatic classifiers that rely on the output of state-of-the-art parsers and use different amounts and kinds of syntactic knowledge: constituent parsing versus dependency parsing; tree-structure classification versus sequence labeling. Finally, we compare our approaches with the recent discourse segmentation methods proposed for English.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

德语语篇的语篇分割

本文解决了将德语文本分割成最小语篇单元的问题，因为它们在基于rst的语篇解析中是必要的。我们讨论了该问题的相关变体，介绍了注释指南的设计，并提供了对语料库进行广泛的注释器间协议研究的结果。之后，我们报告了我们使用三种自动分类器的实验，这些自动分类器依赖于最先进的解析器的输出，并使用不同数量和种类的语法知识:成分解析与依赖解析;树结构分类与序列标记。最后，我们将我们的方法与最近提出的英语语篇分割方法进行了比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

J. Lang. Technol. Comput. Linguistics

自引率

0.00%

发文量