Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019最新文献_第2页

Applying Rhetorical Structure Theory to Student Essays for Providing Automated Writing Feedback 运用修辞结构理论为学生作文提供自动写作反馈

Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019

Pub Date : 1900-01-01 DOI: 10.18653/v1/W19-2720

Shiyan Jiang, K. Yang, Chandrakumari Suvarna, Pooja Casula, Mingtong Zhang, C. Rosé

We present a package of annotation resources, including annotation guideline, flowchart, and an Intelligent Tutoring System for training human annotators. These resources can be used to apply Rhetorical Structure Theory (RST) to essays written by students in K-12 schools. Furthermore, we highlight the great potential of using RST to provide automated feedback for improving writing quality across genres.

我们提出了一套注释资源，包括注释指南、流程图和一个用于训练人类注释者的智能辅导系统。这些资源可以用来将修辞结构理论(RST)应用到K-12学校学生的论文中。此外，我们强调了使用RST为提高不同类型的写作质量提供自动反馈的巨大潜力。

引用次数: 8

Annotating Shallow Discourse Relations in Twitter Conversations 推特对话中的浅语篇关系注释

Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019

Pub Date : 1900-01-01 DOI: 10.18653/v1/W19-2707

Tatjana Scheffler, Berfin Aktas, Debopam Das, Manfred Stede

We introduce our pilot study applying PDTB-style annotation to Twitter conversations. Lexically grounded coherence annotation for Twitter threads will enable detailed investigations of the discourse structure of conversations on social media. Here, we present our corpus of 185 threads and annotation, including an inter-annotator agreement study. We discuss our observations as to how Twitter discourses differ from written news text wrt. discourse connectives and relations. We confirm our hypothesis that discourse relations in written social media conversations are expressed differently than in (news) text. We find that in Twitter, connective arguments frequently are not full syntactic clauses, and that a few general connectives expressing EXPANSION and CONTINGENCY make up the majority of the explicit relations in our data.

我们介绍将pdtb风格的注释应用于Twitter对话的试点研究。基于词汇的Twitter线程连贯注释将有助于对社交媒体上对话的话语结构进行详细的调查。在这里，我们展示了185个线程和注释的语料库，包括注释者之间的协议研究。我们讨论我们的观察，如何推特话语不同于书面新闻文本写作。语篇连接词和关系。我们证实了我们的假设，即书面社交媒体对话中的话语关系表达方式不同于(新闻)文本。我们发现，在Twitter中，连接论点经常不是完整的句法从句，并且一些表达扩展和偶然性的一般连接词构成了我们数据中的大多数显式关系。

引用次数: 5

Multilingual segmentation based on neural networks and pre-trained word embeddings 基于神经网络和预训练词嵌入的多语言分词

Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019

Pub Date : 1900-01-01 DOI: 10.18653/v1/W19-2716

Mikel Iruskieta, K. Bengoetxea, Aitziber Atutxa Salazar, A. D. Ilarraza

The DISPRT 2019 workshop has organized a shared task aiming to identify cross-formalism and multilingual discourse segments. Elementary Discourse Units (EDUs) are quite similar across different theories. Segmentation is the very first stage on the way of rhetorical annotation. Still, each annotation project adopted several decisions with consequences not only on the annotation of the relational discourse structure but also at the segmentation stage. In this shared task, we have employed pre-trained word embeddings, neural networks (BiLSTM+CRF) to perform the segmentation. We report F1 results for 6 languages: Basque (0.853), English (0.919), French (0.907), German (0.913), Portuguese (0.926) and Spanish (0.868 and 0.769). Finally, we also pursued an error analysis based on clause typology for Basque and Spanish, in order to understand the performance of the segmenter.

DISPRT 2019研讨会组织了一项共同任务，旨在确定跨形式主义和多语言话语段。在不同的理论中，基本话语单位(edu)是非常相似的。分词是修辞注释的第一阶段。尽管如此，每个注释项目都采用了几个决策，这些决策不仅对关系话语结构的注释产生了影响，而且对分割阶段也产生了影响。在这个共享任务中，我们使用了预训练的词嵌入，神经网络(BiLSTM+CRF)来执行分割。我们报告了6种语言的F1结果:巴斯克语(0.853)、英语(0.919)、法语(0.907)、德语(0.913)、葡萄牙语(0.926)和西班牙语(0.868和0.769)。最后，我们还进行了基于巴斯克语和西班牙语从句类型的错误分析，以了解分词器的性能。

引用次数: 5

Multi-lingual and Cross-genre Discourse Unit Segmentation 多语言跨体裁语篇单元分割

Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019

Pub Date : 1900-01-01 DOI: 10.18653/V1/W19-2714

Peter Bourgonje, Robin Schäfer

We describe a series of experiments applied to data sets from different languages and genres annotated for coherence relations according to different theoretical frameworks. Specifically, we investigate the feasibility of a unified (theory-neutral) approach toward discourse segmentation; a process which divides a text into minimal discourse units that are involved in s coherence relation. We apply a RandomForest and an LSTM based approach for all data sets, and we improve over a simple baseline assuming simple sentence or clause-like segmentation. Performance however varies a lot depending on language, and more importantly genre, with f-scores ranging from 73.00 to 94.47.

我们描述了一系列应用于不同语言和体裁数据集的实验，这些数据集根据不同的理论框架注释了连贯关系。具体来说，我们研究了统一(理论中立)的话语分割方法的可行性;将语篇划分为最小语篇单位的过程，这些语篇单位涉及到连贯关系。我们对所有数据集应用随机森林和基于LSTM的方法，并在假设简单句子或类子句分割的简单基线上进行改进。然而，表现因语言而异，更重要的是类型，f分在73.00到94.47之间。

引用次数: 7

Introduction to Discourse Relation Parsing and Treebanking (DISRPT): 7th Workshop on Rhetorical Structure Theory and Related Formalisms 第七届修辞结构理论与相关形式主义研讨会

Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019

Pub Date : 1900-01-01 DOI: 10.18653/v1/W19-2701

Amir Zeldes, Debopam Das, E. Maziero, Juliano D. Antonio, Mikel Iruskieta

This overview summarizes the main contributions of the accepted papers at the 2019 workshop on Discourse Relation Parsing and Treebanking (DISRPT 2019). Co-located with NAACL 2019 in Minneapolis, the workshop’s aim was to bring together researchers working on corpus-based and computational approaches to discourse relations. In addition to an invited talk, eighteen papers outlined below were presented, four of which were submitted as part of a shared task on elementary discourse unit segmentation and connective detection.

本文概述了2019年话语关系解析和树库研讨会(DISRPT 2019)上被接受的论文的主要贡献。该研讨会与明尼阿波利斯的NAACL 2019共同举办，旨在汇集研究基于语料库和计算方法的话语关系的研究人员。除了邀请演讲外，还介绍了以下18篇论文，其中四篇是作为基本话语单位分割和连接检测的共享任务的一部分提交的。

引用次数: 5