Shiyan Jiang, K. Yang, Chandrakumari Suvarna, Pooja Casula, Mingtong Zhang, C. Rosé
We present a package of annotation resources, including annotation guideline, flowchart, and an Intelligent Tutoring System for training human annotators. These resources can be used to apply Rhetorical Structure Theory (RST) to essays written by students in K-12 schools. Furthermore, we highlight the great potential of using RST to provide automated feedback for improving writing quality across genres.
{"title":"Applying Rhetorical Structure Theory to Student Essays for Providing Automated Writing Feedback","authors":"Shiyan Jiang, K. Yang, Chandrakumari Suvarna, Pooja Casula, Mingtong Zhang, C. Rosé","doi":"10.18653/v1/W19-2720","DOIUrl":"https://doi.org/10.18653/v1/W19-2720","url":null,"abstract":"We present a package of annotation resources, including annotation guideline, flowchart, and an Intelligent Tutoring System for training human annotators. These resources can be used to apply Rhetorical Structure Theory (RST) to essays written by students in K-12 schools. Furthermore, we highlight the great potential of using RST to provide automated feedback for improving writing quality across genres.","PeriodicalId":243254,"journal":{"name":"Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132837400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tatjana Scheffler, Berfin Aktas, Debopam Das, Manfred Stede
We introduce our pilot study applying PDTB-style annotation to Twitter conversations. Lexically grounded coherence annotation for Twitter threads will enable detailed investigations of the discourse structure of conversations on social media. Here, we present our corpus of 185 threads and annotation, including an inter-annotator agreement study. We discuss our observations as to how Twitter discourses differ from written news text wrt. discourse connectives and relations. We confirm our hypothesis that discourse relations in written social media conversations are expressed differently than in (news) text. We find that in Twitter, connective arguments frequently are not full syntactic clauses, and that a few general connectives expressing EXPANSION and CONTINGENCY make up the majority of the explicit relations in our data.
{"title":"Annotating Shallow Discourse Relations in Twitter Conversations","authors":"Tatjana Scheffler, Berfin Aktas, Debopam Das, Manfred Stede","doi":"10.18653/v1/W19-2707","DOIUrl":"https://doi.org/10.18653/v1/W19-2707","url":null,"abstract":"We introduce our pilot study applying PDTB-style annotation to Twitter conversations. Lexically grounded coherence annotation for Twitter threads will enable detailed investigations of the discourse structure of conversations on social media. Here, we present our corpus of 185 threads and annotation, including an inter-annotator agreement study. We discuss our observations as to how Twitter discourses differ from written news text wrt. discourse connectives and relations. We confirm our hypothesis that discourse relations in written social media conversations are expressed differently than in (news) text. We find that in Twitter, connective arguments frequently are not full syntactic clauses, and that a few general connectives expressing EXPANSION and CONTINGENCY make up the majority of the explicit relations in our data.","PeriodicalId":243254,"journal":{"name":"Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122565124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mikel Iruskieta, K. Bengoetxea, Aitziber Atutxa Salazar, A. D. Ilarraza
The DISPRT 2019 workshop has organized a shared task aiming to identify cross-formalism and multilingual discourse segments. Elementary Discourse Units (EDUs) are quite similar across different theories. Segmentation is the very first stage on the way of rhetorical annotation. Still, each annotation project adopted several decisions with consequences not only on the annotation of the relational discourse structure but also at the segmentation stage. In this shared task, we have employed pre-trained word embeddings, neural networks (BiLSTM+CRF) to perform the segmentation. We report F1 results for 6 languages: Basque (0.853), English (0.919), French (0.907), German (0.913), Portuguese (0.926) and Spanish (0.868 and 0.769). Finally, we also pursued an error analysis based on clause typology for Basque and Spanish, in order to understand the performance of the segmenter.
{"title":"Multilingual segmentation based on neural networks and pre-trained word embeddings","authors":"Mikel Iruskieta, K. Bengoetxea, Aitziber Atutxa Salazar, A. D. Ilarraza","doi":"10.18653/v1/W19-2716","DOIUrl":"https://doi.org/10.18653/v1/W19-2716","url":null,"abstract":"The DISPRT 2019 workshop has organized a shared task aiming to identify cross-formalism and multilingual discourse segments. Elementary Discourse Units (EDUs) are quite similar across different theories. Segmentation is the very first stage on the way of rhetorical annotation. Still, each annotation project adopted several decisions with consequences not only on the annotation of the relational discourse structure but also at the segmentation stage. In this shared task, we have employed pre-trained word embeddings, neural networks (BiLSTM+CRF) to perform the segmentation. We report F1 results for 6 languages: Basque (0.853), English (0.919), French (0.907), German (0.913), Portuguese (0.926) and Spanish (0.868 and 0.769). Finally, we also pursued an error analysis based on clause typology for Basque and Spanish, in order to understand the performance of the segmenter.","PeriodicalId":243254,"journal":{"name":"Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115477831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We describe a series of experiments applied to data sets from different languages and genres annotated for coherence relations according to different theoretical frameworks. Specifically, we investigate the feasibility of a unified (theory-neutral) approach toward discourse segmentation; a process which divides a text into minimal discourse units that are involved in s coherence relation. We apply a RandomForest and an LSTM based approach for all data sets, and we improve over a simple baseline assuming simple sentence or clause-like segmentation. Performance however varies a lot depending on language, and more importantly genre, with f-scores ranging from 73.00 to 94.47.
{"title":"Multi-lingual and Cross-genre Discourse Unit Segmentation","authors":"Peter Bourgonje, Robin Schäfer","doi":"10.18653/V1/W19-2714","DOIUrl":"https://doi.org/10.18653/V1/W19-2714","url":null,"abstract":"We describe a series of experiments applied to data sets from different languages and genres annotated for coherence relations according to different theoretical frameworks. Specifically, we investigate the feasibility of a unified (theory-neutral) approach toward discourse segmentation; a process which divides a text into minimal discourse units that are involved in s coherence relation. We apply a RandomForest and an LSTM based approach for all data sets, and we improve over a simple baseline assuming simple sentence or clause-like segmentation. Performance however varies a lot depending on language, and more importantly genre, with f-scores ranging from 73.00 to 94.47.","PeriodicalId":243254,"journal":{"name":"Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133693009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amir Zeldes, Debopam Das, E. Maziero, Juliano D. Antonio, Mikel Iruskieta
This overview summarizes the main contributions of the accepted papers at the 2019 workshop on Discourse Relation Parsing and Treebanking (DISRPT 2019). Co-located with NAACL 2019 in Minneapolis, the workshop’s aim was to bring together researchers working on corpus-based and computational approaches to discourse relations. In addition to an invited talk, eighteen papers outlined below were presented, four of which were submitted as part of a shared task on elementary discourse unit segmentation and connective detection.
{"title":"Introduction to Discourse Relation Parsing and Treebanking (DISRPT): 7th Workshop on Rhetorical Structure Theory and Related Formalisms","authors":"Amir Zeldes, Debopam Das, E. Maziero, Juliano D. Antonio, Mikel Iruskieta","doi":"10.18653/v1/W19-2701","DOIUrl":"https://doi.org/10.18653/v1/W19-2701","url":null,"abstract":"This overview summarizes the main contributions of the accepted papers at the 2019 workshop on Discourse Relation Parsing and Treebanking (DISRPT 2019). Co-located with NAACL 2019 in Minneapolis, the workshop’s aim was to bring together researchers working on corpus-based and computational approaches to discourse relations. In addition to an invited talk, eighteen papers outlined below were presented, four of which were submitted as part of a shared task on elementary discourse unit segmentation and connective detection.","PeriodicalId":243254,"journal":{"name":"Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019","volume":"9 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113980300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}