{"title":"A Neural Approach to Discourse Relation Signal Detection","authors":"Amir Zeldes, Yang Janet Liu","doi":"10.5087/dad.2020.201","DOIUrl":null,"url":null,"abstract":"Previous data-driven work investigating the types and distributions of discourse\n relation signals, including discourse markers such as 'however' or phrases such as 'as a\n result' has focused on the relative frequencies of signal words within and outside text\n from each discourse relation. Such approaches do not allow us to quantify the signaling\n strength of individual instances of a signal on a scale (e.g. more or less\n discourse-relevant instances of 'and'), to assess the distribution of ambiguity for\n signals, or to identify words that hinder discourse relation identification in context\n ('anti-signals' or 'distractors'). In this paper we present a data-driven approach to\n signal detection using a distantly supervised neural network and develop a metric, Δs\n (or 'delta-softmax'), to quantify signaling strength. Ranging between -1 and 1 and\n relying on recent advances in contextualized words embeddings, the metric represents\n each word's positive or negative contribution to the identifiability of a relation in\n specific instances in context. Based on an English corpus annotated for discourse\n relations using Rhetorical Structure Theory and signal type annotations anchored to\n specific tokens, our analysis examines the reliability of the metric, the places where\n it overlaps with and differs from human judgments, and the implications for identifying\n features that neural models may need in order to perform better on automatic discourse\n relation classification.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"13 1","pages":"1-33"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dialogue and Discourse","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5087/dad.2020.201","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 5
Abstract
Previous data-driven work investigating the types and distributions of discourse
relation signals, including discourse markers such as 'however' or phrases such as 'as a
result' has focused on the relative frequencies of signal words within and outside text
from each discourse relation. Such approaches do not allow us to quantify the signaling
strength of individual instances of a signal on a scale (e.g. more or less
discourse-relevant instances of 'and'), to assess the distribution of ambiguity for
signals, or to identify words that hinder discourse relation identification in context
('anti-signals' or 'distractors'). In this paper we present a data-driven approach to
signal detection using a distantly supervised neural network and develop a metric, Δs
(or 'delta-softmax'), to quantify signaling strength. Ranging between -1 and 1 and
relying on recent advances in contextualized words embeddings, the metric represents
each word's positive or negative contribution to the identifiability of a relation in
specific instances in context. Based on an English corpus annotated for discourse
relations using Rhetorical Structure Theory and signal type annotations anchored to
specific tokens, our analysis examines the reliability of the metric, the places where
it overlaps with and differs from human judgments, and the implications for identifying
features that neural models may need in order to perform better on automatic discourse
relation classification.
期刊介绍:
D&D seeks previously unpublished, high quality articles on the analysis of discourse and dialogue that contain -experimental and/or theoretical studies related to the construction, representation, and maintenance of (linguistic) context -linguistic analysis of phenomena characteristic of discourse and/or dialogue (including, but not limited to: reference and anaphora, presupposition and accommodation, topicality and salience, implicature, ---discourse structure and rhetorical relations, discourse markers and particles, the semantics and -pragmatics of dialogue acts, questions, imperatives, non-sentential utterances, intonation, and meta--communicative phenomena such as repair and grounding) -experimental and/or theoretical studies of agents'' information states and their dynamics in conversational interaction -new analytical frameworks that advance theoretical studies of discourse and dialogue -research on systems performing coreference resolution, discourse structure parsing, event and temporal -structure, and reference resolution in multimodal communication -experimental and/or theoretical results yielding new insight into non-linguistic interaction in -communication -work on natural language understanding (including spoken language understanding), dialogue management, -reasoning, and natural language generation (including text-to-speech) in dialogue systems -work related to the design and engineering of dialogue systems (including, but not limited to: -evaluation, usability design and testing, rapid application deployment, embodied agents, affect detection, -mixed-initiative, adaptation, and user modeling). -extremely well-written surveys of existing work. Highest priority is given to research reports that are specifically written for a multidisciplinary audience. The audience is primarily researchers on discourse and dialogue and its associated fields, including computer scientists, linguists, psychologists, philosophers, roboticists, sociologists.