{"title":"Discovering Rhetoric Agreement between a Request and Response","authors":"Boris A. Galitsky","doi":"10.5087/dad.2017.208","DOIUrl":null,"url":null,"abstract":"To support a natural flow of a conversation between humans and automated agents, rhetoric structures of each message has to be analyzed. We classify a pair of paragraphs of text as appropriate for one to follow another, or inappropriate, based on both topic and communicative discourse considerations. To represent a multi-sentence message with respect to how it should follow a previous message in a conversation or dialogue, we build an extension of a discourse tree for it. Extended discourse tree is based on a discourse tree for RST relations with labels for communicative actions, and also additional arcs for anaphora and ontology-based relations for entities. We refer to such trees as Communicative Discourse Trees (CDTs). We explore syntactic and discourse features that are indicative of correct vs incorrect request-response or question-answer pairs. Two learning frameworks are used to recognize such correct pairs: deterministic, nearest-neighbor learning of CDTs as graphs, and a tree kernel learning of CDTs, where a feature space of all CDT sub-trees is subject to SVM learning. We form the positive training set from the correct pairs obtained from Yahoo Answers, social network, corporate conversations including Enron emails, customer complaints and interviews by journalists. The corresponding negative training set is artificially created by attaching responses for different, inappropriate requests that include relevant keywords. The evaluation showed that it is possible to recognize valid pairs in 70% of cases in the domains of weak request-response agreement and 80% of cases in the domains of strong agreement, which is essential to support automated conversations. These accuracies are comparable with the benchmark task of classification of discourse trees themselves as valid or invalid, and also with classification of multi-sentence answers in factoid question-answering systems. The applicability of proposed machinery to the problem of chatbots, social chats and programming via NL is demonstrated. We conclude that learning rhetoric structures in the form of CDTs is the key source of data to support answering complex questions, chatbots and dialogue management.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"18 1","pages":"167-205"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dialogue and Discourse","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5087/dad.2017.208","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 14
Abstract
To support a natural flow of a conversation between humans and automated agents, rhetoric structures of each message has to be analyzed. We classify a pair of paragraphs of text as appropriate for one to follow another, or inappropriate, based on both topic and communicative discourse considerations. To represent a multi-sentence message with respect to how it should follow a previous message in a conversation or dialogue, we build an extension of a discourse tree for it. Extended discourse tree is based on a discourse tree for RST relations with labels for communicative actions, and also additional arcs for anaphora and ontology-based relations for entities. We refer to such trees as Communicative Discourse Trees (CDTs). We explore syntactic and discourse features that are indicative of correct vs incorrect request-response or question-answer pairs. Two learning frameworks are used to recognize such correct pairs: deterministic, nearest-neighbor learning of CDTs as graphs, and a tree kernel learning of CDTs, where a feature space of all CDT sub-trees is subject to SVM learning. We form the positive training set from the correct pairs obtained from Yahoo Answers, social network, corporate conversations including Enron emails, customer complaints and interviews by journalists. The corresponding negative training set is artificially created by attaching responses for different, inappropriate requests that include relevant keywords. The evaluation showed that it is possible to recognize valid pairs in 70% of cases in the domains of weak request-response agreement and 80% of cases in the domains of strong agreement, which is essential to support automated conversations. These accuracies are comparable with the benchmark task of classification of discourse trees themselves as valid or invalid, and also with classification of multi-sentence answers in factoid question-answering systems. The applicability of proposed machinery to the problem of chatbots, social chats and programming via NL is demonstrated. We conclude that learning rhetoric structures in the form of CDTs is the key source of data to support answering complex questions, chatbots and dialogue management.
期刊介绍:
D&D seeks previously unpublished, high quality articles on the analysis of discourse and dialogue that contain -experimental and/or theoretical studies related to the construction, representation, and maintenance of (linguistic) context -linguistic analysis of phenomena characteristic of discourse and/or dialogue (including, but not limited to: reference and anaphora, presupposition and accommodation, topicality and salience, implicature, ---discourse structure and rhetorical relations, discourse markers and particles, the semantics and -pragmatics of dialogue acts, questions, imperatives, non-sentential utterances, intonation, and meta--communicative phenomena such as repair and grounding) -experimental and/or theoretical studies of agents'' information states and their dynamics in conversational interaction -new analytical frameworks that advance theoretical studies of discourse and dialogue -research on systems performing coreference resolution, discourse structure parsing, event and temporal -structure, and reference resolution in multimodal communication -experimental and/or theoretical results yielding new insight into non-linguistic interaction in -communication -work on natural language understanding (including spoken language understanding), dialogue management, -reasoning, and natural language generation (including text-to-speech) in dialogue systems -work related to the design and engineering of dialogue systems (including, but not limited to: -evaluation, usability design and testing, rapid application deployment, embodied agents, affect detection, -mixed-initiative, adaptation, and user modeling). -extremely well-written surveys of existing work. Highest priority is given to research reports that are specifically written for a multidisciplinary audience. The audience is primarily researchers on discourse and dialogue and its associated fields, including computer scientists, linguists, psychologists, philosophers, roboticists, sociologists.