{"title":"训练人机和人机旅行对话的对话行为标记器","authors":"R. Prasad, M. Walker","doi":"10.3115/1118121.1118142","DOIUrl":null,"url":null,"abstract":"While dialogue acts provide a useful schema for characterizing dialogue behaviors in human-computer and human-human dialogues, their utility is limited by the huge effort involved in hand-labelling dialogues with a dialogue act labelling scheme. In this work, we examine whether it is possible to fully automate the tagging task with the goal of enabling rapid creation of corpora for evaluating spoken dialogue systems and comparing them to human-human dialogues. We report results for training and testing an automatic classifier to label the information provider's utterances in spoken human-computer and human-human dialogues with DATE (Dialogue Act Tagging for Evaluation) dialogue act tags. We train and test the DATE tagger on various combinations of the DARPA Communicator June-2000 and October-2001 human-computer corpora, and the CMU human-human corpus in the travel planning domain. Our results show that we can achieve high accuracies on the human-computer data, and surprisingly, that the human-computer data improves accuracy on the human-human data, when only small amounts of human-human training data are available.","PeriodicalId":426429,"journal":{"name":"SIGDIAL Workshop","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"Training a Dialogue Act Tagger for Human-human and Human-computer Travel dialogues\",\"authors\":\"R. Prasad, M. Walker\",\"doi\":\"10.3115/1118121.1118142\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While dialogue acts provide a useful schema for characterizing dialogue behaviors in human-computer and human-human dialogues, their utility is limited by the huge effort involved in hand-labelling dialogues with a dialogue act labelling scheme. In this work, we examine whether it is possible to fully automate the tagging task with the goal of enabling rapid creation of corpora for evaluating spoken dialogue systems and comparing them to human-human dialogues. We report results for training and testing an automatic classifier to label the information provider's utterances in spoken human-computer and human-human dialogues with DATE (Dialogue Act Tagging for Evaluation) dialogue act tags. We train and test the DATE tagger on various combinations of the DARPA Communicator June-2000 and October-2001 human-computer corpora, and the CMU human-human corpus in the travel planning domain. Our results show that we can achieve high accuracies on the human-computer data, and surprisingly, that the human-computer data improves accuracy on the human-human data, when only small amounts of human-human training data are available.\",\"PeriodicalId\":426429,\"journal\":{\"name\":\"SIGDIAL Workshop\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIGDIAL Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3115/1118121.1118142\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIGDIAL Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3115/1118121.1118142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28
摘要
尽管对话行为为描述人机对话和人-人对话中的对话行为提供了一种有用的模式,但由于使用对话行为标记方案对对话进行手工标记所涉及的大量工作,它们的效用受到限制。在这项工作中,我们研究了是否有可能完全自动化标记任务,以实现快速创建语料库以评估口语对话系统并将其与人类对话进行比较。我们报告了一个自动分类器的训练和测试结果,该分类器使用DATE (Dialogue Act Tagging for Evaluation)对话行为标签来标记人机口语和人机对话中的信息提供者的话语。我们在DARPA Communicator 2000年6月和2001年10月的人机语料库以及旅行计划领域的CMU人机语料库的各种组合上训练和测试DATE标注器。我们的结果表明,我们可以在人机数据上达到很高的准确性,并且令人惊讶的是,当只有少量的人机训练数据可用时,人机数据提高了人机数据的准确性。
Training a Dialogue Act Tagger for Human-human and Human-computer Travel dialogues
While dialogue acts provide a useful schema for characterizing dialogue behaviors in human-computer and human-human dialogues, their utility is limited by the huge effort involved in hand-labelling dialogues with a dialogue act labelling scheme. In this work, we examine whether it is possible to fully automate the tagging task with the goal of enabling rapid creation of corpora for evaluating spoken dialogue systems and comparing them to human-human dialogues. We report results for training and testing an automatic classifier to label the information provider's utterances in spoken human-computer and human-human dialogues with DATE (Dialogue Act Tagging for Evaluation) dialogue act tags. We train and test the DATE tagger on various combinations of the DARPA Communicator June-2000 and October-2001 human-computer corpora, and the CMU human-human corpus in the travel planning domain. Our results show that we can achieve high accuracies on the human-computer data, and surprisingly, that the human-computer data improves accuracy on the human-human data, when only small amounts of human-human training data are available.