Beyond Adjacency Pairs: Hierarchical Clustering of Long Sequences for Human-Machine Dialogues

M. Maitreyee
{"title":"Beyond Adjacency Pairs: Hierarchical Clustering of Long Sequences for Human-Machine Dialogues","authors":"M. Maitreyee","doi":"10.18653/v1/2020.codi-1.2","DOIUrl":null,"url":null,"abstract":"This work proposes a framework to predict sequences in dialogues, using turn based syntactic features and dialogue control functions. Syntactic features were extracted using dependency parsing, while dialogue control functions were manually labelled. These features were transformed using tf-idf and word embedding; feature selection was done using Principal Component Analysis (PCA). We ran experiments on six combinations of features to predict sequences with Hierarchical Agglomerative Clustering. An analysis of the clustering results indicate that using word-embeddings and syntactic features, significantly improved the results.","PeriodicalId":332037,"journal":{"name":"Proceedings of the First Workshop on Computational Approaches to Discourse","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the First Workshop on Computational Approaches to Discourse","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2020.codi-1.2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

This work proposes a framework to predict sequences in dialogues, using turn based syntactic features and dialogue control functions. Syntactic features were extracted using dependency parsing, while dialogue control functions were manually labelled. These features were transformed using tf-idf and word embedding; feature selection was done using Principal Component Analysis (PCA). We ran experiments on six combinations of features to predict sequences with Hierarchical Agglomerative Clustering. An analysis of the clustering results indicate that using word-embeddings and syntactic features, significantly improved the results.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
超越邻接对:人机对话长序列的层次聚类
这项工作提出了一个框架来预测对话序列,使用基于回合的句法特征和对话控制功能。使用依赖解析提取语法特征,同时手动标记对话控制函数。利用tf-idf和词嵌入对这些特征进行转换;使用主成分分析(PCA)进行特征选择。我们对六种特征组合进行了实验,用层次聚集聚类预测序列。对聚类结果的分析表明,使用词嵌入和句法特征可以显著改善聚类结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Do sentence embeddings capture discourse properties of sentences from Scientific Abstracts ? Contextualized Embeddings for Connective Disambiguation in Shallow Discourse Parsing Joint Modeling of Arguments for Event Understanding Coreference for Discourse Parsing: A Neural Approach Computational Interpretation of Recency for the Choice of Referring Expressions in Discourse
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1