{"title":"基于分层标注模式的对话子结构变分自动编码","authors":"Maitreyee Tewari, Michele Persiani","doi":"10.1109/CiSt49399.2021.9357245","DOIUrl":null,"url":null,"abstract":"This work presents a novel method to extract sub-structures in dialogues for the following genres: human-human task driven, human-human chit-chat, human-machine task driven, and human-machine chit-chat dialogues. The model consists of a novel semi-supervised annotation schema of syntactic features, communicative functions, dialogue policy, sequence expansion and sender information. These labels are then transformed into tuples of three, four and five segments, the tuples are used as features and modelled to learn sub-structures in above mentioned genres of dialogues with sequence-to-sequence variational autoencoders. The results analyse the latent space of generic sub-structures decomposed by PCA and ICA, showing an increase in silhouette scores for clustering of the latent space.","PeriodicalId":253233,"journal":{"name":"2020 6th IEEE Congress on Information Science and Technology (CiSt)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Variational Autoencoding Dialogue Sub-Structures Using a Novel Hierarchical Annotation Schema\",\"authors\":\"Maitreyee Tewari, Michele Persiani\",\"doi\":\"10.1109/CiSt49399.2021.9357245\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work presents a novel method to extract sub-structures in dialogues for the following genres: human-human task driven, human-human chit-chat, human-machine task driven, and human-machine chit-chat dialogues. The model consists of a novel semi-supervised annotation schema of syntactic features, communicative functions, dialogue policy, sequence expansion and sender information. These labels are then transformed into tuples of three, four and five segments, the tuples are used as features and modelled to learn sub-structures in above mentioned genres of dialogues with sequence-to-sequence variational autoencoders. The results analyse the latent space of generic sub-structures decomposed by PCA and ICA, showing an increase in silhouette scores for clustering of the latent space.\",\"PeriodicalId\":253233,\"journal\":{\"name\":\"2020 6th IEEE Congress on Information Science and Technology (CiSt)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 6th IEEE Congress on Information Science and Technology (CiSt)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CiSt49399.2021.9357245\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 6th IEEE Congress on Information Science and Technology (CiSt)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CiSt49399.2021.9357245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Variational Autoencoding Dialogue Sub-Structures Using a Novel Hierarchical Annotation Schema
This work presents a novel method to extract sub-structures in dialogues for the following genres: human-human task driven, human-human chit-chat, human-machine task driven, and human-machine chit-chat dialogues. The model consists of a novel semi-supervised annotation schema of syntactic features, communicative functions, dialogue policy, sequence expansion and sender information. These labels are then transformed into tuples of three, four and five segments, the tuples are used as features and modelled to learn sub-structures in above mentioned genres of dialogues with sequence-to-sequence variational autoencoders. The results analyse the latent space of generic sub-structures decomposed by PCA and ICA, showing an increase in silhouette scores for clustering of the latent space.