基于缺失目标跨任务标记的半自主数据充实，用于整体语音分析

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2016-03-20 DOI:10.1109/ICASSP.2016.7472847

Yue Zhang, Yuxiang Zhou, Jie Shen, Björn Schuller

{"title":"基于缺失目标跨任务标记的半自主数据充实，用于整体语音分析","authors":"Yue Zhang, Yuxiang Zhou, Jie Shen, Björn Schuller","doi":"10.1109/ICASSP.2016.7472847","DOIUrl":null,"url":null,"abstract":"In this work, we propose a novel approach for large-scale data enrichment, with the aim to address a major shortcoming of current research in computational paralinguistics, namely, looking at speaker attributes in isolation although strong interdependencies between them exist. The scarcity of multi-target databases, in which instances are labelled for different kinds of speaker characteristics, compounds this problem. The core idea of our work is to join existing data resources into one single holistic database with a multi-dimensional label space by using semi-supervised learning techniques to predict missing labels. In the proposed new Cross-Task Labelling (CTL) method, a model is first trained on the labelled training set of the selected databases for each individual task. Then, the trained classifiers are used for the crosslabelling of databases among each other. To exemplify the effectiveness of the `CTL' method, we evaluated it for likability, personality, and emotion recognition as representative tasks from the INTERSPEECH Computational Paralinguistics ChallengE (ComParE) series. The results show that `CTL' lays the foundation for holistic speech analysis by semi-autonomously annotating the existing databases, and expanding the multi-target label space at the same time, while achieving higher accuracy as the baseline performance of the challenges.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"139 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Semi-autonomous data enrichment based on cross-task labelling of missing targets for holistic speech analysis\",\"authors\":\"Yue Zhang, Yuxiang Zhou, Jie Shen, Björn Schuller\",\"doi\":\"10.1109/ICASSP.2016.7472847\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we propose a novel approach for large-scale data enrichment, with the aim to address a major shortcoming of current research in computational paralinguistics, namely, looking at speaker attributes in isolation although strong interdependencies between them exist. The scarcity of multi-target databases, in which instances are labelled for different kinds of speaker characteristics, compounds this problem. The core idea of our work is to join existing data resources into one single holistic database with a multi-dimensional label space by using semi-supervised learning techniques to predict missing labels. In the proposed new Cross-Task Labelling (CTL) method, a model is first trained on the labelled training set of the selected databases for each individual task. Then, the trained classifiers are used for the crosslabelling of databases among each other. To exemplify the effectiveness of the `CTL' method, we evaluated it for likability, personality, and emotion recognition as representative tasks from the INTERSPEECH Computational Paralinguistics ChallengE (ComParE) series. The results show that `CTL' lays the foundation for holistic speech analysis by semi-autonomously annotating the existing databases, and expanding the multi-target label space at the same time, while achieving higher accuracy as the baseline performance of the challenges.\",\"PeriodicalId\":165321,\"journal\":{\"name\":\"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"139 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2016.7472847\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2016.7472847","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

在这项工作中，我们提出了一种大规模数据丰富的新方法，旨在解决当前计算副语言学研究的一个主要缺点，即尽管它们之间存在很强的相互依赖性，但仍然孤立地研究说话人属性。多目标数据库的稀缺性使得这个问题更加复杂。在多目标数据库中，每个实例都被标记为不同类型的说话人特征。我们工作的核心思想是通过使用半监督学习技术来预测缺失的标签，将现有的数据资源加入到一个具有多维标签空间的单一整体数据库中。在提出的新的跨任务标记(CTL)方法中，首先对每个单独任务的选定数据库的标记训练集进行模型训练。然后，将训练好的分类器用于数据库之间的交叉标记。为了证明“CTL”方法的有效性，我们将其作为INTERSPEECH计算副语言学挑战(ComParE)系列的代表性任务，对其进行了可爱性、个性和情感识别的评估。结果表明，“CTL”通过对现有数据库进行半自主标注，同时扩展多目标标签空间，为整体语音分析奠定了基础，同时实现了更高的准确率作为挑战的基准性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Semi-autonomous data enrichment based on cross-task labelling of missing targets for holistic speech analysis

In this work, we propose a novel approach for large-scale data enrichment, with the aim to address a major shortcoming of current research in computational paralinguistics, namely, looking at speaker attributes in isolation although strong interdependencies between them exist. The scarcity of multi-target databases, in which instances are labelled for different kinds of speaker characteristics, compounds this problem. The core idea of our work is to join existing data resources into one single holistic database with a multi-dimensional label space by using semi-supervised learning techniques to predict missing labels. In the proposed new Cross-Task Labelling (CTL) method, a model is first trained on the labelled training set of the selected databases for each individual task. Then, the trained classifiers are used for the crosslabelling of databases among each other. To exemplify the effectiveness of the `CTL' method, we evaluated it for likability, personality, and emotion recognition as representative tasks from the INTERSPEECH Computational Paralinguistics ChallengE (ComParE) series. The results show that `CTL' lays the foundation for holistic speech analysis by semi-autonomously annotating the existing databases, and expanding the multi-target label space at the same time, while achieving higher accuracy as the baseline performance of the challenges.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

自引率

0.00%

发文量