Comparing speaker independent and speaker adapted classification for word prominence detection

2016 IEEE Spoken Language Technology Workshop (SLT) Pub Date : 2016-12-01 DOI:10.1109/SLT.2016.7846271

Andrea Schnall, M. Heckmann

引用次数: 3

Abstract

Prosodic cues are an important part of human communication. One of these cues is the word prominence which is used to e.g. highlight important information. Since individual speakers use different ways of expressing prominence, it is not easily extracted and incorporated in a dialog system. As a consequence, up to date prominence only plays a marginal role in human-machine communication. In this paper we compare DNNs and SVMs trained speaker independently with the results of classification with SVM using a speaker adaptation method we recently developed. This adaptation method is based on the radial basis function of the SVM with a Gaussian regularization, which is derived from fMLLR. With this adaptation, we can notably reduce the problem of speaker variations. We present detailed evaluations of the methods and discuss advantages and shortcomings of the proposed approaches for word prominence detection.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

比较说话人独立分类和说话人适应分类对单词突出检测的影响

韵律线索是人类交流的重要组成部分。其中一个提示是单词prominent，用来强调重要的信息。由于每个说话者使用不同的方式来表达突出，所以它不容易被提取并纳入对话系统。因此，在人机交流中，最新的突出只起着边缘作用。在本文中，我们将dnn和SVM独立训练的说话人与使用我们最近开发的说话人自适应方法的支持向量机分类结果进行了比较。该自适应方法是基于基于高斯正则化的径向基函数的支持向量机，它是由fMLLR衍生而来的。通过这种适应，我们可以显著减少说话人变化的问题。我们对这些方法进行了详细的评估，并讨论了这些方法的优点和缺点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2016 IEEE Spoken Language Technology Workshop (SLT)

自引率

0.00%

发文量