Structure-constrained distribution matching using quadratic programming and its application to pronunciation evaluation

Y. Qiao, Masayuki Suzuki, N. Minematsu, K. Hirose
{"title":"Structure-constrained distribution matching using quadratic programming and its application to pronunciation evaluation","authors":"Y. Qiao, Masayuki Suzuki, N. Minematsu, K. Hirose","doi":"10.1109/ACPR.2011.6166673","DOIUrl":null,"url":null,"abstract":"We proposed a structural representation of speech that is robust to speaker difference due to its transformation-invariant property in previous works, where we compared two speech structures by calculating the distance between two structural vectors, each composed of the lengths of a structure's edges. However, this distance cannot yield matching scores directly related to individual events (nodes) of the two structures. In spite of comparing structural vectors directly, this paper takes structures as constraints for optimal pattern matching. We derive the formulas of objective functions and constraint functions for optimization. Under assumptions of Gaussian and shared covariance matrices, we show that this optimal problem can be reduced to a quadratically constrained quadratic programming problem. To relieve the too strong invariance problem, we use a subspace decomposition method and perform the optimization in each subspace. We evaluate the proposed method on a task to assess the goodness of students' English pronunciation. Experimental results show that the proposed method achieves higher correlations with teachers' manual scores than compared methods.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The First Asian Conference on Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACPR.2011.6166673","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We proposed a structural representation of speech that is robust to speaker difference due to its transformation-invariant property in previous works, where we compared two speech structures by calculating the distance between two structural vectors, each composed of the lengths of a structure's edges. However, this distance cannot yield matching scores directly related to individual events (nodes) of the two structures. In spite of comparing structural vectors directly, this paper takes structures as constraints for optimal pattern matching. We derive the formulas of objective functions and constraint functions for optimization. Under assumptions of Gaussian and shared covariance matrices, we show that this optimal problem can be reduced to a quadratically constrained quadratic programming problem. To relieve the too strong invariance problem, we use a subspace decomposition method and perform the optimization in each subspace. We evaluate the proposed method on a task to assess the goodness of students' English pronunciation. Experimental results show that the proposed method achieves higher correlations with teachers' manual scores than compared methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
二次规划结构约束分布匹配及其在发音评价中的应用
在之前的工作中,我们提出了一种对说话者差异具有鲁棒性的语音结构表示,由于其变换不变性,我们通过计算两个结构向量之间的距离来比较两个语音结构,每个结构向量由结构边缘的长度组成。然而,这个距离不能产生与两个结构的单个事件(节点)直接相关的匹配分数。除了直接比较结构向量外,本文还将结构作为最优模式匹配的约束条件。导出了优化的目标函数和约束函数的表达式。在高斯矩阵和共享协方差矩阵的假设下,我们证明了这个最优问题可以简化为一个二次约束的二次规划问题。为了解决不变性太强的问题,我们采用子空间分解的方法,在每个子空间上进行优化。我们在一个评估学生英语发音好坏的任务中对所提出的方法进行了评估。实验结果表明,该方法与教师手工成绩的相关性高于其他方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Geolocation based image annotation Discriminant appearance weighting for action recognition Tree crown detection in high resolution optical images during the early growth stages of Eucalyptus plantations in Brazil Designing and selecting features for MR image segmentation Adaptive Patch Alignment Based Local Binary Patterns for face recognition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1