Evaluating target utterance identification method using practical free conversation

Naoto Kosaka, Yumi Wakita
{"title":"Evaluating target utterance identification method using practical free conversation","authors":"Naoto Kosaka, Yumi Wakita","doi":"10.1109/IICAIET49801.2020.9257852","DOIUrl":null,"url":null,"abstract":"We develop a conversation support system for the public community. Our concept is that supporting elderly person's active life by assisting human-to-human conversation is more effective than providing a speech dialogue system. To use a conversation support system in an actual restaurant or lounge environment, it is necessary to separate the conversation of the target near the microphone from the ambient noise. We have already proposed the identification method of the utterances spoken between near a microphone and far from it using the standard deviation values of the fundamental frequency (SD-F0) and those of the speech power level (SD-SP) for each utterance. In the paper, we evaluate the effectiveness of our identification method for an actual free conversation using Support Vector Machine(SVM) method. As a result, the precision rate of the utterances near the microphone is 87.8%. This means that the identification method using the standard deviations of the fundamental frequency and speech power would be effective even if they are used in real environments. However, the performance depends on the utterances lengths, the F0 value's stability of the utterance part of over the threshold and the position of the microphones. In future, it evaluation should be done using more number of speakers and variable situations to define a suitable system specification.","PeriodicalId":300885,"journal":{"name":"2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IICAIET49801.2020.9257852","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We develop a conversation support system for the public community. Our concept is that supporting elderly person's active life by assisting human-to-human conversation is more effective than providing a speech dialogue system. To use a conversation support system in an actual restaurant or lounge environment, it is necessary to separate the conversation of the target near the microphone from the ambient noise. We have already proposed the identification method of the utterances spoken between near a microphone and far from it using the standard deviation values of the fundamental frequency (SD-F0) and those of the speech power level (SD-SP) for each utterance. In the paper, we evaluate the effectiveness of our identification method for an actual free conversation using Support Vector Machine(SVM) method. As a result, the precision rate of the utterances near the microphone is 87.8%. This means that the identification method using the standard deviations of the fundamental frequency and speech power would be effective even if they are used in real environments. However, the performance depends on the utterances lengths, the F0 value's stability of the utterance part of over the threshold and the position of the microphones. In future, it evaluation should be done using more number of speakers and variable situations to define a suitable system specification.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用实际自由会话评价目标话语识别方法
我们为公共社区开发了一个对话支持系统。我们的理念是,通过协助人与人之间的对话来支持老年人的积极生活,比提供语音对话系统更有效。要在实际的餐厅或休息室环境中使用对话支持系统,有必要将麦克风附近目标的对话与环境噪声分开。我们已经提出了利用每个话语的基频(SD-F0)和语音功率电平(SD-SP)的标准差值对近麦克风和远麦克风之间的话语进行识别的方法。在本文中,我们使用支持向量机(SVM)方法评估了我们的识别方法对实际自由对话的有效性。结果表明,在麦克风附近的话语的准确率为87.8%。这意味着,即使在真实环境中使用,利用基频和语音功率的标准差进行识别的方法也是有效的。然而,性能取决于话语长度、超过阈值的话语部分F0值的稳定性以及麦克风的位置。将来,它的评估应该使用更多的扬声器和可变的情况来定义一个合适的系统规范。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Estimating the Number of Cameras Required for a Given Classroom for Face-based Smart Attendance System Stock Market Prediction using Ensemble of Deep Neural Networks Timed Cellular Automata for Flight Delay Scheduling Optimization Experimenting Deep Convolutional Visual Feature Learning using Compositional Subspace Representation and Fashion-MNIST An Investigation of the Effect of Different Number of Electrodes on EIT Reconstructed Images
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1