Recognising and retrieving the meaning of Thirukkural from speech utterances

B. Bharathi, G. Sridevi, G. J. Varshitha
{"title":"Recognising and retrieving the meaning of Thirukkural from speech utterances","authors":"B. Bharathi, G. Sridevi, G. J. Varshitha","doi":"10.1109/ICSCN.2017.8085704","DOIUrl":null,"url":null,"abstract":"Tamil is one of the oldest languages of the world with a rich collection of literature. The state of Tamil Nadu in India and Sri Lanka have vast populations of indigenous Tamil speakers. The Thirukkural is a classical Tamil Sangam Literature, penned by the famous Tamil poet, Thiruvalluvar. Kural is a very short Tamil Poetic form consisting of two lines. Thirukkural contains many important messages, speaking about the moral and ethical values to be followed by everyone. Up to now, speech Recognition has not been applied to this literature. This paper proposes a system which will recognize and retrieve the meaning of Thirukkural from speech utterances. This is achieved by extracting the MFCC feature vectors from the input speech (kural) and building the acoustic models by using Gaussian Mixture Model(GMM). This speaker independent system aims to convert the input speech Thirukkural into text and display the meaning in Tamil along with the chapter number, name and kural number. The system will also synthesize the meaning of the Thirukkural(text to speech). This will be useful to students and visually challenged people to learn Thirukkural in an interactive way. This will be a great help in encouraging more people to take an interest in and learn the Thirukkural. Experiments were conducted by collecting the corpus from 25 people (2000 kural samples) for one chapter of Thirukkural named “Seynandri Aridhal”. The performance of the system was evaluated by building models for different mixture components and retrieving the meaning. It was found to give 100 % accurate results for 128 mixture component models.","PeriodicalId":383458,"journal":{"name":"2017 Fourth International Conference on Signal Processing, Communication and Networking (ICSCN)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Fourth International Conference on Signal Processing, Communication and Networking (ICSCN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCN.2017.8085704","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Tamil is one of the oldest languages of the world with a rich collection of literature. The state of Tamil Nadu in India and Sri Lanka have vast populations of indigenous Tamil speakers. The Thirukkural is a classical Tamil Sangam Literature, penned by the famous Tamil poet, Thiruvalluvar. Kural is a very short Tamil Poetic form consisting of two lines. Thirukkural contains many important messages, speaking about the moral and ethical values to be followed by everyone. Up to now, speech Recognition has not been applied to this literature. This paper proposes a system which will recognize and retrieve the meaning of Thirukkural from speech utterances. This is achieved by extracting the MFCC feature vectors from the input speech (kural) and building the acoustic models by using Gaussian Mixture Model(GMM). This speaker independent system aims to convert the input speech Thirukkural into text and display the meaning in Tamil along with the chapter number, name and kural number. The system will also synthesize the meaning of the Thirukkural(text to speech). This will be useful to students and visually challenged people to learn Thirukkural in an interactive way. This will be a great help in encouraging more people to take an interest in and learn the Thirukkural. Experiments were conducted by collecting the corpus from 25 people (2000 kural samples) for one chapter of Thirukkural named “Seynandri Aridhal”. The performance of the system was evaluated by building models for different mixture components and retrieving the meaning. It was found to give 100 % accurate results for 128 mixture component models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从语音话语中识别和检索Thirukkural的意思
泰米尔语是世界上最古老的语言之一,拥有丰富的文学收藏。印度的泰米尔纳德邦和斯里兰卡有大量说泰米尔语的土著人口。《Thirukkural》是由著名泰米尔诗人Thiruvalluvar所著的经典泰米尔Sangam文学。Kural是一种非常短的泰米尔诗歌形式,由两行组成。Thirukkural包含了许多重要的信息,谈到了每个人都应该遵循的道德和伦理价值观。到目前为止,语音识别还没有应用到这方面的文献中。本文提出了一个从语音话语中识别和检索Thirukkural语意的系统。这是通过从输入语音(kural)中提取MFCC特征向量,并使用高斯混合模型(GMM)建立声学模型来实现的。该系统旨在将输入的Thirukkural语音转换为文本,并显示泰米尔语的含义以及章节号、名称和库尔号。该系统还将综合Thirukkural(文本到语音)的含义。这将有助于学生和视觉障碍人士以互动的方式学习Thirukkural。这对鼓励更多的人对提鲁库尔语感兴趣和学习有很大的帮助。通过收集25人的语料库(2000个库尔样本),对Thirukkural的一个章节“Seynandri Aridhal”进行了实验。通过建立不同混合成分的模型并检索其含义来评估系统的性能。结果表明,该方法对128个混合组分模型均能给出100%的准确结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Design and implementation of programmable read only memory using reversible decoder on FPGA Literature survey on traffic-based server load balancing using SDN and open flow A survey on ARP cache poisoning and techniques for detection and mitigation Machine condition monitoring using audio signature analysis Robust audio watermarking for monitoring and information embedding
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1