Recognising and retrieving the meaning of Thirukkural from speech utterances

2017 Fourth International Conference on Signal Processing, Communication and Networking (ICSCN) Pub Date : 2017-03-01 DOI:10.1109/ICSCN.2017.8085704

B. Bharathi, G. Sridevi, G. J. Varshitha

{"title":"Recognising and retrieving the meaning of Thirukkural from speech utterances","authors":"B. Bharathi, G. Sridevi, G. J. Varshitha","doi":"10.1109/ICSCN.2017.8085704","DOIUrl":null,"url":null,"abstract":"Tamil is one of the oldest languages of the world with a rich collection of literature. The state of Tamil Nadu in India and Sri Lanka have vast populations of indigenous Tamil speakers. The Thirukkural is a classical Tamil Sangam Literature, penned by the famous Tamil poet, Thiruvalluvar. Kural is a very short Tamil Poetic form consisting of two lines. Thirukkural contains many important messages, speaking about the moral and ethical values to be followed by everyone. Up to now, speech Recognition has not been applied to this literature. This paper proposes a system which will recognize and retrieve the meaning of Thirukkural from speech utterances. This is achieved by extracting the MFCC feature vectors from the input speech (kural) and building the acoustic models by using Gaussian Mixture Model(GMM). This speaker independent system aims to convert the input speech Thirukkural into text and display the meaning in Tamil along with the chapter number, name and kural number. The system will also synthesize the meaning of the Thirukkural(text to speech). This will be useful to students and visually challenged people to learn Thirukkural in an interactive way. This will be a great help in encouraging more people to take an interest in and learn the Thirukkural. Experiments were conducted by collecting the corpus from 25 people (2000 kural samples) for one chapter of Thirukkural named “Seynandri Aridhal”. The performance of the system was evaluated by building models for different mixture components and retrieving the meaning. It was found to give 100 % accurate results for 128 mixture component models.","PeriodicalId":383458,"journal":{"name":"2017 Fourth International Conference on Signal Processing, Communication and Networking (ICSCN)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Fourth International Conference on Signal Processing, Communication and Networking (ICSCN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCN.2017.8085704","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Tamil is one of the oldest languages of the world with a rich collection of literature. The state of Tamil Nadu in India and Sri Lanka have vast populations of indigenous Tamil speakers. The Thirukkural is a classical Tamil Sangam Literature, penned by the famous Tamil poet, Thiruvalluvar. Kural is a very short Tamil Poetic form consisting of two lines. Thirukkural contains many important messages, speaking about the moral and ethical values to be followed by everyone. Up to now, speech Recognition has not been applied to this literature. This paper proposes a system which will recognize and retrieve the meaning of Thirukkural from speech utterances. This is achieved by extracting the MFCC feature vectors from the input speech (kural) and building the acoustic models by using Gaussian Mixture Model(GMM). This speaker independent system aims to convert the input speech Thirukkural into text and display the meaning in Tamil along with the chapter number, name and kural number. The system will also synthesize the meaning of the Thirukkural(text to speech). This will be useful to students and visually challenged people to learn Thirukkural in an interactive way. This will be a great help in encouraging more people to take an interest in and learn the Thirukkural. Experiments were conducted by collecting the corpus from 25 people (2000 kural samples) for one chapter of Thirukkural named “Seynandri Aridhal”. The performance of the system was evaluated by building models for different mixture components and retrieving the meaning. It was found to give 100 % accurate results for 128 mixture component models.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从语音话语中识别和检索Thirukkural的意思

泰米尔语是世界上最古老的语言之一，拥有丰富的文学收藏。印度的泰米尔纳德邦和斯里兰卡有大量说泰米尔语的土著人口。《Thirukkural》是由著名泰米尔诗人Thiruvalluvar所著的经典泰米尔Sangam文学。Kural是一种非常短的泰米尔诗歌形式，由两行组成。Thirukkural包含了许多重要的信息，谈到了每个人都应该遵循的道德和伦理价值观。到目前为止，语音识别还没有应用到这方面的文献中。本文提出了一个从语音话语中识别和检索Thirukkural语意的系统。这是通过从输入语音(kural)中提取MFCC特征向量，并使用高斯混合模型(GMM)建立声学模型来实现的。该系统旨在将输入的Thirukkural语音转换为文本，并显示泰米尔语的含义以及章节号、名称和库尔号。该系统还将综合Thirukkural(文本到语音)的含义。这将有助于学生和视觉障碍人士以互动的方式学习Thirukkural。这对鼓励更多的人对提鲁库尔语感兴趣和学习有很大的帮助。通过收集25人的语料库(2000个库尔样本)，对Thirukkural的一个章节“Seynandri Aridhal”进行了实验。通过建立不同混合成分的模型并检索其含义来评估系统的性能。结果表明，该方法对128个混合组分模型均能给出100%的准确结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 Fourth International Conference on Signal Processing, Communication and Networking (ICSCN)

自引率

0.00%

发文量