{"title":"Recognising and retrieving the meaning of Thirukkural from speech utterances","authors":"B. Bharathi, G. Sridevi, G. J. Varshitha","doi":"10.1109/ICSCN.2017.8085704","DOIUrl":null,"url":null,"abstract":"Tamil is one of the oldest languages of the world with a rich collection of literature. The state of Tamil Nadu in India and Sri Lanka have vast populations of indigenous Tamil speakers. The Thirukkural is a classical Tamil Sangam Literature, penned by the famous Tamil poet, Thiruvalluvar. Kural is a very short Tamil Poetic form consisting of two lines. Thirukkural contains many important messages, speaking about the moral and ethical values to be followed by everyone. Up to now, speech Recognition has not been applied to this literature. This paper proposes a system which will recognize and retrieve the meaning of Thirukkural from speech utterances. This is achieved by extracting the MFCC feature vectors from the input speech (kural) and building the acoustic models by using Gaussian Mixture Model(GMM). This speaker independent system aims to convert the input speech Thirukkural into text and display the meaning in Tamil along with the chapter number, name and kural number. The system will also synthesize the meaning of the Thirukkural(text to speech). This will be useful to students and visually challenged people to learn Thirukkural in an interactive way. This will be a great help in encouraging more people to take an interest in and learn the Thirukkural. Experiments were conducted by collecting the corpus from 25 people (2000 kural samples) for one chapter of Thirukkural named “Seynandri Aridhal”. The performance of the system was evaluated by building models for different mixture components and retrieving the meaning. It was found to give 100 % accurate results for 128 mixture component models.","PeriodicalId":383458,"journal":{"name":"2017 Fourth International Conference on Signal Processing, Communication and Networking (ICSCN)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Fourth International Conference on Signal Processing, Communication and Networking (ICSCN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCN.2017.8085704","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Tamil is one of the oldest languages of the world with a rich collection of literature. The state of Tamil Nadu in India and Sri Lanka have vast populations of indigenous Tamil speakers. The Thirukkural is a classical Tamil Sangam Literature, penned by the famous Tamil poet, Thiruvalluvar. Kural is a very short Tamil Poetic form consisting of two lines. Thirukkural contains many important messages, speaking about the moral and ethical values to be followed by everyone. Up to now, speech Recognition has not been applied to this literature. This paper proposes a system which will recognize and retrieve the meaning of Thirukkural from speech utterances. This is achieved by extracting the MFCC feature vectors from the input speech (kural) and building the acoustic models by using Gaussian Mixture Model(GMM). This speaker independent system aims to convert the input speech Thirukkural into text and display the meaning in Tamil along with the chapter number, name and kural number. The system will also synthesize the meaning of the Thirukkural(text to speech). This will be useful to students and visually challenged people to learn Thirukkural in an interactive way. This will be a great help in encouraging more people to take an interest in and learn the Thirukkural. Experiments were conducted by collecting the corpus from 25 people (2000 kural samples) for one chapter of Thirukkural named “Seynandri Aridhal”. The performance of the system was evaluated by building models for different mixture components and retrieving the meaning. It was found to give 100 % accurate results for 128 mixture component models.