Weilin Sun, V. Lu, Aaron Truong, Hermione Bossolina, Yuan Lu
{"title":"Purrai: A Deep Neural Network based Approach to Interpret Domestic Cat Language","authors":"Weilin Sun, V. Lu, Aaron Truong, Hermione Bossolina, Yuan Lu","doi":"10.1109/ICMLA52953.2021.00104","DOIUrl":null,"url":null,"abstract":"Being able to understand and communicate with domestic cats has always been fascinating to humans, although it is considered a difficult task even for phonetics experts. In this paper, we present our approach to this problem: Purrai, a neural-network-based machine learning platform to interpret cat’s language. Our framework consists of two parts. First, we build a comprehensively constructed cat voice dataset that is 3.7x larger than any existing public available dataset [1]. To improve accuracy, we also use several techniques to ensure labeling quality, including rule-based labeling, cross validation, cosine distance, and outlier detection, etc. Second, we design a two-stage neural network structure to interpret what cats express in the context of multiple sounds called sentences. The first stage is a modification of Google’s Vggish architecture [2] [3], which is a Convolutional Neural Network (CNN) architecture that focuses on the classification of nine primary cat sounds. The second stage takes the probability outputs of a sequence of sound classifications from the first stage and determines the emotional meaning of a cat sentence. Our first stage architecture generates a top-l and top-2 accuracy of 74.1% and 92.1%, better than that of the state-of-the-art approach: 64.9% and 83.4% [4]. Our sentence-based AI model achieves an accuracy of 81.1% for emotion prediction.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"11 1","pages":"622-627"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA52953.2021.00104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Being able to understand and communicate with domestic cats has always been fascinating to humans, although it is considered a difficult task even for phonetics experts. In this paper, we present our approach to this problem: Purrai, a neural-network-based machine learning platform to interpret cat’s language. Our framework consists of two parts. First, we build a comprehensively constructed cat voice dataset that is 3.7x larger than any existing public available dataset [1]. To improve accuracy, we also use several techniques to ensure labeling quality, including rule-based labeling, cross validation, cosine distance, and outlier detection, etc. Second, we design a two-stage neural network structure to interpret what cats express in the context of multiple sounds called sentences. The first stage is a modification of Google’s Vggish architecture [2] [3], which is a Convolutional Neural Network (CNN) architecture that focuses on the classification of nine primary cat sounds. The second stage takes the probability outputs of a sequence of sound classifications from the first stage and determines the emotional meaning of a cat sentence. Our first stage architecture generates a top-l and top-2 accuracy of 74.1% and 92.1%, better than that of the state-of-the-art approach: 64.9% and 83.4% [4]. Our sentence-based AI model achieves an accuracy of 81.1% for emotion prediction.