Purrai:一种基于深度神经网络的方法来解释家猫的语言

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA) Pub Date : 2021-12-01 DOI:10.1109/ICMLA52953.2021.00104

Weilin Sun, V. Lu, Aaron Truong, Hermione Bossolina, Yuan Lu

{"title":"Purrai:一种基于深度神经网络的方法来解释家猫的语言","authors":"Weilin Sun, V. Lu, Aaron Truong, Hermione Bossolina, Yuan Lu","doi":"10.1109/ICMLA52953.2021.00104","DOIUrl":null,"url":null,"abstract":"Being able to understand and communicate with domestic cats has always been fascinating to humans, although it is considered a difficult task even for phonetics experts. In this paper, we present our approach to this problem: Purrai, a neural-network-based machine learning platform to interpret cat’s language. Our framework consists of two parts. First, we build a comprehensively constructed cat voice dataset that is 3.7x larger than any existing public available dataset [1]. To improve accuracy, we also use several techniques to ensure labeling quality, including rule-based labeling, cross validation, cosine distance, and outlier detection, etc. Second, we design a two-stage neural network structure to interpret what cats express in the context of multiple sounds called sentences. The first stage is a modification of Google’s Vggish architecture [2] [3], which is a Convolutional Neural Network (CNN) architecture that focuses on the classification of nine primary cat sounds. The second stage takes the probability outputs of a sequence of sound classifications from the first stage and determines the emotional meaning of a cat sentence. Our first stage architecture generates a top-l and top-2 accuracy of 74.1% and 92.1%, better than that of the state-of-the-art approach: 64.9% and 83.4% [4]. Our sentence-based AI model achieves an accuracy of 81.1% for emotion prediction.","PeriodicalId":6750,"journal":{"name":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"11 1","pages":"622-627"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Purrai: A Deep Neural Network based Approach to Interpret Domestic Cat Language\",\"authors\":\"Weilin Sun, V. Lu, Aaron Truong, Hermione Bossolina, Yuan Lu\",\"doi\":\"10.1109/ICMLA52953.2021.00104\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Being able to understand and communicate with domestic cats has always been fascinating to humans, although it is considered a difficult task even for phonetics experts. In this paper, we present our approach to this problem: Purrai, a neural-network-based machine learning platform to interpret cat’s language. Our framework consists of two parts. First, we build a comprehensively constructed cat voice dataset that is 3.7x larger than any existing public available dataset [1]. To improve accuracy, we also use several techniques to ensure labeling quality, including rule-based labeling, cross validation, cosine distance, and outlier detection, etc. Second, we design a two-stage neural network structure to interpret what cats express in the context of multiple sounds called sentences. The first stage is a modification of Google’s Vggish architecture [2] [3], which is a Convolutional Neural Network (CNN) architecture that focuses on the classification of nine primary cat sounds. The second stage takes the probability outputs of a sequence of sound classifications from the first stage and determines the emotional meaning of a cat sentence. Our first stage architecture generates a top-l and top-2 accuracy of 74.1% and 92.1%, better than that of the state-of-the-art approach: 64.9% and 83.4% [4]. Our sentence-based AI model achieves an accuracy of 81.1% for emotion prediction.\",\"PeriodicalId\":6750,\"journal\":{\"name\":\"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"11 1\",\"pages\":\"622-627\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA52953.2021.00104\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA52953.2021.00104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

人类一直对能够听懂家猫的声音并与之交流很感兴趣，尽管即使对语音专家来说，这也被认为是一项艰巨的任务。在本文中，我们提出了解决这个问题的方法:Purrai，一个基于神经网络的机器学习平台，用于解释猫的语言。我们的框架由两部分组成。首先，我们构建了一个全面构建的猫声数据集，该数据集比任何现有的公共可用数据集大3.7倍[1]。为了提高准确性，我们还使用了几种技术来确保标注质量，包括基于规则的标注、交叉验证、余弦距离和离群值检测等。其次，我们设计了一个两阶段的神经网络结构来解释猫在多个声音(称为句子)的背景下表达的内容。第一阶段是对Google的Vggish架构的修改[2][3]，这是一个卷积神经网络(CNN)架构，专注于九种主要猫音的分类。第二阶段从第一阶段获得一系列声音分类的概率输出，并确定猫句的情感意义。我们的第一阶段架构产生的top- 1和top-2准确率分别为74.1%和92.1%，优于最先进的方法:64.9%和83.4%[4]。我们基于句子的人工智能模型在情绪预测方面达到了81.1%的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Purrai: A Deep Neural Network based Approach to Interpret Domestic Cat Language

Being able to understand and communicate with domestic cats has always been fascinating to humans, although it is considered a difficult task even for phonetics experts. In this paper, we present our approach to this problem: Purrai, a neural-network-based machine learning platform to interpret cat’s language. Our framework consists of two parts. First, we build a comprehensively constructed cat voice dataset that is 3.7x larger than any existing public available dataset [1]. To improve accuracy, we also use several techniques to ensure labeling quality, including rule-based labeling, cross validation, cosine distance, and outlier detection, etc. Second, we design a two-stage neural network structure to interpret what cats express in the context of multiple sounds called sentences. The first stage is a modification of Google’s Vggish architecture [2] [3], which is a Convolutional Neural Network (CNN) architecture that focuses on the classification of nine primary cat sounds. The second stage takes the probability outputs of a sequence of sound classifications from the first stage and determines the emotional meaning of a cat sentence. Our first stage architecture generates a top-l and top-2 accuracy of 74.1% and 92.1%, better than that of the state-of-the-art approach: 64.9% and 83.4% [4]. Our sentence-based AI model achieves an accuracy of 81.1% for emotion prediction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

自引率

0.00%

发文量

期刊最新文献

Detecting Offensive Content on Twitter During Proud Boys Riots Explainable Zero-Shot Modelling of Clinical Depression Symptoms from Text Deep Learning Methods for the Prediction of Information Display Type Using Eye Tracking Sequences Step Detection using SVM on NURVV Trackers Condition Monitoring for Power Converters via Deep One-Class Classification