{"title":"利用推文数据和权重扰动分析对字符级 CNN 进行性能评估","authors":"Kazuteru Miyazaki, Masaaki Ida","doi":"10.1007/s10015-024-00944-9","DOIUrl":null,"url":null,"abstract":"<div><p>Character-level convolutional neural networks (CLCNNs) are commonly used to classify textual data. CLCNN is used as a more versatile tool. For natural language recognition, after decomposing a sentence into character units, each unit is converted into a corresponding character code (e.g., Unicode values) and the code is input into the CLCNN network. Thus, sentences can be treated like images. We have previously applied a CLCNN to verify whether a university’s diploma and/or curriculum policies are well written. In this study, we experimentally confirm the effectiveness of CLCNN using tweet data. In particular, we focus on the effect of the number of units on performance using the following two types of data; one is a real and public tweet dataset on the reputation of a cell phone, and the other is the NTCIR-13 MedWeb task, which consists of pseudo-tweet data and is a well-known collection of tests for multi-label problems. Results of experiments conducted by varying the number of units in the all-coupled layer confirmed the agreement of the results with the theorem introduced in the Amari’s book (Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses. SAIENSU-SHA Co., 2014). Furthermore, in the NTCIR-13 MedWeb task, we analyze two kinds of experiments, the effects of kernel size and weight perturbation. The results of the difference in the kernel size suggest the existence of an optimal kernel size for sentence comprehension. The results of perturbations to the convolutional layer and pooling layer indicate the possibility of relationship between the numbers of degrees of freedom and network parameters.</p></div>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations\",\"authors\":\"Kazuteru Miyazaki, Masaaki Ida\",\"doi\":\"10.1007/s10015-024-00944-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Character-level convolutional neural networks (CLCNNs) are commonly used to classify textual data. CLCNN is used as a more versatile tool. For natural language recognition, after decomposing a sentence into character units, each unit is converted into a corresponding character code (e.g., Unicode values) and the code is input into the CLCNN network. Thus, sentences can be treated like images. We have previously applied a CLCNN to verify whether a university’s diploma and/or curriculum policies are well written. In this study, we experimentally confirm the effectiveness of CLCNN using tweet data. In particular, we focus on the effect of the number of units on performance using the following two types of data; one is a real and public tweet dataset on the reputation of a cell phone, and the other is the NTCIR-13 MedWeb task, which consists of pseudo-tweet data and is a well-known collection of tests for multi-label problems. Results of experiments conducted by varying the number of units in the all-coupled layer confirmed the agreement of the results with the theorem introduced in the Amari’s book (Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses. SAIENSU-SHA Co., 2014). Furthermore, in the NTCIR-13 MedWeb task, we analyze two kinds of experiments, the effects of kernel size and weight perturbation. The results of the difference in the kernel size suggest the existence of an optimal kernel size for sentence comprehension. The results of perturbations to the convolutional layer and pooling layer indicate the possibility of relationship between the numbers of degrees of freedom and network parameters.</p></div>\",\"PeriodicalId\":0,\"journal\":{\"name\":\"\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0,\"publicationDate\":\"2024-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10015-024-00944-9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s10015-024-00944-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
字符级卷积神经网络(CLCNN)通常用于对文本数据进行分类。CLCNN 是一种用途更为广泛的工具。在自然语言识别中,将句子分解为字符单元后,将每个单元转换为相应的字符代码(如 Unicode 值),然后将代码输入 CLCNN 网络。因此,句子可以像图像一样处理。此前,我们曾将 CLCNN 用于验证一所大学的文凭和/或课程政策是否编写得当。在本研究中,我们使用推文数据对 CLCNN 的有效性进行了实验验证。具体而言,我们使用以下两种数据重点研究了单元数对性能的影响:一种是关于手机声誉的真实公开推文数据集,另一种是 NTCIR-13 MedWeb 任务,该任务由伪推文数据组成,是众所周知的多标签问题测试集合。通过改变全耦合层中的单元数量进行的实验结果证实,实验结果与阿马里在其著作《阿马里在数学科学中的新发展:信息几何》(Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses.SAIENSU-SHA Co., 2014)。此外,在 NTCIR-13 MedWeb 任务中,我们分析了两种实验,即核大小和权重扰动的影响。内核大小差异的结果表明,句子理解存在最佳内核大小。对卷积层和池化层的扰动结果表明,自由度数与网络参数之间可能存在关系。
Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations
Character-level convolutional neural networks (CLCNNs) are commonly used to classify textual data. CLCNN is used as a more versatile tool. For natural language recognition, after decomposing a sentence into character units, each unit is converted into a corresponding character code (e.g., Unicode values) and the code is input into the CLCNN network. Thus, sentences can be treated like images. We have previously applied a CLCNN to verify whether a university’s diploma and/or curriculum policies are well written. In this study, we experimentally confirm the effectiveness of CLCNN using tweet data. In particular, we focus on the effect of the number of units on performance using the following two types of data; one is a real and public tweet dataset on the reputation of a cell phone, and the other is the NTCIR-13 MedWeb task, which consists of pseudo-tweet data and is a well-known collection of tests for multi-label problems. Results of experiments conducted by varying the number of units in the all-coupled layer confirmed the agreement of the results with the theorem introduced in the Amari’s book (Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses. SAIENSU-SHA Co., 2014). Furthermore, in the NTCIR-13 MedWeb task, we analyze two kinds of experiments, the effects of kernel size and weight perturbation. The results of the difference in the kernel size suggest the existence of an optimal kernel size for sentence comprehension. The results of perturbations to the convolutional layer and pooling layer indicate the possibility of relationship between the numbers of degrees of freedom and network parameters.