Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations

Pub Date : 2024-04-05 DOI:10.1007/s10015-024-00944-9
Kazuteru Miyazaki, Masaaki Ida
{"title":"Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations","authors":"Kazuteru Miyazaki,&nbsp;Masaaki Ida","doi":"10.1007/s10015-024-00944-9","DOIUrl":null,"url":null,"abstract":"<div><p>Character-level convolutional neural networks (CLCNNs) are commonly used to classify textual data. CLCNN is used as a more versatile tool. For natural language recognition, after decomposing a sentence into character units, each unit is converted into a corresponding character code (e.g., Unicode values) and the code is input into the CLCNN network. Thus, sentences can be treated like images. We have previously applied a CLCNN to verify whether a university’s diploma and/or curriculum policies are well written. In this study, we experimentally confirm the effectiveness of CLCNN using tweet data. In particular, we focus on the effect of the number of units on performance using the following two types of data; one is a real and public tweet dataset on the reputation of a cell phone, and the other is the NTCIR-13 MedWeb task, which consists of pseudo-tweet data and is a well-known collection of tests for multi-label problems. Results of experiments conducted by varying the number of units in the all-coupled layer confirmed the agreement of the results with the theorem introduced in the Amari’s book (Amari in Mathematical Science New Development of Information Geometry, For Senior &amp; Graduate Courses. SAIENSU-SHA Co., 2014). Furthermore, in the NTCIR-13 MedWeb task, we analyze two kinds of experiments, the effects of kernel size and weight perturbation. The results of the difference in the kernel size suggest the existence of an optimal kernel size for sentence comprehension. The results of perturbations to the convolutional layer and pooling layer indicate the possibility of relationship between the numbers of degrees of freedom and network parameters.</p></div>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s10015-024-00944-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Character-level convolutional neural networks (CLCNNs) are commonly used to classify textual data. CLCNN is used as a more versatile tool. For natural language recognition, after decomposing a sentence into character units, each unit is converted into a corresponding character code (e.g., Unicode values) and the code is input into the CLCNN network. Thus, sentences can be treated like images. We have previously applied a CLCNN to verify whether a university’s diploma and/or curriculum policies are well written. In this study, we experimentally confirm the effectiveness of CLCNN using tweet data. In particular, we focus on the effect of the number of units on performance using the following two types of data; one is a real and public tweet dataset on the reputation of a cell phone, and the other is the NTCIR-13 MedWeb task, which consists of pseudo-tweet data and is a well-known collection of tests for multi-label problems. Results of experiments conducted by varying the number of units in the all-coupled layer confirmed the agreement of the results with the theorem introduced in the Amari’s book (Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses. SAIENSU-SHA Co., 2014). Furthermore, in the NTCIR-13 MedWeb task, we analyze two kinds of experiments, the effects of kernel size and weight perturbation. The results of the difference in the kernel size suggest the existence of an optimal kernel size for sentence comprehension. The results of perturbations to the convolutional layer and pooling layer indicate the possibility of relationship between the numbers of degrees of freedom and network parameters.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
利用推文数据和权重扰动分析对字符级 CNN 进行性能评估
字符级卷积神经网络(CLCNN)通常用于对文本数据进行分类。CLCNN 是一种用途更为广泛的工具。在自然语言识别中,将句子分解为字符单元后,将每个单元转换为相应的字符代码(如 Unicode 值),然后将代码输入 CLCNN 网络。因此,句子可以像图像一样处理。此前,我们曾将 CLCNN 用于验证一所大学的文凭和/或课程政策是否编写得当。在本研究中,我们使用推文数据对 CLCNN 的有效性进行了实验验证。具体而言,我们使用以下两种数据重点研究了单元数对性能的影响:一种是关于手机声誉的真实公开推文数据集,另一种是 NTCIR-13 MedWeb 任务,该任务由伪推文数据组成,是众所周知的多标签问题测试集合。通过改变全耦合层中的单元数量进行的实验结果证实,实验结果与阿马里在其著作《阿马里在数学科学中的新发展:信息几何》(Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses.SAIENSU-SHA Co., 2014)。此外,在 NTCIR-13 MedWeb 任务中,我们分析了两种实验,即核大小和权重扰动的影响。内核大小差异的结果表明,句子理解存在最佳内核大小。对卷积层和池化层的扰动结果表明,自由度数与网络参数之间可能存在关系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1