Research on Chinese-English Hybrid Rhetorical Question Recognition Model and Corpus Construction of Intelligent Web Text

Y. Zu
{"title":"Research on Chinese-English Hybrid Rhetorical Question Recognition Model and Corpus Construction of Intelligent Web Text","authors":"Y. Zu","doi":"10.1109/ECICE55674.2022.10042881","DOIUrl":null,"url":null,"abstract":"With the popularity of English in China, Chinese-English mixed rhetorical question has become a common expression in China. Mixed Chinese-English rhetorical questions have rich emotional overtones, and if they can be correctly identified, they improve the results of sentiment analysis and other tasks. Using semi-supervised learning and active learning methods, a semi-automatic collection of the rhetorical corpus is proposed to construct a Chinese-English rhetorical corpus of web text. Based on the corpus, the characteristics of Chinese-English mixed rhetorical questions are analyzed from the aspects of semantic features, positional features, and syntactic path features to carry out a rhetorical question recognition experiment. Experimental results show that the rhetorical question corpus constructed from online texts trains a rhetorical question recognition model with high performance, and the accuracy, recall, and F1 values of the model are higher than 90%. At the same time, the experimental results verify the effectiveness of syntactic path features and location features in identifying rhetorical questions.","PeriodicalId":282635,"journal":{"name":"2022 IEEE 4th Eurasia Conference on IOT, Communication and Engineering (ECICE)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 4th Eurasia Conference on IOT, Communication and Engineering (ECICE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ECICE55674.2022.10042881","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

With the popularity of English in China, Chinese-English mixed rhetorical question has become a common expression in China. Mixed Chinese-English rhetorical questions have rich emotional overtones, and if they can be correctly identified, they improve the results of sentiment analysis and other tasks. Using semi-supervised learning and active learning methods, a semi-automatic collection of the rhetorical corpus is proposed to construct a Chinese-English rhetorical corpus of web text. Based on the corpus, the characteristics of Chinese-English mixed rhetorical questions are analyzed from the aspects of semantic features, positional features, and syntactic path features to carry out a rhetorical question recognition experiment. Experimental results show that the rhetorical question corpus constructed from online texts trains a rhetorical question recognition model with high performance, and the accuracy, recall, and F1 values of the model are higher than 90%. At the same time, the experimental results verify the effectiveness of syntactic path features and location features in identifying rhetorical questions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
汉英混合反问句识别模型及智能网络文本语料库构建研究
随着英语在中国的普及,中英混合反问句在中国已经成为一种常见的表达方式。混合汉英反问句具有丰富的情感色彩,如果能够正确识别,可以改善情感分析和其他任务的结果。采用半监督学习和主动学习相结合的方法,提出了一种半自动收集修辞语料库的方法来构建汉英网络文本修辞语料库。基于语料库,从语义特征、位置特征、句法路径特征等方面分析汉英混合反问句的特征,开展反问句识别实验。实验结果表明,利用在线文本构建的反问句语料库训练出了一个高性能的反问句识别模型,模型的准确率、查全率和F1值均高于90%。同时,实验结果验证了句法路径特征和位置特征在反问句识别中的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
License Plate Recognition Model For Tilt Correction Based on Convolutional Neural Network Quaternion Singular Spectrum Analysis of Pupillary Dynamics for Health Monitoring Trajectory Tracking Control of Autonomous Lawn Mower Based on ANSMC Task Scheduling with Makespan Minimization for Distributed Machine Learning Ensembles Socially Assistive Robots Assisting Older Adults in an Internet and Smart Healthcare Era: A Literature Review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1