Construction of Narrative Text Component Recognition Corpus

Feng Zhang, Yingqi Han, Jiong Wang, Jie Liu
{"title":"Construction of Narrative Text Component Recognition Corpus","authors":"Feng Zhang, Yingqi Han, Jiong Wang, Jie Liu","doi":"10.1109/CCET55412.2022.9906339","DOIUrl":null,"url":null,"abstract":"Textual structure analysis is an important part of Automatic Essay Score (AES), and is also one of the important research directions in Natural Language Processing. At present, there are still deficiencies in the research of narrative textual structure in China, one of the main reasons is the lack of data available for research. To solve this problem, this paper proposes and constructs a corpus for the textual component identification of narrative essay. This paper divides the text structure of narrative essay, and forms a corpus for the narrative essay component identification. The paper finally annotated 3024 articles with 21128 sentences in total. This paper combines manual annotation and the automatic annotation of the model to build corpus, and conducts statistical analysis on the distribution of the corpus content and the consistency of the corpus annotation. The experiment shows text component recognition performance achieves 80.75% F 1 score. The work provided basic data for the research of AES.","PeriodicalId":329327,"journal":{"name":"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCET55412.2022.9906339","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Textual structure analysis is an important part of Automatic Essay Score (AES), and is also one of the important research directions in Natural Language Processing. At present, there are still deficiencies in the research of narrative textual structure in China, one of the main reasons is the lack of data available for research. To solve this problem, this paper proposes and constructs a corpus for the textual component identification of narrative essay. This paper divides the text structure of narrative essay, and forms a corpus for the narrative essay component identification. The paper finally annotated 3024 articles with 21128 sentences in total. This paper combines manual annotation and the automatic annotation of the model to build corpus, and conducts statistical analysis on the distribution of the corpus content and the consistency of the corpus annotation. The experiment shows text component recognition performance achieves 80.75% F 1 score. The work provided basic data for the research of AES.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
叙事文本成分识别语料库的构建
文本结构分析是自动作文评分(AES)的重要组成部分,也是自然语言处理的重要研究方向之一。目前,国内对叙事文本结构的研究还存在不足,其中一个主要原因是缺乏可用于研究的数据。为了解决这一问题,本文提出并构建了一个叙事性短文语篇成分识别的语料库。本文对叙事性散文的文本结构进行了划分,形成了叙事性散文成分识别的语料库。论文最终注释了3024篇文章,共计21128个句子。本文将人工标注与模型自动标注相结合构建语料库,并对语料库内容的分布和语料库标注的一致性进行统计分析。实验表明,文本成分识别性能达到80.75%的f1分。该工作为AES的研究提供了基础数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
5G Enabling Streaming Media Architecture with Edge Intelligence Gateway in Smart Grids VPN Traffic Identification Based on Tunneling Protocol Characteristics An Improved Clock Cycle Measurement Method for High-Speed Serial Signal with Duty-Cycle-Distortion Jitter Research on Banana Leaf Disease Detection Based on the Image Processing Technology Vision Transformer Based on Knowledge Distillation in TCM Image Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1