A Comparative Study of Cross-Sentence Features for Named Entity Recognition

2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR) Pub Date : 2023-05-01 DOI:10.1109/IDITR57726.2023.10145820

Sheng-Fu Wang, Jing Huang, Baohua Zhang, Jia Li

{"title":"A Comparative Study of Cross-Sentence Features for Named Entity Recognition","authors":"Sheng-Fu Wang, Jing Huang, Baohua Zhang, Jia Li","doi":"10.1109/IDITR57726.2023.10145820","DOIUrl":null,"url":null,"abstract":"Recently, a growing number of Named Entity Recognition (NER) methods utilize cross-sentence features (also known as contexts) to improve the performance of NER models, instead of using single-sentence information alone. As far as we know, most NER models choose to exploit pre- and post-sentences to capture cross-sentence features. Generally, current NER studies focus only on the model architecture to capture better token representations. However, there is no in-depth exploration on how to better model cross-sentence features. In this paper, based on the span classification model, we investigate the effect of cross-sentence features under different settings. Specifically, we evaluate the impact of context stitching, context window size, context window padding, and classifier token of pre-trained language model (PLM) on model performance. Comparative experimental results show that appropriate incorporation of document-level contexts can considerably improve the NER metrics. Furthermore, we find that several factors can be used to improve the performance of NER models: (1) use domain-specific PLMs, but not classifier tokens; (2) use only preceding contexts for generic text, and random contexts for specialized text; (3) truncate overly long contexts when the context window is small, and preserve sentence integrity when the window is large; (4) set the context window size to about 200 for the basic size PLM.","PeriodicalId":272880,"journal":{"name":"2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IDITR57726.2023.10145820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, a growing number of Named Entity Recognition (NER) methods utilize cross-sentence features (also known as contexts) to improve the performance of NER models, instead of using single-sentence information alone. As far as we know, most NER models choose to exploit pre- and post-sentences to capture cross-sentence features. Generally, current NER studies focus only on the model architecture to capture better token representations. However, there is no in-depth exploration on how to better model cross-sentence features. In this paper, based on the span classification model, we investigate the effect of cross-sentence features under different settings. Specifically, we evaluate the impact of context stitching, context window size, context window padding, and classifier token of pre-trained language model (PLM) on model performance. Comparative experimental results show that appropriate incorporation of document-level contexts can considerably improve the NER metrics. Furthermore, we find that several factors can be used to improve the performance of NER models: (1) use domain-specific PLMs, but not classifier tokens; (2) use only preceding contexts for generic text, and random contexts for specialized text; (3) truncate overly long contexts when the context window is small, and preserve sentence integrity when the window is large; (4) set the context window size to about 200 for the basic size PLM.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

命名实体识别的跨句特征比较研究

最近，越来越多的命名实体识别(NER)方法利用交叉句子特征(也称为上下文)来提高NER模型的性能，而不是单独使用单句信息。据我们所知，大多数NER模型选择利用前句和后句来捕获跨句特征。一般来说，当前的NER研究只关注模型体系结构，以获取更好的令牌表示。然而，对于如何更好地对跨句特征进行建模，目前还没有深入的探讨。本文基于跨度分类模型，研究了不同设置下跨句特征的影响。具体来说，我们评估了预训练语言模型(PLM)的上下文拼接、上下文窗口大小、上下文窗口填充和分类器标记对模型性能的影响。对比实验结果表明，适当地结合文档级上下文可以显著提高NER度量。此外，我们发现有几个因素可以用来提高NER模型的性能:(1)使用特定于领域的plm，而不是分类器令牌;(2)一般文本只使用前面的上下文，特殊文本只使用随机上下文;(3)在上下文窗口较小时截断过长的上下文，在上下文窗口较大时保持句子的完整性;(4)对于基本大小的PLM，将上下文窗口大小设置为200左右。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)

自引率

0.00%

发文量