命名实体识别的跨句特征比较研究

2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR) Pub Date : 2023-05-01 DOI:10.1109/IDITR57726.2023.10145820

Sheng-Fu Wang, Jing Huang, Baohua Zhang, Jia Li

{"title":"命名实体识别的跨句特征比较研究","authors":"Sheng-Fu Wang, Jing Huang, Baohua Zhang, Jia Li","doi":"10.1109/IDITR57726.2023.10145820","DOIUrl":null,"url":null,"abstract":"Recently, a growing number of Named Entity Recognition (NER) methods utilize cross-sentence features (also known as contexts) to improve the performance of NER models, instead of using single-sentence information alone. As far as we know, most NER models choose to exploit pre- and post-sentences to capture cross-sentence features. Generally, current NER studies focus only on the model architecture to capture better token representations. However, there is no in-depth exploration on how to better model cross-sentence features. In this paper, based on the span classification model, we investigate the effect of cross-sentence features under different settings. Specifically, we evaluate the impact of context stitching, context window size, context window padding, and classifier token of pre-trained language model (PLM) on model performance. Comparative experimental results show that appropriate incorporation of document-level contexts can considerably improve the NER metrics. Furthermore, we find that several factors can be used to improve the performance of NER models: (1) use domain-specific PLMs, but not classifier tokens; (2) use only preceding contexts for generic text, and random contexts for specialized text; (3) truncate overly long contexts when the context window is small, and preserve sentence integrity when the window is large; (4) set the context window size to about 200 for the basic size PLM.","PeriodicalId":272880,"journal":{"name":"2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comparative Study of Cross-Sentence Features for Named Entity Recognition\",\"authors\":\"Sheng-Fu Wang, Jing Huang, Baohua Zhang, Jia Li\",\"doi\":\"10.1109/IDITR57726.2023.10145820\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, a growing number of Named Entity Recognition (NER) methods utilize cross-sentence features (also known as contexts) to improve the performance of NER models, instead of using single-sentence information alone. As far as we know, most NER models choose to exploit pre- and post-sentences to capture cross-sentence features. Generally, current NER studies focus only on the model architecture to capture better token representations. However, there is no in-depth exploration on how to better model cross-sentence features. In this paper, based on the span classification model, we investigate the effect of cross-sentence features under different settings. Specifically, we evaluate the impact of context stitching, context window size, context window padding, and classifier token of pre-trained language model (PLM) on model performance. Comparative experimental results show that appropriate incorporation of document-level contexts can considerably improve the NER metrics. Furthermore, we find that several factors can be used to improve the performance of NER models: (1) use domain-specific PLMs, but not classifier tokens; (2) use only preceding contexts for generic text, and random contexts for specialized text; (3) truncate overly long contexts when the context window is small, and preserve sentence integrity when the window is large; (4) set the context window size to about 200 for the basic size PLM.\",\"PeriodicalId\":272880,\"journal\":{\"name\":\"2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IDITR57726.2023.10145820\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IDITR57726.2023.10145820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

最近，越来越多的命名实体识别(NER)方法利用交叉句子特征(也称为上下文)来提高NER模型的性能，而不是单独使用单句信息。据我们所知，大多数NER模型选择利用前句和后句来捕获跨句特征。一般来说，当前的NER研究只关注模型体系结构，以获取更好的令牌表示。然而，对于如何更好地对跨句特征进行建模，目前还没有深入的探讨。本文基于跨度分类模型，研究了不同设置下跨句特征的影响。具体来说，我们评估了预训练语言模型(PLM)的上下文拼接、上下文窗口大小、上下文窗口填充和分类器标记对模型性能的影响。对比实验结果表明，适当地结合文档级上下文可以显著提高NER度量。此外，我们发现有几个因素可以用来提高NER模型的性能:(1)使用特定于领域的plm，而不是分类器令牌;(2)一般文本只使用前面的上下文，特殊文本只使用随机上下文;(3)在上下文窗口较小时截断过长的上下文，在上下文窗口较大时保持句子的完整性;(4)对于基本大小的PLM，将上下文窗口大小设置为200左右。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Comparative Study of Cross-Sentence Features for Named Entity Recognition

Recently, a growing number of Named Entity Recognition (NER) methods utilize cross-sentence features (also known as contexts) to improve the performance of NER models, instead of using single-sentence information alone. As far as we know, most NER models choose to exploit pre- and post-sentences to capture cross-sentence features. Generally, current NER studies focus only on the model architecture to capture better token representations. However, there is no in-depth exploration on how to better model cross-sentence features. In this paper, based on the span classification model, we investigate the effect of cross-sentence features under different settings. Specifically, we evaluate the impact of context stitching, context window size, context window padding, and classifier token of pre-trained language model (PLM) on model performance. Comparative experimental results show that appropriate incorporation of document-level contexts can considerably improve the NER metrics. Furthermore, we find that several factors can be used to improve the performance of NER models: (1) use domain-specific PLMs, but not classifier tokens; (2) use only preceding contexts for generic text, and random contexts for specialized text; (3) truncate overly long contexts when the context window is small, and preserve sentence integrity when the window is large; (4) set the context window size to about 200 for the basic size PLM.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)

自引率

0.00%

发文量