{"title":"A Comparative Study of Cross-Sentence Features for Named Entity Recognition","authors":"Sheng-Fu Wang, Jing Huang, Baohua Zhang, Jia Li","doi":"10.1109/IDITR57726.2023.10145820","DOIUrl":null,"url":null,"abstract":"Recently, a growing number of Named Entity Recognition (NER) methods utilize cross-sentence features (also known as contexts) to improve the performance of NER models, instead of using single-sentence information alone. As far as we know, most NER models choose to exploit pre- and post-sentences to capture cross-sentence features. Generally, current NER studies focus only on the model architecture to capture better token representations. However, there is no in-depth exploration on how to better model cross-sentence features. In this paper, based on the span classification model, we investigate the effect of cross-sentence features under different settings. Specifically, we evaluate the impact of context stitching, context window size, context window padding, and classifier token of pre-trained language model (PLM) on model performance. Comparative experimental results show that appropriate incorporation of document-level contexts can considerably improve the NER metrics. Furthermore, we find that several factors can be used to improve the performance of NER models: (1) use domain-specific PLMs, but not classifier tokens; (2) use only preceding contexts for generic text, and random contexts for specialized text; (3) truncate overly long contexts when the context window is small, and preserve sentence integrity when the window is large; (4) set the context window size to about 200 for the basic size PLM.","PeriodicalId":272880,"journal":{"name":"2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 2nd International Conference on Innovations and Development of Information Technologies and Robotics (IDITR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IDITR57726.2023.10145820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, a growing number of Named Entity Recognition (NER) methods utilize cross-sentence features (also known as contexts) to improve the performance of NER models, instead of using single-sentence information alone. As far as we know, most NER models choose to exploit pre- and post-sentences to capture cross-sentence features. Generally, current NER studies focus only on the model architecture to capture better token representations. However, there is no in-depth exploration on how to better model cross-sentence features. In this paper, based on the span classification model, we investigate the effect of cross-sentence features under different settings. Specifically, we evaluate the impact of context stitching, context window size, context window padding, and classifier token of pre-trained language model (PLM) on model performance. Comparative experimental results show that appropriate incorporation of document-level contexts can considerably improve the NER metrics. Furthermore, we find that several factors can be used to improve the performance of NER models: (1) use domain-specific PLMs, but not classifier tokens; (2) use only preceding contexts for generic text, and random contexts for specialized text; (3) truncate overly long contexts when the context window is small, and preserve sentence integrity when the window is large; (4) set the context window size to about 200 for the basic size PLM.