Improving the performance of Transformer Context Encoders for NER

2021 IEEE 24th International Conference on Information Fusion (FUSION) Pub Date : 2021-11-01 DOI:10.23919/fusion49465.2021.9627061

Avi Chawla, Nidhi Mulay, Vikas Bishnoi, Gaurav Dhama

{"title":"Improving the performance of Transformer Context Encoders for NER","authors":"Avi Chawla, Nidhi Mulay, Vikas Bishnoi, Gaurav Dhama","doi":"10.23919/fusion49465.2021.9627061","DOIUrl":null,"url":null,"abstract":"Large Transformer based models have provided state-of-the-art results on a variety of Natural Language Processing (NLP) tasks. While these Transformer models perform exceptionally well on a wide range of NLP tasks, their usage in Sequence Labeling has been mostly muted. Although pretrained Transformer models such as BERT and XLNet have been successfully employed as input representation, the use of the Transformer model as a context encoder for sequence labeling is still minimal, and most recent works still use recurrent architecture as the context encoder. In this paper, we compare the performance of the Transformer and Recurrent architecture as context encoders on the Named Entity Recognition (NER) task. We vary the character-level representation module from the previously proposed NER models in literature and show how the modification can improve the NER model’s performance. We also explore data augmentation as a method for enhancing their performance. Experimental results on three NER datasets show that our proposed techniques established a new state-of-the-art using the Transformer Encoder over the previously proposed models in the literature using only non-contextualized embeddings.","PeriodicalId":226850,"journal":{"name":"2021 IEEE 24th International Conference on Information Fusion (FUSION)","volume":"397 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 24th International Conference on Information Fusion (FUSION)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/fusion49465.2021.9627061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Large Transformer based models have provided state-of-the-art results on a variety of Natural Language Processing (NLP) tasks. While these Transformer models perform exceptionally well on a wide range of NLP tasks, their usage in Sequence Labeling has been mostly muted. Although pretrained Transformer models such as BERT and XLNet have been successfully employed as input representation, the use of the Transformer model as a context encoder for sequence labeling is still minimal, and most recent works still use recurrent architecture as the context encoder. In this paper, we compare the performance of the Transformer and Recurrent architecture as context encoders on the Named Entity Recognition (NER) task. We vary the character-level representation module from the previously proposed NER models in literature and show how the modification can improve the NER model’s performance. We also explore data augmentation as a method for enhancing their performance. Experimental results on three NER datasets show that our proposed techniques established a new state-of-the-art using the Transformer Encoder over the previously proposed models in the literature using only non-contextualized embeddings.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

改进面向NER的变压器上下文编码器的性能

基于大型变压器的模型已经在各种自然语言处理(NLP)任务上提供了最先进的结果。虽然这些Transformer模型在广泛的NLP任务上表现得非常好，但它们在序列标记中的使用却很少。尽管像BERT和XLNet这样的预训练的Transformer模型已经被成功地用作输入表示，但是Transformer模型作为序列标记的上下文编码器的使用仍然很少，并且最近的工作仍然使用循环架构作为上下文编码器。在本文中，我们比较了Transformer和Recurrent架构作为上下文编码器在命名实体识别(NER)任务中的性能。我们将字符级表示模块与文献中先前提出的NER模型进行了修改，并展示了修改如何提高NER模型的性能。我们还将探索数据增强作为提高其性能的方法。在三个NER数据集上的实验结果表明，我们提出的技术使用变压器编码器在文献中仅使用非上下文嵌入的先前提出的模型上建立了一个新的最先进的技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE 24th International Conference on Information Fusion (FUSION)

自引率

0.00%

发文量

期刊最新文献

Impact of Georegistration Accuracy on Wide Area Motion Imagery Object Detection and Tracking Posterior Cramér-Rao Bounds for Tracking Intermittently Visible Targets in Clutter Monocular 3D Multi-Object Tracking with an EKF Approach for Long-Term Stable Tracks Resilient Collaborative All-source Navigation Symmetric Star-convex Shape Tracking With Wishart Filter