Improving the performance of Transformer Context Encoders for NER

Avi Chawla, Nidhi Mulay, Vikas Bishnoi, Gaurav Dhama
{"title":"Improving the performance of Transformer Context Encoders for NER","authors":"Avi Chawla, Nidhi Mulay, Vikas Bishnoi, Gaurav Dhama","doi":"10.23919/fusion49465.2021.9627061","DOIUrl":null,"url":null,"abstract":"Large Transformer based models have provided state-of-the-art results on a variety of Natural Language Processing (NLP) tasks. While these Transformer models perform exceptionally well on a wide range of NLP tasks, their usage in Sequence Labeling has been mostly muted. Although pretrained Transformer models such as BERT and XLNet have been successfully employed as input representation, the use of the Transformer model as a context encoder for sequence labeling is still minimal, and most recent works still use recurrent architecture as the context encoder. In this paper, we compare the performance of the Transformer and Recurrent architecture as context encoders on the Named Entity Recognition (NER) task. We vary the character-level representation module from the previously proposed NER models in literature and show how the modification can improve the NER model’s performance. We also explore data augmentation as a method for enhancing their performance. Experimental results on three NER datasets show that our proposed techniques established a new state-of-the-art using the Transformer Encoder over the previously proposed models in the literature using only non-contextualized embeddings.","PeriodicalId":226850,"journal":{"name":"2021 IEEE 24th International Conference on Information Fusion (FUSION)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 24th International Conference on Information Fusion (FUSION)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/fusion49465.2021.9627061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Large Transformer based models have provided state-of-the-art results on a variety of Natural Language Processing (NLP) tasks. While these Transformer models perform exceptionally well on a wide range of NLP tasks, their usage in Sequence Labeling has been mostly muted. Although pretrained Transformer models such as BERT and XLNet have been successfully employed as input representation, the use of the Transformer model as a context encoder for sequence labeling is still minimal, and most recent works still use recurrent architecture as the context encoder. In this paper, we compare the performance of the Transformer and Recurrent architecture as context encoders on the Named Entity Recognition (NER) task. We vary the character-level representation module from the previously proposed NER models in literature and show how the modification can improve the NER model’s performance. We also explore data augmentation as a method for enhancing their performance. Experimental results on three NER datasets show that our proposed techniques established a new state-of-the-art using the Transformer Encoder over the previously proposed models in the literature using only non-contextualized embeddings.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
改进面向NER的变压器上下文编码器的性能
基于大型变压器的模型已经在各种自然语言处理(NLP)任务上提供了最先进的结果。虽然这些Transformer模型在广泛的NLP任务上表现得非常好,但它们在序列标记中的使用却很少。尽管像BERT和XLNet这样的预训练的Transformer模型已经被成功地用作输入表示,但是Transformer模型作为序列标记的上下文编码器的使用仍然很少,并且最近的工作仍然使用循环架构作为上下文编码器。在本文中,我们比较了Transformer和Recurrent架构作为上下文编码器在命名实体识别(NER)任务中的性能。我们将字符级表示模块与文献中先前提出的NER模型进行了修改,并展示了修改如何提高NER模型的性能。我们还将探索数据增强作为提高其性能的方法。在三个NER数据集上的实验结果表明,我们提出的技术使用变压器编码器在文献中仅使用非上下文嵌入的先前提出的模型上建立了一个新的最先进的技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Impact of Georegistration Accuracy on Wide Area Motion Imagery Object Detection and Tracking Posterior Cramér-Rao Bounds for Tracking Intermittently Visible Targets in Clutter Monocular 3D Multi-Object Tracking with an EKF Approach for Long-Term Stable Tracks Resilient Collaborative All-source Navigation Symmetric Star-convex Shape Tracking With Wishart Filter
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1