Shuai Gong, Xiong Xiong, Yunfei Liu, Shengyang Li, Anqi Liu
{"title":"A Transformer-Based Longer Entity Attention Model for Chinese Named Entity Recognition in Aerospace","authors":"Shuai Gong, Xiong Xiong, Yunfei Liu, Shengyang Li, Anqi Liu","doi":"10.1109/AEMCSE55572.2022.00077","DOIUrl":null,"url":null,"abstract":"Chinese aerospace knowledge includes many long entities, such as professional terms, equipment names, and cabinets. However, current Named Entity Recognition (NER) algorithms typically address these longer and shorter entities uniformly. In this paper, a Longer Entity Attention (LEA) model based on the transformer is proposed. After the transformer encoding layer, LEA integrates sentence tags, sets thresholds according to the length of entities, and processes the hidden layer features of entities larger than the defined threshold to enhance the ability of the model to recognize longer entities. In addition, we construct an Aerospace Chinese NER dataset (ACNE) containing rich entity categories and domain knowledge. Experimental results demonstrate that LEA outperforms previous state-of-the-art models on ACNE, and shows a significant improvement on longer entities in each threshold range on OntoNotes 5.0 and ACNE datasets.","PeriodicalId":309096,"journal":{"name":"2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AEMCSE55572.2022.00077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Chinese aerospace knowledge includes many long entities, such as professional terms, equipment names, and cabinets. However, current Named Entity Recognition (NER) algorithms typically address these longer and shorter entities uniformly. In this paper, a Longer Entity Attention (LEA) model based on the transformer is proposed. After the transformer encoding layer, LEA integrates sentence tags, sets thresholds according to the length of entities, and processes the hidden layer features of entities larger than the defined threshold to enhance the ability of the model to recognize longer entities. In addition, we construct an Aerospace Chinese NER dataset (ACNE) containing rich entity categories and domain knowledge. Experimental results demonstrate that LEA outperforms previous state-of-the-art models on ACNE, and shows a significant improvement on longer entities in each threshold range on OntoNotes 5.0 and ACNE datasets.