ELCA：通过上下文关联加强中文命名实体识别的边界定位

IF 0.8 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Intelligent Data Analysis Pub Date : 2024-07-17 DOI:10.3233/ida-230383

Yizhao Wang, Shun Mao, Yuncheng Jiang

{"title":"ELCA：通过上下文关联加强中文命名实体识别的边界定位","authors":"Yizhao Wang, Shun Mao, Yuncheng Jiang","doi":"10.3233/ida-230383","DOIUrl":null,"url":null,"abstract":"Named Entity Recognition (NER) is a fundamental task that aids in the completion of other tasks such as text understanding, information retrieval and question answering in Natural Language Processing (NLP). In recent years, the use of a mix of character-word structure and dictionary information forChinese NER has been demonstrated to be effective. As a representative of hybrid models, Lattice-LSTM has obtained better benchmarking results in several publicly available Chinese NER datasets. However, Lattice-LSTM does not address the issue of long-distance entities or the detection of several entities with the same character. At the same time, the ambiguity of entity boundary information also leads to a decrease in the accuracy of embedding NER. This paper proposes ELCA: Enhanced Boundary Location for Chinese Named Entity Recognition Via Contextual Association, a method that solves the problem of long-distance dependent entities by using sentence-level position information. At the same time, it uses adaptive word convolution to overcome the problem of several entities sharing the same character. ELCA achieves the state-of-the-art outcomes in Chinese Word Segmentation and Chinese NER.","PeriodicalId":50355,"journal":{"name":"Intelligent Data Analysis","volume":"23 1","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ELCA: Enhanced boundary location for Chinese named entity recognition via contextual association\",\"authors\":\"Yizhao Wang, Shun Mao, Yuncheng Jiang\",\"doi\":\"10.3233/ida-230383\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Named Entity Recognition (NER) is a fundamental task that aids in the completion of other tasks such as text understanding, information retrieval and question answering in Natural Language Processing (NLP). In recent years, the use of a mix of character-word structure and dictionary information forChinese NER has been demonstrated to be effective. As a representative of hybrid models, Lattice-LSTM has obtained better benchmarking results in several publicly available Chinese NER datasets. However, Lattice-LSTM does not address the issue of long-distance entities or the detection of several entities with the same character. At the same time, the ambiguity of entity boundary information also leads to a decrease in the accuracy of embedding NER. This paper proposes ELCA: Enhanced Boundary Location for Chinese Named Entity Recognition Via Contextual Association, a method that solves the problem of long-distance dependent entities by using sentence-level position information. At the same time, it uses adaptive word convolution to overcome the problem of several entities sharing the same character. ELCA achieves the state-of-the-art outcomes in Chinese Word Segmentation and Chinese NER.\",\"PeriodicalId\":50355,\"journal\":{\"name\":\"Intelligent Data Analysis\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2024-07-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent Data Analysis\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.3233/ida-230383\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Data Analysis","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3233/ida-230383","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

命名实体识别（NER）是一项基本任务，有助于完成自然语言处理（NLP）中的文本理解、信息检索和问题解答等其他任务。近年来，在中文 NER 中混合使用字词结构和词典信息已被证明是有效的。作为混合模型的代表，Lattice-LSTM 在几个公开的中文 NER 数据集中取得了较好的基准结果。然而，Lattice-LSTM 并没有解决远距离实体或多个同字实体的检测问题。同时，实体边界信息的模糊性也会导致嵌入 NER 的准确率下降。本文提出的 ELCA：通过上下文关联增强中文命名实体识别的边界定位，是一种利用句子级位置信息解决长距离依存实体问题的方法。同时，它还利用自适应词卷积克服了多个实体共享同一字符的问题。ELCA 在中文分词和中文近义词识别方面取得了最先进的成果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ELCA: Enhanced boundary location for Chinese named entity recognition via contextual association

Named Entity Recognition (NER) is a fundamental task that aids in the completion of other tasks such as text understanding, information retrieval and question answering in Natural Language Processing (NLP). In recent years, the use of a mix of character-word structure and dictionary information forChinese NER has been demonstrated to be effective. As a representative of hybrid models, Lattice-LSTM has obtained better benchmarking results in several publicly available Chinese NER datasets. However, Lattice-LSTM does not address the issue of long-distance entities or the detection of several entities with the same character. At the same time, the ambiguity of entity boundary information also leads to a decrease in the accuracy of embedding NER. This paper proposes ELCA: Enhanced Boundary Location for Chinese Named Entity Recognition Via Contextual Association, a method that solves the problem of long-distance dependent entities by using sentence-level position information. At the same time, it uses adaptive word convolution to overcome the problem of several entities sharing the same character. ELCA achieves the state-of-the-art outcomes in Chinese Word Segmentation and Chinese NER.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Intelligent Data Analysis 工程技术-计算机：人工智能

CiteScore

2.20

自引率

5.90%

发文量

审稿时长

3.3 months

期刊介绍： Intelligent Data Analysis provides a forum for the examination of issues related to the research and applications of Artificial Intelligence techniques in data analysis across a variety of disciplines. These techniques include (but are not limited to): all areas of data visualization, data pre-processing (fusion, editing, transformation, filtering, sampling), data engineering, database mining techniques, tools and applications, use of domain knowledge in data analysis, big data applications, evolutionary algorithms, machine learning, neural nets, fuzzy logic, statistical pattern recognition, knowledge filtering, and post-processing. In particular, papers are preferred that discuss development of new AI related data analysis architectures, methodologies, and techniques and their applications to various domains.