融合多特征语义的联合实体和关系提取

IF 3.4 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Journal of Intelligent Information Systems Pub Date : 2024-08-01 DOI:10.1007/s10844-024-00871-y

Ting Wang, Wenjie Yang, Tao Wu, Chuan Yang, Jiaying Liang, Hongyang Wang, Jia Li, Dong Xiang, Zheng Zhou

{"title":"融合多特征语义的联合实体和关系提取","authors":"Ting Wang, Wenjie Yang, Tao Wu, Chuan Yang, Jiaying Liang, Hongyang Wang, Jia Li, Dong Xiang, Zheng Zhou","doi":"10.1007/s10844-024-00871-y","DOIUrl":null,"url":null,"abstract":"<p>Entity relation extraction is a key technology for extracting structured information from unstructured text and serves as the foundation for building large-scale knowledge graphs. Current joint entity relation extraction methods primarily focus on improving the recognition of overlapping triplets to enhance the overall performance of the model. However, the model still faces numerous challenges in managing intra-triplet and inter-triplet interactions, expanding the breadth of semantic encoding, and reducing information redundancy during the extraction process. These issues make it challenging for the model to achieve satisfactory performance in both normal and overlapping triple extraction. To address these challenges, this study proposes a comprehensive prediction network that includes multi-feature semantic fusion. We have developed a semantic fusion module that integrates entity mask embedding sequences, which enhance connections between entities, and context embedding sequences that provide richer semantic information, to enhance inter-triplet interactions and expand semantic encoding. Subsequently, using a parallel decoder to simultaneously generate a set of triplets, improving the interaction between them. Additionally, we utilize an entity mask sequence to finely prune these triplets, optimizing the final set of triplets. Experimental results on the publicly available datasets NYT and WebNLG demonstrate that, with BERT as the encoder, our model outperforms the baseline model in terms of accuracy and F1 score.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"34 1","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Joint entity and relation extraction with fusion of multi-feature semantics\",\"authors\":\"Ting Wang, Wenjie Yang, Tao Wu, Chuan Yang, Jiaying Liang, Hongyang Wang, Jia Li, Dong Xiang, Zheng Zhou\",\"doi\":\"10.1007/s10844-024-00871-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Entity relation extraction is a key technology for extracting structured information from unstructured text and serves as the foundation for building large-scale knowledge graphs. Current joint entity relation extraction methods primarily focus on improving the recognition of overlapping triplets to enhance the overall performance of the model. However, the model still faces numerous challenges in managing intra-triplet and inter-triplet interactions, expanding the breadth of semantic encoding, and reducing information redundancy during the extraction process. These issues make it challenging for the model to achieve satisfactory performance in both normal and overlapping triple extraction. To address these challenges, this study proposes a comprehensive prediction network that includes multi-feature semantic fusion. We have developed a semantic fusion module that integrates entity mask embedding sequences, which enhance connections between entities, and context embedding sequences that provide richer semantic information, to enhance inter-triplet interactions and expand semantic encoding. Subsequently, using a parallel decoder to simultaneously generate a set of triplets, improving the interaction between them. Additionally, we utilize an entity mask sequence to finely prune these triplets, optimizing the final set of triplets. Experimental results on the publicly available datasets NYT and WebNLG demonstrate that, with BERT as the encoder, our model outperforms the baseline model in terms of accuracy and F1 score.</p>\",\"PeriodicalId\":56119,\"journal\":{\"name\":\"Journal of Intelligent Information Systems\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Intelligent Information Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10844-024-00871-y\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10844-024-00871-y","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

实体关系提取是从非结构化文本中提取结构化信息的关键技术，也是构建大规模知识图谱的基础。目前的联合实体关系提取方法主要侧重于提高重叠三元组的识别率，以增强模型的整体性能。然而，该模型在管理三元组内和三元组间的交互、扩展语义编码的广度以及减少提取过程中的信息冗余方面仍面临诸多挑战。这些问题使得该模型在正常三元组和重叠三元组提取中都难以取得令人满意的性能。为了应对这些挑战，本研究提出了一种包含多特征语义融合的综合预测网络。我们开发了一个语义融合模块，该模块整合了实体掩码嵌入序列和上下文嵌入序列，前者可增强实体间的联系，后者可提供更丰富的语义信息，从而增强三元组间的交互并扩展语义编码。随后，利用并行解码器同时生成一组三元组，改善它们之间的互动。此外，我们还利用实体掩码序列对这些三元组进行精细修剪，从而优化最终的三元组。在公开数据集 NYT 和 WebNLG 上的实验结果表明，使用 BERT 作为编码器，我们的模型在准确率和 F1 分数方面都优于基线模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Joint entity and relation extraction with fusion of multi-feature semantics

Entity relation extraction is a key technology for extracting structured information from unstructured text and serves as the foundation for building large-scale knowledge graphs. Current joint entity relation extraction methods primarily focus on improving the recognition of overlapping triplets to enhance the overall performance of the model. However, the model still faces numerous challenges in managing intra-triplet and inter-triplet interactions, expanding the breadth of semantic encoding, and reducing information redundancy during the extraction process. These issues make it challenging for the model to achieve satisfactory performance in both normal and overlapping triple extraction. To address these challenges, this study proposes a comprehensive prediction network that includes multi-feature semantic fusion. We have developed a semantic fusion module that integrates entity mask embedding sequences, which enhance connections between entities, and context embedding sequences that provide richer semantic information, to enhance inter-triplet interactions and expand semantic encoding. Subsequently, using a parallel decoder to simultaneously generate a set of triplets, improving the interaction between them. Additionally, we utilize an entity mask sequence to finely prune these triplets, optimizing the final set of triplets. Experimental results on the publicly available datasets NYT and WebNLG demonstrate that, with BERT as the encoder, our model outperforms the baseline model in terms of accuracy and F1 score.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Intelligent Information Systems 工程技术-计算机：人工智能

CiteScore

7.20

自引率

11.80%

发文量

审稿时长

6-12 weeks

期刊介绍： The mission of the Journal of Intelligent Information Systems: Integrating Artifical Intelligence and Database Technologies is to foster and present research and development results focused on the integration of artificial intelligence and database technologies to create next generation information systems - Intelligent Information Systems. These new information systems embody knowledge that allows them to exhibit intelligent behavior, cooperate with users and other systems in problem solving, discovery, access, retrieval and manipulation of a wide variety of multimedia data and knowledge, and reason under uncertainty. Increasingly, knowledge-directed inference processes are being used to: discover knowledge from large data collections, provide cooperative support to users in complex query formulation and refinement, access, retrieve, store and manage large collections of multimedia data and knowledge, integrate information from multiple heterogeneous data and knowledge sources, and reason about information under uncertain conditions. Multimedia and hypermedia information systems now operate on a global scale over the Internet, and new tools and techniques are needed to manage these dynamic and evolving information spaces. The Journal of Intelligent Information Systems provides a forum wherein academics, researchers and practitioners may publish high-quality, original and state-of-the-art papers describing theoretical aspects, systems architectures, analysis and design tools and techniques, and implementation experiences in intelligent information systems. The categories of papers published by JIIS include: research papers, invited papters, meetings, workshop and conference annoucements and reports, survey and tutorial articles, and book reviews. Short articles describing open problems or their solutions are also welcome.