Xing Kun-peng, Xue Yang, Kong De-yan, Dong Wei, Ji Zhen-yan
{"title":"Joint Extraction of Entities and Relations Based on Hybrid Feature Representations","authors":"Xing Kun-peng, Xue Yang, Kong De-yan, Dong Wei, Ji Zhen-yan","doi":"10.1115/icone29-93152","DOIUrl":null,"url":null,"abstract":"\n Although the fine-tuning pre-training model technique has obtained tremendous success in the domains of named entity recognition and relation extraction, realistic scenarios exist with many triples of nested entities and overlapping relations. Existing works focus on solving the overlapping triple problem where multiple relational triples in the same sentence share the same entity. In this work, we introduce a joint entity-relation extraction framework based on hybrid feature representation. Our framework consists of five primary parts: constructing hybrid feature representations, bidirectional LSTM encoder, head entity recognition module, entity type classification, and relation tail entity recognition. First, we fuse character-level vector and word-level vector representations via a max-pooling operation to enrich text feature information. Second, the hybrid feature representation is fed into a bidirectional LSTM to capture the correlation between characters and entities. Third, the head entity recognition module employs two identical binary classifiers to detect the start and end positions of entities separately. Then the entity type classification module filters out entities classified as non-entity types by softmax. Finally, we regard relation tail entity recognition as a machine reading comprehension task to eliminate the problem of entity overlap. Specifically, we regard the combination of head entities and relations as queries to query possible tail entities from the text. This framework efficiently handles the polysemy problem, considerably enhances knowledge extraction efficiency, and accurately extracts overlapping triples in domain texts with complicated relationships.","PeriodicalId":422334,"journal":{"name":"Volume 12: Innovative and Smart Nuclear Power Plant Design","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Volume 12: Innovative and Smart Nuclear Power Plant Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/icone29-93152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Although the fine-tuning pre-training model technique has obtained tremendous success in the domains of named entity recognition and relation extraction, realistic scenarios exist with many triples of nested entities and overlapping relations. Existing works focus on solving the overlapping triple problem where multiple relational triples in the same sentence share the same entity. In this work, we introduce a joint entity-relation extraction framework based on hybrid feature representation. Our framework consists of five primary parts: constructing hybrid feature representations, bidirectional LSTM encoder, head entity recognition module, entity type classification, and relation tail entity recognition. First, we fuse character-level vector and word-level vector representations via a max-pooling operation to enrich text feature information. Second, the hybrid feature representation is fed into a bidirectional LSTM to capture the correlation between characters and entities. Third, the head entity recognition module employs two identical binary classifiers to detect the start and end positions of entities separately. Then the entity type classification module filters out entities classified as non-entity types by softmax. Finally, we regard relation tail entity recognition as a machine reading comprehension task to eliminate the problem of entity overlap. Specifically, we regard the combination of head entities and relations as queries to query possible tail entities from the text. This framework efficiently handles the polysemy problem, considerably enhances knowledge extraction efficiency, and accurately extracts overlapping triples in domain texts with complicated relationships.