Biomedical relation extraction method based on ensemble learning and attention mechanism.

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS BMC Bioinformatics Pub Date : 2024-10-18 DOI:10.1186/s12859-024-05951-y

Yaxun Jia, Haoyang Wang, Zhu Yuan, Lian Zhu, Zuo-Lin Xiang

{"title":"Biomedical relation extraction method based on ensemble learning and attention mechanism.","authors":"Yaxun Jia, Haoyang Wang, Zhu Yuan, Lian Zhu, Zuo-Lin Xiang","doi":"10.1186/s12859-024-05951-y","DOIUrl":null,"url":null,"abstract":"Background: Relation extraction (RE) plays a crucial role in biomedical research as it is essential for uncovering complex semantic relationships between entities in textual data. Given the significance of RE in biomedical informatics and the increasing volume of literature, there is an urgent need for advanced computational models capable of accurately and efficiently extracting these relationships on a large scale.Results: This paper proposes a novel approach, SARE, combining ensemble learning Stacking and attention mechanisms to enhance the performance of biomedical relation extraction. By leveraging multiple pre-trained models, SARE demonstrates improved adaptability and robustness across diverse domains. The attention mechanisms enable the model to capture and utilize key information in the text more accurately. SARE achieved performance improvements of 4.8, 8.7, and 0.8 percentage points on the PPI, DDI, and ChemProt datasets, respectively, compared to the original BERT variant and the domain-specific PubMedBERT model.Conclusions: SARE offers a promising solution for improving the accuracy and efficiency of relation extraction tasks in biomedical research, facilitating advancements in biomedical informatics. The results suggest that combining ensemble learning with attention mechanisms is effective for extracting complex relationships from biomedical texts. Our code and data are publicly available at: https://github.com/GS233/Biomedical .","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"25 1","pages":"333"},"PeriodicalIF":3.3000,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11488084/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-024-05951-y","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Relation extraction (RE) plays a crucial role in biomedical research as it is essential for uncovering complex semantic relationships between entities in textual data. Given the significance of RE in biomedical informatics and the increasing volume of literature, there is an urgent need for advanced computational models capable of accurately and efficiently extracting these relationships on a large scale.

Results: This paper proposes a novel approach, SARE, combining ensemble learning Stacking and attention mechanisms to enhance the performance of biomedical relation extraction. By leveraging multiple pre-trained models, SARE demonstrates improved adaptability and robustness across diverse domains. The attention mechanisms enable the model to capture and utilize key information in the text more accurately. SARE achieved performance improvements of 4.8, 8.7, and 0.8 percentage points on the PPI, DDI, and ChemProt datasets, respectively, compared to the original BERT variant and the domain-specific PubMedBERT model.

Conclusions: SARE offers a promising solution for improving the accuracy and efficiency of relation extraction tasks in biomedical research, facilitating advancements in biomedical informatics. The results suggest that combining ensemble learning with attention mechanisms is effective for extracting complex relationships from biomedical texts. Our code and data are publicly available at: https://github.com/GS233/Biomedical .

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于集合学习和注意力机制的生物医学关系提取方法

背景：关系提取（RE）在生物医学研究中发挥着至关重要的作用，因为它对于揭示文本数据中实体之间复杂的语义关系至关重要。鉴于关系提取在生物医学信息学中的重要性以及文献量的不断增加，迫切需要能够准确、高效地大规模提取这些关系的先进计算模型：本文提出了一种新方法 SARE，它结合了集合学习堆叠（Stacking）和注意力机制，以提高生物医学关系提取的性能。通过利用多个预先训练好的模型，SARE 在不同领域都表现出更强的适应性和鲁棒性。注意力机制使模型能够更准确地捕捉和利用文本中的关键信息。与原始 BERT 变体和特定领域的 PubMedBERT 模型相比，SARE 在 PPI、DDI 和 ChemProt 数据集上的性能分别提高了 4.8、8.7 和 0.8 个百分点：SARE 为提高生物医学研究中关系提取任务的准确性和效率提供了一种有前途的解决方案，促进了生物医学信息学的发展。研究结果表明，将集合学习与注意力机制相结合能有效地从生物医学文本中提取复杂的关系。我们的代码和数据可在以下网站公开： https://github.com/GS233/Biomedical 。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

BMC Bioinformatics 生物-生化研究方法

CiteScore

5.70

自引率

3.30%

发文量

506

审稿时长

4.3 months

期刊介绍： BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.