通过负向学习和噪声学生自我训练实现远程监督生物医学关系提取

IF 3.6 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-06-11 DOI:10.1109/TCBB.2024.3412174

Yuanfei Dai, Bin Zhang, Shiping Wang

{"title":"通过负向学习和噪声学生自我训练实现远程监督生物医学关系提取","authors":"Yuanfei Dai, Bin Zhang, Shiping Wang","doi":"10.1109/TCBB.2024.3412174","DOIUrl":null,"url":null,"abstract":"Biomedical relation extraction aims to identify underlying relationships among entities, such as gene associations and drug interactions, within biomedical texts. Despite advancements in relation extraction in general knowledge domains, the scarcity of labeled training data remains a significant challenge in the biomedical field. This paper provides a novel approach for biomedical relation extraction that leverages a noisy student self-training strategy combined with negative learning. This method addresses the challenge of data insufficiency by utilizing distantly supervised data to generate high-quality labeled samples. Negative learning, as opposed to traditional positive learning, offers a more robust mechanism to discern and relabel noisy samples, preventing model overfitting. The integration of these techniques ensures enhanced noise reduction and relabeling capabilities, leading to improved performance even with noisy datasets. Experimental results demonstrate the effectiveness of the proposed framework in mitigating the impact of noisy data and outperforming existing benchmarks.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":3.6000,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distantly Supervised Biomedical Relation Extraction Via Negative Learning and Noisy Student Self-Training.\",\"authors\":\"Yuanfei Dai, Bin Zhang, Shiping Wang\",\"doi\":\"10.1109/TCBB.2024.3412174\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Biomedical relation extraction aims to identify underlying relationships among entities, such as gene associations and drug interactions, within biomedical texts. Despite advancements in relation extraction in general knowledge domains, the scarcity of labeled training data remains a significant challenge in the biomedical field. This paper provides a novel approach for biomedical relation extraction that leverages a noisy student self-training strategy combined with negative learning. This method addresses the challenge of data insufficiency by utilizing distantly supervised data to generate high-quality labeled samples. Negative learning, as opposed to traditional positive learning, offers a more robust mechanism to discern and relabel noisy samples, preventing model overfitting. The integration of these techniques ensures enhanced noise reduction and relabeling capabilities, leading to improved performance even with noisy datasets. Experimental results demonstrate the effectiveness of the proposed framework in mitigating the impact of noisy data and outperforming existing benchmarks.\",\"PeriodicalId\":13344,\"journal\":{\"name\":\"IEEE/ACM Transactions on Computational Biology and Bioinformatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2024-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE/ACM Transactions on Computational Biology and Bioinformatics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/TCBB.2024.3412174\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/TCBB.2024.3412174","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

生物医学关系提取旨在识别生物医学文本中实体之间的潜在关系，如基因关联和药物相互作用。尽管在一般知识领域的关系提取方面取得了进展，但在生物医学领域，标注训练数据的稀缺性仍然是一个重大挑战。本文提供了一种新颖的生物医学关系提取方法，该方法利用噪声学生自我训练策略与负向学习相结合。该方法利用远距离监督数据生成高质量的标记样本，从而解决了数据不足的难题。与传统的正向学习相比，负向学习提供了一种更稳健的机制来识别和重新标记噪声样本，从而防止模型过拟合。这些技术的整合确保了更强的降噪和重新标注能力，从而提高了即使在高噪声数据集下的性能。实验结果表明，所提出的框架能有效减轻噪声数据的影响，并超越现有基准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Distantly Supervised Biomedical Relation Extraction Via Negative Learning and Noisy Student Self-Training.

Biomedical relation extraction aims to identify underlying relationships among entities, such as gene associations and drug interactions, within biomedical texts. Despite advancements in relation extraction in general knowledge domains, the scarcity of labeled training data remains a significant challenge in the biomedical field. This paper provides a novel approach for biomedical relation extraction that leverages a noisy student self-training strategy combined with negative learning. This method addresses the challenge of data insufficiency by utilizing distantly supervised data to generate high-quality labeled samples. Negative learning, as opposed to traditional positive learning, offers a more robust mechanism to discern and relabel noisy samples, preventing model overfitting. The integration of these techniques ensures enhanced noise reduction and relabeling capabilities, leading to improved performance even with noisy datasets. Experimental results demonstrate the effectiveness of the proposed framework in mitigating the impact of noisy data and outperforming existing benchmarks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE/ACM Transactions on Computational Biology and Bioinformatics 工程技术-计算机：跨学科应用

CiteScore

7.50

自引率

6.70%

发文量

479

审稿时长

3 months

期刊介绍： IEEE/ACM Transactions on Computational Biology and Bioinformatics emphasizes the algorithmic, mathematical, statistical and computational methods that are central in bioinformatics and computational biology; the development and testing of effective computer programs in bioinformatics; the development of biological databases; and important biological results that are obtained from the use of these methods, programs and databases; the emerging field of Systems Biology, where many forms of data are used to create a computer-based model of a complex biological system