Bridget T. McInnes, Jiawei Tang, Darshini Mahendran, Mai H. Nguyen
{"title":"基于 BioBERT 的深度学习和合并 ChemProt-DrugProt 用于增强生物医学关系提取","authors":"Bridget T. McInnes, Jiawei Tang, Darshini Mahendran, Mai H. Nguyen","doi":"arxiv-2405.18605","DOIUrl":null,"url":null,"abstract":"This paper presents a methodology for enhancing relation extraction from\nbiomedical texts, focusing specifically on chemical-gene interactions.\nLeveraging the BioBERT model and a multi-layer fully connected network\narchitecture, our approach integrates the ChemProt and DrugProt datasets using\na novel merging strategy. Through extensive experimentation, we demonstrate\nsignificant performance improvements, particularly in CPR groups shared between\nthe datasets. The findings underscore the importance of dataset merging in\naugmenting sample counts and improving model accuracy. Moreover, the study\nhighlights the potential of automated information extraction in biomedical\nresearch and clinical practice.","PeriodicalId":501325,"journal":{"name":"arXiv - QuanBio - Molecular Networks","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BioBERT-based Deep Learning and Merged ChemProt-DrugProt for Enhanced Biomedical Relation Extraction\",\"authors\":\"Bridget T. McInnes, Jiawei Tang, Darshini Mahendran, Mai H. Nguyen\",\"doi\":\"arxiv-2405.18605\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a methodology for enhancing relation extraction from\\nbiomedical texts, focusing specifically on chemical-gene interactions.\\nLeveraging the BioBERT model and a multi-layer fully connected network\\narchitecture, our approach integrates the ChemProt and DrugProt datasets using\\na novel merging strategy. Through extensive experimentation, we demonstrate\\nsignificant performance improvements, particularly in CPR groups shared between\\nthe datasets. The findings underscore the importance of dataset merging in\\naugmenting sample counts and improving model accuracy. Moreover, the study\\nhighlights the potential of automated information extraction in biomedical\\nresearch and clinical practice.\",\"PeriodicalId\":501325,\"journal\":{\"name\":\"arXiv - QuanBio - Molecular Networks\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Molecular Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.18605\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Molecular Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.18605","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
BioBERT-based Deep Learning and Merged ChemProt-DrugProt for Enhanced Biomedical Relation Extraction
This paper presents a methodology for enhancing relation extraction from
biomedical texts, focusing specifically on chemical-gene interactions.
Leveraging the BioBERT model and a multi-layer fully connected network
architecture, our approach integrates the ChemProt and DrugProt datasets using
a novel merging strategy. Through extensive experimentation, we demonstrate
significant performance improvements, particularly in CPR groups shared between
the datasets. The findings underscore the importance of dataset merging in
augmenting sample counts and improving model accuracy. Moreover, the study
highlights the potential of automated information extraction in biomedical
research and clinical practice.