Protocol for: A Simple, Accessible, Literature-based Drug Repurposing Pipeline

Maximin Lange, Eoin Gogarty, Meredith Martyn, Philip Braude, Feras Fayez, Ben Carter
{"title":"Protocol for: A Simple, Accessible, Literature-based Drug Repurposing Pipeline","authors":"Maximin Lange, Eoin Gogarty, Meredith Martyn, Philip Braude, Feras Fayez, Ben Carter","doi":"10.1101/2024.07.18.24310641","DOIUrl":null,"url":null,"abstract":"We will develop a novel approach to drug repurposing, utilising Natural Language Processing (NLP) and Literature Based Discovery (LBD) techniques. This will present a simplified, accessible drug repurposing pipeline using Word2Vec embeddings trained on PubMed abstracts to identify potential new medications to be repurposed. We present this approach in the context of antipsychotics, but it could be repeated for any available medication. The research is structured in three stages:\n1. Identification of candidate medications using Word2Vec algorithm trained on scientific literature.\n2. Empirical testing of identified candidates using a large hospital dataset to explore protective effects against disease onset.\n3. Validation of findings using a second, independent dataset to assess generalizability. This method addresses limitations in current machine learning-based drug repurposing approaches, including lack of external validation and limited accessibility. By leveraging Word2Vec's ability to capture semantic relationships between words, the study aims to uncover hidden connections in medical literature that may lead to novel therapeutic discoveries. The protocol emphasizes transparency and reproducibility, utilizing publicly available electronic health record (EHR) databases for validation. This approach allows for tangible results even for researchers with limited machine learning expertise, bridging the gap between biomedical and information systems communities.","PeriodicalId":501454,"journal":{"name":"medRxiv - Health Informatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.18.24310641","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We will develop a novel approach to drug repurposing, utilising Natural Language Processing (NLP) and Literature Based Discovery (LBD) techniques. This will present a simplified, accessible drug repurposing pipeline using Word2Vec embeddings trained on PubMed abstracts to identify potential new medications to be repurposed. We present this approach in the context of antipsychotics, but it could be repeated for any available medication. The research is structured in three stages: 1. Identification of candidate medications using Word2Vec algorithm trained on scientific literature. 2. Empirical testing of identified candidates using a large hospital dataset to explore protective effects against disease onset. 3. Validation of findings using a second, independent dataset to assess generalizability. This method addresses limitations in current machine learning-based drug repurposing approaches, including lack of external validation and limited accessibility. By leveraging Word2Vec's ability to capture semantic relationships between words, the study aims to uncover hidden connections in medical literature that may lead to novel therapeutic discoveries. The protocol emphasizes transparency and reproducibility, utilizing publicly available electronic health record (EHR) databases for validation. This approach allows for tangible results even for researchers with limited machine learning expertise, bridging the gap between biomedical and information systems communities.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
协议:基于文献的简单、易用的药物再利用管道
我们将利用自然语言处理(NLP)和基于文献的发现(LBD)技术,开发一种新的药物再利用方法。这将提供一个简化的、可访问的药物再利用管道,使用在PubMed摘要上训练的Word2Vec嵌入来识别潜在的新药再利用。我们以抗精神病药物为背景介绍了这种方法,但任何现有药物都可以重复使用这种方法。研究分为三个阶段:1.使用在科学文献上训练的 Word2Vec 算法识别候选药物;2.使用大型医院数据集对识别出的候选药物进行经验测试,以探索其对疾病发作的保护作用;3.使用第二个独立数据集对研究结果进行验证,以评估其通用性。这种方法解决了目前基于机器学习的药物再利用方法的局限性,包括缺乏外部验证和可及性有限。通过利用 Word2Vec 捕捉词与词之间语义关系的能力,该研究旨在发现医学文献中隐藏的联系,从而发现新的治疗方法。该方案强调透明度和可重复性,利用公开的电子健康记录(EHR)数据库进行验证。即使是机器学习专业知识有限的研究人员也能通过这种方法获得切实的成果,从而缩小生物医学和信息系统界之间的差距。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A case is not a case is not a case - challenges and solutions in determining urolithiasis caseloads using the digital infrastructure of a clinical data warehouse Reliable Online Auditory Cognitive Testing: An observational study Federated Multiple Imputation for Variables that Are Missing Not At Random in Distributed Electronic Health Records Characterizing the connection between Parkinson's disease progression and healthcare utilization Generative AI and Large Language Models in Reducing Medication Related Harm and Adverse Drug Events - A Scoping Review
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1