首页 > 最新文献

Bioinformatics最新文献

英文 中文
MMCL-CDR: Enhancing Cancer Drug Response Prediction with Multi-Omics and Morphology Images Contrastive Representation Learning MMCL-CDR:利用多图像和形态图像对比表征学习加强癌症药物反应预测
IF 5.8 3区 生物学 Q1 Mathematics Pub Date : 2023-12-09 DOI: 10.1093/bioinformatics/btad734
Yang Li, Zihou Guo, Xin Gao, Guohua Wang
Motivation Cancer is a complex disease that results in a significant number of global fatalities. Treatment strategies can vary among patients, even if they have the same type of cancer. The application of precision medicine in cancer shows promise for treating different types of cancer, reducing healthcare expenses, and improving recovery rates. To achieve personalized cancer treatment, machine learning models have been developed to predict drug responses based on tumor and drug characteristics. However, current studies either focus on constructing homogeneous networks from single data source or heterogeneous networks from multi-omics data. While multi-omics data have shown potential in predicting drug responses in cancer cell lines, there is still a lack of research that effectively utilizes insights from different modalities. Furthermore, effectively utilizing the multi-modal knowledge of cancer cell lines poses a challenge due to the heterogeneity inherent in these modalities. Results To address these challenges, we introduce MMCL-CDR, a multi-modal approach for cancer drug response prediction that integrates copy number variation, gene expression, morphology images of cell lines and chemical structure of drugs. The objective of MMCL-CDR is to align cancer cell lines across different data modalities by learning cell line representations from omic and image data, and combined with structural drug representations to enhance the prediction of Cancer Drug Responses (CDR). We have carried out comprehensive experiments and show that our model significantly outperforms other state-of-the-art methods in CDR prediction. The experimental results also prove that the model can learn more accurate cell line representation by integrating multi-omics and morphological data from cell lines, thereby improving the accuracy of CDR prediction. In addition, the ablation study and qualitative analysis also confirm the effectiveness of each part of our proposed model. Last but not least, MMCL-CDR opens up a new dimension for cancer drug response prediction through multimodal contrastive learning, pioneering a novel approach that integrates multi-omics and multi-modal drug and cell line modeling. Availability and Implementation MMCL-CDR is available at https://github.com/catly/MMCL-CDR
动机 癌症是一种复杂的疾病,在全球造成大量死亡。即使是同一种癌症,不同患者的治疗策略也会有所不同。精准医疗在癌症中的应用为治疗不同类型的癌症、降低医疗费用和提高康复率带来了希望。为了实现个性化癌症治疗,人们开发了机器学习模型,根据肿瘤和药物特征预测药物反应。然而,目前的研究要么侧重于从单一数据源构建同构网络,要么侧重于从多组学数据构建异构网络。虽然多组学数据在预测癌症细胞系的药物反应方面已显示出潜力,但仍然缺乏有效利用不同模式的洞察力的研究。此外,有效利用癌症细胞系的多模态知识也是一项挑战,因为这些模态存在固有的异质性。结果 为了应对这些挑战,我们引入了 MMCL-CDR,这是一种用于癌症药物反应预测的多模态方法,它整合了拷贝数变异、基因表达、细胞系形态图像和药物化学结构。MMCL-CDR 的目标是通过从 omic 和图像数据中学习细胞系表征,并结合药物结构表征来对不同数据模式的癌细胞系进行比对,从而增强对癌症药物反应(CDR)的预测。我们进行了全面的实验,结果表明我们的模型在 CDR 预测方面明显优于其他最先进的方法。实验结果还证明,该模型可以通过整合细胞系的多组学和形态学数据,学习到更准确的细胞系表征,从而提高 CDR 预测的准确性。此外,消融研究和定性分析也证实了我们提出的模型各部分的有效性。最后但并非最不重要的一点是,MMCL-CDR 通过多模态对比学习为癌症药物反应预测开辟了一个新的维度,开创了一种将多组学和多模态药物及细胞系建模相结合的新方法。可用性与实施 MMCL-CDR 可从 https://github.com/catly/MMCL-CDR 网站获取。
{"title":"MMCL-CDR: Enhancing Cancer Drug Response Prediction with Multi-Omics and Morphology Images Contrastive Representation Learning","authors":"Yang Li, Zihou Guo, Xin Gao, Guohua Wang","doi":"10.1093/bioinformatics/btad734","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad734","url":null,"abstract":"Motivation Cancer is a complex disease that results in a significant number of global fatalities. Treatment strategies can vary among patients, even if they have the same type of cancer. The application of precision medicine in cancer shows promise for treating different types of cancer, reducing healthcare expenses, and improving recovery rates. To achieve personalized cancer treatment, machine learning models have been developed to predict drug responses based on tumor and drug characteristics. However, current studies either focus on constructing homogeneous networks from single data source or heterogeneous networks from multi-omics data. While multi-omics data have shown potential in predicting drug responses in cancer cell lines, there is still a lack of research that effectively utilizes insights from different modalities. Furthermore, effectively utilizing the multi-modal knowledge of cancer cell lines poses a challenge due to the heterogeneity inherent in these modalities. Results To address these challenges, we introduce MMCL-CDR, a multi-modal approach for cancer drug response prediction that integrates copy number variation, gene expression, morphology images of cell lines and chemical structure of drugs. The objective of MMCL-CDR is to align cancer cell lines across different data modalities by learning cell line representations from omic and image data, and combined with structural drug representations to enhance the prediction of Cancer Drug Responses (CDR). We have carried out comprehensive experiments and show that our model significantly outperforms other state-of-the-art methods in CDR prediction. The experimental results also prove that the model can learn more accurate cell line representation by integrating multi-omics and morphological data from cell lines, thereby improving the accuracy of CDR prediction. In addition, the ablation study and qualitative analysis also confirm the effectiveness of each part of our proposed model. Last but not least, MMCL-CDR opens up a new dimension for cancer drug response prediction through multimodal contrastive learning, pioneering a novel approach that integrates multi-omics and multi-modal drug and cell line modeling. Availability and Implementation MMCL-CDR is available at https://github.com/catly/MMCL-CDR","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138563967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OBMeta: a comprehensive web server to analyze and validate gut microbial features and biomarkers for obesity-associated metabolic diseases OBMeta:用于分析和验证肥胖相关代谢疾病的肠道微生物特征和生物标志物的综合网络服务器
IF 5.8 3区 生物学 Q1 Mathematics Pub Date : 2023-12-09 DOI: 10.1093/bioinformatics/btad715
Cuifang Xu, Jiating Huang, Yongqiang Gao, Weixing Zhao, Yiqi Shen, Feihong Luo, Gang Yu, Feng Zhu, Yan Ni
Motivation Gut dysbiosis is closely associated with obesity and related metabolic diseases including type 2 diabetes (T2D) and non-alcoholic fatty liver disease (NAFLD). The gut microbial features and biomarkers have been increasingly investigated in many studies, which require further validation due to the limited sample size and various confounding factors that may affect microbial compositions in a single study. So far, it lacks a comprehensive bioinformatics pipeline providing automated statistical analysis and integrating multiple independent studies for cross-validation simultaneously. Results OBMeta aims to streamline the standard metagenomics data analysis from diversity analysis, comparative analysis, and functional analysis to co-abundance network analysis. In addition, a curated database has been established with a total of 90 public research projects, covering three different phenotypes (Obesity, T2D, and NAFLD) and more than five different intervention strategies (exercise, diet, probiotics, medication, and surgery). With OBMeta, users can not only analyze their research projects but also search and match public datasets for cross-validation. Moreover, OBMeta provides cross-phenotype and cross-intervention-based advanced validation that maximally supports preliminary findings from an individual study. To summarize, OBMeta is a comprehensive web server to analyze and validate gut microbial features and biomarkers for obesity-associated metabolic diseases. Availability OBMeta is freely available at: http://obmeta.met-bioinformatics.cn/. Supplementary information Supplementary data are available at Bioinformatics online.
动机 肠道菌群失调与肥胖及相关代谢性疾病(包括 2 型糖尿病和非酒精性脂肪肝)密切相关。许多研究对肠道微生物特征和生物标志物进行了越来越多的调查,但由于样本量有限以及各种混杂因素可能会影响单项研究中的微生物组成,因此需要进一步验证。迄今为止,还缺乏一个全面的生物信息学管道来提供自动统计分析,并同时整合多个独立研究进行交叉验证。结果 OBMeta 旨在简化标准的元基因组学数据分析,从多样性分析、比较分析、功能分析到共丰度网络分析。此外,OBMeta 还建立了一个包含 90 个公开研究项目、涵盖三种不同表型(肥胖症、T2D 和非酒精性脂肪肝)和五种以上不同干预策略(运动、饮食、益生菌、药物和手术)的策划数据库。通过 OBMeta,用户不仅可以分析自己的研究项目,还可以搜索和匹配公共数据集进行交叉验证。此外,OBMeta 还提供基于跨表型和跨干预的高级验证,最大限度地支持单项研究的初步发现。总之,OBMeta 是一个综合网络服务器,用于分析和验证肥胖相关代谢疾病的肠道微生物特征和生物标记物。可用性 OBMeta 可在以下网址免费获取:http://obmeta.met-bioinformatics.cn/。补充信息 补充数据可在 Bioinformatics online 上获取。
{"title":"OBMeta: a comprehensive web server to analyze and validate gut microbial features and biomarkers for obesity-associated metabolic diseases","authors":"Cuifang Xu, Jiating Huang, Yongqiang Gao, Weixing Zhao, Yiqi Shen, Feihong Luo, Gang Yu, Feng Zhu, Yan Ni","doi":"10.1093/bioinformatics/btad715","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad715","url":null,"abstract":"Motivation Gut dysbiosis is closely associated with obesity and related metabolic diseases including type 2 diabetes (T2D) and non-alcoholic fatty liver disease (NAFLD). The gut microbial features and biomarkers have been increasingly investigated in many studies, which require further validation due to the limited sample size and various confounding factors that may affect microbial compositions in a single study. So far, it lacks a comprehensive bioinformatics pipeline providing automated statistical analysis and integrating multiple independent studies for cross-validation simultaneously. Results OBMeta aims to streamline the standard metagenomics data analysis from diversity analysis, comparative analysis, and functional analysis to co-abundance network analysis. In addition, a curated database has been established with a total of 90 public research projects, covering three different phenotypes (Obesity, T2D, and NAFLD) and more than five different intervention strategies (exercise, diet, probiotics, medication, and surgery). With OBMeta, users can not only analyze their research projects but also search and match public datasets for cross-validation. Moreover, OBMeta provides cross-phenotype and cross-intervention-based advanced validation that maximally supports preliminary findings from an individual study. To summarize, OBMeta is a comprehensive web server to analyze and validate gut microbial features and biomarkers for obesity-associated metabolic diseases. Availability OBMeta is freely available at: http://obmeta.met-bioinformatics.cn/. Supplementary information Supplementary data are available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138574789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CytoCopasi: A Chemical Systems Biology Target and Drug Discovery Visual Data Analytics Platform CytoCopasi:化学系统生物学目标和药物发现可视化数据分析平台
IF 5.8 3区 生物学 Q1 Mathematics Pub Date : 2023-12-09 DOI: 10.1093/bioinformatics/btad745
Hikmet Emre Kaya, Kevin J Naidoo
Motivation Target discovery and drug evaluation for diseases with complex mechanisms call for a streamlined chemical systems analysis platform. Currently available tools lack the emphasis on reaction kinetics, access to relevant databases, and algorithms to visualize perturbations on a chemical scale providing quantitative details as well streamlined visual data analytics functionality. Results CytoCopasi, a Maven-based application for Cytoscape that combines the chemical systems analysis features of COPASI with the visualization and database access tools of Cytoscape and its plugin applications has been developed. The diverse functionality of CytoCopasi through ab initio model construction, model construction via pathway and parameter databases KEGG and BRENDA is presented. The comparative systems biology visualization analysis toolset is illustrated through a drug competence study on the cancerous RAF/MEK/ERK pathway. Availability The COPASI files, simulation data, native libraries, and the manual are available on https://github.com/scientificomputing/CytoCopasi Supplementary information Supplementary data is available at Bioinformatics online.
动机 针对机制复杂的疾病进行目标发现和药物评估,需要一个简化的化学系统分析平台。目前可用的工具缺乏对反应动力学的重视、对相关数据库的访问以及在化学尺度上可视化扰动的算法,无法提供定量细节和简化的可视化数据分析功能。结果 CytoCopasi 是一个基于 Maven 的 Cytoscape 应用程序,它将 COPASI 的化学系统分析功能与 Cytoscape 及其插件应用程序的可视化和数据库访问工具相结合。介绍了 CytoCopasi 的多种功能,包括自证模型构建、通过路径和参数数据库 KEGG 和 BRENDA 构建模型。通过对癌症 RAF/MEK/ERK 通路的药物能力研究,说明了比较系统生物学可视化分析工具集。可用性 COPASI 文件、模拟数据、原生库和手册可在 https://github.com/scientificomputing/CytoCopasi 上获取 补充信息 补充数据可在 Bioinformatics online 上获取。
{"title":"CytoCopasi: A Chemical Systems Biology Target and Drug Discovery Visual Data Analytics Platform","authors":"Hikmet Emre Kaya, Kevin J Naidoo","doi":"10.1093/bioinformatics/btad745","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad745","url":null,"abstract":"Motivation Target discovery and drug evaluation for diseases with complex mechanisms call for a streamlined chemical systems analysis platform. Currently available tools lack the emphasis on reaction kinetics, access to relevant databases, and algorithms to visualize perturbations on a chemical scale providing quantitative details as well streamlined visual data analytics functionality. Results CytoCopasi, a Maven-based application for Cytoscape that combines the chemical systems analysis features of COPASI with the visualization and database access tools of Cytoscape and its plugin applications has been developed. The diverse functionality of CytoCopasi through ab initio model construction, model construction via pathway and parameter databases KEGG and BRENDA is presented. The comparative systems biology visualization analysis toolset is illustrated through a drug competence study on the cancerous RAF/MEK/ERK pathway. Availability The COPASI files, simulation data, native libraries, and the manual are available on https://github.com/scientificomputing/CytoCopasi Supplementary information Supplementary data is available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138562731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MaxCLK: discovery of cancer driver genes via maximal clique and information entropy of modules MaxCLK:通过模块的最大聚类和信息熵发现癌症驱动基因
IF 5.8 3区 生物学 Q1 Mathematics Pub Date : 2023-12-09 DOI: 10.1093/bioinformatics/btad737
Jian Liu, Fubin Ma, Yongdi Zhu, Naiqian Zhang, Lingming Kong, Jia Mi, Haiyan Cong, Rui Gao, Mingyi Wang, Yusen Zhang
Motivation Cancer is caused by the accumulation of somatic mutations in multiple pathways, in which driver mutations are typically of the properties of high coverage and high exclusivity in patients. Identifying cancer driver genes has a pivotal role in understanding the mechanisms of oncogenesis and treatment. Results Here, we introduced MaxCLK, an algorithm for identifying cancer driver genes, which was developed by an integrated analysis of somatic mutation data and protein–protein interaction (PPI) networks and further improved by an information entropy (IE) index. Tested on pancancer and single cancers, MaxCLK outperformed other existing methods with higher accuracy. About pancancer, we predicted 154 driver genes and 787 driver modules. The analysis of co-occurrence and exclusivity between modules and pathways reveals the correlation of their combinations. Overall, our study has deepened the understanding of driver mechanism in PPI topology and found novel driver genes. Availability The source codes for MaxCLK are freely available at https://github.com/ShandongUniversityMasterMa/MaxCLK-main. Supplementary information Supplementary data are available at Bioinformatics online.
动机 癌症是由多种途径中的体细胞突变累积引起的,其中驱动基因突变在患者中通常具有高覆盖率和高排他性的特性。识别癌症驱动基因对于了解肿瘤发生和治疗机制具有举足轻重的作用。结果 在这里,我们介绍了一种用于识别癌症驱动基因的算法--MaxCLK,它是通过对体细胞突变数据和蛋白-蛋白相互作用(PPI)网络的综合分析而开发的,并通过信息熵(IE)指数得到了进一步改进。通过对胰腺癌和单种癌症的测试,MaxCLK的准确性优于其他现有方法。关于胰腺癌,我们预测了 154 个驱动基因和 787 个驱动模块。模块和通路之间的共存性和排他性分析揭示了其组合的相关性。总之,我们的研究加深了对PPI拓扑中驱动机制的理解,并发现了新的驱动基因。可用性 MaxCLK 的源代码可在 https://github.com/ShandongUniversityMasterMa/MaxCLK-main 免费获取。补充信息 补充数据可在 Bioinformatics online 上获取。
{"title":"MaxCLK: discovery of cancer driver genes via maximal clique and information entropy of modules","authors":"Jian Liu, Fubin Ma, Yongdi Zhu, Naiqian Zhang, Lingming Kong, Jia Mi, Haiyan Cong, Rui Gao, Mingyi Wang, Yusen Zhang","doi":"10.1093/bioinformatics/btad737","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad737","url":null,"abstract":"Motivation Cancer is caused by the accumulation of somatic mutations in multiple pathways, in which driver mutations are typically of the properties of high coverage and high exclusivity in patients. Identifying cancer driver genes has a pivotal role in understanding the mechanisms of oncogenesis and treatment. Results Here, we introduced MaxCLK, an algorithm for identifying cancer driver genes, which was developed by an integrated analysis of somatic mutation data and protein–protein interaction (PPI) networks and further improved by an information entropy (IE) index. Tested on pancancer and single cancers, MaxCLK outperformed other existing methods with higher accuracy. About pancancer, we predicted 154 driver genes and 787 driver modules. The analysis of co-occurrence and exclusivity between modules and pathways reveals the correlation of their combinations. Overall, our study has deepened the understanding of driver mechanism in PPI topology and found novel driver genes. Availability The source codes for MaxCLK are freely available at https://github.com/ShandongUniversityMasterMa/MaxCLK-main. Supplementary information Supplementary data are available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138562604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EPIC-TRACE: predicting TCR binding to unseen epitopes using attention and contextualized embeddings EPIC-TRACE:利用注意力和上下文嵌入预测 TCR 与未知表位的结合
IF 5.8 3区 生物学 Q1 Mathematics Pub Date : 2023-12-09 DOI: 10.1093/bioinformatics/btad743
Dani Korpela, Emmi Jokinen, Alexandru Dumitrescu, Jani Huuhtanen, Satu Mustjoki, Harri Lähdesmäki
Motivation T cells play an essential role in adaptive immune system to fight pathogens and cancer but may also give rise to autoimmune diseases. The recognition of a peptide-MHC (pMHC) complex by a T cell receptor (TCR) is required to elicit an immune response. Many machine learning models have been developed to predict the binding, but generalizing predictions to pMHCs outside the training data remains challenging. Results We have developed a new machine learning model that utilizes information about the TCR from both α and β chains, epitope sequence, and MHC. Our method uses ProtBERT embeddings for the amino acid sequences of both chains and the epitope, as well as convolution and multi-head attention architectures. We show the importance of each input feature as well as the benefit of including epitopes with only a few TCRs to the training data. We evaluate our model on existing databases and show that it compares favorably against other state-of-the-art models. Code availability https://github.com/DaniTheOrange/EPIC-TRACE Supplementary information Supplementary data are available at Bioinformatics online.
动机 T 细胞在适应性免疫系统中发挥着对抗病原体和癌症的重要作用,但也可能引发自身免疫性疾病。T细胞受体(TCR)识别多肽-MHC(pMHC)复合物是引起免疫反应的必要条件。目前已开发出许多机器学习模型来预测这种结合,但将预测结果推广到训练数据之外的 pMHC 仍然具有挑战性。结果 我们开发了一种新的机器学习模型,它利用了来自 α 和 β 链、表位序列和 MHC 的 TCR 信息。我们的方法使用了针对两条链和表位的氨基酸序列的 ProtBERT 嵌入以及卷积和多头注意力架构。我们展示了每个输入特征的重要性,以及将只有少量 TCR 的表位纳入训练数据的好处。我们在现有数据库上对我们的模型进行了评估,结果表明该模型优于其他最先进的模型。代码可用性 https://github.com/DaniTheOrange/EPIC-TRACE 补充信息 补充数据可在 Bioinformatics online 上获取。
{"title":"EPIC-TRACE: predicting TCR binding to unseen epitopes using attention and contextualized embeddings","authors":"Dani Korpela, Emmi Jokinen, Alexandru Dumitrescu, Jani Huuhtanen, Satu Mustjoki, Harri Lähdesmäki","doi":"10.1093/bioinformatics/btad743","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad743","url":null,"abstract":"Motivation T cells play an essential role in adaptive immune system to fight pathogens and cancer but may also give rise to autoimmune diseases. The recognition of a peptide-MHC (pMHC) complex by a T cell receptor (TCR) is required to elicit an immune response. Many machine learning models have been developed to predict the binding, but generalizing predictions to pMHCs outside the training data remains challenging. Results We have developed a new machine learning model that utilizes information about the TCR from both α and β chains, epitope sequence, and MHC. Our method uses ProtBERT embeddings for the amino acid sequences of both chains and the epitope, as well as convolution and multi-head attention architectures. We show the importance of each input feature as well as the benefit of including epitopes with only a few TCRs to the training data. We evaluate our model on existing databases and show that it compares favorably against other state-of-the-art models. Code availability https://github.com/DaniTheOrange/EPIC-TRACE Supplementary information Supplementary data are available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138563087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Drug repositioning with adaptive graph convolutional networks 利用自适应图卷积网络重新定位药物
IF 5.8 3区 生物学 Q1 Mathematics Pub Date : 2023-12-08 DOI: 10.1093/bioinformatics/btad748
Xinliang Sun, Xiao Jia, Zhangli Lu, Jing Tang, Min Li
Motivation Drug repositioning is an effective strategy to identify new indications for existing drugs, providing the quickest possible transition from bench to bedside. With the rapid development of deep learning, graph convolutional networks (GCNs) have been widely adopted for drug repositioning tasks. However, prior GCNs based methods exist limitations in deeply integrating node features and topological structures, which may hinder the capability of GCNs. Results In this study, we propose an adaptive graph convolutional networks approach, termed AdaDR, for drug repositioning by deeply integrating node features and topological structures. Distinct from conventional graph convolution networks, AdaDR models interactive information between them with adaptive graph convolution operation, which enhances the expression of model. Concretely, AdaDR simultaneously extracts embeddings from node features and topological structures and then uses the attention mechanism to learn adaptive importance weights of the embeddings. Experimental results show that AdaDR achieves better performance than multiple baselines for drug repositioning. Moreover, in the case study, exploratory analyses are offered for finding novel drug-disease associations. Availability and implementation The implementation of AdaDR and the preprocessed data is available at: https://github.com/xinliangSun/AdaDR. Supplementary information Supplementary data are available at Bioinformatics online.
动机 药物重新定位是为现有药物确定新适应症的一种有效策略,能以最快的速度实现从实验室到临床的转变。随着深度学习的快速发展,图卷积网络(GCN)已被广泛应用于药物重新定位任务。然而,之前基于 GCNs 的方法在深度整合节点特征和拓扑结构方面存在局限性,这可能会阻碍 GCNs 能力的发挥。结果 在本研究中,我们提出了一种自适应图卷积网络方法(称为 AdaDR),通过深度整合节点特征和拓扑结构来实现药物重新定位。有别于传统的图卷积网络,AdaDR 通过自适应图卷积运算对它们之间的交互信息进行建模,从而增强了模型的表达能力。具体来说,AdaDR 同时从节点特征和拓扑结构中提取嵌入,然后利用注意力机制学习嵌入的自适应重要性权重。实验结果表明,在药物重新定位方面,AdaDR 比多种基线方法取得了更好的性能。此外,在案例研究中,还提供了探索性分析,以发现新的药物-疾病关联。可用性和实现 AdaDR 的实现和预处理数据可在以下网址获取:https://github.com/xinliangSun/AdaDR。补充信息 补充数据可在 Bioinformatics online 上获取。
{"title":"Drug repositioning with adaptive graph convolutional networks","authors":"Xinliang Sun, Xiao Jia, Zhangli Lu, Jing Tang, Min Li","doi":"10.1093/bioinformatics/btad748","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad748","url":null,"abstract":"Motivation Drug repositioning is an effective strategy to identify new indications for existing drugs, providing the quickest possible transition from bench to bedside. With the rapid development of deep learning, graph convolutional networks (GCNs) have been widely adopted for drug repositioning tasks. However, prior GCNs based methods exist limitations in deeply integrating node features and topological structures, which may hinder the capability of GCNs. Results In this study, we propose an adaptive graph convolutional networks approach, termed AdaDR, for drug repositioning by deeply integrating node features and topological structures. Distinct from conventional graph convolution networks, AdaDR models interactive information between them with adaptive graph convolution operation, which enhances the expression of model. Concretely, AdaDR simultaneously extracts embeddings from node features and topological structures and then uses the attention mechanism to learn adaptive importance weights of the embeddings. Experimental results show that AdaDR achieves better performance than multiple baselines for drug repositioning. Moreover, in the case study, exploratory analyses are offered for finding novel drug-disease associations. Availability and implementation The implementation of AdaDR and the preprocessed data is available at: https://github.com/xinliangSun/AdaDR. Supplementary information Supplementary data are available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138562864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bibliometric analysis of neuroscience publications quantifies the impact of data sharing 神经科学出版物的文献计量分析量化了数据共享的影响
IF 5.8 3区 生物学 Q1 Mathematics Pub Date : 2023-12-08 DOI: 10.1093/bioinformatics/btad746
Herve Emissah, Bengt Ljungquist, Giorgio A Ascoli
Summary Neural morphology, the branching geometry of brain cells, is an essential cellular substrate of nervous system function and pathology. Despite the accelerating production of digital reconstructions of neural morphology, the public accessibility of data remains a core issue in neuroscience. Deficiencies in the availability of existing data create redundancy of research efforts and limit synergy. We carried out a comprehensive bibliometric analysis of neural morphology publications to quantify the impact of data sharing in the neuroscience community. Our findings demonstrate that sharing digital reconstructions of neural morphology via NeuroMorpho.Org leads to a significant increase of citations to the original article, thus directly benefiting authors. The rate of data reusage remains constant for at least 16 years after sharing (the whole period analyzed), altogether nearly doubling the peer-reviewed discoveries in the field. Furthermore, the recent availability of larger and more numerous datasets fostered integrative applications, which accrue on average twice the citations of re-analyses of individual datasets. We also released an open-source citation tracking web-service allowing researchers to monitor reusage of their datasets in independent peer-reviewed reports. These results and tools can facilitate the recognition of shared data reuse for merit evaluations and funding decisions. Availability and Implementation The application is available at: http://cng-nmo-dev3.orc.gmu.edu:8181/. The source code at https://github.com/HerveEmissah/nmo-authors-app and https://github.com/HerveEmissah/nmo-bibliometric-analysis. Supplementary information Supplementary data are available at Bioinformatics online.
摘要 神经形态,即脑细胞的分支几何形态,是神经系统功能和病理的重要细胞基质。尽管神经形态学数字重建技术的发展日新月异,但数据的公开获取仍然是神经科学领域的核心问题。现有数据可用性的不足造成了研究工作的重复,限制了协同作用的发挥。我们对神经形态学出版物进行了全面的文献计量分析,以量化数据共享对神经科学界的影响。我们的研究结果表明,通过 NeuroMorpho.Org 共享神经形态学的数字重构会显著增加原始文章的引用率,从而使作者直接受益。数据重用率在共享后至少 16 年内(整个分析期间)保持不变,使该领域经同行评审的发现增加了近一倍。此外,近期更大、更多数据集的出现促进了综合应用,其平均引用率是对单个数据集进行再分析的引用率的两倍。我们还发布了一个开源引文跟踪网络服务,允许研究人员监测其数据集在独立同行评议报告中的再利用情况。这些结果和工具可促进对共享数据再利用的认可,从而有助于评优和资金决策。可用性和实施 应用程序可从以下网址获取:http://cng-nmo-dev3.orc.gmu.edu:8181/。源代码可从以下网址获取:https://github.com/HerveEmissah/nmo-authors-app 和 https://github.com/HerveEmissah/nmo-bibliometric-analysis。补充信息 补充数据可在 Bioinformatics online 上获取。
{"title":"Bibliometric analysis of neuroscience publications quantifies the impact of data sharing","authors":"Herve Emissah, Bengt Ljungquist, Giorgio A Ascoli","doi":"10.1093/bioinformatics/btad746","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad746","url":null,"abstract":"Summary Neural morphology, the branching geometry of brain cells, is an essential cellular substrate of nervous system function and pathology. Despite the accelerating production of digital reconstructions of neural morphology, the public accessibility of data remains a core issue in neuroscience. Deficiencies in the availability of existing data create redundancy of research efforts and limit synergy. We carried out a comprehensive bibliometric analysis of neural morphology publications to quantify the impact of data sharing in the neuroscience community. Our findings demonstrate that sharing digital reconstructions of neural morphology via NeuroMorpho.Org leads to a significant increase of citations to the original article, thus directly benefiting authors. The rate of data reusage remains constant for at least 16 years after sharing (the whole period analyzed), altogether nearly doubling the peer-reviewed discoveries in the field. Furthermore, the recent availability of larger and more numerous datasets fostered integrative applications, which accrue on average twice the citations of re-analyses of individual datasets. We also released an open-source citation tracking web-service allowing researchers to monitor reusage of their datasets in independent peer-reviewed reports. These results and tools can facilitate the recognition of shared data reuse for merit evaluations and funding decisions. Availability and Implementation The application is available at: http://cng-nmo-dev3.orc.gmu.edu:8181/. The source code at https://github.com/HerveEmissah/nmo-authors-app and https://github.com/HerveEmissah/nmo-bibliometric-analysis. Supplementary information Supplementary data are available at Bioinformatics online.","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138562607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: GIL: a python package for designing custom indexing primers 更正:GIL:用于设计定制索引引物的 python 软件包
IF 5.8 3区 生物学 Q1 Mathematics Pub Date : 2023-12-01 DOI: 10.1093/bioinformatics/btad735
{"title":"Correction to: GIL: a python package for designing custom indexing primers","authors":"","doi":"10.1093/bioinformatics/btad735","DOIUrl":"https://doi.org/10.1093/bioinformatics/btad735","url":null,"abstract":"","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138615615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SCORPIO: a utility for defining and classifying mutation constellations of virus genomes. SCORPIO:用于定义和分类病毒基因组突变星座的实用程序。
IF 5.8 3区 生物学 Q1 Mathematics Pub Date : 2023-10-03 DOI: 10.1093/bioinformatics/btad575
Rachel Colquhoun, Ben Jackson, Áine O'Toole, Andrew Rambaut

Summary: Scorpio provides a set of command line utilities for classifying, haplotyping, and defining constellations of mutations for an aligned set of genome sequences. It was developed to enable exploration and classification of variants of concern within the SARS-CoV-2 pandemic, but can be applied more generally to other species.

Availability and implementation: Scorpio is an open-source project distributed under the GNU GPL version 3 license. Source code and binaries are available at https://github.com/cov-lineages/scorpio, and binaries are also available from Bioconda. SARS-CoV-2 specific definitions can be installed as a separate dependency from https://github.com/cov-lineages/constellations.

摘要:Scorpio提供了一套命令行实用程序,用于对一组对齐的基因组序列进行分类、单倍型和定义突变星座。它的开发是为了探索和分类SARS-CoV-2大流行中的变异毒株,但可以更广泛地应用于其他物种。可用性和实现:Scorpio是一个以GNU GPL第3版许可证分发的开源项目。源代码和二进制文件可在https://github.com/cov-lineages/scorpio和二进制文件也可从Bioconda获得。严重急性呼吸系统综合征冠状病毒2型的特定定义可以作为单独的依赖项安装https://github.com/cov-lineages/constellations.
{"title":"SCORPIO: a utility for defining and classifying mutation constellations of virus genomes.","authors":"Rachel Colquhoun, Ben Jackson, Áine O'Toole, Andrew Rambaut","doi":"10.1093/bioinformatics/btad575","DOIUrl":"10.1093/bioinformatics/btad575","url":null,"abstract":"<p><strong>Summary: </strong>Scorpio provides a set of command line utilities for classifying, haplotyping, and defining constellations of mutations for an aligned set of genome sequences. It was developed to enable exploration and classification of variants of concern within the SARS-CoV-2 pandemic, but can be applied more generally to other species.</p><p><strong>Availability and implementation: </strong>Scorpio is an open-source project distributed under the GNU GPL version 3 license. Source code and binaries are available at https://github.com/cov-lineages/scorpio, and binaries are also available from Bioconda. SARS-CoV-2 specific definitions can be installed as a separate dependency from https://github.com/cov-lineages/constellations.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10563142/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10265084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A machine learning-based quantitative model (LogBB_Pred) to predict the blood-brain barrier permeability (logBB value) of drug compounds. 一种基于机器学习的定量模型(LogBB_Pred),用于预测药物化合物的血脑屏障通透性(LogBB值)。
IF 5.8 3区 生物学 Q1 Mathematics Pub Date : 2023-10-03 DOI: 10.1093/bioinformatics/btad577
Bilal Shaker, Jingyu Lee, Yunhyeok Lee, Myeong-Sang Yu, Hyang-Mi Lee, Eunee Lee, Hoon-Chul Kang, Kwang-Seok Oh, Hyung Wook Kim, Dokyun Na

Motivation: Efficient assessment of the blood-brain barrier (BBB) penetration ability of a drug compound is one of the major hurdles in central nervous system drug discovery since experimental methods are costly and time-consuming. To advance and elevate the success rate of neurotherapeutic drug discovery, it is essential to develop an accurate computational quantitative model to determine the absolute logBB value (a logarithmic ratio of the concentration of a drug in the brain to its concentration in the blood) of a drug candidate.

Results: Here, we developed a quantitative model (LogBB_Pred) capable of predicting a logBB value of a query compound. The model achieved an R2 of 0.61 on an independent test dataset and outperformed other publicly available quantitative models. When compared with the available qualitative (classification) models that only classified whether a compound is BBB-permeable or not, our model achieved the same accuracy (0.85) with the best qualitative model and far-outperformed other qualitative models (accuracies between 0.64 and 0.70). For further evaluation, our model, quantitative models, and the qualitative models were evaluated on a real-world central nervous system drug screening library. Our model showed an accuracy of 0.97 while the other models showed an accuracy in the range of 0.29-0.83. Consequently, our model can accurately classify BBB-permeable compounds as well as predict the absolute logBB values of drug candidates.

Availability and implementation: Web server is freely available on the web at http://ssbio.cau.ac.kr/software/logbb_pred/. The data used in this study are available to download at http://ssbio.cau.ac.kr/software/logbb_pred/dataset.zip.

动机:有效评估药物化合物的血脑屏障(BBB)穿透能力是中枢神经系统药物发现的主要障碍之一,因为实验方法成本高昂且耗时。为了推进和提高神经治疗药物发现的成功率,必须开发一个准确的计算定量模型来确定候选药物的绝对logBB值(大脑中药物浓度与血液中药物浓度的对数比)。结果:在这里,我们开发了一个能够预测查询化合物的LogBB值的定量模型(LogBB_Pred)。该模型在独立测试数据集上获得了0.61的R2,并优于其他公开可用的定量模型。与只分类化合物是否具有血脑屏障渗透性的现有定性(分类)模型相比,我们的模型获得了与最佳定性模型相同的准确度(0.85),并且远远优于其他定性模型(准确度在0.64和0.70之间)。为了进一步评估,并在真实世界的中枢神经系统药物筛选库中对定性模型进行评估。我们的模型显示出0.97的准确度,而其他模型显示出0.29-0.83的准确度。因此,我们的模型可以准确地对血脑屏障可渗透的化合物进行分类,并预测候选药物的绝对logBB值。可用性和实施:Web服务器可在http://ssbio.cau.ac.kr/software/logbb_pred/.本研究中使用的数据可在http://ssbio.cau.ac.kr/software/logbb_pred/dataset.zip.
{"title":"A machine learning-based quantitative model (LogBB_Pred) to predict the blood-brain barrier permeability (logBB value) of drug compounds.","authors":"Bilal Shaker,&nbsp;Jingyu Lee,&nbsp;Yunhyeok Lee,&nbsp;Myeong-Sang Yu,&nbsp;Hyang-Mi Lee,&nbsp;Eunee Lee,&nbsp;Hoon-Chul Kang,&nbsp;Kwang-Seok Oh,&nbsp;Hyung Wook Kim,&nbsp;Dokyun Na","doi":"10.1093/bioinformatics/btad577","DOIUrl":"10.1093/bioinformatics/btad577","url":null,"abstract":"<p><strong>Motivation: </strong>Efficient assessment of the blood-brain barrier (BBB) penetration ability of a drug compound is one of the major hurdles in central nervous system drug discovery since experimental methods are costly and time-consuming. To advance and elevate the success rate of neurotherapeutic drug discovery, it is essential to develop an accurate computational quantitative model to determine the absolute logBB value (a logarithmic ratio of the concentration of a drug in the brain to its concentration in the blood) of a drug candidate.</p><p><strong>Results: </strong>Here, we developed a quantitative model (LogBB_Pred) capable of predicting a logBB value of a query compound. The model achieved an R2 of 0.61 on an independent test dataset and outperformed other publicly available quantitative models. When compared with the available qualitative (classification) models that only classified whether a compound is BBB-permeable or not, our model achieved the same accuracy (0.85) with the best qualitative model and far-outperformed other qualitative models (accuracies between 0.64 and 0.70). For further evaluation, our model, quantitative models, and the qualitative models were evaluated on a real-world central nervous system drug screening library. Our model showed an accuracy of 0.97 while the other models showed an accuracy in the range of 0.29-0.83. Consequently, our model can accurately classify BBB-permeable compounds as well as predict the absolute logBB values of drug candidates.</p><p><strong>Availability and implementation: </strong>Web server is freely available on the web at http://ssbio.cau.ac.kr/software/logbb_pred/. The data used in this study are available to download at http://ssbio.cau.ac.kr/software/logbb_pred/dataset.zip.</p>","PeriodicalId":8903,"journal":{"name":"Bioinformatics","volume":null,"pages":null},"PeriodicalIF":5.8,"publicationDate":"2023-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10560102/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10260174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1