Journal of Bioinformatics and Computational Biology最新文献

英文中文

Gtie-Rt: A comprehensive graph learning model for predicting drugs targeting metabolic pathways in human. Gtie-Rt：用于预测以人类代谢途径为靶点的药物的综合图学习模型。

IF 0.9 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2024-06-01 Epub Date: 2024-07-20 DOI: 10.1142/S0219720024500100

Hayat Ali Shah, Juan Liu, Zhihui Yang

Drugs often target specific metabolic pathways to produce a therapeutic effect. However, these pathways are complex and interconnected, making it challenging to predict a drug's potential effects on an organism's overall metabolism. The mapping of drugs with targeting metabolic pathways in the organisms can provide a more complete understanding of the metabolic effects of a drug and help to identify potential drug-drug interactions. In this study, we proposed a machine learning hybrid model Graph Transformer Integrated Encoder (GTIE-RT) for mapping drugs to target metabolic pathways in human. The proposed model is a composite of a Graph Convolution Network (GCN) and transformer encoder for graph embedding and attention mechanism. The output of the transformer encoder is then fed into the Extremely Randomized Trees Classifier to predict target metabolic pathways. The evaluation of the GTIE-RT on drugs dataset demonstrates excellent performance metrics, including accuracy (>95%), recall (>92%), precision (>93%) and F1-score (>92%). Compared to other variants and machine learning methods, GTIE-RT consistently shows more reliable results.

药物通常针对特定的代谢途径产生治疗效果。然而，这些途径复杂且相互关联，因此预测药物对生物体整体代谢的潜在影响具有挑战性。绘制以生物体内代谢途径为靶点的药物图谱可以更全面地了解药物的代谢效应，并有助于识别潜在的药物间相互作用。在这项研究中，我们提出了一种机器学习混合模型 Graph Transformer Integrated Encoder (GTIE-RT)，用于绘制药物在人体内的靶向代谢途径图。该模型由图形卷积网络（GCN）和用于图形嵌入和关注机制的变换器编码器组成。转换器编码器的输出被输入到极随机树分类器中，以预测目标代谢途径。在药物数据集上对 GTIE-RT 进行的评估显示了其出色的性能指标，包括准确率（>95%）、召回率（>92%）、精确率（>93%）和 F1 分数（>92%）。与其他变体和机器学习方法相比，GTIE-RT 始终显示出更可靠的结果。

{"title":"Gtie-Rt: A comprehensive graph learning model for predicting drugs targeting metabolic pathways in human.","authors":"Hayat Ali Shah, Juan Liu, Zhihui Yang","doi":"10.1142/S0219720024500100","DOIUrl":"10.1142/S0219720024500100","url":null,"abstract":"Drugs often target specific metabolic pathways to produce a therapeutic effect. However, these pathways are complex and interconnected, making it challenging to predict a drug's potential effects on an organism's overall metabolism. The mapping of drugs with targeting metabolic pathways in the organisms can provide a more complete understanding of the metabolic effects of a drug and help to identify potential drug-drug interactions. In this study, we proposed a machine learning hybrid model Graph Transformer Integrated Encoder (GTIE-RT) for mapping drugs to target metabolic pathways in human. The proposed model is a composite of a Graph Convolution Network (GCN) and transformer encoder for graph embedding and attention mechanism. The output of the transformer encoder is then fed into the Extremely Randomized Trees Classifier to predict target metabolic pathways. The evaluation of the GTIE-RT on drugs dataset demonstrates excellent performance metrics, including accuracy (>95%), recall (>92%), precision (>93%) and F1-score (>92%). Compared to other variants and machine learning methods, GTIE-RT consistently shows more reliable results.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":" ","pages":"2450010"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141727966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Construction of transcript regulation mechanism prediction models based on binding motif environment of transcription factor AoXlnR in Aspergillus oryzae. 基于黑曲霉转录因子 AoXlnR 的结合主题环境构建转录本调控机制预测模型

IF 0.9 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2024-06-01 DOI: 10.1142/S0219720024500173

Hiroya Oka, Takaaki Kojima, Ryuji Kato, Kunio Ihara, Hideo Nakano

DNA-binding transcription factors (TFs) play a central role in transcriptional regulation mechanisms, mainly through their specific binding to target sites on the genome and regulation of the expression of downstream genes. Therefore, a comprehensive analysis of the function of these TFs will lead to the understanding of various biological mechanisms. However, the functions of TFs in vivo are diverse and complicated, and the identified binding sites on the genome are not necessarily involved in the regulation of downstream gene expression. In this study, we investigated whether DNA structural information around the binding site of TFs can be used to predict the involvement of the binding site in the regulation of the expression of genes located downstream of the binding site. Specifically, we calculated the structural parameters based on the DNA shape around the DNA binding motif located upstream of the gene whose expression is directly regulated by one TF AoXlnR from Aspergillus oryzae, and showed that the presence or absence of expression regulation can be predicted from the sequence information with high accuracy ([Formula: see text]-1.0) by machine learning incorporating these parameters.

DNA 结合型转录因子（TFs）在转录调控机制中发挥着核心作用，主要是通过与基因组上的靶位点特异性结合，调控下游基因的表达。因此，全面分析这些转录因子的功能将有助于了解各种生物学机制。然而，TFs 在体内的功能是多样而复杂的，而且在基因组上确定的结合位点并不一定参与下游基因的表达调控。在本研究中，我们探讨了能否利用 TFs 结合位点周围的 DNA 结构信息来预测结合位点是否参与调控位于结合位点下游的基因的表达。具体来说，我们根据位于基因上游、其表达受一种来自黑曲霉的 TF AoXlnR 直接调控的 DNA 结合位点周围的 DNA 形状计算了结构参数，结果表明，通过机器学习结合这些参数，可以从序列信息预测表达调控的存在与否，准确率很高（[公式：见正文]-1.0）。

{"title":"Construction of transcript regulation mechanism prediction models based on binding motif environment of transcription factor AoXlnR in Aspergillus oryzae.","authors":"Hiroya Oka, Takaaki Kojima, Ryuji Kato, Kunio Ihara, Hideo Nakano","doi":"10.1142/S0219720024500173","DOIUrl":"10.1142/S0219720024500173","url":null,"abstract":"DNA-binding transcription factors (TFs) play a central role in transcriptional regulation mechanisms, mainly through their specific binding to target sites on the genome and regulation of the expression of downstream genes. Therefore, a comprehensive analysis of the function of these TFs will lead to the understanding of various biological mechanisms. However, the functions of TFs in vivo are diverse and complicated, and the identified binding sites on the genome are not necessarily involved in the regulation of downstream gene expression. In this study, we investigated whether DNA structural information around the binding site of TFs can be used to predict the involvement of the binding site in the regulation of the expression of genes located downstream of the binding site. Specifically, we calculated the structural parameters based on the DNA shape around the DNA binding motif located upstream of the gene whose expression is directly regulated by one TF AoXlnR from Aspergillus oryzae, and showed that the presence or absence of expression regulation can be predicted from the sequence information with high accuracy ([Formula: see text]-1.0) by machine learning incorporating these parameters.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"22 3","pages":"2450017"},"PeriodicalIF":0.9,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141761969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

NDMNN: A novel deep residual network based MNN method to remove batch effects from scRNA-seq data. NDMNN：基于深度残差网络的新型 MNN 方法，用于消除 scRNA-seq 数据中的批次效应。

IF 0.9 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2024-06-01 Epub Date: 2024-07-20 DOI: 10.1142/S021972002450015X

Yupeng Ma, Yongzhen Pei

The rapid development of single-cell RNA sequencing (scRNA-seq) technology has generated vast amounts of data. However, these data often exhibit batch effects due to various factors such as different time points, experimental personnel, and instruments used, which can obscure the biological differences in the data itself. Based on the characteristics of scRNA-seq data, we designed a dense deep residual network model, referred to as NDnetwork. Subsequently, we combined the NDnetwork model with the MNN method to correct batch effects in scRNA-seq data, and named it the NDMNN method. Comprehensive experimental results demonstrate that the NDMNN method outperforms existing commonly used methods for correcting batch effects in scRNA-seq data. As the scale of single-cell sequencing continues to expand, we believe that NDMNN will be a valuable tool for researchers in the biological community for correcting batch effects in their studies. The source code and experimental results of the NDMNN method can be found at https://github.com/mustang-hub/NDMNN.

单细胞 RNA 测序（scRNA-seq）技术的快速发展产生了大量数据。然而，由于时间点、实验人员和使用仪器的不同等各种因素，这些数据往往表现出批次效应，从而掩盖了数据本身的生物学差异。根据 scRNA-seq 数据的特点，我们设计了一个密集的深度残差网络模型，简称为 NDnetwork。随后，我们将 NDnetwork 模型与 MNN 方法相结合，校正了 scRNA-seq 数据中的批次效应，并将其命名为 NDMNN 方法。综合实验结果表明，NDMNN方法在校正scRNA-seq数据的批次效应方面优于现有的常用方法。随着单细胞测序规模的不断扩大，我们相信 NDMNN 将成为生物界研究人员在研究中校正批次效应的重要工具。有关 NDMNN 方法的源代码和实验结果，请访问 https://github.com/mustang-hub/NDMNN。

引用次数: 0

How much can ChatGPT really help computational biologists in programming? ChatGPT 对计算生物学家的编程到底有多大帮助？

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2024-04-01 Epub Date: 2024-05-22 DOI: 10.1142/S021972002471001X

Chowdhury Rafeed Rahman, Limsoon Wong

ChatGPT, a recently developed product by openAI, is successfully leaving its mark as a multi-purpose natural language based chatbot. In this paper, we are more interested in analyzing its potential in the field of computational biology. A major share of work done by computational biologists these days involve coding up bioinformatics algorithms, analyzing data, creating pipelining scripts and even machine learning modeling and feature extraction. This paper focuses on the potential influence (both positive and negative) of ChatGPT in the mentioned aspects with illustrative examples from different perspectives. Compared to other fields of computer science, computational biology has (1) less coding resources, (2) more sensitivity and bias issues (deals with medical data), and (3) more necessity of coding assistance (people from diverse background come to this field). Keeping such issues in mind, we cover use cases such as code writing, reviewing, debugging, converting, refactoring, and pipelining using ChatGPT from the perspective of computational biologists in this paper.

ChatGPT 是 openAI 最近开发的一款产品，作为一款基于自然语言的多功能聊天机器人，它成功地留下了自己的印记。在本文中，我们更感兴趣的是分析它在计算生物学领域的潜力。如今，计算生物学家的大部分工作都涉及生物信息学算法编码、数据分析、创建流水线脚本，甚至机器学习建模和特征提取。本文将从不同角度举例说明 ChatGPT 在上述方面的潜在影响（包括正面和负面影响）。与计算机科学的其他领域相比，计算生物学具有以下特点：（1）编码资源较少；（2）敏感性和偏差问题较多（涉及医学数据）；（3）更需要编码帮助（来自不同背景的人员进入这一领域）。考虑到这些问题，我们在本文中从计算生物学家的角度出发，介绍了使用 ChatGPT 进行代码编写、审查、调试、转换、重构和流水线等工作的用例。

引用次数: 0

Learning Long- and Short-term Dependencies for Improving Drug-Target Binding Affinity Prediction using Transformer and Edge Contraction Pooling 利用变换器和边缘收缩池学习长期和短期依赖性以改进药物-靶点结合亲和力预测

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-12-15 DOI: 10.1142/s0219720023500300

Min Gao, Shaohua Jiang, Weibin Ding, Ting Xu, Zhijian Lyu

引用次数: 0

Predictive Recognition of DNA-binding proteins based on Pre-trained Language Model BERT 基于预训练语言模型的 DNA 结合蛋白预测识别 BERT

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-12-08 DOI: 10.1142/s0219720023500282

Yue Ma, Yongzhen Pei, Changguo Li

引用次数: 0

Imputation for single-cell RNA-seq data with non-negative matrix factorization and transfer learning 利用非负矩阵因式分解和迁移学习对单细胞 RNA-seq 数据进行估算

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-12-08 DOI: 10.1142/s0219720023500294

Jiadi Zhu, Youlong Yang

引用次数: 0

Algorithms for the Uniqueness of the Longest Common Subsequence. 最长共同后序唯一性算法。

IF 0.9 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-12-01 Epub Date: 2024-01-10 DOI: 10.1142/S0219720023500270

Yue Wang

Given several number sequences, determining the longest common subsequence is a classical problem in computer science. This problem has applications in bioinformatics, especially determining transposable genes. Nevertheless, related works only consider how to find one longest common subsequence. In this paper, we consider how to determine the uniqueness of the longest common subsequence. If there are multiple longest common subsequences, we also determine which number appears in all/some/none of the longest common subsequences. We focus on four scenarios: (1) linear sequences without duplicated numbers; (2) circular sequences without duplicated numbers; (3) linear sequences with duplicated numbers; (4) circular sequences with duplicated numbers. We develop corresponding algorithms and apply them to gene sequencing data.

给定几个数字序列，确定最长公共子序列是计算机科学中的一个经典问题。这一问题在生物信息学中也有应用，尤其是确定转座基因。然而，相关工作只考虑如何找到一个最长公共子序列。在本文中，我们考虑的是如何确定最长公共子序列的唯一性。如果存在多个最长公共子序列，我们还要确定哪个数字出现在所有/部分/无最长公共子序列中。我们重点研究四种情况：(1) 无重复数字的线性序列；(2) 无重复数字的循环序列；(3) 有重复数字的线性序列；(4) 有重复数字的循环序列。我们开发了相应的算法，并将其应用于基因测序数据。

引用次数: 0

Small groups in multidimensional feature space: two examples of supervised two-group classification from biomedicine 多维特征空间中的小群体:生物医学中监督双群体分类的两个例子

4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-11-07 DOI: 10.1142/s0219720023500257

Dmitriy Karpenko, Aleksei Bigildeev

引用次数: 0

CNV-FB: A Feature bagging strategy-based approach to detect copy number variants from NGS data CNV-FB:基于特征装袋策略的NGS数据拷贝数变异检测方法

4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-11-07 DOI: 10.1142/s0219720023500269

Chengyou Li, Shiqiang Fan, Haiyong Zhao, Xiaotong Liu

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Journal of Bioinformatics and Computational Biology

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀