首页 > 最新文献

Journal of Bioinformatics and Computational Biology最新文献

英文 中文
Transforming OMIC features for classification using siamese convolutional networks. 使用连体卷积网络转换OMIC特征用于分类。
IF 1 4区 生物学 Q3 Computer Science Pub Date : 2022-06-01 Epub Date: 2022-07-09 DOI: 10.1142/S0219720022500135
Qian Wang, Meiyu Duan, Yusi Fan, Shuai Liu, Yanjiao Ren, Lan Huang, Fengfeng Zhou

Modern biotechnologies have generated huge amount of OMIC data, among which transcriptomes and methylomes are two major OMIC types. Transcriptomes measure the expression levels of all the transcripts while methylomes depict the cytosine methylation levels across a genome. Both OMIC data types could be generated by array or sequencing. And some studies deliver many more features (the number of features is denoted as [Formula: see text]) for a sample than the number [Formula: see text] of samples in a cohort, which induce the "large [Formula: see text] small [Formula: see text]" paradigm. This study focused on the classification problem about OMIC with "large [Formula: see text] small [Formula: see text]" paradigm. A Siamese convolutional network was utilized to transform the OMIC features into a new space with minimized intra-class distances and maximized inter-class distances between the samples. The proposed feature engineering algorithm SiaCo was comprehensively evaluated using both transcriptome and methylome datasets. The experimental data showed that SiaCo generated SiaCo features with improved classification accuracies for binary classification problems, and achieved improvements on the independent test dataset. The individual SiaCo features did not show better inter-class discrimination powers than the original OMIC features. This may be due to that the Siamese convolutional network optimized the collective performances of the SiaCo features, instead of the individual feature's discrimination power. The inherent transformation nature of the Siamese twin network also makes the SiaCo features lack of interpretability. The source code of SiaCo is freely available at http://www.healthinformaticslab.org/supp/resources.php.

现代生物技术产生了大量的OMIC数据,其中转录组和甲基组是两种主要的OMIC类型。转录组测量所有转录本的表达水平,而甲基组描述整个基因组的胞嘧啶甲基化水平。这两种OMIC数据类型都可以通过数组或排序生成。一些研究为一个样本提供了更多的特征(特征的数量表示为[公式:见文]),而不是一个队列中样本的数量[公式:见文],这导致了“大[公式:见文]小[公式:见文]”范式。本研究主要研究基于“大[公式:见文]小[公式:见文]”范式的OMIC分类问题。利用Siamese卷积网络将OMIC特征转换为一个新的空间,使样本之间的类内距离最小,类间距离最大。使用转录组和甲基组数据集对所提出的特征工程算法SiaCo进行了综合评估。实验数据表明,SiaCo生成的SiaCo特征对二元分类问题具有更高的分类精度,并在独立测试数据集上取得了改进。单个SiaCo特征没有表现出比原始OMIC特征更好的阶级间歧视能力。这可能是由于Siamese卷积网络优化了SiaCo特征的集体性能,而不是单个特征的识别能力。连体孪生网络固有的转换性质也使得连体孪生特征缺乏可解释性。SiaCo的源代码可以在http://www.healthinformaticslab.org/supp/resources.php上免费获得。
{"title":"Transforming OMIC features for classification using siamese convolutional networks.","authors":"Qian Wang,&nbsp;Meiyu Duan,&nbsp;Yusi Fan,&nbsp;Shuai Liu,&nbsp;Yanjiao Ren,&nbsp;Lan Huang,&nbsp;Fengfeng Zhou","doi":"10.1142/S0219720022500135","DOIUrl":"https://doi.org/10.1142/S0219720022500135","url":null,"abstract":"<p><p>Modern biotechnologies have generated huge amount of OMIC data, among which transcriptomes and methylomes are two major OMIC types. Transcriptomes measure the expression levels of all the transcripts while methylomes depict the cytosine methylation levels across a genome. Both OMIC data types could be generated by array or sequencing. And some studies deliver many more features (the number of features is denoted as [Formula: see text]) for a sample than the number [Formula: see text] of samples in a cohort, which induce the \"large [Formula: see text] small [Formula: see text]\" paradigm. This study focused on the classification problem about OMIC with \"large [Formula: see text] small [Formula: see text]\" paradigm. A Siamese convolutional network was utilized to transform the OMIC features into a new space with minimized intra-class distances and maximized inter-class distances between the samples. The proposed feature engineering algorithm SiaCo was comprehensively evaluated using both transcriptome and methylome datasets. The experimental data showed that SiaCo generated SiaCo features with improved classification accuracies for binary classification problems, and achieved improvements on the independent test dataset. The individual SiaCo features did not show better inter-class discrimination powers than the original OMIC features. This may be due to that the Siamese convolutional network optimized the collective performances of the SiaCo features, instead of the individual feature's discrimination power. The inherent transformation nature of the Siamese twin network also makes the SiaCo features lack of interpretability. The source code of SiaCo is freely available at http://www.healthinformaticslab.org/supp/resources.php.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40608394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction model for synergistic anti-tumor multi-compound combinations from traditional Chinese medicine based on extreme gradient boosting, targets and gene expression data. 基于极端梯度提升、靶点和基因表达数据的中药多药联合增效抗肿瘤预测模型
IF 1 4区 生物学 Q3 Computer Science Pub Date : 2022-06-01 DOI: 10.1142/S0219720022500160
Mengqiu Sun, Shengnan She, Hengwei Chen, Jiaxi Cheng, Wei Ji, Dan Wang, Chunlai Feng

Traditional Chinese medicine (TCM) is characterized by synergistic therapeutic effect involving multiple compounds and targets, which provide potential new therapy for the treatment of complex cancer conditions. However, the main contributors and the underlying mechanisms of synergistic TCM cancer therapies remain largely undetermined. Machine learning now provides a new approach to determine synergistic compound combinations from complex components of TCM. In this study, a prediction model based on extreme gradient boosting (XGBoost) algorithm was constructed by integrating gene expression data of different cancer cell lines, targets information of natural compounds and drug response data. Radix Paeoniae Rubra (RPR) was selected as a model herbal sample to evaluate the reliability of the constructed model. The optimal XGBoost prediction model achieved a good performance with Mean Square Error (MSE) of 0.66, Mean Absolute Error (MAE) of 0.61, and the Root Mean Squared Error (RMSE) of 0.81 on test dataset. The superior synergistic anti-tumor combinations of D15 (Paeonol[Formula: see text][Formula: see text][Formula: see text]Ethyl gallate) and D13 (Paeoniflorin[Formula: see text][Formula: see text][Formula: see text]Paeonol) were successfully predicted from RPR and experimentally validated on MCF-7 cells. Moreover, the combination of D13 could work as a main contributor to a synergistic anti-proliferative activity in the compatibility of RPR and Cortex Moutan (CM). Our XGBoost model could be a reliable tool for the efficient prediction of synergistic anti-tumor multi-compound combinations from TCM.

中药具有多化合物、多靶点协同治疗的特点,为复杂肿瘤的治疗提供了潜在的新疗法。然而,中医协同癌症治疗的主要因素和潜在机制在很大程度上仍不确定。机器学习现在提供了一种新的方法,从复杂的中药成分中确定协同化合物组合。本研究通过整合不同癌细胞系基因表达数据、天然化合物靶点信息和药物反应数据,构建了基于极限梯度增强(XGBoost)算法的预测模型。以芍药为模型药材,对所建模型的可靠性进行评价。最优的XGBoost预测模型在测试数据集上的均方误差(MSE)为0.66,平均绝对误差(MAE)为0.61,均方根误差(RMSE)为0.81,取得了较好的预测效果。D15(丹皮酚[配方:见原文][配方:见原文][配方:见原文]没食子酸乙酯)和D13(芍药苷[配方:见原文][配方:见原文]丹皮酚)的联合抗肿瘤效果较好,并在MCF-7细胞上进行了实验验证。此外,D13的组合可能是RPR与牡丹皮(CM)相容性协同抗增殖活性的主要因素。我们的XGBoost模型可作为有效预测中药多药联合抗肿瘤疗效的可靠工具。
{"title":"Prediction model for synergistic anti-tumor multi-compound combinations from traditional Chinese medicine based on extreme gradient boosting, targets and gene expression data.","authors":"Mengqiu Sun,&nbsp;Shengnan She,&nbsp;Hengwei Chen,&nbsp;Jiaxi Cheng,&nbsp;Wei Ji,&nbsp;Dan Wang,&nbsp;Chunlai Feng","doi":"10.1142/S0219720022500160","DOIUrl":"https://doi.org/10.1142/S0219720022500160","url":null,"abstract":"<p><p>Traditional Chinese medicine (TCM) is characterized by synergistic therapeutic effect involving multiple compounds and targets, which provide potential new therapy for the treatment of complex cancer conditions. However, the main contributors and the underlying mechanisms of synergistic TCM cancer therapies remain largely undetermined. Machine learning now provides a new approach to determine synergistic compound combinations from complex components of TCM. In this study, a prediction model based on extreme gradient boosting (XGBoost) algorithm was constructed by integrating gene expression data of different cancer cell lines, targets information of natural compounds and drug response data. Radix Paeoniae Rubra (RPR) was selected as a model herbal sample to evaluate the reliability of the constructed model. The optimal XGBoost prediction model achieved a good performance with Mean Square Error (MSE) of 0.66, Mean Absolute Error (MAE) of 0.61, and the Root Mean Squared Error (RMSE) of 0.81 on test dataset. The superior synergistic anti-tumor combinations of D15 (Paeonol[Formula: see text][Formula: see text][Formula: see text]Ethyl gallate) and D13 (Paeoniflorin[Formula: see text][Formula: see text][Formula: see text]Paeonol) were successfully predicted from RPR and experimentally validated on MCF-7 cells. Moreover, the combination of D13 could work as a main contributor to a synergistic anti-proliferative activity in the compatibility of RPR and Cortex Moutan (CM). Our XGBoost model could be a reliable tool for the efficient prediction of synergistic anti-tumor multi-compound combinations from TCM.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40624490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated analysis of karyotype images. 核型图像的自动分析。
IF 1 4区 生物学 Q3 Computer Science Pub Date : 2022-06-01 Epub Date: 2022-07-07 DOI: 10.1142/S0219720022500111
Ensieh Khazaei, Ala Emrany, Mostafa Tavassolipour, Foroozandeh Mahjoubi, Ahmad Ebrahimi, Seyed Abolfazl Motahari

Karyotype is a genetic test that is used for detection of chromosomal defects. In a karyotype test, an image is captured from chromosomes during the cell division. The captured images are then analyzed by cytogeneticists in order to detect possible chromosomal defects. In this paper, we have proposed an automated pipeline for analysis of karyotype images. There are three main steps for karyotype image analysis: image enhancement, image segmentation and chromosome classification. In this paper, we have proposed a novel chromosome segmentation algorithm to decompose overlapped chromosomes. We have also proposed a CNN-based classifier which outperforms all the existing classifiers. Our classifier is trained by a dataset of about 1,62,000 human chromosome images. We also introduced a novel post-processing algorithm which improves the classification results. The success rate of our segmentation algorithm is 95%. In addition, our experimental results show that the accuracy of our classifier for human chromosomes is 92.63% and our novel post-processing algorithm increases the classification results to 94%.

核型是一种基因测试,用于检测染色体缺陷。在核型测试中,在细胞分裂期间从染色体上捕获图像。然后由细胞遗传学家分析捕获的图像,以检测可能的染色体缺陷。在本文中,我们提出了一个自动化流水线分析核型图像。核型图像分析主要有三个步骤:图像增强、图像分割和染色体分类。本文提出了一种新的染色体分割算法来分解重叠的染色体。我们还提出了一个基于cnn的分类器,它优于所有现有的分类器。我们的分类器是由大约162,000个人类染色体图像的数据集训练的。我们还引入了一种新的后处理算法来改善分类结果。我们的分割算法的成功率为95%。此外,我们的实验结果表明,我们的分类器对人类染色体的分类准确率为92.63%,我们的新后处理算法将分类结果提高到94%。
{"title":"Automated analysis of karyotype images.","authors":"Ensieh Khazaei,&nbsp;Ala Emrany,&nbsp;Mostafa Tavassolipour,&nbsp;Foroozandeh Mahjoubi,&nbsp;Ahmad Ebrahimi,&nbsp;Seyed Abolfazl Motahari","doi":"10.1142/S0219720022500111","DOIUrl":"https://doi.org/10.1142/S0219720022500111","url":null,"abstract":"<p><p>Karyotype is a genetic test that is used for detection of chromosomal defects. In a karyotype test, an image is captured from chromosomes during the cell division. The captured images are then analyzed by cytogeneticists in order to detect possible chromosomal defects. In this paper, we have proposed an automated pipeline for analysis of karyotype images. There are three main steps for karyotype image analysis: image enhancement, image segmentation and chromosome classification. In this paper, we have proposed a novel chromosome segmentation algorithm to decompose overlapped chromosomes. We have also proposed a CNN-based classifier which outperforms all the existing classifiers. Our classifier is trained by a dataset of about 1,62,000 human chromosome images. We also introduced a novel post-processing algorithm which improves the classification results. The success rate of our segmentation algorithm is 95%. In addition, our experimental results show that the accuracy of our classifier for human chromosomes is 92.63% and our novel post-processing algorithm increases the classification results to 94%.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40494201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of nucleosome dynamic interval based on long-short-term memory network (LSTM) 基于长短期记忆网络的核小体动态区间预测
IF 1 4区 生物学 Q3 Computer Science Pub Date : 2022-05-21 DOI: 10.1142/S0219720022500093
Jianli Liu, D. Zhou, Wen Jin
Nucleosome localization is a dynamic process and consists of nucleosome dynamic intervals (NDIs). We preprocessed nucleosome sequence data as time series data (TSD) and developed a long short-term memory network (LSTM) model for training time series data (TSD; LSTM-TSD model) using iterative training and feature learning that predicts NDIs with high accuracy. Sn, Sp, Acc, and MCC of the obtained LSTM model is 91.88%, 92.72%, 92.30%, and 84.61%, respectively. LSTM model could precisely predict the NDIs of yeast 16 chromosome. The NDIs contain 90.29% of nucleosome core DNA and 91.20% of nucleosome central sites, indicating that NDIs have high confidence. We found that the binding sites of transcriptional proteins and other proteins are outside NDIs, not in NDIs. These results are important for analysis of nucleosome localization and gene transcriptional regulation.
核小体定位是一个动态过程,由核小体动态区间(NDIs)组成。我们将核小体序列数据预处理为时间序列数据(TSD),并使用迭代训练和特征学习开发了用于训练时间序列数据的长短期记忆网络(LSTM)模型(TSD;LSTM-TSD模型),该模型可以高精度预测NDI。所获得的LSTM模型的Sn、Sp、Acc和MCC分别为91.88%、92.72%、92.30%和84.61%。LSTM模型可以准确预测酵母16号染色体的NDIs。NDIs含有90.29%的核小体核心DNA和91.20%的核小体中心位点,表明NDIs具有高置信度。我们发现转录蛋白和其他蛋白的结合位点在NDIs之外,而不是在NDIs中。这些结果对核小体定位和基因转录调控的分析具有重要意义。
{"title":"Prediction of nucleosome dynamic interval based on long-short-term memory network (LSTM)","authors":"Jianli Liu, D. Zhou, Wen Jin","doi":"10.1142/S0219720022500093","DOIUrl":"https://doi.org/10.1142/S0219720022500093","url":null,"abstract":"Nucleosome localization is a dynamic process and consists of nucleosome dynamic intervals (NDIs). We preprocessed nucleosome sequence data as time series data (TSD) and developed a long short-term memory network (LSTM) model for training time series data (TSD; LSTM-TSD model) using iterative training and feature learning that predicts NDIs with high accuracy. Sn, Sp, Acc, and MCC of the obtained LSTM model is 91.88%, 92.72%, 92.30%, and 84.61%, respectively. LSTM model could precisely predict the NDIs of yeast 16 chromosome. The NDIs contain 90.29% of nucleosome core DNA and 91.20% of nucleosome central sites, indicating that NDIs have high confidence. We found that the binding sites of transcriptional proteins and other proteins are outside NDIs, not in NDIs. These results are important for analysis of nucleosome localization and gene transcriptional regulation.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48943898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Denoising of scanning electron microscope images for biological ultrastructure enhancement 用于生物超微结构增强的扫描电镜图像去噪
IF 1 4区 生物学 Q3 Computer Science Pub Date : 2022-04-23 DOI: 10.1142/S021972002250007X
Sheng Chang, Lijun Shen, Linlin Li, Xi Chen, Hua Han
Scanning electron microscopy (SEM) is of great significance for analyzing the ultrastructure. However, due to the requirements of data throughput and electron dose of biological samples in the imaging process, the SEM image of biological samples is often occupied by noise which severely affects the observation of ultrastructure. Therefore, it is necessary to analyze and establish a noise model of SEM and propose an effective denoising algorithm that can preserve the ultrastructure. We first investigated the noise source of SEM images and introduced a signal-related SEM noise model. Then, we validated the effectiveness of the noise model through experiments, which are designed with standard samples to reflect the relation between real signal intensity and noise. Based on the SEM noise model and traditional variance stabilization denoising strategy, we proposed a novel, two-stage denoising method. In the first stage variance stabilization, our VS-Net realizes the separation of signal-dependent noise and signal in the SEM image. In the second stage denoising, our D-Net employs the structure of U-Net and combines the attention mechanism to achieve efficient noise removal. Compared with other existing denoising methods for SEM images, our proposed method is more competitive in objective evaluation and visual effects. Source code is available on GitHub (https://github.com/VictorCSheng/VSID-Net).
扫描电子显微镜(SEM)对超微结构的分析具有重要意义。然而,由于成像过程中对生物样品的数据吞吐量和电子剂量的要求,生物样品的SEM图像经常被噪声占据,严重影响了超微结构的观察。因此,有必要分析和建立SEM的噪声模型,并提出一种有效的去噪算法,以保持SEM的超微结构。我们首先研究了SEM图像的噪声源,并引入了一个与信号相关的SEM噪声模型。然后,我们通过实验验证了噪声模型的有效性,实验是用标准样本设计的,以反映真实信号强度和噪声之间的关系。基于SEM噪声模型和传统的方差稳定去噪策略,我们提出了一种新的两阶段去噪方法。在第一阶段的方差稳定中,我们的VS-Net实现了SEM图像中与信号相关的噪声和信号的分离。在第二阶段去噪中,我们的D-Net采用了U-Net的结构,并结合了注意力机制来实现高效的去噪。与现有的其他SEM图像去噪方法相比,我们提出的方法在客观评价和视觉效果方面更具竞争力。源代码可在GitHub上获得(https://github.com/VictorCSheng/VSID-Net)。
{"title":"Denoising of scanning electron microscope images for biological ultrastructure enhancement","authors":"Sheng Chang, Lijun Shen, Linlin Li, Xi Chen, Hua Han","doi":"10.1142/S021972002250007X","DOIUrl":"https://doi.org/10.1142/S021972002250007X","url":null,"abstract":"Scanning electron microscopy (SEM) is of great significance for analyzing the ultrastructure. However, due to the requirements of data throughput and electron dose of biological samples in the imaging process, the SEM image of biological samples is often occupied by noise which severely affects the observation of ultrastructure. Therefore, it is necessary to analyze and establish a noise model of SEM and propose an effective denoising algorithm that can preserve the ultrastructure. We first investigated the noise source of SEM images and introduced a signal-related SEM noise model. Then, we validated the effectiveness of the noise model through experiments, which are designed with standard samples to reflect the relation between real signal intensity and noise. Based on the SEM noise model and traditional variance stabilization denoising strategy, we proposed a novel, two-stage denoising method. In the first stage variance stabilization, our VS-Net realizes the separation of signal-dependent noise and signal in the SEM image. In the second stage denoising, our D-Net employs the structure of U-Net and combines the attention mechanism to achieve efficient noise removal. Compared with other existing denoising methods for SEM images, our proposed method is more competitive in objective evaluation and visual effects. Source code is available on GitHub (https://github.com/VictorCSheng/VSID-Net).","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46938369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantitative structure-activity relationship modeling reveals the minimal sequence requirement and amino acid preference of sirtuin-1's deacetylation substrates in diabetes mellitus 定量构效关系模型揭示了糖尿病患者sirtuin-1脱乙酰基底物的最小序列需求和氨基酸偏好
IF 1 4区 生物学 Q3 Computer Science Pub Date : 2022-04-21 DOI: 10.1142/S0219720022500081
X. Shao, W. Kong, Y. Li, S. Zhang
Sirtuin 1 (SIRT1) is a nicotinamide adenine dinucleotide (NAD[Formula: see text]-dependent deacetylase involved in multiple glucose metabolism pathways and plays an important role in the pathogenesis of diabetes mellitus (DM). The enzyme specifically recognizes its deacetylation substrates' peptide segments containing a central acetyl-lysine residue as well as a number of amino acids flanking the central residue. In this study, we attempted to ascertain the minimal sequence requirement (MSR) around the central acetyl-lysine residue of SIRT1 substrate-recognition sites as well as the amino acid preference (AAP) at different residues of the MSR window through quantitative structure-activity relationship (QSAR) strategy, which would benefit our understanding of SIRT1 substrate specificity at the molecular level and is also helpful to rationally design substrate-mimicking peptidic agents against DM by competitively targeting SIRT1 active site. In this procedure, a large-scale dataset containing 6801 13-mer acetyl-lysine peptides (and their SIRT1-catalyized deacetylation activities) were compiled to train 10 QSAR regression models developed by systematic combination of machine learning methods (PLS and SVM) and five amino acids descriptors (DPPS, T-scale, MolSurf, [Formula: see text]-score, and FASGAI). The two best QSAR models (PLS+FASGAI and SVM+DPPS) were then employed to statistically examine the contribution of residue positions to the deacetylation activity of acetyl-lysine peptide substrates, revealing that the MSR can be represented by 5-mer acetyl-lysine peptides that meet a consensus motif X[Formula: see text]X[Formula: see text]X[Formula: see text](AcK)0X[Formula: see text]. Structural analysis found that the X[Formula: see text] and (AcK)0 residues are tightly packed against the enzyme active site and confer both stability and specificity for the enzyme-substrate complex, whereas the X[Formula: see text], X[Formula: see text] and X[Formula: see text] residues are partially exposed to solvent but can also effectively stabilize the complex system. Subsequently, a systematic deacetylation activity change profile (SDACP) was created based on QSAR modeling, from which the AAP for each residue position of MSR was depicted. With the profile, we were able to rationally design an SDACP combinatorial library with promising deacetylation activity, from which nine MSR acetyl-lysine peptides as well as two known SIRT1 acetyl-lysine peptide substrates were tested by using SIRT1 deacetylation assay. It is revealed that the designed peptides exhibit a comparable or even higher activity than the controls, although the former is considerably shorter than the latter.
Sirtuin 1(SIRT1)是一种烟酰胺腺嘌呤二核苷酸(NAD[公式:见正文]依赖性脱乙酰酶,参与多种葡萄糖代谢途径,在糖尿病(DM)的发病机制中发挥重要作用。该酶特异性识别其脱乙酰基底物的肽段,该肽段包含中心乙酰赖氨酸残基以及位于中心残基两侧的许多氨基酸。在本研究中,我们试图通过定量构效关系(QSAR)策略来确定SIRT1底物识别位点的中心乙酰赖氨酸残基周围的最小序列需求(MSR)以及MSR窗口不同残基处的氨基酸偏好(AAP),这将有利于我们在分子水平上理解SIRT1底物特异性,也有助于通过竞争性靶向SIRT1活性位点来合理设计针对DM的底物模拟肽剂。在该程序中,汇编了一个包含6801个13聚体乙酰赖氨酸肽(及其SIRT1催化脱乙酰活性)的大规模数据集,以训练10个QSAR回归模型,该模型是通过机器学习方法(PLS和SVM)和5个氨基酸描述符(DPPS、T量表、MolSurf、[公式:见正文]-score和FASGAI)的系统组合开发的。然后,使用两个最佳的QSAR模型(PLS+FASGAI和SVM+DPPS)来统计检查残基位置对乙酰赖氨酸肽底物的脱乙酰活性的贡献,揭示MSR可以由满足一致基序X的5-聚乙酰赖氨肽来表示[公式:见正文]X[公式:见正文]X(公式:见文本](AcK)0X[公式:见图]。结构分析发现,X[式:见正文]和(AcK)0残基紧密堆积在酶活性位点上,并赋予酶-底物复合物稳定性和特异性,而X[式,见正文]、X[式和X[式]残基部分暴露于溶剂中,但也能有效稳定复合物系统。随后,基于QSAR建模创建了系统的脱乙酰基活性变化谱(SDACP),从中描绘了MSR每个残基位置的AAP。利用该图谱,我们能够合理地设计出一个具有良好脱乙酰活性的SDACP组合文库,从中通过SIRT1脱乙酰测定测试了9个MSR乙酰赖氨酸肽和两个已知的SIRT1乙酰赖氨肽底物。研究表明,设计的肽表现出与对照相当甚至更高的活性,尽管前者比后者短得多。
{"title":"Quantitative structure-activity relationship modeling reveals the minimal sequence requirement and amino acid preference of sirtuin-1's deacetylation substrates in diabetes mellitus","authors":"X. Shao, W. Kong, Y. Li, S. Zhang","doi":"10.1142/S0219720022500081","DOIUrl":"https://doi.org/10.1142/S0219720022500081","url":null,"abstract":"Sirtuin 1 (SIRT1) is a nicotinamide adenine dinucleotide (NAD[Formula: see text]-dependent deacetylase involved in multiple glucose metabolism pathways and plays an important role in the pathogenesis of diabetes mellitus (DM). The enzyme specifically recognizes its deacetylation substrates' peptide segments containing a central acetyl-lysine residue as well as a number of amino acids flanking the central residue. In this study, we attempted to ascertain the minimal sequence requirement (MSR) around the central acetyl-lysine residue of SIRT1 substrate-recognition sites as well as the amino acid preference (AAP) at different residues of the MSR window through quantitative structure-activity relationship (QSAR) strategy, which would benefit our understanding of SIRT1 substrate specificity at the molecular level and is also helpful to rationally design substrate-mimicking peptidic agents against DM by competitively targeting SIRT1 active site. In this procedure, a large-scale dataset containing 6801 13-mer acetyl-lysine peptides (and their SIRT1-catalyized deacetylation activities) were compiled to train 10 QSAR regression models developed by systematic combination of machine learning methods (PLS and SVM) and five amino acids descriptors (DPPS, T-scale, MolSurf, [Formula: see text]-score, and FASGAI). The two best QSAR models (PLS+FASGAI and SVM+DPPS) were then employed to statistically examine the contribution of residue positions to the deacetylation activity of acetyl-lysine peptide substrates, revealing that the MSR can be represented by 5-mer acetyl-lysine peptides that meet a consensus motif X[Formula: see text]X[Formula: see text]X[Formula: see text](AcK)0X[Formula: see text]. Structural analysis found that the X[Formula: see text] and (AcK)0 residues are tightly packed against the enzyme active site and confer both stability and specificity for the enzyme-substrate complex, whereas the X[Formula: see text], X[Formula: see text] and X[Formula: see text] residues are partially exposed to solvent but can also effectively stabilize the complex system. Subsequently, a systematic deacetylation activity change profile (SDACP) was created based on QSAR modeling, from which the AAP for each residue position of MSR was depicted. With the profile, we were able to rationally design an SDACP combinatorial library with promising deacetylation activity, from which nine MSR acetyl-lysine peptides as well as two known SIRT1 acetyl-lysine peptide substrates were tested by using SIRT1 deacetylation assay. It is revealed that the designed peptides exhibit a comparable or even higher activity than the controls, although the former is considerably shorter than the latter.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45781245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepBtoD: Improved RNA-binding proteins prediction via integrated deep learning DeepBtoD:通过集成深度学习改进rna结合蛋白预测
IF 1 4区 生物学 Q3 Computer Science Pub Date : 2022-04-21 DOI: 10.1142/S0219720022500068
Xiuquan Du, Xiu-juan Zhao, Yanping Zhang
RNA-binding proteins (RBPs) have crucial roles in various cellular processes such as alternative splicing and gene regulation. Therefore, the analysis and identification of RBPs is an essential issue. However, although many computational methods have been developed for predicting RBPs, a few studies simultaneously consider local and global information from the perspective of the RNA sequence. Facing this challenge, we present a novel method called DeepBtoD, which predicts RBPs directly from RNA sequences. First, a [Formula: see text]-BtoD encoding is designed, which takes into account the composition of [Formula: see text]-nucleotides and their relative positions and forms a local module. Second, we designed a multi-scale convolutional module embedded with a self-attentive mechanism, the ms-focusCNN, which is used to further learn more effective, diverse, and discriminative high-level features. Finally, global information is considered to supplement local modules with ensemble learning to predict whether the target RNA binds to RBPs. Our preliminary 24 independent test datasets show that our proposed method can classify RBPs with the area under the curve of 0.933. Remarkably, DeepBtoD shows competitive results across seven state-of-the-art methods, suggesting that RBPs can be highly recognized by integrating local [Formula: see text]-BtoD and global information only from RNA sequences. Hence, our integrative method may be useful to improve the power of RBPs prediction, which might be particularly useful for modeling protein-nucleic acid interactions in systems biology studies. Our DeepBtoD server can be accessed at http://175.27.228.227/DeepBtoD/.
RNA结合蛋白(RBPs)在各种细胞过程中发挥着至关重要的作用,如选择性剪接和基因调控。因此,RBP的分析和识别是一个至关重要的问题。然而,尽管已经开发了许多预测RBP的计算方法,但少数研究同时从RNA序列的角度考虑了局部和全局信息。面对这一挑战,我们提出了一种名为DeepBtoD的新方法,该方法直接从RNA序列预测RBP。首先,设计了一种[公式:见文本]-BtoD编码,它考虑了[公式:看文本]-核苷酸的组成及其相对位置,并形成了一个局部模块。其次,我们设计了一个嵌入自关注机制的多尺度卷积模块,即ms-focusCNN,用于进一步学习更有效、更多样、更具鉴别力的高级特征。最后,全局信息被认为是用集成学习来补充局部模块,以预测靶RNA是否与RBP结合。我们初步的24个独立测试数据集表明,我们提出的方法可以对曲线下面积为0.933的RBP进行分类。值得注意的是,DeepBtoD在七种最先进的方法中显示了具有竞争力的结果,这表明RBP可以通过整合仅来自RNA序列的局部[公式:见正文]-BtoD和全局信息来高度识别。因此,我们的综合方法可能有助于提高RBPs预测的能力,这可能对系统生物学研究中的蛋白质-核酸相互作用建模特别有用。我们的DeepBtoD服务器可以访问http://175.27.228.227/DeepBtoD/.
{"title":"DeepBtoD: Improved RNA-binding proteins prediction via integrated deep learning","authors":"Xiuquan Du, Xiu-juan Zhao, Yanping Zhang","doi":"10.1142/S0219720022500068","DOIUrl":"https://doi.org/10.1142/S0219720022500068","url":null,"abstract":"RNA-binding proteins (RBPs) have crucial roles in various cellular processes such as alternative splicing and gene regulation. Therefore, the analysis and identification of RBPs is an essential issue. However, although many computational methods have been developed for predicting RBPs, a few studies simultaneously consider local and global information from the perspective of the RNA sequence. Facing this challenge, we present a novel method called DeepBtoD, which predicts RBPs directly from RNA sequences. First, a [Formula: see text]-BtoD encoding is designed, which takes into account the composition of [Formula: see text]-nucleotides and their relative positions and forms a local module. Second, we designed a multi-scale convolutional module embedded with a self-attentive mechanism, the ms-focusCNN, which is used to further learn more effective, diverse, and discriminative high-level features. Finally, global information is considered to supplement local modules with ensemble learning to predict whether the target RNA binds to RBPs. Our preliminary 24 independent test datasets show that our proposed method can classify RBPs with the area under the curve of 0.933. Remarkably, DeepBtoD shows competitive results across seven state-of-the-art methods, suggesting that RBPs can be highly recognized by integrating local [Formula: see text]-BtoD and global information only from RNA sequences. Hence, our integrative method may be useful to improve the power of RBPs prediction, which might be particularly useful for modeling protein-nucleic acid interactions in systems biology studies. Our DeepBtoD server can be accessed at http://175.27.228.227/DeepBtoD/.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42540334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
RPfam: A refiner towards curated-like multiple sequence alignments of the Pfam protein families RPfam:Pfam蛋白家族的精细化多序列比对
IF 1 4区 生物学 Q3 Computer Science Pub Date : 2022-04-14 DOI: 10.1142/S0219720022400029
Qingting Wei, Hong Zou, Cuncong Zhong, Jianfeng Xu
High-quality multiple sequence alignments can provide insights into the architecture and function of protein families. The existing MSA tools often generate results inconsistent with biological distribution of conserved regions because of positioning amino acid residues and gaps only by symbols. We propose RPfam, a refiner towards curated-like MSAs for modeling the protein families in the Pfam database. RPfam refines the automatic alignments via scoring alignments based on the PFASUM matrix, restricting realignments within badly aligned blocks, optimizing the block scores by dynamic programming, and running refinements iteratively using the Simulated Annealing algorithm. Experiments show RPfam effectively refined the alignments produced by the MSA tools ClustalO and Muscle with reference to the curated seed alignments of the Pfam protein families. Especially RPfam improved the quality of the ClustalO alignments by 4.4% and the Muscle alignments by 2.8% on the gp32 DNA binding protein-like family. Supplementary Table is available at http://www.worldscinet.com/jbcb/.
高质量的多序列比对可以深入了解蛋白质家族的结构和功能。由于仅通过符号定位氨基酸残基和间隙,现有的MSA工具经常产生与保守区的生物学分布不一致的结果。我们提出了RPfam,这是一种对Pfam数据库中的蛋白质家族进行建模的策划类MSAs的细化器。RPfam通过基于PFASUM矩阵的评分比对、限制对齐不好的块内的重新对齐、通过动态编程优化块分数以及使用模拟退火算法迭代运行细化来细化自动对齐。实验表明,RPfam参考Pfam蛋白家族的精选种子比对,有效地改进了MSA工具ClustalO和Muscle产生的比对。特别是RPfam使gp32 DNA结合蛋白样家族的ClustalO比对质量提高了4.4%,使肌肉比对质量提高2.8%。补充表格可在http://www.worldscinet.com/jbcb/.
{"title":"RPfam: A refiner towards curated-like multiple sequence alignments of the Pfam protein families","authors":"Qingting Wei, Hong Zou, Cuncong Zhong, Jianfeng Xu","doi":"10.1142/S0219720022400029","DOIUrl":"https://doi.org/10.1142/S0219720022400029","url":null,"abstract":"High-quality multiple sequence alignments can provide insights into the architecture and function of protein families. The existing MSA tools often generate results inconsistent with biological distribution of conserved regions because of positioning amino acid residues and gaps only by symbols. We propose RPfam, a refiner towards curated-like MSAs for modeling the protein families in the Pfam database. RPfam refines the automatic alignments via scoring alignments based on the PFASUM matrix, restricting realignments within badly aligned blocks, optimizing the block scores by dynamic programming, and running refinements iteratively using the Simulated Annealing algorithm. Experiments show RPfam effectively refined the alignments produced by the MSA tools ClustalO and Muscle with reference to the curated seed alignments of the Pfam protein families. Especially RPfam improved the quality of the ClustalO alignments by 4.4% and the Muscle alignments by 2.8% on the gp32 DNA binding protein-like family. Supplementary Table is available at http://www.worldscinet.com/jbcb/.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48191874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis to determine the effect of mutations on binding to small chemical molecules 分析以确定突变对与小化学分子结合的影响
IF 1 4区 生物学 Q3 Computer Science Pub Date : 2022-04-14 DOI: 10.1142/S0219720022400030
T. Koshlan, K. Kulikov
In this paper, the authors present and describe, in detail, an original software-implemented numerical methodology used to determine the effect of mutations on binding to small chemical molecules, on the example of gefitinib, AMPPNP, CO-1686, ASP8273, erlotinib binding with EGFR protein, and imatinib binding with PPARgamma. Furthermore, the developed numerical approach makes it possible to determine the stability of a molecular complex, which consists of a protein and a small chemical molecule. The description of the software package that implements the presented algorithm is given in the website: https://binomlabs.com/.
在本文中,作者详细介绍了一种用于确定突变对与小化学分子结合的影响的原始软件实现的数值方法,例如吉非替尼、AMPPNP、CO-1686、ASP8273、埃洛替尼与EGFR蛋白结合以及伊马替尼与PPARγ结合。此外,所开发的数值方法使确定由蛋白质和小化学分子组成的分子复合物的稳定性成为可能。网站中给出了实现所提出算法的软件包的描述:https://binomlabs.com/.
{"title":"Analysis to determine the effect of mutations on binding to small chemical molecules","authors":"T. Koshlan, K. Kulikov","doi":"10.1142/S0219720022400030","DOIUrl":"https://doi.org/10.1142/S0219720022400030","url":null,"abstract":"In this paper, the authors present and describe, in detail, an original software-implemented numerical methodology used to determine the effect of mutations on binding to small chemical molecules, on the example of gefitinib, AMPPNP, CO-1686, ASP8273, erlotinib binding with EGFR protein, and imatinib binding with PPARgamma. Furthermore, the developed numerical approach makes it possible to determine the stability of a molecular complex, which consists of a protein and a small chemical molecule. The description of the software package that implements the presented algorithm is given in the website: https://binomlabs.com/.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47358607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clinical drug response prediction from preclinical cancer cell lines by logistic matrix factorization approach. logistic矩阵分解法预测临床前癌细胞的临床药物反应。
IF 1 4区 生物学 Q3 Computer Science Pub Date : 2022-04-01 Epub Date: 2021-12-17 DOI: 10.1142/S0219720021500359
Akram Emdadi, Changiz Eslahchi

Predicting tumor drug response using cancer cell line drug response values for a large number of anti-cancer drugs is a significant challenge in personalized medicine. Predicting patient response to drugs from data obtained from preclinical models is made easier by the availability of different knowledge on cell lines and drugs. This paper proposes the TCLMF method, a predictive model for predicting drug response in tumor samples that was trained on preclinical samples and is based on the logistic matrix factorization approach. The TCLMF model is designed based on gene expression profiles, tissue type information, the chemical structure of drugs and drug sensitivity (IC 50) data from cancer cell lines. We use preclinical data from the Genomics of Drug Sensitivity in Cancer dataset (GDSC) to train the proposed drug response model, which we then use to predict drug sensitivity of samples from the Cancer Genome Atlas (TCGA) dataset. The TCLMF approach focuses on identifying successful features of cell lines and drugs in order to calculate the probability of the tumor samples being sensitive to drugs. The closest cell line neighbours for each tumor sample are calculated using a description of similarity between tumor samples and cell lines in this study. The drug response for a new tumor is then calculated by averaging the low-rank features obtained from its neighboring cell lines. We compare the results of the TCLMF model with the results of the previously proposed methods using two databases and two approaches to test the model's performance. In the first approach, 12 drugs with enough known clinical drug response, considered in previous methods, are studied. For 7 drugs out of 12, the TCLMF can significantly distinguish between patients that are resistance to these drugs and the patients that are sensitive to them. These approaches are converted to classification models using a threshold in the second approach, and the results are compared. The results demonstrate that the TCLMF method provides accurate predictions across the results of the other algorithms. Finally, we accurately classify tumor tissue type using the latent vectors obtained from TCLMF's logistic matrix factorization process. These findings demonstrate that the TCLMF approach produces effective latent vectors for tumor samples. The source code of the TCLMF method is available in https://github.com/emdadi/TCLMF.

利用大量抗癌药物的癌细胞系药物反应值预测肿瘤药物反应是个体化医疗的重大挑战。通过从临床前模型获得的数据预测患者对药物的反应,由于对细胞系和药物的不同知识的可用性,使得预测患者对药物的反应变得更加容易。本文提出了TCLMF方法,这是一种基于logistic矩阵分解方法,在临床前样本上训练的预测肿瘤样本药物反应的预测模型。TCLMF模型是基于来自癌细胞系的基因表达谱、组织类型信息、药物化学结构和药物敏感性(IC 50)数据设计的。我们使用来自癌症药物敏感性基因组数据集(GDSC)的临床前数据来训练所提出的药物反应模型,然后我们使用该模型来预测来自癌症基因组图谱(TCGA)数据集的样本的药物敏感性。TCLMF方法侧重于识别细胞系和药物的成功特征,以计算肿瘤样本对药物敏感的概率。在本研究中,使用肿瘤样本和细胞系之间的相似性描述来计算每个肿瘤样本的最近细胞系邻居。对新肿瘤的药物反应,然后通过平均从其邻近细胞系获得的低秩特征来计算。我们将TCLMF模型的结果与之前提出的方法的结果进行比较,使用两个数据库和两种方法来测试模型的性能。在第一种方法中,研究了之前方法中考虑的12种已知临床药物反应足够的药物。对于12种药物中的7种,TCLMF可以显著区分对这些药物耐药的患者和对这些药物敏感的患者。在第二种方法中使用阈值将这些方法转换为分类模型,并对结果进行比较。结果表明,TCLMF方法在其他算法的结果之间提供了准确的预测。最后,利用TCLMF的logistic矩阵分解过程得到的潜伏载体,对肿瘤组织类型进行准确分类。这些发现表明,TCLMF方法产生了有效的肿瘤样本潜伏载体。TCLMF方法的源代码可在https://github.com/emdadi/TCLMF中获得。
{"title":"Clinical drug response prediction from preclinical cancer cell lines by logistic matrix factorization approach.","authors":"Akram Emdadi,&nbsp;Changiz Eslahchi","doi":"10.1142/S0219720021500359","DOIUrl":"https://doi.org/10.1142/S0219720021500359","url":null,"abstract":"<p><p>Predicting tumor drug response using cancer cell line drug response values for a large number of anti-cancer drugs is a significant challenge in personalized medicine. Predicting patient response to drugs from data obtained from preclinical models is made easier by the availability of different knowledge on cell lines and drugs. This paper proposes the TCLMF method, a predictive model for predicting drug response in tumor samples that was trained on preclinical samples and is based on the logistic matrix factorization approach. The TCLMF model is designed based on gene expression profiles, tissue type information, the chemical structure of drugs and drug sensitivity (<i>IC</i> 50) data from cancer cell lines. We use preclinical data from the Genomics of Drug Sensitivity in Cancer dataset (GDSC) to train the proposed drug response model, which we then use to predict drug sensitivity of samples from the Cancer Genome Atlas (TCGA) dataset. The TCLMF approach focuses on identifying successful features of cell lines and drugs in order to calculate the probability of the tumor samples being sensitive to drugs. The closest cell line neighbours for each tumor sample are calculated using a description of similarity between tumor samples and cell lines in this study. The drug response for a new tumor is then calculated by averaging the low-rank features obtained from its neighboring cell lines. We compare the results of the TCLMF model with the results of the previously proposed methods using two databases and two approaches to test the model's performance. In the first approach, 12 drugs with enough known clinical drug response, considered in previous methods, are studied. For 7 drugs out of 12, the TCLMF can significantly distinguish between patients that are resistance to these drugs and the patients that are sensitive to them. These approaches are converted to classification models using a threshold in the second approach, and the results are compared. The results demonstrate that the TCLMF method provides accurate predictions across the results of the other algorithms. Finally, we accurately classify tumor tissue type using the latent vectors obtained from TCLMF's logistic matrix factorization process. These findings demonstrate that the TCLMF approach produces effective latent vectors for tumor samples. The source code of the TCLMF method is available in https://github.com/emdadi/TCLMF.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":1.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39614910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Bioinformatics and Computational Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1