Journal of Bioinformatics and Computational Biology最新文献_第7页

Drug synergy model for malignant diseases using deep learning. 基于深度学习的恶性疾病药物协同模型。

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-06-01 DOI: 10.1142/S0219720023500142

Pooja Rani, Kamlesh Dutta, Vijay Kumar

Drug synergy has emerged as a viable treatment option for malignancy. Drug synergy reduces toxicity, improves therapeutic efficacy, and overcomes drug resistance when compared to single-drug doses. Thus, it has attained significant interest from academics and pharmaceutical organizations. Due to the enormous combinatorial search space, it is impossible to experimentally validate every conceivable combination for synergistic interaction. Due to advancement in artificial intelligence, the computational techniques are being utilized to identify synergistic drug combinations, whereas prior literature has focused on treating certain malignancies. As a result, high-order drug combinations have been given little consideration. Here, DrugSymby, a novel deep-learning model is proposed for predicting drug combinations. To achieve this objective, the data is collected from datasets that include information on anti-cancer drugs, gene expression profiles of malignant cell lines, and screening data against a wide range of malignant cell lines. The proposed model was developed using this data and achieved high performance with f1-score of 0.98, recall of 0.99, and precision of 0.98. The evaluation results of DrugSymby model utilizing drug combination screening data from the NCI-ALMANAC screening dataset indicate drug combination prediction is effective. The proposed model will be used to determine the most successful synergistic drug combinations, and also increase the possibilities of exploring new drug combinations.

药物协同作用已成为恶性肿瘤可行的治疗选择。与单一药物剂量相比，药物协同作用降低毒性，提高治疗效果，克服耐药性。因此，它引起了学术界和制药组织的极大兴趣。由于巨大的组合搜索空间，不可能通过实验验证每个可能的组合来进行协同交互。由于人工智能的进步，计算技术正被用于识别协同药物组合，而先前的文献主要集中在治疗某些恶性肿瘤。因此，高阶药物组合很少得到考虑。在这里，drug symby提出了一种新的深度学习模型，用于预测药物组合。为了实现这一目标，从数据集中收集数据，包括抗癌药物信息、恶性细胞系的基因表达谱和针对各种恶性细胞系的筛选数据。利用该数据建立的模型取得了良好的性能，f1得分为0.98，召回率为0.99，精度为0.98。利用NCI-ALMANAC筛选数据集的药物联合筛选数据对drug - symby模型进行评价，结果表明药物联合预测是有效的。所提出的模型将用于确定最成功的协同药物组合，并增加探索新药物组合的可能性。

{"title":"Drug synergy model for malignant diseases using deep learning.","authors":"Pooja Rani, Kamlesh Dutta, Vijay Kumar","doi":"10.1142/S0219720023500142","DOIUrl":"https://doi.org/10.1142/S0219720023500142","url":null,"abstract":"Drug synergy has emerged as a viable treatment option for malignancy. Drug synergy reduces toxicity, improves therapeutic efficacy, and overcomes drug resistance when compared to single-drug doses. Thus, it has attained significant interest from academics and pharmaceutical organizations. Due to the enormous combinatorial search space, it is impossible to experimentally validate every conceivable combination for synergistic interaction. Due to advancement in artificial intelligence, the computational techniques are being utilized to identify synergistic drug combinations, whereas prior literature has focused on treating certain malignancies. As a result, high-order drug combinations have been given little consideration. Here, DrugSymby, a novel deep-learning model is proposed for predicting drug combinations. To achieve this objective, the data is collected from datasets that include information on anti-cancer drugs, gene expression profiles of malignant cell lines, and screening data against a wide range of malignant cell lines. The proposed model was developed using this data and achieved high performance with f1-score of 0.98, recall of 0.99, and precision of 0.98. The evaluation results of DrugSymby model utilizing drug combination screening data from the NCI-ALMANAC screening dataset indicate drug combination prediction is effective. The proposed model will be used to determine the most successful synergistic drug combinations, and also increase the possibilities of exploring new drug combinations.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350014"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10127381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrating temporal and spatial variabilities for identifying ion binding proteins in phage. 整合噬菌体中离子结合蛋白的时空变异。

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-06-01 DOI: 10.1142/S0219720023500105

Hongliang Zou, Zizheng Yu, Zhijian Yin

Recent studies reported that ion binding proteins (IBPs) in phage play a key role in developing drugs to treat diseases caused by drug-resistant bacteria. Therefore, correct recognition of IBPs is an urgent task, which is beneficial for understanding their biological functions. To explore this issue, a new computational model was developed to identify IBPs in this study. First, we used the physicochemical (PC) property and Pearson's correlation coefficient (PCC) to denote protein sequences, and the temporal and spatial variabilities were employed to extract features. Next, a similarity network fusion algorithm was employed to capture the correlation characteristics between these two different kinds of features. Then, a feature selection method called F-score was utilized to remove the influence of redundant and irrelative information. Finally, these reserved features were fed into support vector machine (SVM) to discriminate IBPs from non-IBPs. Experimental results showed that the proposed method has significant improvement in the classification performance, as compared with the state-of-the-art approach. The Matlab codes and dataset used in this study are available at https://figshare.com/articles/online_resource/iIBP-TSV/21779567 for academic use.

近年来的研究报道，噬菌体中的离子结合蛋白(IBPs)在开发治疗耐药细菌引起的疾病的药物中起着关键作用。因此，正确认识IBPs是一项紧迫的任务，这有利于了解IBPs的生物学功能。为了探讨这个问题，本研究开发了一个新的计算模型来识别ibp。首先，利用蛋白质序列的理化性质(PC)和Pearson相关系数(PCC)来表示蛋白质序列，并利用时间和空间变异性来提取特征;其次，采用相似网络融合算法捕获两种不同类型特征之间的关联特征。然后，利用F-score特征选择方法去除冗余和不相关信息的影响。最后，将这些保留特征输入到支持向量机(SVM)中，以区分ibp和非ibp。实验结果表明，与现有方法相比，该方法在分类性能上有显著提高。本研究中使用的Matlab代码和数据集可在https://figshare.com/articles/online_resource/iIBP-TSV/21779567上获得，供学术使用。

{"title":"Integrating temporal and spatial variabilities for identifying ion binding proteins in phage.","authors":"Hongliang Zou, Zizheng Yu, Zhijian Yin","doi":"10.1142/S0219720023500105","DOIUrl":"https://doi.org/10.1142/S0219720023500105","url":null,"abstract":"Recent studies reported that ion binding proteins (IBPs) in phage play a key role in developing drugs to treat diseases caused by drug-resistant bacteria. Therefore, correct recognition of IBPs is an urgent task, which is beneficial for understanding their biological functions. To explore this issue, a new computational model was developed to identify IBPs in this study. First, we used the physicochemical (PC) property and Pearson's correlation coefficient (PCC) to denote protein sequences, and the temporal and spatial variabilities were employed to extract features. Next, a similarity network fusion algorithm was employed to capture the correlation characteristics between these two different kinds of features. Then, a feature selection method called F-score was utilized to remove the influence of redundant and irrelative information. Finally, these reserved features were fed into support vector machine (SVM) to discriminate IBPs from non-IBPs. Experimental results showed that the proposed method has significant improvement in the classification performance, as compared with the state-of-the-art approach. The Matlab codes and dataset used in this study are available at https://figshare.com/articles/online_resource/iIBP-TSV/21779567 for academic use.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350010"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9750670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identification of a seven autophagy-related gene pairs signature for the diagnosis of colorectal cancer using the RankComp algorithm. 使用RankComp算法鉴定结直肠癌诊断的七个自噬相关基因对。

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-06-01 DOI: 10.1142/S0219720023500129

Qi-Shi Song, Hai-Jun Wu, Qian Lin, Yu-Kai Tang

Based on the colorectal cancer microarray sets gene expression data series (GSE) GSE10972 and GSE74602 in colon cancer and 222 autophagy-related genes, the differential signature in colorectal cancer and paracancerous tissues was analyzed by RankComp algorithm, and a signature consisting of seven autophagy-related reversal gene pairs with stable relative expression orderings (REOs) was obtained. Scoring based on these gene pairs could significantly distinguish colorectal cancer samples from adjacent noncancerous samples, with an average accuracy of 97.5% in two training sets and 90.25% in four independent validation GSE21510, GSE37182, GSE33126, and GSE18105. Scoring based on these gene pairs also accurately identifies 99.85% of colorectal cancer samples in seven other independent datasets containing a total of 1406 colorectal cancer samples.

基于结直肠癌微阵列集基因表达数据序列(GSE) GSE10972和GSE74602以及222个自噬相关基因，通过RankComp算法分析结直肠癌和癌旁组织的差异特征，得到了由7对具有稳定相对表达顺序(REOs)的自噬相关逆转基因对组成的特征。基于这些基因对的评分可以显著区分结直肠癌样本和邻近的非癌样本，两个训练集的平均准确率为97.5%，四个独立验证集GSE21510、GSE37182、GSE33126和GSE18105的平均准确率为90.25%。基于这些基因对的评分在另外7个包含1406个结直肠癌样本的独立数据集中也能准确识别出99.85%的结直肠癌样本。

引用次数: 0

Overlapping group screening for binary cancer classification with TCGA high-dimensional genomic data. 利用TCGA高维基因组数据筛选重叠组进行二元肿瘤分类。

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-06-01 DOI: 10.1142/S0219720023500130

Jie-Huei Wang, Yi-Hau Chen

Precision medicine has been a global trend of medical development, wherein cancer diagnosis plays an important role. With accurate diagnosis of cancer, we can provide patients with appropriate medical treatments for improving patients' survival. Since disease developments involve complex interplay among multiple factors such as gene-gene interactions, cancer classifications based on microarray gene expression profiling data are expected to be effective, and hence, have attracted extensive attention in computational biology and medicine. However, when using genomic data to build a diagnostic model, there exist several problems to be overcome, including the high-dimensional feature space and feature contamination. In this paper, we propose using the overlapping group screening (OGS) approach to build an accurate cancer diagnosis model and predict the probability of a patient falling into some disease classification category in the logistic regression framework. This new proposal integrates gene pathway information into the procedure for identifying genes and gene-gene interactions associated with the classification of cancer outcome groups. We conduct a series of simulation studies to compare the predictive accuracy of our proposed method for cancer diagnosis with some existing machine learning methods, and find the better performances of the former method. We apply the proposed method to the genomic data of The Cancer Genome Atlas related to lung adenocarcinoma (LUAD), liver hepatocellular carcinoma (LHC), and thyroid carcinoma (THCA), to establish accurate cancer diagnosis models.

精准医疗已成为全球医学发展的趋势，其中癌症诊断发挥着重要作用。通过对癌症的准确诊断，我们可以为患者提供适当的药物治疗，提高患者的生存率。由于疾病的发展涉及多种因素之间复杂的相互作用，如基因相互作用，基于微阵列基因表达谱数据的癌症分类预计是有效的，因此在计算生物学和医学中引起了广泛的关注。然而，在利用基因组数据构建诊断模型时，存在高维特征空间和特征污染等问题。在本文中，我们提出使用重叠组筛选(OGS)方法来建立准确的癌症诊断模型，并在逻辑回归框架中预测患者属于某种疾病分类类别的概率。这项新建议将基因通路信息整合到识别与癌症结果组分类相关的基因和基因-基因相互作用的程序中。我们进行了一系列的仿真研究，将我们提出的癌症诊断方法的预测精度与现有的一些机器学习方法进行比较，发现前者的性能更好。我们将该方法应用于与肺腺癌(LUAD)、肝细胞癌(LHC)和甲状腺癌(THCA)相关的癌症基因组图谱的基因组数据，建立准确的癌症诊断模型。

{"title":"Overlapping group screening for binary cancer classification with TCGA high-dimensional genomic data.","authors":"Jie-Huei Wang, Yi-Hau Chen","doi":"10.1142/S0219720023500130","DOIUrl":"https://doi.org/10.1142/S0219720023500130","url":null,"abstract":"Precision medicine has been a global trend of medical development, wherein cancer diagnosis plays an important role. With accurate diagnosis of cancer, we can provide patients with appropriate medical treatments for improving patients' survival. Since disease developments involve complex interplay among multiple factors such as gene-gene interactions, cancer classifications based on microarray gene expression profiling data are expected to be effective, and hence, have attracted extensive attention in computational biology and medicine. However, when using genomic data to build a diagnostic model, there exist several problems to be overcome, including the high-dimensional feature space and feature contamination. In this paper, we propose using the overlapping group screening (OGS) approach to build an accurate cancer diagnosis model and predict the probability of a patient falling into some disease classification category in the logistic regression framework. This new proposal integrates gene pathway information into the procedure for identifying genes and gene-gene interactions associated with the classification of cancer outcome groups. We conduct a series of simulation studies to compare the predictive accuracy of our proposed method for cancer diagnosis with some existing machine learning methods, and find the better performances of the former method. We apply the proposed method to the genomic data of The Cancer Genome Atlas related to lung adenocarcinoma (LUAD), liver hepatocellular carcinoma (LHC), and thyroid carcinoma (THCA), to establish accurate cancer diagnosis models.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350013"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9750378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The mechanism accounting for DNA damage strength modulation of p53 dynamical properties. DNA损伤强度调控p53动力学特性的机制。

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-06-01 DOI: 10.1142/S0219720023500117

Aiqing Ma, Xianhua Dai

The P53 protein levels exhibit a series of pulses in response to DNA double-stranded breaks (DSBs). However, the mechanism regarding how damage strength regulates physical parameters of p53 pulses remains to be elucidated. This paper established two mathematical models translating the mechanism of p53 dynamics in response to DSBs; the two models can reproduce many results observed in the experiments. Based on the models, numerical analysis suggested that the interval between pulses increases as the damage strength decreases, and we proposed that the p53 dynamical system in response to DSBs is modulated by frequency. Next, we found that the ATM positive self-feedback can realize the system characteristic that the pulse amplitude is independent of the damage strength. In addition, the pulse interval is negatively correlated with apoptosis; the greater the damage strength, the smaller the pulse interval, the faster the p53 accumulation rate, and the cells are more susceptible to apoptosis. These findings advance our understanding of the mechanism of p53 dynamical response and give new insights for experiments to probe the dynamics of p53 signaling.

P53蛋白水平表现出一系列脉冲响应DNA双链断裂(DSBs)。然而，损伤强度调控p53脉冲物理参数的机制尚不清楚。本文建立了两个数学模型来解释p53响应DSBs的动力学机制;这两个模型可以再现许多实验中观察到的结果。在此基础上，数值分析表明，脉冲间隔随着损伤强度的减小而增大，并提出p53动力系统响应DSBs是受频率调制的。其次，我们发现ATM正自反馈可以实现脉冲幅值与损伤强度无关的系统特性。此外，脉冲间隔与细胞凋亡呈负相关;损伤强度越大，脉冲间隔越短，p53积累速度越快，细胞更容易发生凋亡。这些发现促进了我们对p53动态反应机制的理解，并为探索p53信号动力学的实验提供了新的见解。

{"title":"The mechanism accounting for DNA damage strength modulation of p53 dynamical properties.","authors":"Aiqing Ma, Xianhua Dai","doi":"10.1142/S0219720023500117","DOIUrl":"https://doi.org/10.1142/S0219720023500117","url":null,"abstract":"The P53 protein levels exhibit a series of pulses in response to DNA double-stranded breaks (DSBs). However, the mechanism regarding how damage strength regulates physical parameters of p53 pulses remains to be elucidated. This paper established two mathematical models translating the mechanism of p53 dynamics in response to DSBs; the two models can reproduce many results observed in the experiments. Based on the models, numerical analysis suggested that the interval between pulses increases as the damage strength decreases, and we proposed that the p53 dynamical system in response to DSBs is modulated by frequency. Next, we found that the ATM positive self-feedback can realize the system characteristic that the pulse amplitude is independent of the damage strength. In addition, the pulse interval is negatively correlated with apoptosis; the greater the damage strength, the smaller the pulse interval, the faster the p53 accumulation rate, and the cells are more susceptible to apoptosis. These findings advance our understanding of the mechanism of p53 dynamical response and give new insights for experiments to probe the dynamics of p53 signaling.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350011"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9752621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Expansin gene family database: A comprehensive bioinformatics resource for plant expansin multigene family. 扩增素基因家族数据库:植物扩增素多基因家族的综合生物信息学资源。

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-06-01 DOI: 10.1142/S0219720023500154

Büşra Özkan Kök, Yasemin Celik Altunoglu, Ali Burak Öncül, Abdulkadir Karaci, Mehmet Cengiz Baloglu

Expansins, which are plant cell wall loosening proteins associated with cell growth, have been identified as a multigene family. Plant expansin proteins are an important family that functions in cell growth and many of developmental processes including wall relaxation, fruit softening, abscission, seed germination, mycorrhiza and root nodule formation, biotic and abiotic stress resistance, invasion of pollen tube stigma and organogenesis. In addition, it is thought that increasing the efficiency of plant expansin genes in plants plays a significant role, especially in the production of secondary bioethanol. When the studies on the expansin genes are examined, it is seen that the expansin genes are a significant gene family in the cell wall expansion mechanism. Therefore, understanding the efficacy of expansin genes is of great importance. Considering the importance of this multigene family, we aimed to create a comprehensively informed database of plant expansin proteins and their properties. The expansin gene family database provides comprehensive online data for the expansin gene family members in the plants. We have designed a new website accessible to the public, including expansin gene family members in 70 plants and their features including gene, coding and peptide sequences, chromosomal location, amino acid length, molecular weight, stability, conserved motif and domain structure and predicted three-dimensional architecture. Furthermore, a deep learning system was developed to detect unknown genes belonging to the expansin gene family. In addition, we provided the blast process within the website by establishing a connection to the NCBI BLAST site in the tools section. Thus, the expansin gene family database becomes a useful database for researchers that enables access to all datasets simultaneously with its user-friendly interface. Our server can be reached freely at the following link (http://www.expansingenefamily.com/).

扩张蛋白是一种与细胞生长相关的植物细胞壁松动蛋白，已被确定为一个多基因家族。植物膨胀蛋白是一个重要的家族，在细胞生长和许多发育过程中起作用，包括细胞壁松弛、果实软化、脱落、种子萌发、菌根和根瘤形成、生物和非生物胁迫抗性、花粉管柱头入侵和器官发生。此外，人们认为提高植物膨胀素基因在植物体内的效率具有重要作用，特别是在二次生物乙醇的生产中。纵观对扩张蛋白基因的研究，发现扩张蛋白基因是细胞壁扩张机制中一个重要的基因家族。因此，了解扩张蛋白基因的功效是非常重要的。考虑到这个多基因家族的重要性，我们的目标是建立一个全面了解植物膨胀蛋白及其特性的数据库。扩展蛋白基因家族数据库为植物扩展蛋白基因家族成员提供了全面的在线数据。我们设计了一个新的网站，向公众开放，包括70种植物的扩增蛋白基因家族成员及其特征，包括基因、编码和肽序列、染色体位置、氨基酸长度、分子量、稳定性、保守基序和结构域结构，以及预测的三维结构。此外，开发了一个深度学习系统来检测属于扩展基因家族的未知基因。此外，我们通过在工具部分建立与NCBI blast网站的连接，在网站内提供了爆破过程。因此，扩展基因家族数据库成为一个有用的数据库，使研究人员能够访问所有的数据集同时与它的用户友好的界面。我们的服务器可以通过以下链接(http://www.expansingenefamily.com/)自由访问。

{"title":"Expansin gene family database: A comprehensive bioinformatics resource for plant expansin multigene family.","authors":"Büşra Özkan Kök, Yasemin Celik Altunoglu, Ali Burak Öncül, Abdulkadir Karaci, Mehmet Cengiz Baloglu","doi":"10.1142/S0219720023500154","DOIUrl":"https://doi.org/10.1142/S0219720023500154","url":null,"abstract":"Expansins, which are plant cell wall loosening proteins associated with cell growth, have been identified as a multigene family. Plant expansin proteins are an important family that functions in cell growth and many of developmental processes including wall relaxation, fruit softening, abscission, seed germination, mycorrhiza and root nodule formation, biotic and abiotic stress resistance, invasion of pollen tube stigma and organogenesis. In addition, it is thought that increasing the efficiency of plant expansin genes in plants plays a significant role, especially in the production of secondary bioethanol. When the studies on the expansin genes are examined, it is seen that the expansin genes are a significant gene family in the cell wall expansion mechanism. Therefore, understanding the efficacy of expansin genes is of great importance. Considering the importance of this multigene family, we aimed to create a comprehensively informed database of plant expansin proteins and their properties. The expansin gene family database provides comprehensive online data for the expansin gene family members in the plants. We have designed a new website accessible to the public, including expansin gene family members in 70 plants and their features including gene, coding and peptide sequences, chromosomal location, amino acid length, molecular weight, stability, conserved motif and domain structure and predicted three-dimensional architecture. Furthermore, a deep learning system was developed to detect unknown genes belonging to the expansin gene family. In addition, we provided the blast process within the website by establishing a connection to the NCBI BLAST site in the tools section. Thus, the expansin gene family database becomes a useful database for researchers that enables access to all datasets simultaneously with its user-friendly interface. Our server can be reached freely at the following link (http://www.expansingenefamily.com/).","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350015"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10109656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rearrangement distance with reversals, indels, and moves in intergenic regions on signed and unsigned permutations. 在有符号排列和无符号排列上，重排与反转、索引和基因间区域移动的距离。

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-04-01 DOI: 10.1142/S0219720023500099

Klairton Lima Brito, Andre Rodrigues Oliveira, Alexsandro Oliveira Alexandrino, Ulisses Dias, Zanoni Dias

Genome rearrangement events are widely used to estimate a minimum-size sequence of mutations capable of transforming a genome into another. The length of this sequence is called distance, and determining it is the main goal in genome rearrangement distance problems. Problems in the genome rearrangement field differ regarding the set of rearrangement events allowed and the genome representation. In this work, we consider the scenario where the genomes share the same set of genes, gene orientation is known or unknown, and intergenic regions (structures between a pair of genes and at the extremities of the genome) are taken into account. We use two models, the first model allows only conservative events (reversals and moves), and the second model includes non-conservative events (insertions and deletions) in the intergenic regions. We show that both models result in NP-hard problems no matter if gene orientation is known or unknown. When the information regarding the orientation of genes is available, we present for both models an approximation algorithm with a factor of 2. For the scenario where this information is unavailable, we propose a 4-approximation algorithm for both models.

基因组重排事件被广泛用于估计能够将一个基因组转化为另一个基因组的最小大小突变序列。这个序列的长度称为距离，确定它是基因组重排距离问题的主要目标。基因组重排领域的问题在允许的重排事件集和基因组表示方面有所不同。在这项工作中，我们考虑了基因组共享同一组基因的情况，基因取向是已知或未知的，基因间区域(一对基因之间和基因组末端的结构)被考虑在内。我们使用了两个模型，第一个模型只允许保守事件(反转和移动)，第二个模型包括基因间区域的非保守事件(插入和删除)。我们表明，无论基因取向是已知的还是未知的，这两种模型都会导致np困难问题。当有关基因取向的信息是可用的，我们提出了两个模型的近似算法与因子2。对于这些信息不可用的场景，我们为两个模型提出了一个4近似算法。

{"title":"Rearrangement distance with reversals, indels, and moves in intergenic regions on signed and unsigned permutations.","authors":"Klairton Lima Brito, Andre Rodrigues Oliveira, Alexsandro Oliveira Alexandrino, Ulisses Dias, Zanoni Dias","doi":"10.1142/S0219720023500099","DOIUrl":"https://doi.org/10.1142/S0219720023500099","url":null,"abstract":"Genome rearrangement events are widely used to estimate a minimum-size sequence of mutations capable of transforming a genome into another. The length of this sequence is called distance, and determining it is the main goal in genome rearrangement distance problems. Problems in the genome rearrangement field differ regarding the set of rearrangement events allowed and the genome representation. In this work, we consider the scenario where the genomes share the same set of genes, gene orientation is known or unknown, and intergenic regions (structures between a pair of genes and at the extremities of the genome) are taken into account. We use two models, the first model allows only conservative events (reversals and moves), and the second model includes non-conservative events (insertions and deletions) in the intergenic regions. We show that both models result in NP-hard problems no matter if gene orientation is known or unknown. When the information regarding the orientation of genes is available, we present for both models an approximation algorithm with a factor of 2. For the scenario where this information is unavailable, we propose a 4-approximation algorithm for both models.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2350009"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9528408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Obstacles to effective model deployment in healthcare. 在医疗保健中有效部署模型的障碍。

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-04-01 DOI: 10.1142/S0219720023710014

Wei Xin Chan, Limsoon Wong

Despite an exponential increase in publications on clinical prediction models over recent years, the number of models deployed in clinical practice remains fairly limited. In this paper, we identify common obstacles that impede effective deployment of prediction models in healthcare, and investigate their underlying causes. We observe a key underlying cause behind most obstacles - the improper development and evaluation of prediction models. Inherent heterogeneities in clinical data complicate the development and evaluation of clinical prediction models. Many of these heterogeneities in clinical data are unreported because they are deemed to be irrelevant, or due to privacy concerns. We provide real-life examples where failure to handle heterogeneities in clinical data, or sources of biases, led to the development of erroneous models. The purpose of this paper is to familiarize modeling practitioners with common sources of biases and heterogeneities in clinical data, both of which have to be dealt with to ensure proper development and evaluation of clinical prediction models. Proper model development and evaluation, together with complete and thorough reporting, are important prerequisites for a prediction model to be effectively deployed in healthcare.

尽管近年来关于临床预测模型的出版物呈指数增长，但在临床实践中部署的模型数量仍然相当有限。在本文中，我们确定了阻碍在医疗保健中有效部署预测模型的常见障碍，并调查了其潜在原因。我们观察到大多数障碍背后的一个关键潜在原因-预测模型的不当开发和评估。临床数据固有的异质性使临床预测模型的发展和评估复杂化。临床数据中的许多异质性未被报道，因为它们被认为是无关的，或者是出于隐私考虑。我们提供了现实生活中的例子，其中未能处理临床数据的异质性，或偏见的来源，导致了错误模型的发展。本文的目的是让建模从业者熟悉临床数据中常见的偏差和异质性来源，这两者都必须得到处理，以确保临床预测模型的正确开发和评估。正确的模型开发和评估，以及完整和彻底的报告，是在医疗保健中有效部署预测模型的重要先决条件。

{"title":"Obstacles to effective model deployment in healthcare.","authors":"Wei Xin Chan, Limsoon Wong","doi":"10.1142/S0219720023710014","DOIUrl":"https://doi.org/10.1142/S0219720023710014","url":null,"abstract":"Despite an exponential increase in publications on clinical prediction models over recent years, the number of models deployed in clinical practice remains fairly limited. In this paper, we identify common obstacles that impede effective deployment of prediction models in healthcare, and investigate their underlying causes. We observe a key underlying cause behind most obstacles - the improper development and evaluation of prediction models. Inherent heterogeneities in clinical data complicate the development and evaluation of clinical prediction models. Many of these heterogeneities in clinical data are unreported because they are deemed to be irrelevant, or due to privacy concerns. We provide real-life examples where failure to handle heterogeneities in clinical data, or sources of biases, led to the development of erroneous models. The purpose of this paper is to familiarize modeling practitioners with common sources of biases and heterogeneities in clinical data, both of which have to be dealt with to ensure proper development and evaluation of clinical prediction models. Proper model development and evaluation, together with complete and thorough reporting, are important prerequisites for a prediction model to be effectively deployed in healthcare.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2371001"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9554623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Integrated in silico-in vitro rational design of oncogenic EGFR-derived specific monoclonal antibody-binding peptide mimotopes. 结合硅片-体外合理设计致癌egfr衍生的特异性单克隆抗体结合肽模位。

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-04-01 DOI: 10.1142/S0219720023500075

Ke Chen, Lili Ge, Guorui Liu

Human epidermal growth factor receptor (EGFR) is strongly associated with malignant proliferation and has been established as an attractive therapeutic target of diverse cancers and used as a significant biomarker for tumor diagnosis. Over the past decades, a variety of monoclonal antibodies (mAbs) have been successfully developed to specifically recognize the third subdomain (TSD) of EGFR extracellular domain. Here, the complex crystal structures of EGFR TSD subdomain with its cognate mAbs were examined and compared systematically, revealing a consistent binding mode shared by these mAbs. The recognition site is located on the [Formula: see text]-sheet surface of TSD ladder architecture, from which several hotspot residues that significantly confer both stability and specificity to the recognition were identified, responsible for about half of the total binding potency of mAbs to TSD subdomain. A number of linear peptide mimotopes were rationally designed to mimic these TSD hotspot residues in different orientations and/or in different head-to-tail manners by using an orthogonal threading-through-strand (OTTS) strategy, which, however, are intrinsically disordered in Free State and thus cannot be maintained in a native hotspot-like conformation. A chemical stapling strategy was employed to constrain the free peptides into a double-stranded conformation by introducing a disulfide bond across two strand arms of the peptide mimotopes. Both empirical scoring and [Formula: see text]fluorescence assay reached an agreement that the stapling can effectively improve the interaction potency of OTTS-designed peptide mimotopes to different mAbs, with binding affinity increase by [Formula: see text]-fold. Conformational analysis revealed that the stapled cyclic peptide mimotopes can spontaneously fold into a double-stranded conformation that well threads through all the hotspot residues on TSD [Formula: see text]-sheet surface and exhibits a consistent binding mode with the TSD hotspot site to mAbs.

人表皮生长因子受体(EGFR)与恶性肿瘤增殖密切相关，已被确定为多种癌症的治疗靶点，并被用作肿瘤诊断的重要生物标志物。在过去的几十年里，已经成功开发了多种单克隆抗体(mab)来特异性识别EGFR细胞外结构域的第三亚结构域(TSD)。本研究对EGFR TSD亚域及其同源单抗的复杂晶体结构进行了系统的检测和比较，揭示了这些单抗具有一致的结合模式。识别位点位于TSD阶梯结构的[公式:见文本]-sheet表面，从中鉴定出几个显著赋予识别稳定性和特异性的热点残基，约占单克隆抗体与TSD亚结构域总结合效力的一半。利用正交穿链(OTTS)策略，合理设计了许多线性肽模位，以不同的方向和/或不同的头尾方式模拟这些TSD热点残基，但这些残基在自由状态下本质上是无序的，因此无法保持原生热点样构象。采用化学钉接策略，通过在肽模位的两条链臂上引入二硫键，将游离肽约束为双链构象。经验评分和[公式:见文]荧光分析均一致认为，钉接能有效提高ots设计的肽模位与不同单克隆抗体的相互作用效力，结合亲和力提高了[公式:见文]1倍。构象分析表明，钉接的环肽模位可以自发折叠成双链构象，很好地穿过TSD[公式:见文]-片表面的所有热点残基，并与TSD热点位点与单克隆抗体的结合模式一致。

{"title":"Integrated in silico-in vitro rational design of oncogenic EGFR-derived specific monoclonal antibody-binding peptide mimotopes.","authors":"Ke Chen, Lili Ge, Guorui Liu","doi":"10.1142/S0219720023500075","DOIUrl":"https://doi.org/10.1142/S0219720023500075","url":null,"abstract":"Human epidermal growth factor receptor (EGFR) is strongly associated with malignant proliferation and has been established as an attractive therapeutic target of diverse cancers and used as a significant biomarker for tumor diagnosis. Over the past decades, a variety of monoclonal antibodies (mAbs) have been successfully developed to specifically recognize the third subdomain (TSD) of EGFR extracellular domain. Here, the complex crystal structures of EGFR TSD subdomain with its cognate mAbs were examined and compared systematically, revealing a consistent binding mode shared by these mAbs. The recognition site is located on the [Formula: see text]-sheet surface of TSD ladder architecture, from which several hotspot residues that significantly confer both stability and specificity to the recognition were identified, responsible for about half of the total binding potency of mAbs to TSD subdomain. A number of linear peptide mimotopes were rationally designed to mimic these TSD hotspot residues in different orientations and/or in different head-to-tail manners by using an orthogonal threading-through-strand (OTTS) strategy, which, however, are intrinsically disordered in Free State and thus cannot be maintained in a native hotspot-like conformation. A chemical stapling strategy was employed to constrain the free peptides into a double-stranded conformation by introducing a disulfide bond across two strand arms of the peptide mimotopes. Both empirical scoring and [Formula: see text]fluorescence assay reached an agreement that the stapling can effectively improve the interaction potency of OTTS-designed peptide mimotopes to different mAbs, with binding affinity increase by [Formula: see text]-fold. Conformational analysis revealed that the stapled cyclic peptide mimotopes can spontaneously fold into a double-stranded conformation that well threads through all the hotspot residues on TSD [Formula: see text]-sheet surface and exhibits a consistent binding mode with the TSD hotspot site to mAbs.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2350007"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9828071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Numerical study of chronic hepatitis B infection using Marchuk-Petrov model. 慢性乙型肝炎感染的Marchuk-Petrov模型数值研究。

IF 1 4区生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Bioinformatics and Computational Biology

Pub Date : 2023-04-01 DOI: 10.1142/S0219720023400012

Michael Khristichenko, Yuri Nechepurenko, Dmitry Grebennikov, Gennady Bocharov

In this work, we briefly describe our technology developed for computing periodic solutions of time-delay systems and discuss the results of computing periodic solutions for the Marchuk-Petrov model with parameter values, corresponding to hepatitis B infection. We identified the regions in the model parameter space in which an oscillatory dynamics in the form of periodic solutions exists. The respective solutions can be interpreted as active forms of chronic hepatitis B. The period and amplitude of oscillatory solutions were traced along the parameter determining the efficacy of antigen presentation by macrophages for T- and B-lymphocytes in the model.. The oscillatory regimes are characterized by enhanced destruction of hepatocytes as a consequence of immunopathology and temporal reduction of viral load to values which can be a prerequisite of spontaneous recovery observed in chronic HBV infection. Our study presents a first step in a systematic analysis of the chronic HBV infection using Marchuk-Petrov model of antiviral immune response.

在这项工作中，我们简要地描述了我们为计算时滞系统的周期解而开发的技术，并讨论了计算具有参数值的Marchuk-Petrov模型的周期解的结果，对应于乙型肝炎感染。我们确定了模型参数空间中存在周期解形式的振荡动力学的区域。各自的溶液可以解释为慢性乙型肝炎的活动性形式。振荡溶液的周期和振幅沿着确定模型中巨噬细胞对T淋巴细胞和b淋巴细胞抗原呈递功效的参数进行追踪。振荡机制的特点是肝细胞的破坏增强，这是免疫病理和病毒载量暂时减少的结果，这可能是慢性HBV感染自发恢复的先决条件。我们的研究在使用抗病毒免疫反应的Marchuk-Petrov模型对慢性HBV感染进行系统分析的第一步。

{"title":"Numerical study of chronic hepatitis B infection using Marchuk-Petrov model.","authors":"Michael Khristichenko, Yuri Nechepurenko, Dmitry Grebennikov, Gennady Bocharov","doi":"10.1142/S0219720023400012","DOIUrl":"https://doi.org/10.1142/S0219720023400012","url":null,"abstract":"In this work, we briefly describe our technology developed for computing periodic solutions of time-delay systems and discuss the results of computing periodic solutions for the Marchuk-Petrov model with parameter values, corresponding to hepatitis B infection. We identified the regions in the model parameter space in which an oscillatory dynamics in the form of periodic solutions exists. The respective solutions can be interpreted as active forms of chronic hepatitis B. The period and amplitude of oscillatory solutions were traced along the parameter determining the efficacy of antigen presentation by macrophages for T- and B-lymphocytes in the model.. The oscillatory regimes are characterized by enhanced destruction of hepatocytes as a consequence of immunopathology and temporal reduction of viral load to values which can be a prerequisite of spontaneous recovery observed in chronic HBV infection. Our study presents a first step in a systematic analysis of the chronic HBV infection using Marchuk-Petrov model of antiviral immune response.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2340001"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9477723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0