首页 > 最新文献

Journal of Bioinformatics and Computational Biology最新文献

英文 中文
ThermalProGAN: A sequence-based thermally stable protein generator trained using unpaired data. ThermalProGAN:一个基于序列的热稳定蛋白质生成器,使用非配对数据进行训练。
IF 1 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-02-01 DOI: 10.1142/S0219720023500087
Hui-Ling Huang, Chong-Heng Weng, Torbjörn E M Nordling, Yi-Fan Liou

Motivation: The synthesis of proteins with novel desired properties is challenging but sought after by the industry and academia. The dominating approach is based on trial-and-error inducing point mutations, assisted by structural information or predictive models built with paired data that are difficult to collect. This study proposes a sequence-based unpaired-sample of novel protein inventor (SUNI) to build ThermalProGAN for generating thermally stable proteins based on sequence information.

Results: The ThermalProGAN can strongly mutate the input sequence with a median number of 32 residues. A known normal protein, 1RG0, was used to generate a thermally stable form by mutating 51 residues. After superimposing the two structures, high similarity is shown, indicating that the basic function would be conserved. Eighty four molecular dynamics simulation results of 1RG0 and the COVID-19 vaccine candidates with a total simulation time of 840[Formula: see text]ns indicate that the thermal stability increased.

Conclusion: This proof of concept demonstrated that transfer of a desired protein property from one set of proteins is feasible. Availability and implementation: The source code of ThermalProGAN can be freely accessed at https://github.com/markliou/ThermalProGAN/ with an MIT license. The website is https://thermalprogan.markliou.tw:433. Supplementary information: Supplementary data are available on Github.

动机:合成具有新特性的蛋白质具有挑战性,但受到工业界和学术界的追捧。主要的方法是基于试错诱导点突变,辅以结构信息或用难以收集的成对数据建立的预测模型。本研究提出了一种基于序列的新蛋白发明人(SUNI)的未配对样本来构建ThermalProGAN,用于基于序列信息生成热稳定蛋白。结果:ThermalProGAN可以对输入序列进行强突变,中位数为32个残基。一种已知的正常蛋白,1RG0,通过突变51个残基来产生一种热稳定的形式。两种结构叠加后显示出较高的相似性,表明基本函数是守恒的。1RG0和COVID-19候选疫苗的84个分子动力学模拟结果表明,总模拟时间为840[公式:见文]ns,热稳定性有所提高。结论:这一概念证明了从一组蛋白质转移所需的蛋白质特性是可行的。可用性和实现:ThermalProGAN的源代码可以通过MIT许可免费访问https://github.com/markliou/ThermalProGAN/。网址是https://thermalprogan.markliou.tw:433。补充信息:在Github上可以获得补充数据。
{"title":"ThermalProGAN: A sequence-based thermally stable protein generator trained using unpaired data.","authors":"Hui-Ling Huang,&nbsp;Chong-Heng Weng,&nbsp;Torbjörn E M Nordling,&nbsp;Yi-Fan Liou","doi":"10.1142/S0219720023500087","DOIUrl":"https://doi.org/10.1142/S0219720023500087","url":null,"abstract":"<p><strong>Motivation: </strong>The synthesis of proteins with novel desired properties is challenging but sought after by the industry and academia. The dominating approach is based on trial-and-error inducing point mutations, assisted by structural information or predictive models built with paired data that are difficult to collect. This study proposes a sequence-based unpaired-sample of novel protein inventor (SUNI) to build ThermalProGAN for generating thermally stable proteins based on sequence information.</p><p><strong>Results: </strong>The ThermalProGAN can strongly mutate the input sequence with a median number of 32 residues. A known normal protein, 1RG0, was used to generate a thermally stable form by mutating 51 residues. After superimposing the two structures, high similarity is shown, indicating that the basic function would be conserved. Eighty four molecular dynamics simulation results of 1RG0 and the COVID-19 vaccine candidates with a total simulation time of 840[Formula: see text]ns indicate that the thermal stability increased.</p><p><strong>Conclusion: </strong>This proof of concept demonstrated that transfer of a desired protein property from one set of proteins is feasible. <b>Availability and implementation:</b> The source code of ThermalProGAN can be freely accessed at https://github.com/markliou/ThermalProGAN/ with an MIT license. The website is https://thermalprogan.markliou.tw:433. <b>Supplementary information:</b> Supplementary data are available on Github.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2350008"},"PeriodicalIF":1.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9466541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities. 使用p值、贝叶斯因子和概率评估基于网络的缺失蛋白预测。
IF 1 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2023-02-01 DOI: 10.1142/S0219720023500051
Wilson Wen Bin Goh, Weijia Kong, Limsoon Wong

Some prediction methods use probability to rank their predictions, while some other prediction methods do not rank their predictions and instead use [Formula: see text]-values to support their predictions. This disparity renders direct cross-comparison of these two kinds of methods difficult. In particular, approaches such as the Bayes Factor upper Bound (BFB) for [Formula: see text]-value conversion may not make correct assumptions for this kind of cross-comparisons. Here, using a well-established case study on renal cancer proteomics and in the context of missing protein prediction, we demonstrate how to compare these two kinds of prediction methods using two different strategies. The first strategy is based on false discovery rate (FDR) estimation, which does not make the same naïve assumptions as BFB conversions. The second strategy is a powerful approach which we colloquially call "home ground testing". Both strategies perform better than BFB conversions. Thus, we recommend comparing prediction methods by standardization to a common performance benchmark such as a global FDR. And where this is not possible, we recommend reciprocal "home ground testing".

一些预测方法使用概率对其预测进行排序,而其他一些预测方法不对其预测进行排序,而是使用[公式:见文本]-值来支持其预测。这种差异使得对这两种方法进行直接交叉比较变得困难。特别是,诸如[公式:见文本]值转换的贝叶斯因子上限(BFB)等方法可能无法对这种交叉比较做出正确的假设。在此,我们利用一个关于肾癌蛋白质组学的成熟案例研究,并在缺失蛋白预测的背景下,展示了如何使用两种不同的策略来比较这两种预测方法。第一种策略是基于错误发现率(FDR)估计,它不做与BFB转换相同的naïve假设。第二种策略是一种强大的方法,我们通俗地称之为“主场测试”。这两种策略都比BFB转换效果更好。因此,我们建议将标准化的预测方法与通用的性能基准(如全局FDR)进行比较。如果这是不可能的,我们建议互惠的“主场测试”。
{"title":"Evaluating network-based missing protein prediction using <i>p</i>-values, Bayes Factors, and probabilities.","authors":"Wilson Wen Bin Goh,&nbsp;Weijia Kong,&nbsp;Limsoon Wong","doi":"10.1142/S0219720023500051","DOIUrl":"https://doi.org/10.1142/S0219720023500051","url":null,"abstract":"<p><p>Some prediction methods use probability to rank their predictions, while some other prediction methods do not rank their predictions and instead use [Formula: see text]-values to support their predictions. This disparity renders direct cross-comparison of these two kinds of methods difficult. In particular, approaches such as the Bayes Factor upper Bound (BFB) for [Formula: see text]-value conversion may not make correct assumptions for this kind of cross-comparisons. Here, using a well-established case study on renal cancer proteomics and in the context of missing protein prediction, we demonstrate how to compare these two kinds of prediction methods using two different strategies. The first strategy is based on false discovery rate (FDR) estimation, which does not make the same naïve assumptions as BFB conversions. The second strategy is a powerful approach which we colloquially call \"home ground testing\". Both strategies perform better than BFB conversions. Thus, we recommend comparing prediction methods by standardization to a common performance benchmark such as a global FDR. And where this is not possible, we recommend reciprocal \"home ground testing\".</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2350005"},"PeriodicalIF":1.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9474482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A network-based dynamic criterion for identifying prediction and early diagnosis biomarkers of complex diseases. 基于网络的复杂疾病生物标志物识别、预测和早期诊断动态准则。
IF 1 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-12-01 DOI: 10.1142/S0219720022500275
Xin Huang, Benzhe Su, Xingyu Wang, Yang Zhou, Xinyu He, Bing Liu

Lung adenocarcinoma (LUAD) seriously threatens human health and generally results from dysfunction of relevant module molecules, which dynamically change with time and conditions, rather than that of an individual molecule. In this study, a novel network construction algorithm for identifying early warning network signals (IEWNS) is proposed for improving the performance of LUAD early diagnosis. To this end, we theoretically derived a dynamic criterion, namely, the relationship of variation (RV), to construct dynamic networks. RV infers correlation [Formula: see text] statistics to measure dynamic changes in molecular relationships during the process of disease development. Based on the dynamic networks constructed by IEWNS, network warning signals used to represent the occurrence of LUAD deterioration can be defined without human intervention. IEWNS was employed to perform a comprehensive analysis of gene expression profiles of LUAD from The Cancer Genome Atlas (TCGA) database and the Gene Expression Omnibus (GEO) database. The experimental results suggest that the potential biomarkers selected by IEWNS can facilitate a better understanding of pathogenetic mechanisms and help to achieve effective early diagnosis of LUAD. In conclusion, IEWNS provides novel insight into the initiation and progression of LUAD and helps to define prospective biomarkers for assessing disease deterioration.

肺腺癌(LUAD)严重威胁着人类的健康,通常是相关模块分子功能失调的结果,这些模块分子不是单个分子,而是随着时间和条件的变化而动态变化的。为了提高LUAD的早期诊断性能,本研究提出了一种新的网络构建算法来识别早期预警网络信号(IEWNS)。为此,我们从理论上推导出一个动态判据,即变异关系(RV)来构建动态网络。RV推断相关性[公式:见文]统计量,用来衡量疾病发展过程中分子关系的动态变化。基于IEWNS构建的动态网络,可以在没有人为干预的情况下定义用于表示LUAD劣化发生的网络预警信号。利用IEWNS对来自Cancer Genome Atlas (TCGA)数据库和gene expression Omnibus (GEO)数据库的LUAD基因表达谱进行综合分析。实验结果表明,IEWNS选择的潜在生物标志物有助于更好地了解LUAD的发病机制,有助于实现LUAD的有效早期诊断。总之,IEWNS为LUAD的发生和发展提供了新的见解,并有助于确定评估疾病恶化的前瞻性生物标志物。
{"title":"A network-based dynamic criterion for identifying prediction and early diagnosis biomarkers of complex diseases.","authors":"Xin Huang,&nbsp;Benzhe Su,&nbsp;Xingyu Wang,&nbsp;Yang Zhou,&nbsp;Xinyu He,&nbsp;Bing Liu","doi":"10.1142/S0219720022500275","DOIUrl":"https://doi.org/10.1142/S0219720022500275","url":null,"abstract":"<p><p>Lung adenocarcinoma (LUAD) seriously threatens human health and generally results from dysfunction of relevant module molecules, which dynamically change with time and conditions, rather than that of an individual molecule. In this study, a novel network construction algorithm for identifying early warning network signals (IEWNS) is proposed for improving the performance of LUAD early diagnosis. To this end, we theoretically derived a dynamic criterion, namely, the relationship of variation (RV), to construct dynamic networks. RV infers correlation [Formula: see text] statistics to measure dynamic changes in molecular relationships during the process of disease development. Based on the dynamic networks constructed by IEWNS, network warning signals used to represent the occurrence of LUAD deterioration can be defined without human intervention. IEWNS was employed to perform a comprehensive analysis of gene expression profiles of LUAD from The Cancer Genome Atlas (TCGA) database and the Gene Expression Omnibus (GEO) database. The experimental results suggest that the potential biomarkers selected by IEWNS can facilitate a better understanding of pathogenetic mechanisms and help to achieve effective early diagnosis of LUAD. In conclusion, IEWNS provides novel insight into the initiation and progression of LUAD and helps to define prospective biomarkers for assessing disease deterioration.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6","pages":"2250027"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9471022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Author Index Volume 20 (2022). 作者索引第20卷(2022)。
IF 1 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-12-01 DOI: 10.1142/S0219720022990013
{"title":"Author Index Volume 20 (2022).","authors":"","doi":"10.1142/S0219720022990013","DOIUrl":"https://doi.org/10.1142/S0219720022990013","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6","pages":"2299001"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10505287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quantification of the presence of enzymes in gelatin zymography using the Gini index. 用基尼指数定量明胶酶谱法中酶的存在。
IF 1 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-12-01 DOI: 10.1142/S0219720022500251
Adriana Laura López Lobato, Martha Lorena Avendaño Garrido, Héctor Gabriel Acosta Mesa, Clara Luz Sampieri, Víctor Hugo Sandoval Lozano

Gel zymography quantifies the activity of certain enzymes in tumor processes. These enzymes are widely used in medical diagnosis. In order to analyze them, experts classify the zymography spots into various classes according to their tonalities. This classification is done by visual analysis, which is what makes it a subjective process. This work proposes a methodology to carry out this classifications with a process that involves an unsupervised learning algorithm in the images, denoted as the GI algorithm. With the experiments shown in this paper, this methodology could constitute a tool that bioinformatics scientists can trust to perform the desired classification since it is a quantitative indicator to order the enzymatic activity of the spots in a zymography.

凝胶酶谱测定肿瘤过程中某些酶的活性。这些酶广泛用于医学诊断。为了分析它们,专家们根据它们的调性将酶谱点分为不同的类别。这种分类是通过视觉分析完成的,这使得它成为一个主观的过程。这项工作提出了一种方法来执行这种分类,该方法涉及图像中的无监督学习算法,称为GI算法。通过本文中所示的实验,该方法可以构成生物信息学科学家可以信任的工具,以执行所需的分类,因为它是酶谱图中点酶活性排序的定量指标。
{"title":"Quantification of the presence of enzymes in gelatin zymography using the Gini index.","authors":"Adriana Laura López Lobato,&nbsp;Martha Lorena Avendaño Garrido,&nbsp;Héctor Gabriel Acosta Mesa,&nbsp;Clara Luz Sampieri,&nbsp;Víctor Hugo Sandoval Lozano","doi":"10.1142/S0219720022500251","DOIUrl":"https://doi.org/10.1142/S0219720022500251","url":null,"abstract":"<p><p>Gel zymography quantifies the activity of certain enzymes in tumor processes. These enzymes are widely used in medical diagnosis. In order to analyze them, experts classify the zymography spots into various classes according to their tonalities. This classification is done by visual analysis, which is what makes it a subjective process. This work proposes a methodology to carry out this classifications with a process that involves an unsupervised learning algorithm in the images, denoted as the GI algorithm. With the experiments shown in this paper, this methodology could constitute a tool that bioinformatics scientists can trust to perform the desired classification since it is a quantitative indicator to order the enzymatic activity of the spots in a zymography.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6","pages":"2250025"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9118622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Author Index Volume 20 (2022). 作者索引第20卷(2022)。
IF 1 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-12-01 DOI: 10.1142/s0219749922990015
{"title":"Author Index Volume 20 (2022).","authors":"","doi":"10.1142/s0219749922990015","DOIUrl":"https://doi.org/10.1142/s0219749922990015","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6 1","pages":"2299001"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"63928439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feedback-AVPGAN: Feedback-guided generative adversarial network for generating antiviral peptides. 反馈- avpgan:用于生成抗病毒肽的反馈引导生成对抗网络。
IF 1 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-12-01 DOI: 10.1142/S0219720022500263
Kano Hasegawa, Yoshitaka Moriwaki, Tohru Terada, Cao Wei, Kentaro Shimizu

In this study, we propose Feedback-AVPGAN, a system that aims to computationally generate novel antiviral peptides (AVPs). This system relies on the key premise of the Generative Adversarial Network (GAN) model and the Feedback method. GAN, a generative modeling approach that uses deep learning methods, comprises a generator and a discriminator. The generator is used to generate peptides; the generated proteins are fed to the discriminator to distinguish between the AVPs and non-AVPs. The original GAN design uses actual data to train the discriminator. However, not many AVPs have been experimentally obtained. To solve this problem, we used the Feedback method to allow the discriminator to learn from the existing as well as generated synthetic data. We implemented this method using a classifier module that classifies each peptide sequence generated by the GAN generator as AVP or non-AVP. The classifier uses the transformer network and achieves high classification accuracy. This mechanism enables the efficient generation of peptides with a high probability of exhibiting antiviral activity. Using the Feedback method, we evaluated various algorithms and their performance. Moreover, we modeled the structure of the generated peptides using AlphaFold2 and determined the peptides having similar physicochemical properties and structures to those of known AVPs, although with different sequences.

在这项研究中,我们提出了反馈- avpgan,一个旨在通过计算产生新型抗病毒肽(avp)的系统。该系统以生成对抗网络(GAN)模型和反馈方法为关键前提。GAN是一种使用深度学习方法的生成建模方法,由生成器和鉴别器组成。该发生器用于生成多肽;生成的蛋白质被送入鉴别器以区分avp和非avp。原始GAN设计使用实际数据来训练鉴别器。然而,实验中获得的avp并不多。为了解决这个问题,我们使用了Feedback方法让鉴别器从现有的和生成的合成数据中学习。我们使用分类器模块实现该方法,该模块将GAN生成器生成的每个肽序列分类为AVP或非AVP。该分类器采用变压器网络,分类精度高。这种机制使高效产生具有高概率抗病毒活性的肽。使用反馈方法,我们评估了各种算法及其性能。此外,我们使用AlphaFold2模拟了生成的肽的结构,并确定了与已知avp具有相似的物理化学性质和结构的肽,尽管序列不同。
{"title":"Feedback-AVPGAN: Feedback-guided generative adversarial network for generating antiviral peptides.","authors":"Kano Hasegawa,&nbsp;Yoshitaka Moriwaki,&nbsp;Tohru Terada,&nbsp;Cao Wei,&nbsp;Kentaro Shimizu","doi":"10.1142/S0219720022500263","DOIUrl":"https://doi.org/10.1142/S0219720022500263","url":null,"abstract":"<p><p>In this study, we propose <i>Feedback-AVPGAN</i>, a system that aims to computationally generate novel antiviral peptides (AVPs). This system relies on the key premise of the Generative Adversarial Network (GAN) model and the Feedback method. GAN, a generative modeling approach that uses deep learning methods, comprises a generator and a discriminator. The generator is used to generate peptides; the generated proteins are fed to the discriminator to distinguish between the AVPs and non-AVPs. The original GAN design uses actual data to train the discriminator. However, not many AVPs have been experimentally obtained. To solve this problem, we used the Feedback method to allow the discriminator to learn from the existing as well as generated synthetic data. We implemented this method using a classifier module that classifies each peptide sequence generated by the GAN generator as AVP or non-AVP. The classifier uses the transformer network and achieves high classification accuracy. This mechanism enables the efficient generation of peptides with a high probability of exhibiting antiviral activity. Using the Feedback method, we evaluated various algorithms and their performance. Moreover, we modeled the structure of the generated peptides using AlphaFold2 and determined the peptides having similar physicochemical properties and structures to those of known AVPs, although with different sequences.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6","pages":"2250026"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9118189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Accounting for treatment during the development or validation of prediction models. 在预测模型的开发或验证过程中考虑处理。
IF 1 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-12-01 DOI: 10.1142/S0219720022710019
Wei Xin Chan, Limsoon Wong
Clinical prediction models are widely used to predict adverse outcomes in patients, and are often employed to guide clinical decision-making. Clinical data typically consist of patients who received different treatments. Many prediction modeling studies fail to account for differences in patient treatment appropriately, which results in the development of prediction models that show poor accuracy and generalizability. In this paper, we list the most common methods used to handle patient treatments and discuss certain caveats associated with each method. We believe that proper handling of differences in patient treatment is crucial for the development of accurate and generalizable models. As different treatment strategies are employed for different diseases, the best approach to properly handle differences in patient treatment is specific to each individual situation. We use the Ma-Spore acute lymphoblastic leukemia data set as a case study to demonstrate the complexities associated with differences in patient treatment, and offer suggestions on incorporating treatment information during evaluation of prediction models. In clinical data, patients are typically treated on a case by case basis, with unique cases occurring more frequently than expected. Hence, there are many subtleties to consider during the analysis and evaluation of clinical prediction models.
临床预测模型被广泛用于预测患者的不良结局,并常用于指导临床决策。临床数据通常由接受不同治疗的患者组成。许多预测建模研究未能适当地考虑到患者治疗的差异,这导致预测模型的发展显示出较差的准确性和通用性。在本文中,我们列出了用于处理患者治疗的最常用方法,并讨论了与每种方法相关的某些注意事项。我们认为,正确处理患者治疗的差异对于建立准确和可推广的模型至关重要。由于不同的疾病采用不同的治疗策略,正确处理患者治疗差异的最佳方法是针对每个个体情况。我们使用ma孢子急性淋巴细胞白血病数据集作为案例研究,以证明患者治疗差异的复杂性,并提供在评估预测模型时纳入治疗信息的建议。在临床数据中,患者通常根据具体情况进行治疗,特殊病例的发生频率比预期的要高。因此,在分析和评估临床预测模型时,有许多微妙之处需要考虑。
{"title":"Accounting for treatment during the development or validation of prediction models.","authors":"Wei Xin Chan,&nbsp;Limsoon Wong","doi":"10.1142/S0219720022710019","DOIUrl":"https://doi.org/10.1142/S0219720022710019","url":null,"abstract":"Clinical prediction models are widely used to predict adverse outcomes in patients, and are often employed to guide clinical decision-making. Clinical data typically consist of patients who received different treatments. Many prediction modeling studies fail to account for differences in patient treatment appropriately, which results in the development of prediction models that show poor accuracy and generalizability. In this paper, we list the most common methods used to handle patient treatments and discuss certain caveats associated with each method. We believe that proper handling of differences in patient treatment is crucial for the development of accurate and generalizable models. As different treatment strategies are employed for different diseases, the best approach to properly handle differences in patient treatment is specific to each individual situation. We use the Ma-Spore acute lymphoblastic leukemia data set as a case study to demonstrate the complexities associated with differences in patient treatment, and offer suggestions on incorporating treatment information during evaluation of prediction models. In clinical data, patients are typically treated on a case by case basis, with unique cases occurring more frequently than expected. Hence, there are many subtleties to consider during the analysis and evaluation of clinical prediction models.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6","pages":"2271001"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10523629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Behavioral dynamics of bacteriophage gene regulatory networks. 噬菌体基因调控网络的行为动力学。
IF 1 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-10-01 DOI: 10.1142/S0219720022500214
Gatis Melkus, Karlis Cerans, Karlis Freivalds, Lelde Lace, Darta Zajakina, Juris Viksna

We present hybrid system-based gene regulatory network models for lambda, HK022, and Mu bacteriophages together with dynamics analysis of the modeled networks. The proposed lambda phage model LPH2 is based on an earlier work and incorporates more recent biological assumptions about the underlying gene regulatory mechanism, HK022, and Mu phage models are new. All three models provide accurate representations of experimentally observed lytic and lysogenic behavioral cycles. Importantly, the models also imply that lysis and lysogeny are the only stable behaviors that can occur in the modeled networks. In addition, the models allow to derive switching conditions that irrevocably lead to either lytic or lysogenic behavioral cycle as well as constraints that are required for their biological feasibility. For LPH2 model the feasibility constraints place two mutually independent requirements on comparative order of cro and cI protein binding site affinities. However, HK022 model, while broadly similar, does not require any of these constraints. Biologically very different lysis-lysogeny switching mechanism of Mu phage is also accurately reproduced by its model. In general the results show that hybrid system model (HSM) hybrid system framework can be successfully applied to modeling small ([Formula: see text] gene) regulatory networks and used for comprehensive analysis of model dynamics and stable behavior regions.

我们提出了基于杂交系统的λ、HK022和Mu噬菌体基因调控网络模型,并对模型网络进行了动力学分析。提出的λ噬菌体模型LPH2是基于早期的工作,并结合了最近关于潜在基因调控机制的生物学假设,HK022和Mu噬菌体模型是新的。所有三个模型都提供了实验观察到的裂解和溶原行为周期的准确表示。重要的是,这些模型还表明,裂解和溶原性是模型网络中唯一可能发生的稳定行为。此外,该模型允许导出不可逆转地导致裂解或溶原行为循环的开关条件,以及其生物学可行性所需的约束。对于LPH2模型,可行性约束对cro和cI蛋白结合位点亲和性的比较顺序提出了两个相互独立的要求。然而,HK022模式虽然大致相似,但不需要这些限制。它的模型也准确地再现了生物学上截然不同的Mu噬菌体的裂解-溶原转换机制。总体而言,研究结果表明,混合系统模型(HSM)混合系统框架可以成功地应用于小型(公式:见文本)基因调控网络的建模,并用于模型动力学和稳定行为区域的综合分析。
{"title":"Behavioral dynamics of bacteriophage gene regulatory networks.","authors":"Gatis Melkus,&nbsp;Karlis Cerans,&nbsp;Karlis Freivalds,&nbsp;Lelde Lace,&nbsp;Darta Zajakina,&nbsp;Juris Viksna","doi":"10.1142/S0219720022500214","DOIUrl":"https://doi.org/10.1142/S0219720022500214","url":null,"abstract":"<p><p>We present hybrid system-based gene regulatory network models for lambda, HK022, and Mu bacteriophages together with dynamics analysis of the modeled networks. The proposed lambda phage model LPH2 is based on an earlier work and incorporates more recent biological assumptions about the underlying gene regulatory mechanism, HK022, and Mu phage models are new. All three models provide accurate representations of experimentally observed lytic and lysogenic behavioral cycles. Importantly, the models also imply that lysis and lysogeny are <i>the only</i> stable behaviors that can occur in the modeled networks. In addition, the models allow to derive switching conditions that irrevocably lead to either lytic or lysogenic behavioral cycle as well as constraints that are required for their biological feasibility. For LPH2 model the feasibility constraints place two mutually independent requirements on comparative order of cro and cI protein binding site affinities. However, HK022 model, while broadly similar, does not require any of these constraints. Biologically very different lysis-lysogeny switching mechanism of Mu phage is also accurately reproduced by its model. In general the results show that hybrid system model (HSM) hybrid system framework can be successfully applied to modeling small ([Formula: see text] gene) regulatory networks and used for comprehensive analysis of model dynamics and stable behavior regions.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 5","pages":"2250021"},"PeriodicalIF":1.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10759590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The impact of simulation time in predicting binding free energies using end-point approaches. 模拟时间对终点法预测束缚自由能的影响。
IF 1 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2022-10-01 DOI: 10.1142/S021972002250024X
Babak Sokouti, Siavoush Dastmalchi, Maryam Hamzeh-Mivehroud

The profound impact of in silico studies for a fast-paced drug discovery pipeline is undeniable for pharmaceutical community. The rational design of novel drug candidates necessitates considering optimization of their different aspects prior to synthesis and biological evaluations. The affinity prediction of small ligands to target of interest for rank-ordering the potential ligands is one of the most routinely used steps in the context of virtual screening. So, the end-point methods were employed for binding free energy estimation focusing on evaluating simulation time effect. Then, a set of human aldose reductase inhibitors were selected for molecular dynamics (MD)-based binding free energy calculations. A total of 100[Formula: see text]ns MD simulation time was conducted for the ligand-receptor complexes followed by prediction of binding free energies using MM/PB(GB)SA and LIE approaches under different simulation time. The results revealed that a maximum of 30[Formula: see text]ns simulation time is sufficient for determination of binding affinities inferred from steady trend of squared correlation values (R2) between experimental and predicted [Formula: see text]G as a function of MD simulation time. In conclusion, the MM/PB(GB)SA algorithms performed well in terms of binding affinity prediction compared to LIE approach. The results provide new insights for large-scale applications of such predictions in an affordable computational cost.

对于制药界来说,计算机研究对快节奏药物发现管道的深远影响是不可否认的。新型候选药物的合理设计需要在合成和生物学评价之前考虑其不同方面的优化。小配体对目标感兴趣的亲和力预测对潜在配体进行排序是虚拟筛选中最常用的步骤之一。因此,采用终点法进行约束自由能估计,重点是评估仿真时间效应。然后,选择一组人醛糖还原酶抑制剂进行基于分子动力学的结合自由能计算。对配体-受体配合物进行了100 ns MD模拟时间,然后采用MM/PB(GB)SA和LIE方法预测了不同模拟时间下的结合自由能。结果表明,根据实验与预测的相关平方值(R2)的稳定趋势[公式:见文]G作为MD模拟时间的函数,最大30 ns模拟时间就足以确定结合亲和力。综上所述,与LIE方法相比,MM/PB(GB)SA算法在结合亲和力预测方面表现良好。研究结果为这种预测的大规模应用提供了新的见解,而且计算成本低廉。
{"title":"The impact of simulation time in predicting binding free energies using end-point approaches.","authors":"Babak Sokouti,&nbsp;Siavoush Dastmalchi,&nbsp;Maryam Hamzeh-Mivehroud","doi":"10.1142/S021972002250024X","DOIUrl":"https://doi.org/10.1142/S021972002250024X","url":null,"abstract":"<p><p>The profound impact of <i>in silico</i> studies for a fast-paced drug discovery pipeline is undeniable for pharmaceutical community. The rational design of novel drug candidates necessitates considering optimization of their different aspects prior to synthesis and biological evaluations. The affinity prediction of small ligands to target of interest for rank-ordering the potential ligands is one of the most routinely used steps in the context of virtual screening. So, the end-point methods were employed for binding free energy estimation focusing on evaluating simulation time effect. Then, a set of human aldose reductase inhibitors were selected for molecular dynamics (MD)-based binding free energy calculations. A total of 100[Formula: see text]ns MD simulation time was conducted for the ligand-receptor complexes followed by prediction of binding free energies using MM/PB(GB)SA and LIE approaches under different simulation time. The results revealed that a maximum of 30[Formula: see text]ns simulation time is sufficient for determination of binding affinities inferred from steady trend of squared correlation values (R<sup>2</sup>) between experimental and predicted [Formula: see text]G as a function of MD simulation time. In conclusion, the MM/PB(GB)SA algorithms performed well in terms of binding affinity prediction compared to LIE approach. The results provide new insights for large-scale applications of such predictions in an affordable computational cost.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 5","pages":"2250024"},"PeriodicalIF":1.0,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10472803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Journal of Bioinformatics and Computational Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1