Pub Date : 2023-08-01Epub Date: 2023-07-28DOI: 10.1142/S0219720023500166
Maria Waldl, Thomas Spicher, Ronny Lorenz, Irene K Beckmann, Ivo L Hofacker, Sarah Von Löhneysen, Peter F Stadler
Most of the functional RNA elements located within large transcripts are local. Local folding therefore serves a practically useful approximation to global structure prediction. Due to the sensitivity of RNA secondary structure prediction to the exact definition of sequence ends, accuracy can be increased by averaging local structure predictions over multiple, overlapping sequence windows. These averages can be computed efficiently by dynamic programming. Here we revisit the local folding problem, present a concise mathematical formalization that generalizes previous approaches and show that correct Boltzmann samples can be obtained by local stochastic backtracing in McCaskill's algorithms but not from local folding recursions. Corresponding new features are implemented in the ViennaRNA package to improve the support of local folding. Applications include the computation of maximum expected accuracy structures from RNAplfold data and a mutual information measure to quantify the sensitivity of individual sequence positions.
{"title":"Local RNA folding revisited.","authors":"Maria Waldl, Thomas Spicher, Ronny Lorenz, Irene K Beckmann, Ivo L Hofacker, Sarah Von Löhneysen, Peter F Stadler","doi":"10.1142/S0219720023500166","DOIUrl":"10.1142/S0219720023500166","url":null,"abstract":"<p><p>Most of the functional RNA elements located within large transcripts are local. Local folding therefore serves a practically useful approximation to global structure prediction. Due to the sensitivity of RNA secondary structure prediction to the exact definition of sequence ends, accuracy can be increased by averaging local structure predictions over multiple, overlapping sequence windows. These averages can be computed efficiently by dynamic programming. Here we revisit the local folding problem, present a concise mathematical formalization that generalizes previous approaches and show that correct Boltzmann samples can be obtained by local stochastic backtracing in McCaskill's algorithms but not from local folding recursions. Corresponding new features are implemented in the ViennaRNA package to improve the support of local folding. Applications include the computation of maximum expected accuracy structures from RNAplfold data and a mutual information measure to quantify the sensitivity of individual sequence positions.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 4","pages":"2350016"},"PeriodicalIF":1.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10293755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-01Epub Date: 2023-09-06DOI: 10.1142/S021972002350018X
Zhaoyang Hu, Qingsen Liu, Zhong Ni
Over the past decades, many existing drugs and clinical/preclinical compounds have been repositioned as new therapeutic indication from which they were originally intended and to treat off-target diseases by targeting their noncognate protein receptors, such as Sildenafil and Paxlovid, termed drug repurposing (DRP). Despite its significant attraction in the current medicinal community, the DRP is usually considered as a matter of accidents that cannot be fulfilled reliably by traditional drug discovery protocol. In this study, we proposed an integrated computational/experimental (iC/E) strategy to facilitate the DRP within a framework of rational drug design, which was practiced on the identification of new neuronal nitric oxide synthase (nNOS) inhibitors from a structurally diverse, functionally distinct drug pool. We demonstrated that the iC/E strategy is very efficient and readily feasible, which confirmed that the phosphodiesterase inhibitor DB06237 showed a high inhibitory potency against nNOS synthase domain, while other two general drugs, i.e. DB02302 and DB08258, can also inhibit the synthase at nanomolar level. Structural bioinformatics analysis revealed diverse noncovalent interactions such as hydrogen bonds, hydrophobic forces and van der Waals contacts across the complex interface of nNOS active site with these identified drugs, conferring both stability and specificity for the complex recognition and association.
{"title":"Facilitating the drug repurposing with iC/E strategy: A practice on novel nNOS inhibitor discovery.","authors":"Zhaoyang Hu, Qingsen Liu, Zhong Ni","doi":"10.1142/S021972002350018X","DOIUrl":"10.1142/S021972002350018X","url":null,"abstract":"<p><p>Over the past decades, many existing drugs and clinical/preclinical compounds have been repositioned as new therapeutic indication from which they were originally intended and to treat off-target diseases by targeting their noncognate protein receptors, such as Sildenafil and Paxlovid, termed drug repurposing (DRP). Despite its significant attraction in the current medicinal community, the DRP is usually considered as a matter of accidents that cannot be fulfilled reliably by traditional drug discovery protocol. In this study, we proposed an integrated computational/experimental (iC/E) strategy to facilitate the DRP within a framework of rational drug design, which was practiced on the identification of new neuronal nitric oxide synthase (nNOS) inhibitors from a structurally diverse, functionally distinct drug pool. We demonstrated that the iC/E strategy is very efficient and readily feasible, which confirmed that the phosphodiesterase inhibitor DB06237 showed a high inhibitory potency against nNOS synthase domain, while other two general drugs, i.e. DB02302 and DB08258, can also inhibit the synthase at nanomolar level. Structural bioinformatics analysis revealed diverse noncovalent interactions such as hydrogen bonds, hydrophobic forces and van der Waals contacts across the complex interface of nNOS active site with these identified drugs, conferring both stability and specificity for the complex recognition and association.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 4","pages":"2350018"},"PeriodicalIF":1.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10669370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1142/S0219720023500142
Pooja Rani, Kamlesh Dutta, Vijay Kumar
Drug synergy has emerged as a viable treatment option for malignancy. Drug synergy reduces toxicity, improves therapeutic efficacy, and overcomes drug resistance when compared to single-drug doses. Thus, it has attained significant interest from academics and pharmaceutical organizations. Due to the enormous combinatorial search space, it is impossible to experimentally validate every conceivable combination for synergistic interaction. Due to advancement in artificial intelligence, the computational techniques are being utilized to identify synergistic drug combinations, whereas prior literature has focused on treating certain malignancies. As a result, high-order drug combinations have been given little consideration. Here, DrugSymby, a novel deep-learning model is proposed for predicting drug combinations. To achieve this objective, the data is collected from datasets that include information on anti-cancer drugs, gene expression profiles of malignant cell lines, and screening data against a wide range of malignant cell lines. The proposed model was developed using this data and achieved high performance with f1-score of 0.98, recall of 0.99, and precision of 0.98. The evaluation results of DrugSymby model utilizing drug combination screening data from the NCI-ALMANAC screening dataset indicate drug combination prediction is effective. The proposed model will be used to determine the most successful synergistic drug combinations, and also increase the possibilities of exploring new drug combinations.
{"title":"Drug synergy model for malignant diseases using deep learning.","authors":"Pooja Rani, Kamlesh Dutta, Vijay Kumar","doi":"10.1142/S0219720023500142","DOIUrl":"https://doi.org/10.1142/S0219720023500142","url":null,"abstract":"<p><p>Drug synergy has emerged as a viable treatment option for malignancy. Drug synergy reduces toxicity, improves therapeutic efficacy, and overcomes drug resistance when compared to single-drug doses. Thus, it has attained significant interest from academics and pharmaceutical organizations. Due to the enormous combinatorial search space, it is impossible to experimentally validate every conceivable combination for synergistic interaction. Due to advancement in artificial intelligence, the computational techniques are being utilized to identify synergistic drug combinations, whereas prior literature has focused on treating certain malignancies. As a result, high-order drug combinations have been given little consideration. Here, DrugSymby, a novel deep-learning model is proposed for predicting drug combinations. To achieve this objective, the data is collected from datasets that include information on anti-cancer drugs, gene expression profiles of malignant cell lines, and screening data against a wide range of malignant cell lines. The proposed model was developed using this data and achieved high performance with f1-score of 0.98, recall of 0.99, and precision of 0.98. The evaluation results of DrugSymby model utilizing drug combination screening data from the NCI-ALMANAC screening dataset indicate drug combination prediction is effective. The proposed model will be used to determine the most successful synergistic drug combinations, and also increase the possibilities of exploring new drug combinations.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350014"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10127381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1142/S0219720023500105
Hongliang Zou, Zizheng Yu, Zhijian Yin
Recent studies reported that ion binding proteins (IBPs) in phage play a key role in developing drugs to treat diseases caused by drug-resistant bacteria. Therefore, correct recognition of IBPs is an urgent task, which is beneficial for understanding their biological functions. To explore this issue, a new computational model was developed to identify IBPs in this study. First, we used the physicochemical (PC) property and Pearson's correlation coefficient (PCC) to denote protein sequences, and the temporal and spatial variabilities were employed to extract features. Next, a similarity network fusion algorithm was employed to capture the correlation characteristics between these two different kinds of features. Then, a feature selection method called F-score was utilized to remove the influence of redundant and irrelative information. Finally, these reserved features were fed into support vector machine (SVM) to discriminate IBPs from non-IBPs. Experimental results showed that the proposed method has significant improvement in the classification performance, as compared with the state-of-the-art approach. The Matlab codes and dataset used in this study are available at https://figshare.com/articles/online_resource/iIBP-TSV/21779567 for academic use.
{"title":"Integrating temporal and spatial variabilities for identifying ion binding proteins in phage.","authors":"Hongliang Zou, Zizheng Yu, Zhijian Yin","doi":"10.1142/S0219720023500105","DOIUrl":"https://doi.org/10.1142/S0219720023500105","url":null,"abstract":"<p><p>Recent studies reported that ion binding proteins (IBPs) in phage play a key role in developing drugs to treat diseases caused by drug-resistant bacteria. Therefore, correct recognition of IBPs is an urgent task, which is beneficial for understanding their biological functions. To explore this issue, a new computational model was developed to identify IBPs in this study. First, we used the physicochemical (PC) property and Pearson's correlation coefficient (PCC) to denote protein sequences, and the temporal and spatial variabilities were employed to extract features. Next, a similarity network fusion algorithm was employed to capture the correlation characteristics between these two different kinds of features. Then, a feature selection method called F-score was utilized to remove the influence of redundant and irrelative information. Finally, these reserved features were fed into support vector machine (SVM) to discriminate IBPs from non-IBPs. Experimental results showed that the proposed method has significant improvement in the classification performance, as compared with the state-of-the-art approach. The Matlab codes and dataset used in this study are available at https://figshare.com/articles/online_resource/iIBP-TSV/21779567 for academic use.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350010"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9750670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1142/S0219720023500130
Jie-Huei Wang, Yi-Hau Chen
Precision medicine has been a global trend of medical development, wherein cancer diagnosis plays an important role. With accurate diagnosis of cancer, we can provide patients with appropriate medical treatments for improving patients' survival. Since disease developments involve complex interplay among multiple factors such as gene-gene interactions, cancer classifications based on microarray gene expression profiling data are expected to be effective, and hence, have attracted extensive attention in computational biology and medicine. However, when using genomic data to build a diagnostic model, there exist several problems to be overcome, including the high-dimensional feature space and feature contamination. In this paper, we propose using the overlapping group screening (OGS) approach to build an accurate cancer diagnosis model and predict the probability of a patient falling into some disease classification category in the logistic regression framework. This new proposal integrates gene pathway information into the procedure for identifying genes and gene-gene interactions associated with the classification of cancer outcome groups. We conduct a series of simulation studies to compare the predictive accuracy of our proposed method for cancer diagnosis with some existing machine learning methods, and find the better performances of the former method. We apply the proposed method to the genomic data of The Cancer Genome Atlas related to lung adenocarcinoma (LUAD), liver hepatocellular carcinoma (LHC), and thyroid carcinoma (THCA), to establish accurate cancer diagnosis models.
{"title":"Overlapping group screening for binary cancer classification with TCGA high-dimensional genomic data.","authors":"Jie-Huei Wang, Yi-Hau Chen","doi":"10.1142/S0219720023500130","DOIUrl":"https://doi.org/10.1142/S0219720023500130","url":null,"abstract":"<p><p>Precision medicine has been a global trend of medical development, wherein cancer diagnosis plays an important role. With accurate diagnosis of cancer, we can provide patients with appropriate medical treatments for improving patients' survival. Since disease developments involve complex interplay among multiple factors such as gene-gene interactions, cancer classifications based on microarray gene expression profiling data are expected to be effective, and hence, have attracted extensive attention in computational biology and medicine. However, when using genomic data to build a diagnostic model, there exist several problems to be overcome, including the high-dimensional feature space and feature contamination. In this paper, we propose using the overlapping group screening (OGS) approach to build an accurate cancer diagnosis model and predict the probability of a patient falling into some disease classification category in the logistic regression framework. This new proposal integrates gene pathway information into the procedure for identifying genes and gene-gene interactions associated with the classification of cancer outcome groups. We conduct a series of simulation studies to compare the predictive accuracy of our proposed method for cancer diagnosis with some existing machine learning methods, and find the better performances of the former method. We apply the proposed method to the genomic data of The Cancer Genome Atlas related to lung adenocarcinoma (LUAD), liver hepatocellular carcinoma (LHC), and thyroid carcinoma (THCA), to establish accurate cancer diagnosis models.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350013"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9750378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1142/S0219720023500129
Qi-Shi Song, Hai-Jun Wu, Qian Lin, Yu-Kai Tang
Based on the colorectal cancer microarray sets gene expression data series (GSE) GSE10972 and GSE74602 in colon cancer and 222 autophagy-related genes, the differential signature in colorectal cancer and paracancerous tissues was analyzed by RankComp algorithm, and a signature consisting of seven autophagy-related reversal gene pairs with stable relative expression orderings (REOs) was obtained. Scoring based on these gene pairs could significantly distinguish colorectal cancer samples from adjacent noncancerous samples, with an average accuracy of 97.5% in two training sets and 90.25% in four independent validation GSE21510, GSE37182, GSE33126, and GSE18105. Scoring based on these gene pairs also accurately identifies 99.85% of colorectal cancer samples in seven other independent datasets containing a total of 1406 colorectal cancer samples.
{"title":"Identification of a seven autophagy-related gene pairs signature for the diagnosis of colorectal cancer using the RankComp algorithm.","authors":"Qi-Shi Song, Hai-Jun Wu, Qian Lin, Yu-Kai Tang","doi":"10.1142/S0219720023500129","DOIUrl":"https://doi.org/10.1142/S0219720023500129","url":null,"abstract":"<p><p>Based on the colorectal cancer microarray sets gene expression data series (GSE) GSE10972 and GSE74602 in colon cancer and 222 autophagy-related genes, the differential signature in colorectal cancer and paracancerous tissues was analyzed by RankComp algorithm, and a signature consisting of seven autophagy-related reversal gene pairs with stable relative expression orderings (REOs) was obtained. Scoring based on these gene pairs could significantly distinguish colorectal cancer samples from adjacent noncancerous samples, with an average accuracy of 97.5% in two training sets and 90.25% in four independent validation GSE21510, GSE37182, GSE33126, and GSE18105. Scoring based on these gene pairs also accurately identifies 99.85% of colorectal cancer samples in seven other independent datasets containing a total of 1406 colorectal cancer samples.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350012"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9752623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1142/S0219720023500117
Aiqing Ma, Xianhua Dai
The P53 protein levels exhibit a series of pulses in response to DNA double-stranded breaks (DSBs). However, the mechanism regarding how damage strength regulates physical parameters of p53 pulses remains to be elucidated. This paper established two mathematical models translating the mechanism of p53 dynamics in response to DSBs; the two models can reproduce many results observed in the experiments. Based on the models, numerical analysis suggested that the interval between pulses increases as the damage strength decreases, and we proposed that the p53 dynamical system in response to DSBs is modulated by frequency. Next, we found that the ATM positive self-feedback can realize the system characteristic that the pulse amplitude is independent of the damage strength. In addition, the pulse interval is negatively correlated with apoptosis; the greater the damage strength, the smaller the pulse interval, the faster the p53 accumulation rate, and the cells are more susceptible to apoptosis. These findings advance our understanding of the mechanism of p53 dynamical response and give new insights for experiments to probe the dynamics of p53 signaling.
{"title":"The mechanism accounting for DNA damage strength modulation of p53 dynamical properties.","authors":"Aiqing Ma, Xianhua Dai","doi":"10.1142/S0219720023500117","DOIUrl":"https://doi.org/10.1142/S0219720023500117","url":null,"abstract":"<p><p>The P53 protein levels exhibit a series of pulses in response to DNA double-stranded breaks (DSBs). However, the mechanism regarding how damage strength regulates physical parameters of p53 pulses remains to be elucidated. This paper established two mathematical models translating the mechanism of p53 dynamics in response to DSBs; the two models can reproduce many results observed in the experiments. Based on the models, numerical analysis suggested that the interval between pulses increases as the damage strength decreases, and we proposed that the p53 dynamical system in response to DSBs is modulated by frequency. Next, we found that the ATM positive self-feedback can realize the system characteristic that the pulse amplitude is independent of the damage strength. In addition, the pulse interval is negatively correlated with apoptosis; the greater the damage strength, the smaller the pulse interval, the faster the p53 accumulation rate, and the cells are more susceptible to apoptosis. These findings advance our understanding of the mechanism of p53 dynamical response and give new insights for experiments to probe the dynamics of p53 signaling.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350011"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9752621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-06-01DOI: 10.1142/S0219720023500154
Büşra Özkan Kök, Yasemin Celik Altunoglu, Ali Burak Öncül, Abdulkadir Karaci, Mehmet Cengiz Baloglu
Expansins, which are plant cell wall loosening proteins associated with cell growth, have been identified as a multigene family. Plant expansin proteins are an important family that functions in cell growth and many of developmental processes including wall relaxation, fruit softening, abscission, seed germination, mycorrhiza and root nodule formation, biotic and abiotic stress resistance, invasion of pollen tube stigma and organogenesis. In addition, it is thought that increasing the efficiency of plant expansin genes in plants plays a significant role, especially in the production of secondary bioethanol. When the studies on the expansin genes are examined, it is seen that the expansin genes are a significant gene family in the cell wall expansion mechanism. Therefore, understanding the efficacy of expansin genes is of great importance. Considering the importance of this multigene family, we aimed to create a comprehensively informed database of plant expansin proteins and their properties. The expansin gene family database provides comprehensive online data for the expansin gene family members in the plants. We have designed a new website accessible to the public, including expansin gene family members in 70 plants and their features including gene, coding and peptide sequences, chromosomal location, amino acid length, molecular weight, stability, conserved motif and domain structure and predicted three-dimensional architecture. Furthermore, a deep learning system was developed to detect unknown genes belonging to the expansin gene family. In addition, we provided the blast process within the website by establishing a connection to the NCBI BLAST site in the tools section. Thus, the expansin gene family database becomes a useful database for researchers that enables access to all datasets simultaneously with its user-friendly interface. Our server can be reached freely at the following link (http://www.expansingenefamily.com/).
{"title":"Expansin gene family database: A comprehensive bioinformatics resource for plant expansin multigene family.","authors":"Büşra Özkan Kök, Yasemin Celik Altunoglu, Ali Burak Öncül, Abdulkadir Karaci, Mehmet Cengiz Baloglu","doi":"10.1142/S0219720023500154","DOIUrl":"https://doi.org/10.1142/S0219720023500154","url":null,"abstract":"<p><p>Expansins, which are plant cell wall loosening proteins associated with cell growth, have been identified as a multigene family. Plant expansin proteins are an important family that functions in cell growth and many of developmental processes including wall relaxation, fruit softening, abscission, seed germination, mycorrhiza and root nodule formation, biotic and abiotic stress resistance, invasion of pollen tube stigma and organogenesis. In addition, it is thought that increasing the efficiency of plant expansin genes in plants plays a significant role, especially in the production of secondary bioethanol. When the studies on the expansin genes are examined, it is seen that the expansin genes are a significant gene family in the cell wall expansion mechanism. Therefore, understanding the efficacy of expansin genes is of great importance. Considering the importance of this multigene family, we aimed to create a comprehensively informed database of plant expansin proteins and their properties. The expansin gene family database provides comprehensive online data for the expansin gene family members in the plants. We have designed a new website accessible to the public, including expansin gene family members in 70 plants and their features including gene, coding and peptide sequences, chromosomal location, amino acid length, molecular weight, stability, conserved motif and domain structure and predicted three-dimensional architecture. Furthermore, a deep learning system was developed to detect unknown genes belonging to the expansin gene family. In addition, we provided the blast process within the website by establishing a connection to the NCBI BLAST site in the tools section. Thus, the expansin gene family database becomes a useful database for researchers that enables access to all datasets simultaneously with its user-friendly interface. Our server can be reached freely at the following link (http://www.expansingenefamily.com/).</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350015"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10109656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01DOI: 10.1142/S0219720023500099
Klairton Lima Brito, Andre Rodrigues Oliveira, Alexsandro Oliveira Alexandrino, Ulisses Dias, Zanoni Dias
Genome rearrangement events are widely used to estimate a minimum-size sequence of mutations capable of transforming a genome into another. The length of this sequence is called distance, and determining it is the main goal in genome rearrangement distance problems. Problems in the genome rearrangement field differ regarding the set of rearrangement events allowed and the genome representation. In this work, we consider the scenario where the genomes share the same set of genes, gene orientation is known or unknown, and intergenic regions (structures between a pair of genes and at the extremities of the genome) are taken into account. We use two models, the first model allows only conservative events (reversals and moves), and the second model includes non-conservative events (insertions and deletions) in the intergenic regions. We show that both models result in NP-hard problems no matter if gene orientation is known or unknown. When the information regarding the orientation of genes is available, we present for both models an approximation algorithm with a factor of 2. For the scenario where this information is unavailable, we propose a 4-approximation algorithm for both models.
{"title":"Rearrangement distance with reversals, indels, and moves in intergenic regions on signed and unsigned permutations.","authors":"Klairton Lima Brito, Andre Rodrigues Oliveira, Alexsandro Oliveira Alexandrino, Ulisses Dias, Zanoni Dias","doi":"10.1142/S0219720023500099","DOIUrl":"https://doi.org/10.1142/S0219720023500099","url":null,"abstract":"<p><p>Genome rearrangement events are widely used to estimate a minimum-size sequence of mutations capable of transforming a genome into another. The length of this sequence is called distance, and determining it is the main goal in genome rearrangement distance problems. Problems in the genome rearrangement field differ regarding the set of rearrangement events allowed and the genome representation. In this work, we consider the scenario where the genomes share the same set of genes, gene orientation is known or unknown, and intergenic regions (structures between a pair of genes and at the extremities of the genome) are taken into account. We use two models, the first model allows only conservative events (reversals and moves), and the second model includes non-conservative events (insertions and deletions) in the intergenic regions. We show that both models result in NP-hard problems no matter if gene orientation is known or unknown. When the information regarding the orientation of genes is available, we present for both models an approximation algorithm with a factor of 2. For the scenario where this information is unavailable, we propose a 4-approximation algorithm for both models.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2350009"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9528408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01DOI: 10.1142/S0219720023710014
Wei Xin Chan, Limsoon Wong
Despite an exponential increase in publications on clinical prediction models over recent years, the number of models deployed in clinical practice remains fairly limited. In this paper, we identify common obstacles that impede effective deployment of prediction models in healthcare, and investigate their underlying causes. We observe a key underlying cause behind most obstacles - the improper development and evaluation of prediction models. Inherent heterogeneities in clinical data complicate the development and evaluation of clinical prediction models. Many of these heterogeneities in clinical data are unreported because they are deemed to be irrelevant, or due to privacy concerns. We provide real-life examples where failure to handle heterogeneities in clinical data, or sources of biases, led to the development of erroneous models. The purpose of this paper is to familiarize modeling practitioners with common sources of biases and heterogeneities in clinical data, both of which have to be dealt with to ensure proper development and evaluation of clinical prediction models. Proper model development and evaluation, together with complete and thorough reporting, are important prerequisites for a prediction model to be effectively deployed in healthcare.
{"title":"Obstacles to effective model deployment in healthcare.","authors":"Wei Xin Chan, Limsoon Wong","doi":"10.1142/S0219720023710014","DOIUrl":"https://doi.org/10.1142/S0219720023710014","url":null,"abstract":"<p><p>Despite an exponential increase in publications on clinical prediction models over recent years, the number of models deployed in clinical practice remains fairly limited. In this paper, we identify common obstacles that impede effective deployment of prediction models in healthcare, and investigate their underlying causes. We observe a key underlying cause behind most obstacles - the improper development and evaluation of prediction models. Inherent heterogeneities in clinical data complicate the development and evaluation of clinical prediction models. Many of these heterogeneities in clinical data are unreported because they are deemed to be irrelevant, or due to privacy concerns. We provide real-life examples where failure to handle heterogeneities in clinical data, or sources of biases, led to the development of erroneous models. The purpose of this paper is to familiarize modeling practitioners with common sources of biases and heterogeneities in clinical data, both of which have to be dealt with to ensure proper development and evaluation of clinical prediction models. Proper model development and evaluation, together with complete and thorough reporting, are important prerequisites for a prediction model to be effectively deployed in healthcare.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2371001"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9554623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}