首页 > 最新文献

Open Bioinformatics Journal最新文献

英文 中文
Bioinformatics Based Understanding of Effect of Mutations in the Human β Tubulin Outside Drug Binding Sites and its Significance in Drug Resistance 基于生物信息学的人β微管蛋白药物结合位点外突变效应及其在耐药性中的意义
Q3 Computer Science Pub Date : 2018-03-13 DOI: 10.2174/1875036201811010029
Selvaa Kumar, D. Dasgupta, Nikhil Gadewal
RESEARCH ARTICLE Bioinformatics Based Understanding of Effect of Mutations in the Human β Tubulin Outside Drug Binding Sites and its Significance in Drug Resistance Selvaa Kumar C, Debjani Dasgupta and Nikhil Gadewal School of Biotechnology and Bioinformatics, DY Patil University, CBD Belapur, Navi Mumbai 400614, India Advanced Centre for Treatment, Research and Education in Cancer, Kharghar, Navi Mumbai 410210, India
研究文章基于生物信息学的人类βTubulin药物结合位点外突变效应及其在耐药性中的意义Selvaa Kumar C,Debjani Dasgupta和Nikhil Gadewal生物技术和生物信息学学院,DY Patil大学,CBD Belapur,Navi Mumbai 400614,印度癌症治疗、研究和教育高级中心,Kharghar,Navi Mumbai 410210,印度
{"title":"Bioinformatics Based Understanding of Effect of Mutations in the Human β Tubulin Outside Drug Binding Sites and its Significance in Drug Resistance","authors":"Selvaa Kumar, D. Dasgupta, Nikhil Gadewal","doi":"10.2174/1875036201811010029","DOIUrl":"https://doi.org/10.2174/1875036201811010029","url":null,"abstract":"RESEARCH ARTICLE Bioinformatics Based Understanding of Effect of Mutations in the Human β Tubulin Outside Drug Binding Sites and its Significance in Drug Resistance Selvaa Kumar C, Debjani Dasgupta and Nikhil Gadewal School of Biotechnology and Bioinformatics, DY Patil University, CBD Belapur, Navi Mumbai 400614, India Advanced Centre for Treatment, Research and Education in Cancer, Kharghar, Navi Mumbai 410210, India","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"11 1","pages":"29-37"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43040487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigation of Drought and Salinity Tolerance Related Genes and their Regulatory Mechanisms in Arabidopsis (Arabidopsis thaliana) 拟南芥耐干旱耐盐碱相关基因及其调控机制的研究
Q3 Computer Science Pub Date : 2018-03-07 DOI: 10.2174/1875036201811010012
Nikwan Shariatipour, B. Heidari
Results: Under drought stress, 2558 gene accessions in root and 3691 in shoot tissues had significantly differential expression with respect to control condition. Likewise, under salinity stress 9078 gene accessions in root and 5785 in shoot tissues were discriminated between stressed and non-stressed conditions. Furthermore, the transcription regulatory activity of differentially expressed genes was mainly due to hormone, light, circadian and stress responsive cis-acting regulatory elements among which ABRE, ERE, P-box, TATC-box, CGTCA-motif, GARE-motif, TGACG-motif, GAG-motif, GA-motif, GATAmotif, TCT-motif, GT1-motif, Box 4, G-Box, I-box, LAMP-element, Sp1, MBS, TC-rich repeats, TCA-element and HSE were the most important elements in the identified up-regulated genes.
结果:在干旱胁迫下,2558份基因材料在根和3691份基因材料的地上部组织中的表达与对照相比有显著差异。同样,在盐度胁迫下,根中的9078个基因材料和地上部组织中的5785个基因材料在胁迫和非胁迫条件下被区分。此外,差异表达基因的转录调控活性主要是由于激素、光、昼夜节律和应激反应性顺式作用调控元件,其中ABRE、ERE、P-box、TATC-box、CGTCA基序、GARE基序、TGACG基序、GAG基序和GA基序、GATAmotif、TCT基序、GT1基序、Box4、G-box、I-box、LAMP元件、Sp1、MBS、TC富集重复序列,TCA元素和HSE是已鉴定的上调基因中最重要的元素。
{"title":"Investigation of Drought and Salinity Tolerance Related Genes and their Regulatory Mechanisms in Arabidopsis (Arabidopsis thaliana)","authors":"Nikwan Shariatipour, B. Heidari","doi":"10.2174/1875036201811010012","DOIUrl":"https://doi.org/10.2174/1875036201811010012","url":null,"abstract":"Results: Under drought stress, 2558 gene accessions in root and 3691 in shoot tissues had significantly differential expression with respect to control condition. Likewise, under salinity stress 9078 gene accessions in root and 5785 in shoot tissues were discriminated between stressed and non-stressed conditions. Furthermore, the transcription regulatory activity of differentially expressed genes was mainly due to hormone, light, circadian and stress responsive cis-acting regulatory elements among which ABRE, ERE, P-box, TATC-box, CGTCA-motif, GARE-motif, TGACG-motif, GAG-motif, GA-motif, GATAmotif, TCT-motif, GT1-motif, Box 4, G-Box, I-box, LAMP-element, Sp1, MBS, TC-rich repeats, TCA-element and HSE were the most important elements in the identified up-regulated genes.","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"11 1","pages":"12-28"},"PeriodicalIF":0.0,"publicationDate":"2018-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45041774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Prospect and Competence of Quantitative Methods via Real-time PCR in a Comparative Manner: An Experimental Review of Current Methods 实时PCR定量方法的比较前景和能力:对现有方法的实验综述
Q3 Computer Science Pub Date : 2018-02-28 DOI: 10.2174/1875036201811010001
Hossein Mahboudi, N. Heidari, Zahra Irani Rashidabadi, Ali Houshmand Anbarestani, Soroush Karimi, K. Darestani
RESEARCH ARTICLE Prospect and Competence of Quantitative Methods via Real-time PCR in a Comparative Manner: An Experimental Review of Current Methods Hossein Mahboudi, Negin Mohammadizadeh Heidari, Zahra Irani Rashidabadi, Ali Houshmand Anbarestani, Soroush Karimi and Kaveh Darabi Darestani Department of medical biotechnology, School of Advanced Technologies in Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran Department of Agronomy and Plant breeding, Agricultural Faculty, Zanjan University, Zanjan, Iran Department of Genetics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran Nano Drug Delivery Research Center, Kermanshah University of Medical Sciences, Kermanshah, Iran Biology Department, School of advanced sciences regenerative medicine, Tehran Medical Branch Islamic Azad University, Tehran, Iran
研究文章比较实时PCR定量方法的前景和能力:当前方法的实验综述Hossein Mahboudi、Negin Mohammadizadeh Heidari、Zahra Irani Rashidabadi、Ali Houshmand Anbarestani、Soroush Karimi和Kaveh Darabi Darestani医学院医学生物技术系,Shahid Beheshti医学科学大学,德黑兰,伊朗农学和植物育种系,赞詹大学农业学院,赞詹,伊朗遗传学系,生物科学学院,Tarbiat Modares大学,德黑兰伊朗纳米药物递送研究中心,克尔曼沙医学科学大学,伊朗德黑兰伊斯兰阿扎德大学德黑兰医学分院再生医学高级科学学院
{"title":"Prospect and Competence of Quantitative Methods via Real-time PCR in a Comparative Manner: An Experimental Review of Current Methods","authors":"Hossein Mahboudi, N. Heidari, Zahra Irani Rashidabadi, Ali Houshmand Anbarestani, Soroush Karimi, K. Darestani","doi":"10.2174/1875036201811010001","DOIUrl":"https://doi.org/10.2174/1875036201811010001","url":null,"abstract":"RESEARCH ARTICLE Prospect and Competence of Quantitative Methods via Real-time PCR in a Comparative Manner: An Experimental Review of Current Methods Hossein Mahboudi, Negin Mohammadizadeh Heidari, Zahra Irani Rashidabadi, Ali Houshmand Anbarestani, Soroush Karimi and Kaveh Darabi Darestani Department of medical biotechnology, School of Advanced Technologies in Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran Department of Agronomy and Plant breeding, Agricultural Faculty, Zanjan University, Zanjan, Iran Department of Genetics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran Nano Drug Delivery Research Center, Kermanshah University of Medical Sciences, Kermanshah, Iran Biology Department, School of advanced sciences regenerative medicine, Tehran Medical Branch Islamic Azad University, Tehran, Iran","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"11 1","pages":"1-11"},"PeriodicalIF":0.0,"publicationDate":"2018-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48757328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Data Mining Approach to Identify Disease Cohorts from Primary Care Electronic Medical Records: A Case of Diabetes Mellitus 从初级保健电子病历中识别疾病队列的数据挖掘方法:一例糖尿病
Q3 Computer Science Pub Date : 2017-12-12 DOI: 10.2174/1875036201710010016
Ebenezer S. Owusu Adjah, O. Montvida, Julius Agbeve, S. Paul
Background: Identification of diseased patients from primary care based electronic medical records (EMRs) has methodological challenges that may impact epidemiologic inferences. Objective: To compare deterministic clinically guided selection algorithms with probabilistic machine learning (ML) methodologies for their ability to identify patients with type 2 diabetes mellitus (T2DM) from large population based EMRs from nationally representative primary care database. Methods: Four cohorts of patients with T2DM were defined by deterministic approach based on disease codes. The database was mined for a set of best predictors of T2DM and the performance of six ML algorithms were compared based on cross-validated true positive rate, true negative rate, and area under receiver operating characteristic curve. Results: In the database of 11,018,025 research suitable individuals, 379 657 (3.4%) were coded to have T2DM. Logistic Regression classifier was selected as best ML algorithm and resulted in a cohort of 383,330 patients with potential T2DM. Eighty-three percent (83%) of this cohort had a T2DM code, and 16% of the patients with T2DM code were not included in this ML cohort. Of those in the ML cohort without disease code, 52% had at least one measure of elevated glucose level and 22% had received at least one prescription for antidiabetic medication. Conclusion: Deterministic cohort selection based on disease coding potentially introduces significant mis-classification problem. ML techniques allow testing for potential disease predictors, and under meaningful data input, are able to identify diseased cohorts in a holistic way.
背景:从基于初级保健的电子医疗记录(EMR)中识别患病患者存在方法学挑战,可能会影响流行病学推断。目的:比较确定性临床指导选择算法和概率机器学习(ML)方法在从具有全国代表性的初级保健数据库中基于大规模人群的电子病历中识别2型糖尿病(T2DM)患者的能力。方法:以疾病编码为基础,采用确定性方法确定4组T2DM患者。在数据库中挖掘了一组T2DM的最佳预测因子,并基于交叉验证的真阳性率、真阴性率和受试者工作特征曲线下面积对六种ML算法的性能进行了比较。结果:在11018025个适合研究的个体的数据库中,379657(3.4%)被编码为患有T2DM。Logistic回归分类器被选为最佳ML算法,并产生了383330名潜在T2DM患者的队列。该队列中83%(83%)有T2DM代码,16%的T2DM代码患者不包括在该ML队列中。在没有疾病代码的ML队列中,52%的人至少有一次血糖水平升高,22%的人至少接受过一次抗糖尿病药物处方。结论:基于疾病编码的确定性队列选择可能会引入重大的错误分类问题。ML技术允许测试潜在的疾病预测因素,并且在有意义的数据输入下,能够以整体的方式识别患病队列。
{"title":"Data Mining Approach to Identify Disease Cohorts from Primary Care Electronic Medical Records: A Case of Diabetes Mellitus","authors":"Ebenezer S. Owusu Adjah, O. Montvida, Julius Agbeve, S. Paul","doi":"10.2174/1875036201710010016","DOIUrl":"https://doi.org/10.2174/1875036201710010016","url":null,"abstract":"Background: Identification of diseased patients from primary care based electronic medical records (EMRs) has methodological challenges that may impact epidemiologic inferences. Objective: To compare deterministic clinically guided selection algorithms with probabilistic machine learning (ML) methodologies for their ability to identify patients with type 2 diabetes mellitus (T2DM) from large population based EMRs from nationally representative primary care database. Methods: Four cohorts of patients with T2DM were defined by deterministic approach based on disease codes. The database was mined for a set of best predictors of T2DM and the performance of six ML algorithms were compared based on cross-validated true positive rate, true negative rate, and area under receiver operating characteristic curve. Results: In the database of 11,018,025 research suitable individuals, 379 657 (3.4%) were coded to have T2DM. Logistic Regression classifier was selected as best ML algorithm and resulted in a cohort of 383,330 patients with potential T2DM. Eighty-three percent (83%) of this cohort had a T2DM code, and 16% of the patients with T2DM code were not included in this ML cohort. Of those in the ML cohort without disease code, 52% had at least one measure of elevated glucose level and 22% had received at least one prescription for antidiabetic medication. Conclusion: Deterministic cohort selection based on disease coding potentially introduces significant mis-classification problem. ML techniques allow testing for potential disease predictors, and under meaningful data input, are able to identify diseased cohorts in a holistic way.","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"10 1","pages":"16-27"},"PeriodicalIF":0.0,"publicationDate":"2017-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46745948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Data Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical Records 从纵向电子病历估计药物治疗持续时间的数据挖掘方法
Q3 Computer Science Pub Date : 2017-07-31 DOI: 10.2174/1875036201709010001
O. Montvida, Ognjen Arandjelovic, E. Reiner, S. Paul
Electronic Medical Records (EMRs) from primary/ ambulatory care systems present a new and promising source of information for conducting clinical and translational research. To address the methodological and computational challenges in order to extract reliable medication information from raw data which is often complex, incomplete and erroneous. To assess whether the use of specific chaining fields of medication information may additionally improve the data quality. Guided by a range of challenges associated with missing and internally inconsistent data, we introduce two methods for the robust extraction of patient-level medication data. First method relies on chaining fields to estimate duration of treatment (“chaining”), while second disregards chaining fields and relies on the chronology of records (“continuous”). Centricity EMR database was used to estimate treatment duration with both methods for two widely prescribed drugs among type 2 diabetes patients: insulin and glucagon-like peptide-1 receptor agonists. At individual patient level the “chaining” approach could identify the treatment alterations longitudinally and produced more robust estimates of treatment duration for individual drugs, while the “continuous” method was unable to capture that dynamics. At population level, both methods produced similar estimates of average treatment duration, however, notable differences were observed at individual-patient level. The proposed algorithms explicitly identify and handle longitudinal erroneous or missing entries and estimate treatment duration with specific drug(s) of interest, which makes them a valuable tool for future EMR based clinical and pharmaco-epidemiological studies. To improve accuracy of real-world based studies, implementing chaining fields of medication information is recommended.
初级/门诊医疗系统的电子病历(EMR)为进行临床和转化研究提供了一种新的、有前景的信息来源。解决方法和计算方面的挑战,以便从通常复杂、不完整和错误的原始数据中提取可靠的药物信息。评估药物信息的特定链接字段的使用是否可以额外提高数据质量。在与数据缺失和内部不一致相关的一系列挑战的指导下,我们介绍了两种稳健提取患者级药物数据的方法。第一种方法依赖于链接字段来估计治疗的持续时间(“链接”),而第二种方法忽略了链接字段并依赖于记录的年表(“连续”)。Centricity EMR数据库用于估计两种方法在2型糖尿病患者中广泛使用的药物的治疗持续时间:胰岛素和胰高血糖素样肽-1受体激动剂。在个体患者层面,“连锁”方法可以纵向识别治疗变化,并对单个药物的治疗持续时间产生更可靠的估计,而“连续”方法无法捕捉这种动态。在人群水平上,两种方法对平均治疗持续时间的估计值相似,但在个体患者水平上观察到显著差异。所提出的算法明确识别和处理纵向错误或缺失条目,并估计感兴趣的特定药物的治疗持续时间,这使其成为未来基于EMR的临床和药物流行病学研究的宝贵工具。为了提高基于现实世界的研究的准确性,建议实施药物信息的链接字段。
{"title":"Data Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical Records","authors":"O. Montvida, Ognjen Arandjelovic, E. Reiner, S. Paul","doi":"10.2174/1875036201709010001","DOIUrl":"https://doi.org/10.2174/1875036201709010001","url":null,"abstract":"\u0000 \u0000 Electronic Medical Records (EMRs) from primary/ ambulatory care systems present a new and promising source of information for conducting clinical and translational research.\u0000 \u0000 \u0000 \u0000 To address the methodological and computational challenges in order to extract reliable medication information from raw data which is often complex, incomplete and erroneous. To assess whether the use of specific chaining fields of medication information may additionally improve the data quality.\u0000 \u0000 \u0000 \u0000 Guided by a range of challenges associated with missing and internally inconsistent data, we introduce two methods for the robust extraction of patient-level medication data. First method relies on chaining fields to estimate duration of treatment (“chaining”), while second disregards chaining fields and relies on the chronology of records (“continuous”). Centricity EMR database was used to estimate treatment duration with both methods for two widely prescribed drugs among type 2 diabetes patients: insulin and glucagon-like peptide-1 receptor agonists.\u0000 \u0000 \u0000 \u0000 At individual patient level the “chaining” approach could identify the treatment alterations longitudinally and produced more robust estimates of treatment duration for individual drugs, while the “continuous” method was unable to capture that dynamics. At population level, both methods produced similar estimates of average treatment duration, however, notable differences were observed at individual-patient level.\u0000 \u0000 \u0000 \u0000 The proposed algorithms explicitly identify and handle longitudinal erroneous or missing entries and estimate treatment duration with specific drug(s) of interest, which makes them a valuable tool for future EMR based clinical and pharmaco-epidemiological studies. To improve accuracy of real-world based studies, implementing chaining fields of medication information is recommended.\u0000","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"10 1","pages":"1-15"},"PeriodicalIF":0.0,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49153056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Using Chou’s Pseudo Amino Acid Composition and Machine LearningMethod to Predict the Antiviral Peptides 利用Chou的伪氨基酸组成和机器学习方法预测抗病毒肽
Q3 Computer Science Pub Date : 2015-03-31 DOI: 10.2174/1875036201509010013
M. Zare, H. Mohabatkar, Fatemeh Faramarzi, Majid Mohammad Beigi, M. Behbahani
Traditional antiviral therapies are expensive, limitedly available, and cause several side effects. Currently, de- signing antiviral peptides is very important, because these peptides interfere with the key stage of virus life cycle. Most of the antiviral peptides are derived from viral proteins for example peptide derived from HIV-1 capsid protein. Because of the importance of these peptides, in this study the concept of pseudo-amino acid composition (PseAAC) and machine learning methods are used to classify or identify antiviral peptides.
传统的抗病毒疗法价格昂贵,可用性有限,并且会产生一些副作用。目前,设计抗病毒肽是非常重要的,因为这些肽干扰病毒生命周期的关键阶段。大多数抗病毒肽来源于病毒蛋白,例如来源于HIV-1衣壳蛋白的肽。由于这些肽的重要性,在本研究中,伪氨基酸组成(PseAAC)的概念和机器学习方法被用于分类或识别抗病毒肽。
{"title":"Using Chou’s Pseudo Amino Acid Composition and Machine LearningMethod to Predict the Antiviral Peptides","authors":"M. Zare, H. Mohabatkar, Fatemeh Faramarzi, Majid Mohammad Beigi, M. Behbahani","doi":"10.2174/1875036201509010013","DOIUrl":"https://doi.org/10.2174/1875036201509010013","url":null,"abstract":"Traditional antiviral therapies are expensive, limitedly available, and cause several side effects. Currently, de- signing antiviral peptides is very important, because these peptides interfere with the key stage of virus life cycle. Most of the antiviral peptides are derived from viral proteins for example peptide derived from HIV-1 capsid protein. Because of the importance of these peptides, in this study the concept of pseudo-amino acid composition (PseAAC) and machine learning methods are used to classify or identify antiviral peptides.","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"9 1","pages":"13-19"},"PeriodicalIF":0.0,"publicationDate":"2015-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68107581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Protein-Protein Interaction Prediction using PCA and SVR-PHCS 基于PCA和SVR-PHCS的蛋白质相互作用预测
Q3 Computer Science Pub Date : 2015-01-23 DOI: 10.2174/1875036201509010001
S. Mahmoudian, Abdulaziz Yousef, Nasrollah Moghadam Charkari
Protein-Protein Interactions (PPIs) play a key role in many biological systems. Thus, identifying PPIs is critical for understanding cellular processes. Many experimental techniques were applied to predict PPIs. The data extracted using these techniques are incomplete and noisy. In this regard, a number of computational methods include machine learning classification techniques have been developed to reduce the noise data and predict new PPIs. Since, using regression methods to solve classification problems has good results in other applications. Therefore, in this paper, a regression view is applied to the PPI prediction classification problem, so a new approach is proposed using Principal Component Analysis (PCA) and Support Vector Regression (SVR) which has been improved by a new Parallel Hierarchical Cube Search (PHCS) method. Firstly, PCA algorithm is implemented to select an optimal subset of features which leads to reduce processing time and to lessen the effect of noise. Then, the PPIs would be predicted, by using SVR. To get a better performance of SVR, a new PHCS method has been applied to select the appropriate values of SVR parameters. The obtained classification accuracy of the proposed method is 74.505% on KUPS (The University of Kansas Proteomics Service) dataset which outperforms the other methods.
蛋白质-蛋白质相互作用(PPIs)在许多生物系统中起着关键作用。因此,识别ppi对于理解细胞过程至关重要。许多实验技术被应用于预测ppi。使用这些技术提取的数据是不完整和有噪声的。在这方面,已经开发了许多计算方法,包括机器学习分类技术,以减少噪声数据并预测新的ppi。因此,使用回归方法来解决分类问题在其他应用中也有很好的效果。为此,本文将回归的观点应用于PPI预测分类问题,提出了一种基于主成分分析(PCA)和支持向量回归(SVR)的PPI预测分类方法,并在此基础上改进了一种新的并行分层立方搜索(PHCS)方法。首先,采用主成分分析算法选择最优的特征子集,减少处理时间和噪声的影响;然后,利用SVR对ppi进行预测。为了获得更好的SVR性能,采用一种新的PHCS方法来选择合适的SVR参数值。在美国堪萨斯大学蛋白质组学服务(University of Kansas Proteomics Service)数据集上,该方法的分类准确率为74.505%,优于其他方法。
{"title":"Protein-Protein Interaction Prediction using PCA and SVR-PHCS","authors":"S. Mahmoudian, Abdulaziz Yousef, Nasrollah Moghadam Charkari","doi":"10.2174/1875036201509010001","DOIUrl":"https://doi.org/10.2174/1875036201509010001","url":null,"abstract":"Protein-Protein Interactions (PPIs) play a key role in many biological systems. Thus, identifying PPIs is critical for understanding cellular processes. Many experimental techniques were applied to predict PPIs. The data extracted using these techniques are incomplete and noisy. In this regard, a number of computational methods include machine learning classification techniques have been developed to reduce the noise data and predict new PPIs. Since, using regression methods to solve classification problems has good results in other applications. Therefore, in this paper, a regression view is applied to the PPI prediction classification problem, so a new approach is proposed using Principal Component Analysis (PCA) and Support Vector Regression (SVR) which has been improved by a new Parallel Hierarchical Cube Search (PHCS) method. Firstly, PCA algorithm is implemented to select an optimal subset of features which leads to reduce processing time and to lessen the effect of noise. Then, the PPIs would be predicted, by using SVR. To get a better performance of SVR, a new PHCS method has been applied to select the appropriate values of SVR parameters. The obtained classification accuracy of the proposed method is 74.505% on KUPS (The University of Kansas Proteomics Service) dataset which outperforms the other methods.","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"41 1","pages":"1-12"},"PeriodicalIF":0.0,"publicationDate":"2015-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68107512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Neutropenia Risk in Breast Cancer Patients from Pre- Chemotherapy Characteristics 从化疗前特征预测乳腺癌患者中性粒细胞减少的风险
Q3 Computer Science Pub Date : 2015-01-13 DOI: 10.2174/1875036201408010016
S. Lawal, M. Korenberg, Natalia M. Pittman, M. Mates
A previous study (Pittman, Hopman, Mates) of breast cancer patients undergoing curative chemotherapy (CT) found that the third most common reason for emergency department (ER) visits and hospital admission (HA) was febrile neutropenia. Factors associated with ER visits and HA included (1) stage of the cancer, (2) size of tumor, (3) adjuvant versus neo-adjuvant CT ("adjuvance"), and (4) number of CT cycles. We hypothesized that a statistically-significant pre- dictor of neutropenia could be built based on some of these factors, so that risk of neutropenia predicted for a patient feel- ing unwell during CT could be used in weighing need to visit the ER. The number of CT cycles was not used as a factor so that the predictor could calculate the neutropenia risk for a patient before the first CT cycle. Different models were built corresponding to different pre-chemotherapy factors or combinations of factors. The single factor yielding the best classification accuracy was tumor size (Mathews' correlation coefficient � = +0.18, Fisher's exact two-tailed probability P < 0.0374). The odds ratio of developing febrile neutropenia for the predicted high-risk group compared to the predicted low-risk group was 5.1875. Combining tumor size with adjuvance yielded a slightly more accurate predictor (Mathews' correlation coefficient � = +0.19, Fisher's exact two-tailed probability P < 0.0331, odds ratio = 5.5093). Based on the ob- served odds ratios, we conclude that a simple predictor of neutropenia may have value in deciding whether to recommend an ER visit. The predictor is sufficiently fast that it can run conveniently as an Applet on a mobile computing device.
Pittman, Hopman, Mates先前对接受治疗性化疗(CT)的乳腺癌患者的研究发现,急诊科(ER)就诊和住院(HA)的第三大常见原因是发热性中性粒细胞减少症。与ER就诊和HA相关的因素包括(1)癌症分期,(2)肿瘤大小,(3)辅助与新辅助CT(“辅助”),(4)CT周期数。我们假设,基于这些因素,可以建立一个具有统计学意义的中性粒细胞减少的预测指标,因此,当患者在CT期间感觉不适时,预测的中性粒细胞减少的风险可以用来衡量是否需要去急诊室。CT周期的数量没有被用作一个因素,因此预测者可以在第一个CT周期之前计算患者中性粒细胞减少的风险。针对不同的化疗前因素或因素组合建立不同的模型。产生最佳分类准确度的单因素是肿瘤大小(Mathews相关系数= +0.18,Fisher精确双侧概率P < 0.0374)。预测高危组与预测低危组发生发热性中性粒细胞减少的比值比为5.1875。肿瘤大小与佐剂相结合的预测结果更为准确(Mathews相关系数= +0.19,Fisher精确双侧概率P < 0.0331,优势比= 5.5093)。根据观察到的优势比,我们得出结论,中性粒细胞减少症的简单预测因子可能对决定是否推荐急诊室就诊有价值。这个预测器足够快,可以作为Applet在移动计算设备上方便地运行。
{"title":"Predicting Neutropenia Risk in Breast Cancer Patients from Pre- Chemotherapy Characteristics","authors":"S. Lawal, M. Korenberg, Natalia M. Pittman, M. Mates","doi":"10.2174/1875036201408010016","DOIUrl":"https://doi.org/10.2174/1875036201408010016","url":null,"abstract":"A previous study (Pittman, Hopman, Mates) of breast cancer patients undergoing curative chemotherapy (CT) found that the third most common reason for emergency department (ER) visits and hospital admission (HA) was febrile neutropenia. Factors associated with ER visits and HA included (1) stage of the cancer, (2) size of tumor, (3) adjuvant versus neo-adjuvant CT (\"adjuvance\"), and (4) number of CT cycles. We hypothesized that a statistically-significant pre- dictor of neutropenia could be built based on some of these factors, so that risk of neutropenia predicted for a patient feel- ing unwell during CT could be used in weighing need to visit the ER. The number of CT cycles was not used as a factor so that the predictor could calculate the neutropenia risk for a patient before the first CT cycle. Different models were built corresponding to different pre-chemotherapy factors or combinations of factors. The single factor yielding the best classification accuracy was tumor size (Mathews' correlation coefficient � = +0.18, Fisher's exact two-tailed probability P < 0.0374). The odds ratio of developing febrile neutropenia for the predicted high-risk group compared to the predicted low-risk group was 5.1875. Combining tumor size with adjuvance yielded a slightly more accurate predictor (Mathews' correlation coefficient � = +0.19, Fisher's exact two-tailed probability P < 0.0331, odds ratio = 5.5093). Based on the ob- served odds ratios, we conclude that a simple predictor of neutropenia may have value in deciding whether to recommend an ER visit. The predictor is sufficiently fast that it can run conveniently as an Applet on a mobile computing device.","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"29 1","pages":"16-21"},"PeriodicalIF":0.0,"publicationDate":"2015-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68107500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification of the Factors Responsible for the Interaction of Hsp90α and its Client Proteins Hsp90α与其客户蛋白相互作用的相关因子鉴定
Q3 Computer Science Pub Date : 2014-12-31 DOI: 10.2174/1875036201408010006
Ashutosh Shukla, S. Paul
Hsp90 is a stress protein that acts as a molecular chaperone and is known to assist in the maturation, folding and stabilization of various cellular proteins known as ‘client proteins’. However, the factors that drive the interaction between Hsp90 and its client proteins are not well understood. In the present investigation, we predicted the basis of the different interaction of Hsp90 with both wild and mutant p53 and other client proteins. We have predicted that the presence of hydrophobic patches having substantial value of hydropathy index and a minimum percent similarity of hydrophobic patches between Hsp90 and its client proteins of 40 % is a necessary condition for client proteins to be recognized by Hsp90 . We also predicted that the overall percentage hydrophobicity of client proteins more than 20 is a required condition for them to bind with Hsp90 . The docking energy of p53 with Hsp90 and with multi-chaperone complex was also separately reported. We have reported from docking result that mutant p53 has a stronger interaction with Hsp90 when associated with multi-chaperone complex than wild type p53 and this might be one of the causes of breast cancer pathogenesis.
Hsp90是一种应激蛋白,作为分子伴侣,已知有助于各种被称为“客户蛋白”的细胞蛋白的成熟、折叠和稳定。然而,驱动Hsp90与其客户蛋白之间相互作用的因素尚不清楚。在本研究中,我们预测了Hsp90与野生型和突变型p53及其他客户蛋白不同相互作用的基础。我们预测,Hsp90与客户蛋白之间疏水贴片的亲水性指数和疏水贴片相似度至少为40%是客户蛋白被Hsp90识别的必要条件。我们还预测,客户蛋白的总体疏水性百分比大于20是它们与Hsp90结合的必要条件。p53与Hsp90和多伴侣复合物的对接能量也分别被报道。我们从对接结果中报道,突变型p53与多伴侣复合物相关时与Hsp90的相互作用强于野生型p53,这可能是乳腺癌发病的原因之一。
{"title":"Identification of the Factors Responsible for the Interaction of Hsp90α and its Client Proteins","authors":"Ashutosh Shukla, S. Paul","doi":"10.2174/1875036201408010006","DOIUrl":"https://doi.org/10.2174/1875036201408010006","url":null,"abstract":"Hsp90 is a stress protein that acts as a molecular chaperone and is known to assist in the maturation, folding and stabilization of various cellular proteins known as ‘client proteins’. However, the factors that drive the interaction between Hsp90 and its client proteins are not well understood. In the present investigation, we predicted the basis of the different interaction of Hsp90 with both wild and mutant p53 and other client proteins. We have predicted that the presence of hydrophobic patches having substantial value of hydropathy index and a minimum percent similarity of hydrophobic patches between Hsp90 and its client proteins of 40 % is a necessary condition for client proteins to be recognized by Hsp90 . We also predicted that the overall percentage hydrophobicity of client proteins more than 20 is a required condition for them to bind with Hsp90 . The docking energy of p53 with Hsp90 and with multi-chaperone complex was also separately reported. We have reported from docking result that mutant p53 has a stronger interaction with Hsp90 when associated with multi-chaperone complex than wild type p53 and this might be one of the causes of breast cancer pathogenesis.","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"8 1","pages":"6-15"},"PeriodicalIF":0.0,"publicationDate":"2014-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68107456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performances of Bioinformatics Pipelines for the Identification of Pathogensin Clinical Samples with the De Novo Assembly Approaches: Focuson 2009 Pandemic Influenza A (H1N1) 基于De Novo组装方法的生物信息学管道在临床样本病原体鉴定中的应用——以2009年甲型H1N1流感为例
Q3 Computer Science Pub Date : 2014-12-31 DOI: 10.2174/1875036201408010001
T. Biagini, B. Bartolini, E. Giombini, M. Capobianchi, F. Ferrè, G. Chillemi, A. Desideri
Diagnostic assays for pathogen detection are critical components of public-health monitoring efforts. In view of the limitations of methods that target specific agents, new approaches are required for the identification of novel, modi- fied or 'unsuspected' pathogens in public-health monitoring schemes. Metagenomic approach is an attractive possibility for rapid identification of these pathogens. The analysis of metagenomic libraries requires fast computation and appropri- ate algorithms to characterize sequences. In this paper, we compared the computational efficiency of different bioinfor- matic pipelines ad hoc established, based on de novo assembly of pathogen genomes, using a data set generated with a 454 genome sequencer from respiratory samples of patients with diagnosis of 2009 pandemic influenza A (H1N1). The results indicate high computational efficiency of the different bioinformatic pipelines, reducing the number of alignments respect to the identification based on the alignment of individual reads. The resulting computational time, added to the processing/sequencing time, is well compatible with diagnostic needs. The pipelines here described are useful in the unbi- ased analysis of clinical samples from patients with infectious diseases that may be relevant not only for the rapid identifi- cation but also for the extensive genetic characterization of viral pathogens without the need of culture amplification.
病原检测诊断分析是公共卫生监测工作的重要组成部分。鉴于针对特定病原体的方法的局限性,需要在公共卫生监测计划中采用新的方法来识别新的、改良的或"未怀疑的"病原体。宏基因组方法是快速鉴定这些病原体的一种有吸引力的可能性。宏基因组文库的分析需要快速计算和适当的算法来表征序列。在本文中,我们使用来自2009年甲型H1N1流感大流行诊断患者呼吸道样本的454基因组测序仪生成的数据集,比较了基于病原体基因组从头组装而建立的不同生物信息管道的计算效率。结果表明,不同的生物信息学管道具有较高的计算效率,减少了基于单个reads比对的鉴定的比对次数。由此产生的计算时间,加上处理/测序时间,与诊断需求很好地兼容。这里描述的管道在对传染病患者临床样本的无基础分析中是有用的,这可能不仅与快速鉴定有关,而且与不需要培养扩增的病毒病原体的广泛遗传特征有关。
{"title":"Performances of Bioinformatics Pipelines for the Identification of Pathogensin Clinical Samples with the De Novo Assembly Approaches: Focuson 2009 Pandemic Influenza A (H1N1)","authors":"T. Biagini, B. Bartolini, E. Giombini, M. Capobianchi, F. Ferrè, G. Chillemi, A. Desideri","doi":"10.2174/1875036201408010001","DOIUrl":"https://doi.org/10.2174/1875036201408010001","url":null,"abstract":"Diagnostic assays for pathogen detection are critical components of public-health monitoring efforts. In view of the limitations of methods that target specific agents, new approaches are required for the identification of novel, modi- fied or 'unsuspected' pathogens in public-health monitoring schemes. Metagenomic approach is an attractive possibility for rapid identification of these pathogens. The analysis of metagenomic libraries requires fast computation and appropri- ate algorithms to characterize sequences. In this paper, we compared the computational efficiency of different bioinfor- matic pipelines ad hoc established, based on de novo assembly of pathogen genomes, using a data set generated with a 454 genome sequencer from respiratory samples of patients with diagnosis of 2009 pandemic influenza A (H1N1). The results indicate high computational efficiency of the different bioinformatic pipelines, reducing the number of alignments respect to the identification based on the alignment of individual reads. The resulting computational time, added to the processing/sequencing time, is well compatible with diagnostic needs. The pipelines here described are useful in the unbi- ased analysis of clinical samples from patients with infectious diseases that may be relevant not only for the rapid identifi- cation but also for the extensive genetic characterization of viral pathogens without the need of culture amplification.","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"8 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2014-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68107448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Open Bioinformatics Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1