Pub Date : 2013-11-01DOI: 10.1109/GENSIPS.2013.6735927
Yi Yang, Si Li, Andrew S. Maxwell, Natalie D. Barker, Yan Peng, Y. Li, Haoni Li, Xi Wu, Pengcheng Li, Tao Huang, Chenhua Zhang, Nan Wang, E. Perkins, Chaoyang Zhang, P. Gong
The etiology of chemically-induced neurotoxicity like seizures is poorly understood. Using reversible neurotoxicity induced by two neurotoxicants as example, we demonstrate that a bioinformatics-guided reverse engineering approach can be applied to analyze time series microarray gene expression data and uncover the underlying molecular mechanism. Our results reinforce previous findings that cholinergic and GABAergic synapse pathways are the target of carbaryl and RDX, respectively. We also conclude that perturbations to these pathways by sublethal concentrations of RDX and carbaryl were temporary, and earthworms were capable of fully recovering at the end of the 7-day recovery phase. In addition, our study indicates that many pathways other than those related to synaptic and neuronal activities were altered during the 6-day exposure phase.
{"title":"Deciphering chemically-induced reversible neurotoxicity by reconstructing perturbed pathways from time series microarray gene expression data","authors":"Yi Yang, Si Li, Andrew S. Maxwell, Natalie D. Barker, Yan Peng, Y. Li, Haoni Li, Xi Wu, Pengcheng Li, Tao Huang, Chenhua Zhang, Nan Wang, E. Perkins, Chaoyang Zhang, P. Gong","doi":"10.1109/GENSIPS.2013.6735927","DOIUrl":"https://doi.org/10.1109/GENSIPS.2013.6735927","url":null,"abstract":"The etiology of chemically-induced neurotoxicity like seizures is poorly understood. Using reversible neurotoxicity induced by two neurotoxicants as example, we demonstrate that a bioinformatics-guided reverse engineering approach can be applied to analyze time series microarray gene expression data and uncover the underlying molecular mechanism. Our results reinforce previous findings that cholinergic and GABAergic synapse pathways are the target of carbaryl and RDX, respectively. We also conclude that perturbations to these pathways by sublethal concentrations of RDX and carbaryl were temporary, and earthworms were capable of fully recovering at the end of the 7-day recovery phase. In addition, our study indicates that many pathways other than those related to synaptic and neuronal activities were altered during the 6-day exposure phase.","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126927979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/GENSIPS.2013.6735949
Yan Cui, Xiaodong Cai, Zhong Jin
Gene expression profiles have been used to predict cancer recurrence or other clinical outcomes of cancer patients. However, clinical information of cancer patients is often incomplete, which yields many unlabeled samples that cannot be used in supervised learning. In this is paper, we develop a novel semi-supervised leaning (SSL) method that uses both labeled and unlabeled patient samples to predict cancer recurrence. Our new SSL algorithm employs a sparse representation approach where a labeled sample is represented as a combination of a small number of properly chosen unlabeled samples. Experiments with a set of gene expression data from patients with colorectal cancer(CRC) demonstrate that our SSL algorithm can improve prediction accuracy compared to other two SSL methods including TSVM and T3VM, and the traditional support vector machine.
{"title":"Semi-supervised classification using sparse representation for cancer recurrence prediction","authors":"Yan Cui, Xiaodong Cai, Zhong Jin","doi":"10.1109/GENSIPS.2013.6735949","DOIUrl":"https://doi.org/10.1109/GENSIPS.2013.6735949","url":null,"abstract":"Gene expression profiles have been used to predict cancer recurrence or other clinical outcomes of cancer patients. However, clinical information of cancer patients is often incomplete, which yields many unlabeled samples that cannot be used in supervised learning. In this is paper, we develop a novel semi-supervised leaning (SSL) method that uses both labeled and unlabeled patient samples to predict cancer recurrence. Our new SSL algorithm employs a sparse representation approach where a labeled sample is represented as a combination of a small number of properly chosen unlabeled samples. Experiments with a set of gene expression data from patients with colorectal cancer(CRC) demonstrate that our SSL algorithm can improve prediction accuracy compared to other two SSL methods including TSVM and T3VM, and the traditional support vector machine.","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116069818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/GENSIPS.2013.6735928
Noah E. Berlow, Saad Haider, R. Pal, C. Keller
A model for drug sensitivity prediction is often inferred from the response of a training drug screen. Quantifying the inference power of perturbations before experimentation will assist in selecting drugs screens with higher predictive power. In this article, we present a novel approach to quantify the inference power of a drug screen based on drug target profiles and biologically motivated monotonicity constraints. We have tested our algorithm on synthetically and experimentally generated datasets and the results illustrate the suitability of the proposed measure in estimating information gained from an experimental drug screen.
{"title":"Quantifying the inference power of a drug screen for predictive analysis","authors":"Noah E. Berlow, Saad Haider, R. Pal, C. Keller","doi":"10.1109/GENSIPS.2013.6735928","DOIUrl":"https://doi.org/10.1109/GENSIPS.2013.6735928","url":null,"abstract":"A model for drug sensitivity prediction is often inferred from the response of a training drug screen. Quantifying the inference power of perturbations before experimentation will assist in selecting drugs screens with higher predictive power. In this article, we present a novel approach to quantify the inference power of a drug screen based on drug target profiles and biologically motivated monotonicity constraints. We have tested our algorithm on synthetically and experimentally generated datasets and the results illustrate the suitability of the proposed measure in estimating information gained from an experimental drug screen.","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131129074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/GENSIPS.2013.6735932
Tianyi Yang, Nguyen T. Nguyen, Yufang Jin, M. Lindsey
With development of new technologies applied to biological experiments, more and more data are generated every day. To make predictions in biological systems, mathematical modeling plays a critical role. Ordinary differential equations (ODEs) contribute to a large portion in mathematical modeling. In which parameters are inevitable. Noise is intrinsic in all experiments. Therefore, to think of parameters as statistical distributions is a realistic treatment. In this paper, we discuss in a 1st order ODE common in biological systems, how to calculate parameter distribution analytically according to the experimentally observed output assumed to be normal distribution. Conditions on when parameter can be correctly estimated are elucidated.
{"title":"Parameter distribution estimation in first order ODE","authors":"Tianyi Yang, Nguyen T. Nguyen, Yufang Jin, M. Lindsey","doi":"10.1109/GENSIPS.2013.6735932","DOIUrl":"https://doi.org/10.1109/GENSIPS.2013.6735932","url":null,"abstract":"With development of new technologies applied to biological experiments, more and more data are generated every day. To make predictions in biological systems, mathematical modeling plays a critical role. Ordinary differential equations (ODEs) contribute to a large portion in mathematical modeling. In which parameters are inevitable. Noise is intrinsic in all experiments. Therefore, to think of parameters as statistical distributions is a realistic treatment. In this paper, we discuss in a 1st order ODE common in biological systems, how to calculate parameter distribution analytically according to the experimentally observed output assumed to be normal distribution. Conditions on when parameter can be correctly estimated are elucidated.","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131937115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/GENSIPS.2013.6735941
B. Wajid, A. R. Ekti, Amina Noor, E. Serpedin, M. N. Ayyaz, H. Nounou, M. Nounou
A novel assembly pipeline, MiB, employs Minimum Description Length (MDL), de-Bruijn graphs and Bayesian estimation for reference assisted assembly of the novel genome. In a previous study MiB assembly was compared with nine other assembly algorithms showing significant improvement in results coupled with very large execution times. This correspondence introduces `Supersonic MiB', an extension to our previous study MiB. Supersonic MiB aims to stimulate the assembly pipeline of MiB showing significant improvement in execution time compared to its predecessor.
{"title":"Supersonic MiB","authors":"B. Wajid, A. R. Ekti, Amina Noor, E. Serpedin, M. N. Ayyaz, H. Nounou, M. Nounou","doi":"10.1109/GENSIPS.2013.6735941","DOIUrl":"https://doi.org/10.1109/GENSIPS.2013.6735941","url":null,"abstract":"A novel assembly pipeline, MiB, employs Minimum Description Length (MDL), de-Bruijn graphs and Bayesian estimation for reference assisted assembly of the novel genome. In a previous study MiB assembly was compared with nine other assembly algorithms showing significant improvement in results coupled with very large execution times. This correspondence introduces `Supersonic MiB', an extension to our previous study MiB. Supersonic MiB aims to stimulate the assembly pipeline of MiB showing significant improvement in execution time compared to its predecessor.","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121272076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/GENSIPS.2013.6735942
Roozbeh Dehghannasiri, Byung-Jun Yoon, E. Dougherty
One of the main issues in systems biology is limited resources for conducting biological experiments. Therefore, a strategy for prioritizing the experiments seems to be inevitable. Experimental design is the process of planning experiments in such a way to make experiments as informative as possible. In this work, we propose a novel strategy for designing effective experiments that can optimally reduce the uncertainty in gene regulatory networks, based on the concept of mean objective cost of uncertainty (MOCU).
{"title":"Designing experiments for optimal reduction of uncertainty in gene regulatory networks","authors":"Roozbeh Dehghannasiri, Byung-Jun Yoon, E. Dougherty","doi":"10.1109/GENSIPS.2013.6735942","DOIUrl":"https://doi.org/10.1109/GENSIPS.2013.6735942","url":null,"abstract":"One of the main issues in systems biology is limited resources for conducting biological experiments. Therefore, a strategy for prioritizing the experiments seems to be inevitable. Experimental design is the process of planning experiments in such a way to make experiments as informative as possible. In this work, we propose a novel strategy for designing effective experiments that can optimally reduce the uncertainty in gene regulatory networks, based on the concept of mean objective cost of uncertainty (MOCU).","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127268629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/GENSIPS.2013.6735926
O. A. Arshad, P. Venkatasubramani, A. Datta, Jijayanagaram Venkatraj
The uncontrolled cell proliferation that is characteristically associated with cancer is usually accompanied by alterations in the genome and cell metabolism. Indeed, the phenomenon of cancer cells metabolizing glucose using a less efficient anaerobic process even in the presence of normal oxygen levels, termed the Warburg effect, is currently considered to be one of the hallmarks of cancer. Diabetes, much like cancer, is defined by significant metabolic changes. Recent epidemiological studies have shown that diabetes patients treated with the anti-diabetic drug Metformin, have significantly lowered risk of cancer as compared to patients treated with other anti-diabetic drugs. We utilize a Boolean logic model of the pathways commonly mutated in cancer to not only investigate the efficacy of Metformin for cancer therapeutic purposes but also demonstrate how Metformin in concert with standard therapeutic drugs could provide better and less toxic clinical outcomes as compared to using chemotherapy alone.
{"title":"Exploiting the cancer and diabetes metabolic connection for therapeutic purposes","authors":"O. A. Arshad, P. Venkatasubramani, A. Datta, Jijayanagaram Venkatraj","doi":"10.1109/GENSIPS.2013.6735926","DOIUrl":"https://doi.org/10.1109/GENSIPS.2013.6735926","url":null,"abstract":"The uncontrolled cell proliferation that is characteristically associated with cancer is usually accompanied by alterations in the genome and cell metabolism. Indeed, the phenomenon of cancer cells metabolizing glucose using a less efficient anaerobic process even in the presence of normal oxygen levels, termed the Warburg effect, is currently considered to be one of the hallmarks of cancer. Diabetes, much like cancer, is defined by significant metabolic changes. Recent epidemiological studies have shown that diabetes patients treated with the anti-diabetic drug Metformin, have significantly lowered risk of cancer as compared to patients treated with other anti-diabetic drugs. We utilize a Boolean logic model of the pathways commonly mutated in cancer to not only investigate the efficacy of Metformin for cancer therapeutic purposes but also demonstrate how Metformin in concert with standard therapeutic drugs could provide better and less toxic clinical outcomes as compared to using chemotherapy alone.","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115027690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/GENSIPS.2013.6735920
Yu-Chiao Chiu, E. Chuang, T. Hsiao, Yidong Chen
Summary form only given. MicroRNAs (miRNAs) are short non-coding RNAs with the average length of 22 nucleotides. They are known to induce mRNA degradation or suppression of translation by complementarily binding to 3' untranslated regions (3' UTRs) of target mRNA transcripts. Recently, an alternative mechanism through which miRNAs participate in gene regulation was postulated and experimentally validated, namely the competing endogenous RNAs (ceRNAs). By competing for a limited pool of common targeting miRNAs (miRNA programs; miRP), pairs of genes (ceRNAs) sharing, fully or partially, identical miRNAs binding sites can “talk” to each other: when one ceRNA is up-regulated (or down-regulated) in cells, it attracts (or releases) the targeting miRNAs away from (or toward) the other ceRNA, and in turn have protective (or harmful) effects on expression of the other ceRNA. Based on in silico and in vitro analysis, recent reports suggested the dynamic and condition-specific properties of ceRNA regulation. The essential factors involved in ceRNA regulation include size of miRP, number of miRP binding sites, expression level of miRP, and expression level of ceRNAs. For better characterizing the optimal conditions for ceRNA regulation, in the present study we aim to confer how essential factors determine strength of ceRNA regulation in vivo, by analyzing TCGA datasets of glioblastoma multiforme (GBM) patients with 491 tumor samples profiled with paired miRNA and gene expression. Based on the definition that two genes sharing any number of common targeting miRNAs as a putative ceRNA pair, and by utilizing TargetScan algorithm, we identified 47,451,423 putative ceRNA pairs, involving 10,872 ceRNAs (genes). Pairwise correlation coefficients of gene expression profiles were then computed for each of the putative ceRNA pairs, and then the CDF. Varying size of miRP, for example, generated multiple CDFs, and then the goodness-of-fit was performed for pinpointing the essential factors and optimal conditions for intensified ceRNA activity. Our analysis results demonstrated that increased size of miRPs as well as the abundance of miRP binding sites stabilize ceRNA activity and strengthen coexpression of ceRNA pairs. Furthermore, the expression levels of both miRPs and ceRNAs affect ceRNA activity and lead to statistically significant differences in distributions of correlation coefficients. Taken together, the results indicated that ceRNA regulation depends on states of the essential factors and thus may involve complex and dynamic processes in vivo. Our findings bring biological insights into complex ceRNA crosstalk in glioblastoma multiforme and contribute to further unveiling complex mechanism governing ceRNA regulation.
{"title":"Characterization of conditions for competing endogenous RNA regulation in GBM","authors":"Yu-Chiao Chiu, E. Chuang, T. Hsiao, Yidong Chen","doi":"10.1109/GENSIPS.2013.6735920","DOIUrl":"https://doi.org/10.1109/GENSIPS.2013.6735920","url":null,"abstract":"Summary form only given. MicroRNAs (miRNAs) are short non-coding RNAs with the average length of 22 nucleotides. They are known to induce mRNA degradation or suppression of translation by complementarily binding to 3' untranslated regions (3' UTRs) of target mRNA transcripts. Recently, an alternative mechanism through which miRNAs participate in gene regulation was postulated and experimentally validated, namely the competing endogenous RNAs (ceRNAs). By competing for a limited pool of common targeting miRNAs (miRNA programs; miRP), pairs of genes (ceRNAs) sharing, fully or partially, identical miRNAs binding sites can “talk” to each other: when one ceRNA is up-regulated (or down-regulated) in cells, it attracts (or releases) the targeting miRNAs away from (or toward) the other ceRNA, and in turn have protective (or harmful) effects on expression of the other ceRNA. Based on in silico and in vitro analysis, recent reports suggested the dynamic and condition-specific properties of ceRNA regulation. The essential factors involved in ceRNA regulation include size of miRP, number of miRP binding sites, expression level of miRP, and expression level of ceRNAs. For better characterizing the optimal conditions for ceRNA regulation, in the present study we aim to confer how essential factors determine strength of ceRNA regulation in vivo, by analyzing TCGA datasets of glioblastoma multiforme (GBM) patients with 491 tumor samples profiled with paired miRNA and gene expression. Based on the definition that two genes sharing any number of common targeting miRNAs as a putative ceRNA pair, and by utilizing TargetScan algorithm, we identified 47,451,423 putative ceRNA pairs, involving 10,872 ceRNAs (genes). Pairwise correlation coefficients of gene expression profiles were then computed for each of the putative ceRNA pairs, and then the CDF. Varying size of miRP, for example, generated multiple CDFs, and then the goodness-of-fit was performed for pinpointing the essential factors and optimal conditions for intensified ceRNA activity. Our analysis results demonstrated that increased size of miRPs as well as the abundance of miRP binding sites stabilize ceRNA activity and strengthen coexpression of ceRNA pairs. Furthermore, the expression levels of both miRPs and ceRNAs affect ceRNA activity and lead to statistically significant differences in distributions of correlation coefficients. Taken together, the results indicated that ceRNA regulation depends on states of the essential factors and thus may involve complex and dynamic processes in vivo. Our findings bring biological insights into complex ceRNA crosstalk in glioblastoma multiforme and contribute to further unveiling complex mechanism governing ceRNA regulation.","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132122460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/GENSIPS.2013.6735925
Hua Li, J. Vallandingham, Jing Chen
Following the breakthrough of the microarray technology, the next generation sequencing (NGS) technology further advanced approaches in modern biomedical research. The high-throughput NGS technology is now frequently used in profiling tumor and control samples for the study of DNA copy number variants (CNVs). In particular, the ratio of read count of the tumor sample to that of the control sample is popularly used for identifying CNV regions. We illustrate that a change-point (or a breakpoint) detection method, along with a Bayesian approach, is particularly suitable for identifying CNVs in the reads ratio data. We have written our algorithm into a user friendly R-package, SeqBBS (stands for Bayesian breakpoints search for sequencing data) and applied our method to the sequencing data of reads ratio between the breast tumor cell lines HCC1954 and its matched normal cell line BL1954. Breakpoints that separate different CNV regions are successfully identified.
{"title":"SeqBBS: A change-point model based algorithm and R package for searching CNV regions via the ratio of sequencing reads","authors":"Hua Li, J. Vallandingham, Jing Chen","doi":"10.1109/GENSIPS.2013.6735925","DOIUrl":"https://doi.org/10.1109/GENSIPS.2013.6735925","url":null,"abstract":"Following the breakthrough of the microarray technology, the next generation sequencing (NGS) technology further advanced approaches in modern biomedical research. The high-throughput NGS technology is now frequently used in profiling tumor and control samples for the study of DNA copy number variants (CNVs). In particular, the ratio of read count of the tumor sample to that of the control sample is popularly used for identifying CNV regions. We illustrate that a change-point (or a breakpoint) detection method, along with a Bayesian approach, is particularly suitable for identifying CNVs in the reads ratio data. We have written our algorithm into a user friendly R-package, SeqBBS (stands for Bayesian breakpoints search for sequencing data) and applied our method to the sequencing data of reads ratio between the breast tumor cell lines HCC1954 and its matched normal cell line BL1954. Breakpoints that separate different CNV regions are successfully identified.","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114404273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-11-01DOI: 10.1109/GENSIPS.2013.6735936
Belhassen Bayar, N. Bouaynaya, R. Shterenberg
The major challenge in reverse-engineering genetic regulatory networks is the small number of (time) measurements or experiments compared to the number of genes, which makes the system under-determined and hence unidentifiable. The only way to overcome the identifiability problem is to incorporate prior knowledge about the system. It is often assumed that genetic networks are sparse. In addition, if the measurements, in each experiment, present an unknown correlation structure, then the estimation problem becomes even more challenging. Estimating the covariance structure will improve the estimation of the network connectivity but will also make the estimation of the already under-determined problem even more challenging. In this paper, we formulate reverse-engineering genetic networks as a multiple linear regression problem. We show that, if the number of experiments is smaller than the number of genes and if the measurements present an unknown covariance structure, then the likelihood function diverges, making the maximum likelihood estimator senseless. We subsequently propose a normalized likelihood function that guarantees convergence while keeping the form of the Gaussian distribution. The optimal connectivity matrix is approximated as the solution of a convex optimization problem. Our simulation results show that the proposed maximum normalized-likelihood estimator outperforms the classical regularized maximum likelihood estimator, which assumes a known covariance structure.
{"title":"Inference of genetic regulatory networks with unknown covariance structure","authors":"Belhassen Bayar, N. Bouaynaya, R. Shterenberg","doi":"10.1109/GENSIPS.2013.6735936","DOIUrl":"https://doi.org/10.1109/GENSIPS.2013.6735936","url":null,"abstract":"The major challenge in reverse-engineering genetic regulatory networks is the small number of (time) measurements or experiments compared to the number of genes, which makes the system under-determined and hence unidentifiable. The only way to overcome the identifiability problem is to incorporate prior knowledge about the system. It is often assumed that genetic networks are sparse. In addition, if the measurements, in each experiment, present an unknown correlation structure, then the estimation problem becomes even more challenging. Estimating the covariance structure will improve the estimation of the network connectivity but will also make the estimation of the already under-determined problem even more challenging. In this paper, we formulate reverse-engineering genetic networks as a multiple linear regression problem. We show that, if the number of experiments is smaller than the number of genes and if the measurements present an unknown covariance structure, then the likelihood function diverges, making the maximum likelihood estimator senseless. We subsequently propose a normalized likelihood function that guarantees convergence while keeping the form of the Gaussian distribution. The optimal connectivity matrix is approximated as the solution of a convex optimization problem. Our simulation results show that the proposed maximum normalized-likelihood estimator outperforms the classical regularized maximum likelihood estimator, which assumes a known covariance structure.","PeriodicalId":336511,"journal":{"name":"2013 IEEE International Workshop on Genomic Signal Processing and Statistics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114947040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}