Pub Date : 2015-11-01Epub Date: 2015-12-17DOI: 10.1109/BIBM.2015.7359869
Noah Stier, Nicholas Vincent, David Liebeskind, Fabien Scalzo
In acute ischemic stroke treatment, prediction of tissue survival outcome plays a fundamental role in the clinical decision-making process, as it can be used to assess the balance of risk vs. possible benefit when considering endovascular clot-retrieval intervention. For the first time, we construct a deep learning model of tissue fate based on randomly sampled local patches from the hypoperfusion (Tmax) feature observed in MRI immediately after symptom onset. We evaluate the model with respect to the ground truth established by an expert neurologist four days after intervention. Experiments on 19 acute stroke patients evaluated the accuracy of the model in predicting tissue fate. Results show the superiority of the proposed regional learning framework versus a single-voxel-based regression model.
{"title":"Deep Learning of Tissue Fate Features in Acute Ischemic Stroke.","authors":"Noah Stier, Nicholas Vincent, David Liebeskind, Fabien Scalzo","doi":"10.1109/BIBM.2015.7359869","DOIUrl":"10.1109/BIBM.2015.7359869","url":null,"abstract":"<p><p>In acute ischemic stroke treatment, prediction of tissue survival outcome plays a fundamental role in the clinical decision-making process, as it can be used to assess the balance of risk vs. possible benefit when considering endovascular clot-retrieval intervention. For the first time, we construct a deep learning model of tissue fate based on randomly sampled local patches from the hypoperfusion (Tmax) feature observed in MRI immediately after symptom onset. We evaluate the model with respect to the ground truth established by an expert neurologist four days after intervention. Experiments on 19 acute stroke patients evaluated the accuracy of the model in predicting tissue fate. Results show the superiority of the proposed regional learning framework versus a single-voxel-based regression model.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2015 ","pages":"1316-1321"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2015.7359869","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35363448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-11-01DOI: 10.1109/BIBM.2015.7359913
Matt Schwartzi, Martin Parkl, John H Phanl, May D Wang
Kidney cancer is of prominent concern in modern medicine. Predicting patient survival is critical to patient awareness and developing a proper treatment regimens. Previous prediction models built upon molecular feature analysis are limited to just gene expression data. In this study we investigate the difference in predicting five year survival between unimodal and multimodal analysis of RNA-seq data from gene, exon, junction, and isoform modalities. Our preliminary findings report higher predictive accuracy-as measured by area under the ROC curve (AUC)-for multimodal learning when compared to unimodal learning with both support vector machine (SVM) and k-nearest neighbor (KNN) methods. The results of this study justify further research on the use of multimodal RNA-seq data to predict survival for other cancer types using a larger sample size and additional machine learning methods.
{"title":"Integration of multimodal RNA-seq data for prediction of kidney cancer survival.","authors":"Matt Schwartzi, Martin Parkl, John H Phanl, May D Wang","doi":"10.1109/BIBM.2015.7359913","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359913","url":null,"abstract":"<p><p>Kidney cancer is of prominent concern in modern medicine. Predicting patient survival is critical to patient awareness and developing a proper treatment regimens. Previous prediction models built upon molecular feature analysis are limited to just gene expression data. In this study we investigate the difference in predicting five year survival between unimodal and multimodal analysis of RNA-seq data from gene, exon, junction, and isoform modalities. Our preliminary findings report higher predictive accuracy-as measured by area under the ROC curve (AUC)-for multimodal learning when compared to unimodal learning with both support vector machine (SVM) and k-nearest neighbor (KNN) methods. The results of this study justify further research on the use of multimodal RNA-seq data to predict survival for other cancer types using a larger sample size and additional machine learning methods.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2015 ","pages":"1591-1595"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2015.7359913","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34313626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-11-01DOI: 10.1109/BIBM.2015.7359776
Priya Ramesh, Annan Wei, Elisabeth Welter, Yvan Bamps, Shelley Stoll, Ashley Bukach, Martha Sajatovic, Satya S Sahoo
Insight is a Semantic Web technology-based platform to support large-scale secondary analysis of healthcare data for neurology clinical research. Insight features the novel use of: (1) provenance metadata, which describes the history or origin of patient data, in clinical research analysis, and (2) support for patient cohort queries across multiple institutions conducting research in epilepsy, which is the one of the most common neurological disorders affecting 50 million persons worldwide. Insight is being developed as a healthcare informatics infrastructure to support a national network of eight epilepsy research centers across the U.S. funded by the U.S. Centers for Disease Control and Prevention (CDC). This paper describes the use of the World Wide Web Consortium (W3C) PROV recommendation for provenance metadata that allows researchers to create patient cohorts based on the provenance of the research studies. In addition, the paper describes the use of descriptive logic-based OWL2 epilepsy ontology for cohort queries with "expansion of query expression" using ontology reasoning. Finally, the evaluation results for the data integration and query performance are described using data from three research studies with 180 epilepsy patients. The experiment results demonstrate that Insight is a scalable approach to use Semantic provenance metadata for context-based data analysis in healthcare informatics.
{"title":"<i>Insight</i>: Semantic Provenance and Analysis Platform for Multi-center Neurology Healthcare Research.","authors":"Priya Ramesh, Annan Wei, Elisabeth Welter, Yvan Bamps, Shelley Stoll, Ashley Bukach, Martha Sajatovic, Satya S Sahoo","doi":"10.1109/BIBM.2015.7359776","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359776","url":null,"abstract":"<p><p><i>Insight</i> is a Semantic Web technology-based platform to support large-scale secondary analysis of healthcare data for neurology clinical research. <i>Insight</i> features the novel use of: (1) provenance metadata, which describes the history or origin of patient data, in clinical research analysis, and (2) support for patient cohort queries across multiple institutions conducting research in epilepsy, which is the one of the most common neurological disorders affecting 50 million persons worldwide. <i>Insight</i> is being developed as a healthcare informatics infrastructure to support a national network of eight epilepsy research centers across the U.S. funded by the U.S. Centers for Disease Control and Prevention (CDC). This paper describes the use of the World Wide Web Consortium (W3C) PROV recommendation for provenance metadata that allows researchers to create patient cohorts based on the provenance of the research studies. In addition, the paper describes the use of descriptive logic-based OWL2 epilepsy ontology for cohort queries with \"expansion of query expression\" using ontology reasoning. Finally, the evaluation results for the data integration and query performance are described using data from three research studies with 180 epilepsy patients. The experiment results demonstrate that <i>Insight</i> is a scalable approach to use Semantic provenance metadata for context-based data analysis in healthcare informatics.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2015 ","pages":"731-736"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2015.7359776","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34393837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-11-01Epub Date: 2015-12-17DOI: 10.1109/BIBM.2015.7359924
Karthik Devarajan, Nader Ebrahimi, Ehsan Soofi
The objective of this paper is to provide a hybrid algorithm for non-negative matrix factorization based on a symmetric version of Kullback-Leibler divergence, known as intrinsic information. The convergence of the proposed algorithm is shown for several members of the exponential family such as the Gaussian, Poisson, gamma and inverse Gaussian models. The speed of this algorithm is examined and its usefulness is illustrated through some applied problems.
{"title":"A Hybrid Algorithm for Non-negative Matrix Factorization Based on Symmetric Information Divergence.","authors":"Karthik Devarajan, Nader Ebrahimi, Ehsan Soofi","doi":"10.1109/BIBM.2015.7359924","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359924","url":null,"abstract":"<p><p>The objective of this paper is to provide a hybrid algorithm for non-negative matrix factorization based on a symmetric version of Kullback-Leibler divergence, known as <i>intrinsic information</i>. The convergence of the proposed algorithm is shown for several members of the exponential family such as the Gaussian, Poisson, gamma and inverse Gaussian models. The speed of this algorithm is examined and its usefulness is illustrated through some applied problems.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2015 ","pages":"1658-1664"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2015.7359924","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35371696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-11-01DOI: 10.1109/BIBM.2015.7359860
Jing He, Stephanie Zeil, Hussam Hallak, Kele McKaig, Julio Kovacs, Willy Wriggers
Cryo-electron microscopy (cryo-EM) is an important biophysical technique that produces three-dimensional (3D) density maps at different resolutions. Because more and more models are being produced from cryo-EM density maps, validation of the models is becoming important. We propose a method for measuring local agreement between a model and the density map using the central axis of the helix. This method was tested using 19 helices from cryo-EM density maps between 5.5 Å and 7.2 Å resolution and 94 helices from simulated density maps. This method distinguished most of the well-fitting helices, although challenges exist for shorter helices.
低温电子显微镜(cryo-EM)是一种重要的生物物理技术,可以产生不同分辨率的三维(3D)密度图。由于越来越多的模型是由低温电镜密度图产生的,因此模型的验证变得越来越重要。我们提出了一种利用螺旋的中轴线测量模型和密度图之间局部一致性的方法。采用5.5 Å ~ 7.2 Å分辨率的低温电镜密度图中的19条螺旋和模拟密度图中的94条螺旋对该方法进行了测试。该方法区分了大多数拟合良好的螺旋,尽管对于较短的螺旋存在挑战。
{"title":"Comparison of an Atomic Model and Its Cryo-EM Image at the Central Axis of a Helix.","authors":"Jing He, Stephanie Zeil, Hussam Hallak, Kele McKaig, Julio Kovacs, Willy Wriggers","doi":"10.1109/BIBM.2015.7359860","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359860","url":null,"abstract":"<p><p>Cryo-electron microscopy (cryo-EM) is an important biophysical technique that produces three-dimensional (3D) density maps at different resolutions. Because more and more models are being produced from cryo-EM density maps, validation of the models is becoming important. We propose a method for measuring local agreement between a model and the density map using the central axis of the helix. This method was tested using 19 helices from cryo-EM density maps between 5.5 Å and 7.2 Å resolution and 94 helices from simulated density maps. This method distinguished most of the well-fitting helices, although challenges exist for shorter helices.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2015 ","pages":"1253-1259"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2015.7359860","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34626295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-11-01Epub Date: 2015-12-17DOI: 10.1109/BIBM.2015.7359870
Nicholas Vincent, Noah Stier, Songlin Yu, David S Liebeskind, Danny Jj Wang, Fabien Scalzo
Hyperperfusion detected on arterial spin labeling (ASL) images acquired after acute stroke onset has been shown to correlate with development of subsequent intracerebral hemorrhage. We present in this study a quantitative hyperperfusion detection model that can provide an objective decision support for the interpretation of ASL cerebral blood flow (CBF) maps and rapidly delineate hyperperfusion regions. The detection problem is solved using Deep Learning such that the model relates ASL image patches to the corresponding label (normal or hyperperfused). Our method takes into account the regional intensity values of contralateral hemisphere during the labeling of a pixel. Each input vector is associated to a label corresponding to the presence of hyperperfusion that was manually established by a clinical researcher in Neurology. When compared to the manually established hyperperfusion, the predicted maps reached an accuracy of 97.45 ± 2.49% after crossvalidation. Pattern recognition based on deep learning can provide an accurate and objective measure of hyperperfusion on ASL CBF images and could therefore improve the detection of hemorrhagic transformation in acute stroke patients.
{"title":"Detection of Hyperperfusion on Arterial Spin Labeling using Deep Learning.","authors":"Nicholas Vincent, Noah Stier, Songlin Yu, David S Liebeskind, Danny Jj Wang, Fabien Scalzo","doi":"10.1109/BIBM.2015.7359870","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359870","url":null,"abstract":"<p><p>Hyperperfusion detected on arterial spin labeling (ASL) images acquired after acute stroke onset has been shown to correlate with development of subsequent intracerebral hemorrhage. We present in this study a quantitative hyperperfusion detection model that can provide an objective decision support for the interpretation of ASL cerebral blood flow (CBF) maps and rapidly delineate hyperperfusion regions. The detection problem is solved using Deep Learning such that the model relates ASL image patches to the corresponding label (normal or hyperperfused). Our method takes into account the regional intensity values of contralateral hemisphere during the labeling of a pixel. Each input vector is associated to a label corresponding to the presence of hyperperfusion that was manually established by a clinical researcher in Neurology. When compared to the manually established hyperperfusion, the predicted maps reached an accuracy of 97.45 ± 2.49% after crossvalidation. Pattern recognition based on deep learning can provide an accurate and objective measure of hyperperfusion on ASL CBF images and could therefore improve the detection of hemorrhagic transformation in acute stroke patients.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2015 ","pages":"1322-1327"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2015.7359870","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35431192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-11-01DOI: 10.1109/BIBM.2014.6999159
Quazi Abidur Rahman, Larisa G Tereshchenko, Matthew Kongkatong, Theodore Abraham, M Roselle Abraham, Hagit Shatkay
Test based on electrocardiograms (ECG) that record the heart electrical activity can help in early detection of patients with hypertrophic cardiomyopathy (HCM) where the heart muscle is partially thickened and blood flow is (potentially fatally) obstructed. This paper presents a cardiovascular-patient classifier we developed to identify HCM patients using standard 10-seconds, 12-lead ECG signals. Patients are classified as having HCM if the majority of the heartbeats are recognized as HCM. Thus, the classifier's underlying task is to recognize individual heartbeats segmented from 12-lead ECG signals as HCM beats, where heartbeats from non-HCM cardiovascular patients are used as controls. We extracted 504 morphological and temporal features - both commonly used and newly-developed ones - from ECG signals for heartbeat classification. To assess classification performance, we trained and tested a random forest classifier and a support vector machine classifier using 5-fold cross validation. The patient-classification precision and F-measure of both classifiers are close to 0.85. Recall (sensitivity) and specificity are approximately 0.90. We also conducted feature selection experiments by gradually removing the least informative features; the results show that a relatively small subset of 304 highly informative features can achieve performance measures comparable to that achieved by using the complete set of features.
{"title":"Identifying Hypertrophic Cardiomyopathy Patients by Classifying Individual Heartbeats from 12-lead ECG Signals.","authors":"Quazi Abidur Rahman, Larisa G Tereshchenko, Matthew Kongkatong, Theodore Abraham, M Roselle Abraham, Hagit Shatkay","doi":"10.1109/BIBM.2014.6999159","DOIUrl":"https://doi.org/10.1109/BIBM.2014.6999159","url":null,"abstract":"<p><p>Test based on electrocardiograms (ECG) that record the heart electrical activity can help in early detection of patients with hypertrophic cardiomyopathy (HCM) where the heart muscle is partially thickened and blood flow is (potentially fatally) obstructed. This paper presents a cardiovascular-patient classifier we developed to identify HCM patients using standard 10-seconds, 12-lead ECG signals. Patients are classified as having HCM if the majority of the heartbeats are recognized as HCM. Thus, the classifier's underlying task is to recognize individual heartbeats segmented from 12-lead ECG signals as HCM beats, where heartbeats from non-HCM cardiovascular patients are used as controls. We extracted 504 morphological and temporal features - both commonly used and newly-developed ones - from ECG signals for heartbeat classification. To assess classification performance, we trained and tested a random forest classifier and a support vector machine classifier using 5-fold cross validation. The patient-classification precision and F-measure of both classifiers are close to 0.85. Recall (sensitivity) and specificity are approximately 0.90. We also conducted feature selection experiments by gradually removing the least informative features; the results show that a relatively small subset of 304 highly informative features can achieve performance measures comparable to that achieved by using the complete set of features.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":"2014 ","pages":"224-229"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2014.6999159","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33431174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-12-01DOI: 10.1109/BIBM.2013.6732495
Jun Kong, Fusheng Wang, George Teodoro, Lee Cooper, Carlos S Moreno, Tahsin Kurc, Tony Pan, Joel Saltz, Daniel Brat
In this paper, we present a novel framework for microscopic image analysis of nuclei, data management, and high performance computation to support translational research involving nuclear morphometry features, molecular data, and clinical outcomes. Our image analysis pipeline consists of nuclei segmentation and feature computation facilitated by high performance computing with coordinated execution in multi-core CPUs and Graphical Processor Units (GPUs). All data derived from image analysis are managed in a spatial relational database supporting highly efficient scientific queries. We applied our image analysis workflow to 159 glioblastomas (GBM) from The Cancer Genome Atlas dataset. With integrative studies, we found statistics of four specific nuclear features were significantly associated with patient survival. Additionally, we correlated nuclear features with molecular data and found interesting results that support pathologic domain knowledge. We found that Proneural subtype GBMs had the smallest mean of nuclear Eccentricity and the largest mean of nuclear Extent, and MinorAxisLength. We also found gene expressions of stem cell marker MYC and cell proliferation maker MKI67 were correlated with nuclear features. To complement and inform pathologists of relevant diagnostic features, we queried the most representative nuclear instances from each patient population based on genetic and transcriptional classes. Our results demonstrate that specific nuclear features carry prognostic significance and associations with transcriptional and genetic classes, highlighting the potential of high throughput pathology image analysis as a complementary approach to human-based review and translational research.
{"title":"High-Performance Computational Analysis of Glioblastoma Pathology Images with Database Support Identifies Molecular and Survival Correlates.","authors":"Jun Kong, Fusheng Wang, George Teodoro, Lee Cooper, Carlos S Moreno, Tahsin Kurc, Tony Pan, Joel Saltz, Daniel Brat","doi":"10.1109/BIBM.2013.6732495","DOIUrl":"https://doi.org/10.1109/BIBM.2013.6732495","url":null,"abstract":"<p><p>In this paper, we present a novel framework for microscopic image analysis of nuclei, data management, and high performance computation to support translational research involving nuclear morphometry features, molecular data, and clinical outcomes. Our image analysis pipeline consists of nuclei segmentation and feature computation facilitated by high performance computing with coordinated execution in multi-core CPUs and Graphical Processor Units (GPUs). All data derived from image analysis are managed in a spatial relational database supporting highly efficient scientific queries. We applied our image analysis workflow to 159 glioblastomas (GBM) from The Cancer Genome Atlas dataset. With integrative studies, we found statistics of four specific nuclear features were significantly associated with patient survival. Additionally, we correlated nuclear features with molecular data and found interesting results that support pathologic domain knowledge. We found that Proneural subtype GBMs had the smallest mean of nuclear Eccentricity and the largest mean of nuclear Extent, and MinorAxisLength. We also found gene expressions of stem cell marker MYC and cell proliferation maker MKI67 were correlated with nuclear features. To complement and inform pathologists of relevant diagnostic features, we queried the most representative nuclear instances from each patient population based on genetic and transcriptional classes. Our results demonstrate that specific nuclear features carry prognostic significance and associations with transcriptional and genetic classes, highlighting the potential of high throughput pathology image analysis as a complementary approach to human-based review and translational research.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":" ","pages":"229-236"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2013.6732495","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32564580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-01-01DOI: 10.1109/BIBM.2013.6732493
Lu Liu, Jianhua Ruan
Finding out the associations between an input gene set, such as genes associated with a certain phenotype, and annotated gene sets, such as known pathways, are a very important problem in modern molecular biology. The existing approaches mainly focus on the overlap between the two, and may miss important but subtle relationships between genes. In this paper, we propose a method, NetPEA, by combining the known pathways and high-throughput networks. Our method not only considers the shared genes, but also takes the gene interactions into account. It utilizes a protein-protein interaction network and a random walk procedure to identify hidden relationships between gene sets, and uses a randomization strategy to evaluate the significance for pathways to achieve such similarity scores. Compared with the over-representation based method, our method can identify more relationships. Compared with a state of the art network-based method, EnrichNet, our method not only provides a ranked list of pathways, but also provides the statistical significant information. Importantly, through independent tests, we show that our method likely has a higher sensitivity in revealing the true casual pathways, while at the same time achieve a higher specificity. Literature review of selected results indicates that some of the novel pathways reported by our method are biologically relevant and important.
{"title":"Network-based Pathway Enrichment Analysis.","authors":"Lu Liu, Jianhua Ruan","doi":"10.1109/BIBM.2013.6732493","DOIUrl":"https://doi.org/10.1109/BIBM.2013.6732493","url":null,"abstract":"<p><p>Finding out the associations between an input gene set, such as genes associated with a certain phenotype, and annotated gene sets, such as known pathways, are a very important problem in modern molecular biology. The existing approaches mainly focus on the overlap between the two, and may miss important but subtle relationships between genes. In this paper, we propose a method, NetPEA, by combining the known pathways and high-throughput networks. Our method not only considers the shared genes, but also takes the gene interactions into account. It utilizes a protein-protein interaction network and a random walk procedure to identify hidden relationships between gene sets, and uses a randomization strategy to evaluate the significance for pathways to achieve such similarity scores. Compared with the over-representation based method, our method can identify more relationships. Compared with a state of the art network-based method, EnrichNet, our method not only provides a ranked list of pathways, but also provides the statistical significant information. Importantly, through independent tests, we show that our method likely has a higher sensitivity in revealing the true casual pathways, while at the same time achieve a higher specificity. Literature review of selected results indicates that some of the novel pathways reported by our method are biologically relevant and important.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":" ","pages":"218-221"},"PeriodicalIF":0.0,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2013.6732493","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32758305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-01-01DOI: 10.1109/BIBM.2013.6732517
Su Yan, Xiaoqian Jiang, Ying Chen
Identifying drug-drug interactions is an important and challenging problem in computational biology and healthcare research. There are accurate, structured but limited domain knowledge and noisy, unstructured but abundant textual information available for building predictive models. The difficulty lies in mining the true patterns embedded in text data and developing efficient and effective ways to combine heterogenous types of information. We demonstrate a novel approach of leveraging augmented text-mining features to build a logistic regression model with improved prediction performance (in terms of discrimination and calibration). Our model based on synthesized features significantly outperforms the model trained with only structured features (AUC: 96% vs. 91%, Sensitivity: 90% vs. 82% and Specificity: 88% vs. 81%). Along with the quantitative results, we also show learned "latent topics", an intermediary result of our text mining module, and discuss their implications.
识别药物-药物相互作用是计算生物学和医疗保健研究中的一个重要而具有挑战性的问题。有准确的、结构化的但有限的领域知识和嘈杂的、非结构化的但丰富的文本信息可用于构建预测模型。其难点在于挖掘文本数据中的真实模式,并开发高效的方法来组合异构类型的信息。我们展示了一种利用增强文本挖掘特征来构建具有改进预测性能(在区分和校准方面)的逻辑回归模型的新方法。我们基于综合特征的模型明显优于仅使用结构化特征训练的模型(AUC: 96% vs 91%,灵敏度:90% vs 82%,特异性:88% vs 81%)。除了定量结果,我们还展示了学习到的“潜在主题”,这是我们的文本挖掘模块的一个中间结果,并讨论了它们的含义。
{"title":"Text Mining Driven Drug-Drug Interaction Detection.","authors":"Su Yan, Xiaoqian Jiang, Ying Chen","doi":"10.1109/BIBM.2013.6732517","DOIUrl":"https://doi.org/10.1109/BIBM.2013.6732517","url":null,"abstract":"<p><p>Identifying drug-drug interactions is an important and challenging problem in computational biology and healthcare research. There are accurate, structured but limited domain knowledge and noisy, unstructured but abundant textual information available for building predictive models. The difficulty lies in mining the true patterns embedded in text data and developing efficient and effective ways to combine heterogenous types of information. We demonstrate a novel approach of leveraging augmented text-mining features to build a logistic regression model with improved prediction performance (in terms of discrimination and calibration). Our model based on synthesized features significantly outperforms the model trained with only structured features (AUC: 96% vs. 91%, Sensitivity: 90% vs. 82% and Specificity: 88% vs. 81%). Along with the quantitative results, we also show learned \"latent topics\", an intermediary result of our text mining module, and discuss their implications.</p>","PeriodicalId":74563,"journal":{"name":"Proceedings. IEEE International Conference on Bioinformatics and Biomedicine","volume":" ","pages":"349-355"},"PeriodicalIF":0.0,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/BIBM.2013.6732517","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32592567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}