首页 > 最新文献

2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing最新文献

英文 中文
Modeling Confidentiality in a Simplified Database Access 简化数据库访问中的机密性建模
M. Shing, Chen-chi Shing, Kuo Lane Chen, Huei Lee
In a simplified secured database access model, privileged group and public group can access data with any distribution. In order to secure the database, the confidentiality policy must be applied. Often, the management of the database privacy is neglected. This paper looks into the data confidentiality management and suggests to use semi-Markov chains to model the policy and the simulation results are discussed.
在简化的安全数据库访问模型中,特权组和公共组可以访问任何分布的数据。为了保护数据库,必须应用保密策略。数据库隐私的管理往往被忽视。本文对数据保密管理进行了研究,提出了采用半马尔可夫链对策略进行建模的方法,并对仿真结果进行了讨论。
{"title":"Modeling Confidentiality in a Simplified Database Access","authors":"M. Shing, Chen-chi Shing, Kuo Lane Chen, Huei Lee","doi":"10.1109/IJCBS.2009.43","DOIUrl":"https://doi.org/10.1109/IJCBS.2009.43","url":null,"abstract":"In a simplified secured database access model, privileged group and public group can access data with any distribution. In order to secure the database, the confidentiality policy must be applied. Often, the management of the database privacy is neglected. This paper looks into the data confidentiality management and suggests to use semi-Markov chains to model the policy and the simulation results are discussed.","PeriodicalId":170985,"journal":{"name":"2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115491205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gene Selection for Microarray Expression Data with Imbalanced Sample Distributions 样本分布不平衡的微阵列表达数据的基因选择
Abu H. M. Kamal, Xingquan Zhu, R. Narayanan
Microarray expression data, which contain expression levels of a large number of simultaneously observed genes, have been used in many scientific research and clinical studies. Due to its high dimensionalities, selecting a small number of genes has shown to be beneficial for tasks such as building prediction models for molecular classification of cancers. Traditional gene selection methods, however, fail to take the sample distributions into consideration for gene selection. Due to the scarcity of the samples, in Biomedical research it is very common to have severely biased data distributions with one class of examples (e.g., diseased samples) significantly less than other classes (e.g., normal samples). Sample sets with biased distributions require special attention for identifying genes responsible for particular disease. In this paper, we propose three filtering techniques, Higher Weight (HW), Differential Minority Repeat (DMR) and Balanced Minority Repeat (BMR), to identify genes relevant to fatal diseases for biased microarray expression data. Experimental comparisons with the traditional ReliefF method on five microarray datasets demonstrate the effectiveness of the proposed methods in selecting informative genes from microarray expression data with biased sample distributions.
微阵列表达数据包含大量同时观察到的基因的表达水平,已被用于许多科学研究和临床研究。由于其高维性,选择少量基因已被证明对诸如建立癌症分子分类预测模型等任务有益。传统的基因选择方法在进行基因选择时没有考虑样本的分布。由于样本的稀缺性,在生物医学研究中,一类样本(例如,患病样本)明显少于其他类别(例如,正常样本)的严重偏倚数据分布是非常常见的。具有偏倚分布的样本集需要特别注意识别导致特定疾病的基因。在本文中,我们提出了三种过滤技术,高权重(HW),差分少数重复(DMR)和平衡少数重复(BMR),以识别与致命疾病相关的基因,用于偏置微阵列表达数据。与传统的ReliefF方法在5个微阵列数据集上的实验比较表明,该方法可以有效地从有偏样本分布的微阵列表达数据中选择信息基因。
{"title":"Gene Selection for Microarray Expression Data with Imbalanced Sample Distributions","authors":"Abu H. M. Kamal, Xingquan Zhu, R. Narayanan","doi":"10.1109/IJCBS.2009.117","DOIUrl":"https://doi.org/10.1109/IJCBS.2009.117","url":null,"abstract":"Microarray expression data, which contain expression levels of a large number of simultaneously observed genes, have been used in many scientific research and clinical studies. Due to its high dimensionalities, selecting a small number of genes has shown to be beneficial for tasks such as building prediction models for molecular classification of cancers. Traditional gene selection methods, however, fail to take the sample distributions into consideration for gene selection. Due to the scarcity of the samples, in Biomedical research it is very common to have severely biased data distributions with one class of examples (e.g., diseased samples) significantly less than other classes (e.g., normal samples). Sample sets with biased distributions require special attention for identifying genes responsible for particular disease. In this paper, we propose three filtering techniques, Higher Weight (HW), Differential Minority Repeat (DMR) and Balanced Minority Repeat (BMR), to identify genes relevant to fatal diseases for biased microarray expression data. Experimental comparisons with the traditional ReliefF method on five microarray datasets demonstrate the effectiveness of the proposed methods in selecting informative genes from microarray expression data with biased sample distributions.","PeriodicalId":170985,"journal":{"name":"2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123420615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Proteogenomic Mapping for Structural Annotation of Prokaryote Genomes 原核生物基因组结构注释的蛋白质基因组图谱
Nan Wang, S. Burgess, M. Lawrence, S. Bridges
Structural annotation of genomes is one of major goals of genomics research. Most popular tools for structural annotation of genomes are determined by computational pipelines. It is well-known that these computational methods have a number of shortcomings including false identifications and incorrect identification of gene boundaries. Proteomic data can used to confirm the identification of genes identified by computational methods and correct mistakes. A Proteogenomic mapping method has been developed, which uses peptides identified from mass spectrometry for structural annotation of genomes. Spectra are matched against both a protein database and the genome database translated in all six reading frames. Those peptides that match the genome but not the protein database potentially represent novel protein coding genes, annotation errors. These short experimentally derived peptides are used to discover potential novel protein coding genes called expressed Protein Sequence Tags (ePSTs) by aligning the peptides to the genomic DNA and extending the translation in the 3' and 5' direction. In the paper, an enhanced pipeline, has been designed and developed for discovering and evaluating of potential novel protein coding genes: 1) a distance-based outlier detection method for validating peptides identified from MS/MS, 2) a proteogenomic mapping for discovery of potential novel protein coding genes, 3) collection of evidence from a number of sources and automatically evaluate potential novel protein coding genes by using machine learning techniques, such as Neural Network, Support Vector Machine, Naïve Bayes etc.
基因组的结构注释是基因组学研究的主要目标之一。目前最流行的基因组结构标注工具是由计算管道决定的。众所周知,这些计算方法存在一些缺点,包括错误识别和错误识别基因边界。蛋白质组学数据可用于确认通过计算方法识别的基因的鉴定并纠正错误。一种蛋白质基因组作图方法,利用质谱鉴定的多肽对基因组进行结构标注。光谱与所有六个阅读框翻译的蛋白质数据库和基因组数据库相匹配。那些与基因组匹配但与蛋白质数据库不匹配的肽可能代表新的蛋白质编码基因,注释错误。这些实验衍生的短肽被用来发现潜在的新的蛋白质编码基因,称为表达蛋白序列标签(ePSTs),通过将肽与基因组DNA对齐并在3'和5'方向扩展翻译。本文设计和开发了一个增强的管道,用于发现和评估潜在的新型蛋白质编码基因:1)基于距离的离群值检测方法,用于验证从MS/MS鉴定的肽,2)蛋白质基因组图谱,用于发现潜在的新蛋白质编码基因,3)从多个来源收集证据,并使用机器学习技术自动评估潜在的新蛋白质编码基因,如神经网络,支持向量机,Naïve贝叶斯等。
{"title":"Proteogenomic Mapping for Structural Annotation of Prokaryote Genomes","authors":"Nan Wang, S. Burgess, M. Lawrence, S. Bridges","doi":"10.1109/IJCBS.2009.126","DOIUrl":"https://doi.org/10.1109/IJCBS.2009.126","url":null,"abstract":"Structural annotation of genomes is one of major goals of genomics research. Most popular tools for structural annotation of genomes are determined by computational pipelines. It is well-known that these computational methods have a number of shortcomings including false identifications and incorrect identification of gene boundaries. Proteomic data can used to confirm the identification of genes identified by computational methods and correct mistakes. A Proteogenomic mapping method has been developed, which uses peptides identified from mass spectrometry for structural annotation of genomes. Spectra are matched against both a protein database and the genome database translated in all six reading frames. Those peptides that match the genome but not the protein database potentially represent novel protein coding genes, annotation errors. These short experimentally derived peptides are used to discover potential novel protein coding genes called expressed Protein Sequence Tags (ePSTs) by aligning the peptides to the genomic DNA and extending the translation in the 3' and 5' direction. In the paper, an enhanced pipeline, has been designed and developed for discovering and evaluating of potential novel protein coding genes: 1) a distance-based outlier detection method for validating peptides identified from MS/MS, 2) a proteogenomic mapping for discovery of potential novel protein coding genes, 3) collection of evidence from a number of sources and automatically evaluate potential novel protein coding genes by using machine learning techniques, such as Neural Network, Support Vector Machine, Naïve Bayes etc.","PeriodicalId":170985,"journal":{"name":"2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116677844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Effective Multi-level Immune Algorithm for Graph Bipartitionin 一种有效的图双分割多级免疫算法
Ming Leng, Lingyu Sun, Songnian Yu
An important application of graph partitioning is data clustering using a graph model--- the pairwise similarities between all data objects form a weighted graph adjacency matrix that contains all necessary information for clustering. An effective multi-level algorithm based on AIS (artificial immune systems) for graph bipartitioning is proposed. During its coarsening phase, we adopt an improved matching approach based on the global information of the graph core to develop its guidance function. During its refinement phase, we exploit the hybrid immune refinement algorithm inspired in the CSA (clonal selection algorithm) and affinity maturation of the AIS. The algorithm is verified to be capable of finding the global approximate bipartitioning which incorporate early-exit FM (FM-EE) local improvement heuristic into CSA. The success of our algorithm relies on exploiting both the CSA and the concept of the graph core. It is implemented with American National Standards Institute (ANSI) C and compared to MeTiS that is a state-of-the-art partitioner in the literature. Our experimental evaluations show that it performs well and produces encouraging solutions on 18 graphs benchmarks.
图分区的一个重要应用是使用图模型进行数据聚类——所有数据对象之间的成对相似性形成一个加权图邻接矩阵,其中包含聚类所需的所有信息。提出了一种有效的基于人工免疫系统的多层次图双分区算法。在其粗化阶段,我们采用一种改进的基于图核全局信息的匹配方法来开发其制导功能。在改进阶段,我们利用了受克隆选择算法(CSA)和AIS亲和成熟启发的混合免疫改进算法。将早期退出FM (FM- ee)局部改进启发式算法引入CSA,验证了该算法能够找到全局近似双分区。我们的算法的成功依赖于利用CSA和图核的概念。它是用美国国家标准协会(ANSI) C实现的,并与文献中最先进的分区MeTiS进行了比较。我们的实验评估表明,它表现良好,并在18个图基准上产生令人鼓舞的解决方案。
{"title":"An Effective Multi-level Immune Algorithm for Graph Bipartitionin","authors":"Ming Leng, Lingyu Sun, Songnian Yu","doi":"10.1109/IJCBS.2009.12","DOIUrl":"https://doi.org/10.1109/IJCBS.2009.12","url":null,"abstract":"An important application of graph partitioning is data clustering using a graph model--- the pairwise similarities between all data objects form a weighted graph adjacency matrix that contains all necessary information for clustering. An effective multi-level algorithm based on AIS (artificial immune systems) for graph bipartitioning is proposed. During its coarsening phase, we adopt an improved matching approach based on the global information of the graph core to develop its guidance function. During its refinement phase, we exploit the hybrid immune refinement algorithm inspired in the CSA (clonal selection algorithm) and affinity maturation of the AIS. The algorithm is verified to be capable of finding the global approximate bipartitioning which incorporate early-exit FM (FM-EE) local improvement heuristic into CSA. The success of our algorithm relies on exploiting both the CSA and the concept of the graph core. It is implemented with American National Standards Institute (ANSI) C and compared to MeTiS that is a state-of-the-art partitioner in the literature. Our experimental evaluations show that it performs well and produces encouraging solutions on 18 graphs benchmarks.","PeriodicalId":170985,"journal":{"name":"2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128441186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Designing the Self-Adaptive Fuzzy Neural Networks 自适应模糊神经网络的设计
Liu Fang
In this paper, a approach for automatically generating fuzzy rules from sample patterns is presented. Then a self-adaptive fuzzy neural network is built based on the fuzzy partition which divides the input space with input and output information. The salient characteristics of the self-adaptive fuzzy neural networks are: 1) structure identification and parameters estimation are performed automatically and simultaneously; 2) fuzzy rules can be recruited or deleted dynamically; 3) parameters of rules can be obtained by evolutionary computation. Simulation results demonstrate that a compact and high performance fuzzy rule base can be constructed. Comprehensive com-parisons with other approach show that the proposed approach is superior over other in terms of learning efficiency and performance.
本文提出了一种从样本模式中自动生成模糊规则的方法。然后在模糊划分的基础上构建自适应模糊神经网络,将输入空间与输入输出信息进行划分。自适应模糊神经网络的显著特点是:1)结构辨识和参数估计同时自动进行;2)模糊规则可以动态添加或删除;3)通过进化计算获得规则参数。仿真结果表明,该方法可以构建一个紧凑、高性能的模糊规则库。与其他方法的综合比较表明,该方法在学习效率和性能方面都优于其他方法。
{"title":"Designing the Self-Adaptive Fuzzy Neural Networks","authors":"Liu Fang","doi":"10.1109/IJCBS.2009.40","DOIUrl":"https://doi.org/10.1109/IJCBS.2009.40","url":null,"abstract":"In this paper, a approach for automatically generating fuzzy rules from sample patterns is presented. Then a self-adaptive fuzzy neural network is built based on the fuzzy partition which divides the input space with input and output information. The salient characteristics of the self-adaptive fuzzy neural networks are: 1) structure identification and parameters estimation are performed automatically and simultaneously; 2) fuzzy rules can be recruited or deleted dynamically; 3) parameters of rules can be obtained by evolutionary computation. Simulation results demonstrate that a compact and high performance fuzzy rule base can be constructed. Comprehensive com-parisons with other approach show that the proposed approach is superior over other in terms of learning efficiency and performance.","PeriodicalId":170985,"journal":{"name":"2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124760318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Accelerating Genome-Wide Association Studies Using CUDA Compatible Graphics Processing Units 使用CUDA兼容图形处理单元加速全基因组关联研究
Rui Jiang, Feng Zeng, Wangshu Zhang, Xuebing Wu, Zhihong Yu
Recent advances in highly parallel, multithreaded, manycore Graphics Processing Units (GPUs) have been enabling massive parallel implementations of many applications in bioinformatics. In this paper, we describe a parallel implementation of genome-wide association studies (GWAS) using Compute Unified Device Architecture (CUDA). Using a single NVIDIA GTX 280 graphics card, we achieve speedups of about 15 times over Intel Xeon E5420. We also implement a highly scalable, massive parallel, GWAS system using the Message Passing Interface (MPI) and show that a single GTX 280 can have similar performance as a 16-node cluster. We further apply the GPU program to two real genome-wide case-control data sets. The results show that the GPU program is 17.7 times as fast as the CPU version for an Age-related Macular Degeneration (AMD) data set and 25.7 times as fast as the CPU version for a Parkinson’s disease data set.
高度并行、多线程、多核图形处理单元(gpu)的最新进展已经使生物信息学中的许多应用能够大规模并行实现。在本文中,我们描述了使用计算统一设备架构(CUDA)的全基因组关联研究(GWAS)的并行实现。使用单个NVIDIA GTX 280显卡,我们实现了比英特尔至强E5420快15倍的速度。我们还使用消息传递接口(MPI)实现了一个高度可扩展的大规模并行GWAS系统,并表明单个GTX 280可以具有与16节点集群相似的性能。我们进一步将GPU程序应用于两个真实的全基因组病例对照数据集。结果表明,对于年龄相关性黄斑变性(AMD)数据集,GPU程序的速度是CPU版本的17.7倍,对于帕金森病数据集,GPU程序的速度是CPU版本的25.7倍。
{"title":"Accelerating Genome-Wide Association Studies Using CUDA Compatible Graphics Processing Units","authors":"Rui Jiang, Feng Zeng, Wangshu Zhang, Xuebing Wu, Zhihong Yu","doi":"10.1109/IJCBS.2009.32","DOIUrl":"https://doi.org/10.1109/IJCBS.2009.32","url":null,"abstract":"Recent advances in highly parallel, multithreaded, manycore Graphics Processing Units (GPUs) have been enabling massive parallel implementations of many applications in bioinformatics. In this paper, we describe a parallel implementation of genome-wide association studies (GWAS) using Compute Unified Device Architecture (CUDA). Using a single NVIDIA GTX 280 graphics card, we achieve speedups of about 15 times over Intel Xeon E5420. We also implement a highly scalable, massive parallel, GWAS system using the Message Passing Interface (MPI) and show that a single GTX 280 can have similar performance as a 16-node cluster. We further apply the GPU program to two real genome-wide case-control data sets. The results show that the GPU program is 17.7 times as fast as the CPU version for an Age-related Macular Degeneration (AMD) data set and 25.7 times as fast as the CPU version for a Parkinson’s disease data set.","PeriodicalId":170985,"journal":{"name":"2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing","volume":"259 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121179742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Discovery of Biomarker Genes from Earthworm Microarray Data by Discriminant Analysis and Clustering 用判别分析和聚类方法从蚯蚓微阵列数据中发现生物标记基因
Ying Li, Nan Wang, E. Perkins, P. Gong
Monitoring, assessment and prediction of environmental risks that chemicals pose demand rapid and accurate diagnostic assays. One important goal of microarray experiments is to discover novel biomarkers for toxicity evaluation. A variety of toxicological effects have been associated with explosive compounds 2,4,6-trinitrotoluene (TNT) and 1,3,5-trinitro-1,3,5-triazacyclohexane (RDX). Here we developed a discriminant analysis and cluster (DAC) pipeline to analyze a 248-array dataset with 15,208 non-redundant earthworm (Eisenia fetida) gene probes on each array. Our objective was to identify biomarker genes that can separate earthworm samples into three groups: control (untreated), TNT-treated, and RDX-treated. First, the class comparison statistical algorithm implemented in BRB-ArrayTools was used to infer a total of 869 genes that significantly changed relative to controls as a result of exposure to TNT or RDX at various concentrations for 4 or 14 days. Then, nine tree-based supervised machine learning algorithms were applied to generate classification rules and a set of 286 classifier genes. These classifier genes were ranked by their overall weight of significance in the nine classification methods, and were used to build support vector machines (SVM). A SVM containing all 286 classifier genes had the highest classification accuracy (91.5%). Results of unsupervised clustering show that the use of the top 100 classifier genes can assign the largest number of the 248 worm samples into the three reference clusters obtained by using all the 14,188 filtered genes, suggesting that these top-ranked genes may be potential candidates for biomarkers. This study demonstrates that the DAC pipeline can be used to identify a small set of biomarker genes from high dimensional datasets and generate a reliable SVM classification model for multiple classes.
监测、评估和预测化学品造成的环境风险需要快速和准确的诊断分析。微阵列实验的一个重要目标是发现新的生物标志物用于毒性评估。爆炸性化合物2,4,6-三硝基甲苯(TNT)和1,3,5-三硝基-1,3,5-三氮杂环己烷(RDX)具有多种毒理学效应。在这里,我们开发了一个判别分析和聚类(DAC)管道来分析一个248个阵列的数据集,每个阵列上有15,208个非冗余的蚯蚓(Eisenia fetida)基因探针。我们的目标是鉴定生物标记基因,这些基因可以将蚯蚓样本分为三组:对照组(未处理)、tnt处理和rdx处理。首先,使用BRB-ArrayTools中实现的类比较统计算法来推断,由于暴露于不同浓度的TNT或RDX 4天或14天,总共有869个基因相对于对照组发生了显著变化。然后,应用9种基于树的监督机器学习算法生成分类规则和286个分类器基因。根据这些分类器基因在9种分类方法中的总体显著性权重对其进行排序,并用于构建支持向量机(SVM)。包含全部286个分类器基因的SVM分类准确率最高(91.5%)。无监督聚类结果表明,使用前100个分类器基因可以将248个蠕虫样本中的最多数量分配到使用所有过滤基因获得的3个参考聚类中,这表明这些排名靠前的基因可能是生物标志物的潜在候选基因。本研究表明,DAC管道可以用于从高维数据集中识别一小部分生物标记基因,并生成可靠的支持向量机多类分类模型。
{"title":"Discovery of Biomarker Genes from Earthworm Microarray Data by Discriminant Analysis and Clustering","authors":"Ying Li, Nan Wang, E. Perkins, P. Gong","doi":"10.1109/IJCBS.2009.134","DOIUrl":"https://doi.org/10.1109/IJCBS.2009.134","url":null,"abstract":"Monitoring, assessment and prediction of environmental risks that chemicals pose demand rapid and accurate diagnostic assays. One important goal of microarray experiments is to discover novel biomarkers for toxicity evaluation. A variety of toxicological effects have been associated with explosive compounds 2,4,6-trinitrotoluene (TNT) and 1,3,5-trinitro-1,3,5-triazacyclohexane (RDX). Here we developed a discriminant analysis and cluster (DAC) pipeline to analyze a 248-array dataset with 15,208 non-redundant earthworm (Eisenia fetida) gene probes on each array. Our objective was to identify biomarker genes that can separate earthworm samples into three groups: control (untreated), TNT-treated, and RDX-treated. First, the class comparison statistical algorithm implemented in BRB-ArrayTools was used to infer a total of 869 genes that significantly changed relative to controls as a result of exposure to TNT or RDX at various concentrations for 4 or 14 days. Then, nine tree-based supervised machine learning algorithms were applied to generate classification rules and a set of 286 classifier genes. These classifier genes were ranked by their overall weight of significance in the nine classification methods, and were used to build support vector machines (SVM). A SVM containing all 286 classifier genes had the highest classification accuracy (91.5%). Results of unsupervised clustering show that the use of the top 100 classifier genes can assign the largest number of the 248 worm samples into the three reference clusters obtained by using all the 14,188 filtered genes, suggesting that these top-ranked genes may be potential candidates for biomarkers. This study demonstrates that the DAC pipeline can be used to identify a small set of biomarker genes from high dimensional datasets and generate a reliable SVM classification model for multiple classes.","PeriodicalId":170985,"journal":{"name":"2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128023141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Two Dimensional Modeling and Fractal Characterization of Tumor Vascular Network 肿瘤血管网络的二维建模与分形表征
R. Dobrescu, L. Ichim
Tumor networks display percolation like scaling, representing the first evidence for a biological growth process whose key determinants are local substrate properties. In this paper we present a full characterization of a recently proposed model which reproduces the main features of the biological system, focusing on its dynamical properties, on the fractal properties of patterns, and on the percolative phase transition. We propose a simple model which reproduces many features of the biological system.
肿瘤网络表现出像结垢一样的渗透,这是生物生长过程的第一个证据,其关键决定因素是局部底物特性。在本文中,我们提出了一个最近提出的模型的完整表征,该模型再现了生物系统的主要特征,重点是其动力学特性,图案的分形特性和渗透相变。我们提出一个简单的模型,它再现了生物系统的许多特征。
{"title":"Two Dimensional Modeling and Fractal Characterization of Tumor Vascular Network","authors":"R. Dobrescu, L. Ichim","doi":"10.1109/IJCBS.2009.71","DOIUrl":"https://doi.org/10.1109/IJCBS.2009.71","url":null,"abstract":"Tumor networks display percolation like scaling, representing the first evidence for a biological growth process whose key determinants are local substrate properties. In this paper we present a full characterization of a recently proposed model which reproduces the main features of the biological system, focusing on its dynamical properties, on the fractal properties of patterns, and on the percolative phase transition. We propose a simple model which reproduces many features of the biological system.","PeriodicalId":170985,"journal":{"name":"2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131402338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Study on Prognosis of Brain Tumors Using Fuzzy Logic and Genetic Algorithm Based Techniques 基于模糊逻辑和遗传算法的脑肿瘤预后研究
Arpita Das, M. Bhattacharya
In present study attempt has been taken to determine the degree of malignancy of brain tumors using artificial intelligence. The suspicious regions in brain as suggested by the radiologists have been segmented using fuzzy c-means clustering technique. Fourier descriptors are utilized for precise extraction of boundary features of the tumor region. As Fourier Descriptors introduce a large number of feature vectors that may invite the problem of over learning and chance of misclassifications, the proposed diagnosis system efficiently search the significant boundary features by genetic algorithm and feed them to the adaptive neuro-fuzzy based classifier. In addition to shape based features, textural compositions are also incorporated to achieve high level of accuracy in diagnosis of tumors. The study involves 100 brain images and has shown 86% correct classification rate.
目前的研究尝试利用人工智能来确定脑肿瘤的恶性程度。利用模糊c均值聚类技术对放射科医生建议的脑可疑区域进行分割。利用傅里叶描述子精确提取肿瘤区域的边界特征。由于傅里叶描述子引入了大量的特征向量,可能会导致过度学习和误分类的问题,该诊断系统通过遗传算法有效地搜索重要的边界特征,并将其提供给基于自适应神经模糊的分类器。除了基于形状的特征外,还结合了纹理组合物以实现肿瘤诊断的高水平准确性。该研究涉及100张大脑图像,显示出86%的正确分类率。
{"title":"A Study on Prognosis of Brain Tumors Using Fuzzy Logic and Genetic Algorithm Based Techniques","authors":"Arpita Das, M. Bhattacharya","doi":"10.1109/IJCBS.2009.129","DOIUrl":"https://doi.org/10.1109/IJCBS.2009.129","url":null,"abstract":"In present study attempt has been taken to determine the degree of malignancy of brain tumors using artificial intelligence. The suspicious regions in brain as suggested by the radiologists have been segmented using fuzzy c-means clustering technique. Fourier descriptors are utilized for precise extraction of boundary features of the tumor region. As Fourier Descriptors introduce a large number of feature vectors that may invite the problem of over learning and chance of misclassifications, the proposed diagnosis system efficiently search the significant boundary features by genetic algorithm and feed them to the adaptive neuro-fuzzy based classifier. In addition to shape based features, textural compositions are also incorporated to achieve high level of accuracy in diagnosis of tumors. The study involves 100 brain images and has shown 86% correct classification rate.","PeriodicalId":170985,"journal":{"name":"2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114435129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Parameterized Complexity of Finding Elementary Modes in Metabolic Networks 代谢网络中寻找基本模式的参数化复杂性
Hong Liu, Haodi Feng, Daming Zhu
The concept of elementary (flux) modes provides a rigorous description of pathways in metabolic networks. Finding the elementary modes with minimum number of reactions (shortest elementary modes) is an interesting problem and has potential uses in various applications. However, this problem is NP-hard. This work is an initial step to analyze this problem from a parameterized computation view. With the number of reactions in elementary modes as natural parameter, we prove that finding the shortest elementary modes in metabolic networks is W[1]-hard.
基本(通量)模式的概念提供了代谢网络途径的严格描述。寻找反应数最少的基本模(最短基本模)是一个有趣的问题,在各种应用中都有潜在的用途。然而,这个问题是np困难的。这项工作是从参数化计算的角度分析这一问题的第一步。以基本模态的反应数为自然参数,证明了在代谢网络中寻找最短的基本模态是困难的。
{"title":"Parameterized Complexity of Finding Elementary Modes in Metabolic Networks","authors":"Hong Liu, Haodi Feng, Daming Zhu","doi":"10.1109/IJCBS.2009.121","DOIUrl":"https://doi.org/10.1109/IJCBS.2009.121","url":null,"abstract":"The concept of elementary (flux) modes provides a rigorous description of pathways in metabolic networks. Finding the elementary modes with minimum number of reactions (shortest elementary modes) is an interesting problem and has potential uses in various applications. However, this problem is NP-hard. This work is an initial step to analyze this problem from a parameterized computation view. With the number of reactions in elementary modes as natural parameter, we prove that finding the shortest elementary modes in metabolic networks is W[1]-hard.","PeriodicalId":170985,"journal":{"name":"2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114632943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2009 International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1