首页 > 最新文献

Proceedings. International Conference on Intelligent Systems for Molecular Biology最新文献

英文 中文
Linear modeling of genetic networks from experimental data. 基于实验数据的遗传网络线性建模。
E P van Someren, L F Wessels, M J Reinders

In this paper, the regulatory interactions between genes are modeled by a linear genetic network that is estimated from gene expression data. The inference of such a genetic network is hampered by the dimensionality problem. This problem is inherent in all gene expression data since the number of genes by far exceeds the number of measured time points. Consequently, there are infinitely many solutions that fit the data set perfectly. In this paper, this problem is tackled by combining genes with similar expression profiles in a single prototypical 'gene'. Instead of modeling the genes individually, the relations between prototypical genes are modeled. In this way, genes that cannot be distinguished based on their expression profiles are grouped together and their common control action is modeled instead. This process reduces the number of signals and imposes a structure on the model that is supported by the fact that biological genetic networks are thought to be redundant and sparsely connected. In essence, the ambiguity in model solutions is represented explicitly by providing a generalized model that expresses the basic regulatory interactions between groups of similarly expressed genes. The modeling approach is illustrated on artificial as well as real data.

在本文中,基因之间的调节相互作用是由一个线性遗传网络,估计从基因表达数据建模。这种遗传网络的推理受到维数问题的阻碍。这个问题在所有基因表达数据中都是固有的,因为基因的数量远远超过了测量时间点的数量。因此,有无限多个解可以完美地拟合数据集。在本文中,通过将具有相似表达谱的基因组合在单个原型“基因”中来解决这个问题。与单个基因建模不同,原型基因之间的关系被建模。通过这种方式,不能根据表达谱来区分的基因被分组在一起,而它们共同的控制作用被建模代替。这一过程减少了信号的数量,并在模型上强加了一种结构,这种结构得到了生物遗传网络被认为是冗余的和稀疏连接的事实的支持。从本质上讲,通过提供一个表达相似表达基因组之间基本调控相互作用的广义模型来明确表示模型解决方案中的模糊性。通过人工数据和实际数据对建模方法进行了说明。
{"title":"Linear modeling of genetic networks from experimental data.","authors":"E P van Someren,&nbsp;L F Wessels,&nbsp;M J Reinders","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In this paper, the regulatory interactions between genes are modeled by a linear genetic network that is estimated from gene expression data. The inference of such a genetic network is hampered by the dimensionality problem. This problem is inherent in all gene expression data since the number of genes by far exceeds the number of measured time points. Consequently, there are infinitely many solutions that fit the data set perfectly. In this paper, this problem is tackled by combining genes with similar expression profiles in a single prototypical 'gene'. Instead of modeling the genes individually, the relations between prototypical genes are modeled. In this way, genes that cannot be distinguished based on their expression profiles are grouped together and their common control action is modeled instead. This process reduces the number of signals and imposes a structure on the model that is supported by the fact that biological genetic networks are thought to be redundant and sparsely connected. In essence, the ambiguity in model solutions is represented explicitly by providing a generalized model that expresses the basic regulatory interactions between groups of similarly expressed genes. The modeling approach is illustrated on artificial as well as real data.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21813095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards a complete map of the protein space based on a unified sequence and structure analysis of all known proteins. 在对所有已知蛋白质进行统一的序列和结构分析的基础上,迈向蛋白质空间的完整图谱。
G Yona, M Levitt

In search for global principles that may explain the organization of the space of all possible proteins, we study all known protein sequences and structures. In this paper we present a global map of the protein space based on our analysis. Our protein space contains all protein sequences in a non-redundant (NR) database, which includes all major sequence databases. Using the PSI-BLAST procedure we defined 4,670 clusters of related sequences in this space. Of these clusters, 1,421 are centered on a sequence of known structure. All 4,670 clusters were then compared using either a structure metric (when 3D structures are known) or a novel sequence profile metric. These scores were used to define a unified and consistent metric between all clusters. Two schemes were employed to organize these clusters in a meta-organization. The first uses a graph theory method and cluster the clusters in an hierarchical organization. This organization extends our ability to predict the structure and function of many proteins beyond what is possible with existing tools for sequence analysis. The second uses a variation on a multidimensional scaling technique to embed the clusters in a low dimensional real space. This last approach resulted in a projection of the protein space onto a 2D plane that provides us with a bird's eye view of the protein space. Based on this map we suggest a list of possible target sequences with unknown structure that are likely to adopt new, unknown folds.

为了寻找可以解释所有可能的蛋白质空间组织的全局原理,我们研究了所有已知的蛋白质序列和结构。在本文中,我们根据我们的分析提出了一个蛋白质空间的全球地图。我们的蛋白质空间包含非冗余(NR)数据库中的所有蛋白质序列,该数据库包括所有主要的序列数据库。使用PSI-BLAST程序,我们在该空间中定义了4670个相关序列簇。在这些簇中,有1421个以已知结构序列为中心。然后使用结构度量(当三维结构已知时)或新的序列轮廓度量对所有4670个聚类进行比较。这些分数被用来定义所有集群之间统一和一致的度量。采用两种方案将这些集群组织到元组织中。第一种方法使用图论方法,将聚类聚在一个层次组织中。该组织扩展了我们预测许多蛋白质结构和功能的能力,超出了现有序列分析工具的可能性。第二种方法使用一种多维缩放技术的变体,将集群嵌入到低维真实空间中。最后一种方法将蛋白质空间投影到二维平面上,为我们提供了蛋白质空间的鸟瞰图。基于这张图,我们提出了一个可能的目标序列列表,这些序列具有未知的结构,可能采用新的未知折叠。
{"title":"Towards a complete map of the protein space based on a unified sequence and structure analysis of all known proteins.","authors":"G Yona,&nbsp;M Levitt","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In search for global principles that may explain the organization of the space of all possible proteins, we study all known protein sequences and structures. In this paper we present a global map of the protein space based on our analysis. Our protein space contains all protein sequences in a non-redundant (NR) database, which includes all major sequence databases. Using the PSI-BLAST procedure we defined 4,670 clusters of related sequences in this space. Of these clusters, 1,421 are centered on a sequence of known structure. All 4,670 clusters were then compared using either a structure metric (when 3D structures are known) or a novel sequence profile metric. These scores were used to define a unified and consistent metric between all clusters. Two schemes were employed to organize these clusters in a meta-organization. The first uses a graph theory method and cluster the clusters in an hierarchical organization. This organization extends our ability to predict the structure and function of many proteins beyond what is possible with existing tools for sequence analysis. The second uses a variation on a multidimensional scaling technique to embed the clusters in a low dimensional real space. This last approach resulted in a projection of the protein space onto a 2D plane that provides us with a bird's eye view of the protein space. Based on this map we suggest a list of possible target sequences with unknown structure that are likely to adopt new, unknown folds.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21813099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Alignment of flexible protein structures. 柔性蛋白质结构的排列。
M Shatsky, Z Y Fligelman, R Nussinov, H J Wolfson

We present two algorithms which align flexible protein structures. Both apply efficient structural pattern detection and graph theoretic techniques. The FlexProt algorithm simultaneously detects the hinge regions and aligns the rigid subparts of the molecules. It does it by efficiently detecting maximal congruent rigid fragments in both molecules and calculating their optimal arrangement which does not violate the protein sequence order. The FlexMol algorithm is sequence order independent, yet requires as input the hypothesized hinge positions. Due its sequence order independence it can also be applied to protein-protein interface matching and drug molecule alignment. It aligns the rigid parts of the molecule using the Geometric Hashing method and calculates optimal connectivity among these parts by graph-theoretic techniques. Both algorithms are highly efficient even compared with rigid structure alignment algorithms. Typical running times on a standard desktop PC (400 MHz) are about 7 seconds for FlexProt and about 1 minute for FlexMol.

我们提出了两种对齐柔性蛋白质结构的算法。两者都应用了高效的结构模式检测和图论技术。FlexProt算法同时检测铰链区域并对齐分子的刚性子部分。它通过有效地检测两个分子中最大一致的刚性片段并计算它们在不违反蛋白质序列顺序的情况下的最佳排列来实现。FlexMol算法是序列顺序独立的,但需要作为输入假设的铰链位置。由于其序列顺序的独立性,它也可以应用于蛋白质-蛋白质界面匹配和药物分子比对。它使用几何哈希方法对分子的刚性部分进行对齐,并通过图论技术计算这些部分之间的最优连通性。与刚性结构对齐算法相比,这两种算法都是非常高效的。在标准台式电脑(400mhz)上,FlexProt的典型运行时间约为7秒,FlexMol的运行时间约为1分钟。
{"title":"Alignment of flexible protein structures.","authors":"M Shatsky,&nbsp;Z Y Fligelman,&nbsp;R Nussinov,&nbsp;H J Wolfson","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We present two algorithms which align flexible protein structures. Both apply efficient structural pattern detection and graph theoretic techniques. The FlexProt algorithm simultaneously detects the hinge regions and aligns the rigid subparts of the molecules. It does it by efficiently detecting maximal congruent rigid fragments in both molecules and calculating their optimal arrangement which does not violate the protein sequence order. The FlexMol algorithm is sequence order independent, yet requires as input the hypothesized hinge positions. Due its sequence order independence it can also be applied to protein-protein interface matching and drug molecule alignment. It aligns the rigid parts of the molecule using the Geometric Hashing method and calculates optimal connectivity among these parts by graph-theoretic techniques. Both algorithms are highly efficient even compared with rigid structure alignment algorithms. Typical running times on a standard desktop PC (400 MHz) are about 7 seconds for FlexProt and about 1 minute for FlexMol.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21812564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrative analysis of protein interaction data. 蛋白质相互作用数据的综合分析。
M Fellenberg, K Albermann, A Zollner, H W Mewes, J Hani

We have developed a method for the integrative analysis of protein interaction data. It comprises clustering, visualization and data integration components. The method is generally applicable for all sequenced organisms. Here, we describe in detail the combination of protein interaction data in the yeast Saccharomyces cerevisiae with the functional classification of all yeast proteins. We evaluate the utility of the method by comparison with experimental data and deduce hypotheses about the functional role of so far uncharacterized proteins. Further applications of the integrative analysis method are discussed. The method presented here is powerful and flexible. We show that it is capable of mining large-scale data sets.

我们开发了一种综合分析蛋白质相互作用数据的方法。它包括集群、可视化和数据集成组件。该方法一般适用于所有已测序的生物体。在这里,我们详细描述了酵母中蛋白质相互作用数据与所有酵母蛋白功能分类的结合。我们通过与实验数据的比较来评估该方法的实用性,并推断出迄今为止尚未表征的蛋白质的功能作用的假设。讨论了综合分析方法的进一步应用。本文提出的方法功能强大且灵活。我们证明了它能够挖掘大规模的数据集。
{"title":"Integrative analysis of protein interaction data.","authors":"M Fellenberg,&nbsp;K Albermann,&nbsp;A Zollner,&nbsp;H W Mewes,&nbsp;J Hani","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We have developed a method for the integrative analysis of protein interaction data. It comprises clustering, visualization and data integration components. The method is generally applicable for all sequenced organisms. Here, we describe in detail the combination of protein interaction data in the yeast Saccharomyces cerevisiae with the functional classification of all yeast proteins. We evaluate the utility of the method by comparison with experimental data and deduce hypotheses about the functional role of so far uncharacterized proteins. Further applications of the integrative analysis method are discussed. The method presented here is powerful and flexible. We show that it is capable of mining large-scale data sets.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21812148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient attractor analysis based on self-dependent subsets of elements--an application to signal transduction studies. 基于元素自依赖子集的有效吸引子分析——在信号转导研究中的应用。
M Cárdenas-García, J Lagunez-Otero, N Korneev

External signals are transmitted to the cells through receptors activating signal transduction pathways. These pathways form a complicated interconnected network, which is able to answer to different stimuli. Here we analyze an important pathway for oncogenesis namely RAS/MAPK signal transduction pathway. We show that the interaction of the elements of this pathway induces topological structure in the element set and that the knowledge of the topology simplifies the analysis of the set. With a computer algorithm, we isolate from a large and complex group, smaller, independent, more manageable subsets, and build their hierarchy. Subsets introduction makes easier the search for attractors in discrete dynamical system, it permits the prediction of final states for elements involved in signal transduction pathways.

外部信号通过受体激活信号转导通路传递到细胞。这些通路形成了一个复杂的相互连接的网络,能够响应不同的刺激。在此,我们分析了肿瘤发生的一个重要途径,即RAS/MAPK信号转导途径。我们证明了该路径中元素的相互作用导致元素集中的拓扑结构,并且拓扑的知识简化了集合的分析。通过计算机算法,我们从一个庞大而复杂的群体中分离出较小的、独立的、更易于管理的子集,并建立它们的层次结构。子集的引入使离散动力系统中吸引子的搜索变得更加容易,它允许对信号转导通路中涉及的元素的最终状态进行预测。
{"title":"Efficient attractor analysis based on self-dependent subsets of elements--an application to signal transduction studies.","authors":"M Cárdenas-García,&nbsp;J Lagunez-Otero,&nbsp;N Korneev","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>External signals are transmitted to the cells through receptors activating signal transduction pathways. These pathways form a complicated interconnected network, which is able to answer to different stimuli. Here we analyze an important pathway for oncogenesis namely RAS/MAPK signal transduction pathway. We show that the interaction of the elements of this pathway induces topological structure in the element set and that the knowledge of the topology simplifies the analysis of the set. With a computer algorithm, we isolate from a large and complex group, smaller, independent, more manageable subsets, and build their hierarchy. Subsets introduction makes easier the search for attractors in discrete dynamical system, it permits the prediction of final states for elements involved in signal transduction pathways.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21812200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A statistical method for finding transcription factor binding sites. 寻找转录因子结合位点的统计方法。
S Sinha, M Tompa

Understanding the mechanisms that determine the regulation of gene expression is an important and challenging problem. A fundamental subproblem is to identify DNA-binding sites for unknown regulatory factors, given a collection of genes believed to be coregulated, and given the noncoding DNA sequences near those genes. We present an enumerative statistical method for identifying good candidates for such transcription factor binding sites. Unlike local search techniques such as Expectation Maximization and Gibbs samplers that may not reach a global optimum, the method proposed here is guaranteed to produce the motifs with greatest z-scores. We discuss the results of experiments in which this algorithm was used to locate candidate binding sites in several well studied pathways of S. cerevisiae, as well as gene clusters from some of the hybridization microarray experiments.

了解决定基因表达调控的机制是一个重要而具有挑战性的问题。一个基本的子问题是确定未知调节因子的DNA结合位点,已知一组被认为是共同调节的基因,以及这些基因附近的非编码DNA序列。我们提出了一种枚举统计方法来确定这些转录因子结合位点的良好候选者。与期望最大化和吉布斯采样器等局部搜索技术可能无法达到全局最优不同,本文提出的方法保证产生具有最大z分数的图案。我们讨论了实验结果,其中该算法用于定位酿酒酵母的几个研究良好的途径中的候选结合位点,以及来自一些杂交微阵列实验的基因簇。
{"title":"A statistical method for finding transcription factor binding sites.","authors":"S Sinha,&nbsp;M Tompa","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Understanding the mechanisms that determine the regulation of gene expression is an important and challenging problem. A fundamental subproblem is to identify DNA-binding sites for unknown regulatory factors, given a collection of genes believed to be coregulated, and given the noncoding DNA sequences near those genes. We present an enumerative statistical method for identifying good candidates for such transcription factor binding sites. Unlike local search techniques such as Expectation Maximization and Gibbs samplers that may not reach a global optimum, the method proposed here is guaranteed to produce the motifs with greatest z-scores. We discuss the results of experiments in which this algorithm was used to locate candidate binding sites in several well studied pathways of S. cerevisiae, as well as gene clusters from some of the hybridization microarray experiments.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21812565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of gene expression data with pathway scores. 基因表达数据的通路评分分析。
A Zien, R Küffner, R Zimmer, T Lengauer

We present a new approach for the evaluation of gene expression data. The basic idea is to generate biologically possible pathways and to score them with respect to gene expression measurements. We suggest sample scoring functions for different problem specifications. We assess the significance of the scores for the investigated pathways by comparison to a number of scores for random pathways. We show that simple scoring functions can assign statistically significant scores to biologically relevant pathways. This suggests that the combination of appropriate scoring functions with the systematic generation of pathways can be used in order to select the most interesting pathways based on gene expression measurements.

我们提出了一种评估基因表达数据的新方法。基本的想法是产生生物学上可能的途径,并根据基因表达测量对它们进行评分。我们建议针对不同的问题规格使用样本评分函数。我们通过与随机路径的一些分数进行比较来评估所调查路径的分数的重要性。我们表明,简单的评分函数可以为生物学相关途径分配具有统计意义的分数。这表明,为了根据基因表达测量选择最有趣的途径,可以使用适当的评分函数和系统的途径生成相结合。
{"title":"Analysis of gene expression data with pathway scores.","authors":"A Zien,&nbsp;R Küffner,&nbsp;R Zimmer,&nbsp;T Lengauer","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We present a new approach for the evaluation of gene expression data. The basic idea is to generate biologically possible pathways and to score them with respect to gene expression measurements. We suggest sample scoring functions for different problem specifications. We assess the significance of the scores for the investigated pathways by comparison to a number of scores for random pathways. We show that simple scoring functions can assign statistically significant scores to biologically relevant pathways. This suggests that the combination of appropriate scoring functions with the systematic generation of pathways can be used in order to select the most interesting pathways based on gene expression measurements.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21813100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intelligent aids for parallel experiment planning and macromolecular crystallization. 智能辅助平行实验计划和大分子结晶。
V Gopalakrishnan, B G Buchanan, J M Rosenberg

This paper presents a framework called Parallel Experiment Planning (PEP) that is based on an abstraction of how experiments are performed in the domain of macromolecular crystallization. The goal in this domain is to obtain a good quality crystal of a protein or other macromolecule that can be X-ray diffracted to determine three-dimensional structure. This domain presents problems encountered in real-world situations, such as a parallel and dynamic environment, insufficient resources and expensive tasks. The PEP framework comprises of two types of components: (1) an information management system for keeping track of sets of experiments, resources and costs; and (2) knowledge-based methods for providing intelligent assistance to decision-making. The significance of the developed PEP framework is three-fold--(a) the framework can be used for PEP even without one of its major intelligent aids that simulates experiments, simply by collecting real experimental data; (b) the framework with a simulator can provide intelligent assistance for experiment design by utilizing existing domain theories; and (c) the framework can help provide strategic assessment of different types of parallel experimentation plans that involve different tradeoffs.

本文提出了一个并行实验计划(PEP)框架,该框架基于大分子结晶领域中如何进行实验的抽象。该领域的目标是获得高质量的蛋白质或其他大分子晶体,可以通过x射线衍射来确定三维结构。该领域提出了在现实世界中遇到的问题,例如并行和动态环境、资源不足和昂贵的任务。PEP框架由两类组件组成:(1)用于跟踪实验集、资源和成本的信息管理系统;(2)基于知识的决策智能辅助方法。开发的PEP框架的意义有三个方面——(a)即使没有模拟实验的主要智能辅助工具之一,该框架也可以用于PEP,只需收集真实的实验数据;(b)带有模拟器的框架可以利用现有的领域理论为实验设计提供智能辅助;(c)该框架可以帮助对涉及不同权衡的不同类型的并行实验计划提供战略评估。
{"title":"Intelligent aids for parallel experiment planning and macromolecular crystallization.","authors":"V Gopalakrishnan,&nbsp;B G Buchanan,&nbsp;J M Rosenberg","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper presents a framework called Parallel Experiment Planning (PEP) that is based on an abstraction of how experiments are performed in the domain of macromolecular crystallization. The goal in this domain is to obtain a good quality crystal of a protein or other macromolecule that can be X-ray diffracted to determine three-dimensional structure. This domain presents problems encountered in real-world situations, such as a parallel and dynamic environment, insufficient resources and expensive tasks. The PEP framework comprises of two types of components: (1) an information management system for keeping track of sets of experiments, resources and costs; and (2) knowledge-based methods for providing intelligent assistance to decision-making. The significance of the developed PEP framework is three-fold--(a) the framework can be used for PEP even without one of its major intelligent aids that simulates experiments, simply by collecting real experimental data; (b) the framework with a simulator can provide intelligent assistance for experiment design by utilizing existing domain theories; and (c) the framework can help provide strategic assessment of different types of parallel experimentation plans that involve different tradeoffs.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21811343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectrum alignment: efficient resequencing by hybridization. 光谱比对:高效的杂交重测序。
I Pe'er, R Shamir

Recent high-density microarray technologies allow, in principle, the determination of all k-mers that appear along a DNA sequence, for k = 8 - 10 in a single experiment on a standard chip. The k-mer contents, also called the spectrum of the sequence, is not sufficient to uniquely reconstruct a sequence longer than a few hundred bases. We have devised a polynomial algorithm that reconstructs the sequence, given the spectrum and a homologous sequence. This situation occurs, for example, in the identification of single nucleotide polymorphisms (SNPs), and whenever a homologue of the target sequence is known. The algorithm is robust, can handle errors in the spectrum and assumes no knowledge of the k-mer multiplicities. Our simulations show that with realistic levels of SNPs, the algorithm correctly reconstructs a target sequence of length up to 2,000 nucleotides when a polymorphic sequence is known. The technique is generalized to handle profiles and HMMs as input instead of a single homologous sequence.

最近的高密度微阵列技术原则上允许在一个标准芯片上的一次实验中,测定沿DNA序列出现的所有k-mers,因为k = 8 - 10。k-mer含量,也称为序列的谱,不足以唯一地重建长度超过几百个碱基的序列。我们设计了一个多项式算法来重建序列,给定频谱和一个同源序列。这种情况发生,例如,在鉴定单核苷酸多态性(snp)时,以及每当目标序列的同源物已知时。该算法具有鲁棒性,可以处理频谱中的误差,并且不需要知道k-mer多重性。我们的模拟表明,在实际的snp水平下,当多态性序列已知时,该算法正确地重建了长度高达2,000个核苷酸的目标序列。将该技术推广到将剖面和hmm作为输入处理,而不是将单个同源序列作为输入处理。
{"title":"Spectrum alignment: efficient resequencing by hybridization.","authors":"I Pe'er,&nbsp;R Shamir","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Recent high-density microarray technologies allow, in principle, the determination of all k-mers that appear along a DNA sequence, for k = 8 - 10 in a single experiment on a standard chip. The k-mer contents, also called the spectrum of the sequence, is not sufficient to uniquely reconstruct a sequence longer than a few hundred bases. We have devised a polynomial algorithm that reconstructs the sequence, given the spectrum and a homologous sequence. This situation occurs, for example, in the identification of single nucleotide polymorphisms (SNPs), and whenever a homologue of the target sequence is known. The algorithm is robust, can handle errors in the spectrum and assumes no knowledge of the k-mer multiplicities. Our simulations show that with realistic levels of SNPs, the algorithm correctly reconstructs a target sequence of length up to 2,000 nucleotides when a polymorphic sequence is known. The technique is generalized to handle profiles and HMMs as input instead of a single homologous sequence.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21812557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genomic fold assignment and rational modeling of proteins of biological interest. 基因组折叠分配和合理建模的蛋白质的生物学利益。
J M Sauder, R L Dunbrack

The first available genome of a multicellular organism, C. elegans, was used as a test case for protein fold assignment using PSI-BLAST, followed by rational structure modeling and interpretation of experimental mutagenesis data in the context of collaboration with biologists. Similar results are demonstrated for human disease proteins with known polymorphisms.

多细胞生物秀丽隐杆线虫的第一个可用基因组被用作使用PSI-BLAST进行蛋白质折叠分配的测试案例,随后在与生物学家合作的背景下进行合理的结构建模和实验诱变数据的解释。具有已知多态性的人类疾病蛋白也证明了类似的结果。
{"title":"Genomic fold assignment and rational modeling of proteins of biological interest.","authors":"J M Sauder,&nbsp;R L Dunbrack","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The first available genome of a multicellular organism, C. elegans, was used as a test case for protein fold assignment using PSI-BLAST, followed by rational structure modeling and interpretation of experimental mutagenesis data in the context of collaboration with biologists. Similar results are demonstrated for human disease proteins with known polymorphisms.</p>","PeriodicalId":79420,"journal":{"name":"Proceedings. International Conference on Intelligent Systems for Molecular Biology","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"21812560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings. International Conference on Intelligent Systems for Molecular Biology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1