2009 Fifth IEEE International Conference on e-Science最新文献

英文中文

Supporting the Running and Analysis of Trials of Web-Based Behavioural Interventions: The LifeGuide 支持基于网络的行为干预试验的运行和分析:生活指南

2009 Fifth IEEE International Conference on e-Science

Pub Date : 2009-12-09 DOI: 10.1109/E-SCIENCE.2009.16

Yang Yang, Adrian Osmond, Xiaoyu Chen, M. Weal, G. Wills, D. D. Roure, J. Joseph, L. Yardley

Behavioural interventions - packages of advice and support for behaviour change - are one of the most important methodologies and technologies employed by social scientists for understanding and changing behaviour. A typical web-based behavioural intervention study includes the designing, deploying, piloting and trialling of the intervention as well as data analysis. We have developed a research environment named LifeGuide, which covers the full scope of this process, enabling social scientists to carry out intervention studies with minimal technical expertise. In this paper, we present how the LifeGuide can assist and accelerate intervention research, particularly focusing on supporting the running and analysis of trials of web-based behavioural interventions along with the case study of an intervention that has been developed within the LifeGuide.

行为干预——行为改变的一揽子建议和支持——是社会科学家用来理解和改变行为的最重要的方法和技术之一。一项典型的基于网络的行为干预研究包括干预的设计、部署、试点和试验以及数据分析。我们开发了一个名为LifeGuide的研究环境，它涵盖了这一过程的全部范围，使社会科学家能够以最少的技术专长进行干预研究。在本文中，我们介绍了LifeGuide如何协助和加速干预研究，特别是侧重于支持基于网络的行为干预试验的运行和分析，以及在LifeGuide中开发的干预案例研究。

引用次数: 9

A Fresh Perspective on Developing and Executing DAG-Based Distributed Applications: A Case-Study of SAGA-Based Montage 开发和执行基于dag的分布式应用程序的新视角:基于saga的蒙太奇案例研究

2009 Fifth IEEE International Conference on e-Science

Pub Date : 2009-12-09 DOI: 10.1109/E-SCIENCE.2009.40

André Merzky, K. Stamou, S. Jha, D. Katz

Most workflow based applications currently have to adapt to available tools. While this keeps the cost of development low, it can lead to performance and flexibility tradeoffs that the application developer and deployer must make. In this paper, we use the Montage astronomical image mosaicking application as prototypical DAG-based workflow application to layout the development and deployment decisions for distributed applications. We discuss and explain the lack of simple (easy-to-use), scalable, and extensible distributed applications. We then introduce SAGA as a technology that permits the construction of abstractions that aid the development and execution of the applications, and thus addresses some of common shortcomings of traditional distributed applications development. We use Montage together with SAGA to examine how legacy applications can be made to run on distributed infrastructures, to see if our reasons are valid, and to compare potential new methods for creating distributed applications with existing technologies that are currently used. We demonstrate the ability to (i) scale-out and (ii) use different production infrastructure, while maintaining performance comparable to established systems. Our hope is that by demonstrating the simplicity of development along with other advantages (performance, scalability, extensibility, and infrastructure independence), this example will encourage others to think more broadly about how distributed applications are created and how new programming models such as Dryad can be supported in an infrastructure independent way, thus eventually leading to more applications that can seamlessly scale-out.

大多数基于工作流的应用程序目前都必须适应可用的工具。虽然这样可以降低开发成本，但它可能导致应用程序开发人员和部署人员必须在性能和灵活性方面做出权衡。本文以蒙太奇天文图像拼接应用为原型，对分布式应用的开发和部署决策进行了布局。我们讨论并解释了缺乏简单(易于使用)、可伸缩和可扩展的分布式应用程序的原因。然后，我们介绍SAGA作为一种技术，它允许构建抽象来帮助应用程序的开发和执行，从而解决传统分布式应用程序开发的一些常见缺点。我们使用蒙太奇和SAGA来检查如何使遗留应用程序在分布式基础设施上运行，看看我们的理由是否有效，并将创建分布式应用程序的潜在新方法与当前使用的现有技术进行比较。我们展示了(i)向外扩展和(ii)使用不同生产基础设施的能力，同时保持与已建立系统相当的性能。我们希望通过演示开发的简单性以及其他优势(性能、可伸缩性、可扩展性和基础设施独立性)，这个示例将鼓励其他人更广泛地思考如何创建分布式应用程序，以及如何以独立于基础设施的方式支持新的编程模型，例如Dryad，从而最终导致更多的应用程序可以无缝地向外扩展。

{"title":"A Fresh Perspective on Developing and Executing DAG-Based Distributed Applications: A Case-Study of SAGA-Based Montage","authors":"André Merzky, K. Stamou, S. Jha, D. Katz","doi":"10.1109/E-SCIENCE.2009.40","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2009.40","url":null,"abstract":"Most workflow based applications currently have to adapt to available tools. While this keeps the cost of development low, it can lead to performance and flexibility tradeoffs that the application developer and deployer must make. In this paper, we use the Montage astronomical image mosaicking application as prototypical DAG-based workflow application to layout the development and deployment decisions for distributed applications. We discuss and explain the lack of simple (easy-to-use), scalable, and extensible distributed applications. We then introduce SAGA as a technology that permits the construction of abstractions that aid the development and execution of the applications, and thus addresses some of common shortcomings of traditional distributed applications development. We use Montage together with SAGA to examine how legacy applications can be made to run on distributed infrastructures, to see if our reasons are valid, and to compare potential new methods for creating distributed applications with existing technologies that are currently used. We demonstrate the ability to (i) scale-out and (ii) use different production infrastructure, while maintaining performance comparable to established systems. Our hope is that by demonstrating the simplicity of development along with other advantages (performance, scalability, extensibility, and infrastructure independence), this example will encourage others to think more broadly about how distributed applications are created and how new programming models such as Dryad can be supported in an infrastructure independent way, thus eventually leading to more applications that can seamlessly scale-out.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132080346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Strategies for Network Motifs Discovery 网络母题发现策略

2009 Fifth IEEE International Conference on e-Science

Pub Date : 2009-12-09 DOI: 10.1109/e-Science.2009.20

P. Ribeiro, Fernando M A Silva, Marcus Kaiser

Complex networks from domains like Biology or Sociology are present in many e-Science data sets. Dealing with networks can often form a workflow bottleneck as several related algorithms are computationally hard. One example is detecting characteristic patterns or "network motifs" - a problem involving subgraph mining and graph isomorphism. This paper provides a review and runtime comparison of current motif detection algorithms in the field. We present the strategies and the corresponding algorithms in pseudo-code yielding a framework for comparison. We categorize the algorithms outlining the main differences and advantages of each strategy. We finally implement all strategies in a common platform to allow a fair and objective efficiency comparison using a set of benchmark networks. We hope to inform the choice of strategy and critically discuss future improvements in motif detection.

来自生物学或社会学等领域的复杂网络存在于许多e-Science数据集中。处理网络通常会形成工作流瓶颈，因为一些相关算法的计算难度很大。一个例子是检测特征模式或“网络主题”——一个涉及子图挖掘和图同构的问题。本文对该领域目前的基序检测算法进行了综述和运行时比较。我们提出的策略和相应的算法在伪代码产生一个框架进行比较。我们对这些算法进行了分类，概述了每种策略的主要区别和优点。最后，我们在一个共同的平台上实现所有策略，以便使用一组基准网络进行公平客观的效率比较。我们希望为策略的选择提供信息，并批判性地讨论基序检测的未来改进。

引用次数: 65

Sharing and Reusing Cancer Image Segmentation Algorithms Using Scientific Workflows: Pros and Cons 使用科学工作流共享和重用癌症图像分割算法:利与弊

2009 Fifth IEEE International Conference on e-Science

Pub Date : 2009-12-09 DOI: 10.1109/e-Science.2009.22

M. S. Avila-Garcia, Anne E. Trefethen, N. Joshi, F. Gleeson, W. Ba-alawi

Image analysis researchers would benefit considerably by sharing and reusing image processing algorithms. We consider some of the issues that researchers face in trying to provide algorithms in a shareable and reusable form illustrating our approach in the context of medical imaging needs and workflow for colorectal cancer. We consider the use of workflow as a model for developing and reusing components of medical imaging and specifically we consider a solution built using .Net and Windows Workflow Foundation.

通过共享和重用图像处理算法，图像分析研究人员将受益匪浅。我们考虑了研究人员在尝试提供可共享和可重复使用的算法时面临的一些问题，说明了我们在结直肠癌医学成像需求和工作流程方面的方法。我们考虑使用工作流作为开发和重用医学成像组件的模型，特别是考虑使用。net和Windows workflow Foundation构建的解决方案。

引用次数: 1

Building Reliable Data Pipelines for Managing Community Data Using Scientific Workflows 构建可靠的数据管道，使用科学的工作流管理社区数据

2009 Fifth IEEE International Conference on e-Science

Pub Date : 2009-12-09 DOI: 10.1109/e-Science.2009.52

Yogesh L. Simmhan, C. Ingen, A. Szalay, R. Barga, J. Heasley

The growing amount of scientific data from sensors and field observations is posing a challenge to “data valets” responsible for managing them in data repositories. These repositories built on commodity clusters need to reliably ingest data continuously and ensure its availability to a wide user community. Workflows provide several benefits to modeling data-intensive science applications and many of these benefits can help manage the data ingest pipelines too. But using workflows is not panacea in itself and data valets need to consider several issues when designing workflows that behave reliably on fault prone hardware while retaining the consistency of the scientific data. In this paper, we propose workflow designs for reliable data ingest in a distributed environment and identify workflow framework features to support resilience. We illustrate these using the data pipeline for the Pan-STARRS repository, one of the largest digital surveys that accumulates 100TB of data annually to support 300 astronomers.

来自传感器和实地观测的科学数据越来越多，这对负责在数据存储库中管理这些数据的“数据管家”构成了挑战。这些构建在商品集群上的存储库需要可靠地连续摄取数据，并确保其对广泛的用户社区可用。工作流为建模数据密集型科学应用程序提供了一些好处，其中许多好处也可以帮助管理数据摄取管道。但是使用工作流本身并不是万灵药，数据代工在设计工作流时需要考虑几个问题，这些工作流在容易发生故障的硬件上运行可靠，同时保持科学数据的一致性。在本文中，我们提出了在分布式环境中可靠数据摄取的工作流设计，并确定了支持弹性的工作流框架特征。我们使用Pan-STARRS存储库的数据管道来说明这些问题，Pan-STARRS存储库是最大的数字调查之一，每年积累100TB的数据，以支持300名天文学家。

引用次数: 13

User-Level Virtual Network Support for Sky Computing 天空计算的用户级虚拟网络支持

2009 Fifth IEEE International Conference on e-Science

Pub Date : 2009-12-09 DOI: 10.1109/E-SCIENCE.2009.19

Maurício O. Tsugawa, Andréa M. Matsunaga, J. Fortes

With the emergence of multiple cloud providers of Infrastructure-as-a-Service, it becomes possible to envision a near-future when high-performance computing users could combine services from different clouds to access huge numbers of resources. However, as more administrative privileges are exposed to end users, providers are required to deploy network security measures that present challenges to the network virtualization technologies that are needed to enable inter-cloud communication. This paper studies these challenges and proposes techniques to enable unmodified applications on resources across distinct clouds. The techniques are implemented in TinyViNe, an extension to ViNe, a virtual networking technology for distributed resources in different administrative domains. The results of evaluating TinyViNe on a WAN-based testbed across three sites are reported for a bioinformatics application (BLAST) and MPI benchmarks. The results confirm that TinyViNe enables cross-cloud computing while having little impact on application performance. TinyViNe also has auto-configuration and “download-and-run” capabilities for easy deployment by users who are not knowledgeable about networking.

随着基础设施即服务的多个云提供商的出现，可以想象在不久的将来，高性能计算用户可以组合来自不同云的服务来访问大量资源。然而，随着越来越多的管理权限暴露给最终用户，提供商需要部署网络安全措施，这对支持云间通信所需的网络虚拟化技术提出了挑战。本文研究了这些挑战，并提出了在不同云上的资源上实现未经修改的应用程序的技术。这些技术在TinyViNe中实现，TinyViNe是ViNe的扩展，ViNe是一种用于不同管理域中分布式资源的虚拟网络技术。在生物信息学应用(BLAST)和MPI基准测试中，报告了在跨三个站点的基于wan的测试平台上评估TinyViNe的结果。结果证实，TinyViNe支持跨云计算，同时对应用程序性能几乎没有影响。TinyViNe还具有自动配置和“下载并运行”功能，可以让不懂网络的用户轻松部署。

{"title":"User-Level Virtual Network Support for Sky Computing","authors":"Maurício O. Tsugawa, Andréa M. Matsunaga, J. Fortes","doi":"10.1109/E-SCIENCE.2009.19","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2009.19","url":null,"abstract":"With the emergence of multiple cloud providers of Infrastructure-as-a-Service, it becomes possible to envision a near-future when high-performance computing users could combine services from different clouds to access huge numbers of resources. However, as more administrative privileges are exposed to end users, providers are required to deploy network security measures that present challenges to the network virtualization technologies that are needed to enable inter-cloud communication. This paper studies these challenges and proposes techniques to enable unmodified applications on resources across distinct clouds. The techniques are implemented in TinyViNe, an extension to ViNe, a virtual networking technology for distributed resources in different administrative domains. The results of evaluating TinyViNe on a WAN-based testbed across three sites are reported for a bioinformatics application (BLAST) and MPI benchmarks. The results confirm that TinyViNe enables cross-cloud computing while having little impact on application performance. TinyViNe also has auto-configuration and “download-and-run” capabilities for easy deployment by users who are not knowledgeable about networking.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131819671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Beyond the Document Library: Portal-Based Browsing and Exploration of Community Data Clouds 超越文档库:基于门户的社区数据云浏览和探索

2009 Fifth IEEE International Conference on e-Science

Pub Date : 2009-12-09 DOI: 10.1109/E-SCIENCE.2009.33

Yong Liu, Kailash Kotwani, Alejandro Rodríguez, J. Futrelle, R. McGrath, J. Myers

Every modern portal ships with some form of internal document library portlet tools that can be used to enable groups to share files. Unfortunately, certain limitations -- that the portlet can only view data managed directly by the portal and that document and data files are the only resource that can be browsed -- make these tools less valuable in many real-world collaborations. This paper describes a semantically-enhanced scientific resource library portlet that extends the traditional document library to enable interaction with multiple distributed repositories in the cloud and to broaden the set of resources that can be viewed beyond simple hierarchical document folder-files structure to include people, sensors, data streams and other complex digital entities and their relationships. Our technology is based on semantic content abstraction and context aggregation functionality supported by Tupelo, a semantic content middleware and is implemented as a portlet plugin to the Liferay-based CyberCollaboratory portal. We describe the architecture components and the browsing features currently implemented and present a water science use case in which users are able to share documents, raw sensor data streams, and derived virtual sensor data (rainfall) from distributed sources within the same semantic resource library.

每个现代门户都附带了某种形式的内部文档库portlet工具，可用于使组能够共享文件。不幸的是，某些限制——portlet只能查看由门户直接管理的数据，文档和数据文件是可以浏览的唯一资源——使得这些工具在许多实际协作中不那么有价值。本文描述了一个语义增强的科学资源库portlet，它扩展了传统的文档库，使其能够与云中的多个分布式存储库进行交互，并扩展了可以查看的资源集，使其超越简单的分层文档文件夹-文件结构，包括人员、传感器、数据流和其他复杂的数字实体及其关系。我们的技术基于语义内容抽象和上下文聚合功能，Tupelo是一种语义内容中间件，并作为基于liferray的CyberCollaboratory门户的portlet插件实现。我们描述了当前实现的架构组件和浏览功能，并提出了一个水科学用例，其中用户能够共享文档、原始传感器数据流和派生的虚拟传感器数据(降雨)，这些数据来自同一语义资源库中的分布式来源。

{"title":"Beyond the Document Library: Portal-Based Browsing and Exploration of Community Data Clouds","authors":"Yong Liu, Kailash Kotwani, Alejandro Rodríguez, J. Futrelle, R. McGrath, J. Myers","doi":"10.1109/E-SCIENCE.2009.33","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2009.33","url":null,"abstract":"Every modern portal ships with some form of internal document library portlet tools that can be used to enable groups to share files. Unfortunately, certain limitations -- that the portlet can only view data managed directly by the portal and that document and data files are the only resource that can be browsed -- make these tools less valuable in many real-world collaborations. This paper describes a semantically-enhanced scientific resource library portlet that extends the traditional document library to enable interaction with multiple distributed repositories in the cloud and to broaden the set of resources that can be viewed beyond simple hierarchical document folder-files structure to include people, sensors, data streams and other complex digital entities and their relationships. Our technology is based on semantic content abstraction and context aggregation functionality supported by Tupelo, a semantic content middleware and is implemented as a portlet plugin to the Liferay-based CyberCollaboratory portal. We describe the architecture components and the browsing features currently implemented and present a water science use case in which users are able to share documents, raw sensor data streams, and derived virtual sensor data (rainfall) from distributed sources within the same semantic resource library.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128780093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Enabling Advanced Visualization Tools in a Web-Based Simulation Monitoring System 在基于web的仿真监控系统中启用高级可视化工具

2009 Fifth IEEE International Conference on e-Science

Pub Date : 2009-12-09 DOI: 10.1109/E-SCIENCE.2009.57

E. Santos, Julien Tierny, Ayla Khan, Brad Grimm, L. Lins, J. Freire, Valerio Pascucci, Cláudio T. Silva, S. Klasky, Roselyne B. Tchoua, N. Podhorszki

Simulations that require massive amounts of computing power and generate tens of terabytes of data are now part of the daily lives of scientists. Analyzing and visualizing the results of these simulations as they are computed can lead not only to early insights but also to useful knowledge that can be provided as feedback to the simulation, avoiding unnecessary use of computing power. Our work is aimed at making advanced visualization tools available to scientists in a user-friendly, Web-based environment where they can be accessed anytime from anywhere. In the context of turbulent combustion for example, visualization is used to understand the coupling between turbulence and the turbulent mixing of scalars. Although isosurface generation is a useful technique in this scenario, computing and rendering isosurfaces one at a time is expensive and not particularly well-suited for such a Web-based framework. In this paper we propose the use of a summary structure, called contour tree, that captures the topological structure of a scalar field and guides the user in identifying useful isosurfaces. We have also designed an interface which has been integrated with a Web-based simulation monitoring system, that allows users to interact with and explore multiple isosurfaces.

需要大量计算能力和生成数十tb数据的模拟现在是科学家日常生活的一部分。在计算这些模拟结果时对其进行分析和可视化，不仅可以获得早期的见解，还可以获得作为模拟反馈的有用知识，从而避免不必要的计算能力使用。我们的工作旨在使先进的可视化工具在一个用户友好的、基于网络的环境中提供给科学家，使他们可以随时随地访问。例如，在湍流燃烧的背景下，可视化被用来理解湍流和标量的湍流混合之间的耦合。虽然等值面生成在这种情况下是一种有用的技术，但是一次计算和呈现一个等值面是非常昂贵的，而且并不特别适合这种基于web的框架。在本文中，我们建议使用一种称为轮廓树的总结结构，它捕获标量场的拓扑结构，并指导用户识别有用的等值面。我们还设计了一个与基于web的模拟监控系统集成的界面，允许用户与多个等值面进行交互和探索。

{"title":"Enabling Advanced Visualization Tools in a Web-Based Simulation Monitoring System","authors":"E. Santos, Julien Tierny, Ayla Khan, Brad Grimm, L. Lins, J. Freire, Valerio Pascucci, Cláudio T. Silva, S. Klasky, Roselyne B. Tchoua, N. Podhorszki","doi":"10.1109/E-SCIENCE.2009.57","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2009.57","url":null,"abstract":"Simulations that require massive amounts of computing power and generate tens of terabytes of data are now part of the daily lives of scientists. Analyzing and visualizing the results of these simulations as they are computed can lead not only to early insights but also to useful knowledge that can be provided as feedback to the simulation, avoiding unnecessary use of computing power. Our work is aimed at making advanced visualization tools available to scientists in a user-friendly, Web-based environment where they can be accessed anytime from anywhere. In the context of turbulent combustion for example, visualization is used to understand the coupling between turbulence and the turbulent mixing of scalars. Although isosurface generation is a useful technique in this scenario, computing and rendering isosurfaces one at a time is expensive and not particularly well-suited for such a Web-based framework. In this paper we propose the use of a summary structure, called contour tree, that captures the topological structure of a scalar field and guides the user in identifying useful isosurfaces. We have also designed an interface which has been integrated with a Web-based simulation monitoring system, that allows users to interact with and explore multiple isosurfaces.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"13 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120910931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Phylogenetic Predictions on Grids 网格系统发育预测

2009 Fifth IEEE International Conference on e-Science

Pub Date : 2009-12-09 DOI: 10.1109/e-Science.2009.17

Priyanka Katariya, Sathish S. Vadhiyar

A phylogenetic or evolutionary tree is constructed from a set of species or DNA sequences and depicts the relatedness between the sequences. Predictions of future sequences in a phylogenetic tree are important for a variety of applications including drug discovery, pharmaceutical research and disease control. In this work, we predict future DNA sequences in a phylogenetic tree using cellular automata. Cellular automata are used for modeling neighbor-dependent mutations from an ancestor to a progeny in a branch of the phylogenetic tree. Since the number of possible ways of transformations from an ancestor to a progeny is huge, we use computational grids and middleware techniques to explore the large number of cellular automata rules used for the mutations. We use the popular and recurring neighbor-based transitions or mutations to predict the progeny sequences in the phylogenetic tree. We performed predictions for three types of sequences, namely, triose phosphate isomerase, pyruvate kinase, and polyketide synthase sequences, by obtaining cellular automata rules on a grid consisting of 29 machines in 4 clusters located in 4 countries, and compared the predictions of the sequences using our method with predictions by random methods. We found that in all cases, our method gave about 40% better predictions than the random methods.

系统发育或进化树是由一组物种或DNA序列构建的，并描述了序列之间的关系。系统发育树中未来序列的预测对于包括药物发现、药物研究和疾病控制在内的各种应用都很重要。在这项工作中，我们使用细胞自动机在系统发育树中预测未来的DNA序列。元胞自动机用于模拟系统发育树分支中从祖先到后代的邻居依赖突变。由于从祖先到后代的可能转换方式数量巨大，因此我们使用计算网格和中间件技术来探索用于突变的大量元胞自动机规则。我们使用流行的和反复出现的基于邻居的过渡或突变来预测系统发育树中的后代序列。我们在4个国家的4个集群29台机器组成的网格上获得细胞自动机规则，对磷酸三糖异构酶、丙酮酸激酶和聚酮合成酶三种序列进行了预测，并将我们的预测结果与随机方法的预测结果进行了比较。我们发现，在所有情况下，我们的方法给出的预测比随机方法好40%左右。

{"title":"Phylogenetic Predictions on Grids","authors":"Priyanka Katariya, Sathish S. Vadhiyar","doi":"10.1109/e-Science.2009.17","DOIUrl":"https://doi.org/10.1109/e-Science.2009.17","url":null,"abstract":"A phylogenetic or evolutionary tree is constructed from a set of species or DNA sequences and depicts the relatedness between the sequences. Predictions of future sequences in a phylogenetic tree are important for a variety of applications including drug discovery, pharmaceutical research and disease control. In this work, we predict future DNA sequences in a phylogenetic tree using cellular automata. Cellular automata are used for modeling neighbor-dependent mutations from an ancestor to a progeny in a branch of the phylogenetic tree. Since the number of possible ways of transformations from an ancestor to a progeny is huge, we use computational grids and middleware techniques to explore the large number of cellular automata rules used for the mutations. We use the popular and recurring neighbor-based transitions or mutations to predict the progeny sequences in the phylogenetic tree. We performed predictions for three types of sequences, namely, triose phosphate isomerase, pyruvate kinase, and polyketide synthase sequences, by obtaining cellular automata rules on a grid consisting of 29 machines in 4 clusters located in 4 countries, and compared the predictions of the sequences using our method with predictions by random methods. We found that in all cases, our method gave about 40% better predictions than the random methods.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124520791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

The SweDat Project and Swedia Database for Phonetic and Acoustic Research 瑞典项目和瑞典语音和声学研究数据库

2009 Fifth IEEE International Conference on e-Science

Pub Date : 2009-12-09 DOI: 10.1109/e-Science.2009.15

J. Lindh, A. Eriksson

The project described here may be seen as a continuation of an earlier project, SweDia 2000, aimed at transforming the database collected in that project to a full-fledged e-science database. The database consists of recordings of Swedish dialects from 107 locations in Sweden and Swedish speaking parts of Finland. The goal of the present project is to make the material searchable in a flexible and simple way to make it available to a much wider sector of the research community than is the case at present. The database will be accessible over the Internet via user-friendly interfaces specifically designed for this type of data. Other more specialized research interfaces will also be designed to facilitate phonetic acoustic research and orientation of the database.

这里描述的项目可以看作是早期项目SweDia 2000的延续，该项目旨在将项目中收集的数据库转换为成熟的电子科学数据库。该数据库包括来自瑞典107个地点和芬兰瑞典语地区的瑞典方言录音。本项目的目标是以一种灵活和简单的方式使资料可搜索，使它比目前的情况提供给更广泛的研究界部门。该数据库将通过专门为这类数据设计的用户友好界面在互联网上访问。其他更专业的研究接口也将被设计，以促进音音研究和数据库的定位。

引用次数: 5

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2009 Fifth IEEE International Conference on e-Science

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀