Proceedings the Ninth International Symposium on High-Performance Distributed Computing最新文献

英文中文

dQCOB: managing large data flows using dynamic embedded queries dQCOB:使用动态嵌入查询管理大型数据流

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868658

Beth Plale, K. Schwan

The dQUOB system satisfies client need for specific information from high-volume data streams. The data streams we speak of are the flow of data existing during large-scale visualizations, video streaming to large numbers of distributed users, and high volume business transactions. We introduce the notion of conceptualizing a data stream as a set of relational database tables so that a scientist can request information with an SQL-like query. Transformation or computation that often needs to be performed on the data en-route can be conceptualized as computation performed on consecutive views of the data, with computation associated with each view. The dQUOB system moves the query code into the data stream as a quoblet; as compiled code. The relational database data model has the significant advantage of presenting opportunities for efficient reoptimizations of queries and sets of queries. Using examples from global atmospheric modeling, we illustrate the usefulness of the dQUOB system. We carry the examples through the experiments to establish the viability of the approach for high performance computing with a baseline benchmark. We define a cost-metric of end-to-end latency that can be used to determine realistic cases where optimization should be applied. Finally, we show that end-to-end latency can be controlled through a probability assigned to a query that a query will evaluate to true.

dQUOB系统满足了客户端对来自大容量数据流的特定信息的需求。我们所说的数据流是指在大规模可视化、面向大量分布式用户的视频流以及大量业务事务期间存在的数据流。我们引入了将数据流概念化为一组关系数据库表的概念，以便科学家可以使用类似sql的查询请求信息。经常需要在途中对数据执行的转换或计算可以被概念化为在数据的连续视图上执行的计算，并且计算与每个视图相关联。dQUOB系统将查询代码作为quoblet移动到数据流中;作为编译后的代码。关系数据库数据模型具有显著的优势，它为查询和查询集的有效重新优化提供了机会。通过全球大气模拟的实例，我们说明了dQUOB系统的实用性。我们通过实验来验证该方法在高性能计算中的可行性。我们定义了端到端延迟的成本度量，可用于确定应该应用优化的实际情况。最后，我们展示了端到端延迟可以通过分配给查询的概率来控制，查询将计算为true。

{"title":"dQCOB: managing large data flows using dynamic embedded queries","authors":"Beth Plale, K. Schwan","doi":"10.1109/HPDC.2000.868658","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868658","url":null,"abstract":"The dQUOB system satisfies client need for specific information from high-volume data streams. The data streams we speak of are the flow of data existing during large-scale visualizations, video streaming to large numbers of distributed users, and high volume business transactions. We introduce the notion of conceptualizing a data stream as a set of relational database tables so that a scientist can request information with an SQL-like query. Transformation or computation that often needs to be performed on the data en-route can be conceptualized as computation performed on consecutive views of the data, with computation associated with each view. The dQUOB system moves the query code into the data stream as a quoblet; as compiled code. The relational database data model has the significant advantage of presenting opportunities for efficient reoptimizations of queries and sets of queries. Using examples from global atmospheric modeling, we illustrate the usefulness of the dQUOB system. We carry the examples through the experiments to establish the viability of the approach for high performance computing with a baseline benchmark. We define a cost-metric of end-to-end latency that can be used to determine realistic cases where optimization should be applied. Finally, we show that end-to-end latency can be controlled through a probability assigned to a query that a query will evaluate to true.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126129235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 51

Robust resource management for metacomputers 元计算机健壮的资源管理

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868640

J. Gehring, A. Streit

Presents a robust software infrastructure for metacomputing. The system is intended to be used by others as a building block for large and powerful computational grids. Much effort has been taken to develop a fault-tolerant architecture that does not exhibit a single point of failure. Furthermore, we have designed the system to be modular, lean and portable. It is available as open source code and has been successfully compiled on POSIX- and Microsoft Windows-compliant platforms. The system does not originate from a laboratory environment but has proven its robustness within two large metacomputing installations. It embodies a modular concept which allows easy integration of new or modified components. Hence, it is not necessary to buy into the system as whole. We rather encourage others to use only those components that fit into their specific environments.

为元计算提供了一个健壮的软件基础结构。该系统旨在被其他人用作大型和强大的计算网格的构建块。为了开发不出现单点故障的容错体系结构，已经付出了很多努力。此外，我们还设计了模块化，精简和便携的系统。它作为开源代码提供，并已成功地在POSIX和Microsoft windows兼容平台上进行了编译。该系统并非来自实验室环境，但已在两个大型元计算装置中证明了其健壮性。它体现了一个模块化的概念，可以很容易地集成新的或修改的组件。因此，没有必要购买整个系统。我们鼓励其他人只使用适合其特定环境的组件。

引用次数: 32

Managing network resources in Condor Condor网络资源管理

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868666

J. Basney, M. Livny

Data-intensive applications in the Condor high-throughput computing (HTC) environment can place heavy demands on network resources for checkpointing and remote data access. We have developed mechanisms to monitor, control and schedule network usage in Condor. By managing network resources, these mechanisms provide administrative control over Condor's network usage and improve the execution efficiency of Condor applications.

Condor高吞吐量计算(HTC)环境中的数据密集型应用程序可能会对网络资源提出大量要求，以进行检查点和远程数据访问。我们已经开发了监控、控制和调度Condor网络使用的机制。通过管理网络资源，这些机制提供了对Condor网络使用的管理控制，并提高了Condor应用程序的执行效率。

引用次数: 60

Resource management through multilateral matchmaking 多方牵线搭桥的资源管理

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868662

Rajesh Raman, M. Livny, M. Solomon

Federated distributed systems present new challenges to resource management, which cannot be met by conventional systems that employ relatively static resource models and centralized allocators. We previously argued that matchmaking provides an elegant and robust resource management solution for these highly dynamic environments (R. Raman et al., 1998). Although powerful and flexible, multiparty policies (e.g., co-allocation) cannot be accommodated by matchmaking. The authors present Gang-Matching, a multilateral matchmaking formalism to address this deficiency.

联邦分布式系统对资源管理提出了新的挑战，而采用相对静态的资源模型和集中式分配器的传统系统无法满足这些挑战。我们之前认为，配对为这些高度动态的环境提供了一种优雅而稳健的资源管理解决方案(R. Raman et al.， 1998)。虽然强大而灵活，但多方政策(例如，共同分配)无法通过配对来适应。作者提出了Gang-Matching，一种多边配对形式来解决这一缺陷。

引用次数: 106

Parallel matching and sorting with TACO's distributed collections-a case study from molecular biology research 并行匹配和排序与TACO的分布式收集-从分子生物学研究的一个案例研究

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868656

J. Nolte, P. Horton

TACO is a template library that implements higher-order parallel operations on distributed object sets by means of reusable topology classes and C++ function templates. We discuss an experimental application that exploits TACO's distributed object groups and collective operations for computing the similarity between groups of molecular sequences, a computationally intensive core problem in molecular biology research. In particular we show how TACO's distributed collections can be conveniently combined with well known concepts found in the C++ standard template library (STL) to solve matching and sorting problems effectively on distributed hardware platforms. The resulting implementation is concise and gives excellent parallel performance on PC- and workstation clusters.

TACO是一个模板库，它通过可重用的拓扑类和c++函数模板在分布式对象集上实现高阶并行操作。我们讨论了一个实验应用，利用TACO的分布式对象组和集体操作来计算分子序列组之间的相似性，这是分子生物学研究中计算密集型的核心问题。我们特别展示了TACO的分布式集合如何方便地与c++标准模板库(STL)中的知名概念相结合，以有效地解决分布式硬件平台上的匹配和排序问题。最终的实现非常简洁，并在PC和工作站集群上提供了出色的并行性能。

引用次数: 0

A comparative evaluation of implicit coscheduling strategies for networks of workstations 工作站网络隐式协同调度策略的比较评价

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868653

C. Anglano

Implicit coscheduling strategies enable parallel applications to dynamically share the machines in a network of workstations (NOW) with interactive, CPU and IO-bound sequential jobs. We present a simulation study that compares 12 coscheduling strategies in terms of their impact on the performance of parallel and sequential applications executed simultaneously on a NOW. Our results show that the coscheduling strategy has a strong impact on the performance of the applications (both parallel and sequential) composing the workload, and that no single strategy is able to effectively handle all workloads. In spite of that, our results can be used to identify the strategy that represents the best choice for a given application class, or the best compromise for various workloads. Moreover, we show that in many cases simple strategies outperform more complex ones.

隐式协同调度策略使并行应用程序能够动态地共享工作站网络(NOW)中的机器，这些机器具有交互式、CPU和io绑定的顺序作业。我们提出了一项模拟研究，比较了12种协同调度策略对并行和顺序应用程序在NOW上同时执行的性能的影响。我们的结果表明，协同调度策略对组成工作负载的应用程序(并行和顺序)的性能有很大的影响，并且没有一种策略能够有效地处理所有工作负载。尽管如此，我们的结果可用于确定代表给定应用程序类的最佳选择的策略，或者用于各种工作负载的最佳折衷策略。此外，我们表明，在许多情况下，简单的策略优于更复杂的策略。

引用次数: 37

Flexible high-performance access to distributed storage resources 灵活、高性能地访问分布式存储资源

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868648

C. Patten, K. Hawick

Describes a software architecture for storage services in computational grid environments. Based upon a lightweight message-passing paradigm, the architecture enables the provision and composition of active, distributed storage services. These services can then cooperatively provide access to distributed storage in a manner potentially optimized for dataset and resource environments. We report on the design and implementation of a distributed file system and a dataset-specific satellite imagery service using the architecture. We discuss data movement and storage issues and implications for future work with the architecture.

描述计算网格环境中存储服务的软件体系结构。该体系结构基于轻量级消息传递范例，支持提供和组合活动的分布式存储服务。然后，这些服务可以以一种可能针对数据集和资源环境进行优化的方式协同提供对分布式存储的访问。我们报告了使用该架构的分布式文件系统和特定于数据集的卫星图像服务的设计和实现。我们讨论了数据移动和存储问题，以及对该体系结构未来工作的影响。

引用次数: 9

Distributed data access in the Sequential Access Model at the D0 experiment at Fermilab 顺序访问模型在费米实验室D0实验中的分布式数据访问

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 2000-08-01 DOI: 10.1109/HPDC.2000.868672

I. Terekhov, V. White

Presents the Sequential Access Model (SAM), which is the data-handling system for D0, one of two primary high-energy experiments at Fermilab. During the next several years, the D0 experiment will store a total of about 1 PByte of data, including raw detector data and data processed at various levels. The design of SAM is not specific to the D0 experiment and carries few assumptions about the underlying mass storage level; its ideas are applicable to any sequential data access. By definition, in the sequential access mode, a user application needs to process a stream of data by accessing each data unit exactly once, the order of the data units in the stream being irrelevant. The units of data are laid out sequentially in files. The adopted model allows for a significant optimization of system performance, a reduction in user file latency and an increase in the overall throughput. In particular, caching is done with the knowledge of all the files that are needed "in the near future", which is defined as all the files being used by already-running or submitted jobs. The bulk of the data is stored in files on tape in the mass storage system Enstore. All of the data managed by SAM is cataloged in great detail in a relational database (Oracle).

介绍了费米实验室两个主要高能实验之一D0的数据处理系统——顺序存取模型(SAM)。在接下来的几年中，D0实验将存储总计约1 PByte的数据，包括原始探测器数据和各级处理的数据。SAM的设计不是针对D0实验的，并且对底层的大容量存储水平没有多少假设;它的思想适用于任何顺序数据访问。根据定义，在顺序访问模式下，用户应用程序需要通过访问每个数据单元只访问一次来处理数据流，而流中数据单元的顺序是无关的。数据单元按顺序排列在文件中。所采用的模型可以显著优化系统性能，减少用户文件延迟并提高总体吞吐量。特别是，缓存是在知道“在不久的将来”需要的所有文件的情况下完成的，这些文件被定义为已经运行或提交的作业正在使用的所有文件。大容量存储系统Enstore中的大量数据以磁带文件的形式存储。SAM管理的所有数据都在关系数据库(Oracle)中进行了非常详细的编目。

{"title":"Distributed data access in the Sequential Access Model at the D0 experiment at Fermilab","authors":"I. Terekhov, V. White","doi":"10.1109/HPDC.2000.868672","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868672","url":null,"abstract":"Presents the Sequential Access Model (SAM), which is the data-handling system for D0, one of two primary high-energy experiments at Fermilab. During the next several years, the D0 experiment will store a total of about 1 PByte of data, including raw detector data and data processed at various levels. The design of SAM is not specific to the D0 experiment and carries few assumptions about the underlying mass storage level; its ideas are applicable to any sequential data access. By definition, in the sequential access mode, a user application needs to process a stream of data by accessing each data unit exactly once, the order of the data units in the stream being irrelevant. The units of data are laid out sequentially in files. The adopted model allows for a significant optimization of system performance, a reduction in user file latency and an increase in the overall throughput. In particular, caching is done with the knowledge of all the files that are needed \"in the near future\", which is defined as all the files being used by already-running or submitted jobs. The bulk of the data is stored in files on tape in the mass storage system Enstore. All of the data managed by SAM is cataloged in great detail in a relational database (Oracle).","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115222967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

The Ninth International Symposium On High-performance Distributed Computing 第九届高性能分布式计算国际研讨会

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

Pub Date : 1900-01-01 DOI: 10.1109/HPDC.2000.868628

F. Berman

引用次数: 5

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings the Ninth International Symposium on High-Performance Distributed Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀