Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868658
Beth Plale, K. Schwan
The dQUOB system satisfies client need for specific information from high-volume data streams. The data streams we speak of are the flow of data existing during large-scale visualizations, video streaming to large numbers of distributed users, and high volume business transactions. We introduce the notion of conceptualizing a data stream as a set of relational database tables so that a scientist can request information with an SQL-like query. Transformation or computation that often needs to be performed on the data en-route can be conceptualized as computation performed on consecutive views of the data, with computation associated with each view. The dQUOB system moves the query code into the data stream as a quoblet; as compiled code. The relational database data model has the significant advantage of presenting opportunities for efficient reoptimizations of queries and sets of queries. Using examples from global atmospheric modeling, we illustrate the usefulness of the dQUOB system. We carry the examples through the experiments to establish the viability of the approach for high performance computing with a baseline benchmark. We define a cost-metric of end-to-end latency that can be used to determine realistic cases where optimization should be applied. Finally, we show that end-to-end latency can be controlled through a probability assigned to a query that a query will evaluate to true.
{"title":"dQCOB: managing large data flows using dynamic embedded queries","authors":"Beth Plale, K. Schwan","doi":"10.1109/HPDC.2000.868658","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868658","url":null,"abstract":"The dQUOB system satisfies client need for specific information from high-volume data streams. The data streams we speak of are the flow of data existing during large-scale visualizations, video streaming to large numbers of distributed users, and high volume business transactions. We introduce the notion of conceptualizing a data stream as a set of relational database tables so that a scientist can request information with an SQL-like query. Transformation or computation that often needs to be performed on the data en-route can be conceptualized as computation performed on consecutive views of the data, with computation associated with each view. The dQUOB system moves the query code into the data stream as a quoblet; as compiled code. The relational database data model has the significant advantage of presenting opportunities for efficient reoptimizations of queries and sets of queries. Using examples from global atmospheric modeling, we illustrate the usefulness of the dQUOB system. We carry the examples through the experiments to establish the viability of the approach for high performance computing with a baseline benchmark. We define a cost-metric of end-to-end latency that can be used to determine realistic cases where optimization should be applied. Finally, we show that end-to-end latency can be controlled through a probability assigned to a query that a query will evaluate to true.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126129235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868645
R. Wolski, B. Gaidioz, B. Tourancheau
Presents a scalable protocol for conducting periodic probes of network performance in a way that minimizes collisions between separate probes. The goal of the protocol is to enable active performance monitoring of large-scale distributed computational systems and networks. We use the protocol to generate time series of measurement data that are then exposed to numerical forecasting models when a prediction of network performance is required. We present the protocol and demonstrate its effectiveness using the Network Weather Service -a tool for dynamically predicting network, CPU, memory and storage performance.
{"title":"Synchronizing network probes to avoid measurement intrusiveness with the Network Weather Service","authors":"R. Wolski, B. Gaidioz, B. Tourancheau","doi":"10.1109/HPDC.2000.868645","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868645","url":null,"abstract":"Presents a scalable protocol for conducting periodic probes of network performance in a way that minimizes collisions between separate probes. The goal of the protocol is to enable active performance monitoring of large-scale distributed computational systems and networks. We use the protocol to generate time series of measurement data that are then exposed to numerical forecasting models when a prediction of network performance is required. We present the protocol and demonstrate its effectiveness using the Network Weather Service -a tool for dynamically predicting network, CPU, memory and storage performance.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121748636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868666
J. Basney, M. Livny
Data-intensive applications in the Condor high-throughput computing (HTC) environment can place heavy demands on network resources for checkpointing and remote data access. We have developed mechanisms to monitor, control and schedule network usage in Condor. By managing network resources, these mechanisms provide administrative control over Condor's network usage and improve the execution efficiency of Condor applications.
{"title":"Managing network resources in Condor","authors":"J. Basney, M. Livny","doi":"10.1109/HPDC.2000.868666","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868666","url":null,"abstract":"Data-intensive applications in the Condor high-throughput computing (HTC) environment can place heavy demands on network resources for checkpointing and remote data access. We have developed mechanisms to monitor, control and schedule network usage in Condor. By managing network resources, these mechanisms provide administrative control over Condor's network usage and improve the execution efficiency of Condor applications.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131112059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868662
Rajesh Raman, M. Livny, M. Solomon
Federated distributed systems present new challenges to resource management, which cannot be met by conventional systems that employ relatively static resource models and centralized allocators. We previously argued that matchmaking provides an elegant and robust resource management solution for these highly dynamic environments (R. Raman et al., 1998). Although powerful and flexible, multiparty policies (e.g., co-allocation) cannot be accommodated by matchmaking. The authors present Gang-Matching, a multilateral matchmaking formalism to address this deficiency.
联邦分布式系统对资源管理提出了新的挑战,而采用相对静态的资源模型和集中式分配器的传统系统无法满足这些挑战。我们之前认为,配对为这些高度动态的环境提供了一种优雅而稳健的资源管理解决方案(R. Raman et al., 1998)。虽然强大而灵活,但多方政策(例如,共同分配)无法通过配对来适应。作者提出了Gang-Matching,一种多边配对形式来解决这一缺陷。
{"title":"Resource management through multilateral matchmaking","authors":"Rajesh Raman, M. Livny, M. Solomon","doi":"10.1109/HPDC.2000.868662","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868662","url":null,"abstract":"Federated distributed systems present new challenges to resource management, which cannot be met by conventional systems that employ relatively static resource models and centralized allocators. We previously argued that matchmaking provides an elegant and robust resource management solution for these highly dynamic environments (R. Raman et al., 1998). Although powerful and flexible, multiparty policies (e.g., co-allocation) cannot be accommodated by matchmaking. The authors present Gang-Matching, a multilateral matchmaking formalism to address this deficiency.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131283973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868656
J. Nolte, P. Horton
TACO is a template library that implements higher-order parallel operations on distributed object sets by means of reusable topology classes and C++ function templates. We discuss an experimental application that exploits TACO's distributed object groups and collective operations for computing the similarity between groups of molecular sequences, a computationally intensive core problem in molecular biology research. In particular we show how TACO's distributed collections can be conveniently combined with well known concepts found in the C++ standard template library (STL) to solve matching and sorting problems effectively on distributed hardware platforms. The resulting implementation is concise and gives excellent parallel performance on PC- and workstation clusters.
{"title":"Parallel matching and sorting with TACO's distributed collections-a case study from molecular biology research","authors":"J. Nolte, P. Horton","doi":"10.1109/HPDC.2000.868656","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868656","url":null,"abstract":"TACO is a template library that implements higher-order parallel operations on distributed object sets by means of reusable topology classes and C++ function templates. We discuss an experimental application that exploits TACO's distributed object groups and collective operations for computing the similarity between groups of molecular sequences, a computationally intensive core problem in molecular biology research. In particular we show how TACO's distributed collections can be conveniently combined with well known concepts found in the C++ standard template library (STL) to solve matching and sorting problems effectively on distributed hardware platforms. The resulting implementation is concise and gives excellent parallel performance on PC- and workstation clusters.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"217 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130376684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868653
C. Anglano
Implicit coscheduling strategies enable parallel applications to dynamically share the machines in a network of workstations (NOW) with interactive, CPU and IO-bound sequential jobs. We present a simulation study that compares 12 coscheduling strategies in terms of their impact on the performance of parallel and sequential applications executed simultaneously on a NOW. Our results show that the coscheduling strategy has a strong impact on the performance of the applications (both parallel and sequential) composing the workload, and that no single strategy is able to effectively handle all workloads. In spite of that, our results can be used to identify the strategy that represents the best choice for a given application class, or the best compromise for various workloads. Moreover, we show that in many cases simple strategies outperform more complex ones.
{"title":"A comparative evaluation of implicit coscheduling strategies for networks of workstations","authors":"C. Anglano","doi":"10.1109/HPDC.2000.868653","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868653","url":null,"abstract":"Implicit coscheduling strategies enable parallel applications to dynamically share the machines in a network of workstations (NOW) with interactive, CPU and IO-bound sequential jobs. We present a simulation study that compares 12 coscheduling strategies in terms of their impact on the performance of parallel and sequential applications executed simultaneously on a NOW. Our results show that the coscheduling strategy has a strong impact on the performance of the applications (both parallel and sequential) composing the workload, and that no single strategy is able to effectively handle all workloads. In spite of that, our results can be used to identify the strategy that represents the best choice for a given application class, or the best compromise for various workloads. Moreover, we show that in many cases simple strategies outperform more complex ones.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"449 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132131600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868648
C. Patten, K. Hawick
Describes a software architecture for storage services in computational grid environments. Based upon a lightweight message-passing paradigm, the architecture enables the provision and composition of active, distributed storage services. These services can then cooperatively provide access to distributed storage in a manner potentially optimized for dataset and resource environments. We report on the design and implementation of a distributed file system and a dataset-specific satellite imagery service using the architecture. We discuss data movement and storage issues and implications for future work with the architecture.
{"title":"Flexible high-performance access to distributed storage resources","authors":"C. Patten, K. Hawick","doi":"10.1109/HPDC.2000.868648","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868648","url":null,"abstract":"Describes a software architecture for storage services in computational grid environments. Based upon a lightweight message-passing paradigm, the architecture enables the provision and composition of active, distributed storage services. These services can then cooperatively provide access to distributed storage in a manner potentially optimized for dataset and resource environments. We report on the design and implementation of a distributed file system and a dataset-specific satellite imagery service using the architecture. We discuss data movement and storage issues and implications for future work with the architecture.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116982503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-08-01DOI: 10.1109/HPDC.2000.868672
I. Terekhov, V. White
Presents the Sequential Access Model (SAM), which is the data-handling system for D0, one of two primary high-energy experiments at Fermilab. During the next several years, the D0 experiment will store a total of about 1 PByte of data, including raw detector data and data processed at various levels. The design of SAM is not specific to the D0 experiment and carries few assumptions about the underlying mass storage level; its ideas are applicable to any sequential data access. By definition, in the sequential access mode, a user application needs to process a stream of data by accessing each data unit exactly once, the order of the data units in the stream being irrelevant. The units of data are laid out sequentially in files. The adopted model allows for a significant optimization of system performance, a reduction in user file latency and an increase in the overall throughput. In particular, caching is done with the knowledge of all the files that are needed "in the near future", which is defined as all the files being used by already-running or submitted jobs. The bulk of the data is stored in files on tape in the mass storage system Enstore. All of the data managed by SAM is cataloged in great detail in a relational database (Oracle).
{"title":"Distributed data access in the Sequential Access Model at the D0 experiment at Fermilab","authors":"I. Terekhov, V. White","doi":"10.1109/HPDC.2000.868672","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868672","url":null,"abstract":"Presents the Sequential Access Model (SAM), which is the data-handling system for D0, one of two primary high-energy experiments at Fermilab. During the next several years, the D0 experiment will store a total of about 1 PByte of data, including raw detector data and data processed at various levels. The design of SAM is not specific to the D0 experiment and carries few assumptions about the underlying mass storage level; its ideas are applicable to any sequential data access. By definition, in the sequential access mode, a user application needs to process a stream of data by accessing each data unit exactly once, the order of the data units in the stream being irrelevant. The units of data are laid out sequentially in files. The adopted model allows for a significant optimization of system performance, a reduction in user file latency and an increase in the overall throughput. In particular, caching is done with the knowledge of all the files that are needed \"in the near future\", which is defined as all the files being used by already-running or submitted jobs. The bulk of the data is stored in files on tape in the mass storage system Enstore. All of the data managed by SAM is cataloged in great detail in a relational database (Oracle).","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115222967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/HPDC.2000.868628
F. Berman
{"title":"The Ninth International Symposium On High-performance Distributed Computing","authors":"F. Berman","doi":"10.1109/HPDC.2000.868628","DOIUrl":"https://doi.org/10.1109/HPDC.2000.868628","url":null,"abstract":"","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115869443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}