2012 SC Companion: High Performance Computing, Networking Storage and Analysis最新文献

英文中文

Abstract: MemzNet: Memory-Mapped Zero-Copy Network Channel for Moving Large Datasets over 100Gbps Network MemzNet:内存映射零拷贝网络通道，用于在100Gbps网络上移动大数据集

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.294

Mehmet Balman

High-bandwidth networks are poised to provide new opportunities in tackling large data challenges in today's scientific applications. However, increasing the bandwidth is not sufficient by itself; we need careful evaluation of future high-bandwidth networks from the applications' perspective. We have experimented with current state-of-the-art data movement tools, and realized that file-centric data transfer protocols do not perform well with managing the transfer of many small files in high-bandwidth networks, even when using parallel streams or concurrent transfers. We require enhancements in current middleware tools to take advantage of future networking frameworks. To improve performance and efficiency, we develop an experimental prototype, called MemzNet: Memory-mapped Zero-copy Network Channel, which uses a block-based data movement method in moving large scientific datasets. We have implemented MemzNet that takes the approach of aggregating files into blocks and providing dynamic data channel management. In this work, we present our initial results in 100Gbps network.

在当今的科学应用中，高带宽网络为应对大数据挑战提供了新的机遇。然而，增加带宽本身是不够的;我们需要从应用的角度对未来的高带宽网络进行仔细的评估。我们对当前最先进的数据移动工具进行了实验，并意识到以文件为中心的数据传输协议在管理高带宽网络中许多小文件的传输方面表现不佳，即使在使用并行流或并发传输时也是如此。我们需要增强当前的中间件工具，以利用未来的网络框架。为了提高性能和效率，我们开发了一个实验原型，称为MemzNet:内存映射零复制网络通道，它使用基于块的数据移动方法来移动大型科学数据集。我们已经实现了MemzNet，它采用将文件聚合到块中的方法，并提供动态数据通道管理。在这项工作中，我们介绍了我们在100Gbps网络中的初步结果。

引用次数: 2

FRIEDA: Flexible Robust Intelligent Elastic Data Management in Cloud Environments FRIEDA:云环境中灵活稳健的智能弹性数据管理

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.132

D. Ghoshal, L. Ramakrishnan

Scientific applications are increasingly using cloud resources for their data analysis workflows. However, managing data effectively and efficiently over these cloud resources is challenging due to the myriad storage choices with different performance-cost trade-offs, complex application choices, complexity associated with elasticity and, failure rates. The explosion in scientific data coupled with unique characteristics of cloud environments require a more flexible and robust distributed data management solution than the ones currently in existence. This paper describes the design and implementation of FRIEDA - a Flexible Robust Intelligent Elastic Data Management framework. FRIEDA coordinates data in a transient cloud environment taking into account specific application characteristics. Additionally, we describe a range of data management strategies and show the benefit of flexible data management schemes in cloud environments. We study two distinct scientific applications from bioinformatics and image analysis to understand the effectiveness of such a framework.

科学应用越来越多地使用云资源进行数据分析工作流程。然而，在这些云资源上有效和高效地管理数据是具有挑战性的，因为有无数的存储选择，具有不同的性能成本权衡，复杂的应用程序选择，与弹性和故障率相关的复杂性。科学数据的爆炸式增长，加上云环境的独特特征，需要比现有的分布式数据管理解决方案更灵活、更健壮的解决方案。本文介绍了一个灵活、稳健的智能弹性数据管理框架FRIEDA的设计与实现。FRIEDA在瞬态云环境中协调数据，同时考虑到特定的应用程序特征。此外，我们还描述了一系列数据管理策略，并展示了灵活的数据管理方案在云环境中的好处。我们研究了生物信息学和图像分析两种不同的科学应用，以了解这种框架的有效性。

引用次数: 18

Poster: Bringing Task and Data Parallelism to Analysis of Climate Model Output 海报:将任务和数据并行性引入气候模式输出分析

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.283

R. Jacob, Jayesh Krishna, Xiabing Xu, S. Mickelson, T. Tautges, M. Wilde, R. Latham, Ian T Foster, R. Ross, M. Hereld, J. Larson, P. Bochev, K. Peterson, M. Taylor, K. Schuchardt, Jain Yin, D. Middleton, Mary Haley, David Brown, Wei Huang, D. Shea, R. Brownrigg, M. Vertenstein, K. Ma, Jingrong Xie

Climate models are both outputting larger and larger amounts of data and are doing it on more sophisticated numerical grids. The tools climate scientists have used to analyze climate output, an essential component of climate modeling, are single threaded and assume rectangular structured grids in their analysis algorithms. We are bringing both task- and data-parallelism to the analysis of climate model output. We have created a new data-parallel library, the Parallel Gridded Analysis Library (ParGAL) which can read in data using parallel I/O, store the data on a compete representation of the structured or unstructured mesh and perform sophisticated analysis on the data in parallel. ParGAL has been used to create a parallel version of a script-based analysis and visualization package. Finally, we have also taken current workflows and employed task-based parallelism to decrease the total execution time.

气候模型输出的数据量越来越大，而且是在更复杂的数值网格上进行的。气候科学家用来分析气候输出(气候建模的重要组成部分)的工具是单线程的，在分析算法中采用矩形结构网格。我们正在将任务和数据并行性引入气候模型输出的分析。我们创建了一个新的数据并行库，并行网格分析库(ParGAL)，它可以使用并行I/O读取数据，将数据存储在结构化或非结构化网格的竞争表示中，并并行地对数据进行复杂的分析。ParGAL被用来创建一个基于脚本的分析和可视化包的并行版本。最后，我们还采用了当前的工作流，并采用了基于任务的并行性来减少总执行时间。

{"title":"Poster: Bringing Task and Data Parallelism to Analysis of Climate Model Output","authors":"R. Jacob, Jayesh Krishna, Xiabing Xu, S. Mickelson, T. Tautges, M. Wilde, R. Latham, Ian T Foster, R. Ross, M. Hereld, J. Larson, P. Bochev, K. Peterson, M. Taylor, K. Schuchardt, Jain Yin, D. Middleton, Mary Haley, David Brown, Wei Huang, D. Shea, R. Brownrigg, M. Vertenstein, K. Ma, Jingrong Xie","doi":"10.1109/SC.Companion.2012.283","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.283","url":null,"abstract":"Climate models are both outputting larger and larger amounts of data and are doing it on more sophisticated numerical grids. The tools climate scientists have used to analyze climate output, an essential component of climate modeling, are single threaded and assume rectangular structured grids in their analysis algorithms. We are bringing both task- and data-parallelism to the analysis of climate model output. We have created a new data-parallel library, the Parallel Gridded Analysis Library (ParGAL) which can read in data using parallel I/O, store the data on a compete representation of the structured or unstructured mesh and perform sophisticated analysis on the data in parallel. ParGAL has been used to create a parallel version of a script-based analysis and visualization package. Finally, we have also taken current workflows and employed task-based parallelism to decrease the total execution time.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"12 1","pages":"1495"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76674227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

New ASHRAE Thermal Guidelines for Air and Liquid Cooling 新的ASHRAE空气和液体冷却热指南

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.122

M. Ellsworth

This presentation provides a tutorial on ASHRAE thermal guidelines for both air and liquid cooling.

本报告提供了ASHRAE空气和液体冷却热指南的教程。

引用次数: 1

An Evolutionary Path to Object Storage Access 对象存储访问的进化路径

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.17

David Goodell, S. Kim, R. Latham, M. Kandemir, R. Ross

High-performance computing (HPC) storage systems typically consist of an object storage system that is accessed via the POSIX file interface. However, rapid increases in system scales and storage system complexity have uncovered a number of limitations in this model. In particular, applications and libraries are limited in their ability to partition data into units with independent concurrency control, and mapping complex science data models into the POSIX file model is inconvenient at best. In this paper we propose an alternative interface for use by applications and libraries that provides direct access to underlying storage objects. This model allows applications and libraries to organize storage access around these objects in order to avoid lock contention without needing to create many separate files. Additionally, complex data models are more readily organized into multiple object data streams, simplifying the storage of variable-length data and allowing a choice of degree of parallelism related to access needs. Our approach provides for datasets stored in this new model to coexist with POSIX files, allowing evolution to the new model over time. We apply these concepts in the PVFS, PLFS, and Parallel netCDF packages to prototype the model and describe our experiences.

高性能计算(HPC)存储系统通常由对象存储系统组成，通过POSIX文件接口访问。然而，系统规模和存储系统复杂性的快速增长揭示了该模型的许多局限性。特别是，应用程序和库在将数据划分为具有独立并发控制的单元的能力方面受到限制，并且将复杂的科学数据模型映射到POSIX文件模型是不方便的。在本文中，我们提出了一个可供应用程序和库使用的替代接口，该接口提供了对底层存储对象的直接访问。该模型允许应用程序和库围绕这些对象组织存储访问，以避免锁争用，而无需创建许多单独的文件。此外，复杂的数据模型更容易组织成多个对象数据流，从而简化了可变长度数据的存储，并允许选择与访问需求相关的并行度。我们的方法允许存储在这个新模型中的数据集与POSIX文件共存，允许随着时间的推移向新模型演进。我们在PVFS, PLFS和Parallel netCDF包中应用这些概念来原型化模型并描述我们的经验。

{"title":"An Evolutionary Path to Object Storage Access","authors":"David Goodell, S. Kim, R. Latham, M. Kandemir, R. Ross","doi":"10.1109/SC.Companion.2012.17","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.17","url":null,"abstract":"High-performance computing (HPC) storage systems typically consist of an object storage system that is accessed via the POSIX file interface. However, rapid increases in system scales and storage system complexity have uncovered a number of limitations in this model. In particular, applications and libraries are limited in their ability to partition data into units with independent concurrency control, and mapping complex science data models into the POSIX file model is inconvenient at best. In this paper we propose an alternative interface for use by applications and libraries that provides direct access to underlying storage objects. This model allows applications and libraries to organize storage access around these objects in order to avoid lock contention without needing to create many separate files. Additionally, complex data models are more readily organized into multiple object data streams, simplifying the storage of variable-length data and allowing a choice of degree of parallelism related to access needs. Our approach provides for datasets stored in this new model to coexist with POSIX files, allowing evolution to the new model over time. We apply these concepts in the PVFS, PLFS, and Parallel netCDF packages to prototype the model and describe our experiences.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"108 1","pages":"36-41"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74661813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Array Databases 数组的数据库

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.365

P. Baumann

Summary form only given. The paper presents the Array Databases using the example of rasdaman, a fully implemented system in operational service since years. We introduce an array query language which embeds seamlessly into standard SQL and show how this language can be supported by a streamlined architecture which allows for effective storage and query optimization and parallelization. In this context we emphasize that Array Database research can gain a lot from combining the knowledge of database, supercomputing, and programming language domains.

只提供摘要形式。本文以rasdaman为例介绍了阵列数据库，这是一个多年来在运营服务中完全实现的系统。我们将介绍一种数组查询语言，它可以无缝嵌入到标准SQL中，并展示如何通过简化的体系结构来支持这种语言，从而实现有效的存储和查询优化以及并行化。在这种背景下，我们强调数组数据库的研究可以从数据库、超级计算和编程语言领域的知识相结合中获益良多。

引用次数: 8

Understanding Cloud Data Using Approximate String Matching and Edit Distance 使用近似字符串匹配和编辑距离理解云数据

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.149

Joseph Jupin, Justin Y. Shi, Z. Obradovic

For health and human services, fraud detection and other security services, identity resolution is a core requirement for understanding big data in the cloud. Due to the lack of a globally unique identifier and captured typographic differences for the same identity, identity resolution has high spatial and temporal complexities. We propose a filter and verify method to substantially increase the speed of approximate string matching using edit distance. This method has been found to be almost 80 times faster (130 times when combined with other optimizations) than Damerau-Levenshtein edit distance and preserves all approximate matches. Our method creates compressed signatures for data fields and uses Boolean operations and an enhanced bit counter to quickly compare the distance between the fields. This method is intended to be applied to data records whose fields contain relatively short-length strings, such as those found in most demographic data. Without loss of accuracy, the proposed Fast Bitwise Filter will provide substantial performance gain to approximate string comparison in database, record linkage and deduplication data processing systems.

对于健康和人类服务、欺诈检测和其他安全服务而言，身份解析是理解云中的大数据的核心要求。由于缺乏全局唯一标识符和捕获相同标识的排版差异，标识解析具有很高的空间和时间复杂性。我们提出了一种过滤和验证方法，可以大大提高使用编辑距离进行近似字符串匹配的速度。这种方法被发现比Damerau-Levenshtein编辑距离快近80倍(与其他优化相结合时快130倍)，并保留所有近似匹配。我们的方法为数据字段创建压缩签名，并使用布尔运算和增强的位计数器来快速比较字段之间的距离。此方法旨在应用于字段包含相对较短字符串的数据记录，例如大多数人口统计数据中的字符串。在不损失准确性的情况下，所提出的Fast Bitwise Filter将为数据库、记录链接和重复数据处理系统中的近似字符串比较提供实质性的性能增益。

{"title":"Understanding Cloud Data Using Approximate String Matching and Edit Distance","authors":"Joseph Jupin, Justin Y. Shi, Z. Obradovic","doi":"10.1109/SC.Companion.2012.149","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.149","url":null,"abstract":"For health and human services, fraud detection and other security services, identity resolution is a core requirement for understanding big data in the cloud. Due to the lack of a globally unique identifier and captured typographic differences for the same identity, identity resolution has high spatial and temporal complexities. We propose a filter and verify method to substantially increase the speed of approximate string matching using edit distance. This method has been found to be almost 80 times faster (130 times when combined with other optimizations) than Damerau-Levenshtein edit distance and preserves all approximate matches. Our method creates compressed signatures for data fields and uses Boolean operations and an enhanced bit counter to quickly compare the distance between the fields. This method is intended to be applied to data records whose fields contain relatively short-length strings, such as those found in most demographic data. Without loss of accuracy, the proposed Fast Bitwise Filter will provide substantial performance gain to approximate string comparison in database, record linkage and deduplication data processing systems.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"59 1","pages":"1234-1243"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73140971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Scalable Cyber-Security for Terabit Cloud Computing 太比特云计算的可扩展网络安全

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.338

Jordi Ros-Giralt, Péter Szilágyi, R. Lethin

This paper addresses the problem of scalable cyber-security using a cloud computing architecture. Scalability is treated in two contexts: (1) performance and power efficiency and (2) degree of cyber security-relevant information detected by the cyber-security cloud (CSC). We provide a framework to construct CSCs, which derives from a set of fundamental building blocks (forwarders, analyzers and grounds) and the identification of the smallest functional units (atomic CSC cells or simply aCS C cells) capable of embedding the full functionality of the cyber-security cloud. aCSC cells are then studied and several high-performance algorithms are presented to optimize the system's performance and power efficiency. Among these, a new queuing policy - called tail early detection (TED) - is introduced to proactively drop packets in a way that the degree of detected information is maximized while saving power by avoiding spending cycles on less relevant traffic components. We also show that it is possible to use aCSC cells as core building blocks to construct arbitrarily large cyber-security clouds by structuring the cells using a hierarchical architecture. To demonstrate the utility of our framework, we implement one cyber-security "mini-cloud" on a single chip prototype based on the Tilera's TILEPro64 processor demonstrating performance of up to 10Gbps.

本文讨论了使用云计算架构的可扩展网络安全问题。可扩展性是在两种情况下处理的:(1)性能和功率效率;(2)网络安全云(CSC)检测到的网络安全相关信息的程度。我们提供了一个构建CSC的框架，该框架源自一组基本构建块(转发器、分析器和基础)和最小功能单元(原子CSC细胞或简单的aCS C细胞)的识别，能够嵌入网络安全云的全部功能。然后研究了aCSC单元，并提出了几种高性能算法来优化系统的性能和功率效率。其中，引入了一种新的队列策略-尾部早期检测(TED) -以一种方式主动丢弃数据包，以最大程度地检测到信息，同时通过避免在不相关的流量组件上花费周期来节省电力。我们还表明，可以使用aCSC单元作为核心构建块，通过使用分层架构构建单元来构建任意大的网络安全云。为了演示我们的框架的实用性，我们在基于Tilera的tile64处理器的单芯片原型上实现了一个网络安全“迷你云”，其性能高达10Gbps。

{"title":"Scalable Cyber-Security for Terabit Cloud Computing","authors":"Jordi Ros-Giralt, Péter Szilágyi, R. Lethin","doi":"10.1109/SC.Companion.2012.338","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.338","url":null,"abstract":"This paper addresses the problem of scalable cyber-security using a cloud computing architecture. Scalability is treated in two contexts: (1) performance and power efficiency and (2) degree of cyber security-relevant information detected by the cyber-security cloud (CSC). We provide a framework to construct CSCs, which derives from a set of fundamental building blocks (forwarders, analyzers and grounds) and the identification of the smallest functional units (atomic CSC cells or simply aCS C cells) capable of embedding the full functionality of the cyber-security cloud. aCSC cells are then studied and several high-performance algorithms are presented to optimize the system's performance and power efficiency. Among these, a new queuing policy - called tail early detection (TED) - is introduced to proactively drop packets in a way that the degree of detected information is maximized while saving power by avoiding spending cycles on less relevant traffic components. We also show that it is possible to use aCSC cells as core building blocks to construct arbitrarily large cyber-security clouds by structuring the cells using a hierarchical architecture. To demonstrate the utility of our framework, we implement one cyber-security \"mini-cloud\" on a single chip prototype based on the Tilera's TILEPro64 processor demonstrating performance of up to 10Gbps.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"1 1","pages":"1607-1616"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76311938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Poster: Numeric Based Ordering for Preconditioned Conjugate Gradient 海报:预条件共轭梯度的基于数值的排序

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.309

J. Booth

The ordering of a matrix vastly impact the convergence rate of precondition conjugate gradient method. Past ordering methods focus solely on a graph representation of the sparse matrix and do not give an inside into the convergence rate that is linked to the preconditioned eigenspectrum. This work attempt to investigate how numerical based ordering may produce a better preconditioned system in terms of faster convergence.

矩阵的序对前置共轭梯度法的收敛速度有很大影响。过去的排序方法只关注稀疏矩阵的图表示，而没有给出与预条件特征谱相关联的收敛速率的内部。这项工作试图研究基于数值的排序如何在更快的收敛方面产生更好的预置系统。

引用次数: 1

DS-CUDA: A Middleware to Use Many GPUs in the Cloud Environment DS-CUDA:在云环境中使用多个gpu的中间件

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.146

Minoru Oikawa, A. Kawai, K. Nomura, K. Yasuoka, Kazuyuki Yoshikawa, T. Narumi

GPGPU (General-purpose computing on graphics processing units) has several difficulties when used in cloud environment, such as narrow bandwidth, higher cost, and lower security, compared with computation using only CPUs. Most high performance computing applications require huge communication between nodes, and do not fit a cloud environment, since network topology and its bandwidth are not fixed and they affect the performance of the application program. However, there are some applications for which little communication is needed, such as molecular dynamics (MD) simulation with the replica exchange method (REM). For such applications, we propose DS-CUDA (Distributed-shared compute unified device architecture), a middleware to use many GPUs in a cloud environment with lower cost and higher security. It virtualizes GPUs in a cloud such that they appear to be locally installed GPUs in a client machine. Its redundant mechanism ensures reliable calculation with consumer GPUs, which reduce the cost greatly. It also enhances the security level since no data except command and data for GPUs are stored in the cloud side. REM-MD simulation with 64 GPUs showed 58 and 36 times more speed than a locally-installed GPU via InfiniBand and the Internet, respectively.

GPGPU (General-purpose computing on graphics processing unit，图形处理单元上的通用计算)在云环境中使用时，与仅使用cpu进行计算相比，存在带宽窄、成本高、安全性低等问题。大多数高性能计算应用需要在节点之间进行大量通信，并且不适合云环境，因为网络拓扑及其带宽不是固定的，并且会影响应用程序的性能。然而，也有一些应用程序几乎不需要通信，例如使用副本交换方法(REM)的分子动力学(MD)模拟。针对这样的应用，我们提出了DS-CUDA (Distributed-shared compute unified device architecture，分布式共享计算统一设备架构)，这是一种在云环境中使用多个gpu的中间件，具有更低的成本和更高的安全性。它在云端虚拟化gpu，使它们看起来像是在客户端机器上本地安装的gpu。它的冗余机制保证了与消费级gpu的可靠计算，大大降低了成本。除了命令和gpu的数据，没有其他数据存储在云端，提高了安全性。使用64个GPU的REM-MD模拟的速度分别是通过InfiniBand和Internet安装的本地GPU的58倍和36倍。

{"title":"DS-CUDA: A Middleware to Use Many GPUs in the Cloud Environment","authors":"Minoru Oikawa, A. Kawai, K. Nomura, K. Yasuoka, Kazuyuki Yoshikawa, T. Narumi","doi":"10.1109/SC.Companion.2012.146","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.146","url":null,"abstract":"GPGPU (General-purpose computing on graphics processing units) has several difficulties when used in cloud environment, such as narrow bandwidth, higher cost, and lower security, compared with computation using only CPUs. Most high performance computing applications require huge communication between nodes, and do not fit a cloud environment, since network topology and its bandwidth are not fixed and they affect the performance of the application program. However, there are some applications for which little communication is needed, such as molecular dynamics (MD) simulation with the replica exchange method (REM). For such applications, we propose DS-CUDA (Distributed-shared compute unified device architecture), a middleware to use many GPUs in a cloud environment with lower cost and higher security. It virtualizes GPUs in a cloud such that they appear to be locally installed GPUs in a client machine. Its redundant mechanism ensures reliable calculation with consumer GPUs, which reduce the cost greatly. It also enhances the security level since no data except command and data for GPUs are stored in the cloud side. REM-MD simulation with 64 GPUs showed 58 and 36 times more speed than a locally-installed GPU via InfiniBand and the Internet, respectively.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"132 1","pages":"1207-1214"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80011066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 73

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀