2012 SC Companion: High Performance Computing, Networking Storage and Analysis最新文献

英文中文

Poster: Exploring Design Space of a 3D Stacked Vector Cache - Designing a 3D Stacked Vector Cache using Conventional EDA Tools 海报:探索3D堆叠矢量缓存的设计空间-使用传统EDA工具设计3D堆叠矢量缓存

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.271

Ryusuke Egawa, J. Tada, Yusuke Endo, H. Takizawa, Hiroaki Kobayashi

Although 3D integration technologies with through silicon vias (TSVs) have expected to overcome the memory and power wall problems in the future microprocessor design, there is no promising EDA tools to design 3D integrated VLSIs. In addition, effects of 3D integration on microprocessor design have not been discussed well. Under this situation, this paper presents design approach of 3D stacked cache memories using existing EDA tools, and shows early performances evaluation of 3D stacked cache memories for vector processors.

虽然通过硅通孔(tsv)的3D集成技术有望在未来的微处理器设计中克服内存和功率墙问题，但目前还没有有前途的EDA工具来设计3D集成vlsi。此外，三维集成对微处理器设计的影响还没有得到很好的讨论。在这种情况下，本文提出了利用现有EDA工具设计三维堆叠式高速缓存的方法，并给出了矢量处理器三维堆叠式高速缓存的早期性能评价。

引用次数: 0

Poster: PanDA: Next Generation Workload Management and Analysis System for Big Data 海报:PanDA:面向大数据的新一代工作量管理与分析系统

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.302

K. De, A. Klimentov, S. Panitkin, M. Titov, A. Vaniachine, T. Wenaus, D. Yu, G. Záruba

In real world any big science project implies to use a sophisticated Workload Management System (WMS) that deals with a huge amount of highly distributed data, which is often accessed by large collaborations. The Production and Distributed Analysis System (PanDA) is a high-performance WMS that is aimed to meet production and analysis requirements for a data-driven workload management system capable of operating at the Large Hadron Collider data processing scale. PanDA provides execution environments for a wide range of experimental applications, automates centralized data production and processing, enables analysis activity of physics groups, supports custom workflow of individual physicists, provides a unified view of distributed worldwide resources, presents status and history of workflow through an integrated monitoring system, archives and curates all workflow. PanDA is now being generalized and packaged, as a WMS already proven at extreme scales, for the wider use of the Big Data community.

在现实世界中，任何大型科学项目都意味着使用复杂的工作负载管理系统(WMS)来处理大量高度分布式的数据，这些数据通常由大型协作访问。生产和分布式分析系统(PanDA)是一个高性能的WMS，旨在满足能够在大型强子对撞机数据处理规模上运行的数据驱动工作负载管理系统的生产和分析需求。PanDA为广泛的实验应用提供执行环境，自动化集中数据生产和处理，支持物理组的分析活动，支持单个物理学家的自定义工作流，提供分布式全球资源的统一视图，通过集成的监控系统呈现工作流的状态和历史，存档和管理所有工作流。作为一个已经在极端规模上得到验证的WMS, PanDA现在正在被一般化和打包，以供大数据社区更广泛地使用。

引用次数: 3

Light-Weight Data Management Solutions for Visualization and Dissemination of Massive Scientific Datasets - Position Paper 大规模科学数据集可视化和传播的轻量级数据管理解决方案-立场文件

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.157

G. Agrawal, Yunde Su

Many of the `big-data' challenges today are arising from increasing computing ability, as data collected from simulations has become extremely valuable for a variety of scientific endeavors. With growing computational capabilities of parallel machines, scientific simulations are being performed at finer spatial and temporal scales, leading to a data explosion. As a specific example, the Global Cloud-Resolving Model (GCRM) currently has a grid-cell size of 4 km, and already produces 1 petabyte of data for a 10 day simulation. Future plans include simulations with a grid-cell size of 1 km, which will increase the data generation 64 folds. Finer granularity of simulation data offers both an opportunity and a challenge. On one hand, it can allow understanding of underlying phenomenon and features in a way that would not be possible with coarser granularity. On the other hand, larger datasets are extremely difficult to store, manage, disseminate, analyze, and visualize. Neither the memory capacity of parallel machines, memory access speeds, nor disk bandwidths are increasing at the same rate as computing power, contributing to the difficulty in storing, managing, and analyzing these datasets. Simulation data is often disseminated widely, through portals like the Earth System Grid (ESG), and downloaded by researchers all over the world. Such dissemination efforts are hampered by dataset size growth, as wide area data transfer bandwidths are growing at a much slower pace. Finally, while visualizing datasets, human perception is inherently limited.

今天的许多“大数据”挑战都来自于不断提高的计算能力，因为从模拟中收集的数据对各种科学努力都变得非常有价值。随着并行机器的计算能力不断增强，科学模拟正在更精细的空间和时间尺度上进行，导致数据爆炸。作为一个具体的例子，全球云分辨模型(GCRM)目前的网格单元大小为4公里，并且已经为10天的模拟产生了1拍字节的数据。未来的计划包括网格单元大小为1公里的模拟，这将使数据生成增加64倍。更细粒度的模拟数据提供了机遇和挑战。一方面，它允许以一种粗粒度无法实现的方式理解底层现象和特征。另一方面，大型数据集非常难以存储、管理、传播、分析和可视化。并行机器的内存容量、内存访问速度和磁盘带宽都没有以与计算能力相同的速度增长，这增加了存储、管理和分析这些数据集的难度。模拟数据通常通过地球系统网格(ESG)等门户网站广泛传播，并由世界各地的研究人员下载。这种传播努力受到数据集规模增长的阻碍，因为广域数据传输带宽的增长速度要慢得多。最后，在可视化数据集时，人类的感知是有限的。

{"title":"Light-Weight Data Management Solutions for Visualization and Dissemination of Massive Scientific Datasets - Position Paper","authors":"G. Agrawal, Yunde Su","doi":"10.1109/SC.Companion.2012.157","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.157","url":null,"abstract":"Many of the `big-data' challenges today are arising from increasing computing ability, as data collected from simulations has become extremely valuable for a variety of scientific endeavors. With growing computational capabilities of parallel machines, scientific simulations are being performed at finer spatial and temporal scales, leading to a data explosion. As a specific example, the Global Cloud-Resolving Model (GCRM) currently has a grid-cell size of 4 km, and already produces 1 petabyte of data for a 10 day simulation. Future plans include simulations with a grid-cell size of 1 km, which will increase the data generation 64 folds. Finer granularity of simulation data offers both an opportunity and a challenge. On one hand, it can allow understanding of underlying phenomenon and features in a way that would not be possible with coarser granularity. On the other hand, larger datasets are extremely difficult to store, manage, disseminate, analyze, and visualize. Neither the memory capacity of parallel machines, memory access speeds, nor disk bandwidths are increasing at the same rate as computing power, contributing to the difficulty in storing, managing, and analyzing these datasets. Simulation data is often disseminated widely, through portals like the Earth System Grid (ESG), and downloaded by researchers all over the world. Such dissemination efforts are hampered by dataset size growth, as wide area data transfer bandwidths are growing at a much slower pace. Finally, while visualizing datasets, human perception is inherently limited.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"144 1","pages":"1296-1300"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78588974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Building a Climatology of Mountain Gap Wind Jets and Related Coastal Upwelling 山口风急流及相关海岸上升流气候学的建立

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.71

S. Graves, Xiang Li, K. Keiser, Deborah K. Smith

Winds accelerating through coastal topology are capable of generating jets that often result in cold-water upwelling events in near-coast locations. In situ measurements are frequently not available in remote locations for many of the mountain gap locations globally, so to provide a record of these events for researchers, as well as military and commercial interests, this NASA-funded project is demonstrating how remotely sensed satellite data derived products, and fused model and observations, for wind and sea surface temperatures can be used to detect both wind jet and upwelling events. An algorithm was developed to automatically detect gap wind and ocean upwelling events at gulf regions of Central America using the Cross-Calibrated, Multi-Platform (CCMP) ocean surface wind product and the Optimally Interpolated Sea Surface Temperature (OISST) product. Hierarchical thresholding and region growing methods are used to extract regions of strong winds and temperature anomalies. A post processing step further links the detected events to generate time series of these events. Though developed for Central America regions, the algorithm is being extended to apply to other coastal regions so that detected event products are globally consistent. Through collaboration with the Global Hydrology Resource Center (GHRC), a NASA Distribute Active Archive Center, this project is analyzing large climate data records to generate a resulting climatology of wind jet and upwelling events at known geographic locations will be available as a resource for other researchers. Likewise, through integration of the project's analysis techniques with the GHRC's data ingest processing, the identification and notification of new or current events will likewise be openly available to research, commercial and military users. This paper provides a report on the preliminary results of applying the team's approach of identifying and capturing events for selected mountain gap jet locations.

通过沿海拓扑结构加速的风能够产生射流，经常导致近海岸地区的冷水上升流事件。在全球许多山口位置的偏远地区，现场测量通常是不可用的，因此，为了为研究人员以及军事和商业利益提供这些事件的记录，这个美国宇航局资助的项目正在展示如何利用遥感卫星数据衍生产品、融合模型和观测，用于风和海洋表面温度，以检测风喷流和上升流事件。利用交叉校准、多平台(CCMP)海面风产品和最佳插值海面温度(OISST)产品，提出了一种用于中美洲海湾地区间隙风和海洋上升流事件自动检测的算法。采用分层阈值法和区域生长法提取强风和温度异常区域。后处理步骤进一步将检测到的事件链接起来，以生成这些事件的时间序列。虽然是为中美洲地区开发的，但该算法正在扩展到适用于其他沿海地区，以便检测到的事件产品在全球范围内保持一致。通过与全球水文资源中心(GHRC)的合作，该项目正在分析大量的气候数据记录，以产生已知地理位置的风射流和上升流事件的气候学结果，这将作为其他研究人员的资源。同样，通过将该项目的分析技术与GHRC的数据摄取处理相结合，新的或当前事件的识别和通知同样将公开提供给研究、商业和军事用户。本文提供了一份报告，介绍了应用该团队的方法对选定的山隙射流位置进行识别和捕获事件的初步结果。

{"title":"Building a Climatology of Mountain Gap Wind Jets and Related Coastal Upwelling","authors":"S. Graves, Xiang Li, K. Keiser, Deborah K. Smith","doi":"10.1109/SC.Companion.2012.71","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.71","url":null,"abstract":"Winds accelerating through coastal topology are capable of generating jets that often result in cold-water upwelling events in near-coast locations. In situ measurements are frequently not available in remote locations for many of the mountain gap locations globally, so to provide a record of these events for researchers, as well as military and commercial interests, this NASA-funded project is demonstrating how remotely sensed satellite data derived products, and fused model and observations, for wind and sea surface temperatures can be used to detect both wind jet and upwelling events. An algorithm was developed to automatically detect gap wind and ocean upwelling events at gulf regions of Central America using the Cross-Calibrated, Multi-Platform (CCMP) ocean surface wind product and the Optimally Interpolated Sea Surface Temperature (OISST) product. Hierarchical thresholding and region growing methods are used to extract regions of strong winds and temperature anomalies. A post processing step further links the detected events to generate time series of these events. Though developed for Central America regions, the algorithm is being extended to apply to other coastal regions so that detected event products are globally consistent. Through collaboration with the Global Hydrology Resource Center (GHRC), a NASA Distribute Active Archive Center, this project is analyzing large climate data records to generate a resulting climatology of wind jet and upwelling events at known geographic locations will be available as a resource for other researchers. Likewise, through integration of the project's analysis techniques with the GHRC's data ingest processing, the identification and notification of new or current events will likewise be openly available to research, commercial and military users. This paper provides a report on the preliminary results of applying the team's approach of identifying and capturing events for selected mountain gap jet locations.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"324 1","pages":"495-499"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76301127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Poster: HTCaaS: A Large-Scale High-Throughput Computing by Leveraging Grids, Supercomputers and Cloud 海报:HTCaaS:利用网格、超级计算机和云的大规模高吞吐量计算

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.176

Seungwoo Rho, Seoyoung Kim, Sangwan Kim, Seokkyoo Kim, Jik-Soo Kim, Soonwook Hwang

We present the HTCaaS (High-Throughput Computing as a Service) system which aims to provide researchers with ease of exploring large-scale and complex HTC problems by leveraging Supercomputers, Grids, and Cloud. HTCaaS can hide heterogeneity and complexity of harnessing different types of computing infrastructures from users, and efficiently submit a large number of jobs at once by effectively managing and exploiting of all available computing resources. Our system has been effectively integrated with national Supercomputers in Korea, international computational Grids, and Amazon EC2 resulting in combining a vast amount of computing resources to support most challenging scientific problems.

我们提出了HTCaaS(高通量计算即服务)系统，旨在通过利用超级计算机、网格和云，为研究人员提供探索大规模和复杂HTC问题的便利。HTCaaS可以向用户隐藏利用不同类型计算基础设施的异质性和复杂性，并通过有效地管理和利用所有可用的计算资源，高效地同时提交大量作业。我们的系统已经与韩国的国家超级计算机、国际计算网格和亚马逊EC2有效集成，从而结合了大量的计算资源来支持最具挑战性的科学问题。

引用次数: 12

Reducing the De-linearization of Data Placement to Improve Deduplication Performance 减少数据放置的非线性化，提高重复数据删除性能

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.110

Yujuan Tan, Zhichao Yan, D. Feng, E. Sha, Xiongzi Ge

Data deduplication is a lossless compression technology that replaces the redundant data chunks with pointers pointing to the already-stored ones. Due to this intrinsic data elimination feature, the deduplication commodity would delinearize the data placement and force the data chunks that belong to the same data object to be divided into multiple separate parts. In our preliminary study, it is found that the de-linearization of the data placement would weaken the data spatial locality that is used for improving data read performance, deduplication throughput and efficiency in some deduplication approaches, which significantly affects the deduplication performance. In this paper, we first analyze the negative effect of the de-linearization of data placement to the data deduplication performance with some examples and experimental evidences, and then propose an effective approach to reduce the de-linearization of data placement by sacrificing little compression ratios. The experimental evaluation driven by the real world datasets shows that our proposed approach effectively reduces the de-linearization of the data placement and enhances the data spatial locality, which significantly improves the deduplication performances including deduplication throughput, deduplication efficiency and data read performance, while at the cost of little compression ratios.

重复数据删除是一种无损压缩技术，它用指向已存储数据块的指针替换冗余数据块。由于这种固有的数据消除特性，重复数据删除商品将使数据放置非线性化，并强制将属于同一数据对象的数据块划分为多个独立的部分。在我们的初步研究中，我们发现数据放置的去线性化会削弱数据空间局部性，而在某些重复数据删除方法中，数据空间局部性用于提高数据读取性能、重复数据删除吞吐量和效率，从而显著影响重复数据删除性能。本文首先通过一些实例和实验证据分析了数据放置的去线性化对重复数据删除性能的负面影响，然后提出了一种通过牺牲较小的压缩比来降低数据放置的去线性化的有效方法。实验结果表明，该方法有效地降低了数据放置的去线性化，增强了数据空间局域性，在压缩比较低的情况下显著提高了重复数据删除吞吐量、重复数据删除效率和数据读取性能。

{"title":"Reducing the De-linearization of Data Placement to Improve Deduplication Performance","authors":"Yujuan Tan, Zhichao Yan, D. Feng, E. Sha, Xiongzi Ge","doi":"10.1109/SC.Companion.2012.110","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.110","url":null,"abstract":"Data deduplication is a lossless compression technology that replaces the redundant data chunks with pointers pointing to the already-stored ones. Due to this intrinsic data elimination feature, the deduplication commodity would delinearize the data placement and force the data chunks that belong to the same data object to be divided into multiple separate parts. In our preliminary study, it is found that the de-linearization of the data placement would weaken the data spatial locality that is used for improving data read performance, deduplication throughput and efficiency in some deduplication approaches, which significantly affects the deduplication performance. In this paper, we first analyze the negative effect of the de-linearization of data placement to the data deduplication performance with some examples and experimental evidences, and then propose an effective approach to reduce the de-linearization of data placement by sacrificing little compression ratios. The experimental evaluation driven by the real world datasets shows that our proposed approach effectively reduces the de-linearization of the data placement and enhances the data spatial locality, which significantly improves the deduplication performances including deduplication throughput, deduplication efficiency and data read performance, while at the cost of little compression ratios.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"80 5 pt 1 1","pages":"796-800"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74592308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Software-Defined Networking for Big-Data Science - Architectural Models from Campus to the WAN 面向大数据科学的软件定义网络——从校园到广域网的架构模型

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.341

I. Monga, Eric Pouyoul, C. Guok

University campuses, Supercomputer centers and R&E networks are challenged to architect, build and support IT infrastructure to deal effectively with the data deluge facing most science disciplines. Hybrid network architecture, multi-domain bandwidth reservations, performance monitoring and GLIF Open Lightpath Exchanges (GOLE) are examples of network architectures that have been proposed, championed and implemented successfully to meet the needs of science. Most recently, Science DMZ, a campus design pattern that bypasses traditional performance hotspots in typical campus network implementation, has been gaining momentum. In this paper and corresponding demonstration, we build upon the SC11 SCinet Research Sandbox demonstrator with Software-Defined networking to explore new architectural approaches. A virtual switch network abstraction is explored, that when combined with software-defined networking concepts provides the science users a simple, adaptable network framework to meet their upcoming application requirements.

大学校园、超级计算机中心和R&E网络面临着架构、构建和支持IT基础设施以有效处理大多数科学学科面临的数据洪流的挑战。混合网络架构、多域带宽预留、性能监控和GLIF开放光路交换(GOLE)是网络架构的例子，这些网络架构已经被提出、倡导并成功实施，以满足科学的需求。最近，Science DMZ(一种绕过典型校园网实现中传统性能热点的校园网设计模式)的势头越来越大。在本文和相应的演示中，我们以SC11 SCinet Research沙盒演示器为基础，使用软件定义网络来探索新的架构方法。探讨了虚拟交换网络的抽象，当与软件定义网络概念相结合时，为科学用户提供了一个简单、适应性强的网络框架，以满足他们未来的应用需求。

引用次数: 62

Poster: Evaluating Asynchrony in Gibraltar RAID's GPU Reed-Solomon Coding Library 海报:评估直布罗陀RAID的GPU Reed-Solomon编码库中的异步性

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.285

Xin Zhou, A. Skjellum, M. Curry

GPU has been utilized for Reed Solomon coding tasks for arbitrary (k+n) RAID system. In this project, we apply the asynchronous design with CUDA in order to run multiple coding tasks simultaneously. Results show significant performance boosts for small block coding tasks by concurrent kernels.

GPU已被用于任意(k+n) RAID系统的Reed Solomon编码任务。在这个项目中，我们采用了CUDA的异步设计，以便同时运行多个编码任务。结果表明，并发核对小块编码任务有显著的性能提升。

引用次数: 4

Abstract: Preliminary Report for a High Precision Distributed Memory Parallel Eigenvalue Solver 高精度分布式存储器并行特征值求解器初步研究

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.255

Toshiyuki Imamura, S. Yamada, M. Machida

This study covers the design and implementation of a DD (double-double) extended parallel eigenvalue solver, namely QPEigenK. We extended most of underlying numerical software layers from BLAS, LAPACK, and ScaLAPACK as well as MPI. Preliminary results show that QPEigenK performs on several platforms, and shows good accuracy and parallel efficiency. We can conclude that the DD format is reasonable data format instead of real (16) format from the viewpoint of programming and performance.

本研究涵盖了双双扩展并行特征值求解器QPEigenK的设计与实现。我们从BLAS、LAPACK和ScaLAPACK以及MPI扩展了大多数底层数值软件层。初步结果表明，QPEigenK可以在多个平台上运行，具有良好的精度和并行效率。从编程和性能的角度来看，DD格式是合理的数据格式，而不是真实的(16)格式。

引用次数: 1

Crayons: An Azure Cloud Based Parallel System for GIS Overlay Operations 蜡笔:一个基于Azure云的GIS叠加操作并行系统

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.315

Dinesh Agarwal

Processing of extremely large polygonal (vector-based) spatial datasets has been a long-standing research challenge for scientists in the Geographic Information Systems and Science (GIS) community. Surprisingly, it is not for the lack of individual parallel algorithm; we discovered that the irregular and data intensive nature of the underlying processing is the main reason for the meager amount of work by way of system design and implementation. Furthermore, of all the systems reported in the literature, very few deal with the complexities of vector-based datasets and none, including commercial systems, on the cloud platform. We have designed and implemented an open-architecture-based system named Crayons for Windows Azure cloud platform using state-of-the-art techniques. We have implemented three different architectures of Crayons with different load balancing schemes. Crayons scales well for sufficiently large data sets, achieving end-to-end absolute speedup of over 28-fold employing 100 Azure processors. For smaller and more irregular workload, it still yields over 10-fold speedup.

超大多边形(矢量)空间数据集的处理一直是地理信息系统与科学(GIS)界科学家长期面临的研究挑战。令人惊讶的是，这并不是因为缺乏单独的并行算法;我们发现，底层处理的不规则性和数据密集性是系统设计和实现方面工作量不足的主要原因。此外，在文献中报道的所有系统中，很少有系统处理基于向量的数据集的复杂性，而且没有一个系统(包括商业系统)在云平台上。我们使用最先进的技术为Windows Azure云平台设计并实现了一个基于开放架构的系统Crayons。我们用不同的负载均衡方案实现了蜡笔的三种不同架构。Crayons可以很好地扩展到足够大的数据集，使用100个Azure处理器可以实现超过28倍的端到端绝对加速。对于更小和更不规则的工作负载，它仍然可以产生超过10倍的加速。

引用次数: 8

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 SC Companion: High Performance Computing, Networking Storage and Analysis

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀