首页 > 最新文献

Proceedings of the 2006 ACM/IEEE conference on Supercomputing最新文献

英文 中文
Scalable software infrastructure project 可扩展的软件基础设施项目
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188601
A. Nishida, Hisashi Kotakemori, Tamito Kajiyama, Akira Nukada
Recent progress of science and technology has made numerical simulation an important approach for studies in various fields. Although scalable and high performance numerical libraries on large scale computing resources are indispensable tools for handling various multiscale phenomena, few projects for integrating these numerical libraries have been reported. The object of this project is the development of a basic library of solutions and algorithms required for large scale scientific simulations, which have been developed separately in each fields, and its integration into a scalable software infrastructure. The components include a scalable iterative solvers library Lis, having a number of solvers, preconditioners, and matrix storage formats that are flexibly combinable, a fast Fourier transform library FFTSS for various superscalar architectures with SIMD instructions, which outperforms some vendor-provided FFT libraries, and a language- and computing environment-independent matrix computation framework SILC. We show some highlights of our achievements on leading high performance computers.
近年来科学技术的进步使数值模拟成为各个领域研究的重要手段。虽然大规模计算资源上的可扩展和高性能数值库是处理各种多尺度现象不可或缺的工具,但很少有集成这些数值库的项目被报道。该项目的目标是开发大规模科学模拟所需的基本解决方案和算法库,这些解决方案和算法已在每个领域单独开发,并将其集成到可扩展的软件基础设施中。这些组件包括一个可扩展的迭代求解器库Lis,它具有许多可灵活组合的求解器、预处理器和矩阵存储格式;一个快速傅立叶变换库FFTSS,用于各种带有SIMD指令的标量体系结构,其性能优于一些供应商提供的FFT库;以及一个与语言和计算环境无关的矩阵计算框架SILC。我们展示了我们在领先的高性能计算机上取得的一些成就。
{"title":"Scalable software infrastructure project","authors":"A. Nishida, Hisashi Kotakemori, Tamito Kajiyama, Akira Nukada","doi":"10.1145/1188455.1188601","DOIUrl":"https://doi.org/10.1145/1188455.1188601","url":null,"abstract":"Recent progress of science and technology has made numerical simulation an important approach for studies in various fields. Although scalable and high performance numerical libraries on large scale computing resources are indispensable tools for handling various multiscale phenomena, few projects for integrating these numerical libraries have been reported. The object of this project is the development of a basic library of solutions and algorithms required for large scale scientific simulations, which have been developed separately in each fields, and its integration into a scalable software infrastructure. The components include a scalable iterative solvers library Lis, having a number of solvers, preconditioners, and matrix storage formats that are flexibly combinable, a fast Fourier transform library FFTSS for various superscalar architectures with SIMD instructions, which outperforms some vendor-provided FFT libraries, and a language- and computing environment-independent matrix computation framework SILC. We show some highlights of our achievements on leading high performance computers.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117220069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Human arterial tree simulation on TeraGrid TeraGrid上的人体动脉树模拟
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188613
Leopold Grinberg, S. Dong, J. Noble, A. Yakhot, G. Karniadakis, N. Karonis
The human arterial tree consists of a complex network of branching blood vessels leading from the heart to arterioles, capillaries, and venules - comprising the microcirculation. The numerical simulation of the blood flow in a single part of the human arterial tree requires hundreds of CPUs; a full human arterial tree will require thousands of CPUs. Nowadays, we can use geographically distributed supercomputers connected by a fast network to perform large-scale simulations.Nektar-G2 is the grid-enabled version of Nektar, software developed at Brown University, that allows to solve problems on geographically distributed supercomputers. The topology-aware feature of MPICH-G2 is utilized to enforce an efficient data distribution strategy. Multi-level message passing algorithms minimizes the inter-site communication. Our ultimate goal is to model blood flow interaction of different regions of the cardiovascular system and to establish a biomechanics gateway on the TeraGrid.During poster presentation we will present results of ongoing project.
人体动脉树由从心脏到小动脉、毛细血管和小静脉的复杂分支血管网络组成,构成了微循环。人体动脉树单个部分的血流数值模拟需要数百个cpu;一个完整的人类动脉树将需要数千个cpu。如今,我们可以使用由快速网络连接的地理分布的超级计算机进行大规模模拟。Nektar- g2是Nektar的网格支持版本,Nektar是布朗大学开发的软件,可以解决地理分布的超级计算机上的问题。利用MPICH-G2的拓扑感知特性来实施有效的数据分发策略。多级消息传递算法使站点间通信最小化。我们的最终目标是模拟心血管系统不同区域的血流相互作用,并在TeraGrid上建立生物力学门户。在海报展示期间,我们将展示正在进行的项目的结果。
{"title":"Human arterial tree simulation on TeraGrid","authors":"Leopold Grinberg, S. Dong, J. Noble, A. Yakhot, G. Karniadakis, N. Karonis","doi":"10.1145/1188455.1188613","DOIUrl":"https://doi.org/10.1145/1188455.1188613","url":null,"abstract":"The human arterial tree consists of a complex network of branching blood vessels leading from the heart to arterioles, capillaries, and venules - comprising the microcirculation. The numerical simulation of the blood flow in a single part of the human arterial tree requires hundreds of CPUs; a full human arterial tree will require thousands of CPUs. Nowadays, we can use geographically distributed supercomputers connected by a fast network to perform large-scale simulations.Nektar-G2 is the grid-enabled version of Nektar, software developed at Brown University, that allows to solve problems on geographically distributed supercomputers. The topology-aware feature of MPICH-G2 is utilized to enforce an efficient data distribution strategy. Multi-level message passing algorithms minimizes the inter-site communication. Our ultimate goal is to model blood flow interaction of different regions of the cardiovascular system and to establish a biomechanics gateway on the TeraGrid.During poster presentation we will present results of ongoing project.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134090473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What's inside the grid? a discussion of standards and the future of computing 网格里面有什么?讨论标准和计算的未来
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188529
Gary Tyreman, Mark Linesch, S. Wheat, Andre Hill
Cluster computing is a disruptive force that has quickly reshaped the HPC market and rapidly gained acceptance as a solution to an ever-increasing number of business and research problems in the data center. To sustain, and accelerate the growth of grid and related fields, we need a set of reference architectures, implementations and technology standards that will enable interoperable components and an ubiquitous computing experience.This panel will debut the initiative to establish standards to discuss and implement grid systems. Industry and academic representatives will debate current challenges in grid computing, why we need standards, how we should talk about grid, what grid will mean to the industry for the next ten years, and how grid is a spring board for academics to take HPC, virtualization, and distributed computing to new heights. During the Q&A period, attendees will have an opportunity to voice their concerns or support of initiatives addressed.
集群计算是一种颠覆性的力量,它迅速重塑了HPC市场,并迅速被接受为数据中心中不断增加的业务和研究问题的解决方案。为了维持和加速网格及相关领域的发展,我们需要一套参考架构、实现和技术标准,以实现可互操作的组件和无处不在的计算体验。该小组将首次提出建立标准以讨论和实施网格系统的倡议。工业界和学术界代表将讨论网格计算当前面临的挑战,为什么我们需要标准,我们应该如何谈论网格,网格对未来十年的行业意味着什么,以及网格如何成为学术界将HPC、虚拟化和分布式计算推向新高度的跳板。在问答环节,与会者将有机会表达他们的担忧或支持所讨论的倡议。
{"title":"What's inside the grid? a discussion of standards and the future of computing","authors":"Gary Tyreman, Mark Linesch, S. Wheat, Andre Hill","doi":"10.1145/1188455.1188529","DOIUrl":"https://doi.org/10.1145/1188455.1188529","url":null,"abstract":"Cluster computing is a disruptive force that has quickly reshaped the HPC market and rapidly gained acceptance as a solution to an ever-increasing number of business and research problems in the data center. To sustain, and accelerate the growth of grid and related fields, we need a set of reference architectures, implementations and technology standards that will enable interoperable components and an ubiquitous computing experience.This panel will debut the initiative to establish standards to discuss and implement grid systems. Industry and academic representatives will debate current challenges in grid computing, why we need standards, how we should talk about grid, what grid will mean to the industry for the next ten years, and how grid is a spring board for academics to take HPC, virtualization, and distributed computing to new heights. During the Q&A period, attendees will have an opportunity to voice their concerns or support of initiatives addressed.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133267870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cluster storage and file system technologies 集群存储和文件系统技术
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188678
B. Welch, M. Unangst
Cluster Storage and File System TechnologiesTo meet the demands of increasingly hungry cluster applications, cluster-based distributed storage technologies are now capable of delivering performance scaling 10's to 100's of GB/sec. This tutorial will examine current state-of-the-art high performance file systems and the underlying technologies employed to deliver scalable performance across a range of scientific and industrial applications.The first half of the tutorial provides an in-depth description of the core features common across most high-performance file systems; including details of datapath design, decoupled and scalable metadata operations, data layout techniques, failover techniques, scalable reconstruction, storage interfaces and security. The second half describes the design trade-offs found in both open-source and commercial solutions including Lustre, GPFS, Parallel NFS and Panasas.
集群存储和文件系统技术为了满足日益庞大的集群应用程序的需求,基于集群的分布式存储技术现在能够提供10到100 GB/秒的性能扩展。本教程将研究当前最先进的高性能文件系统和用于在一系列科学和工业应用程序中提供可扩展性能的底层技术。本教程的前半部分深入描述了大多数高性能文件系统中常见的核心特性;包括数据路径设计、解耦和可扩展元数据操作、数据布局技术、故障转移技术、可扩展重建、存储接口和安全性的细节。第二部分描述了在开源和商业解决方案(包括Lustre、GPFS、Parallel NFS和Panasas)中发现的设计权衡。
{"title":"Cluster storage and file system technologies","authors":"B. Welch, M. Unangst","doi":"10.1145/1188455.1188678","DOIUrl":"https://doi.org/10.1145/1188455.1188678","url":null,"abstract":"Cluster Storage and File System TechnologiesTo meet the demands of increasingly hungry cluster applications, cluster-based distributed storage technologies are now capable of delivering performance scaling 10's to 100's of GB/sec. This tutorial will examine current state-of-the-art high performance file systems and the underlying technologies employed to deliver scalable performance across a range of scientific and industrial applications.The first half of the tutorial provides an in-depth description of the core features common across most high-performance file systems; including details of datapath design, decoupled and scalable metadata operations, data layout techniques, failover techniques, scalable reconstruction, storage interfaces and security. The second half describes the design trade-offs found in both open-source and commercial solutions including Lustre, GPFS, Parallel NFS and Panasas.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133425013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The meeting list tool - a shared application for sharing dynamic information in meetings 会议列表工具—共享应用程序,用于共享会议中的动态信息
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188786
Adam C. Carter
I will describe and demonstrate the "Meeting List Tool", a shared application built with the Access Grid Toolkit for use in AG meetings.At present, the two most common ways of sharing text-based information during an AG meeting are shared presentations and the chat tool built into the Venue Client. The former is ideal for static content which is known in advance. The latter is ideal for sharing short pieces of information, such as a URL. What is apparently missing is an application for which data can be prepared in advance, displayed and quickly manipulated during a meeting, and kept at the close of the meeting by all the participants in the collaboration.This application is designed to fill this gap. The current version provides a list of items that can be highlighted, added, deleted, edited, and re-ordered, with the changes being propagated to all instances of the tool.
我将描述和演示“会议列表工具”,这是一个与Access Grid Toolkit一起构建的共享应用程序,用于AG会议。目前,在AG会议期间共享基于文本的信息的两种最常见的方式是共享演示文稿和内置在Venue Client中的聊天工具。前者是预先知道的静态内容的理想选择。后者非常适合共享短信息,比如URL。显然缺少的是一个应用程序,它可以提前准备数据,在会议期间显示和快速操作数据,并在会议结束时由协作中的所有参与者保存数据。此应用程序旨在填补这一空白。当前版本提供了一个项目列表,可以突出显示、添加、删除、编辑和重新排序,并将更改传播到工具的所有实例。
{"title":"The meeting list tool - a shared application for sharing dynamic information in meetings","authors":"Adam C. Carter","doi":"10.1145/1188455.1188786","DOIUrl":"https://doi.org/10.1145/1188455.1188786","url":null,"abstract":"I will describe and demonstrate the \"Meeting List Tool\", a shared application built with the Access Grid Toolkit for use in AG meetings.At present, the two most common ways of sharing text-based information during an AG meeting are shared presentations and the chat tool built into the Venue Client. The former is ideal for static content which is known in advance. The latter is ideal for sharing short pieces of information, such as a URL. What is apparently missing is an application for which data can be prepared in advance, displayed and quickly manipulated during a meeting, and kept at the close of the meeting by all the participants in the collaboration.This application is designed to fill this gap. The current version provides a list of items that can be highlighted, added, deleted, edited, and re-ordered, with the changes being propagated to all instances of the tool.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"169 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113991445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-Force MPI: toward tractable toolkits for high performance computing 零力MPI:面向高性能计算的易于处理的工具包
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188595
M. Slawinska, Dawid Kurzyniec, Jaroslaw Slawinski, V. Sunderam
Shared HPC platforms continue to require substantial effort for software installation and management, often necessitating manual intervention and tedious procedures. We propose a novel model of resource sharing that shifts resource virtualization and aggregation responsibilities to client-side software, thus reducing the burdens on resource providers.The Zero-Force MPI toolkit automates the installation, build, run, and post-processing stages of HPC applications, thus allowing application scientists to focus on using resources instead of managing them. Through a provided console, MPI runtime systems, support libraries, application executables, and needed datafiles can be soft-installed across distributed resources with just a few commands. Built-in data synchronization capabilities simplify common HPC development tasks, saving end-user time and effort. To evaluate ZF-MPI, we conducted experiments with the NAS Parallel Benchmarks. Results demonstrate that the proposed run-not-install approach is effective and may substantially increase overall productivity.
共享的HPC平台仍然需要大量的软件安装和管理工作,通常需要人工干预和繁琐的过程。我们提出了一种新的资源共享模型,将资源虚拟化和聚合责任转移到客户端软件,从而减少了资源提供者的负担。Zero-Force MPI工具包自动化了HPC应用程序的安装、构建、运行和后处理阶段,从而使应用程序科学家能够专注于使用资源,而不是管理资源。通过提供的控制台,只需几个命令,就可以跨分布式资源进行MPI运行时系统、支持库、应用程序可执行文件和所需的数据文件的软安装。内置的数据同步功能简化了常见的HPC开发任务,节省了最终用户的时间和精力。为了评估ZF-MPI,我们使用NAS并行基准进行了实验。结果表明,建议的运行-不安装方法是有效的,并且可以大大提高总体生产率。
{"title":"Zero-Force MPI: toward tractable toolkits for high performance computing","authors":"M. Slawinska, Dawid Kurzyniec, Jaroslaw Slawinski, V. Sunderam","doi":"10.1145/1188455.1188595","DOIUrl":"https://doi.org/10.1145/1188455.1188595","url":null,"abstract":"Shared HPC platforms continue to require substantial effort for software installation and management, often necessitating manual intervention and tedious procedures. We propose a novel model of resource sharing that shifts resource virtualization and aggregation responsibilities to client-side software, thus reducing the burdens on resource providers.The Zero-Force MPI toolkit automates the installation, build, run, and post-processing stages of HPC applications, thus allowing application scientists to focus on using resources instead of managing them. Through a provided console, MPI runtime systems, support libraries, application executables, and needed datafiles can be soft-installed across distributed resources with just a few commands. Built-in data synchronization capabilities simplify common HPC development tasks, saving end-user time and effort. To evaluate ZF-MPI, we conducted experiments with the NAS Parallel Benchmarks. Results demonstrate that the proposed run-not-install approach is effective and may substantially increase overall productivity.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122354528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
TotalView tips and tricks TotalView提示和技巧
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188465
C. Gottbrath, P. Thompson
TotalView is a flexible, scriptable parallel debugger with wide acceptance in the High Performance Computing community. This BOF will be an opportunity for TotalView users to share clever and interesting ways of adapting TotalView to their unique environment, using TotalView to do something unusual, or simply making the day to day process of debugging easier. Contact Chris.Gottbrath@etnus.com if you want us to reserve time for you to tell your story or simply show up at the BOF and step forward.
TotalView是一个灵活的、可编写脚本的并行调试器,在高性能计算社区中被广泛接受。这将是一个机会,为TotalView用户分享聪明和有趣的方式,使TotalView适应他们独特的环境,使用TotalView做一些不寻常的事情,或简单地使日常的调试过程更容易。如果您希望我们为您预留时间讲述您的故事,或者只是出现在BOF并向前迈进,请联系Chris.Gottbrath@etnus.com。
{"title":"TotalView tips and tricks","authors":"C. Gottbrath, P. Thompson","doi":"10.1145/1188455.1188465","DOIUrl":"https://doi.org/10.1145/1188455.1188465","url":null,"abstract":"TotalView is a flexible, scriptable parallel debugger with wide acceptance in the High Performance Computing community. This BOF will be an opportunity for TotalView users to share clever and interesting ways of adapting TotalView to their unique environment, using TotalView to do something unusual, or simply making the day to day process of debugging easier. Contact Chris.Gottbrath@etnus.com if you want us to reserve time for you to tell your story or simply show up at the BOF and step forward.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129211183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
IANUS: scientific computing on an FPGA-based architecture IANUS:基于fpga架构的科学计算
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188633
F. Belletti, M. Cotallo, A. Flor, L. A. Fernández, A. Gordillo, A. Maiorano, F. Mantovani, E. Marinari, V. Martin-Mayor, A. M. Sudupe, D. Navarro, S. P. Gaviro, M. Rossi, J. Ruiz-Lorenzo, S. Schifano, D. Sciretti, A. Tarancón, R. Tripiccione, J. Velasco
IANUS is a massively parallel system based on a 2D array of FPGA-based processors with nearest-neighbor connections. Processors are also directly connected to a central hub attached to a host computer.The prototype, available in October 2006 uses an array of 4x4 Xilinx Virtex4LX160 FPGA's.We map onto the array the computational kernels of scientific applications characterized by regular control flow, unconventional mix of data-manipulation operations and limited memory usage.Careful VHDL coding of the kernel algorithms relevant for Monte Carlo simulation of spin-glass systems (our first application) yields impressive performances: single processor tests concurrently update ~1000 spins, so average spin-update time is 15 psec. This is ~60 times faster than accurately programmed 3,2 GHz PC's. We plan to build a 256 nodes system, roughly equivalent to 15000 PC's.This poster describes the architecture, the implementation and the methodology with which a specific application is mapped onto the system.
IANUS是一个基于2D fpga处理器阵列的大规模并行系统,具有最近邻连接。处理器也直接连接到连接到主机的中央集线器。原型机将于2006年10月上市,使用4x4 Xilinx Virtex4LX160 FPGA阵列。我们将科学应用的计算内核映射到阵列上,这些应用的特点是有规则的控制流、非常规的数据操作组合和有限的内存使用。对与自旋玻璃系统(我们的第一个应用程序)的蒙特卡罗模拟相关的内核算法进行仔细的VHDL编码,产生了令人印象深刻的性能:单处理器测试并发更新~1000个自旋,因此平均自旋更新时间为15 psec。这比精确编程的3.2 GHz PC快60倍。我们计划构建一个256个节点的系统,大致相当于15000台PC。这张海报描述了将特定应用程序映射到系统的体系结构、实现和方法。
{"title":"IANUS: scientific computing on an FPGA-based architecture","authors":"F. Belletti, M. Cotallo, A. Flor, L. A. Fernández, A. Gordillo, A. Maiorano, F. Mantovani, E. Marinari, V. Martin-Mayor, A. M. Sudupe, D. Navarro, S. P. Gaviro, M. Rossi, J. Ruiz-Lorenzo, S. Schifano, D. Sciretti, A. Tarancón, R. Tripiccione, J. Velasco","doi":"10.1145/1188455.1188633","DOIUrl":"https://doi.org/10.1145/1188455.1188633","url":null,"abstract":"IANUS is a massively parallel system based on a 2D array of FPGA-based processors with nearest-neighbor connections. Processors are also directly connected to a central hub attached to a host computer.The prototype, available in October 2006 uses an array of 4x4 Xilinx Virtex4LX160 FPGA's.We map onto the array the computational kernels of scientific applications characterized by regular control flow, unconventional mix of data-manipulation operations and limited memory usage.Careful VHDL coding of the kernel algorithms relevant for Monte Carlo simulation of spin-glass systems (our first application) yields impressive performances: single processor tests concurrently update ~1000 spins, so average spin-update time is 15 psec. This is ~60 times faster than accurately programmed 3,2 GHz PC's. We plan to build a 256 nodes system, roughly equivalent to 15000 PC's.This poster describes the architecture, the implementation and the methodology with which a specific application is mapped onto the system.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"328 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129305455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Realistic visualization for large-scale simulations 逼真的可视化大规模模拟
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188688
V. Popescu, C. Hoffmann
Simulations that model in detail complex interactions in complex environments are now possible, and visualization is a uniquely powerful analysis tool. However, visualization features of simulation codes are typically intended for scientists designing simulations, providing little support for presenting simulation results in a form suitable for non-experts. Conversely, graphics software and hardware progress has been fueled by applications where precisely abiding by the laws of physics is of secondary importance compared to visual realism.This half-day tutorial presents an approach for state-of-the-art visualization of simulation data based on connecting the worlds of computer simulation and computer animation. Concretely, the attendees will learn how to simplify, import, and integrate the simulation data into the surrounding scene, with examples from our simulation of the September 11 Attack on the Pentagon. The resulting realistic visualization enables effective dissemination of simulation results, and helps simulations reach their full potential for high societal impact.
在复杂环境中对复杂交互进行详细建模的仿真现在是可能的,可视化是一种独特的强大分析工具。然而,仿真代码的可视化特性通常是为设计仿真的科学家准备的,很少支持以适合非专家的形式呈现仿真结果。相反,图形软件和硬件的进步是由应用程序推动的,在这些应用程序中,与视觉现实主义相比,严格遵守物理定律是次要的。这个半天的教程介绍了一种基于连接计算机仿真和计算机动画世界的最先进的仿真数据可视化方法。具体来说,与会者将学习如何简化、导入和整合模拟数据到周围的场景中,并以我们对五角大楼9月11日袭击的模拟为例。由此产生的逼真的可视化能够有效地传播模拟结果,并帮助模拟充分发挥其高社会影响的潜力。
{"title":"Realistic visualization for large-scale simulations","authors":"V. Popescu, C. Hoffmann","doi":"10.1145/1188455.1188688","DOIUrl":"https://doi.org/10.1145/1188455.1188688","url":null,"abstract":"Simulations that model in detail complex interactions in complex environments are now possible, and visualization is a uniquely powerful analysis tool. However, visualization features of simulation codes are typically intended for scientists designing simulations, providing little support for presenting simulation results in a form suitable for non-experts. Conversely, graphics software and hardware progress has been fueled by applications where precisely abiding by the laws of physics is of secondary importance compared to visual realism.This half-day tutorial presents an approach for state-of-the-art visualization of simulation data based on connecting the worlds of computer simulation and computer animation. Concretely, the attendees will learn how to simplify, import, and integrate the simulation data into the surrounding scene, with examples from our simulation of the September 11 Attack on the Pentagon. The resulting realistic visualization enables effective dissemination of simulation results, and helps simulations reach their full potential for high societal impact.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"44 24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128527539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large scale drop impact analysis of mobile phone using ADVC on Blue Gene/L 基于ADVC对蓝色基因/L的手机大尺度跌落冲击分析
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188503
H. Akiba, T. Ohyama, Y. Shibata, Kiyoshi Yuyama, Yoshikazu Katai, R. Takeuchi, T. Hoshino, S. Yoshimura, H. Noguchi, Manish Gupta, John A. Gunnels, V. Austel, Yogish Sabharwal, R. Garg, S. Kato, T. Kawakami, Satoru Todokoro, Junko Ikeda
Existing commercial finite element analysis (FEA) codes do not exhibit the performance necessary for large scale analysis on parallel computer systems. In this paper, we demonstrate the performance characteristics of a commercial parallel structural analysis code, ADVC, on Blue Gene/L (BG/L). The numerical algorithm of ADVC is described, tuned, and optimized on BG/L, and then a large scale drop impact analysis of a mobile phone is performed. The model of the mobile phone is a nearly-full assembly that includes inner structures. The size of the model we have analyzed has 47 million nodal points and 142 million DOFs. This does not seem exceptionally large, but the dynamic impact analysis of a product model, with the contact condition on the entire surface of the outer case under this size, cannot be handled by other CAE systems. Our analysis is an unprecedented attempt in the electronics industry. It took only half a day, 12.1 hours, for the analysis of about 2.4 milliseconds. The floating point operation performance obtained has been 538 GFLOPS on 4096 node of BG/L.
现有的商业有限元分析(FEA)代码不具备在并行计算机系统上进行大规模分析所需的性能。在本文中,我们在Blue Gene/L (BG/L)上演示了商用并行结构分析代码ADVC的性能特征。在BG/L的基础上对ADVC的数值算法进行了描述、调整和优化,并对某手机进行了大规模跌落冲击分析。这个手机模型是一个几乎完整的组装体,包括内部结构。我们分析的模型的大小有4700万个节点和1.42亿个自由度。这看起来并不是特别大,但是一个产品模型的动态冲击分析,在这个尺寸下外壳整个表面的接触情况,是其他CAE系统无法处理的。我们的分析在电子行业是前所未有的尝试。只花了半天,也就是12.1小时,就分析了大约2.4毫秒的数据。在BG/L的4096个节点上获得了538 GFLOPS的浮点运算性能。
{"title":"Large scale drop impact analysis of mobile phone using ADVC on Blue Gene/L","authors":"H. Akiba, T. Ohyama, Y. Shibata, Kiyoshi Yuyama, Yoshikazu Katai, R. Takeuchi, T. Hoshino, S. Yoshimura, H. Noguchi, Manish Gupta, John A. Gunnels, V. Austel, Yogish Sabharwal, R. Garg, S. Kato, T. Kawakami, Satoru Todokoro, Junko Ikeda","doi":"10.1145/1188455.1188503","DOIUrl":"https://doi.org/10.1145/1188455.1188503","url":null,"abstract":"Existing commercial finite element analysis (FEA) codes do not exhibit the performance necessary for large scale analysis on parallel computer systems. In this paper, we demonstrate the performance characteristics of a commercial parallel structural analysis code, ADVC, on Blue Gene/L (BG/L). The numerical algorithm of ADVC is described, tuned, and optimized on BG/L, and then a large scale drop impact analysis of a mobile phone is performed. The model of the mobile phone is a nearly-full assembly that includes inner structures. The size of the model we have analyzed has 47 million nodal points and 142 million DOFs. This does not seem exceptionally large, but the dynamic impact analysis of a product model, with the contact condition on the entire surface of the outer case under this size, cannot be handled by other CAE systems. Our analysis is an unprecedented attempt in the electronics industry. It took only half a day, 12.1 hours, for the analysis of about 2.4 milliseconds. The floating point operation performance obtained has been 538 GFLOPS on 4096 node of BG/L.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128541567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
期刊
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1