首页 > 最新文献

Proceedings of the 2006 ACM/IEEE conference on Supercomputing最新文献

英文 中文
High performance computing for combustion applications 燃烧应用的高性能计算
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188514
G. Staffelbach
Combustion process is at the root of most energy production systems. The understanding of combustion is fundamental to exploit efficiently the available natural resources and to reduce pollutant emissions. The giant leaps performed in computer science over the past two decades render possible the use of computer simulation to better understand combustion in real industrial configurations. This presentation discusses and illustrates the application of high performance computing for Computational Fluid Dynamics (CFD). Specific attention is addressed to the Large Eddy Simulation (LES) approach for industrial energy production configurations: ranging from aeronautical gas turbine engines including helicopters and commercial airliners, piston engines and stationary gas turbine engines used in large scale electricity production systems.
燃烧过程是大多数能源生产系统的基础。了解燃烧是有效利用现有自然资源和减少污染物排放的基础。在过去的二十年里,计算机科学取得了巨大的飞跃,这使得利用计算机模拟来更好地理解真实工业结构中的燃烧成为可能。本报告讨论并说明了高性能计算在计算流体动力学(CFD)中的应用。特别关注工业能源生产配置的大涡模拟(LES)方法:范围从航空燃气涡轮发动机,包括直升机和商用客机,活塞发动机和大型电力生产系统中使用的固定式燃气涡轮发动机。
{"title":"High performance computing for combustion applications","authors":"G. Staffelbach","doi":"10.1145/1188455.1188514","DOIUrl":"https://doi.org/10.1145/1188455.1188514","url":null,"abstract":"Combustion process is at the root of most energy production systems. The understanding of combustion is fundamental to exploit efficiently the available natural resources and to reduce pollutant emissions. The giant leaps performed in computer science over the past two decades render possible the use of computer simulation to better understand combustion in real industrial configurations. This presentation discusses and illustrates the application of high performance computing for Computational Fluid Dynamics (CFD). Specific attention is addressed to the Large Eddy Simulation (LES) approach for industrial energy production configurations: ranging from aeronautical gas turbine engines including helicopters and commercial airliners, piston engines and stationary gas turbine engines used in large scale electricity production systems.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126411637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Enabling next generation supercomputing clusters 启用下一代超级计算集群
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188732
M. Vrazel
Innovative cabling solutions will be a key factor in realizing next generation supercomputing clusters. Demands for higher data rates, larger clusters, and increased density cannot be optimally addressed with existing twin-axial cabling solutions. Quellan, Inc.'s family of low power, low latency Lane Manager ICs provide a 2x reach extension over standard cable for single lane data rates up to 6.25 Gb/s. In addition, the Lane Managers can facilitate increased density and improved airflow through clusters by enabling narrow gauge cables to operate at maximum lengths comparable to that of standard 24AWG cabling. Integrated higher layer features ensure compliance with a variety of current and emerging standards such as Infiniband, PCI Express, and CX-4. This presentation will highlight the performance and advanced features set of the Lane Manager family while also detailing the benefits of this technology for addressing various signal integrity challenges inherent to the cabling infrastructure of supercomputing clusters.
创新的布线解决方案将是实现下一代超级计算集群的关键因素。现有的双轴布线解决方案无法最佳地满足对更高数据速率、更大集群和更高密度的需求。Quellan公司的低功耗、低延迟Lane Manager ic系列在标准电缆上提供2倍的延伸,单线数据速率高达6.25 Gb/s。此外,Lane Managers可以通过窄轨电缆在与标准24AWG电缆相当的最大长度下运行,从而增加密度并改善通过集群的气流。集成的更高层功能确保符合各种当前和新兴的标准,如Infiniband、PCI Express和CX-4。本演讲将重点介绍Lane Manager系列的性能和高级功能集,同时详细介绍该技术在解决超级计算集群布线基础设施固有的各种信号完整性挑战方面的优势。
{"title":"Enabling next generation supercomputing clusters","authors":"M. Vrazel","doi":"10.1145/1188455.1188732","DOIUrl":"https://doi.org/10.1145/1188455.1188732","url":null,"abstract":"Innovative cabling solutions will be a key factor in realizing next generation supercomputing clusters. Demands for higher data rates, larger clusters, and increased density cannot be optimally addressed with existing twin-axial cabling solutions. Quellan, Inc.'s family of low power, low latency Lane Manager ICs provide a 2x reach extension over standard cable for single lane data rates up to 6.25 Gb/s. In addition, the Lane Managers can facilitate increased density and improved airflow through clusters by enabling narrow gauge cables to operate at maximum lengths comparable to that of standard 24AWG cabling. Integrated higher layer features ensure compliance with a variety of current and emerging standards such as Infiniband, PCI Express, and CX-4. This presentation will highlight the performance and advanced features set of the Lane Manager family while also detailing the benefits of this technology for addressing various signal integrity challenges inherent to the cabling infrastructure of supercomputing clusters.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127400828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond the beyond and the extremes of computing 超越了计算的极限和极限
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188524
T. Sterling
Twenty five years ago supercomputing was dominated by vector processors and emergent SIMD array processors clocked at tens of Megahertz. Today responding to dramatic advances in semiconductor device fabrication technologies, the world of supercomputing is dominated by multi-core based MPP and commodity cluster systems clocked at Gigahertz. Twenty five years in the future, the technology landscape will again have experienced dramatic change with the flat-lining of Moore's Law, the realization of nanoscale devices, and the emergence of potentially alien technologies, architectures, and paradigms. If Moore's Law were to continue to progress as before, we would be deploying systems approaching 100 Exaflops with clock rates nearing a Terahertz. But by then, power constraints, quantum effects, or our inability to exploit trillion way program parallelism may have forced us in to entirely new realms of processing. This presentation will consider the range of alternative technologies, architectures, and methods that may drive the extremes of computing beyond the current incremental steps of the current era.
25年前,超级计算由矢量处理器和新兴的SIMD阵列处理器主导,处理器的频率为几十兆赫兹。今天,随着半导体器件制造技术的巨大进步,超级计算的世界由基于多核的MPP和以千兆赫为时钟的商品集群系统主导。未来25年,随着摩尔定律的确立、纳米级器件的实现以及潜在的外来技术、架构和范式的出现,技术领域将再次经历戏剧性的变化。如果摩尔定律继续像以前一样发展,我们将部署接近100百亿亿次浮点运算的系统,时钟速率接近1太赫兹。但到那时,功率限制、量子效应或我们无法开发万亿级并行程序,可能会迫使我们进入一个全新的处理领域。本演讲将考虑各种替代技术、体系结构和方法,这些技术、体系结构和方法可能会推动计算的极限,超越当前时代的当前增量步骤。
{"title":"Beyond the beyond and the extremes of computing","authors":"T. Sterling","doi":"10.1145/1188455.1188524","DOIUrl":"https://doi.org/10.1145/1188455.1188524","url":null,"abstract":"Twenty five years ago supercomputing was dominated by vector processors and emergent SIMD array processors clocked at tens of Megahertz. Today responding to dramatic advances in semiconductor device fabrication technologies, the world of supercomputing is dominated by multi-core based MPP and commodity cluster systems clocked at Gigahertz. Twenty five years in the future, the technology landscape will again have experienced dramatic change with the flat-lining of Moore's Law, the realization of nanoscale devices, and the emergence of potentially alien technologies, architectures, and paradigms. If Moore's Law were to continue to progress as before, we would be deploying systems approaching 100 Exaflops with clock rates nearing a Terahertz. But by then, power constraints, quantum effects, or our inability to exploit trillion way program parallelism may have forced us in to entirely new realms of processing. This presentation will consider the range of alternative technologies, architectures, and methods that may drive the extremes of computing beyond the current incremental steps of the current era.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"65 31","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120817912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Secure file sharing 安全文件共享
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188707
N. Fujita, H. Ohkawa
SRFS on Ether adds an ethernet interface to the Shared Rapid File System (SRFS) that is currently used as a distributed file system between nodes by the HPC-system. It can be used like NFS and has solved the problem of data coherency in the high-speed transmission of data in a broadband environment, which NFS has not. Moreover, adjustment of the TCP/IP parameters in the OS to improve speed is unnecessary, and special hardware is not needed, unlike with the SAN construction by iFCP and others. For additional speed, it stripes data streams automatically (default MAX 8 streams), switches protocols between TCP and UDP based on IOsize.In this bandwidth challenge, we demonstrate security using a host-to-host IPSec connection between Tampa and Tokyo. To show performance, we used a hardware IPSec accelerator and tuned TCP/IP with SRFS on Ether's.
SRFS on Ether为当前hpc系统在节点间使用的分布式文件系统SRFS (Shared Rapid File System)增加了一个以太网接口。它可以像NFS一样使用,并且解决了在宽带环境下高速传输数据时的数据一致性问题,这是NFS所没有的。此外,不需要调整操作系统中的TCP/IP参数来提高速度,也不需要特殊的硬件,这与iFCP等构建SAN不同。为了获得额外的速度,它自动划分数据流(默认最大8个流),基于IOsize在TCP和UDP之间切换协议。在这个带宽挑战中,我们使用坦帕和东京之间的主机对主机IPSec连接来演示安全性。为了展示性能,我们使用了硬件IPSec加速器,并在以太网上使用SRFS调优TCP/IP。
{"title":"Secure file sharing","authors":"N. Fujita, H. Ohkawa","doi":"10.1145/1188455.1188707","DOIUrl":"https://doi.org/10.1145/1188455.1188707","url":null,"abstract":"SRFS on Ether adds an ethernet interface to the Shared Rapid File System (SRFS) that is currently used as a distributed file system between nodes by the HPC-system. It can be used like NFS and has solved the problem of data coherency in the high-speed transmission of data in a broadband environment, which NFS has not. Moreover, adjustment of the TCP/IP parameters in the OS to improve speed is unnecessary, and special hardware is not needed, unlike with the SAN construction by iFCP and others. For additional speed, it stripes data streams automatically (default MAX 8 streams), switches protocols between TCP and UDP based on IOsize.In this bandwidth challenge, we demonstrate security using a host-to-host IPSec connection between Tampa and Tokyo. To show performance, we used a hardware IPSec accelerator and tuned TCP/IP with SRFS on Ether's.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"77 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120825139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The meeting list tool - a shared application for sharing dynamic information in meetings 会议列表工具—共享应用程序,用于共享会议中的动态信息
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188786
Adam C. Carter
I will describe and demonstrate the "Meeting List Tool", a shared application built with the Access Grid Toolkit for use in AG meetings.At present, the two most common ways of sharing text-based information during an AG meeting are shared presentations and the chat tool built into the Venue Client. The former is ideal for static content which is known in advance. The latter is ideal for sharing short pieces of information, such as a URL. What is apparently missing is an application for which data can be prepared in advance, displayed and quickly manipulated during a meeting, and kept at the close of the meeting by all the participants in the collaboration.This application is designed to fill this gap. The current version provides a list of items that can be highlighted, added, deleted, edited, and re-ordered, with the changes being propagated to all instances of the tool.
我将描述和演示“会议列表工具”,这是一个与Access Grid Toolkit一起构建的共享应用程序,用于AG会议。目前,在AG会议期间共享基于文本的信息的两种最常见的方式是共享演示文稿和内置在Venue Client中的聊天工具。前者是预先知道的静态内容的理想选择。后者非常适合共享短信息,比如URL。显然缺少的是一个应用程序,它可以提前准备数据,在会议期间显示和快速操作数据,并在会议结束时由协作中的所有参与者保存数据。此应用程序旨在填补这一空白。当前版本提供了一个项目列表,可以突出显示、添加、删除、编辑和重新排序,并将更改传播到工具的所有实例。
{"title":"The meeting list tool - a shared application for sharing dynamic information in meetings","authors":"Adam C. Carter","doi":"10.1145/1188455.1188786","DOIUrl":"https://doi.org/10.1145/1188455.1188786","url":null,"abstract":"I will describe and demonstrate the \"Meeting List Tool\", a shared application built with the Access Grid Toolkit for use in AG meetings.At present, the two most common ways of sharing text-based information during an AG meeting are shared presentations and the chat tool built into the Venue Client. The former is ideal for static content which is known in advance. The latter is ideal for sharing short pieces of information, such as a URL. What is apparently missing is an application for which data can be prepared in advance, displayed and quickly manipulated during a meeting, and kept at the close of the meeting by all the participants in the collaboration.This application is designed to fill this gap. The current version provides a list of items that can be highlighted, added, deleted, edited, and re-ordered, with the changes being propagated to all instances of the tool.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"169 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113991445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-Force MPI: toward tractable toolkits for high performance computing 零力MPI:面向高性能计算的易于处理的工具包
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188595
M. Slawinska, Dawid Kurzyniec, Jaroslaw Slawinski, V. Sunderam
Shared HPC platforms continue to require substantial effort for software installation and management, often necessitating manual intervention and tedious procedures. We propose a novel model of resource sharing that shifts resource virtualization and aggregation responsibilities to client-side software, thus reducing the burdens on resource providers.The Zero-Force MPI toolkit automates the installation, build, run, and post-processing stages of HPC applications, thus allowing application scientists to focus on using resources instead of managing them. Through a provided console, MPI runtime systems, support libraries, application executables, and needed datafiles can be soft-installed across distributed resources with just a few commands. Built-in data synchronization capabilities simplify common HPC development tasks, saving end-user time and effort. To evaluate ZF-MPI, we conducted experiments with the NAS Parallel Benchmarks. Results demonstrate that the proposed run-not-install approach is effective and may substantially increase overall productivity.
共享的HPC平台仍然需要大量的软件安装和管理工作,通常需要人工干预和繁琐的过程。我们提出了一种新的资源共享模型,将资源虚拟化和聚合责任转移到客户端软件,从而减少了资源提供者的负担。Zero-Force MPI工具包自动化了HPC应用程序的安装、构建、运行和后处理阶段,从而使应用程序科学家能够专注于使用资源,而不是管理资源。通过提供的控制台,只需几个命令,就可以跨分布式资源进行MPI运行时系统、支持库、应用程序可执行文件和所需的数据文件的软安装。内置的数据同步功能简化了常见的HPC开发任务,节省了最终用户的时间和精力。为了评估ZF-MPI,我们使用NAS并行基准进行了实验。结果表明,建议的运行-不安装方法是有效的,并且可以大大提高总体生产率。
{"title":"Zero-Force MPI: toward tractable toolkits for high performance computing","authors":"M. Slawinska, Dawid Kurzyniec, Jaroslaw Slawinski, V. Sunderam","doi":"10.1145/1188455.1188595","DOIUrl":"https://doi.org/10.1145/1188455.1188595","url":null,"abstract":"Shared HPC platforms continue to require substantial effort for software installation and management, often necessitating manual intervention and tedious procedures. We propose a novel model of resource sharing that shifts resource virtualization and aggregation responsibilities to client-side software, thus reducing the burdens on resource providers.The Zero-Force MPI toolkit automates the installation, build, run, and post-processing stages of HPC applications, thus allowing application scientists to focus on using resources instead of managing them. Through a provided console, MPI runtime systems, support libraries, application executables, and needed datafiles can be soft-installed across distributed resources with just a few commands. Built-in data synchronization capabilities simplify common HPC development tasks, saving end-user time and effort. To evaluate ZF-MPI, we conducted experiments with the NAS Parallel Benchmarks. Results demonstrate that the proposed run-not-install approach is effective and may substantially increase overall productivity.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122354528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
TotalView tips and tricks TotalView提示和技巧
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188465
C. Gottbrath, P. Thompson
TotalView is a flexible, scriptable parallel debugger with wide acceptance in the High Performance Computing community. This BOF will be an opportunity for TotalView users to share clever and interesting ways of adapting TotalView to their unique environment, using TotalView to do something unusual, or simply making the day to day process of debugging easier. Contact Chris.Gottbrath@etnus.com if you want us to reserve time for you to tell your story or simply show up at the BOF and step forward.
TotalView是一个灵活的、可编写脚本的并行调试器,在高性能计算社区中被广泛接受。这将是一个机会,为TotalView用户分享聪明和有趣的方式,使TotalView适应他们独特的环境,使用TotalView做一些不寻常的事情,或简单地使日常的调试过程更容易。如果您希望我们为您预留时间讲述您的故事,或者只是出现在BOF并向前迈进,请联系Chris.Gottbrath@etnus.com。
{"title":"TotalView tips and tricks","authors":"C. Gottbrath, P. Thompson","doi":"10.1145/1188455.1188465","DOIUrl":"https://doi.org/10.1145/1188455.1188465","url":null,"abstract":"TotalView is a flexible, scriptable parallel debugger with wide acceptance in the High Performance Computing community. This BOF will be an opportunity for TotalView users to share clever and interesting ways of adapting TotalView to their unique environment, using TotalView to do something unusual, or simply making the day to day process of debugging easier. Contact Chris.Gottbrath@etnus.com if you want us to reserve time for you to tell your story or simply show up at the BOF and step forward.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129211183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
IANUS: scientific computing on an FPGA-based architecture IANUS:基于fpga架构的科学计算
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188633
F. Belletti, M. Cotallo, A. Flor, L. A. Fernández, A. Gordillo, A. Maiorano, F. Mantovani, E. Marinari, V. Martin-Mayor, A. M. Sudupe, D. Navarro, S. P. Gaviro, M. Rossi, J. Ruiz-Lorenzo, S. Schifano, D. Sciretti, A. Tarancón, R. Tripiccione, J. Velasco
IANUS is a massively parallel system based on a 2D array of FPGA-based processors with nearest-neighbor connections. Processors are also directly connected to a central hub attached to a host computer.The prototype, available in October 2006 uses an array of 4x4 Xilinx Virtex4LX160 FPGA's.We map onto the array the computational kernels of scientific applications characterized by regular control flow, unconventional mix of data-manipulation operations and limited memory usage.Careful VHDL coding of the kernel algorithms relevant for Monte Carlo simulation of spin-glass systems (our first application) yields impressive performances: single processor tests concurrently update ~1000 spins, so average spin-update time is 15 psec. This is ~60 times faster than accurately programmed 3,2 GHz PC's. We plan to build a 256 nodes system, roughly equivalent to 15000 PC's.This poster describes the architecture, the implementation and the methodology with which a specific application is mapped onto the system.
IANUS是一个基于2D fpga处理器阵列的大规模并行系统,具有最近邻连接。处理器也直接连接到连接到主机的中央集线器。原型机将于2006年10月上市,使用4x4 Xilinx Virtex4LX160 FPGA阵列。我们将科学应用的计算内核映射到阵列上,这些应用的特点是有规则的控制流、非常规的数据操作组合和有限的内存使用。对与自旋玻璃系统(我们的第一个应用程序)的蒙特卡罗模拟相关的内核算法进行仔细的VHDL编码,产生了令人印象深刻的性能:单处理器测试并发更新~1000个自旋,因此平均自旋更新时间为15 psec。这比精确编程的3.2 GHz PC快60倍。我们计划构建一个256个节点的系统,大致相当于15000台PC。这张海报描述了将特定应用程序映射到系统的体系结构、实现和方法。
{"title":"IANUS: scientific computing on an FPGA-based architecture","authors":"F. Belletti, M. Cotallo, A. Flor, L. A. Fernández, A. Gordillo, A. Maiorano, F. Mantovani, E. Marinari, V. Martin-Mayor, A. M. Sudupe, D. Navarro, S. P. Gaviro, M. Rossi, J. Ruiz-Lorenzo, S. Schifano, D. Sciretti, A. Tarancón, R. Tripiccione, J. Velasco","doi":"10.1145/1188455.1188633","DOIUrl":"https://doi.org/10.1145/1188455.1188633","url":null,"abstract":"IANUS is a massively parallel system based on a 2D array of FPGA-based processors with nearest-neighbor connections. Processors are also directly connected to a central hub attached to a host computer.The prototype, available in October 2006 uses an array of 4x4 Xilinx Virtex4LX160 FPGA's.We map onto the array the computational kernels of scientific applications characterized by regular control flow, unconventional mix of data-manipulation operations and limited memory usage.Careful VHDL coding of the kernel algorithms relevant for Monte Carlo simulation of spin-glass systems (our first application) yields impressive performances: single processor tests concurrently update ~1000 spins, so average spin-update time is 15 psec. This is ~60 times faster than accurately programmed 3,2 GHz PC's. We plan to build a 256 nodes system, roughly equivalent to 15000 PC's.This poster describes the architecture, the implementation and the methodology with which a specific application is mapped onto the system.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"328 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129305455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Realistic visualization for large-scale simulations 逼真的可视化大规模模拟
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188688
V. Popescu, C. Hoffmann
Simulations that model in detail complex interactions in complex environments are now possible, and visualization is a uniquely powerful analysis tool. However, visualization features of simulation codes are typically intended for scientists designing simulations, providing little support for presenting simulation results in a form suitable for non-experts. Conversely, graphics software and hardware progress has been fueled by applications where precisely abiding by the laws of physics is of secondary importance compared to visual realism.This half-day tutorial presents an approach for state-of-the-art visualization of simulation data based on connecting the worlds of computer simulation and computer animation. Concretely, the attendees will learn how to simplify, import, and integrate the simulation data into the surrounding scene, with examples from our simulation of the September 11 Attack on the Pentagon. The resulting realistic visualization enables effective dissemination of simulation results, and helps simulations reach their full potential for high societal impact.
在复杂环境中对复杂交互进行详细建模的仿真现在是可能的,可视化是一种独特的强大分析工具。然而,仿真代码的可视化特性通常是为设计仿真的科学家准备的,很少支持以适合非专家的形式呈现仿真结果。相反,图形软件和硬件的进步是由应用程序推动的,在这些应用程序中,与视觉现实主义相比,严格遵守物理定律是次要的。这个半天的教程介绍了一种基于连接计算机仿真和计算机动画世界的最先进的仿真数据可视化方法。具体来说,与会者将学习如何简化、导入和整合模拟数据到周围的场景中,并以我们对五角大楼9月11日袭击的模拟为例。由此产生的逼真的可视化能够有效地传播模拟结果,并帮助模拟充分发挥其高社会影响的潜力。
{"title":"Realistic visualization for large-scale simulations","authors":"V. Popescu, C. Hoffmann","doi":"10.1145/1188455.1188688","DOIUrl":"https://doi.org/10.1145/1188455.1188688","url":null,"abstract":"Simulations that model in detail complex interactions in complex environments are now possible, and visualization is a uniquely powerful analysis tool. However, visualization features of simulation codes are typically intended for scientists designing simulations, providing little support for presenting simulation results in a form suitable for non-experts. Conversely, graphics software and hardware progress has been fueled by applications where precisely abiding by the laws of physics is of secondary importance compared to visual realism.This half-day tutorial presents an approach for state-of-the-art visualization of simulation data based on connecting the worlds of computer simulation and computer animation. Concretely, the attendees will learn how to simplify, import, and integrate the simulation data into the surrounding scene, with examples from our simulation of the September 11 Attack on the Pentagon. The resulting realistic visualization enables effective dissemination of simulation results, and helps simulations reach their full potential for high societal impact.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"44 24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128527539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large scale drop impact analysis of mobile phone using ADVC on Blue Gene/L 基于ADVC对蓝色基因/L的手机大尺度跌落冲击分析
Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188503
H. Akiba, T. Ohyama, Y. Shibata, Kiyoshi Yuyama, Yoshikazu Katai, R. Takeuchi, T. Hoshino, S. Yoshimura, H. Noguchi, Manish Gupta, John A. Gunnels, V. Austel, Yogish Sabharwal, R. Garg, S. Kato, T. Kawakami, Satoru Todokoro, Junko Ikeda
Existing commercial finite element analysis (FEA) codes do not exhibit the performance necessary for large scale analysis on parallel computer systems. In this paper, we demonstrate the performance characteristics of a commercial parallel structural analysis code, ADVC, on Blue Gene/L (BG/L). The numerical algorithm of ADVC is described, tuned, and optimized on BG/L, and then a large scale drop impact analysis of a mobile phone is performed. The model of the mobile phone is a nearly-full assembly that includes inner structures. The size of the model we have analyzed has 47 million nodal points and 142 million DOFs. This does not seem exceptionally large, but the dynamic impact analysis of a product model, with the contact condition on the entire surface of the outer case under this size, cannot be handled by other CAE systems. Our analysis is an unprecedented attempt in the electronics industry. It took only half a day, 12.1 hours, for the analysis of about 2.4 milliseconds. The floating point operation performance obtained has been 538 GFLOPS on 4096 node of BG/L.
现有的商业有限元分析(FEA)代码不具备在并行计算机系统上进行大规模分析所需的性能。在本文中,我们在Blue Gene/L (BG/L)上演示了商用并行结构分析代码ADVC的性能特征。在BG/L的基础上对ADVC的数值算法进行了描述、调整和优化,并对某手机进行了大规模跌落冲击分析。这个手机模型是一个几乎完整的组装体,包括内部结构。我们分析的模型的大小有4700万个节点和1.42亿个自由度。这看起来并不是特别大,但是一个产品模型的动态冲击分析,在这个尺寸下外壳整个表面的接触情况,是其他CAE系统无法处理的。我们的分析在电子行业是前所未有的尝试。只花了半天,也就是12.1小时,就分析了大约2.4毫秒的数据。在BG/L的4096个节点上获得了538 GFLOPS的浮点运算性能。
{"title":"Large scale drop impact analysis of mobile phone using ADVC on Blue Gene/L","authors":"H. Akiba, T. Ohyama, Y. Shibata, Kiyoshi Yuyama, Yoshikazu Katai, R. Takeuchi, T. Hoshino, S. Yoshimura, H. Noguchi, Manish Gupta, John A. Gunnels, V. Austel, Yogish Sabharwal, R. Garg, S. Kato, T. Kawakami, Satoru Todokoro, Junko Ikeda","doi":"10.1145/1188455.1188503","DOIUrl":"https://doi.org/10.1145/1188455.1188503","url":null,"abstract":"Existing commercial finite element analysis (FEA) codes do not exhibit the performance necessary for large scale analysis on parallel computer systems. In this paper, we demonstrate the performance characteristics of a commercial parallel structural analysis code, ADVC, on Blue Gene/L (BG/L). The numerical algorithm of ADVC is described, tuned, and optimized on BG/L, and then a large scale drop impact analysis of a mobile phone is performed. The model of the mobile phone is a nearly-full assembly that includes inner structures. The size of the model we have analyzed has 47 million nodal points and 142 million DOFs. This does not seem exceptionally large, but the dynamic impact analysis of a product model, with the contact condition on the entire surface of the outer case under this size, cannot be handled by other CAE systems. Our analysis is an unprecedented attempt in the electronics industry. It took only half a day, 12.1 hours, for the analysis of about 2.4 milliseconds. The floating point operation performance obtained has been 538 GFLOPS on 4096 node of BG/L.","PeriodicalId":115940,"journal":{"name":"Proceedings of the 2006 ACM/IEEE conference on Supercomputing","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128541567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
期刊
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1