2015 International Conference on High Performance Computing & Simulation (HPCS)最新文献_第5页

Scalable correlation-aware virtual machine consolidation using two-phase clustering 使用两阶段集群的可伸缩关联感知虚拟机整合

2015 International Conference on High Performance Computing & Simulation (HPCS)

Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237045

Xi Li, Anthony Ventresque, Jesús Omana Iglesias, John Murphy

Server consolidation is the most common and effective method to save energy and increase resource utilization in data centers, and virtual machine (VM) placement is the usual way of achieving server consolidation. VM placement is however challenging given the scale of IT infrastructures nowadays and the risk of resource contention among co-located VMs after consolidation. Therefore, the correlation among VMs to be co-located need to be considered. However, existing solutions do not address the scalability issue that arises once the number of VMs increases to an order of magnitude that makes it unrealistic to calculate the correlation between each pair of VMs. In this paper, we propose a correlation-aware VM consolidation solution ScalCCon1, which uses a novel two-phase clustering scheme to address the aforementioned scalability problem. We propose and demonstrate the benefits of using the two-phase clustering scheme in comparison to solutions using one-phase clustering (up to 84% reduction of execution time when 17, 446 VMs are considered). Moreover, our solution manages to reduce the number of physical machines (PMs) required, as well as the number of performance violations, compared to existing correlation-based approaches.

服务器整合是数据中心中节省能源和提高资源利用率的最常见和最有效的方法，而虚拟机(VM)放置是实现服务器整合的常用方法。然而，考虑到当今IT基础设施的规模以及合并后共存的VM之间资源争用的风险，VM的放置是具有挑战性的。因此，需要考虑待共置虚拟机之间的相关性。但是，现有的解决方案无法解决当虚拟机数量增加到一个数量级时出现的可伸缩性问题，这使得计算每对虚拟机之间的相关性变得不现实。在本文中，我们提出了一种关联感知的VM整合解决方案ScalCCon1，它使用一种新的两阶段集群方案来解决上述可扩展性问题。我们提出并演示了与使用单阶段集群的解决方案相比，使用两阶段集群方案的好处(当考虑17,446个vm时，执行时间最多减少84%)。此外，与现有的基于关联的方法相比，我们的解决方案设法减少了所需的物理机器(pm)的数量，以及性能违规的数量。

{"title":"Scalable correlation-aware virtual machine consolidation using two-phase clustering","authors":"Xi Li, Anthony Ventresque, Jesús Omana Iglesias, John Murphy","doi":"10.1109/HPCSim.2015.7237045","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237045","url":null,"abstract":"Server consolidation is the most common and effective method to save energy and increase resource utilization in data centers, and virtual machine (VM) placement is the usual way of achieving server consolidation. VM placement is however challenging given the scale of IT infrastructures nowadays and the risk of resource contention among co-located VMs after consolidation. Therefore, the correlation among VMs to be co-located need to be considered. However, existing solutions do not address the scalability issue that arises once the number of VMs increases to an order of magnitude that makes it unrealistic to calculate the correlation between each pair of VMs. In this paper, we propose a correlation-aware VM consolidation solution ScalCCon1, which uses a novel two-phase clustering scheme to address the aforementioned scalability problem. We propose and demonstrate the benefits of using the two-phase clustering scheme in comparison to solutions using one-phase clustering (up to 84% reduction of execution time when 17, 446 VMs are considered). Moreover, our solution manages to reduce the number of physical machines (PMs) required, as well as the number of performance violations, compared to existing correlation-based approaches.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"17 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126235310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

On the threats to Cloud-based online service users (and what we can do about them) 基于云的在线服务用户面临的威胁(以及我们能做些什么)

2015 International Conference on High Performance Computing & Simulation (HPCS)

Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237026

G. Stringhini

In this paper we highlighted some threats that affect the users of cloud-based online services. We then presented a handful of systems that we developed over the last years to detect and block malicious activity on online services. Although the threat landscape is constantly evolving, we believe that such techniques constitute a good foundation to increasing the security of online service users.

在本文中，我们重点介绍了影响基于云的在线服务用户的一些威胁。然后，我们展示了我们在过去几年中开发的一些系统，用于检测和阻止在线服务上的恶意活动。尽管威胁形势在不断发展，但我们相信这些技术为提高在线服务用户的安全性奠定了良好的基础。

引用次数: 0

Measuring cells in phytoplankton images 测量浮游植物图像中的细胞

2015 International Conference on High Performance Computing & Simulation (HPCS)

Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237085

M. Mirto, Laura Conte, G. Aloisio, C. Distante, Pietro Vecchio, Alessandra De Giovanni

Phytoplankton is a quality element for determining the ecological status of transitional water ecosystems. In routine analysis, bio-volume and surface area of phytoplankton are the most studied morphometric descriptors. Bio-volume can be estimated by comparing the algae with similar three-dimensional geometric forms and determining their volume, by measuring the linear dimensions required for its calculation with images acquired by an inverse microscope. Software such as LUCIA-G (Laboratory Imaging) determines, in an automatic way, only the linear dimensions of simple forms such as circle or ellipse, approximated at a given algae, whereas complex forms require the intervention of an operator by selecting the start and end points of linear dimensions with obvious introduction of human error. In this paper, we propose a novel methodology for detecting phytoplankton algae and by measuring linear dimensions of 42 geometrical forms to automatically compute their area and bio-volume, that has been implemented in a novel software, named LUISA, for image analysis.

浮游植物是决定过渡性水体生态系统生态状况的质量要素。在常规分析中，浮游植物的生物体积和表面积是研究最多的形态计量描述因子。生物体积可以通过比较具有相似三维几何形状的藻类并确定它们的体积来估计，通过测量其计算所需的线性尺寸，并使用反向显微镜获得图像。LUCIA-G(实验室成像)等软件仅自动确定给定藻类近似的简单形状(如圆形或椭圆形)的线性尺寸，而复杂形状则需要操作员通过选择线性尺寸的起点和终点进行干预，这显然会引入人为误差。在本文中，我们提出了一种检测浮游植物藻类的新方法，通过测量42种几何形式的线性尺寸来自动计算它们的面积和生物体积，该方法已在一种名为LUISA的新型软件中实现，用于图像分析。

引用次数: 0

A performance analysis of precopy, postcopy and hybrid live VM migration algorithms in scientific cloud computing environment 科学云计算环境下预拷贝、后拷贝和混合虚拟机迁移算法的性能分析

2015 International Conference on High Performance Computing & Simulation (HPCS)

Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237044

Syed Asif Raza Shah, A. Jaikar, S. Noh

Virtualization technology plays a vital role in cloud computing. One of the core features of virtualization technology is live virtual machine migration. The live migration is a process of transferring the complete state of virtual machine between physical hosts without any service interruption. This capability is being widely used for the purpose of system maintenance, load balancing, energy efficiency, reconfiguration and fault tolerance. Live migration has been extensively studied for commercial workloads and the ongoing research is mainly focusing on the performance improvements of well-known live migration algorithms. Today, scientific communities are actively thinking to take the advantage of cloud computing for scientific workloads. In this paper, we analyze the performance of well-known precopy, postcopy and hybrid live migration algorithms and examine the migration times of VMs running high throughput computing (HTC) jobs in a scientific cloud computing environment. The results of our research not only show the performance comparison of live migration algorithms but also will be helpful when selecting a live migration algorithm in scientific cloud computing environment.

虚拟化技术在云计算中起着至关重要的作用。虚拟化技术的核心特性之一是实时虚拟机迁移。热迁移是指在不中断业务的情况下，在物理主机之间迁移虚拟机的完整状态的过程。这种能力被广泛用于系统维护、负载平衡、能源效率、重新配置和容错。对于商业工作负载的实时迁移已经进行了广泛的研究，目前的研究主要集中在已知的实时迁移算法的性能改进上。今天，科学界正在积极考虑利用云计算来处理科学工作负载。在本文中，我们分析了著名的预拷贝、后拷贝和混合实时迁移算法的性能，并检查了在科学云计算环境中运行高吞吐量计算(HTC)作业的虚拟机的迁移时间。本文的研究结果不仅对实时迁移算法的性能进行了比较，而且对科学云计算环境下实时迁移算法的选择也有一定的参考价值。

{"title":"A performance analysis of precopy, postcopy and hybrid live VM migration algorithms in scientific cloud computing environment","authors":"Syed Asif Raza Shah, A. Jaikar, S. Noh","doi":"10.1109/HPCSim.2015.7237044","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237044","url":null,"abstract":"Virtualization technology plays a vital role in cloud computing. One of the core features of virtualization technology is live virtual machine migration. The live migration is a process of transferring the complete state of virtual machine between physical hosts without any service interruption. This capability is being widely used for the purpose of system maintenance, load balancing, energy efficiency, reconfiguration and fault tolerance. Live migration has been extensively studied for commercial workloads and the ongoing research is mainly focusing on the performance improvements of well-known live migration algorithms. Today, scientific communities are actively thinking to take the advantage of cloud computing for scientific workloads. In this paper, we analyze the performance of well-known precopy, postcopy and hybrid live migration algorithms and examine the migration times of VMs running high throughput computing (HTC) jobs in a scientific cloud computing environment. The results of our research not only show the performance comparison of live migration algorithms but also will be helpful when selecting a live migration algorithm in scientific cloud computing environment.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129464490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Big data exploration with faceted browsing 使用分面浏览进行大数据探索

2015 International Conference on High Performance Computing & Simulation (HPCS)

Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237087

Giovanni Simonini, Song Zhu

Big data analysis now drives nearly every aspect of modern society, from manufacturing and retail, through mobile and financial services, through the life sciences and physical sciences. The ability to continue to use big data to make new connections and discoveries will help to drive the breakthroughs of tomorrow. One of the most valuable means through which to make sense of big data, and thus make it more approachable to most people, is data visualization. Data visualization can guide decision-making and become a tool to convey information critical in all data analysis. However, to be actually actionable, data visualizations should contain the right amount of interactivity. They have to be well designed, easy to use, understandable, meaningful, and approachable. In this article we present a new approach to visualize huge amount of data, based on a Bayesian suggestion algorithm and the widely used enterprise search platform Solr. We demonstrate how the proposed Bayesian suggestion algorithm became a key ingredient in a big data scenario, where generally a query can generate so many results that the user can be confused. Thus, the selection of the best results, together with the result path chosen by the user by means of multi-faceted querying and faceted navigation, can be very useful.

如今，大数据分析几乎驱动着现代社会的方方面面，从制造业和零售业，到移动和金融服务，再到生命科学和物理科学。继续使用大数据建立新的联系和发现的能力将有助于推动未来的突破。数据可视化是理解大数据并使其更容易为大多数人所接受的最有价值的方法之一。数据可视化可以指导决策，并成为传达所有数据分析中关键信息的工具。然而，要真正具有可操作性，数据可视化应该包含适量的交互性。它们必须设计良好，易于使用，易于理解，有意义且易于接近。在本文中，我们提出了一种基于贝叶斯建议算法和广泛使用的企业搜索平台Solr的海量数据可视化新方法。我们演示了所提出的贝叶斯建议算法如何成为大数据场景中的关键因素，在大数据场景中，通常一个查询可能产生如此多的结果，以至于用户可能会感到困惑。因此，选择最佳结果以及用户通过多方面查询和多方面导航选择的结果路径是非常有用的。

{"title":"Big data exploration with faceted browsing","authors":"Giovanni Simonini, Song Zhu","doi":"10.1109/HPCSim.2015.7237087","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237087","url":null,"abstract":"Big data analysis now drives nearly every aspect of modern society, from manufacturing and retail, through mobile and financial services, through the life sciences and physical sciences. The ability to continue to use big data to make new connections and discoveries will help to drive the breakthroughs of tomorrow. One of the most valuable means through which to make sense of big data, and thus make it more approachable to most people, is data visualization. Data visualization can guide decision-making and become a tool to convey information critical in all data analysis. However, to be actually actionable, data visualizations should contain the right amount of interactivity. They have to be well designed, easy to use, understandable, meaningful, and approachable. In this article we present a new approach to visualize huge amount of data, based on a Bayesian suggestion algorithm and the widely used enterprise search platform Solr. We demonstrate how the proposed Bayesian suggestion algorithm became a key ingredient in a big data scenario, where generally a query can generate so many results that the user can be confused. Thus, the selection of the best results, together with the result path chosen by the user by means of multi-faceted querying and faceted navigation, can be very useful.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133747644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

A four-decomposition strategies for hierarchically modeling combinatorial optimization problems: framework, conditions and relations 组合优化问题分层建模的四分解策略:框架、条件和关系

2015 International Conference on High Performance Computing & Simulation (HPCS)

Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237081

M. Chaieb, Jaber Jemai, K. Mellouli

We address the problem of modeling combinatorial optimization problems (COP). COPs are generally complex problems to solve. So a good modeling step is fundamental to make the solution easier. Our approach orients researches to choose the best modeling strategy from the beginning to avoid any problem in the solving process. This paper aims at proposing a new approach dealing with hard COPs particularly when the decomposition process leads to some well-known and canonical optimization sub-problems. We tried to draw a clear framework that will help to model hierarchical optimization problems. The framework will be composed by four decomposition strategies which are: objective based decomposition; constraints based decomposition, semantic decomposition and data partitioning strategy. For each strategy, we present supporting examples from the literature where it was applied. But, not all combinatorial problems can be benefit from the outcomes and benefits of modeling problems hierarchically, rather only particular problems can be modeled like a hierarchical optimization problem. Thus, we propose a set of decomposability conditions for decomposing COPs. Furthermore, we define the types of relationships between obtained sub-problems and how partial solutions can be merged to obtain the final solution.

我们解决了组合优化问题(COP)的建模问题。cop通常是难以解决的复杂问题。因此，良好的建模步骤是使解决方案更容易的基础。我们的方法旨在从一开始就选择最佳的建模策略，以避免在求解过程中出现任何问题。本文旨在提出一种新的方法来处理硬cop，特别是当分解过程导致一些众所周知的和规范的优化子问题时。我们试图绘制一个清晰的框架，这将有助于建模分层优化问题。该框架将由四种分解策略组成:基于目标的分解;基于约束的分解、语义分解和数据分区策略。对于每种策略，我们从应用该策略的文献中提供支持示例。但是，并不是所有的组合问题都能从分层建模问题的结果和好处中受益，而是只有特定的问题才能像分层优化问题一样建模。因此，我们提出了一组分解cop的可分解性条件。此外，我们定义了所得到的子问题之间的关系类型，以及如何将部分解合并以获得最终解。

{"title":"A four-decomposition strategies for hierarchically modeling combinatorial optimization problems: framework, conditions and relations","authors":"M. Chaieb, Jaber Jemai, K. Mellouli","doi":"10.1109/HPCSim.2015.7237081","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237081","url":null,"abstract":"We address the problem of modeling combinatorial optimization problems (COP). COPs are generally complex problems to solve. So a good modeling step is fundamental to make the solution easier. Our approach orients researches to choose the best modeling strategy from the beginning to avoid any problem in the solving process. This paper aims at proposing a new approach dealing with hard COPs particularly when the decomposition process leads to some well-known and canonical optimization sub-problems. We tried to draw a clear framework that will help to model hierarchical optimization problems. The framework will be composed by four decomposition strategies which are: objective based decomposition; constraints based decomposition, semantic decomposition and data partitioning strategy. For each strategy, we present supporting examples from the literature where it was applied. But, not all combinatorial problems can be benefit from the outcomes and benefits of modeling problems hierarchically, rather only particular problems can be modeled like a hierarchical optimization problem. Thus, we propose a set of decomposability conditions for decomposing COPs. Furthermore, we define the types of relationships between obtained sub-problems and how partial solutions can be merged to obtain the final solution.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131911252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Performance evaluation of Optical Packet Switches on high performance applications 光分组交换机在高性能应用中的性能评估

2015 International Conference on High Performance Computing & Simulation (HPCS)

Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237062

Hugo Meyer, J. Sancho, W. Miao, H. Dorren, N. Calabretta, Montse Farreras

This paper analyzes the performance impact of Optical Packet Switches (OPS) on parallel HPC applications. Because these devices cannot store light, in case of a collision for accessing the same output port in the switch only one packet can proceed and the others are dropped. The analysis focuses on the negative impact of packet collisions in the OPS and subsequent re-transmissions of dropped packets. To carry out this analysis we have developed a system simulator that mimics the behavior of real HPC application traffic and optical network devices such as the OPS. By using real application traces we have analyzed how message re-transmissions could affect parallel executions. In addition, we have also developed a methodology that allows to process applications traces and determine packet concurrency. The concurrency evaluates the amount of simultaneous packets that applications could transmit in the network. Results have shown that there are applications that can benefit from the advantages of OPS technology. Taking into account the applications analyzed, these applications are the ones that show less than 1% of packet concurrency; whereas there are other applications where their performance could be impacted by up to 65%. This impact is mostly dependent on application traffic behavior that is successfully characterized by our proposed methodology.

本文分析了光分组交换机(OPS)对并行高性能计算应用的性能影响。由于这些设备不能存储光，如果在访问交换机的同一输出端口时发生碰撞，只有一个数据包可以继续，其他数据包被丢弃。分析的重点是OPS中数据包冲突的负面影响以及随后丢失数据包的重新传输。为了进行分析，我们开发了一个系统模拟器来模拟真实的高性能计算应用流量和光网络设备(如OPS)的行为。通过使用真实的应用程序跟踪，我们分析了消息重传如何影响并行执行。此外，我们还开发了一种方法，允许处理应用程序跟踪和确定包并发性。并发性评估应用程序可以在网络中同时传输的数据包数量。结果表明，有一些应用可以从OPS技术的优势中受益。考虑到所分析的应用程序，这些应用程序显示的数据包并发性低于1%;然而，还有其他应用程序的性能可能会受到高达65%的影响。这种影响主要取决于应用程序流量行为，而应用程序流量行为是通过我们提出的方法成功表征的。

{"title":"Performance evaluation of Optical Packet Switches on high performance applications","authors":"Hugo Meyer, J. Sancho, W. Miao, H. Dorren, N. Calabretta, Montse Farreras","doi":"10.1109/HPCSim.2015.7237062","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237062","url":null,"abstract":"This paper analyzes the performance impact of Optical Packet Switches (OPS) on parallel HPC applications. Because these devices cannot store light, in case of a collision for accessing the same output port in the switch only one packet can proceed and the others are dropped. The analysis focuses on the negative impact of packet collisions in the OPS and subsequent re-transmissions of dropped packets. To carry out this analysis we have developed a system simulator that mimics the behavior of real HPC application traffic and optical network devices such as the OPS. By using real application traces we have analyzed how message re-transmissions could affect parallel executions. In addition, we have also developed a methodology that allows to process applications traces and determine packet concurrency. The concurrency evaluates the amount of simultaneous packets that applications could transmit in the network. Results have shown that there are applications that can benefit from the advantages of OPS technology. Taking into account the applications analyzed, these applications are the ones that show less than 1% of packet concurrency; whereas there are other applications where their performance could be impacted by up to 65%. This impact is mostly dependent on application traffic behavior that is successfully characterized by our proposed methodology.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128310225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Techniques to improve the scalability of collective checkpointing at large scale 提高大规模集体检查点可扩展性的技术

2015 International Conference on High Performance Computing & Simulation (HPCS)

Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237113

Bogdan Nicolae

Scientific and data-intensive computing have matured over the last couple of years in all fields of science and industry. Their rapid increase in complexity and scale has prompted ongoing efforts dedicated to reach exascale infrastructure capability by the end of the decade. However, advances in this context are not homogeneous: I/O capabilities in terms of networking and storage are lagging behind computational power and are often considered a major limitation that that persists even at petascale [1]. A particularly difficult challenge in this context are collective I/O access patterns (which we henceforth refer to as collective checkpointing) where all processes simultaneously dump large amounts of related data simultaneously to persistent storage. This pattern is often exhibited by large-scale, bulk-synchronous applications in a variety of circumstances, e.g., when they use checkpoint-restart fault tolerance techniques to save intermediate computational states at regular time intervals [2] or when intermediate, globally synchronized results are needed during the lifetime of the computation (e.g. to understand how a simulation progresses during key phases). Under such circumstances, a decoupled storage system (e.g. a parallel file system such as GPFS [3] or a specialized storage system such as BlobSeer [4]) does not provide sufficient I/O bandwidth to handle the explosion of data sizes: for example, Jones et al. [5] predict dump times in the order of several hours. In order to overcome the I/O bandwidth limitation, one potential solution is to equip the compute nodes with local storage (i.e., HDDs, SSDs, NVMs, etc.) or use I/O forwarding nodes. Using this approach, a large part of the data can be dumped locally, which completely avoids the need to consume and compete for the I/O bandwidth of a decoupled storage system. However, this is not without drawbacks: the local storage devices or I/O forwarding nodes are prone to failures and as such the data they hold is volatile. Thus, a popular approach in practice is to wait until the local dump has finished, then let the application continue while the checkpoints are in turn dumped to a parallel file system in background. Such a straightforward solution can be effective at hiding the overhead incurred to due I/O bandwidth limitations, but this not necessarily the case: it may happen that there is not enough time to fully flush everything to the parallel file system before the next collective checkpoint request is issued. In fact, this a likely scenario with growing scale, as the failure rate increases, which introduces the need to checkpoint at smaller intervals in order to compensate for this effect. Furthermore, a smaller checkpoint interval also means local dumps are frequent and as such their overhead becomes significant itself.

在过去的几年中，科学和数据密集型计算在科学和工业的所有领域都已经成熟。它们的复杂性和规模的迅速增加促使人们不断努力，致力于在本十年末达到百亿亿次基础设施的能力。然而，在这方面的进展并不均匀:网络和存储方面的I/O能力落后于计算能力，并且通常被认为是一个主要的限制，即使在千兆级(peascale)上也是如此[1]。在这种情况下，一个特别困难的挑战是集体I/O访问模式(我们今后将其称为集体检查点)，其中所有进程同时将大量相关数据转储到持久存储中。这种模式经常出现在各种情况下的大规模批量同步应用程序中，例如，当它们使用检查点重新启动容错技术以定期时间间隔保存中间计算状态[2]时，或者在计算生命周期中需要中间的全局同步结果时(例如，为了了解在关键阶段模拟的进展情况)。在这种情况下，解耦的存储系统(例如GPFS[3]这样的并行文件系统或BlobSeer[4]这样的专用存储系统)不能提供足够的I/O带宽来处理数据大小的爆炸:例如，Jones等人[5]预测转储时间以几个小时为单位。为了克服I/O带宽限制，一个可能的解决方案是为计算节点配备本地存储(即hdd、ssd、nvm等)或使用I/O转发节点。使用这种方法，大部分数据可以在本地转储，从而完全避免了对解耦存储系统的I/O带宽的消耗和竞争。然而，这并非没有缺点:本地存储设备或I/O转发节点容易出现故障，因此它们保存的数据是不稳定的。因此，在实践中，一种流行的方法是等待本地转储完成，然后让应用程序继续运行，而检查点则依次转储到后台的并行文件系统。这种直接的解决方案可以有效地隐藏由于I/O带宽限制而产生的开销，但情况并非如此:在发出下一个集体检查点请求之前，可能没有足够的时间将所有内容完全刷新到并行文件系统。事实上，随着失败率的增加，随着规模的增长，这种情况很可能出现，这就需要以更小的间隔检查点来补偿这种影响。此外，更小的检查点间隔也意味着频繁地进行本地转储，因此它们的开销本身就变得很大。

{"title":"Techniques to improve the scalability of collective checkpointing at large scale","authors":"Bogdan Nicolae","doi":"10.1109/HPCSim.2015.7237113","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237113","url":null,"abstract":"Scientific and data-intensive computing have matured over the last couple of years in all fields of science and industry. Their rapid increase in complexity and scale has prompted ongoing efforts dedicated to reach exascale infrastructure capability by the end of the decade. However, advances in this context are not homogeneous: I/O capabilities in terms of networking and storage are lagging behind computational power and are often considered a major limitation that that persists even at petascale [1]. A particularly difficult challenge in this context are collective I/O access patterns (which we henceforth refer to as collective checkpointing) where all processes simultaneously dump large amounts of related data simultaneously to persistent storage. This pattern is often exhibited by large-scale, bulk-synchronous applications in a variety of circumstances, e.g., when they use checkpoint-restart fault tolerance techniques to save intermediate computational states at regular time intervals [2] or when intermediate, globally synchronized results are needed during the lifetime of the computation (e.g. to understand how a simulation progresses during key phases). Under such circumstances, a decoupled storage system (e.g. a parallel file system such as GPFS [3] or a specialized storage system such as BlobSeer [4]) does not provide sufficient I/O bandwidth to handle the explosion of data sizes: for example, Jones et al. [5] predict dump times in the order of several hours. In order to overcome the I/O bandwidth limitation, one potential solution is to equip the compute nodes with local storage (i.e., HDDs, SSDs, NVMs, etc.) or use I/O forwarding nodes. Using this approach, a large part of the data can be dumped locally, which completely avoids the need to consume and compete for the I/O bandwidth of a decoupled storage system. However, this is not without drawbacks: the local storage devices or I/O forwarding nodes are prone to failures and as such the data they hold is volatile. Thus, a popular approach in practice is to wait until the local dump has finished, then let the application continue while the checkpoints are in turn dumped to a parallel file system in background. Such a straightforward solution can be effective at hiding the overhead incurred to due I/O bandwidth limitations, but this not necessarily the case: it may happen that there is not enough time to fully flush everything to the parallel file system before the next collective checkpoint request is issued. In fact, this a likely scenario with growing scale, as the failure rate increases, which introduces the need to checkpoint at smaller intervals in order to compensate for this effect. Furthermore, a smaller checkpoint interval also means local dumps are frequent and as such their overhead becomes significant itself.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122034306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analyzing available routing engines for InfiniBand-based clusters with Dragonfly topology 利用Dragonfly拓扑分析基于infiniband集群的可用路由引擎

2015 International Conference on High Performance Computing & Simulation (HPCS)

Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237036

G. Mathey, P. Yébenes, P. García, F. Quiles, J. Escudero-Sahuquillo

Dragonfly topologies have gathered great interest as one of the most promising interconnection patterns for the networks at the core of HPC clusters. In this paper, we configure several simulated InfiniBand-based clusters with a Dragonfly topology, and we analyze the performance in these configurations of the routing engines included in the latest release of the InfiniBand Subnet Manager (OpenSM v3.3.19).

蜻蜓拓扑作为高性能计算集群核心网络中最有前途的互连模式之一，引起了人们的极大兴趣。在本文中，我们使用Dragonfly拓扑配置了几个模拟的基于InfiniBand的集群，并分析了最新版本InfiniBand子网管理器(OpenSM v3.3.19)中包含的路由引擎在这些配置中的性能。

引用次数: 2

Opportunistic vehicular networking: Large-scale bus movement traces as base for network analysis 机会式车辆联网:大规模巴士运动轨迹作为网络分析的基础

2015 International Conference on High Performance Computing & Simulation (HPCS)

Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237118

M. Doering, L. Wolf

In many road traffic scenarios the ability to communicate among traffic participants is very helpful. Therefore, research and development in academia and industry in that field exists already for many years and is ongoing in several directions. Some examples are Vehicular Ad-hoc Networks (VANETs), e.g., using technologies like IEEE 802.11p, and vehicles communicating with backend systems, e.g., using 2/3/4G cellular networks. In opportunistic vehicular networks, vehicles may not only exchange data for the immediate use such as Cooperative Awareness Messages (CAMs) in the ETSI Intelligent Transport Systems (ITS). Instead, a more general type of network might be set up, also for application scenarios beyond direct road traffic related aspects. For instance, buses of public transportation systems could collect data from the field or distribute data among several buses. Thus, buses could become an important part of smart cities or Internet of Things (IoT) application scenarios. Important questions are then, e.g., how much data could be distributed in such a bus-based opportunistic network or how often is it possible to exchange data between buses. Usually, buses in urban public transport systems follow well planned but nevertheless highly dynamic schedules and trajectories. Thus, traffic conditions have a significant and complex influence on bus mobility, causing very characteristic movement properties that are considerably distinct from other road vehicles. Understanding these special characteristics is essential for the design and evaluation of opportunistic vehicular communication networks. For this purpose we inspect two large-scale bus movement traces and describe the available data and metadata. Moreover, we analyze and compare vehicle density, speed, update intervals, and characteristics that are specific to public transport. Especially for large cities, but even for smaller ones if many devices like vehicles, sensors, and various other IoT things are part of such a network, high-performance computing and simulation approaches are necessary to study, analyse, design, use and maintain such a system.

在许多道路交通场景中，交通参与者之间的通信能力是非常有用的。因此，学术界和工业界在该领域的研究和发展已经存在多年，并正在几个方向上进行。一些例子是车辆自组织网络(vanet)，例如，使用像IEEE 802.11p这样的技术，以及车辆与后端系统通信，例如，使用2/3/4G蜂窝网络。在机会主义车辆网络中，车辆不仅可以交换即时使用的数据，例如ETSI智能交通系统(ITS)中的协同感知信息(CAMs)。相反，可能会建立一种更通用的网络类型，同样适用于与道路交通直接相关的应用场景。例如，公共交通系统的公交车可以从现场收集数据或在几辆公交车之间分发数据。因此，公交车可能成为智慧城市或物联网(IoT)应用场景的重要组成部分。重要的问题是，例如，有多少数据可以分布在这样一个基于总线的机会网络中，或者总线之间交换数据的频率有多高。通常，城市公共交通系统中的公共汽车遵循精心规划但高度动态的时间表和轨迹。因此，交通条件对公共汽车的机动性有重要而复杂的影响，造成了与其他道路车辆相当不同的非常独特的运动特性。了解这些特性对于设计和评估机会式车载通信网络至关重要。为此，我们检查了两个大规模的总线运动轨迹，并描述了可用的数据和元数据。此外，我们还分析和比较了公共交通的车辆密度、速度、更新间隔和特征。特别是对于大城市，但即使是较小的城市，如果许多设备，如车辆，传感器和各种其他物联网设备都是这样一个网络的一部分，高性能计算和仿真方法是必要的，以研究，分析，设计，使用和维护这样一个系统。

{"title":"Opportunistic vehicular networking: Large-scale bus movement traces as base for network analysis","authors":"M. Doering, L. Wolf","doi":"10.1109/HPCSim.2015.7237118","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237118","url":null,"abstract":"In many road traffic scenarios the ability to communicate among traffic participants is very helpful. Therefore, research and development in academia and industry in that field exists already for many years and is ongoing in several directions. Some examples are Vehicular Ad-hoc Networks (VANETs), e.g., using technologies like IEEE 802.11p, and vehicles communicating with backend systems, e.g., using 2/3/4G cellular networks. In opportunistic vehicular networks, vehicles may not only exchange data for the immediate use such as Cooperative Awareness Messages (CAMs) in the ETSI Intelligent Transport Systems (ITS). Instead, a more general type of network might be set up, also for application scenarios beyond direct road traffic related aspects. For instance, buses of public transportation systems could collect data from the field or distribute data among several buses. Thus, buses could become an important part of smart cities or Internet of Things (IoT) application scenarios. Important questions are then, e.g., how much data could be distributed in such a bus-based opportunistic network or how often is it possible to exchange data between buses. Usually, buses in urban public transport systems follow well planned but nevertheless highly dynamic schedules and trajectories. Thus, traffic conditions have a significant and complex influence on bus mobility, causing very characteristic movement properties that are considerably distinct from other road vehicles. Understanding these special characteristics is essential for the design and evaluation of opportunistic vehicular communication networks. For this purpose we inspect two large-scale bus movement traces and describe the available data and metadata. Moreover, we analyze and compare vehicle density, speed, update intervals, and characteristics that are specific to public transport. Especially for large cities, but even for smaller ones if many devices like vehicles, sensors, and various other IoT things are part of such a network, high-performance computing and simulation approaches are necessary to study, analyse, design, use and maintain such a system.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129099414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7