首页 > 最新文献

2014 International Conference on High Performance Computing & Simulation (HPCS)最新文献

英文 中文
Reconfigurable Network-on-chip design for heterogeneous multi-core system architecture 异构多核系统架构的可重构片上网络设计
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903730
Jih-Sheng Shen, Pao-Ann Hsiung, Juin-Ming Lu
Due to the need to support concurrent executions of versatile applications, the system complexity, in terms of the number of cores, is drastically increased from tens to hundreds or thousands of cores. These complex systems usually contain heterogeneous cores or processing elements such as different processor cores, memories, and several Silicon Intellectual Properties (SIPs). Network-on-chip (NoC) was proposed to provide scalability and higher throughput for these heterogeneous multi-core systems. However, general designs of NoC infrastructures for multi-core systems usually lack the flexibility to support different processing requirements such as performance, power, reliability, and response time. It is helpful if designers can provide a reconfigurable NoC design so that these requirements can be supported more easily. In this work, we take an existing reconfigurable NoC for example and discuss related hardware and software issues. Some issues such as the reconfiguration time overhead must be considered in the design of a reconfigurable NoC such that it can be used for heterogeneous multi-core systems.
由于需要支持多用途应用程序的并发执行,系统的复杂性(就核心数量而言)从几十个核心急剧增加到数百或数千个核心。这些复杂的系统通常包含异构内核或处理元素,例如不同的处理器内核、存储器和多个硅知识产权(sip)。为了给这些异构多核系统提供可扩展性和更高的吞吐量,提出了片上网络(NoC)。然而,针对多核系统的NoC基础架构的一般设计通常缺乏灵活性,无法支持不同的处理需求,如性能、功耗、可靠性和响应时间。如果设计人员能够提供可重新配置的NoC设计,以便更容易地支持这些需求,这将是有帮助的。本文以现有的可重构NoC为例,讨论了相关的硬件和软件问题。在设计可重新配置的NoC时,必须考虑一些问题,例如重新配置时间开销,以便将其用于异构多核系统。
{"title":"Reconfigurable Network-on-chip design for heterogeneous multi-core system architecture","authors":"Jih-Sheng Shen, Pao-Ann Hsiung, Juin-Ming Lu","doi":"10.1109/HPCSim.2014.6903730","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903730","url":null,"abstract":"Due to the need to support concurrent executions of versatile applications, the system complexity, in terms of the number of cores, is drastically increased from tens to hundreds or thousands of cores. These complex systems usually contain heterogeneous cores or processing elements such as different processor cores, memories, and several Silicon Intellectual Properties (SIPs). Network-on-chip (NoC) was proposed to provide scalability and higher throughput for these heterogeneous multi-core systems. However, general designs of NoC infrastructures for multi-core systems usually lack the flexibility to support different processing requirements such as performance, power, reliability, and response time. It is helpful if designers can provide a reconfigurable NoC design so that these requirements can be supported more easily. In this work, we take an existing reconfigurable NoC for example and discuss related hardware and software issues. Some issues such as the reconfiguration time overhead must be considered in the design of a reconfigurable NoC such that it can be used for heterogeneous multi-core systems.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"523-526"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87371168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A new approach for binary feature selection and combining classifiers 二值特征选择与分类器组合的新方法
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903754
A. Asaithambi, V. Valev, A. Krzyżak, V. Zeljkovic
This paper explores feature selection and combining classifiers when binary features are used. The concept of Non-Reducible Descriptors (NRDs) for binary features is introduced. NRDs are descriptors of patterns that do not contain any redundant information. The underlying mathematical model for the present approach is based on learning Boolean formulas which are used to represent NRDs as conjunctions. Starting with a description of a computational procedure for the construction of all NRDs for a pattern, a two-step solution method is presented for the feature selection problem. The method computes weights of features during the construction of NRDs in the first step. The second step in the method then updates these weights based on repeated occurrences of features in the constructed NRDs. The paper then proceeds to present a new procedure for combining classifiers based on the votes computed for different classifiers. This procedure uses three different approaches for obtaining the single combined classifier, using majority, averaging, and randomized vote.
本文探讨了在使用二值特征时的特征选择和组合分类器。引入了二元特征的不可约描述符的概念。nrd是不包含任何冗余信息的模式描述符。本方法的基础数学模型是基于布尔公式的学习,布尔公式用于将nrd表示为连词。首先描述了构造一个模式的所有nrd的计算过程,然后给出了特征选择问题的两步求解方法。该方法在第一步NRDs的构建过程中计算特征的权重。然后,该方法的第二步根据构造的nrd中重复出现的特征更新这些权重。然后,本文提出了一种基于对不同分类器计算的投票来组合分类器的新方法。这个过程使用三种不同的方法来获得单个组合分类器,使用多数、平均和随机投票。
{"title":"A new approach for binary feature selection and combining classifiers","authors":"A. Asaithambi, V. Valev, A. Krzyżak, V. Zeljkovic","doi":"10.1109/HPCSim.2014.6903754","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903754","url":null,"abstract":"This paper explores feature selection and combining classifiers when binary features are used. The concept of Non-Reducible Descriptors (NRDs) for binary features is introduced. NRDs are descriptors of patterns that do not contain any redundant information. The underlying mathematical model for the present approach is based on learning Boolean formulas which are used to represent NRDs as conjunctions. Starting with a description of a computational procedure for the construction of all NRDs for a pattern, a two-step solution method is presented for the feature selection problem. The method computes weights of features during the construction of NRDs in the first step. The second step in the method then updates these weights based on repeated occurrences of features in the constructed NRDs. The paper then proceeds to present a new procedure for combining classifiers based on the votes computed for different classifiers. This procedure uses three different approaches for obtaining the single combined classifier, using majority, averaging, and randomized vote.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"49 1","pages":"681-687"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82247846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
From hybrid electro-photonic to all-optical on-chip interconnections for future CMPs 从混合光电到未来cmp的全光片上互连
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903798
P. Grani
Wants to be an excursus on the different solutions in which an optical Network-on-Chip (NoC) could be applied to, starting from passive NoC topologies (Mesh/Torus) enhanced by a simple shared optical ring and moving to more complex all-optical reconfigurable networks, in a state-of-the-art coherence assisted Chip-Multi-Processor (CMP). We investigate performance and power consumption effects on a CMP comparing them against a standard electronic Mesh (passive) and both a standard Torus (electronic baseline) and an optical Torus with sequential path-setup done through a symmetric electronic helper network (optical baseline, active).
希望成为应用于光片上网络(NoC)的不同解决方案的一个补充,从通过简单的共享光环增强的无源NoC拓扑(Mesh/Torus)开始,转向更复杂的全光可重构网络,在最先进的相干辅助芯片多处理器(CMP)中。我们研究了CMP的性能和功耗影响,将它们与标准电子网格(无源)、标准环面(电子基线)和通过对称电子辅助网络(光基线,有源)完成顺序路径设置的光环面进行比较。
{"title":"From hybrid electro-photonic to all-optical on-chip interconnections for future CMPs","authors":"P. Grani","doi":"10.1109/HPCSim.2014.6903798","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903798","url":null,"abstract":"Wants to be an excursus on the different solutions in which an optical Network-on-Chip (NoC) could be applied to, starting from passive NoC topologies (Mesh/Torus) enhanced by a simple shared optical ring and moving to more complex all-optical reconfigurable networks, in a state-of-the-art coherence assisted Chip-Multi-Processor (CMP). We investigate performance and power consumption effects on a CMP comparing them against a standard electronic Mesh (passive) and both a standard Torus (electronic baseline) and an optical Torus with sequential path-setup done through a symmetric electronic helper network (optical baseline, active).","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"16 1","pages":"999-1001"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82728423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A distributed self-balancing policy for virtual machine management in cloud datacenters 面向云数据中心虚拟机管理的分布式自平衡策略
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903712
Daniela Loreti, A. Ciampolini
Cloud Computing is a crucial computational paradigm for modern companies because it can discharge them from managing their ever growing IT infrastructure. Dynamically offering a plenty of computational resources, the cloud can also simplify the execution of CPU-intensive applications. Modern data centers for cloud computing are facing the challenge of a growing complexity due to the increasing number of users and their augmenting resource requests. A lot of efforts are now concentrated on providing the cloud infrastructure with autonomic behavior, so that it can take decisions about virtual machine (VM) management across the datacenter's nodes without human intervention. While the major part of these solutions is intrinsically centralized and suffers of scalability and reliability problems, we investigate the possibility to provide the cloud with a decentralized self-organizing behavior. To this purpose we present a novel VM migration policy suitable for a distributed environment, where hosts can exchange status information with each other according to a predefined protocol. The main goal of the policy is to balance the computational load on datacenter's physical hosts by conveniently moving virtual machines (VMs). We tested the policy performance by means of an ad hoc built simulator.
云计算对于现代企业来说是一种至关重要的计算范式,因为它可以使企业从管理不断增长的it基础设施中解脱出来。通过动态地提供大量计算资源,云还可以简化cpu密集型应用程序的执行。由于用户数量的增加和资源请求的增加,用于云计算的现代数据中心正面临着日益复杂的挑战。现在很多工作都集中在为云基础设施提供自主行为上,这样它就可以在没有人工干预的情况下跨数据中心节点做出关于虚拟机(VM)管理的决策。虽然这些解决方案的主要部分本质上是集中的,并且存在可伸缩性和可靠性问题,但我们研究了为云提供分散自组织行为的可能性。为此,我们提出了一种适合分布式环境的虚拟机迁移策略,在分布式环境中,主机可以根据预定义的协议相互交换状态信息。该策略的主要目标是通过方便地移动虚拟机来平衡数据中心物理主机上的计算负载。我们通过一个特别构建的模拟器测试了策略性能。
{"title":"A distributed self-balancing policy for virtual machine management in cloud datacenters","authors":"Daniela Loreti, A. Ciampolini","doi":"10.1109/HPCSim.2014.6903712","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903712","url":null,"abstract":"Cloud Computing is a crucial computational paradigm for modern companies because it can discharge them from managing their ever growing IT infrastructure. Dynamically offering a plenty of computational resources, the cloud can also simplify the execution of CPU-intensive applications. Modern data centers for cloud computing are facing the challenge of a growing complexity due to the increasing number of users and their augmenting resource requests. A lot of efforts are now concentrated on providing the cloud infrastructure with autonomic behavior, so that it can take decisions about virtual machine (VM) management across the datacenter's nodes without human intervention. While the major part of these solutions is intrinsically centralized and suffers of scalability and reliability problems, we investigate the possibility to provide the cloud with a decentralized self-organizing behavior. To this purpose we present a novel VM migration policy suitable for a distributed environment, where hosts can exchange status information with each other according to a predefined protocol. The main goal of the policy is to balance the computational load on datacenter's physical hosts by conveniently moving virtual machines (VMs). We tested the policy performance by means of an ad hoc built simulator.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"22 1","pages":"391-398"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80598988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Personal access control system using moving object detection and face recognition 个人门禁系统采用运动物体检测和人脸识别
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903751
V. Zeljkovic, Du Zhang, V. Valev, Zhongyu Zhang, Sheng-Jun Zhu, Junjie Li
Real time automated personal access control system is proposed in order to detect the moving objects, localize, extract and recognize their faces in real image sequence. The described method encompasses two important issues in personal access control system that receives increased attention over years: moving object detection and face recognition. It is tested on personal access controlled area video testing. The efficiency of the described system is illustrated on four real world interior video sequences recorded in indoor/outdoor mixed environment with slight illumination changes.
为了在实时图像序列中检测运动物体,对其进行定位、提取和人脸识别,提出了实时自动门禁系统。所描述的方法涵盖了近年来受到越来越多关注的个人门禁系统中的两个重要问题:运动物体检测和人脸识别。在个人门禁控制区视频测试中进行了测试。通过在室内/室外混合环境中记录的四个真实室内视频序列,说明了该系统的效率。
{"title":"Personal access control system using moving object detection and face recognition","authors":"V. Zeljkovic, Du Zhang, V. Valev, Zhongyu Zhang, Sheng-Jun Zhu, Junjie Li","doi":"10.1109/HPCSim.2014.6903751","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903751","url":null,"abstract":"Real time automated personal access control system is proposed in order to detect the moving objects, localize, extract and recognize their faces in real image sequence. The described method encompasses two important issues in personal access control system that receives increased attention over years: moving object detection and face recognition. It is tested on personal access controlled area video testing. The efficiency of the described system is illustrated on four real world interior video sequences recorded in indoor/outdoor mixed environment with slight illumination changes.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"36 1","pages":"662-669"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89496551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Parallel nonnegative tensor factorization via newton iteration on matrices 基于矩阵牛顿迭代的并行非负张量分解
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903803
M. Flatz, M. Vajtersic
Nonnegative Matrix Factorization (NMF) is a technique to approximate a large nonnegative matrix as a product of two significantly smaller nonnegative matrices. Since matrices can be seen as second-order tensors, NMF can be generalized to Nonnegative Tensor Factorization (NTF). To compute an NTF, the tensor problem can be transformed into a matrix problem by using matricization. Any NMF algorithm can be used to process such a matricized tensor, including a method based on Newton iteration. Here, an approach will be presented to adopt our parallel design of the Newton algorithm for NMF to compute an NTF in parallel for tensors of any order.
非负矩阵分解(NMF)是一种将一个大的非负矩阵近似为两个显著较小的非负矩阵的乘积的技术。由于矩阵可以看作是二阶张量,NMF可以推广到非负张量分解(NTF)。为了计算NTF,可以使用矩阵化将张量问题转化为矩阵问题。任何NMF算法都可以用来处理这样一个矩阵化张量,包括基于牛顿迭代的方法。在这里,我们将提出一种方法,采用牛顿算法的NMF并行设计来并行计算任意阶张量的NTF。
{"title":"Parallel nonnegative tensor factorization via newton iteration on matrices","authors":"M. Flatz, M. Vajtersic","doi":"10.1109/HPCSim.2014.6903803","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903803","url":null,"abstract":"Nonnegative Matrix Factorization (NMF) is a technique to approximate a large nonnegative matrix as a product of two significantly smaller nonnegative matrices. Since matrices can be seen as second-order tensors, NMF can be generalized to Nonnegative Tensor Factorization (NTF). To compute an NTF, the tensor problem can be transformed into a matrix problem by using matricization. Any NMF algorithm can be used to process such a matricized tensor, including a method based on Newton iteration. Here, an approach will be presented to adopt our parallel design of the Newton algorithm for NMF to compute an NTF in parallel for tensors of any order.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"6 11-12","pages":"1014-1015"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91500775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The role of medium size facilities in the HPC ecosystem: the case of the new CRESCO4 cluster integrated in the ENEAGRID infrastructure 中型设施在高性能计算生态系统中的作用:在ENEAGRID基础设施中集成的新CRESCO4集群的案例
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903807
Giovanni Ponti, Filippo Palombi, D. Abate, F. Ambrosino, G. Aprea, T. Bastianelli, F. Beone, R. Bertini, G. Bracco, M. Caporicci, B. Calosso, M. Chinnici, Antonio Colavincenzo, A. Cucurullo, P. Dangelo, M. D. Rosa, P. D. Michele, A. Funel, G. Furini, Dante Giammattei, S. Giusepponi, R. Guadagni, G. Guarnieri, A. Italiano, S. Magagnino, Angelo Mariano, G. Mencuccini, C. Mercuri, S. Migliori, P. Ornelli, S. Pecoraro, A. Perozziello, S. Pierattini, S. Podda, F. Poggi, A. Quintiliani, A. Rocchi, C. Sciò, F. Simoni, A. Vita
Medium size HPC clusters play an important role in the HPC landscape in that they provide both the training environment for system scalability and a flexible production field for a large class of numerical problems. In this poster we present CRESCO4, the latest medium size HPC cluster purchased by ENEA, in operation since few months. CRESCO4 is part of a family of HPC systems, all integrated within ENEAGRID, a large infrastructure for cloud computing, which includes all the computational facilities installed at several ENEA sites in Italy.
中等规模的高性能计算集群在高性能计算领域扮演着重要的角色,因为它们既提供了系统可扩展性的训练环境,又为大量数值问题提供了灵活的生产领域。在这张海报中,我们展示了ENEA购买的最新中型HPC集群CRESCO4,它已经运行了几个月。CRESCO4是高性能计算系统家族的一部分,所有这些系统都集成在ENEAGRID中,这是一个大型云计算基础设施,包括安装在意大利几个ENEA站点的所有计算设施。
{"title":"The role of medium size facilities in the HPC ecosystem: the case of the new CRESCO4 cluster integrated in the ENEAGRID infrastructure","authors":"Giovanni Ponti, Filippo Palombi, D. Abate, F. Ambrosino, G. Aprea, T. Bastianelli, F. Beone, R. Bertini, G. Bracco, M. Caporicci, B. Calosso, M. Chinnici, Antonio Colavincenzo, A. Cucurullo, P. Dangelo, M. D. Rosa, P. D. Michele, A. Funel, G. Furini, Dante Giammattei, S. Giusepponi, R. Guadagni, G. Guarnieri, A. Italiano, S. Magagnino, Angelo Mariano, G. Mencuccini, C. Mercuri, S. Migliori, P. Ornelli, S. Pecoraro, A. Perozziello, S. Pierattini, S. Podda, F. Poggi, A. Quintiliani, A. Rocchi, C. Sciò, F. Simoni, A. Vita","doi":"10.1109/HPCSim.2014.6903807","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903807","url":null,"abstract":"Medium size HPC clusters play an important role in the HPC landscape in that they provide both the training environment for system scalability and a flexible production field for a large class of numerical problems. In this poster we present CRESCO4, the latest medium size HPC cluster purchased by ENEA, in operation since few months. CRESCO4 is part of a family of HPC systems, all integrated within ENEAGRID, a large infrastructure for cloud computing, which includes all the computational facilities installed at several ENEA sites in Italy.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"60 1","pages":"1030-1033"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73833970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 142
Scalable high-quality 1D partitioning 可扩展的高质量一维分区
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903676
Matthias Lieber, W. Nagel
The decomposition of one-dimensional workload arrays into consecutive partitions is a core problem of many load balancing methods, especially those based on space-filling curves. While previous work has shown that heuristics can be parallelized, only sequential algorithms exist for the optimal solution. However, centralized partitioning will become infeasible in the exascale era due to the vast amount of tasks to be mapped to millions of processors. In this work, we first introduce optimizations to a published exact algorithm. Further, we investigate a hierarchical approach which combines a parallel heuristic and an exact algorithm to form a scalable and high-quality 1D partitioning algorithm. We compare load balance, execution time, and task migration of the algorithms for up to 262 144 processes using real-life workload data. The results show a 300 times speed-up compared to an existing fast exact algorithm, while achieving nearly the optimal load balance.
将一维工作负载数组分解成连续的分区是许多负载均衡方法的核心问题,特别是基于空间填充曲线的负载均衡方法。虽然以前的工作已经表明启发式可以并行化,但只有顺序算法存在最优解。然而,集中式分区在百亿亿次时代将变得不可行,因为大量的任务需要映射到数百万个处理器上。在这项工作中,我们首先将优化引入到已发布的精确算法中。此外,我们还研究了一种结合并行启发式和精确算法的分层方法,以形成可扩展的高质量一维划分算法。我们使用实际工作负载数据比较了多达262 - 144个进程的算法的负载平衡、执行时间和任务迁移。结果表明,与现有的快速精确算法相比,该算法的速度提高了300倍,同时实现了近乎最佳的负载平衡。
{"title":"Scalable high-quality 1D partitioning","authors":"Matthias Lieber, W. Nagel","doi":"10.1109/HPCSim.2014.6903676","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903676","url":null,"abstract":"The decomposition of one-dimensional workload arrays into consecutive partitions is a core problem of many load balancing methods, especially those based on space-filling curves. While previous work has shown that heuristics can be parallelized, only sequential algorithms exist for the optimal solution. However, centralized partitioning will become infeasible in the exascale era due to the vast amount of tasks to be mapped to millions of processors. In this work, we first introduce optimizations to a published exact algorithm. Further, we investigate a hierarchical approach which combines a parallel heuristic and an exact algorithm to form a scalable and high-quality 1D partitioning algorithm. We compare load balance, execution time, and task migration of the algorithms for up to 262 144 processes using real-life workload data. The results show a 300 times speed-up compared to an existing fast exact algorithm, while achieving nearly the optimal load balance.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"137 1","pages":"112-119"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75549945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Optimizing Xen inter-domain data transfer 优化Xen域间数据传输
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903799
S. Fremal, P. Manneback
The delivery of data to computing ressources in a short time is a crucial issue for the effectiveness of High Performance Computing. We meet this issue when, for example, designing drivers for virtual machines. We developped two tools to speed up data transfers between Xen virtual machines. The first one is a circular buffer shared in user memory space between the two communicating domains and allowing transfers without copy. The second pins pages in memory and transfers their Machine Frame Number (MFN), significantly reducing the transfered data volume. This paper briefly unveils the architecture of our tools and compare them with TCP sockets and XenSocket, a circular buffer in kernel memory space.
在短时间内将数据传递到计算资源是影响高性能计算有效性的关键问题。例如,在为虚拟机设计驱动程序时,我们会遇到这个问题。我们开发了两个工具来加速Xen虚拟机之间的数据传输。第一种是在两个通信域之间共享用户内存空间中的循环缓冲区,允许传输而不需要复制。第二种是在内存中固定页面并传输它们的机器帧号(MFN),这大大减少了传输的数据量。本文简要地揭示了我们的工具的体系结构,并将它们与TCP套接字和XenSocket(内核内存空间中的循环缓冲区)进行了比较。
{"title":"Optimizing Xen inter-domain data transfer","authors":"S. Fremal, P. Manneback","doi":"10.1109/HPCSim.2014.6903799","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903799","url":null,"abstract":"The delivery of data to computing ressources in a short time is a crucial issue for the effectiveness of High Performance Computing. We meet this issue when, for example, designing drivers for virtual machines. We developped two tools to speed up data transfers between Xen virtual machines. The first one is a circular buffer shared in user memory space between the two communicating domains and allowing transfers without copy. The second pins pages in memory and transfers their Machine Frame Number (MFN), significantly reducing the transfered data volume. This paper briefly unveils the architecture of our tools and compare them with TCP sockets and XenSocket, a circular buffer in kernel memory space.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"514 1-2 1","pages":"1002-1004"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78403101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Modeling and verification of ATM security policies with SecBPMN 基于SecBPMN的ATM安全策略建模与验证
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903740
Mattia Salnitri, P. Giorgini
High Performance Computing (HPC) techniques are essential in complex systems such as Socio-Technical Systems (STSs), where humans and organizations are elements of the same system along with technical infrastructures and hardware/software components. For example, several HPC approaches have been successfully applied to support and facilitate distribution or aggregation of computation power among independent and atomic components (e.g., smart meters to solve and/or simulate complex models). However, HPC techniques have to be studied and developed without underestimating the problem of security that, given the interaction-centric nature of STSs, has to be considered not only from the single component perspective but for the system as a whole. In our previous work, we have proposed SecBPMN, a framework to support the design of secure STSs. It is used to model the interaction design and security policies of a STS and it supports their verification through a querying engine. In this paper, we describe how SecBPMN has been successfully used for the study of security in an Air Traffic Management (ATM) system, and we show how it can result also an efficient support when of HPC techniques when applied in complex and heterogeneous environments.
高性能计算(HPC)技术在社会技术系统(sts)等复杂系统中是必不可少的,在社会技术系统中,人和组织与技术基础设施和硬件/软件组件是同一系统的元素。例如,一些HPC方法已经成功地应用于支持和促进计算能力在独立和原子组件之间的分布或聚合(例如,解决和/或模拟复杂模型的智能电表)。然而,在研究和发展HPC技术时,必须不低估安全问题,鉴于STSs以交互为中心的性质,必须不仅从单个组件的角度考虑安全问题,还要从整个系统的角度考虑安全问题。在我们之前的工作中,我们提出了SecBPMN,这是一个支持安全sts设计的框架。它用于为STS的交互设计和安全策略建模,并支持通过查询引擎对其进行验证。在本文中,我们描述了SecBPMN如何成功地用于空中交通管理(ATM)系统的安全性研究,并且我们展示了它如何在复杂和异构环境中应用高性能计算技术时也能提供有效的支持。
{"title":"Modeling and verification of ATM security policies with SecBPMN","authors":"Mattia Salnitri, P. Giorgini","doi":"10.1109/HPCSim.2014.6903740","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903740","url":null,"abstract":"High Performance Computing (HPC) techniques are essential in complex systems such as Socio-Technical Systems (STSs), where humans and organizations are elements of the same system along with technical infrastructures and hardware/software components. For example, several HPC approaches have been successfully applied to support and facilitate distribution or aggregation of computation power among independent and atomic components (e.g., smart meters to solve and/or simulate complex models). However, HPC techniques have to be studied and developed without underestimating the problem of security that, given the interaction-centric nature of STSs, has to be considered not only from the single component perspective but for the system as a whole. In our previous work, we have proposed SecBPMN, a framework to support the design of secure STSs. It is used to model the interaction design and security policies of a STS and it supports their verification through a querying engine. In this paper, we describe how SecBPMN has been successfully used for the study of security in an Air Traffic Management (ATM) system, and we show how it can result also an efficient support when of HPC techniques when applied in complex and heterogeneous environments.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"29 1","pages":"588-591"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76743643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2014 International Conference on High Performance Computing & Simulation (HPCS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1