首页 > 最新文献

2015 International Conference on High Performance Computing & Simulation (HPCS)最新文献

英文 中文
Effective topic modeling for email 有效的电子邮件主题建模
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237060
Hiep Hong, Teng-Sheng Moh
Emails have been increasingly popular and have become an indispensible tool for communication and document exchange. Because of its convenience, people use emails every day at work, at school, and for personal matters. Consequently, the number of emails people receive daily keeps on increasing, causing them to spend more time organizing the emails. People often need to classify and move email into folders so that they can go back and read them later. Most email client tools available today allow the users to filter and organize emails by defining rules on how to handle incoming emails. However, this manual process requires users to know their expected emails very well, and to make good use of these tools users need to understand how filtering rules work and how to apply them correctly. In reality, most users do not know what their incoming emails will be. The work described in this paper aims to take the burden of organizing emails away from users by using the Latent Dirichlet Allocation (LDA) [10] to automatically extract topics from emails and group them into folders of common topics. Experiments have shown that the proposed method is able to correctly group emails in appropriate topics with 77% accuracy.
电子邮件越来越受欢迎,已经成为沟通和文件交换不可或缺的工具。由于方便,人们每天在工作、上学和处理个人事务时都使用电子邮件。因此,人们每天收到的电子邮件数量不断增加,导致他们花更多的时间来组织电子邮件。人们经常需要将电子邮件分类并放入文件夹中,以便稍后可以回去阅读。目前大多数可用的电子邮件客户端工具都允许用户通过定义如何处理传入电子邮件的规则来过滤和组织电子邮件。然而,这个手动过程要求用户非常了解他们期望的电子邮件,并且为了很好地使用这些工具,用户需要了解过滤规则的工作原理以及如何正确地应用它们。实际上,大多数用户并不知道他们收到的邮件是什么。本文所描述的工作旨在通过使用潜狄利克雷分配(Latent Dirichlet Allocation, LDA)[10]从邮件中自动提取主题并将其分组到共同主题的文件夹中,从而减轻用户组织邮件的负担。实验表明,该方法能够正确地将电子邮件分组到合适的主题中,准确率为77%。
{"title":"Effective topic modeling for email","authors":"Hiep Hong, Teng-Sheng Moh","doi":"10.1109/HPCSim.2015.7237060","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237060","url":null,"abstract":"Emails have been increasingly popular and have become an indispensible tool for communication and document exchange. Because of its convenience, people use emails every day at work, at school, and for personal matters. Consequently, the number of emails people receive daily keeps on increasing, causing them to spend more time organizing the emails. People often need to classify and move email into folders so that they can go back and read them later. Most email client tools available today allow the users to filter and organize emails by defining rules on how to handle incoming emails. However, this manual process requires users to know their expected emails very well, and to make good use of these tools users need to understand how filtering rules work and how to apply them correctly. In reality, most users do not know what their incoming emails will be. The work described in this paper aims to take the burden of organizing emails away from users by using the Latent Dirichlet Allocation (LDA) [10] to automatically extract topics from emails and group them into folders of common topics. Experiments have shown that the proposed method is able to correctly group emails in appropriate topics with 77% accuracy.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129000049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Toward a fully parallel multigrid in time algorithm in PETSc environment: A case study in ocean models PETSc环境下的全并行多网格时间算法研究——以海洋模型为例
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237098
L. Carracciuolo, L. D’Amore, Valeria Mele
We consider linear systems that arise from the discretization of evolutionary models. Typically, solution algorithms are based on a time-stepping approach, solving for one time step after the other. Parallelism is limited to the spatial dimension only. Because time is sequential in nature, the idea of simultaneously solving along time steps is not intuitive. One approach to achieve parallelism in time direction is MGRIT algorithm [7], based on multigrid reduction (MGR) techniques. Here we refer to this approach as MGR-1D. Other kind of approach is the space-time multigrid, where time is simply another dimension in the grid. Analougsly, we refer to this approach as MGR-4D. In this work, motivated by the need of maximizing the availability of new algorithms to climate science, we propose a new parallel approach that mixes both the MGR-1D idea and classical space multigrid methods. We refer to it as the MGR3D+1 approach. Moreover, we discuss their implementation in the high performance scientific library PETSc, as starting point to develope more efficient and scalable algorithms in ocean models.
我们考虑由进化模型离散化产生的线性系统。通常,求解算法基于时间步进方法,一个时间步接着另一个时间步求解。并行性仅限于空间维度。因为时间本质上是连续的,沿着时间步长同时求解的想法并不直观。在时间方向上实现并行的一种方法是基于多网格约简(MGR)技术的MGRIT算法[7]。这里我们把这种方法称为mri - 1d。另一种方法是时空多重网格,其中时间只是网格中的另一个维度。类似地,我们把这种方法称为mri - 4d。在这项工作中,由于需要最大限度地提高气候科学新算法的可用性,我们提出了一种新的并行方法,该方法混合了mri - 1d思想和经典的空间多网格方法。我们称之为MGR3D+1方法。此外,我们讨论了它们在高性能科学库PETSc中的实现,作为开发更有效和可扩展的海洋模型算法的起点。
{"title":"Toward a fully parallel multigrid in time algorithm in PETSc environment: A case study in ocean models","authors":"L. Carracciuolo, L. D’Amore, Valeria Mele","doi":"10.1109/HPCSim.2015.7237098","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237098","url":null,"abstract":"We consider linear systems that arise from the discretization of evolutionary models. Typically, solution algorithms are based on a time-stepping approach, solving for one time step after the other. Parallelism is limited to the spatial dimension only. Because time is sequential in nature, the idea of simultaneously solving along time steps is not intuitive. One approach to achieve parallelism in time direction is MGRIT algorithm [7], based on multigrid reduction (MGR) techniques. Here we refer to this approach as MGR-1D. Other kind of approach is the space-time multigrid, where time is simply another dimension in the grid. Analougsly, we refer to this approach as MGR-4D. In this work, motivated by the need of maximizing the availability of new algorithms to climate science, we propose a new parallel approach that mixes both the MGR-1D idea and classical space multigrid methods. We refer to it as the MGR3D+1 approach. Moreover, we discuss their implementation in the high performance scientific library PETSc, as starting point to develope more efficient and scalable algorithms in ocean models.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129371258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
On the run-time cost of distributed-memory communications generated using the polyhedral model 关于使用多面体模型生成的分布式内存通信的运行时成本
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237034
Ana Moreton-Fernandez, Arturo González-Escribano, D. Ferraris
The polyhedral model can be used to automatically generate distributed-memory communications for affine nested loops. Recently, new communication schemes that reduce the communication volume have been presented. In this paper we study the extra computational effort introduced at run-time by the code generated to manage the communication details across distributed processes. We focus on the most sophisticated communication scheme so far introduced (the FOP scheme). We present an asymptotic cost study of the FOP scheme in terms of two main run-time parameters: The problem size, and the number of processors. Based on this study, we identify scalability limitations in current implementations of these techniques, and propose a simple implementation alternative to eliminate one of them. Experimental results are presented, showing the potential impact on performance of these implementation limitations when using these codes in large parallel systems.
多面体模型可用于自动生成仿射嵌套循环的分布式存储通信。近年来,新的通信方案被提出,以减少通信量。在本文中,我们研究了为管理跨分布式进程的通信细节而生成的代码在运行时引入的额外计算工作量。我们将重点介绍目前介绍的最复杂的通信方案(FOP方案)。我们根据两个主要运行时参数:问题大小和处理器数量,给出了FOP方案的渐近成本研究。基于这项研究,我们确定了这些技术当前实现中的可扩展性限制,并提出了一个简单的实现替代方案来消除其中一个限制。实验结果显示,当在大型并行系统中使用这些代码时,这些实现限制对性能的潜在影响。
{"title":"On the run-time cost of distributed-memory communications generated using the polyhedral model","authors":"Ana Moreton-Fernandez, Arturo González-Escribano, D. Ferraris","doi":"10.1109/HPCSim.2015.7237034","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237034","url":null,"abstract":"The polyhedral model can be used to automatically generate distributed-memory communications for affine nested loops. Recently, new communication schemes that reduce the communication volume have been presented. In this paper we study the extra computational effort introduced at run-time by the code generated to manage the communication details across distributed processes. We focus on the most sophisticated communication scheme so far introduced (the FOP scheme). We present an asymptotic cost study of the FOP scheme in terms of two main run-time parameters: The problem size, and the number of processors. Based on this study, we identify scalability limitations in current implementations of these techniques, and propose a simple implementation alternative to eliminate one of them. Experimental results are presented, showing the potential impact on performance of these implementation limitations when using these codes in large parallel systems.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129521300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
NoC-centric partitioning and reconfiguration technologies for the efficient sharing of multi-core programmable accelerators 面向多核可编程加速器的高效共享的以noc为中心的分区和重构技术
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237107
Marco Balboni, D. Bertozzi
Today, multi- and many-core architectures are gaining momentum as a potential source of hardware acceleration, bringing to new challenges for system designers related to both system virtualization and runtime testing. My research activity tackles these challenges exploiting and optimizing the capabilities of reconfiguring the routing function at runtime.
如今,作为硬件加速的潜在来源,多核和多核架构正在获得动力,这给系统设计人员带来了与系统虚拟化和运行时测试相关的新挑战。我的研究活动解决了这些挑战,利用和优化在运行时重新配置路由功能的能力。
{"title":"NoC-centric partitioning and reconfiguration technologies for the efficient sharing of multi-core programmable accelerators","authors":"Marco Balboni, D. Bertozzi","doi":"10.1109/HPCSim.2015.7237107","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237107","url":null,"abstract":"Today, multi- and many-core architectures are gaining momentum as a potential source of hardware acceleration, bringing to new challenges for system designers related to both system virtualization and runtime testing. My research activity tackles these challenges exploiting and optimizing the capabilities of reconfiguring the routing function at runtime.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134189524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How advanced cloud technologies can impact and change HPC environments for simulation 先进的云技术如何影响和改变模拟的HPC环境
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237116
M. Mancini, G. Aloisio
In the last years, most enterprises and IT organizations have adopted virtualization and cloud computing solutions to achieve features such as flexibility, elasticity, fault tolerance, high availability and reliability for their computational, storage and networking resource infrastructures. Moreover, recent advances in Linux containers [1] and the emergence of technologies as Docker [2] are revolutionizing the way of developing and deploying web and large scale distributed applications.
在过去几年中,大多数企业和IT组织都采用虚拟化和云计算解决方案来实现其计算、存储和网络资源基础设施的灵活性、弹性、容错性、高可用性和可靠性等特性。此外,Linux容器的最新进展[1]和Docker等技术的出现[2]正在彻底改变开发和部署web和大规模分布式应用程序的方式。
{"title":"How advanced cloud technologies can impact and change HPC environments for simulation","authors":"M. Mancini, G. Aloisio","doi":"10.1109/HPCSim.2015.7237116","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237116","url":null,"abstract":"In the last years, most enterprises and IT organizations have adopted virtualization and cloud computing solutions to achieve features such as flexibility, elasticity, fault tolerance, high availability and reliability for their computational, storage and networking resource infrastructures. Moreover, recent advances in Linux containers [1] and the emergence of technologies as Docker [2] are revolutionizing the way of developing and deploying web and large scale distributed applications.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123686872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Revisiting co-scheduling for upcoming ExaScale systems 重新考虑即将推出的ExaScale系统的协同调度
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237117
Stefan Lankes
Future generation supercomputers will be a hundred times faster than today's leaders of the Top 500 while reaching the exascale mark. It is predicted that this performance gain in terms of CPU power will be achieved by a shift in the ratio of compute nodes to cores per node. The amount of nodes will not grow significantly compared to today's systems, instead they will be built by using many-core CPUs holding more than hundreds of cores resulting in a widening gap between compute power and I/O performance [1]. Four key challenges of future exascale systems have been identified by previous studies that must be coped with when designing them: energy and power, memory and storage, concurrency and locality, and resiliency [2].
未来一代超级计算机的运算速度将达到百亿亿次大关,比当今世界500强中的超级计算机快100倍。据预测,这种CPU功率方面的性能增益将通过计算节点与每个节点的核数之比的变化来实现。与今天的系统相比,节点的数量不会显著增长,相反,它们将通过使用拥有数百个内核的多核cpu来构建,从而导致计算能力和I/O性能之间的差距越来越大[1]。以前的研究已经确定了未来百亿亿级系统在设计时必须应对的四个关键挑战:能量和功率、内存和存储、并发性和局部性以及弹性[2]。
{"title":"Revisiting co-scheduling for upcoming ExaScale systems","authors":"Stefan Lankes","doi":"10.1109/HPCSim.2015.7237117","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237117","url":null,"abstract":"Future generation supercomputers will be a hundred times faster than today's leaders of the Top 500 while reaching the exascale mark. It is predicted that this performance gain in terms of CPU power will be achieved by a shift in the ratio of compute nodes to cores per node. The amount of nodes will not grow significantly compared to today's systems, instead they will be built by using many-core CPUs holding more than hundreds of cores resulting in a widening gap between compute power and I/O performance [1]. Four key challenges of future exascale systems have been identified by previous studies that must be coped with when designing them: energy and power, memory and storage, concurrency and locality, and resiliency [2].","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126779255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tracing long running applications: A case study using Gromacs 跟踪长时间运行的应用程序:使用gromac的案例研究
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237031
M. Wagner, J. Doleschal, A. Knüpfer
Performance analysis is inevitable to develop applications that utilize the enormous capabilities of current HPC systems. While many recent tool studies focused on large scales, performance analysis of long-running applications has not been paid much attention. This paper investigates challenges that arise from monitoring long-running real-life applications, in particular, the disruptive bias of intermediate memory buffer flushes in the measurement environment. We propose a concept for an in-memory event tracing that completely avoids intermediate memory buffer flushes. We evaluate to which extent such an in-memory event tracing workflow helps overcoming the critical properties, such as resulting trace size, application slow down, and measurement bias. We utilize a prototype implementation, based on Score-P and OTF2, with the molecular dynamics packages Gromacs, an application currently infeasible to monitor in a full production run.
要开发利用当前高性能计算系统的巨大功能的应用程序,性能分析是不可避免的。虽然最近的许多工具研究都集中在大规模上,但对长时间运行的应用程序的性能分析却没有得到太多关注。本文研究了监视长时间运行的实际应用程序所带来的挑战,特别是测量环境中中间内存缓冲区刷新的破坏性偏差。我们提出了一个内存事件跟踪的概念,它完全避免了中间内存缓冲区刷新。我们评估这种内存中的事件跟踪工作流在多大程度上有助于克服关键属性,例如产生的跟踪大小、应用程序速度减慢和测量偏差。我们利用基于Score-P和OTF2的原型实现,以及分子动力学软件包Gromacs,这是一种目前无法在完整生产运行中监控的应用程序。
{"title":"Tracing long running applications: A case study using Gromacs","authors":"M. Wagner, J. Doleschal, A. Knüpfer","doi":"10.1109/HPCSim.2015.7237031","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237031","url":null,"abstract":"Performance analysis is inevitable to develop applications that utilize the enormous capabilities of current HPC systems. While many recent tool studies focused on large scales, performance analysis of long-running applications has not been paid much attention. This paper investigates challenges that arise from monitoring long-running real-life applications, in particular, the disruptive bias of intermediate memory buffer flushes in the measurement environment. We propose a concept for an in-memory event tracing that completely avoids intermediate memory buffer flushes. We evaluate to which extent such an in-memory event tracing workflow helps overcoming the critical properties, such as resulting trace size, application slow down, and measurement bias. We utilize a prototype implementation, based on Score-P and OTF2, with the molecular dynamics packages Gromacs, an application currently infeasible to monitor in a full production run.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129770087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A runtime/memory trade-off of the continous Ziggurat method on GPUs gpu上连续Ziggurat方法的运行时/内存权衡
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237018
C. Riesinger, T. Neckel
Pseudo random number generators are intensively used in many computational applications, e.g. the treatment of Uncertainty Quantification problems. For this reason, the optimization of such generators for various hardware architectures is of big interest. We present a runtime/memory trade-off for the popular Ziggurat method with focus on GPUs. Such a trade-off means that the runtime of pseudo random number generation can be reduced by investing more memory and vice versa. Especially GPUs benefit from this approach since it reduces warp divergence which occurs for rejection methods such as the Ziggurat method. To our knowledge, such a trade-off for the Ziggurat method has never been investigated before for GPUs. It is shown that this approach makes the Ziggurat method competitive against well established normal pseudo random number generators on GPUs. Optimal implementations and grid configurations are given for different GPU architectures.
伪随机数生成器在许多计算应用中被广泛使用,例如不确定性量化问题的处理。出于这个原因,针对各种硬件架构优化这些生成器是非常有趣的。我们提出了流行的Ziggurat方法的运行时/内存权衡,重点是gpu。这种权衡意味着可以通过投入更多内存来减少伪随机数生成的运行时间,反之亦然。特别是gpu从这种方法中受益,因为它减少了诸如Ziggurat方法等拒绝方法中出现的翘曲发散。据我们所知,这种权衡的Ziggurat方法从未调查过gpu之前。结果表明,这种方法使Ziggurat方法与gpu上成熟的普通伪随机数生成器具有竞争力。针对不同的GPU架构给出了最优实现和网格配置。
{"title":"A runtime/memory trade-off of the continous Ziggurat method on GPUs","authors":"C. Riesinger, T. Neckel","doi":"10.1109/HPCSim.2015.7237018","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237018","url":null,"abstract":"Pseudo random number generators are intensively used in many computational applications, e.g. the treatment of Uncertainty Quantification problems. For this reason, the optimization of such generators for various hardware architectures is of big interest. We present a runtime/memory trade-off for the popular Ziggurat method with focus on GPUs. Such a trade-off means that the runtime of pseudo random number generation can be reduced by investing more memory and vice versa. Especially GPUs benefit from this approach since it reduces warp divergence which occurs for rejection methods such as the Ziggurat method. To our knowledge, such a trade-off for the Ziggurat method has never been investigated before for GPUs. It is shown that this approach makes the Ziggurat method competitive against well established normal pseudo random number generators on GPUs. Optimal implementations and grid configurations are given for different GPU architectures.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128849301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Active learning for support vector regression in radiation shielding design 辐射屏蔽设计中支持向量回归的主动学习
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237055
Paulina Duckic, Krešimir Trontl, M. Matijević
Recently a novel approach based on support vector regression technique has been proposed and tested for the estimation of multi layer buildup factors for gamma ray shielding calculations, while for neutron shielding calculations some initial analyses have been conducted. During the development of the model a number of questions regarding possible application of active learning measures have been raised. In this paper general applicability of the active learning measures on the problem, in particular data transfer method used in the investigation, and testing of the active procedure are discussed.
近年来,人们提出了一种基于支持向量回归技术的新方法,并对伽马射线屏蔽计算中多层累积因子的估计进行了试验,对中子屏蔽计算也进行了初步分析。在模型的发展过程中,提出了一些关于主动学习措施可能应用的问题。本文讨论了主动学习措施在该问题上的一般适用性,特别是在调查中使用的数据传输方法,以及主动过程的测试。
{"title":"Active learning for support vector regression in radiation shielding design","authors":"Paulina Duckic, Krešimir Trontl, M. Matijević","doi":"10.1109/HPCSim.2015.7237055","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237055","url":null,"abstract":"Recently a novel approach based on support vector regression technique has been proposed and tested for the estimation of multi layer buildup factors for gamma ray shielding calculations, while for neutron shielding calculations some initial analyses have been conducted. During the development of the model a number of questions regarding possible application of active learning measures have been raised. In this paper general applicability of the active learning measures on the problem, in particular data transfer method used in the investigation, and testing of the active procedure are discussed.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116872436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
In search of the best MPI-OpenMP distribution for optimum Intel-MIC cluster performance 寻找最佳的MPI-OpenMP分布以获得最佳的Intel-MIC集群性能
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237072
G. Utrera, Marisa Gil, X. Martorell
Applications for HPC platforms are mainly based on hybrid programming models: MPI for communication and OpenMP for task and fork-join parallelism to exploit shared memory communication inside a node. On the basis of this scheme, much research has been carried out to improve performance. Some examples are: the overlap of communication and computation, or the increase of speedup and bandwidth on new network fabrics (i.e. Infiniband and 10GB or 40GB ethernet). Henceforth, as far as computation and communication are concerned, the HPC platforms will be heterogeneous with high-speed networks. And, in this context, an important issue is to decide how to distribute the workload among all the nodes in order to balance the application execution as well as choosing the most appropriate programming model to exploit parallelism inside the node. In this paper we propose a mechanism to balance dynamically the work distribution among the heterogeneous components of an heterogeneous cluster based on their performance characteristics. For our evaluations we run the miniFE mini-application of the Mantevo suite benchmark, in a heterogeneous Intel MIC cluster. Experimental results show that making an effort to choose the appropriate number of threads can improve performance significantly over choosing the maximum available number of cores in the Intel MIC.
HPC平台的应用程序主要基于混合编程模型:MPI用于通信,OpenMP用于任务和fork-join并行性,以利用节点内的共享内存通信。在此方案的基础上,进行了大量的研究,以提高性能。一些例子是:通信和计算的重叠,或者在新的网络结构(即Infiniband和10GB或40GB以太网)上加速和带宽的增加。今后,就计算和通信而言,高性能计算平台将与高速网络一起异构化。在这种情况下,一个重要的问题是决定如何在所有节点之间分配工作负载,以平衡应用程序的执行,以及选择最合适的编程模型来利用节点内部的并行性。本文提出了一种基于异构集群中异构组件性能特征的工作分配动态平衡机制。为了进行评估,我们在异构Intel MIC集群中运行了Mantevo套件基准测试的miniFE迷你应用程序。实验结果表明,与选择Intel MIC中最大可用核数相比,努力选择适当的线程数可以显著提高性能。
{"title":"In search of the best MPI-OpenMP distribution for optimum Intel-MIC cluster performance","authors":"G. Utrera, Marisa Gil, X. Martorell","doi":"10.1109/HPCSim.2015.7237072","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237072","url":null,"abstract":"Applications for HPC platforms are mainly based on hybrid programming models: MPI for communication and OpenMP for task and fork-join parallelism to exploit shared memory communication inside a node. On the basis of this scheme, much research has been carried out to improve performance. Some examples are: the overlap of communication and computation, or the increase of speedup and bandwidth on new network fabrics (i.e. Infiniband and 10GB or 40GB ethernet). Henceforth, as far as computation and communication are concerned, the HPC platforms will be heterogeneous with high-speed networks. And, in this context, an important issue is to decide how to distribute the workload among all the nodes in order to balance the application execution as well as choosing the most appropriate programming model to exploit parallelism inside the node. In this paper we propose a mechanism to balance dynamically the work distribution among the heterogeneous components of an heterogeneous cluster based on their performance characteristics. For our evaluations we run the miniFE mini-application of the Mantevo suite benchmark, in a heterogeneous Intel MIC cluster. Experimental results show that making an effort to choose the appropriate number of threads can improve performance significantly over choosing the maximum available number of cores in the Intel MIC.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124393442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
2015 International Conference on High Performance Computing & Simulation (HPCS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1