首页 > 最新文献

2015 International Conference on High Performance Computing & Simulation (HPCS)最新文献

英文 中文
A fault-tolerant gyrokinetic plasma application using the sparse grid combination technique 基于稀疏网格组合技术的容错回旋动力学等离子体应用
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237082
Md. Mohsin Ali, P. Strazdins, B. Harding, M. Hegland, J. Larson
Applications performing ultra-large scale simulations via solving PDEs require very large computational systems for their timely solution. Studies have shown the rate of failure grows with the system size and these trends are likely to worsen in future machines as less reliable components are used to reduce the energy cost. Thus, as systems, and the problems solved on them, continue to grow, the ability to survive failures is becoming a critical aspect of algorithm development. The sparse grid combination technique (SGCT) is a cost-effective method for solving time-evolving PDEs, especially for higher-dimensional problems. It can also be easily modified to provide algorithm-based fault tolerance for these problems. In this paper, we show how the SGCT can produce a fault-tolerant version of the GENE gyrokinetic plasma application, which evolves a 5D complex density field over time. We use an alternate component grid combination formula to recover data from lost processes. User Level Failure Mitigation (ULFM) MPI is used to recover the processes, and our implementation is robust over multiple failures and recovery for both process and node failures. An acceptable degree of modification of the application is required. Results using the SGCT on two of the fields' dimensions show competitive execution times with acceptable error (within 0.1%), compared to the same simulation with a single full resolution grid. The benefits improve when the SGCT is used over three dimensions. Our experiments show that the GENE application can successfully recover from multiple process failures, and applying the SGCT the corresponding number of times minimizes the error for the lost sub-grids. Application recovery overhead via ULFM MPI increases from ~1.5s at 64 cores to ~5s at 2048 cores for a one-off failure. This compares favourably to using GENE's in-built checkpointing with job restart in conjunction with the classical SGCT on failure, which have overheads four times as large for a single failure, excluding the backtrack overhead. An analysis for a long-running application taking into account checkpoint backtrack times indicates a reduction in overhead of over an order of magnitude.
通过求解偏微分方程执行超大规模模拟的应用需要非常大的计算系统才能及时解决问题。研究表明,随着系统规模的扩大,故障率也在增长,而且随着为了降低能源成本而使用不太可靠的部件,未来的机器可能会出现这种趋势。因此,随着系统和在其上解决的问题不断增长,在失败中生存的能力正成为算法开发的一个关键方面。稀疏网格组合技术(SGCT)是求解时间演化偏微分方程的一种经济有效的方法,尤其适用于高维问题。它也可以很容易地修改为这些问题提供基于算法的容错。在本文中,我们展示了SGCT如何产生基因回旋动力学等离子体应用的容错版本,该应用随着时间的推移演变为5D复杂密度场。我们使用交替组件网格组合公式从丢失的进程中恢复数据。用户级故障缓解(ULFM) MPI用于恢复流程,我们的实现对于多个故障和流程和节点故障的恢复都是健壮的。需要对申请进行可接受程度的修改。与使用单个全分辨率网格的相同模拟相比,在两个字段的维度上使用SGCT的结果显示,执行时间具有可接受的误差(在0.1%以内)。当SGCT在三维空间上使用时,其好处会得到改善。我们的实验表明,GENE应用可以成功地从多个过程故障中恢复,并且应用相应次数的SGCT可以最大限度地减少丢失子网格的误差。对于一次性故障,通过ULFM MPI的应用程序恢复开销从64核时的1.5秒增加到2048核时的5秒。这比在故障时使用GENE的内置检查点与作业重新启动相结合的传统SGCT更有利,后者的开销是单个故障的四倍,不包括回溯开销。考虑检查点回溯时间的长时间运行应用程序的分析表明,开销减少了一个数量级以上。
{"title":"A fault-tolerant gyrokinetic plasma application using the sparse grid combination technique","authors":"Md. Mohsin Ali, P. Strazdins, B. Harding, M. Hegland, J. Larson","doi":"10.1109/HPCSim.2015.7237082","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237082","url":null,"abstract":"Applications performing ultra-large scale simulations via solving PDEs require very large computational systems for their timely solution. Studies have shown the rate of failure grows with the system size and these trends are likely to worsen in future machines as less reliable components are used to reduce the energy cost. Thus, as systems, and the problems solved on them, continue to grow, the ability to survive failures is becoming a critical aspect of algorithm development. The sparse grid combination technique (SGCT) is a cost-effective method for solving time-evolving PDEs, especially for higher-dimensional problems. It can also be easily modified to provide algorithm-based fault tolerance for these problems. In this paper, we show how the SGCT can produce a fault-tolerant version of the GENE gyrokinetic plasma application, which evolves a 5D complex density field over time. We use an alternate component grid combination formula to recover data from lost processes. User Level Failure Mitigation (ULFM) MPI is used to recover the processes, and our implementation is robust over multiple failures and recovery for both process and node failures. An acceptable degree of modification of the application is required. Results using the SGCT on two of the fields' dimensions show competitive execution times with acceptable error (within 0.1%), compared to the same simulation with a single full resolution grid. The benefits improve when the SGCT is used over three dimensions. Our experiments show that the GENE application can successfully recover from multiple process failures, and applying the SGCT the corresponding number of times minimizes the error for the lost sub-grids. Application recovery overhead via ULFM MPI increases from ~1.5s at 64 cores to ~5s at 2048 cores for a one-off failure. This compares favourably to using GENE's in-built checkpointing with job restart in conjunction with the classical SGCT on failure, which have overheads four times as large for a single failure, excluding the backtrack overhead. An analysis for a long-running application taking into account checkpoint backtrack times indicates a reduction in overhead of over an order of magnitude.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131359520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Designing HPC libraries in the modern C++ world 在现代c++世界中设计HPC库
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237076
J. Falcou
Numerical simulations running on computers is the most fundamental tool that most sciences - from physics to social science - use as a substitute to experiments when said experiments cannot realistically be run with a satisfactory duration, budget or ethical framework. This also means that the accuracy and the speed at which such computer simulations can be done is a crucial factor for the global scientific advancement. If accuracy of the simulation is tied to the field knowledge of scientists, the speed of a simulation is tied to the way one may take advantage of a computer hardware.
在计算机上运行的数值模拟是大多数科学——从物理学到社会科学——在实验无法在一个令人满意的持续时间、预算或道德框架下运行时用作实验的替代品的最基本工具。这也意味着,计算机模拟的准确性和速度是全球科学进步的关键因素。如果模拟的准确性与科学家的专业知识有关,那么模拟的速度则与利用计算机硬件的方式有关。
{"title":"Designing HPC libraries in the modern C++ world","authors":"J. Falcou","doi":"10.1109/HPCSim.2015.7237076","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237076","url":null,"abstract":"Numerical simulations running on computers is the most fundamental tool that most sciences - from physics to social science - use as a substitute to experiments when said experiments cannot realistically be run with a satisfactory duration, budget or ethical framework. This also means that the accuracy and the speed at which such computer simulations can be done is a crucial factor for the global scientific advancement. If accuracy of the simulation is tied to the field knowledge of scientists, the speed of a simulation is tied to the way one may take advantage of a computer hardware.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"52 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120886852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Real-time mixed-criticality Network-on-Chip resource allocation 实时混合关键的片上网络资源分配
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237091
L. Indrusiak
This paper summarises the latest developments on the resource allocation of real-time and mixed-criticality applications onto multiprocessor platforms based on Networks-on-Chip. The paper focuses on priority-preemptive Networks-on-Chip, and on the use of analytical models as fitness functions to guide search-based heuristics towards fully schedulable allocations, therefore suitable for mixed criticality applications with hard real-time constraints.
本文综述了基于片上网络的多处理器平台上实时和混合关键应用资源分配的最新进展。本文着重于优先级抢占的片上网络,以及使用分析模型作为适应度函数来指导基于搜索的启发式方法实现完全可调度的分配,因此适用于具有硬实时约束的混合临界应用。
{"title":"Real-time mixed-criticality Network-on-Chip resource allocation","authors":"L. Indrusiak","doi":"10.1109/HPCSim.2015.7237091","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237091","url":null,"abstract":"This paper summarises the latest developments on the resource allocation of real-time and mixed-criticality applications onto multiprocessor platforms based on Networks-on-Chip. The paper focuses on priority-preemptive Networks-on-Chip, and on the use of analytical models as fitness functions to guide search-based heuristics towards fully schedulable allocations, therefore suitable for mixed criticality applications with hard real-time constraints.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127995014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Market-inspired dynamic resource allocation in many-core high performance computing systems 多核高性能计算系统中受市场启发的动态资源分配
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237070
Ashutosh Kumar Singh, P. Dziurzański, L. Indrusiak
Many-core systems are envisioned to fulfill the increased performance demands in several computing domains such as embedded and high performance computing (HPC). The HPC systems are often overloaded to execute a number of dynamically arriving jobs. In overload situations, market-inspired resource allocation heuristics have been found to provide better results in terms of overall profit (value) earned by completing the execution of a number of jobs when compared to various other heuristics. However, the conventional market-inspired heuristics lack the concept of holding low value executing jobs to free the occupied resources to be used by high value arrived jobs in order to maximize the overall profit. In this paper, we propose a market-inspired heuristic that accomplish the aforementioned concept and utilizes design-time profiling results of jobs to facilitate efficient allocation. Additionally, the remaining executions of the held jobs are performed on freed resources at later stages to make some profit out of them. The holding process identifies the appropriate jobs to be put on hold to free the resources and ensures that the loss incurred due to holding is lower than the profit achieved by high value arrived jobs by using the free resources. Experiments show that the proposed approach achieves 8% higher savings when compared to existing approaches, which can be a significant amount when dealing in the order of millions of dollars.
多核系统被设想为满足多个计算领域(如嵌入式和高性能计算(HPC))中不断增长的性能需求。HPC系统经常超负荷执行大量动态到达的作业。在超负荷的情况下,与其他各种启发式方法相比,市场启发的资源分配启发式方法在完成许多工作的执行所获得的总体利润(价值)方面提供了更好的结果。然而,传统的市场启发式方法缺乏保留低价值执行任务以释放占用的资源以供高价值到达的任务使用以实现整体利润最大化的概念。在本文中,我们提出了一种市场启发的启发式方法,它实现了上述概念,并利用工作的设计时分析结果来促进有效的分配。此外,保留的作业的剩余执行将在稍后阶段在释放的资源上执行,以从中获得一些利润。保持过程确定要保持的适当作业以释放资源,并确保由于保持而产生的损失低于使用空闲资源获得的高价值到达作业所获得的利润。实验表明,与现有方法相比,所提出的方法节省了8%的费用,在处理数百万美元的订单时,这可能是一个可观的数字。
{"title":"Market-inspired dynamic resource allocation in many-core high performance computing systems","authors":"Ashutosh Kumar Singh, P. Dziurzański, L. Indrusiak","doi":"10.1109/HPCSim.2015.7237070","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237070","url":null,"abstract":"Many-core systems are envisioned to fulfill the increased performance demands in several computing domains such as embedded and high performance computing (HPC). The HPC systems are often overloaded to execute a number of dynamically arriving jobs. In overload situations, market-inspired resource allocation heuristics have been found to provide better results in terms of overall profit (value) earned by completing the execution of a number of jobs when compared to various other heuristics. However, the conventional market-inspired heuristics lack the concept of holding low value executing jobs to free the occupied resources to be used by high value arrived jobs in order to maximize the overall profit. In this paper, we propose a market-inspired heuristic that accomplish the aforementioned concept and utilizes design-time profiling results of jobs to facilitate efficient allocation. Additionally, the remaining executions of the held jobs are performed on freed resources at later stages to make some profit out of them. The holding process identifies the appropriate jobs to be put on hold to free the resources and ensures that the loss incurred due to holding is lower than the profit achieved by high value arrived jobs by using the free resources. Experiments show that the proposed approach achieves 8% higher savings when compared to existing approaches, which can be a significant amount when dealing in the order of millions of dollars.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127459653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Data and process abstractions for cloud computing 云计算的数据和过程抽象
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237108
Mikolaj Baranowski, M. Bubak, A. Belloum
Applying suitable application models is essential to achieve efficient execution of the applications, effective development process and assure application portability and reusability. We observe that execution environments and the cloud environment in particular, lack of tools that would make it more available from a programmer perspective. Complexity and variety of libraries and authentication methods do not correspond with simplistic functionality of those services. We approach this issue at several fields - workflow systems, domain specific languages and cloud computing. Our work focus on two points, optimizing existing application models by supplementing them with additional tools and providing new features, and develop new models that are based on our research on applications in the cloud environment.
应用合适的应用程序模型对于实现应用程序的高效执行、有效的开发过程以及确保应用程序的可移植性和可重用性至关重要。我们观察到,执行环境,特别是云环境,缺乏从程序员的角度使其更可用的工具。库和身份验证方法的复杂性和多样性与这些服务的简单功能不相对应。我们在几个领域探讨这个问题——工作流系统、特定领域语言和云计算。我们的工作主要集中在两点上,通过补充额外的工具和提供新功能来优化现有的应用程序模型,并基于我们对云环境中的应用程序的研究开发新的模型。
{"title":"Data and process abstractions for cloud computing","authors":"Mikolaj Baranowski, M. Bubak, A. Belloum","doi":"10.1109/HPCSim.2015.7237108","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237108","url":null,"abstract":"Applying suitable application models is essential to achieve efficient execution of the applications, effective development process and assure application portability and reusability. We observe that execution environments and the cloud environment in particular, lack of tools that would make it more available from a programmer perspective. Complexity and variety of libraries and authentication methods do not correspond with simplistic functionality of those services. We approach this issue at several fields - workflow systems, domain specific languages and cloud computing. Our work focus on two points, optimizing existing application models by supplementing them with additional tools and providing new features, and develop new models that are based on our research on applications in the cloud environment.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130968375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the problem-decomposition of scalable 4D-Var Data Assimilation models 可扩展4D-Var数据同化模型的问题分解
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237097
Rossella Arcucci, L. D’Amore, L. Carracciuolo
We present an innovative approach for solving Four Dimensional Variational Data Assimilation (4D-VAR DA) problems. The approach we consider starts from a decomposition of the physical domain; it uses a partitioning of the solution and a modified regularization functional describing the 4D-VAR DA problem on the decomposition. We provide a mathematical formulation of the model and we perform a feasibility analysis in terms of computational cost and of algorithmic scalability. We use the scale-up factor which measure the performance gain in terms of time complexity reduction. We verify the reliability of the approach on a consistent test case (the Shallow Water Equations).
我们提出了一种解决四维变分数据同化(4D-VAR DA)问题的创新方法。我们考虑的方法从物理域的分解开始;它使用解的划分和一个改进的正则化函数来描述分解上的4D-VAR数据分析问题。我们提供了模型的数学公式,并从计算成本和算法可扩展性方面进行了可行性分析。我们使用缩放因子来衡量时间复杂度降低方面的性能增益。我们在一个一致的测试用例(浅水方程)上验证了该方法的可靠性。
{"title":"On the problem-decomposition of scalable 4D-Var Data Assimilation models","authors":"Rossella Arcucci, L. D’Amore, L. Carracciuolo","doi":"10.1109/HPCSim.2015.7237097","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237097","url":null,"abstract":"We present an innovative approach for solving Four Dimensional Variational Data Assimilation (4D-VAR DA) problems. The approach we consider starts from a decomposition of the physical domain; it uses a partitioning of the solution and a modified regularization functional describing the 4D-VAR DA problem on the decomposition. We provide a mathematical formulation of the model and we perform a feasibility analysis in terms of computational cost and of algorithmic scalability. We use the scale-up factor which measure the performance gain in terms of time complexity reduction. We verify the reliability of the approach on a consistent test case (the Shallow Water Equations).","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131148436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Acceleration of dRMSD calculation and efficient usage of GPU caches dRMSD计算的加速和GPU缓存的有效使用
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237020
J. Filipovič, Jan Plhak, D. Střelák
In this paper, we introduce the GPU acceleration of dRMSD algorithm, used to compare different structures of a molecule. Comparing to multithreaded CPU implementation, we have reached 13.4× speedup in clustering and 62.7× speedup in I:I dRMSD computation using mid-end GPU. The dRMSD computation exposes strong memory locality and thus is compute-bound. Along with conservative implementation using shared memory, we have decided to implement variants of the algorithm using GPU caches to maintain memory locality. Our implementation using cache reaches 96.5% and 91.6% of shared memory performance on Fermi and Maxwell, respectively. We have identified several performance pitfalls related to cache blocking in compute-bound codes and suggested optimization techniques to improve the performance.
本文介绍了dRMSD算法的GPU加速,用于比较分子的不同结构。与多线程CPU实现相比,使用中端GPU,我们在集群方面的加速达到13.4倍,在I:I dRMSD计算方面的加速达到62.7倍。dRMSD计算暴露了强内存局部性,因此受计算约束。除了使用共享内存的保守实现外,我们还决定使用GPU缓存来实现算法的变体,以维护内存局部性。我们使用缓存的实现在Fermi和Maxwell上分别达到96.5%和91.6%的共享内存性能。我们已经确定了与计算绑定代码中的缓存阻塞相关的几个性能缺陷,并提出了改进性能的优化技术。
{"title":"Acceleration of dRMSD calculation and efficient usage of GPU caches","authors":"J. Filipovič, Jan Plhak, D. Střelák","doi":"10.1109/HPCSim.2015.7237020","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237020","url":null,"abstract":"In this paper, we introduce the GPU acceleration of dRMSD algorithm, used to compare different structures of a molecule. Comparing to multithreaded CPU implementation, we have reached 13.4× speedup in clustering and 62.7× speedup in I:I dRMSD computation using mid-end GPU. The dRMSD computation exposes strong memory locality and thus is compute-bound. Along with conservative implementation using shared memory, we have decided to implement variants of the algorithm using GPU caches to maintain memory locality. Our implementation using cache reaches 96.5% and 91.6% of shared memory performance on Fermi and Maxwell, respectively. We have identified several performance pitfalls related to cache blocking in compute-bound codes and suggested optimization techniques to improve the performance.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132728249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A fuzzy logic controlled mobility model based on simulated traffics' characteristics in MANET 基于仿真交通特性的模糊逻辑控制机动网络模型
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237051
Chen Chen, Qingqi Pei
In this paper, we investigate the self-similarity characteristic of MANET(Mobile Ad-hoc NETworks) traffics through simulations and then construct a fuzzy logic controlled mobility model according to the traffic feature to optimize the network performance. First, based on the generated traffics using OPNET, the self-similarity of MANET traffics has been verified with a qualitative analysis. Then, by exploring the relation between the self-similarity indicator, i.e., Hurst and some network performance metrics, such as Packets Delivery Ratio(PDR), Average Transmission Delay(ATD) and nodal Average Moving Speed(AMS), a fuzzy logic controller is designed to make the mobility model adaptively work in order to output satisfied performance. By online estimating the self-similarity of incoming traffics using R/S analysis, the Packet Size(PS) and AMS of each vehicle could be intelligently adjusted to maximize the PDR and minimize the experienced ATD. Numerical results indicate that our proposed mobility model, compared to the classic mobility model RWP(Random WayPoint), has a better performance in terms of PDR and ATD.
本文通过仿真研究了MANET(Mobile Ad-hoc network)业务的自相似特性,并根据业务特征构建了模糊逻辑控制的移动性模型,以优化网络性能。首先,基于OPNET生成的流量,通过定性分析验证了MANET流量的自相似性。然后,通过探索自相似指标Hurst与网络性能指标PDR、平均传输延迟(ATD)和节点平均移动速度(AMS)之间的关系,设计模糊逻辑控制器,使迁移模型自适应工作,以输出满意的性能。通过R/S分析在线估计入路流量的自相似度,智能调整每辆车的包大小(PS)和AMS,以最大化PDR和最小化经验ATD。数值结果表明,与经典的RWP(Random WayPoint)移动模型相比,我们提出的移动模型在PDR和ATD方面具有更好的性能。
{"title":"A fuzzy logic controlled mobility model based on simulated traffics' characteristics in MANET","authors":"Chen Chen, Qingqi Pei","doi":"10.1109/HPCSim.2015.7237051","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237051","url":null,"abstract":"In this paper, we investigate the self-similarity characteristic of MANET(Mobile Ad-hoc NETworks) traffics through simulations and then construct a fuzzy logic controlled mobility model according to the traffic feature to optimize the network performance. First, based on the generated traffics using OPNET, the self-similarity of MANET traffics has been verified with a qualitative analysis. Then, by exploring the relation between the self-similarity indicator, i.e., Hurst and some network performance metrics, such as Packets Delivery Ratio(PDR), Average Transmission Delay(ATD) and nodal Average Moving Speed(AMS), a fuzzy logic controller is designed to make the mobility model adaptively work in order to output satisfied performance. By online estimating the self-similarity of incoming traffics using R/S analysis, the Packet Size(PS) and AMS of each vehicle could be intelligently adjusted to maximize the PDR and minimize the experienced ATD. Numerical results indicate that our proposed mobility model, compared to the classic mobility model RWP(Random WayPoint), has a better performance in terms of PDR and ATD.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134541778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Cph CT Toolbox: A performance evaluation Cph CT工具箱:性能评估
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237019
J. Bardino, Martin Rehr, B. Vinter
With the first version of the Cph CT Toolbox released and introduced, we turn to intensively evaluating the performance of the FDK and Katsevich reconstruction implementations in the second major release. The evaluation focuses on comparisons between different hardware platforms from the two major GPU compute vendors, AMD and NVIDIA, using our updated CUDA and new OpenCL implementations. Such a performance comparison is in itself interesting in a narrow CT scanning and reconstruction perspective, but it also sheds some light on the performance of those AMD and NVIDIA platforms and GPU technologies: something of general interest to anyone building or considering GPU solutions for their scientific calculations. Results from the best system reveals the chosen streaming strategy to scale linearly up to problem sizes one order of magnitude larger than the available GPU memory, and with only a minor scaling decrease when increasing the problem size further to the next order of magnitude.
随着第一个版本的Cph CT工具箱的发布和介绍,我们转向在第二个主要版本中集中评估FDK和Katsevich重建实现的性能。评估侧重于比较来自两大主要GPU计算供应商AMD和NVIDIA的不同硬件平台,使用我们更新的CUDA和新的OpenCL实现。从狭窄的CT扫描和重建角度来看,这样的性能比较本身就很有趣,但它也揭示了AMD和NVIDIA平台和GPU技术的性能:对于那些正在构建或考虑为其科学计算构建GPU解决方案的人来说,这是一个普遍感兴趣的东西。来自最佳系统的结果显示,所选择的流策略可以线性扩展到比可用GPU内存大一个数量级的问题大小,并且当将问题大小进一步增加到下一个数量级时,只会有轻微的缩放减少。
{"title":"Cph CT Toolbox: A performance evaluation","authors":"J. Bardino, Martin Rehr, B. Vinter","doi":"10.1109/HPCSim.2015.7237019","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237019","url":null,"abstract":"With the first version of the Cph CT Toolbox released and introduced, we turn to intensively evaluating the performance of the FDK and Katsevich reconstruction implementations in the second major release. The evaluation focuses on comparisons between different hardware platforms from the two major GPU compute vendors, AMD and NVIDIA, using our updated CUDA and new OpenCL implementations. Such a performance comparison is in itself interesting in a narrow CT scanning and reconstruction perspective, but it also sheds some light on the performance of those AMD and NVIDIA platforms and GPU technologies: something of general interest to anyone building or considering GPU solutions for their scientific calculations. Results from the best system reveals the chosen streaming strategy to scale linearly up to problem sizes one order of magnitude larger than the available GPU memory, and with only a minor scaling decrease when increasing the problem size further to the next order of magnitude.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"605 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116381849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
DBSCAN on Resilient Distributed Datasets DBSCAN上的弹性分布式数据集
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237086
Irving Cordova, Teng-Sheng Moh
DBSCAN is a well-known density-based data clustering algorithm that is widely used due to its ability to find arbitrarily shaped clusters in noisy data. However, DBSCAN is hard to scale which limits its utility when working with large data sets. Resilient Distributed Datasets (RDDs), on the other hand, are a fast data-processing abstraction created explicitly for in-memory computation of large data sets. This paper presents a new algorithm based on DBSCAN using the Resilient Distributed Datasets approach: RDD-DBSCAN. RDD-DBSCAN overcomes the scalability limitations of the traditional DBSCAN algorithm by operating in a fully distributed fashion. The paper also evaluates an implementation of RDD-DBSCAN using Apache Spark, the official RDD implementation.
DBSCAN是一种众所周知的基于密度的数据聚类算法,由于它能够在噪声数据中发现任意形状的聚类而被广泛使用。然而,DBSCAN很难扩展,这限制了它在处理大型数据集时的效用。另一方面,弹性分布式数据集(rdd)是为大型数据集的内存计算而显式创建的快速数据处理抽象。本文提出了一种基于弹性分布式数据集的DBSCAN算法:RDD-DBSCAN。RDD-DBSCAN以完全分布式的方式运行,克服了传统DBSCAN算法的可伸缩性限制。本文还评估了使用官方RDD实现Apache Spark的RDD- dbscan实现。
{"title":"DBSCAN on Resilient Distributed Datasets","authors":"Irving Cordova, Teng-Sheng Moh","doi":"10.1109/HPCSim.2015.7237086","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237086","url":null,"abstract":"DBSCAN is a well-known density-based data clustering algorithm that is widely used due to its ability to find arbitrarily shaped clusters in noisy data. However, DBSCAN is hard to scale which limits its utility when working with large data sets. Resilient Distributed Datasets (RDDs), on the other hand, are a fast data-processing abstraction created explicitly for in-memory computation of large data sets. This paper presents a new algorithm based on DBSCAN using the Resilient Distributed Datasets approach: RDD-DBSCAN. RDD-DBSCAN overcomes the scalability limitations of the traditional DBSCAN algorithm by operating in a fully distributed fashion. The paper also evaluates an implementation of RDD-DBSCAN using Apache Spark, the official RDD implementation.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117235355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
期刊
2015 International Conference on High Performance Computing & Simulation (HPCS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1