2010 39th International Conference on Parallel Processing Workshops最新文献

英文中文

Fine-Grained Parallel Compacting Garbage Collection through Hardware-Supported Synchronization 通过硬件支持的同步实现细粒度并行压缩垃圾收集

2010 39th International Conference on Parallel Processing Workshops

Pub Date : 2010-09-13 DOI: 10.1109/ICPPW.2010.28

O. Horvath, M. Meyer

Parallel garbage collection seeks to exploit the inherent parallelism of graph tracing by evenly distributing the set of objects in the heap among all available processing resources. Any straightforward implementation, however, suffers from prohibitive overheads since each access to the worklist of objects and to the objects themselves needs to be protected by synchronization, especially so in the case of compacting collectors. For this reason, known parallel collectors sacrifice a great deal of work distribution granularity and scalability to keep the synchronization costs acceptable. In this paper, we present a case study of a different approach. Our parallel compacting collector is based on Cheney's copying algorithm, employs a single worklist and distributes garbage collection work on an object-by-object basis. This way, it achieves well balanced work distribution and good scalability. To solve the synchronization problem, we introduce a low-cost multi-core garbage collection coprocessor and take advantage of hardware-supported synchronization. We built an FPGA-based prototype with a single-core main processor supported by a multi-core garbage collection coprocessor. Measurement results show that an 8-core garbage collection coprocessor decreases the duration of garbage collection cycles by a factor of up to 7.4, while a 16-core configuration still achieves a factor of up to 12.1.

并行垃圾收集试图利用图跟踪的固有并行性，将堆中的对象集均匀地分布在所有可用的处理资源中。然而，任何简单的实现都有令人望而却步的开销，因为对对象工作列表和对象本身的每次访问都需要通过同步进行保护，特别是在压缩收集器的情况下。由于这个原因，已知的并行收集器牺牲了大量的工作分布粒度和可伸缩性，以保持可接受的同步成本。在本文中，我们提出了一个不同方法的案例研究。我们的并行压缩收集器基于Cheney的复制算法，采用单个工作列表，并在逐个对象的基础上分配垃圾收集工作。通过这种方式，实现了良好的均衡工作分布和良好的可扩展性。为了解决同步问题，我们引入了一个低成本的多核垃圾收集协处理器，并利用硬件支持的同步。我们构建了一个基于fpga的原型，该原型采用单核主处理器，由多核垃圾收集协处理器支持。测量结果表明，8核垃圾收集协处理器将垃圾收集周期的持续时间减少了7.4倍，而16核配置仍然达到了12.1倍。

{"title":"Fine-Grained Parallel Compacting Garbage Collection through Hardware-Supported Synchronization","authors":"O. Horvath, M. Meyer","doi":"10.1109/ICPPW.2010.28","DOIUrl":"https://doi.org/10.1109/ICPPW.2010.28","url":null,"abstract":"Parallel garbage collection seeks to exploit the inherent parallelism of graph tracing by evenly distributing the set of objects in the heap among all available processing resources. Any straightforward implementation, however, suffers from prohibitive overheads since each access to the worklist of objects and to the objects themselves needs to be protected by synchronization, especially so in the case of compacting collectors. For this reason, known parallel collectors sacrifice a great deal of work distribution granularity and scalability to keep the synchronization costs acceptable. In this paper, we present a case study of a different approach. Our parallel compacting collector is based on Cheney's copying algorithm, employs a single worklist and distributes garbage collection work on an object-by-object basis. This way, it achieves well balanced work distribution and good scalability. To solve the synchronization problem, we introduce a low-cost multi-core garbage collection coprocessor and take advantage of hardware-supported synchronization. We built an FPGA-based prototype with a single-core main processor supported by a multi-core garbage collection coprocessor. Measurement results show that an 8-core garbage collection coprocessor decreases the duration of garbage collection cycles by a factor of up to 7.4, while a 16-core configuration still achieves a factor of up to 12.1.","PeriodicalId":415472,"journal":{"name":"2010 39th International Conference on Parallel Processing Workshops","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122489791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Effectively Presenting Call Path Profiles of Application Performance 有效地呈现应用程序性能的调用路径概要

2010 39th International Conference on Parallel Processing Workshops

Pub Date : 2010-09-13 DOI: 10.1109/ICPPW.2010.35

L. Adhianto, J. Mellor-Crummey, Nathan R. Tallent

Call path profiling is a scalable measurement technique that has been shown to provide insight into the performance characteristics of complex modular programs. However, poor presentation of accurate and precise call path profiles obscures insight. To enable rapid analysis of an execution's performance bottlenecks, we make the following contributions for effectively presenting call path profiles. First, we combine a relatively small set of complementary presentation techniques to form a coherent synthesis that is greater than the constituent parts. Second, we extend existing presentation techniques to rapidly focus an analyst's attention on performance bottlenecks. In particular, we (1) show how to scalably present three complementary views of calling-context-sensitive metrics; (2) treat a procedure's static structure as first-class information with respect to both performance metrics and constructing views; (3) enable construction of a large variety of user-defined metrics to assess performance inefficiency; and (4) automatically expand hot paths based on arbitrary performance metrics --- through calling contexts and static structure --- to rapidly highlight important program contexts. Our work is implemented within HPCToolkit, which collects call path profiles using low-overhead asynchronous sampling.

调用路径分析是一种可扩展的测量技术，已被证明可以深入了解复杂模块化程序的性能特征。然而，准确和精确的呼叫路径配置文件的糟糕呈现模糊了洞察力。为了能够快速分析执行的性能瓶颈，我们做出了以下贡献，以便有效地呈现调用路径配置文件。首先，我们将一组相对较小的互补表示技术组合起来，形成一个比组成部分更大的连贯综合。其次，我们扩展了现有的表示技术，以迅速将分析人员的注意力集中在性能瓶颈上。特别是，我们(1)展示了如何可扩展地呈现调用上下文敏感度量的三种互补视图;(2)将过程的静态结构作为性能指标和构造视图的一级信息;(3)允许构建各种用户定义的指标来评估性能低效率;(4)基于任意性能指标(通过调用上下文和静态结构)自动扩展热路径，以快速突出重要的程序上下文。我们的工作是在HPCToolkit中实现的，它使用低开销的异步采样收集调用路径配置文件。

{"title":"Effectively Presenting Call Path Profiles of Application Performance","authors":"L. Adhianto, J. Mellor-Crummey, Nathan R. Tallent","doi":"10.1109/ICPPW.2010.35","DOIUrl":"https://doi.org/10.1109/ICPPW.2010.35","url":null,"abstract":"Call path profiling is a scalable measurement technique that has been shown to provide insight into the performance characteristics of complex modular programs. However, poor presentation of accurate and precise call path profiles obscures insight. To enable rapid analysis of an execution's performance bottlenecks, we make the following contributions for effectively presenting call path profiles. First, we combine a relatively small set of complementary presentation techniques to form a coherent synthesis that is greater than the constituent parts. Second, we extend existing presentation techniques to rapidly focus an analyst's attention on performance bottlenecks. In particular, we (1) show how to scalably present three complementary views of calling-context-sensitive metrics; (2) treat a procedure's static structure as first-class information with respect to both performance metrics and constructing views; (3) enable construction of a large variety of user-defined metrics to assess performance inefficiency; and (4) automatically expand hot paths based on arbitrary performance metrics --- through calling contexts and static structure --- to rapidly highlight important program contexts. Our work is implemented within HPCToolkit, which collects call path profiles using low-overhead asynchronous sampling.","PeriodicalId":415472,"journal":{"name":"2010 39th International Conference on Parallel Processing Workshops","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121315666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Resource Discovery and Scheduling in Unstructured Peer-to-Peer Desktop Grids 非结构化点对点桌面网格中的资源发现和调度

2010 39th International Conference on Parallel Processing Workshops

Pub Date : 2010-09-13 DOI: 10.1109/ICPPW.2010.49

S. Kwan, J. Muppala

In this paper, we explore resource discovery and scheduling issues that arise in unstructured peer-to-peer (P2P) desktop grids. We examine the use of a super-peer based approach to address these issues. The super-peers form a resource information tracking and exchange overlay to enable users to rapidly locate resources for remote execution of jobs. Resource availability information is exchanged among the super-peers using a light-weight threshold-driven gossip protocol with the aim of minimizing the resource discovery overhead. We conduct detailed simulation experiments to illustrate the comparative results. Our results indicate that this approach offers a lightweight and scalable method for managing resources in a desktop grid.

在本文中，我们探讨了在非结构化点对点(P2P)桌面网格中出现的资源发现和调度问题。我们研究了使用基于超级同伴的方法来解决这些问题。超级对等体形成资源信息跟踪和交换覆盖，使用户能够快速定位资源，以便远程执行作业。超级对等点之间使用轻量级阈值驱动的八卦协议交换资源可用性信息，目的是尽量减少资源发现开销。我们进行了详细的模拟实验来说明比较结果。我们的结果表明，这种方法为管理桌面网格中的资源提供了一种轻量级和可扩展的方法。

引用次数: 4

Hierarchical Load Balancing for Charm++ Applications on Large Supercomputers 大型超级计算机上的分层负载平衡

2010 39th International Conference on Parallel Processing Workshops

Pub Date : 2010-09-13 DOI: 10.1109/ICPPW.2010.65

G. Zheng, Esteban Meneses, A. Bhatele, L. Kalé

Large parallel machines with hundreds of thousands of processors are being built. Recent studies have shown that ensuring good load balance is critical for scaling certain classes of parallel applications on even thousands of processors. Centralized load balancing algorithms suffer from scalability problems, especially on machines with relatively small amount of memory. Fully distributed load balancing algorithms, on the other hand, tend to yield poor load balance on very large machines. In this paper, we present an automatic dynamic hierarchical load balancing method that overcomes the scalability challenges of centralized schemes and poor solutions of traditional distributed schemes. This is done by creating multiple levels of aggressive load balancing domains which form a tree. This hierarchical method is demonstrated within a measurement-based load balancing framework in Charm++. We present techniques to deal with scalability challenges of load balancing at very large scale. We show performance data of the hierarchical load balancing method on up to 16,384 cores of Ranger (at TACC) for a synthetic benchmark. We also demonstrate the successful deployment of the method in a scientific application, NAMD with results on the Blue Gene/P machine at ANL.

拥有数十万个处理器的大型并行机器正在建造中。最近的研究表明，确保良好的负载平衡对于在数千个处理器上扩展某些类别的并行应用程序至关重要。集中式负载平衡算法存在可伸缩性问题，特别是在内存相对较少的机器上。另一方面，完全分布式的负载平衡算法往往会在非常大的机器上产生较差的负载平衡。本文提出了一种自动动态分层负载均衡方法，克服了集中式方案的可扩展性挑战和传统分布式方案解决方案的不足。这是通过创建多级主动负载平衡域来实现的，这些域形成一个树。在一个基于测量的负载平衡框架中演示了这种分层方法。我们提出了处理大规模负载平衡的可伸缩性挑战的技术。我们展示了分层负载平衡方法在多达16,384个Ranger内核(在TACC)上的性能数据，用于合成基准测试。我们还演示了该方法在科学应用程序NAMD中的成功部署，并在ANL的Blue Gene/P机器上取得了结果。

{"title":"Hierarchical Load Balancing for Charm++ Applications on Large Supercomputers","authors":"G. Zheng, Esteban Meneses, A. Bhatele, L. Kalé","doi":"10.1109/ICPPW.2010.65","DOIUrl":"https://doi.org/10.1109/ICPPW.2010.65","url":null,"abstract":"Large parallel machines with hundreds of thousands of processors are being built. Recent studies have shown that ensuring good load balance is critical for scaling certain classes of parallel applications on even thousands of processors. Centralized load balancing algorithms suffer from scalability problems, especially on machines with relatively small amount of memory. Fully distributed load balancing algorithms, on the other hand, tend to yield poor load balance on very large machines. In this paper, we present an automatic dynamic hierarchical load balancing method that overcomes the scalability challenges of centralized schemes and poor solutions of traditional distributed schemes. This is done by creating multiple levels of aggressive load balancing domains which form a tree. This hierarchical method is demonstrated within a measurement-based load balancing framework in Charm++. We present techniques to deal with scalability challenges of load balancing at very large scale. We show performance data of the hierarchical load balancing method on up to 16,384 cores of Ranger (at TACC) for a synthetic benchmark. We also demonstrate the successful deployment of the method in a scientific application, NAMD with results on the Blue Gene/P machine at ANL.","PeriodicalId":415472,"journal":{"name":"2010 39th International Conference on Parallel Processing Workshops","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123123680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 70

Efficient Zero-Copy Noncontiguous I/O for Globus on InfiniBand InfiniBand上Globus的高效零拷贝不连续I/O

2010 39th International Conference on Parallel Processing Workshops

Pub Date : 2010-09-13 DOI: 10.1109/ICPPW.2010.56

Weikuan Yu, Yuan Tian, J. Vetter

Noncontiguous I/O access is one of the main access patterns in parallel and distributed applications. An I/O architecture EXIO enables Globus, a popular run-time environment for distributed computing, on RDMA networks such as InfiniBand. In this paper, we investigate the benefits of InfiniBand zero-copy RDMA to noncontiguous I/O on Globus. Our experimental results demonstrate that, by enabling zero-copy RDMA on InfiniBand, EXIO significantly improves the performance of Globus noncontiguous I/O. Compared to the packing and unpacking, zero-copy RDMA improve the bandwidth by up to 2.7 times. Compared to both IPoIB and 10GigE, it increases the bandwidth by more than three times. While achieving efficient noncontiguous I/O, RDMA-based noncontiguous I/O on InfiniBand also leads to dramatical reduction of CPU utilization on Globus clients and servers.

不连续I/O访问是并行和分布式应用程序中的主要访问模式之一。I/O体系结构EXIO支持在RDMA网络(如InfiniBand)上使用Globus，这是一种用于分布式计算的流行运行时环境。在本文中，我们研究了InfiniBand零拷贝RDMA对Globus上不连续I/O的好处。我们的实验结果表明，通过在InfiniBand上启用零拷贝RDMA, EXIO显着提高了Globus不连续I/O的性能。零拷贝RDMA与拆包方式相比，带宽提升幅度可达2.7倍。与IPoIB和10GigE相比，带宽提高了3倍以上。在InfiniBand上实现高效的不连续I/O的同时，基于rdma的不连续I/O也显著降低了Globus客户机和服务器上的CPU利用率。

引用次数: 3

A Performance Estimation Technique for the SegBus Distributed Architecture SegBus分布式架构的性能评估技术

2010 39th International Conference on Parallel Processing Workshops

Pub Date : 2010-09-13 DOI: 10.1109/ICPPW.2010.24

M. F. Niazi, T. Seceleanu, H. Tenhunen

We propose a performance estimation technique for a multi-core segmented bus platform, SegBus. The technique enables us to assess the performance aspects of any specific application on a particular platform configuration, modeled in Unified Modeling Language (UML). We present methods to transform Packet Synchronous Data Flow (PSDF) and Platform Specific Model (PSM) models of the application into Extensible Markup Language (XML) schemes using modeling tool and how the generated XML schemes can be utilized by the emulator program to get the execution results. The technique facilitates us to estimate performance aspects of application mapped on a number of different platform configurations during the early stages of the design process.

我们提出了一种多核分段总线平台SegBus的性能评估技术。该技术使我们能够评估任何特定应用程序在特定平台配置上的性能方面，并使用统一建模语言(UML)建模。提出了利用建模工具将应用程序的包同步数据流(PSDF)和平台特定模型(PSM)模型转换为可扩展标记语言(XML)模式的方法，以及仿真程序如何利用生成的XML模式来获得执行结果。该技术有助于我们在设计过程的早期阶段估计映射在许多不同平台配置上的应用程序的性能方面。

引用次数: 4

A Cooperative Intrusion Detection System Framework for Cloud Computing Networks 面向云计算网络的协同入侵检测系统框架

2010 39th International Conference on Parallel Processing Workshops

Pub Date : 2010-09-13 DOI: 10.1109/ICPPW.2010.46

Chi-Chun Lo, Chun-Chieh Huang, Joy Ku

Cloud computing provides a framework for supporting end users easily attaching powerful services and applications through Internet. To provide secure and reliable services in cloud computing environment is an important issue. One of the security issues is how to reduce the impact of denial-of-service (DoS) attack or distributed denial-of-service (DDoS) in this environment. To counter these kinds of attacks, a framework of cooperative intrusion detection system (IDS) is proposed. The proposed system could reduce the impact of these kinds of attacks. To provide such ability, IDSs in the cloud computing regions exchange their alerts with each other. In the system, each of IDSs has a cooperative agent used to compute and determine whether to accept the alerts sent from other IDSs or not. By this way, IDSs could avoid the same type of attack happening. The implementation results indicate that the proposed system could resist DoS attack. Moreover, by comparison, the proposed cooperative IDS system only increases little computation effort compared with pure Snort based IDS but prevents the system from single point of failure attack.

云计算提供了一个框架，支持最终用户通过Internet轻松地附加功能强大的服务和应用程序。如何在云计算环境中提供安全可靠的服务是一个重要的问题。其中一个安全问题是如何在这种环境中减少拒绝服务(DoS)攻击或分布式拒绝服务(DDoS)的影响。针对这类攻击，提出了一种协作入侵检测系统框架。所提出的系统可以减少这类攻击的影响。为了提供这种能力，云计算区域中的ids相互交换它们的警报。在系统中，每个ids都有一个协作代理，用于计算和确定是否接受来自其他ids发送的警报。通过这种方式，ids可以避免发生相同类型的攻击。实现结果表明，该系统能够抵御DoS攻击。此外，与基于Snort的纯IDS相比，所提出的协作IDS系统仅增加了很少的计算量，但防止了系统遭受单点故障攻击。

{"title":"A Cooperative Intrusion Detection System Framework for Cloud Computing Networks","authors":"Chi-Chun Lo, Chun-Chieh Huang, Joy Ku","doi":"10.1109/ICPPW.2010.46","DOIUrl":"https://doi.org/10.1109/ICPPW.2010.46","url":null,"abstract":"Cloud computing provides a framework for supporting end users easily attaching powerful services and applications through Internet. To provide secure and reliable services in cloud computing environment is an important issue. One of the security issues is how to reduce the impact of denial-of-service (DoS) attack or distributed denial-of-service (DDoS) in this environment. To counter these kinds of attacks, a framework of cooperative intrusion detection system (IDS) is proposed. The proposed system could reduce the impact of these kinds of attacks. To provide such ability, IDSs in the cloud computing regions exchange their alerts with each other. In the system, each of IDSs has a cooperative agent used to compute and determine whether to accept the alerts sent from other IDSs or not. By this way, IDSs could avoid the same type of attack happening. The implementation results indicate that the proposed system could resist DoS attack. Moreover, by comparison, the proposed cooperative IDS system only increases little computation effort compared with pure Snort based IDS but prevents the system from single point of failure attack.","PeriodicalId":415472,"journal":{"name":"2010 39th International Conference on Parallel Processing Workshops","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124906909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 270

A Novel RSS-Based Indoor Positioning Algorithm Using Mobility Prediction 一种新的基于rss的移动预测室内定位算法

2010 39th International Conference on Parallel Processing Workshops

Pub Date : 2010-09-13 DOI: 10.1109/ICPPW.2010.80

Lyu-Han Chen, Gen-Huey Chen, Ming-Hui Jin, E. Wu

Severe received signal strength (RSS) fluctuation is one of the crucial problems in an indoor positioning system using fingerprint-based algorithms. Even at a fixed location, the RSSs received by a mobile device at different time have large discrepancy. Adopting these fluctuated signals for positioning may lead to inaccurate results. To mitigate this problem, in this paper, any of the existing fingerprint-based indoor positioning algorithms can be integrated into our positioning system to estimate the location of mobile device. Then, a mobility prediction algorithm using the model of Brownian motion is presented for further calculating the rationality of the estimated location and correcting the inaccurate results. To be realistic, some experiments in a real WLAN environment with a multitude of people moving in a testing area demonstrate the noticeably better accuracy of this approach. The solution can ensure low and stable positioning error. Besides, the region where training records are out of date can also be found out.

接收信号强度剧烈波动是指纹定位系统的关键问题之一。即使在固定位置，移动设备在不同时间接收到的rss也存在较大差异。采用这些波动信号进行定位可能导致结果不准确。为了解决这一问题，本文将现有的任何一种基于指纹的室内定位算法集成到我们的定位系统中，以估计移动设备的位置。然后，提出了一种基于布朗运动模型的机动性预测算法，进一步计算估计位置的合理性，并对不准确的结果进行修正。在实际的WLAN环境中，许多人在一个测试区域移动，一些实验证明了这种方法明显更好的准确性。该方案可保证定位误差低而稳定。此外，还可以找出培训记录过期的区域。

引用次数: 10

A Region-Based Hierarchical Location Service with Road-Adapted Grids for Vehicular Networks 基于区域的车辆网络分层位置服务与道路自适应网格

2010 39th International Conference on Parallel Processing Workshops

Pub Date : 2010-09-13 DOI: 10.1109/ICPPW.2010.81

Guey-Yun Chang, Yun-Yu Chen, J. Sheu

In VANETs, it is very important to communicate between two vehicles, but how to get the correct position of a vehicle is not easy. Due to vehicles are moving fast, topology in VANETs changes rapidly. As a result, location services processed in VANETs are more difficult than in MANETs. In our thesis, we propose a hierarchical location service system, it provides a low cost and rapid service. First, we select the main arteries to divide network into grids because of there are more vehicles than in normal roads, and then design a mechanism that when vehicles need to send update packets. This mechanism can decrease the number of update packets and still gain correct vehicles??? location. Second, we design grids with three levels, the higher the level, the larger the area. Each level stores update packets sent within its area. Vehicles using our system can find the destination vehicle distributedly within a small area; if the target is not within this area, then find within a larger area. Besides, we propose a packets collection method, it can be adjusted with different size of collection area. The simulation results show that our scheme could decrease the number of location update packets effectively, and keep high success rate of location service.

在VANETs中，两辆车之间的通信非常重要，但如何获得车辆的正确位置并不容易。由于车辆的快速移动，VANETs中的拓扑结构变化很快。因此，在vanet中处理位置服务比在manet中处理要困难得多。在本文中，我们提出了一种分层的定位服务系统，它提供了低成本和快速的服务。首先，针对车辆较多的情况，选择主干道将网络划分为网格，然后设计车辆需要发送更新数据包的机制。这种机制可以减少更新包的数量，仍然获得正确的车辆??的位置。其次，我们设计了三个层次的网格，层次越高，面积越大。每个级别存储在其区域内发送的更新数据包。使用该系统的车辆可以在小范围内分散地找到目的地车辆;如果目标不在这个区域内，那么在更大的区域内寻找。此外，我们提出了一种数据包收集方法，它可以根据不同的收集区域大小进行调整。仿真结果表明，该方案可以有效地减少位置更新包的数量，并保持较高的位置服务成功率。

{"title":"A Region-Based Hierarchical Location Service with Road-Adapted Grids for Vehicular Networks","authors":"Guey-Yun Chang, Yun-Yu Chen, J. Sheu","doi":"10.1109/ICPPW.2010.81","DOIUrl":"https://doi.org/10.1109/ICPPW.2010.81","url":null,"abstract":"In VANETs, it is very important to communicate between two vehicles, but how to get the correct position of a vehicle is not easy. Due to vehicles are moving fast, topology in VANETs changes rapidly. As a result, location services processed in VANETs are more difficult than in MANETs. In our thesis, we propose a hierarchical location service system, it provides a low cost and rapid service. First, we select the main arteries to divide network into grids because of there are more vehicles than in normal roads, and then design a mechanism that when vehicles need to send update packets. This mechanism can decrease the number of update packets and still gain correct vehicles??? location. Second, we design grids with three levels, the higher the level, the larger the area. Each level stores update packets sent within its area. Vehicles using our system can find the destination vehicle distributedly within a small area; if the target is not within this area, then find within a larger area. Besides, we propose a packets collection method, it can be adjusted with different size of collection area. The simulation results show that our scheme could decrease the number of location update packets effectively, and keep high success rate of location service.","PeriodicalId":415472,"journal":{"name":"2010 39th International Conference on Parallel Processing Workshops","volume":"1070 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132392838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Jedule: A Tool for Visualizing Schedules of Parallel Applications Jedule:一个可视化并行应用程序时间表的工具

2010 39th International Conference on Parallel Processing Workshops

Pub Date : 2010-09-13 DOI: 10.1109/ICPPW.2010.34

S. Hunold, Ralf Hoffmann, F. Suter

Task scheduling is one of the most prominent problems in the era of parallel computing. We find scheduling algorithms in every domain of computer science, e.g., mapping multiprocessor tasks to clusters, mapping jobs to grid resources, or mapping fine-grained tasks to cores of multicore processors. Many tools exist that help understand or debug an application by presenting visual representations of a certain program run, e.g., visualizations of MPI traces. However, often developers want to get a global and abstract view of their schedules first. In this paper we introduce Jedule, a tool dedicated to visualize schedules of parallel applications. We demonstrate the effectiveness of Jedule by showing how it helped analyzing problems in several case studies.

任务调度是并行计算时代最突出的问题之一。我们在计算机科学的每个领域都发现了调度算法，例如，将多处理器任务映射到集群，将作业映射到网格资源，或将细粒度任务映射到多核处理器的核心。有许多工具可以通过呈现某个程序运行的可视化表示来帮助理解或调试应用程序，例如MPI跟踪的可视化。然而，开发人员通常希望首先获得他们日程安排的全局和抽象视图。在本文中，我们介绍了Jedule，一个专门用于可视化并行应用程序调度的工具。我们通过展示Jedule如何帮助分析几个案例研究中的问题来演示它的有效性。

引用次数: 22

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2010 39th International Conference on Parallel Processing Workshops

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀