2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)最新文献

英文中文

Predictibility of inter-component latency in a software communications architecture operating environment 软件通信体系结构操作环境中组件间延迟的可预测性

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

Pub Date : 2010-04-19 DOI: 10.1109/IPDPSW.2010.5470783

Gael Abgrall, F. Roy, J. Diguet, G. Gogniat, J. Delahaye

This paper presents an in-depth analysis of the behavior of a SCA component-based waveform application in terms of ¿inter-component¿ communication latency. The main limitation with SCA, in the context of embedded systems, is the additional cost introduced by the use of CORBA. Previous studies have already defined the major metrics of interest regarding this issue, these are CPU cost, memory requirements and ¿inter-component¿ latency. Real-time systems can not afford high latency, in consequence, this paper focuses on this metric. The starting point of this paper is the desire of knowing if the SCA CF does not also bring an overhead. Measurements have been realized with OmniORB as CORBA distribution and OSSIE for SCA implementation. In order to perform these measurements, a SCA waveform composed of several ¿empty-components¿ have been created. ¿Empty-components¿ are software components compliant to SCA without any signal processing part. The study only focuses on communications between components. The same kind of ¿inter-component¿ link has been measured between two components using CORBA without SCA. It is possible to compare the latency values between the two measurements and to show as a result that they are approximately the same. The CORBA bus is really the part which brings an overhead to the system. The final part of this paper introduces a statistical estimation of the latency distributions. It results from measurements performed with various data packet sizes and uses a fitting method based on a combination of Gaussian functions.

本文从“组件间”通信延迟的角度对基于SCA组件的波形应用程序的行为进行了深入分析。在嵌入式系统上下文中，SCA的主要限制是使用CORBA带来的额外成本。先前的研究已经定义了与此问题相关的主要指标，这些指标是CPU成本、内存需求和“组件间”延迟。实时系统无法承受高延迟，因此本文重点研究了该度量。本文的出发点是想知道SCA CF是否也会带来开销。测量是用OmniORB作为CORBA分发，用OSSIE作为SCA实现来实现的。为了执行这些测量，创建了由几个“空分量”组成的SCA波形。“空组件”是指没有任何信号处理部分的符合SCA的软件组件。这项研究只关注组件之间的通信。在没有SCA的情况下，使用CORBA在两个组件之间测量了相同类型的“组件间”链接。可以比较两个测量值之间的延迟值，结果显示它们大致相同。CORBA总线实际上是给系统带来开销的部分。最后介绍了时延分布的统计估计。它是对不同数据包大小的测量结果，并使用基于高斯函数组合的拟合方法。

{"title":"Predictibility of inter-component latency in a software communications architecture operating environment","authors":"Gael Abgrall, F. Roy, J. Diguet, G. Gogniat, J. Delahaye","doi":"10.1109/IPDPSW.2010.5470783","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470783","url":null,"abstract":"This paper presents an in-depth analysis of the behavior of a SCA component-based waveform application in terms of ¿inter-component¿ communication latency. The main limitation with SCA, in the context of embedded systems, is the additional cost introduced by the use of CORBA. Previous studies have already defined the major metrics of interest regarding this issue, these are CPU cost, memory requirements and ¿inter-component¿ latency. Real-time systems can not afford high latency, in consequence, this paper focuses on this metric. The starting point of this paper is the desire of knowing if the SCA CF does not also bring an overhead. Measurements have been realized with OmniORB as CORBA distribution and OSSIE for SCA implementation. In order to perform these measurements, a SCA waveform composed of several ¿empty-components¿ have been created. ¿Empty-components¿ are software components compliant to SCA without any signal processing part. The study only focuses on communications between components. The same kind of ¿inter-component¿ link has been measured between two components using CORBA without SCA. It is possible to compare the latency values between the two measurements and to show as a result that they are approximately the same. The CORBA bus is really the part which brings an overhead to the system. The final part of this paper introduces a statistical estimation of the latency distributions. It results from measurements performed with various data packet sizes and uses a fitting method based on a combination of Gaussian functions.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129843492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Welcome to CAC/SSPS 2010 欢迎来到CAC/SSPS 2010

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

Pub Date : 2010-04-19 DOI: 10.1109/IPDPSW.2010.5470843

S. Pakin, C. Stunkel, J. Flich, H. Andrade, Vibhore Kumar, D. Turaga

Efficient data motion is the cornerstone of both traditional parallel computing and more recent stream processing systems. This special combined meeting of the Communication Architecture for Clusters (CAC) and the Scalable Stream Processing Systems (SSPS) workshops showcases the latest research advances in both forms of data motion: network communication within a computer system and data streamed from a remote source and processed through a continuous processing framework supporting data analysis applications.

高效的数据移动是传统并行计算和最近的流处理系统的基石。集群通信架构(CAC)和可扩展流处理系统(SSPS)研讨会的特别联合会议展示了数据移动两种形式的最新研究进展:计算机系统内的网络通信和来自远程源的数据流，并通过支持数据分析应用程序的连续处理框架进行处理。

引用次数: 0

Dense linear algebra solvers for multicore with GPU accelerators 密集线性代数求解多核与GPU加速器

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

Pub Date : 2010-04-19 DOI: 10.1109/IPDPSW.2010.5470941

S. Tomov, Rajib Nath, H. Ltaief, J. Dongarra

Solving dense linear systems of equations is a fundamental problem in scientific computing. Numerical simulations involving complex systems represented in terms of unknown variables and relations between them often lead to linear systems of equations that must be solved as fast as possible. We describe current efforts toward the development of these critical solvers in the area of dense linear algebra (DLA) for multicore with GPU accelerators. We describe how to code/develop solvers to effectively use the high computing power available in these new and emerging hybrid architectures. The approach taken is based on hybridization techniques in the context of Cholesky, LU, and QR factorizations. We use a high-level parallel programming model and leverage existing software infrastructure, e.g. optimized BLAS for CPU and GPU, and LAPACK for sequential CPU processing. Included also are architecture and algorithm-specific optimizations for standard solvers as well as mixed-precision iterative refinement solvers. The new algorithms, depending on the hardware configuration and routine parameters, can lead to orders of magnitude acceleration when compared to the same algorithms on standard multicore architectures that do not contain GPU accelerators. The newly developed DLA solvers are integrated and freely available through the MAGMA library.

求解密集线性方程组是科学计算中的一个基本问题。涉及以未知变量及其之间关系表示的复杂系统的数值模拟通常会导致必须尽快求解的线性方程组。我们描述了目前在GPU加速器的多核密集线性代数(DLA)领域开发这些关键求解器的努力。我们描述了如何编码/开发求解器，以有效地利用这些新兴的混合架构中可用的高计算能力。所采取的方法是基于杂交技术在上下文的Cholesky, LU和QR分解。我们使用高级并行编程模型并利用现有的软件基础设施，例如针对CPU和GPU优化的BLAS，以及针对顺序CPU处理的LAPACK。还包括针对标准求解器的架构和算法特定优化，以及混合精度迭代细化求解器。与不包含GPU加速器的标准多核架构上的相同算法相比，新算法(取决于硬件配置和常规参数)可以带来数量级的加速。新开发的DLA求解器通过MAGMA库集成并免费提供。

{"title":"Dense linear algebra solvers for multicore with GPU accelerators","authors":"S. Tomov, Rajib Nath, H. Ltaief, J. Dongarra","doi":"10.1109/IPDPSW.2010.5470941","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470941","url":null,"abstract":"Solving dense linear systems of equations is a fundamental problem in scientific computing. Numerical simulations involving complex systems represented in terms of unknown variables and relations between them often lead to linear systems of equations that must be solved as fast as possible. We describe current efforts toward the development of these critical solvers in the area of dense linear algebra (DLA) for multicore with GPU accelerators. We describe how to code/develop solvers to effectively use the high computing power available in these new and emerging hybrid architectures. The approach taken is based on hybridization techniques in the context of Cholesky, LU, and QR factorizations. We use a high-level parallel programming model and leverage existing software infrastructure, e.g. optimized BLAS for CPU and GPU, and LAPACK for sequential CPU processing. Included also are architecture and algorithm-specific optimizations for standard solvers as well as mixed-precision iterative refinement solvers. The new algorithms, depending on the hardware configuration and routine parameters, can lead to orders of magnitude acceleration when compared to the same algorithms on standard multicore architectures that do not contain GPU accelerators. The newly developed DLA solvers are integrated and freely available through the MAGMA library.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127221826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 261

Modeling memory resources distribution on multicore processors using games on cellular automata lattices 利用元胞自动机格上的游戏建模多核处理器上的内存资源分配

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

Pub Date : 2010-04-19 DOI: 10.1109/IPDPSW.2010.5470700

Michail-Antisthenis I. Tsompanas, G. Sirakoulis, I. Karafyllidis

Nowadays, there is an increasingly recognized need for more computing power, which has led to multicore processors. However, this evolution is still restrained by the poor efficiency of memory chips. As a possible solution to the problem, this paper examines a model of re-distributing the memory resources assigned to the processor, especially the on-chip memory, in order to achieve higher performance. The proposed model uses the basic concepts of game theory applied to cellular automata lattices and the iterated spatial prisoner's dilemma game. A simulation was established in order to evaluate the performance of this model under different circumstances. Moreover, a corresponding FPGA logic circuit was designed as a part of an embedded, real-time co-circuit, aiming at memory resources fair distribution. The proposed FPGA implementation proved advantageous in terms of low-cost, high-speed, compactness and portability features. Finally, a significant improvement on the performance of the memory resources was ascertained from simulation results.

如今，人们越来越认识到需要更强的计算能力，这导致了多核处理器的出现。然而，这种进化仍然受到存储芯片效率低下的限制。作为一种可能的解决方案，本文研究了一种重新分配分配给处理器的内存资源的模型，特别是片上内存，以获得更高的性能。该模型将博弈论的基本概念应用于元胞自动机格和迭代空间囚徒困境博弈。为了评估该模型在不同情况下的性能，建立了仿真。并设计了相应的FPGA逻辑电路作为嵌入式实时共电路的一部分，以实现内存资源的公平分配。所提出的FPGA实现在低成本、高速、紧凑和可移植性方面具有优势。最后，从仿真结果中确定了内存资源性能的显著改善。

{"title":"Modeling memory resources distribution on multicore processors using games on cellular automata lattices","authors":"Michail-Antisthenis I. Tsompanas, G. Sirakoulis, I. Karafyllidis","doi":"10.1109/IPDPSW.2010.5470700","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470700","url":null,"abstract":"Nowadays, there is an increasingly recognized need for more computing power, which has led to multicore processors. However, this evolution is still restrained by the poor efficiency of memory chips. As a possible solution to the problem, this paper examines a model of re-distributing the memory resources assigned to the processor, especially the on-chip memory, in order to achieve higher performance. The proposed model uses the basic concepts of game theory applied to cellular automata lattices and the iterated spatial prisoner's dilemma game. A simulation was established in order to evaluate the performance of this model under different circumstances. Moreover, a corresponding FPGA logic circuit was designed as a part of an embedded, real-time co-circuit, aiming at memory resources fair distribution. The proposed FPGA implementation proved advantageous in terms of low-cost, high-speed, compactness and portability features. Finally, a significant improvement on the performance of the memory resources was ascertained from simulation results.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127400010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Characterizing energy efficiency of I/O intensive parallel applications on power-aware clusters 在功率感知集群上描述I/O密集型并行应用的能源效率

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

Pub Date : 2010-04-19 DOI: 10.1109/IPDPSW.2010.5470904

Rong Ge, Xizhou Feng, Sindhu Subramanya, Xian-He Sun

Energy efficiency and parallel I/O performance have become two critical measures in high performance computing (HPC). However, there is little empirical data that characterize the energy-performance behaviors of parallel I/O workload. In this paper, we present a methodology to profile the performance, energy, and energy efficiency of parallel I/O access patterns and report our findings on the impacting factors of parallel I/O energy efficiency. Our study shows that choosing the right buffer size can change the energy-performance efficiency by up to 30 times. High spatial and temporal spacing can also lead to significant improvement in energy-performance efficiency (about 2X). We observe CPU frequency has a more complex impact, depending on the IO operations, spatial and temporal, and memory buffer size. The presented methodology and findings are useful for evaluating the energy efficiency of I/O intensive applications and for providing a guideline to develop energy efficient parallel I/O technology.

能源效率和并行I/O性能已经成为高性能计算(HPC)的两个关键指标。然而，很少有经验数据表征并行I/O工作负载的能量性能行为。在本文中，我们提出了一种方法来描述并行I/O访问模式的性能、能源和能源效率，并报告了我们关于并行I/O能源效率影响因素的研究结果。我们的研究表明，选择合适的缓冲大小可以使能源性能效率提高30倍。高空间和时间间隔也可以显著提高能源性能效率(约2倍)。我们观察到CPU频率有更复杂的影响，这取决于IO操作、空间和时间以及内存缓冲区大小。所提出的方法和发现对于评估I/O密集型应用程序的能源效率和为开发节能的并行I/O技术提供指导是有用的。

引用次数: 10

A supplying partner strategy for mobile networks-based 3D streaming - proof of concept 一种基于移动网络的3D流媒体供应合作伙伴策略——概念验证

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

Pub Date : 2010-04-19 DOI: 10.1109/IPDPSW.2010.5470795

H. Maamar, R. Pazzi, A. Boukerche, E. Petriu

With the advances of wireless communication and mobile computing, there is a growing interest among researchers about augmented reality and streaming 3D graphics on mobile devices for training first responders to be better prepared in a case of disaster scenarios. However, several challenges need to be resolved before this technology become a commodity. One of the major difficulties in 3D streaming over thin mobile devices is related to the supplying partner strategy as it is not easy to discover the peer that has the correct information and that posses enough bandwidth to send the required data quickly and efficiently to the peers in need. In this paper, we propose a new supplying partner strategy for mobile networks-based 3D streaming. The primary goal of the work presented in this paper is first to address the thin mobile devices low storage capabilities; and second to avoid the flooding problem that most wireless mobile networks suffer from. Our proposed protocol is based on the quick discovery of multiple supplying partners, by optimizing the time required by peers to acquire data, avoiding unnecessary messages propagation and network congestion, and decreasing the latency and the network bandwidth over utilization.

随着无线通信和移动计算的进步，研究人员对增强现实和移动设备上的流3D图形越来越感兴趣，以培训第一响应者在灾难场景中更好地做好准备。然而，在这项技术成为商品之前，还需要解决几个挑战。在薄型移动设备上进行3D流传输的主要困难之一与供应伙伴策略有关，因为很难发现拥有正确信息并拥有足够带宽以快速有效地向需要的对等体发送所需数据的对等体。本文提出了一种新的基于移动网络的3D流媒体供应合作伙伴策略。本文提出的主要目标是首先解决薄移动设备低存储能力的问题;第二，避免大多数无线移动网络所面临的泛滥问题。我们提出的协议基于快速发现多个供应伙伴，通过优化对等体获取数据所需的时间，避免不必要的消息传播和网络拥塞，减少延迟和网络带宽的过度利用。

{"title":"A supplying partner strategy for mobile networks-based 3D streaming - proof of concept","authors":"H. Maamar, R. Pazzi, A. Boukerche, E. Petriu","doi":"10.1109/IPDPSW.2010.5470795","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470795","url":null,"abstract":"With the advances of wireless communication and mobile computing, there is a growing interest among researchers about augmented reality and streaming 3D graphics on mobile devices for training first responders to be better prepared in a case of disaster scenarios. However, several challenges need to be resolved before this technology become a commodity. One of the major difficulties in 3D streaming over thin mobile devices is related to the supplying partner strategy as it is not easy to discover the peer that has the correct information and that posses enough bandwidth to send the required data quickly and efficiently to the peers in need. In this paper, we propose a new supplying partner strategy for mobile networks-based 3D streaming. The primary goal of the work presented in this paper is first to address the thin mobile devices low storage capabilities; and second to avoid the flooding problem that most wireless mobile networks suffer from. Our proposed protocol is based on the quick discovery of multiple supplying partners, by optimizing the time required by peers to acquire data, avoiding unnecessary messages propagation and network congestion, and decreasing the latency and the network bandwidth over utilization.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130601019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

pALS: An object-oriented framework for developing parallel cooperative metaheuristics pALS:用于开发并行协作元启发式的面向对象框架

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

Pub Date : 2010-04-19 DOI: 10.1109/IPDPSW.2010.5470697

Andres Bernal, H. Castro

pALS acronym for parallel Adaptive Learning Search is a computational object oriented framework for the development of parallel and cooperative metaheuristics for solving complex optimization problems. The library exploits the paralellization allowing the deployment of mainly two models: the parallel execution of operators and the execution of separate instances or multi-start models. pALS also allows to include in the design of the problem's solution cooperation strategies such as the islands model for genetic algorithms or the parallel exploration of neighborhoods in metaheuristics derived from local searches, including a broad set of topologies associated with these models. pALS has been successfully used in different optimization problems and has proven to be a flexible, extensible and commanding library to promptly develop prototypes offering a collection of ready to use operators that encompass the nucleus of many metaheuristics including hybrid metaheuristics.

pALS是并行自适应学习搜索(parallel Adaptive Learning Search)的缩写，是一个面向计算对象的框架，用于开发并行和协作的元启发式算法，以解决复杂的优化问题。该库利用并行性，允许主要部署两种模型:操作符的并行执行和单独实例或多启动模型的并行执行。pALS还允许在问题的解决方案合作策略的设计中包含诸如遗传算法的岛屿模型或从本地搜索派生的元启发式中对邻域的并行探索，包括与这些模型相关的广泛拓扑集。pALS已经成功地用于不同的优化问题，并且已被证明是一个灵活的、可扩展的和命令库，可以快速开发原型，提供一组现成的操作符，这些操作符包含许多元启发式的核心，包括混合元启发式。

引用次数: 5

A new probabilistic Linear Exponential Backoff scheme for MANETs 一种新的概率线性指数回退算法

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

Pub Date : 2010-04-19 DOI: 10.1109/IPDPSW.2010.5470789

M. B. Yassein, S. Manaseer, Asmahan Abu Al-hassan, Zeinab Abu Taye, A. Al-Dubai

Broadcasting is an essential operation in Mobile ad hoc Networks (MANETs) environments. It is used in the initial phase of route discovery process in many reactive protocols. Although broadcasting is simple, it causes the well known broadcast storm problem, which is a result of packet redundancy, contention and collision. A probabilistic scheme has been proposed to overcome this problem. This work aims to study the effect of network density and network mobility on probabilistic schemes using different thresholds (fixed, 2p, 3p and 4p) with the Pessimistic Linear Exponential Backoff (PLEB) algorithm and compare the results with the standard MAC. A number of simulation experiments have been conducted to examine the performance of the proposed PLEP under different operating conditions. The simulation results show that in dense networks the normalized routing load, delay and routing packets are high and the PLEB outperforms the standard MAC in terms of delay.

广播是移动自组网(manet)环境中必不可少的一项业务。在许多响应式协议中，它被用于路由发现过程的初始阶段。广播虽然简单，但由于报文冗余、争用和冲突，导致了众所周知的广播风暴问题。为了克服这个问题，提出了一种概率方案。本研究旨在研究网络密度和网络移动性对使用不同阈值(固定、2p、3p和4p)的悲观线性指数回退(PLEB)算法的概率方案的影响，并将结果与标准MAC进行比较。已经进行了许多仿真实验，以检查所提出的PLEP在不同操作条件下的性能。仿真结果表明，在密集网络中，归一化路由负载、时延和路由数据包都很高，并且在时延方面优于标准MAC。

引用次数: 8

Estimating operating conditions in a Peer-to-Peer Session Initiation Protocol overlay network 估算点对点会话发起协议覆盖网络的运行条件

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

Pub Date : 2010-04-19 DOI: 10.1109/IPDPSW.2010.5470935

Jouni Mäenpää, G. Camarillo

Distributed Hash Table (DHT) based peer-to-peer overlays are decentralized, scalable, and fault tolerant. However, due to their decentralized nature, it is very hard to know the state and prevailing operating conditions of a running overlay. If the system could figure out the operating conditions, it would be easier to monitor the system and re-configure it in response to changing conditions. Many DHT-based system such as the Peer-to-Peer Session Initiation Protocol (P2PSIP) would benefit from the ability to accurately estimate the prevailing operating conditions of the overlay. In this paper, we evaluate mechanisms that can be used to do this. We focus on network size, join rate, and leave rate. We start from existing mechanisms and show that their accuracy is not sufficient. Next, we show how the mechanisms can be improved to achieve a higher level of accuracy. The improvements we study include various mechanisms improving the accuracy of leave rate estimation, use of a secondary network size estimate, sharing of estimates between peers, and statistical mechanisms to process shared estimates.

基于点对点覆盖的分布式哈希表(DHT)是分散的、可扩展的和容错的。然而，由于其分散性，很难知道运行覆盖层的状态和主要运行条件。如果系统能够弄清楚运行条件，就更容易监控系统并根据不断变化的条件对其进行重新配置。许多基于dht的系统，如点对点会话发起协议(P2PSIP)，将受益于准确估计覆盖层当前运行条件的能力。在本文中，我们评估了可以用来做到这一点的机制。我们关注网络规模、加入率和离职率。我们从现有的机制开始，并表明它们的准确性是不够的。接下来，我们将展示如何改进这些机制以达到更高的精度。我们研究的改进包括提高离职率估计准确性的各种机制，使用二级网络大小估计，在对等体之间共享估计，以及处理共享估计的统计机制。

{"title":"Estimating operating conditions in a Peer-to-Peer Session Initiation Protocol overlay network","authors":"Jouni Mäenpää, G. Camarillo","doi":"10.1109/IPDPSW.2010.5470935","DOIUrl":"https://doi.org/10.1109/IPDPSW.2010.5470935","url":null,"abstract":"Distributed Hash Table (DHT) based peer-to-peer overlays are decentralized, scalable, and fault tolerant. However, due to their decentralized nature, it is very hard to know the state and prevailing operating conditions of a running overlay. If the system could figure out the operating conditions, it would be easier to monitor the system and re-configure it in response to changing conditions. Many DHT-based system such as the Peer-to-Peer Session Initiation Protocol (P2PSIP) would benefit from the ability to accurately estimate the prevailing operating conditions of the overlay. In this paper, we evaluate mechanisms that can be used to do this. We focus on network size, join rate, and leave rate. We start from existing mechanisms and show that their accuracy is not sufficient. Next, we show how the mechanisms can be improved to achieve a higher level of accuracy. The improvements we study include various mechanisms improving the accuracy of leave rate estimation, use of a secondary network size estimate, sharing of estimates between peers, and statistical mechanisms to process shared estimates.","PeriodicalId":329280,"journal":{"name":"2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132135533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Fast Smith-Waterman hardware implementation 快速的Smith-Waterman硬件实现

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

Pub Date : 2010-04-19 DOI: 10.1109/IPDPSW.2010.5470748

Z. Nawaz, K. Bertels, H. Sumbul

The Smith-Waterman (SW) algorithm is one of the widely used algorithms for sequence alignment in computational biology. With the growing size of the sequence database, there is always a need for even faster implementation of SW. In this paper, we have implemented two Recursive Variable Expansion (RVE) based techniques, which are proved to give better speedup than any best dataflow approach at the cost of extra area. Compared to dataflow approach, our HW implementation is 2.29 times faster at the expense of 2.82 times more area.

Smith-Waterman (SW)算法是计算生物学中应用最广泛的序列比对算法之一。随着序列数据库规模的增长，总是需要更快的软件实现。在本文中，我们实现了两种基于递归变量展开(RVE)的技术，证明了它们比任何最好的数据流方法都具有更好的加速效果，但代价是额外的面积。与数据流方法相比，我们的硬件实现以2.82倍的面积为代价，速度提高了2.29倍。

引用次数: 11

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀