首页 > 最新文献

2010 First International Conference on Networking and Computing最新文献

英文 中文
Multiprocessor Architectures Specialized for Multi-agent Simulation 专门用于多代理仿真的多处理器体系结构
Pub Date : 2010-11-17 DOI: 10.1109/IC-NC.2010.34
Christian Schäck, W. Heenes, R. Hoffmann
Two new multiprocessor architectures to accelerate the simulation of multi-agent worlds based on the massively parallel GCA (Global Cellular Automata) model are presented. The GCA model is suited to describe and simulate different multi-agent worlds. The designed and implemented architectures mainly consist of a set of processors (NIOS II) and a network. The multiprocessor systems allow the implementation in a flexible way through programming, thus simulating different behaviors on the same architecture. Two architectures with up to 16 processors were implemented on an FPGA. The first architecture uses hardware hash functions in order to reduces the overall simulation time, but lacks scalability. The second architecture uses an agent memory and a cell field memory. This improves the scalability and further increases the performance.
提出了两种基于大规模并行全局元胞自动机(GCA)模型的加速多智能体世界仿真的新多处理器体系结构。GCA模型适用于描述和模拟不同的多智能体世界。设计和实现的体系结构主要由一组处理器(NIOS II)和一个网络组成。多处理器系统允许通过编程以灵活的方式实现,从而在同一体系结构上模拟不同的行为。在FPGA上实现了多达16个处理器的两种架构。第一种体系结构使用硬件哈希函数来减少总体模拟时间,但缺乏可伸缩性。第二个体系结构使用代理内存和单元格字段内存。这提高了可伸缩性,并进一步提高了性能。
{"title":"Multiprocessor Architectures Specialized for Multi-agent Simulation","authors":"Christian Schäck, W. Heenes, R. Hoffmann","doi":"10.1109/IC-NC.2010.34","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.34","url":null,"abstract":"Two new multiprocessor architectures to accelerate the simulation of multi-agent worlds based on the massively parallel GCA (Global Cellular Automata) model are presented. The GCA model is suited to describe and simulate different multi-agent worlds. The designed and implemented architectures mainly consist of a set of processors (NIOS II) and a network. The multiprocessor systems allow the implementation in a flexible way through programming, thus simulating different behaviors on the same architecture. Two architectures with up to 16 processors were implemented on an FPGA. The first architecture uses hardware hash functions in order to reduces the overall simulation time, but lacks scalability. The second architecture uses an agent memory and a cell field memory. This improves the scalability and further increases the performance.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121021988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An Adaptive Timeout Strategy for Profiling UDP Flows 分析UDP流的自适应超时策略
Pub Date : 2010-11-17 DOI: 10.1109/IC-NC.2010.15
Jing Cai, Zhibin Zhang, P. Zhang, Xinbo Song
With the increase of network bandwidth, more and more new applications such as audio, video and online games have become the main body in network traffic. Based on real time considerations, these new applications mostly use UDP as transport layer protocol, which directly increase UDP traffic. However, traditional studies believe that TCP dominates the Internet traffic and previous traffic measurements were generally based on it while UDP was ignored. In view of this, we mainly discuss the adaptive timeout strategy of UDP flows in this paper. Firstly, due to the significant differences in flow characteristics between the TCP flows and UDP flows, we expound and prove that the existing adaptive timeout strategies are not appropriate for UDP flows. Secondly, we present our adaptive strategy using Support Vector Machine techniques. We build six classifiers to accurately predict its corresponding maximum packet inter-arrival time and adapt its timeout value within the flow duration. Limited to its accurate rating, we present another concept of adjust accuracy rating which can probability-guaranteed(90%,95%,98%) to avoid long flow to be cut into short flows. The experiment result reveals that our adaptive strategy has the potential to achieve significant performance advantages over other widely used fixed and other adaptive timeout schemes.
随着网络带宽的增加,音频、视频、网络游戏等越来越多的新应用成为网络流量的主体。基于实时性的考虑,这些新应用大多使用UDP作为传输层协议,这直接增加了UDP流量。然而,传统的研究认为TCP在互联网流量中占主导地位,以前的流量测量通常基于TCP,而忽略了UDP。鉴于此,本文主要讨论了UDP流的自适应超时策略。首先,由于TCP流和UDP流的流特性存在显著差异,我们阐述并证明了现有的自适应超时策略不适用于UDP流。其次,提出了基于支持向量机的自适应策略。我们构建了六个分类器来准确预测其对应的最大数据包到达时间,并在流持续时间内调整其超时值。根据其精度等级,提出了另一种可概率保证(90%、95%、98%)的调整精度等级概念,以避免长流被截断为短流。实验结果表明,与其他广泛使用的固定超时和其他自适应超时方案相比,我们的自适应策略具有显著的性能优势。
{"title":"An Adaptive Timeout Strategy for Profiling UDP Flows","authors":"Jing Cai, Zhibin Zhang, P. Zhang, Xinbo Song","doi":"10.1109/IC-NC.2010.15","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.15","url":null,"abstract":"With the increase of network bandwidth, more and more new applications such as audio, video and online games have become the main body in network traffic. Based on real time considerations, these new applications mostly use UDP as transport layer protocol, which directly increase UDP traffic. However, traditional studies believe that TCP dominates the Internet traffic and previous traffic measurements were generally based on it while UDP was ignored. In view of this, we mainly discuss the adaptive timeout strategy of UDP flows in this paper. Firstly, due to the significant differences in flow characteristics between the TCP flows and UDP flows, we expound and prove that the existing adaptive timeout strategies are not appropriate for UDP flows. Secondly, we present our adaptive strategy using Support Vector Machine techniques. We build six classifiers to accurately predict its corresponding maximum packet inter-arrival time and adapt its timeout value within the flow duration. Limited to its accurate rating, we present another concept of adjust accuracy rating which can probability-guaranteed(90%,95%,98%) to avoid long flow to be cut into short flows. The experiment result reveals that our adaptive strategy has the potential to achieve significant performance advantages over other widely used fixed and other adaptive timeout schemes.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"697 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114896501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An Evaluation on Sensor Network Technologies for AMI Associated Mudslide Warning System AMI关联泥石流预警系统传感器网络技术评价
Pub Date : 2010-11-17 DOI: 10.1109/IC-NC.2010.10
Cheng-Jen Tang, Miau-Ru Dai
In order to detect occurrences of mudslides, a mudslide warning system has to address three major issues: sensor sensitivity, coverage area, and deployment cost. With the emergence of AMI, which is considered as the fundamental step towards Smart Grid, a mudslide warning system is able to utilize AMI communication network to provide a broad coverage at a relatively low deployment cost. However, realization of the sensor networks for this AMI associated mudslide warning system needs to satisfy many constraints. In addition to the design factors identified by previous studies, such as fault tolerance, scalability, cost, hardware, topology change, environment, and power consumption, AMI brings limitations that come along with the electricity grid infrastructure. This paper studies the state of the art of current communication technologies in sensor networks, and identifies which of them meet the requirements of AMI associated mudslide warning system.
为了检测泥石流的发生,泥石流预警系统必须解决三个主要问题:传感器灵敏度、覆盖范围和部署成本。AMI的出现被认为是迈向智能电网的基础一步,使得泥石流预警系统能够利用AMI通信网络以相对较低的部署成本提供广泛的覆盖范围。然而,这种AMI相关的泥石流预警系统的传感器网络的实现需要满足许多约束条件。除了先前研究确定的设计因素(如容错、可伸缩性、成本、硬件、拓扑变化、环境和功耗)之外,AMI还带来了电网基础设施带来的限制。本文研究了当前传感器网络通信技术的发展现状,并对满足AMI相关泥石流预警系统要求的传感器网络通信技术进行了识别。
{"title":"An Evaluation on Sensor Network Technologies for AMI Associated Mudslide Warning System","authors":"Cheng-Jen Tang, Miau-Ru Dai","doi":"10.1109/IC-NC.2010.10","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.10","url":null,"abstract":"In order to detect occurrences of mudslides, a mudslide warning system has to address three major issues: sensor sensitivity, coverage area, and deployment cost. With the emergence of AMI, which is considered as the fundamental step towards Smart Grid, a mudslide warning system is able to utilize AMI communication network to provide a broad coverage at a relatively low deployment cost. However, realization of the sensor networks for this AMI associated mudslide warning system needs to satisfy many constraints. In addition to the design factors identified by previous studies, such as fault tolerance, scalability, cost, hardware, topology change, environment, and power consumption, AMI brings limitations that come along with the electricity grid infrastructure. This paper studies the state of the art of current communication technologies in sensor networks, and identifies which of them meet the requirements of AMI associated mudslide warning system.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114083963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Implementation of SIVARM: A Simple VMM for the ARM Architecture SIVARM的实现:ARM架构的简单VMM
Pub Date : 2010-11-17 DOI: 10.1109/IC-NC.2010.23
A. Suzuki, S. Oikawa
By using Virutal Machine Monitors (VMMs)cite{vmm}, we can overcome many issues in embedded systems. The performance gain of recent hardware enables the use of VMM even in small embedded systems. We implemented a VMM for the ARM architecture that is the most widely used CPU for embedded systems. We name it SIVARM: a simple VMM for the ARM architecture. Since the VMM executes in privileged mode and its guest OS executes in non-privileged mode, the VMM can catch the execution of sensitive instructions as exceptions and emulate them appropriately. The guest OS can execute in non-privileged mode thanks to the virtual banked registers and the virtual processor mode provided by the VMM. Domains are used to control the access between the guest OS and the VMM. The VMM was implemented for the ARM926EJ-S processor, and can successfully boot the Linux on it.
通过使用虚拟机监视器(vmm) cite{vmm},我们可以克服嵌入式系统中的许多问题。最近硬件的性能提升使VMM甚至可以在小型嵌入式系统中使用。我们为ARM架构实现了一个VMM, ARM架构是嵌入式系统中使用最广泛的CPU。我们将其命名为SIVARM: ARM架构的简单VMM。由于VMM以特权模式执行,而它的客户机操作系统以非特权模式执行,因此VMM可以捕捉敏感指令的执行,并将其作为异常进行适当的模拟。由于VMM提供了虚拟银行寄存器和虚拟处理器模式,来宾操作系统可以在非特权模式下执行。域用于控制来宾操作系统和VMM之间的访问。VMM是针对arm926ejj - s处理器实现的,可以在其上成功地引导Linux。
{"title":"Implementation of SIVARM: A Simple VMM for the ARM Architecture","authors":"A. Suzuki, S. Oikawa","doi":"10.1109/IC-NC.2010.23","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.23","url":null,"abstract":"By using Virutal Machine Monitors (VMMs)cite{vmm}, we can overcome many issues in embedded systems. The performance gain of recent hardware enables the use of VMM even in small embedded systems. We implemented a VMM for the ARM architecture that is the most widely used CPU for embedded systems. We name it SIVARM: a simple VMM for the ARM architecture. Since the VMM executes in privileged mode and its guest OS executes in non-privileged mode, the VMM can catch the execution of sensitive instructions as exceptions and emulate them appropriately. The guest OS can execute in non-privileged mode thanks to the virtual banked registers and the virtual processor mode provided by the VMM. Domains are used to control the access between the guest OS and the VMM. The VMM was implemented for the ARM926EJ-S processor, and can successfully boot the Linux on it.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124010025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The Design of On-the-Fly Virtual Channel Allocation for Low Cost High Performance On-Chip Routers 面向低成本高性能片上路由器的实时虚拟信道分配设计
Pub Date : 2010-11-17 DOI: 10.1109/IC-NC.2010.25
S. Nguyen, S. Oyanagi
Network-on-Chip (NoC) is an important communication infrastructure for System-on-Chips (SoCs). Designing high performance NoCs with minimized area overhead is becoming a major technical challenge. In this paper, we propose the on-the-fly virtual channel (VC) allocation for low cost high performance on-chip routers. By performing the VC allocation based on the result of switch allocation, the dependency between VC allocation and switch traversal is removed and these stages can be performed in parallel. In this manner, the pipeline of a packet transfer can be shortened in a non-speculative fashion. We have implemented the proposed router on FPGA and evaluated in terms of communication latency, throughput and hardware amount. The experimental results show that, the proposed router with on-the-fly VC allocation reduces the communication latency by 27.3%, and improves throughput by 21.4% as compared to the conventional VC router. In comparison with the look-ahead speculative router, it improves the throughput by 6.2% with 17.6% reduction of area for control logic.
片上网络(NoC)是片上系统(soc)的重要通信基础设施。以最小的面积开销设计高性能noc正成为主要的技术挑战。本文提出了一种低成本高性能片上路由器的实时虚拟信道(VC)分配方法。通过基于交换机分配的结果执行VC分配,消除了VC分配与交换机遍历之间的依赖关系,这些阶段可以并行执行。通过这种方式,数据包传输的管道可以以一种非推测的方式缩短。我们已经在FPGA上实现了所提出的路由器,并在通信延迟、吞吐量和硬件数量方面进行了评估。实验结果表明,与传统的VC路由器相比,该动态VC分配路由器的通信延迟降低了27.3%,吞吐量提高了21.4%。与前向推测路由器相比,吞吐量提高了6.2%,控制逻辑面积减少了17.6%。
{"title":"The Design of On-the-Fly Virtual Channel Allocation for Low Cost High Performance On-Chip Routers","authors":"S. Nguyen, S. Oyanagi","doi":"10.1109/IC-NC.2010.25","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.25","url":null,"abstract":"Network-on-Chip (NoC) is an important communication infrastructure for System-on-Chips (SoCs). Designing high performance NoCs with minimized area overhead is becoming a major technical challenge. In this paper, we propose the on-the-fly virtual channel (VC) allocation for low cost high performance on-chip routers. By performing the VC allocation based on the result of switch allocation, the dependency between VC allocation and switch traversal is removed and these stages can be performed in parallel. In this manner, the pipeline of a packet transfer can be shortened in a non-speculative fashion. We have implemented the proposed router on FPGA and evaluated in terms of communication latency, throughput and hardware amount. The experimental results show that, the proposed router with on-the-fly VC allocation reduces the communication latency by 27.3%, and improves throughput by 21.4% as compared to the conventional VC router. In comparison with the look-ahead speculative router, it improves the throughput by 6.2% with 17.6% reduction of area for control logic.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131036995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Design and Implementation of a Uniform Platform to Support Multigenerational GPU Architectures for High Performance Stream-Based Computing 一个统一平台的设计与实现,以支持多代GPU架构的高性能流计算
Pub Date : 2010-11-17 DOI: 10.1109/IC-NC.2010.35
S. Yamagiwa, Masahiro Arai, K. Wada
GPU-based computing has become one of the popular high performance computing fields. The field is called GPGPU. This paper is focused on design and implementation of a uniform GPGPU application that is optimized for both the legacy and the recent GPU architectures. As a typical example of such the GPGPU application, this paper will discuss the uniform implementation of the Caravel a platform. Especially the flow-model execution mechanism will be considered referring the recent GPU architectures. To verify the design and the implementation, this paper will evaluate the compatibility among the architectures, and also test measurements of performance.
基于gpu的计算已经成为高性能计算的热门领域之一。该字段称为GPGPU。本文的重点是设计和实现一个统一的GPGPU应用程序,该应用程序针对传统和最新的GPU架构进行了优化。作为这种GPGPU应用的一个典型例子,本文将讨论统一实现的Caravel平台。特别是流模型的执行机制将参考最新的GPU架构。为了验证设计和实现,本文将评估体系结构之间的兼容性,并测试性能度量。
{"title":"Design and Implementation of a Uniform Platform to Support Multigenerational GPU Architectures for High Performance Stream-Based Computing","authors":"S. Yamagiwa, Masahiro Arai, K. Wada","doi":"10.1109/IC-NC.2010.35","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.35","url":null,"abstract":"GPU-based computing has become one of the popular high performance computing fields. The field is called GPGPU. This paper is focused on design and implementation of a uniform GPGPU application that is optimized for both the legacy and the recent GPU architectures. As a typical example of such the GPGPU application, this paper will discuss the uniform implementation of the Caravel a platform. Especially the flow-model execution mechanism will be considered referring the recent GPU architectures. To verify the design and the implementation, this paper will evaluate the compatibility among the architectures, and also test measurements of performance.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128200204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Loop Performance Improvement for Min-cut Program Decomposition Method 最小割程序分解法的循环性能改进
Pub Date : 2010-11-17 DOI: 10.1109/IC-NC.2010.47
K. Ootsu, Takeshi Abe, T. Yokota, T. Baba
In recent years, speedup by the thread level parallel processing becomes more and more important with the spread of multi-core processors, and various techniques for parallelizing the single thread code into the efficient multithreaded code that can achieve efficient thread level parallel processing have been developed. The speculative multithreading is an important technology for achieving the high performance by the thread level parallel processing. To improve the execution performance by speculative multithreading, it is necessary to appropriately decompose the program code. Against the background of this, T. A. Johnson, et al. proposed a program decomposition technique (hereafter, we refer to their method as min-cut method) for finding the decomposition pattern that can minimize the effects of the performance degradation factors, by finding the minimum cut set in the weighted control flow graph (CFG) of the program. The min-cut method is wide coverage and a very promising technique since the whole program can be decomposed without being restricted to the logical structures, such as loop. However, the min-cut method has a problem that the loop level parallelism cannot be utilized enough. In this paper, we propose an improved method for the min-cut method to enhance the loop execution performance. Our method is based on the min-cut method and tries to apply the loop unrolling to the loop in the target program codes during the process of the decomposition. We apply our method to the practical program codes selected from the SPEC CINT2000 benchmarks. The results show that the loops, that are not decomposed with the min-cut method, are decomposed, and the possibilities of making use of the loop level parallelism increased. In addition, the results of the performance evaluation by using a cycle-based simulator show that our method can improve the loop execution performance, as compared to the min-cut method.
近年来,随着多核处理器的普及,线程级并行处理的加速变得越来越重要,各种将单线程代码并行化为能够实现高效线程级并行处理的高效多线程代码的技术被开发出来。推测式多线程是通过线程级并行处理实现高性能的重要技术。为了提高推测多线程的执行性能,有必要对程序代码进行适当的分解。在此背景下,t.a.j hson等人提出了一种程序分解技术(以下简称其方法为min-cut法),通过在程序的加权控制流图(CFG)中找到最小割集,来寻找能使性能退化因素影响最小化的分解模式。最小分割法是一种很有前途的技术,因为它可以在不局限于循环等逻辑结构的情况下对整个程序进行分解。然而,最小割法存在着不能充分利用循环级并行性的问题。在本文中,我们提出了一种改进的最小切割方法来提高循环的执行性能。我们的方法是基于最小切割方法,并试图在分解过程中将循环展开应用于目标程序代码中的循环。我们将我们的方法应用于从SPEC CINT2000基准中选择的实际程序代码。结果表明,不能用最小切割法分解的循环被分解了,增加了利用循环级并行性的可能性。此外,利用基于循环的模拟器进行性能评估的结果表明,与最小切割方法相比,我们的方法可以提高循环执行性能。
{"title":"Loop Performance Improvement for Min-cut Program Decomposition Method","authors":"K. Ootsu, Takeshi Abe, T. Yokota, T. Baba","doi":"10.1109/IC-NC.2010.47","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.47","url":null,"abstract":"In recent years, speedup by the thread level parallel processing becomes more and more important with the spread of multi-core processors, and various techniques for parallelizing the single thread code into the efficient multithreaded code that can achieve efficient thread level parallel processing have been developed. The speculative multithreading is an important technology for achieving the high performance by the thread level parallel processing. To improve the execution performance by speculative multithreading, it is necessary to appropriately decompose the program code. Against the background of this, T. A. Johnson, et al. proposed a program decomposition technique (hereafter, we refer to their method as min-cut method) for finding the decomposition pattern that can minimize the effects of the performance degradation factors, by finding the minimum cut set in the weighted control flow graph (CFG) of the program. The min-cut method is wide coverage and a very promising technique since the whole program can be decomposed without being restricted to the logical structures, such as loop. However, the min-cut method has a problem that the loop level parallelism cannot be utilized enough. In this paper, we propose an improved method for the min-cut method to enhance the loop execution performance. Our method is based on the min-cut method and tries to apply the loop unrolling to the loop in the target program codes during the process of the decomposition. We apply our method to the practical program codes selected from the SPEC CINT2000 benchmarks. The results show that the loops, that are not decomposed with the min-cut method, are decomposed, and the possibilities of making use of the loop level parallelism increased. In addition, the results of the performance evaluation by using a cycle-based simulator show that our method can improve the loop execution performance, as compared to the min-cut method.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128797178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Design an Implementation of Bee Hive in a Mult-agent Based Resource Discovery Method in P2P Systems 基于多代理的P2P系统中蜂箱资源发现方法的实现设计
Pub Date : 2010-11-17 DOI: 10.1109/IC-NC.2010.20
Junpei Yamasaki, Y. Kambayashi
We have proposed and implemented an efficient resource locating method in a pure P2P system based on a multiple agent system. All the resources as well as resource information are managed by cooperative multiple agents. In order to optimize the behaviors of cooperative multiple agents, we now utilized a honey bee algorithm that guides mobile agents to migrate toward the nodes that are possible to have the requested resources. In this paper, we report on our implementation.
本文提出并实现了一种基于多代理系统的纯P2P系统中高效的资源定位方法。所有的资源和资源信息都由多个协作代理来管理。为了优化合作多agent的行为,我们现在使用蜜蜂算法来引导移动agent向可能拥有请求资源的节点迁移。在本文中,我们报告了我们的实现。
{"title":"Design an Implementation of Bee Hive in a Mult-agent Based Resource Discovery Method in P2P Systems","authors":"Junpei Yamasaki, Y. Kambayashi","doi":"10.1109/IC-NC.2010.20","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.20","url":null,"abstract":"We have proposed and implemented an efficient resource locating method in a pure P2P system based on a multiple agent system. All the resources as well as resource information are managed by cooperative multiple agents. In order to optimize the behaviors of cooperative multiple agents, we now utilized a honey bee algorithm that guides mobile agents to migrate toward the nodes that are possible to have the requested resources. In this paper, we report on our implementation.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122759998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient Path Setup for a Photonic Network-on-Chip 片上光子网络的有效路径设置
Pub Date : 2010-11-17 DOI: 10.1109/IC-NC.2010.31
Cisse Ahmadou Dit Adi, Hiroki Matsutani, M. Koibuchi, H. Irie, T. Miyoshi, T. Yoshinaga
Electrical Network-on-Chip (NoC) faces critical challenges in meeting the high performance and low power consumption requirements for future multicore processors interconnection. Recent tremendous advances in CMOS compatible optical components give the potential for photonics to deliver an efficient NoC performance at an acceptable energy cost. However, the lack of in ¿ight processing and buffering of optical data made the realization of a fully optical NoC complicated. A hybrid architecture which uses optical high bandwidth transfer and a tiny electrical control network can take advantage of both interconnection methods to offer an efficient performance-per-watt infrastructure to connect multicore processors and System-on-Chip (SoC). In this paper, we propose a hybrid photonic torus NoC (HPNoC) that uses a predictive switching to improve the performance of a hybrid architecture. By using prediction techniques, we can reduce the path set up latency for the electrical control network hence improving the overall end-to-end delay for communication in the HPNoC. Simulation results using a cycle accurate simulator under uniform, neighbor and bit reversal traffic patterns for 64 nodes show that predictive switching considerably improves the HPNoC overall performance.
芯片上网络(NoC)在满足未来多核处理器互连的高性能和低功耗要求方面面临着严峻的挑战。最近CMOS兼容光学元件的巨大进步,使得光子学能够以可接受的能源成本提供高效的NoC性能。然而,由于光学数据缺乏光处理和缓冲,使得全光学NoC的实现变得复杂。使用光学高带宽传输和微型电气控制网络的混合架构可以利用这两种互连方法来提供高效的每瓦性能基础设施,以连接多核处理器和片上系统(SoC)。在本文中,我们提出了一种混合光子环面NoC (HPNoC),它使用预测开关来提高混合结构的性能。通过使用预测技术,我们可以减少电气控制网络的路径设置延迟,从而提高HPNoC中通信的整体端到端延迟。使用周期精确模拟器对64个节点在均匀、邻居和位反转业务模式下的仿真结果表明,预测交换显著提高了HPNoC的整体性能。
{"title":"An Efficient Path Setup for a Photonic Network-on-Chip","authors":"Cisse Ahmadou Dit Adi, Hiroki Matsutani, M. Koibuchi, H. Irie, T. Miyoshi, T. Yoshinaga","doi":"10.1109/IC-NC.2010.31","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.31","url":null,"abstract":"Electrical Network-on-Chip (NoC) faces critical challenges in meeting the high performance and low power consumption requirements for future multicore processors interconnection. Recent tremendous advances in CMOS compatible optical components give the potential for photonics to deliver an efficient NoC performance at an acceptable energy cost. However, the lack of in ¿ight processing and buffering of optical data made the realization of a fully optical NoC complicated. A hybrid architecture which uses optical high bandwidth transfer and a tiny electrical control network can take advantage of both interconnection methods to offer an efficient performance-per-watt infrastructure to connect multicore processors and System-on-Chip (SoC). In this paper, we propose a hybrid photonic torus NoC (HPNoC) that uses a predictive switching to improve the performance of a hybrid architecture. By using prediction techniques, we can reduce the path set up latency for the electrical control network hence improving the overall end-to-end delay for communication in the HPNoC. Simulation results using a cycle accurate simulator under uniform, neighbor and bit reversal traffic patterns for 64 nodes show that predictive switching considerably improves the HPNoC overall performance.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123133443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Optimization Vector Quantization by Adaptive Associative-Memory-Based Codebook Learning in Combination with Huffman Coding 基于自适应联想记忆的码本学习与霍夫曼编码相结合的优化矢量量化
Pub Date : 2010-11-17 DOI: 10.1109/IC-NC.2010.38
A. Kawabata, T. Koide, H. Mattausch
In the presented research on codebook optimization for vector quantization, an associative memory architecture is applied, which searches the most similar data among previously stored reference data. For realizing the learning function of new codebook data, a learning algorithm is implemented, which is based on this associative memory and which imitates the concept of the human short/long-term memory. The quality improvement of the codebook for vector quantization, created with the proposed learning algorithm, and the learning-parameter dependence of the improvement is evaluated with the Peak Signal Noise Ratio (PSNR), which is an index of the image quality. A quantitative PSNR improvement of 2.5 – 3.0 dB could be verified. Since the learning algorithm orders the codebook elements according to their usage frequency for the vector-quantization process, Huffman coding is additionally applied, and is verified to further improve the compression ratio from 12.8 to 14.1.
在矢量量化的码本优化研究中,采用了一种关联存储架构,在已有的参考数据中搜索最相似的数据。为了实现新码本数据的学习功能,实现了一种基于这种联想记忆的学习算法,该算法模仿了人类短时/长时记忆的概念。利用所提出的学习算法对矢量量化码本进行质量改进,并利用图像质量指标——峰值信噪比(PSNR)评估改进的学习参数依赖性。定量的PSNR提高了2.5 - 3.0 dB。由于学习算法根据码本元素的使用频率对其进行矢量量化处理,因此在此基础上增加了霍夫曼编码,并经过验证,进一步将压缩比从12.8提高到14.1。
{"title":"Optimization Vector Quantization by Adaptive Associative-Memory-Based Codebook Learning in Combination with Huffman Coding","authors":"A. Kawabata, T. Koide, H. Mattausch","doi":"10.1109/IC-NC.2010.38","DOIUrl":"https://doi.org/10.1109/IC-NC.2010.38","url":null,"abstract":"In the presented research on codebook optimization for vector quantization, an associative memory architecture is applied, which searches the most similar data among previously stored reference data. For realizing the learning function of new codebook data, a learning algorithm is implemented, which is based on this associative memory and which imitates the concept of the human short/long-term memory. The quality improvement of the codebook for vector quantization, created with the proposed learning algorithm, and the learning-parameter dependence of the improvement is evaluated with the Peak Signal Noise Ratio (PSNR), which is an index of the image quality. A quantitative PSNR improvement of 2.5 – 3.0 dB could be verified. Since the learning algorithm orders the codebook elements according to their usage frequency for the vector-quantization process, Huffman coding is additionally applied, and is verified to further improve the compression ratio from 12.8 to 14.1.","PeriodicalId":375145,"journal":{"name":"2010 First International Conference on Networking and Computing","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133740065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2010 First International Conference on Networking and Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1