In this paper, a bit-rate estimation method is proposed, which could improve the performance of the H.264 encoder and avoid the strict data-dependencies of mode decision. After that, an efficient parallel algorithm for H.264 encoder with CABAC entropy coding is presented based on the Macro-Block Region Partition (MBRP) parallel method and the bit-rate estimation technique. Simulation results show that, the proposed parallel algorithm could improve the performance of H.264 encoder efficiently while maintaining the similar RD performance as JM 10.2.
{"title":"An Efficient Parallel Algorithm for H.264/AVC Encoder","authors":"Shuwei Sun, Shuming Chen","doi":"10.1109/PDCAT.2008.19","DOIUrl":"https://doi.org/10.1109/PDCAT.2008.19","url":null,"abstract":"In this paper, a bit-rate estimation method is proposed, which could improve the performance of the H.264 encoder and avoid the strict data-dependencies of mode decision. After that, an efficient parallel algorithm for H.264 encoder with CABAC entropy coding is presented based on the Macro-Block Region Partition (MBRP) parallel method and the bit-rate estimation technique. Simulation results show that, the proposed parallel algorithm could improve the performance of H.264 encoder efficiently while maintaining the similar RD performance as JM 10.2.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133506273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The paper provides an insight to implementation framework and a test-bed of multi-channel correlation on high-performance CPU+FPGA hybrid platform. Solution based on commodity PC, PCIe Altera Stratix II GX board, and C-to-HDL tool has been demonstrated.
给出了多通道相关在高性能CPU+FPGA混合平台上的实现框架和测试平台。演示了基于商用PC机、PCIe Altera Stratix II GX板和C-to-HDL工具的解决方案。
{"title":"Feasibility Study of Implementing Multi-Channel Correlation for DSP Applications on Reconfigurable CPU+FPGA Platform","authors":"M. Leonov, V. V. Kitaev","doi":"10.1109/PDCAT.2008.62","DOIUrl":"https://doi.org/10.1109/PDCAT.2008.62","url":null,"abstract":"The paper provides an insight to implementation framework and a test-bed of multi-channel correlation on high-performance CPU+FPGA hybrid platform. Solution based on commodity PC, PCIe Altera Stratix II GX board, and C-to-HDL tool has been demonstrated.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131320143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The extensible authentication protocol (EAP), which is typically used over wireless LANs and point-to-point links, allows a server to request authentication information from a client. The protocol for carrying authentication for network access (PANA) is designed to transport EAP messages over IP networks. This paper presents a formal coloured Petri net model and analysis of PANA, focusing on the initial authentication and authorisation phase. State space analysis of selected configurations reveals a deadlock may occur at the client when the server aborts a PANA authentication session. The analysis also derives a formal definition of the service between PANA and EAP, which is important for verifying that PANA correctly interfaces with EAP, and can later be used for automated testing.
{"title":"Formal Analysis of PANA Authentication and Authorisation Protocol","authors":"S. Gordon","doi":"10.1109/PDCAT.2008.12","DOIUrl":"https://doi.org/10.1109/PDCAT.2008.12","url":null,"abstract":"The extensible authentication protocol (EAP), which is typically used over wireless LANs and point-to-point links, allows a server to request authentication information from a client. The protocol for carrying authentication for network access (PANA) is designed to transport EAP messages over IP networks. This paper presents a formal coloured Petri net model and analysis of PANA, focusing on the initial authentication and authorisation phase. State space analysis of selected configurations reveals a deadlock may occur at the client when the server aborts a PANA authentication session. The analysis also derives a formal definition of the service between PANA and EAP, which is important for verifying that PANA correctly interfaces with EAP, and can later be used for automated testing.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134480446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents MIM: multimedia integration middleware, which integrates existing tools, multimedia standards and offers a set of high-level grid services to provide extensible functionality for multimedia information manipulation. MIM provides a high-level API, allowing developers to easily incorporate these functionalities into their applications.
{"title":"MIM: Multimedia Integration Middleware, a Multimedia Services Platform for Grid Environments","authors":"Leonardo Mancilla-Amaya, Claudia Jiménez-Guarín","doi":"10.1109/PDCAT.2008.71","DOIUrl":"https://doi.org/10.1109/PDCAT.2008.71","url":null,"abstract":"This paper presents MIM: multimedia integration middleware, which integrates existing tools, multimedia standards and offers a set of high-level grid services to provide extensible functionality for multimedia information manipulation. MIM provides a high-level API, allowing developers to easily incorporate these functionalities into their applications.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124772682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An-quan Qin, Haiyan Yu, Chengchun Shu, Xiaoqian Yu, Y. Jégou, C. Morin
In computational grids, a virtual organization (VO) is a dynamic coupling of multiple Linux/Unix nodes for resource sharing under specific polices. Currently, VO support functionalities are generally implemented as grid middleware. However, the usability of grids is often impaired by the complexity of configuring and maintaining a new layer of security infrastructure as well as adapting to new interfaces of security enabled services. In this paper, we present an OS-level approach to provide native VO support functionalities, which is a part of XtreemOS project [18]. Our approach adopts pluggable frameworks existing in current OS as extension points to implement VO support, avoiding modification of kernel codes and easily turning traditional OSes into grid-aware ones. The performance evaluation of NAS parallel benchmarks (NPB) shows that our current implementation incurs trivial overhead on original systems.
{"title":"Operating System-Level Virtual Organization Support in XtreemOS","authors":"An-quan Qin, Haiyan Yu, Chengchun Shu, Xiaoqian Yu, Y. Jégou, C. Morin","doi":"10.1109/PDCAT.2008.48","DOIUrl":"https://doi.org/10.1109/PDCAT.2008.48","url":null,"abstract":"In computational grids, a virtual organization (VO) is a dynamic coupling of multiple Linux/Unix nodes for resource sharing under specific polices. Currently, VO support functionalities are generally implemented as grid middleware. However, the usability of grids is often impaired by the complexity of configuring and maintaining a new layer of security infrastructure as well as adapting to new interfaces of security enabled services. In this paper, we present an OS-level approach to provide native VO support functionalities, which is a part of XtreemOS project [18]. Our approach adopts pluggable frameworks existing in current OS as extension points to implement VO support, avoiding modification of kernel codes and easily turning traditional OSes into grid-aware ones. The performance evaluation of NAS parallel benchmarks (NPB) shows that our current implementation incurs trivial overhead on original systems.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124781200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditional cache coherence protocols either provide low latency cache misses (snooping protocols) or bandwidth efficiency (directory protocols). To simultaneously capture the best attributes of traditional protocols, Token Coherence has been recently proposed. This protocol can quickly resolve cache misses by transient requests. However, since transient requests are unordered messages, they may sometimes fail in solving cache misses mainly due to the occurrence of protocol races. Thus, when the completion of cache misses is not possible by transient requests, Token Coherence uses a starvation prevention mechanism to ensure their completion. Although several implementation options of starvation prevention mechanisms have been proposed, all of them are broadcast-based. This fact represents a large detriment to the Token Coherence scalability. To tackle this problem, in this work we apply a switch-based packing technique that alleviates the harm of broadcast messages and improves the protocol scalability.
{"title":"Switch-Based Packing Technique for Improving Token Coherence Scalability","authors":"B. Cuesta, A. Robles, J. Duato","doi":"10.1109/PDCAT.2008.25","DOIUrl":"https://doi.org/10.1109/PDCAT.2008.25","url":null,"abstract":"Traditional cache coherence protocols either provide low latency cache misses (snooping protocols) or bandwidth efficiency (directory protocols). To simultaneously capture the best attributes of traditional protocols, Token Coherence has been recently proposed. This protocol can quickly resolve cache misses by transient requests. However, since transient requests are unordered messages, they may sometimes fail in solving cache misses mainly due to the occurrence of protocol races. Thus, when the completion of cache misses is not possible by transient requests, Token Coherence uses a starvation prevention mechanism to ensure their completion. Although several implementation options of starvation prevention mechanisms have been proposed, all of them are broadcast-based. This fact represents a large detriment to the Token Coherence scalability. To tackle this problem, in this work we apply a switch-based packing technique that alleviates the harm of broadcast messages and improves the protocol scalability.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128308260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiyi Huang, A. Trotman, Jiaqi Zhang, Xiangfei Jia, M. Nowostawski, Nathan Rountree, P. Werstein
Parallel computing has been in the spotlight with the advent of multi-core computers. The popular multithreading model does not scale very well when there are hundreds or thousands of cores, since it can only help exploit coarse-grained parallelism. There exist a lot of fine-grained parallelism to be exploited in I/O tasks and memory accesses during execution of a thread. Our counter-Amdahl's law tells us that it is more effective to parallelize the serial fraction of a parallel algorithm rather than the parallelized fraction in order to maximize the speedup. In this paper, we have proposed a virtual aggregated processor that is aiming at speeding up execution of a thread through exploiting the fine-grained parallelism in I/O tasks and memory accesses. We have proposed and implemented two techniques, helper thread and I/O specialization, to demonstrate the potential effectiveness of the virtual aggregated processor technology.
{"title":"Virtual Aggregated Processor in Multi-core Computers","authors":"Zhiyi Huang, A. Trotman, Jiaqi Zhang, Xiangfei Jia, M. Nowostawski, Nathan Rountree, P. Werstein","doi":"10.1109/PDCAT.2008.27","DOIUrl":"https://doi.org/10.1109/PDCAT.2008.27","url":null,"abstract":"Parallel computing has been in the spotlight with the advent of multi-core computers. The popular multithreading model does not scale very well when there are hundreds or thousands of cores, since it can only help exploit coarse-grained parallelism. There exist a lot of fine-grained parallelism to be exploited in I/O tasks and memory accesses during execution of a thread. Our counter-Amdahl's law tells us that it is more effective to parallelize the serial fraction of a parallel algorithm rather than the parallelized fraction in order to maximize the speedup. In this paper, we have proposed a virtual aggregated processor that is aiming at speeding up execution of a thread through exploiting the fine-grained parallelism in I/O tasks and memory accesses. We have proposed and implemented two techniques, helper thread and I/O specialization, to demonstrate the potential effectiveness of the virtual aggregated processor technology.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128530583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we focus on periodic task scheduling on a variable voltage processor with d discrete voltage/speed levels. We propose an intra-task DVS algorithm which constructs a minimum energy schedule for k tasks in O(d+k log k) time. We also give an inter-task DVS algorithm for constructing a schedule of n jobs in O(d+n log n) time where each task is composed of a sequence of jobs. Previous approaches for solving DVS problems have to generate a canonical schedule in advance and change the speed/voltage in O(dn log n) or O(n3) time. However, the length of a canonical schedule depends on the LCM of those of task periods and is of exponential length in general. In this paper, the tasks with arbitrary periods are transformed into harmonic periods so that the relative start time, finish time and preemption time of each task can be derived easily. These task features benefit greatly the predictability of schedules and the control on power-awareness.
{"title":"Efficient Algorithms for Jitterless Real-Time Tasks to DVS Schedules","authors":"Da-Ren Chen, Shu-Ming Hsieh, Ming-Fong Lai","doi":"10.1109/PDCAT.2008.15","DOIUrl":"https://doi.org/10.1109/PDCAT.2008.15","url":null,"abstract":"In this paper, we focus on periodic task scheduling on a variable voltage processor with d discrete voltage/speed levels. We propose an intra-task DVS algorithm which constructs a minimum energy schedule for k tasks in O(d+k log k) time. We also give an inter-task DVS algorithm for constructing a schedule of n jobs in O(d+n log n) time where each task is composed of a sequence of jobs. Previous approaches for solving DVS problems have to generate a canonical schedule in advance and change the speed/voltage in O(dn log n) or O(n3) time. However, the length of a canonical schedule depends on the LCM of those of task periods and is of exponential length in general. In this paper, the tasks with arbitrary periods are transformed into harmonic periods so that the relative start time, finish time and preemption time of each task can be derived easily. These task features benefit greatly the predictability of schedules and the control on power-awareness.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125526715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Increasingly, collaborative activities in science are using the concept of virtual organization as an organizing principle. One benefit of viewing these collaborations from an organizational perspective is that there is a long history of studying how organizations can be structured to function effectively. Many of these organizational principles have been reflected in the design of enterprise architectures and the use of service oriented architecture concepts as an implementation vehicle for capturing these organizational constructs. One approach to meeting organizational requirements in systems architecture has been to express organizational structure in terms of business roles, business processes and business rules. To date however, this type of analysis and associated infrastructure tools has not been applied in any consistent way to the concept of virtual organizations and their associated scientific applications. In this talk, the author explores these established approaches to business IT systems and their applicability to the virtual organizations that are being created to support scientific endeavors. As an example, I will describes how data management policies for virtual organization can be expressed as business rules, and implemented via existing business rules engines.
{"title":"Virtual Organizations By the Rules","authors":"C. Kesselman","doi":"10.1109/PDCAT.2008.87","DOIUrl":"https://doi.org/10.1109/PDCAT.2008.87","url":null,"abstract":"Increasingly, collaborative activities in science are using the concept of virtual organization as an organizing principle. One benefit of viewing these collaborations from an organizational perspective is that there is a long history of studying how organizations can be structured to function effectively. Many of these organizational principles have been reflected in the design of enterprise architectures and the use of service oriented architecture concepts as an implementation vehicle for capturing these organizational constructs. One approach to meeting organizational requirements in systems architecture has been to express organizational structure in terms of business roles, business processes and business rules. To date however, this type of analysis and associated infrastructure tools has not been applied in any consistent way to the concept of virtual organizations and their associated scientific applications. In this talk, the author explores these established approaches to business IT systems and their applicability to the virtual organizations that are being created to support scientific endeavors. As an example, I will describes how data management policies for virtual organization can be expressed as business rules, and implemented via existing business rules engines.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126945567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the last few years, GPUs(Graphics Processing Units) have made rapid development. Their ever-increasing computing power and decreasing cost have attracted attention from both industry and academia. In addition to graphics applications, researchers are interested in using them for general purpose computing. Recently, NVIDIA released a new computing architecture, CUDA (compute united device architecture), for its GeForce 8 series, Quadro FX, and Tesla GPU products. This new architecture can change fundamentally the way in which GPUs are used. In this paper, we study the programmability of CUDA and its GeForce 8 GPU and compare its performance with general purpose processors, in order to investigate its suitability for general purpose computation.
{"title":"GPU as a General Purpose Computing Resource","authors":"Qihang Huang, Zhiyi Huang, P. Werstein, M. Purvis","doi":"10.1109/PDCAT.2008.38","DOIUrl":"https://doi.org/10.1109/PDCAT.2008.38","url":null,"abstract":"In the last few years, GPUs(Graphics Processing Units) have made rapid development. Their ever-increasing computing power and decreasing cost have attracted attention from both industry and academia. In addition to graphics applications, researchers are interested in using them for general purpose computing. Recently, NVIDIA released a new computing architecture, CUDA (compute united device architecture), for its GeForce 8 series, Quadro FX, and Tesla GPU products. This new architecture can change fundamentally the way in which GPUs are used. In this paper, we study the programmability of CUDA and its GeForce 8 GPU and compare its performance with general purpose processors, in order to investigate its suitability for general purpose computation.","PeriodicalId":282779,"journal":{"name":"2008 Ninth International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130258494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}