In this paper, a Model Driven Architecture (MDA) approach is applied to Semi-automatically translate sequential programs into corresponding distributed code. The novelty of our work is the use of MDA in the process of translating serial into distributed code. The transformation comprises automatic generation of platform independent and then platform specific models from the sequential code. In order to generate the PIM, a meta-model defining the overall architecture of the resultant distributed code is developed. The meta-model is used as a basis for the development of platform independent models (PIM) for the resultant distributed code. A set of transformation rules are defined to transform the resulted PIM into a corresponding platform-specific model. These transformation rules can be modified by the user, depending on the details of the underlying middle-ware applied for the distribution. The platform independent model provides a better understanding of the distributed code and helps the programmer to modify the code more easily.
{"title":"Semi-automatic Transformation of Sequential Code to Distributed Code Using Model Driven Architecture Approach","authors":"S. Karimi, Saeed Parsa","doi":"10.1109/ISPA.2009.71","DOIUrl":"https://doi.org/10.1109/ISPA.2009.71","url":null,"abstract":"In this paper, a Model Driven Architecture (MDA) approach is applied to Semi-automatically translate sequential programs into corresponding distributed code. The novelty of our work is the use of MDA in the process of translating serial into distributed code. The transformation comprises automatic generation of platform independent and then platform specific models from the sequential code. In order to generate the PIM, a meta-model defining the overall architecture of the resultant distributed code is developed. The meta-model is used as a basis for the development of platform independent models (PIM) for the resultant distributed code. A set of transformation rules are defined to transform the resulted PIM into a corresponding platform-specific model. These transformation rules can be modified by the user, depending on the details of the underlying middle-ware applied for the distribution. The platform independent model provides a better understanding of the distributed code and helps the programmer to modify the code more easily.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124324187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In In-networking storage Wireless Sensor Networks, sensed data are stored locally for a long term and retrieved on-demand instead of real-time. To maximize data survival, the sensed data are normally distributively stored at multiple nearby nodes. It arises a problem that how to check and grantee data integrity of distributed data storage in the context of resource constraints. In this paper, a technique called Two Granularity Linear Code (TGLC) that consists of Intra-codes and Inter-codes is presented. An efficient and lightweight data integrity check scheme based on TGLC is proposed. Data integrity can be checked by any one who holds short Inter-codes, and the checking credentials is short Intra-codes that is dynamically generated. The proposed scheme is efficient and lightweight with respect to low storage and communication overhead, and yet checking validity is maintained. Our conclusion is justified by extensive analysis.
{"title":"Efficient and Lightweight Data Integrity Check in In-Networking Storage Wireless Sensor Networks","authors":"Wei Ren, Yi Ren, Hui Zhang","doi":"10.1109/ISPA.2009.103","DOIUrl":"https://doi.org/10.1109/ISPA.2009.103","url":null,"abstract":"In In-networking storage Wireless Sensor Networks, sensed data are stored locally for a long term and retrieved on-demand instead of real-time. To maximize data survival, the sensed data are normally distributively stored at multiple nearby nodes. It arises a problem that how to check and grantee data integrity of distributed data storage in the context of resource constraints. In this paper, a technique called Two Granularity Linear Code (TGLC) that consists of Intra-codes and Inter-codes is presented. An efficient and lightweight data integrity check scheme based on TGLC is proposed. Data integrity can be checked by any one who holds short Inter-codes, and the checking credentials is short Intra-codes that is dynamically generated. The proposed scheme is efficient and lightweight with respect to low storage and communication overhead, and yet checking validity is maintained. Our conclusion is justified by extensive analysis.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127062255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-08-18DOI: 10.1080/00207160903477175
Shuming Zhou
In evaluating the fault tolerance of an network structure, it is essential to estimate the order of a maximal connected component of this network provided the faulty vertices may break its connectedness, and it is crucial to local and to replace the faulty processors to maintain system’s high reliability. The fault diagnosis is the process of identifying fault processors in a system through testing. The conditional diagnosis requires that for each processor v in a system, all the processors that are directly connected to v do not fail at the same time. In this paper, the conditional diagnosability of the twisted cubes TQn under the comparison diagnosis model is 3n-5 when n≫6. Hence the conditional diagnosability of TQn is three times larger than its classical diagnosability.
{"title":"The Conditional Diagnosability of Twisted Cubes under the Comparison Model","authors":"Shuming Zhou","doi":"10.1080/00207160903477175","DOIUrl":"https://doi.org/10.1080/00207160903477175","url":null,"abstract":"In evaluating the fault tolerance of an network structure, it is essential to estimate the order of a maximal connected component of this network provided the faulty vertices may break its connectedness, and it is crucial to local and to replace the faulty processors to maintain system’s high reliability. The fault diagnosis is the process of identifying fault processors in a system through testing. The conditional diagnosis requires that for each processor v in a system, all the processors that are directly connected to v do not fail at the same time. In this paper, the conditional diagnosability of the twisted cubes TQn under the comparison diagnosis model is 3n-5 when n≫6. Hence the conditional diagnosability of TQn is three times larger than its classical diagnosability.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121982934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Service composition is an effective approach for multimedia delivery in pervasive environment. In previous works, there is one fixed functional path which is composed of several underlying services in a certain order. Actually, there are several functional paths delivering different quality level multimedia from the source to the end user. Due to the dynamicity and mobility of pervasive space, system should generate a reliable and low-delay service path for multimedia delivery in real-time. Since some multimedia service components change the data transmission volume which has a deep impact on the transmission delay, it makes the media delivery problem equal to Multi-Constrained Path problem which is known to NP-Complete. We propose an efficient algorithm LD/RPath(Lowest Delay/Reliability Path) for adaptive multimedia delivery. LD/RPath generates a low-delay service path based on several functional paths with reliability guarantee. Experiment results show that LD/RPath has a good performance and it is an effective algorithm for multimedia delivery in pervasive space.
{"title":"An Efficient Algorithm for Multimedia Delivery in Pervasive Space","authors":"S. Zhang, Zhuzhong Qian, M. Guo, Sanglu Lu","doi":"10.1109/ISPA.2009.34","DOIUrl":"https://doi.org/10.1109/ISPA.2009.34","url":null,"abstract":"Service composition is an effective approach for multimedia delivery in pervasive environment. In previous works, there is one fixed functional path which is composed of several underlying services in a certain order. Actually, there are several functional paths delivering different quality level multimedia from the source to the end user. Due to the dynamicity and mobility of pervasive space, system should generate a reliable and low-delay service path for multimedia delivery in real-time. Since some multimedia service components change the data transmission volume which has a deep impact on the transmission delay, it makes the media delivery problem equal to Multi-Constrained Path problem which is known to NP-Complete. We propose an efficient algorithm LD/RPath(Lowest Delay/Reliability Path) for adaptive multimedia delivery. LD/RPath generates a low-delay service path based on several functional paths with reliability guarantee. Experiment results show that LD/RPath has a good performance and it is an effective algorithm for multimedia delivery in pervasive space.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124196806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A deniable authentication allows the receiver to identify the source of the received messages but cannot prove it to any third party. However, the deniability of the content, which is called restricted deniability in this paper, is concerned in electronic voting and some other similar application. At present, most non-interactive deniable authentication protocols cannot resist weaken key-compromise impersonation (W-KCI) attack. To settle this problem, a non-interactive identity-based restricted deniable authentication protocol is proposed. It not only can resist W-KCI attack but also has the properties of communication flexibility. It meets the security requirements such as correctness, restricted deniability as well. Therefore, this protocol can be applied in electronic voting.
{"title":"An Identity-Based Restricted Deniable Authentication Protocol","authors":"Chengyu Fan, Shijie Zhou, Fagen Li","doi":"10.1109/ISPA.2009.113","DOIUrl":"https://doi.org/10.1109/ISPA.2009.113","url":null,"abstract":"A deniable authentication allows the receiver to identify the source of the received messages but cannot prove it to any third party. However, the deniability of the content, which is called restricted deniability in this paper, is concerned in electronic voting and some other similar application. At present, most non-interactive deniable authentication protocols cannot resist weaken key-compromise impersonation (W-KCI) attack. To settle this problem, a non-interactive identity-based restricted deniable authentication protocol is proposed. It not only can resist W-KCI attack but also has the properties of communication flexibility. It meets the security requirements such as correctness, restricted deniability as well. Therefore, this protocol can be applied in electronic voting.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116573559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An application system for enterprises is a composite Web service that consists of a collection of Web services related by data and control flow. It’s necessary to get formalizing and modeling techniques and graphic tools for reliable Web Service and its application construction. In this paper we present the formalization of Web services and the algorithm for constructing composition. Firstly, we propose a dynamic timed colored Petri net (DTCPN) to model and analyze a Web service. In this Petri net, the colors including parameters and user’s QoS (Quality of Service) requirements represent data flow. The time delay of transition is a function of colors in input place instead of time constant, which shows the dynamic property of Web service. The DTCPN allows the modeling of dynamic behavior of large and complex systems. Secondly, we give an algorithm for constructing composition of DTCPN model for an application composed of Web services. In order to reduce the complexity of model and the state explosion problem in reachability analysis of Petri nets, we give a reduction algorithm of DTCPN for four basic structures of the Web service composition. Finally, we discuss the correctness and time and cost performance of the Web service composition by reducing DTCPN model and analyzing the reachable service graph.
{"title":"Composition and Reduction of Web Service Based on Dynamic Timed Colored Petri Nets","authors":"Yaojun Han, Xuemei Luo","doi":"10.1109/ISPA.2009.21","DOIUrl":"https://doi.org/10.1109/ISPA.2009.21","url":null,"abstract":"An application system for enterprises is a composite Web service that consists of a collection of Web services related by data and control flow. It’s necessary to get formalizing and modeling techniques and graphic tools for reliable Web Service and its application construction. In this paper we present the formalization of Web services and the algorithm for constructing composition. Firstly, we propose a dynamic timed colored Petri net (DTCPN) to model and analyze a Web service. In this Petri net, the colors including parameters and user’s QoS (Quality of Service) requirements represent data flow. The time delay of transition is a function of colors in input place instead of time constant, which shows the dynamic property of Web service. The DTCPN allows the modeling of dynamic behavior of large and complex systems. Secondly, we give an algorithm for constructing composition of DTCPN model for an application composed of Web services. In order to reduce the complexity of model and the state explosion problem in reachability analysis of Petri nets, we give a reduction algorithm of DTCPN for four basic structures of the Web service composition. Finally, we discuss the correctness and time and cost performance of the Web service composition by reducing DTCPN model and analyzing the reachable service graph.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130087846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The instance-aspect oriented workflow management system is to vertically combine multiple workflow activity instances and submit them for execution as a whole according to some batch or combination logics. It is inspired by the idea of aspect-oriented programming methodology and aims at improving the execution efficiency of business processes. Traditional workflow systems do not support workflow model with instance aspects. In our previous work, we have studied workflow instance modeling technology. This paper makes a research on the principles, methods and implementation of a workflow visual GUI tool for modeling instance aspects in workflow. It is based on an open source GUI tool, Together Workflow Editor, and makes some expansion in instance aspect functionality.
{"title":"Implementation of a Visual Modeling Tool for Defining Instance Aspect in Workflow","authors":"Jianxun Liu, Zefeng Zhu, Yiping Wen, Jinjun Chen","doi":"10.1109/ISPA.2009.59","DOIUrl":"https://doi.org/10.1109/ISPA.2009.59","url":null,"abstract":"The instance-aspect oriented workflow management system is to vertically combine multiple workflow activity instances and submit them for execution as a whole according to some batch or combination logics. It is inspired by the idea of aspect-oriented programming methodology and aims at improving the execution efficiency of business processes. Traditional workflow systems do not support workflow model with instance aspects. In our previous work, we have studied workflow instance modeling technology. This paper makes a research on the principles, methods and implementation of a workflow visual GUI tool for modeling instance aspects in workflow. It is based on an open source GUI tool, Together Workflow Editor, and makes some expansion in instance aspect functionality.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"292 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131525441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposed a Virtualized Self-Adaptive Heterogeneous High Productivity Computers Parallel Programming Framework (VAPPF), which is composed of Virtualization-Based Runtime System (VRTS) and Virtualized Adaptive Parallel Programming Model (VAPPM). Virtualization-Based Runtime System is composed of Node-Level Virtual Machine Monitor (NVMM) and System-Level Virtual Infrastructure (SVI). VAPPM program model is not only compatible with conventional data parallel, but also support task parallel. Moreover, with the concept of Domains and virtualized process Locale, Virtualization-Based Runtime System can map between computation and processors according to system-level resources view and performance model. By conceal the hardware details through both runtime system level and programming model level by virtualization, the framework provides programmers a middle-level view independent of hardware details. Programmers can do their programming and debugging works on this middle-level view, and then, the runtime system map it into specific hardware environment. By this way, programming can be relatively separated from specific hardware architectures, this model realized an efficient work division between programmers and systems, and can help to improve the system’s programmability, scalability, portability, robustness, performance, and productivity.
{"title":"A Virtualized Self-Adaptive Parallel Programming Framework for Heterogeneous High Productivity Computers","authors":"Hua Cheng, Zuoning Chen, Ninghui Sun, Fenbin Qi, Chaoqun Dong, Laiwang Cheng","doi":"10.1109/ISPA.2009.76","DOIUrl":"https://doi.org/10.1109/ISPA.2009.76","url":null,"abstract":"This paper proposed a Virtualized Self-Adaptive Heterogeneous High Productivity Computers Parallel Programming Framework (VAPPF), which is composed of Virtualization-Based Runtime System (VRTS) and Virtualized Adaptive Parallel Programming Model (VAPPM). Virtualization-Based Runtime System is composed of Node-Level Virtual Machine Monitor (NVMM) and System-Level Virtual Infrastructure (SVI). VAPPM program model is not only compatible with conventional data parallel, but also support task parallel. Moreover, with the concept of Domains and virtualized process Locale, Virtualization-Based Runtime System can map between computation and processors according to system-level resources view and performance model. By conceal the hardware details through both runtime system level and programming model level by virtualization, the framework provides programmers a middle-level view independent of hardware details. Programmers can do their programming and debugging works on this middle-level view, and then, the runtime system map it into specific hardware environment. By this way, programming can be relatively separated from specific hardware architectures, this model realized an efficient work division between programmers and systems, and can help to improve the system’s programmability, scalability, portability, robustness, performance, and productivity.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"182 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133648900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graphics Processing Units (GPUs) that support general purpose program are promising platforms for high performance computing. However, the fundamental architectural difference between GPU and CPU, the complexity of GPU platform and the diversity of GPU specifications have made the generation of highly efficient code for GPU increasingly difficult. Manual code generation is time consuming and the result tends to be difficult to debug and maintain. On the other hand, the code generated by today's GPU compiler often has much lower performance than the best hand-tuned codes. A promising code generation strategy, implemented by systems like ATLAS~cite{Whaley}, FFTW~cite{FFTW_org}, SPIRAL~cite{Pueschel:05} and X-Sort~cite{Li:05}, uses empirical search to find the parameter values of the implementation, such as the tile size and instruction schedules, that deliver near-optimal performance for a particular machine. However, this approach has only proved successful when applied to CPU where the performance of CPU programs has been relatively better understood. Clearly, empirical search must be extended to general purpose programs on GPU. In this paper, we propose an empirical optimization technique for one of the most important sorting routines on GPU, the radix sort, that generates highly efficient code for a number of representative NVIDIA GPUs with a wide variety of architectural specifications. Our study has been focused on the algorithmic parameters of radix sort that can be adapted to different environments and the GPU architectural factors that affect the performance of radix sort. We present a powerful empirical optimization approach that is shown to be able to find highly efficient code for different NVIDIA GPUs. Our results show that such an empirical optimization approach is quite effective at taking into account the complex interactions between architectural characteristics and that the resulting code performs significantly better than two radix sort implementations that have been shown outperforming other GPU sort routines with the maximal speedup of 33.4%.
支持通用程序的图形处理单元(Graphics Processing unit, gpu)是一种很有前途的高性能计算平台。然而,GPU与CPU在架构上的根本差异、GPU平台的复杂性以及GPU规格的多样性,使得为GPU生成高效的代码变得越来越困难。手动代码生成非常耗时,而且结果往往难以调试和维护。另一方面,由今天的GPU编译器生成的代码通常比最好的手动调优代码的性能低得多。由ATLAS cite{Whaley}、FFTW cite{FFTW_org}、SPIRAL cite{Pueschel:05}和X-Sort cite{Li:05}等系统实现的一种很有前途的代码生成策略,使用经验搜索来找到实现的参数值,例如块大小和指令时间表,为特定机器提供接近最佳的性能。然而,这种方法只有在应用于CPU时才被证明是成功的,因为CPU程序的性能已经得到了相对更好的理解。显然,经验搜索必须扩展到GPU上的通用程序。在本文中,我们提出了一种经验优化技术,用于GPU上最重要的排序例程之一,基数排序,该技术可为具有各种架构规范的许多具有代表性的NVIDIA GPU生成高效代码。我们的研究主要集中在可以适应不同环境的基数排序算法参数和影响基数排序性能的GPU架构因素。我们提出了一个强大的经验优化方法,该方法被证明能够为不同的NVIDIA gpu找到高效的代码。我们的结果表明,这种经验优化方法在考虑到架构特征之间的复杂交互方面非常有效,并且结果代码的性能明显优于两个基数排序实现,这两个实现的性能已经被证明优于其他GPU排序例程,最大加速提升了33.4%。
{"title":"An Empirically Optimized Radix Sort for GPU","authors":"Bonan Huang, Jinlan Gao, Xiaoming Li","doi":"10.1109/ISPA.2009.89","DOIUrl":"https://doi.org/10.1109/ISPA.2009.89","url":null,"abstract":"Graphics Processing Units (GPUs) that support general purpose program are promising platforms for high performance computing. However, the fundamental architectural difference between GPU and CPU, the complexity of GPU platform and the diversity of GPU specifications have made the generation of highly efficient code for GPU increasingly difficult. Manual code generation is time consuming and the result tends to be difficult to debug and maintain. On the other hand, the code generated by today's GPU compiler often has much lower performance than the best hand-tuned codes. A promising code generation strategy, implemented by systems like ATLAS~cite{Whaley}, FFTW~cite{FFTW_org}, SPIRAL~cite{Pueschel:05} and X-Sort~cite{Li:05}, uses empirical search to find the parameter values of the implementation, such as the tile size and instruction schedules, that deliver near-optimal performance for a particular machine. However, this approach has only proved successful when applied to CPU where the performance of CPU programs has been relatively better understood. Clearly, empirical search must be extended to general purpose programs on GPU. In this paper, we propose an empirical optimization technique for one of the most important sorting routines on GPU, the radix sort, that generates highly efficient code for a number of representative NVIDIA GPUs with a wide variety of architectural specifications. Our study has been focused on the algorithmic parameters of radix sort that can be adapted to different environments and the GPU architectural factors that affect the performance of radix sort. We present a powerful empirical optimization approach that is shown to be able to find highly efficient code for different NVIDIA GPUs. Our results show that such an empirical optimization approach is quite effective at taking into account the complex interactions between architectural characteristics and that the resulting code performs significantly better than two radix sort implementations that have been shown outperforming other GPU sort routines with the maximal speedup of 33.4%.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123116100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The problem of piracy has disturbed people’s daily life for hundreds of years and has not been relieved until now, though many existing anti-counterfeit solutions have been applied. However, due to the emergences of Radio Frequency IDentification (RFID) technologies, there is a more reliable alternative solution to construct authentication system. On the other hand, there arises another issue of how to simplify the deployment of RFID-centric anti-counterfeit system over the Internet. In this article, we propose an approach, Web Service Locating Unit (WSLU), to achieve this goal to manage numbers of RFID-centric authentication services (relied on web services).
{"title":"Web Service Locating Unit in RFID-Centric Anti-counterfeit System","authors":"Zhiyuan Tan, Xiangjian He, P. Nanda","doi":"10.1109/ISPA.2009.94","DOIUrl":"https://doi.org/10.1109/ISPA.2009.94","url":null,"abstract":"The problem of piracy has disturbed people’s daily life for hundreds of years and has not been relieved until now, though many existing anti-counterfeit solutions have been applied. However, due to the emergences of Radio Frequency IDentification (RFID) technologies, there is a more reliable alternative solution to construct authentication system. On the other hand, there arises another issue of how to simplify the deployment of RFID-centric anti-counterfeit system over the Internet. In this article, we propose an approach, Web Service Locating Unit (WSLU), to achieve this goal to manage numbers of RFID-centric authentication services (relied on web services).","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"85 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114339687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}