In this paper a novel technique based on nearest feature space (NFS), known as incenter-based nearest feature space (INFS), is proposed for supervised hyperspectral image classification. Due to the class separability and neighborhood structure, the traditional NFS can perform well for classification of remote sensing images. However, in some instances, the overlapping training samples might cause classification errors in spite of the high classification accuracy of NFS for normal cases. In response, the INFS is proposed to overcome this problem in this paper. INFS method makes use of the incircle of a triangle which is tangent to its three sides and form a INFS. In addition, an incenter can be calculated by three training samples of the same class efficiently. Furthermore, in order to speed up the computation performance, this paper proposes a parallel computing version of INFS, namely parallel INFS (PINFS). It uses a modern graphics processing unit (GPU) architecture with NVIDIA's compute unified device architecture (CUDA) technology to improve the computational speed of INFS. Experimental results demonstrate the proposed INFS approach is suitable for land cover classification in earth remote sensing. It can achieve the better performance than NFS classifier when the class sample distribution overlaps. Through the computation of GPU by CUDA, we can also gain better speedup.
{"title":"Incenter-based nearest feature space method for hyperspectral image classification using GPU","authors":"Yang-Lang Chang, Hsien-Tang Chao, Min-Yu Huang, Lena Chang, Jyh-Perng Fang, Tung-Ju Hsieh","doi":"10.1109/PADSW.2014.7097911","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097911","url":null,"abstract":"In this paper a novel technique based on nearest feature space (NFS), known as incenter-based nearest feature space (INFS), is proposed for supervised hyperspectral image classification. Due to the class separability and neighborhood structure, the traditional NFS can perform well for classification of remote sensing images. However, in some instances, the overlapping training samples might cause classification errors in spite of the high classification accuracy of NFS for normal cases. In response, the INFS is proposed to overcome this problem in this paper. INFS method makes use of the incircle of a triangle which is tangent to its three sides and form a INFS. In addition, an incenter can be calculated by three training samples of the same class efficiently. Furthermore, in order to speed up the computation performance, this paper proposes a parallel computing version of INFS, namely parallel INFS (PINFS). It uses a modern graphics processing unit (GPU) architecture with NVIDIA's compute unified device architecture (CUDA) technology to improve the computational speed of INFS. Experimental results demonstrate the proposed INFS approach is suitable for land cover classification in earth remote sensing. It can achieve the better performance than NFS classifier when the class sample distribution overlaps. Through the computation of GPU by CUDA, we can also gain better speedup.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129767526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PADSW.2014.7097818
Jen-Yu Wang, Yarsun Hsu
As the CMOS technology develops, the number of buffers required in a network-on-chip increases with flit width. This increase of buffers provides more power and area overhead to a network router. This paper proposes a hybrid packet-switched and circuit-switched network in which the total buffer requirement depends on only the width of the short message and buffer depth, and does not increase with the network width. The performance is maintained through a low latency circuit-switch by using a simple reverse path reservation method. The simulation results indicated that a considerable amount of power and area can be saved by the buffer reduction, whereas performance is maintained.
{"title":"A hybrid on-chip network with a low buffer requirement","authors":"Jen-Yu Wang, Yarsun Hsu","doi":"10.1109/PADSW.2014.7097818","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097818","url":null,"abstract":"As the CMOS technology develops, the number of buffers required in a network-on-chip increases with flit width. This increase of buffers provides more power and area overhead to a network router. This paper proposes a hybrid packet-switched and circuit-switched network in which the total buffer requirement depends on only the width of the short message and buffer depth, and does not increase with the network width. The performance is maintained through a low latency circuit-switch by using a simple reverse path reservation method. The simulation results indicated that a considerable amount of power and area can be saved by the buffer reduction, whereas performance is maintained.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"523 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129790925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PADSW.2014.7097925
Peng Li, Kunal Agrawal, J. Buhler, R. Chamberlain
Streaming computing is a paradigm of distributed computing that features networked nodes connected by first-in-first-out data channels. Communication between nodes may include not only high-volume data tokens but also infrequent and unpredictable control messages carrying control information, such as data set boundaries, exceptions, or reconfiguration requests. In many applications, it is necessary to order delivery of control messages precisely relative to data tokens, which can be especially challenging when nodes can filter data tokens. Existing approaches, mainly data serialization protocols, do not exploit the low-volume nature of control messages and may not guarantee that synchronization of these messages with data will be free of deadlock. In this paper, we propose an efficient messaging system for adding precisely ordered control messages to streaming applications. We use a credit-based protocol to avoid the need to tag data tokens and control messages. For potential deadlocks caused by filtering behavior and global synchronization, we propose deadlock avoidance solutions and prove their correctness.
{"title":"Orchestrating safe streaming computations with precise control","authors":"Peng Li, Kunal Agrawal, J. Buhler, R. Chamberlain","doi":"10.1109/PADSW.2014.7097925","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097925","url":null,"abstract":"Streaming computing is a paradigm of distributed computing that features networked nodes connected by first-in-first-out data channels. Communication between nodes may include not only high-volume data tokens but also infrequent and unpredictable control messages carrying control information, such as data set boundaries, exceptions, or reconfiguration requests. In many applications, it is necessary to order delivery of control messages precisely relative to data tokens, which can be especially challenging when nodes can filter data tokens. Existing approaches, mainly data serialization protocols, do not exploit the low-volume nature of control messages and may not guarantee that synchronization of these messages with data will be free of deadlock. In this paper, we propose an efficient messaging system for adding precisely ordered control messages to streaming applications. We use a credit-based protocol to avoid the need to tag data tokens and control messages. For potential deadlocks caused by filtering behavior and global synchronization, we propose deadlock avoidance solutions and prove their correctness.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121051239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PADSW.2014.7097847
Shan Chang, Hongzi Zhu, M. Dong, K. Ota, Xiaoqiang Liu, Guangtao Xue, Xuemin Shen
With the popularity of intelligent mobile devices, enormous urban information has been generated and required by the public. In response, ShanghaiGrid (SG) aims to providing abundant information services to the public. With fixed schedule and urban-wide coverage, an appealing service in SG is to provide free message delivery service to the public using buses, which allows mobile device users to send messages to locations of interest via buses. The main challenge in realizing this service is to provide efficient routing scheme with privacy preservation under highly dynamic urban traffic condition. In this paper, we present an innovative scheme BusCast to tackle this problem. In BusCast, buses can pick up and forward personal messages to their destination locations in a store-carry-forward fashion. For each message, BusCast conservatively associates a routing graph rather than a fixed routing path with the message in order to adapt the dynamic of urban traffic. Meanwhile, the privacy information about the user and the message destination is concealed from both intermediate relay buses and outside adversaries. Both rigorous privacy analysis and extensive trace-driven simulations demonstrate the efficacy of BusCast scheme.
{"title":"BusCast: Flexible and privacy preserving message delivery using urban buses","authors":"Shan Chang, Hongzi Zhu, M. Dong, K. Ota, Xiaoqiang Liu, Guangtao Xue, Xuemin Shen","doi":"10.1109/PADSW.2014.7097847","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097847","url":null,"abstract":"With the popularity of intelligent mobile devices, enormous urban information has been generated and required by the public. In response, ShanghaiGrid (SG) aims to providing abundant information services to the public. With fixed schedule and urban-wide coverage, an appealing service in SG is to provide free message delivery service to the public using buses, which allows mobile device users to send messages to locations of interest via buses. The main challenge in realizing this service is to provide efficient routing scheme with privacy preservation under highly dynamic urban traffic condition. In this paper, we present an innovative scheme BusCast to tackle this problem. In BusCast, buses can pick up and forward personal messages to their destination locations in a store-carry-forward fashion. For each message, BusCast conservatively associates a routing graph rather than a fixed routing path with the message in order to adapt the dynamic of urban traffic. Meanwhile, the privacy information about the user and the message destination is concealed from both intermediate relay buses and outside adversaries. Both rigorous privacy analysis and extensive trace-driven simulations demonstrate the efficacy of BusCast scheme.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123339684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PADSW.2014.7097905
Yunyun Su, Haibin Cai, Jingmin Shi
The information field is undergoing a new round of technological revolution from the Internet to the Internet of things. Vehicle ad-hoc network (VANET), an application of the internet of things using in Intelligent Transportation System (ITS), has already attracted broad attention in the world in recent years. It mainly provides communications between vehicle-to-vehicle and vehicle-to-infrastructure, which significantly improve road transport efficiency, reduce energy consumption and ease traffic congestion. In this paper, we developed a client to make SUMO and NS3 work parallel by TraCI (Traffic Control Interface) in NS3. It helps NS3 get SUMO's information and sends instructions to change the states of vehicles and traffic lights. We present a realistic road traffic model with kinds of vehicles and intelligent traffic lights. The model is built in SUMO (Simulation of Urban Mobility). We use OpenStreetMap to generate a realistic map near the bund in Shanghai. The traffic flow is built according to a survey which makes us get meaningful and reliable statistics. A mechanism of changing the traffic lights dynamically is introduced to minimize traffic jams and give high priority to emergency vehicle. As a result, the waiting time and the duration of the vehicles in the scenario have reduced significantly after using the mechanism. The emergency vehicle's waiting time is less than others.
信息领域正在经历从互联网到物联网的新一轮技术革命。车辆自组织网络(VANET)是物联网在智能交通系统(ITS)中的一种应用,近年来在世界范围内引起了广泛关注。它主要提供车与车、车与基础设施之间的通信,大大提高了道路运输效率,降低了能源消耗,缓解了交通拥堵。在本文中,我们开发了一个客户端,通过NS3中的TraCI (Traffic Control Interface)实现SUMO和NS3的并行工作。它帮助NS3获取相扑的信息,并发送指令来改变车辆和交通灯的状态。提出了一种具有多种车辆和智能交通灯的现实道路交通模型。该模型在SUMO (Simulation of Urban Mobility)中建立。我们使用OpenStreetMap在上海外滩附近生成一个逼真的地图。交通流量是根据调查建立的,这使我们得到有意义和可靠的统计数据。引入了一种动态改变交通信号灯的机制,使交通堵塞最小化,并给予应急车辆优先权。因此,在使用该机制后,该场景下车辆的等待时间和持续时间大大减少。急救车辆的等待时间比其他车辆短。
{"title":"An improved realistic mobility model and mechanism for VANET based on SUMO and NS3 collaborative simulations","authors":"Yunyun Su, Haibin Cai, Jingmin Shi","doi":"10.1109/PADSW.2014.7097905","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097905","url":null,"abstract":"The information field is undergoing a new round of technological revolution from the Internet to the Internet of things. Vehicle ad-hoc network (VANET), an application of the internet of things using in Intelligent Transportation System (ITS), has already attracted broad attention in the world in recent years. It mainly provides communications between vehicle-to-vehicle and vehicle-to-infrastructure, which significantly improve road transport efficiency, reduce energy consumption and ease traffic congestion. In this paper, we developed a client to make SUMO and NS3 work parallel by TraCI (Traffic Control Interface) in NS3. It helps NS3 get SUMO's information and sends instructions to change the states of vehicles and traffic lights. We present a realistic road traffic model with kinds of vehicles and intelligent traffic lights. The model is built in SUMO (Simulation of Urban Mobility). We use OpenStreetMap to generate a realistic map near the bund in Shanghai. The traffic flow is built according to a survey which makes us get meaningful and reliable statistics. A mechanism of changing the traffic lights dynamically is introduced to minimize traffic jams and give high priority to emergency vehicle. As a result, the waiting time and the duration of the vehicles in the scenario have reduced significantly after using the mechanism. The emergency vehicle's waiting time is less than others.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132518796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PADSW.2014.7097825
Hailong Shi, Dong Li, H. Chen, J. Qiu, Li Cui
Traditional wireless sensor networks (WSNs) can be integrated into Internet and be regarded as its sensing infrastructure, which supports development and running of multiple third-party applications simultaneously. Therefore, due to constrained resource of sensor nodes, it is necessary to establish a runtime framework to improve sensor sharing efficiency for concurrent third-party applications. This paper presents EasiCAE, a concurrent applications runtime framework, to enhance sensor sharing efficiency greatly by incorporating task allocation with redundancy elimination. In brief, EasiCAE decompose the applications into tasks and distributes tasks to the sensors which will bring the least energy to run them. EasiCAE has three salient features. Firstly, we define task-sensor correlation to indicate how many samplings of a sensor can be shared with the new task. Secondly, EasiCAE reduces energy consumption by assigning tasks to a sensor with higher task-sensor correlation. Finally, a light-weight merging algorithm is proposed to eliminate redundant samplings for the assigned sensors. Experimental results show that EasiCAE reduces energy consumption by 31% to 79% compared with existing methods, while introducing tolerable overheads. We also evaluate EasiCAE with various influencing parameters, showing that the performance of EasiCAE increases stably as the network scale and the number of concurrent applications increases.
{"title":"EasiCAE: A runtime framework for efficient sensor sharing among concurrent IoT applications","authors":"Hailong Shi, Dong Li, H. Chen, J. Qiu, Li Cui","doi":"10.1109/PADSW.2014.7097825","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097825","url":null,"abstract":"Traditional wireless sensor networks (WSNs) can be integrated into Internet and be regarded as its sensing infrastructure, which supports development and running of multiple third-party applications simultaneously. Therefore, due to constrained resource of sensor nodes, it is necessary to establish a runtime framework to improve sensor sharing efficiency for concurrent third-party applications. This paper presents EasiCAE, a concurrent applications runtime framework, to enhance sensor sharing efficiency greatly by incorporating task allocation with redundancy elimination. In brief, EasiCAE decompose the applications into tasks and distributes tasks to the sensors which will bring the least energy to run them. EasiCAE has three salient features. Firstly, we define task-sensor correlation to indicate how many samplings of a sensor can be shared with the new task. Secondly, EasiCAE reduces energy consumption by assigning tasks to a sensor with higher task-sensor correlation. Finally, a light-weight merging algorithm is proposed to eliminate redundant samplings for the assigned sensors. Experimental results show that EasiCAE reduces energy consumption by 31% to 79% compared with existing methods, while introducing tolerable overheads. We also evaluate EasiCAE with various influencing parameters, showing that the performance of EasiCAE increases stably as the network scale and the number of concurrent applications increases.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132625759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PADSW.2014.7097795
Chiao-Yun Tu, Yuan-Ying Chang, C. King, Chien-Ting Chen, Tai-Yuan Wang
General-purpose computing on graphics processing units (GPGPU) can provide orders of magnitude more computing power than general purpose processors (CPU) for highly parallel applications. For such parallel applications, the memory traffic pattern of GPGPUs behaves considerably different from that of CPUs. This gives rise to opportunities for optimizing the on-chip interconnection network (NoC) of GPGPUs. In this work, we first investigate the characteristics of GPGPU memory traffic of typical benchmarks and categorize the memory traffic patterns. Different traffic patterns require different throughput in the request and reply paths of the NoC to match the network load. To meet this requirement, we examine the feasibility of scaling the network frequency dynamically to balance the throughput of the request and reply networks. The decision is guided by monitoring some shader cores to identify the memory traffic pattern. Performance evaluation shows that this dynamic frequency tuning design can achieve up to 27% improvement in terms of execution speedup compared to a baseline setting and 7.4% improvement on average.
{"title":"Traffic-aware frequency scaling for balanced on-chip networks on GPGPUs","authors":"Chiao-Yun Tu, Yuan-Ying Chang, C. King, Chien-Ting Chen, Tai-Yuan Wang","doi":"10.1109/PADSW.2014.7097795","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097795","url":null,"abstract":"General-purpose computing on graphics processing units (GPGPU) can provide orders of magnitude more computing power than general purpose processors (CPU) for highly parallel applications. For such parallel applications, the memory traffic pattern of GPGPUs behaves considerably different from that of CPUs. This gives rise to opportunities for optimizing the on-chip interconnection network (NoC) of GPGPUs. In this work, we first investigate the characteristics of GPGPU memory traffic of typical benchmarks and categorize the memory traffic patterns. Different traffic patterns require different throughput in the request and reply paths of the NoC to match the network load. To meet this requirement, we examine the feasibility of scaling the network frequency dynamically to balance the throughput of the request and reply networks. The decision is guided by monitoring some shader cores to identify the memory traffic pattern. Performance evaluation shows that this dynamic frequency tuning design can achieve up to 27% improvement in terms of execution speedup compared to a baseline setting and 7.4% improvement on average.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"206 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128612028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PADSW.2014.7097794
Jie Sun, Xiaofei Liao, Long Zheng, Hai Jin, Yu Zhang
Deterministic multithreading (DMT) system is well-known to eliminate the harmful program behaviors caused by nondeterminism, i.e., always proceeding the program execution into the same thread schedule for the same given input. To achieve this goal, two kinds of schedules are enforced by existing DMT systems. 1) A mem-based schedule ensures the determinism with the total order of the shared memory accesses, and 2) A sync-based schedule makes it by only enforcing the total order of the synchronization operations. Mem-schedule achieves full determinism but suffers from prohibitive overhead; while sync-schedule mitigates this overhead but cannot ensure the determinism for the race schedules, i.e., part determinism. Much recent research is devoted to the hybrid schedule combining the determinism of mem-schedule and efficiency of sync-schedule. However, they suffer from the practicability and scalability problems due to the defects of their technical characteristics, such as trace collection in advance and huge schedule memoization. To address the above problem, this paper proposes esDMT, an efficient and scalable DMT system using a new technique of memory isolation. It can improve the efficiency by proceeding the execution of each thread in parallel within its private virtual memory, and defers the determinism guarantee by updating private memory into shared memory in a deterministic order according to deterministic lock algorithm, thus further reducing the overhead of inter-thread waiting. In contrast to the previous hybrid work avoiding the nondeterminism of race schedules offline based on the enormous historical records, our key insight is to eliminate the nondeterminism of race schedules online at runtime. Our experimental results on PARSEC benchmarks show that esDMT eliminates the nondeterminism successfully, almost gains the same performance as the sync-schedule (with <;18% slowdown compared with pthread library at most), and manifests good scalability on an 8-core machine.
{"title":"esDMT: Efficient and scalable deterministic multithreading through memory isolation","authors":"Jie Sun, Xiaofei Liao, Long Zheng, Hai Jin, Yu Zhang","doi":"10.1109/PADSW.2014.7097794","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097794","url":null,"abstract":"Deterministic multithreading (DMT) system is well-known to eliminate the harmful program behaviors caused by nondeterminism, i.e., always proceeding the program execution into the same thread schedule for the same given input. To achieve this goal, two kinds of schedules are enforced by existing DMT systems. 1) A mem-based schedule ensures the determinism with the total order of the shared memory accesses, and 2) A sync-based schedule makes it by only enforcing the total order of the synchronization operations. Mem-schedule achieves full determinism but suffers from prohibitive overhead; while sync-schedule mitigates this overhead but cannot ensure the determinism for the race schedules, i.e., part determinism. Much recent research is devoted to the hybrid schedule combining the determinism of mem-schedule and efficiency of sync-schedule. However, they suffer from the practicability and scalability problems due to the defects of their technical characteristics, such as trace collection in advance and huge schedule memoization. To address the above problem, this paper proposes esDMT, an efficient and scalable DMT system using a new technique of memory isolation. It can improve the efficiency by proceeding the execution of each thread in parallel within its private virtual memory, and defers the determinism guarantee by updating private memory into shared memory in a deterministic order according to deterministic lock algorithm, thus further reducing the overhead of inter-thread waiting. In contrast to the previous hybrid work avoiding the nondeterminism of race schedules offline based on the enormous historical records, our key insight is to eliminate the nondeterminism of race schedules online at runtime. Our experimental results on PARSEC benchmarks show that esDMT eliminates the nondeterminism successfully, almost gains the same performance as the sync-schedule (with <;18% slowdown compared with pthread library at most), and manifests good scalability on an 8-core machine.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126892420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PADSW.2014.7097881
A. Banerjee, H. Paul, A. Mukherjee, P. Datta, Sajal K. Das
In a mobile grid computing framework where mobile devices are used as computing resources, minimizing the task offloading time remains an important issue. A task is an independent unit of execution consisting of a input data volume for execution and optionally a target-specific executable. We consider a mobile grid infrastructure where mobile devices are connected via Wi-Fi network and the grid infrastructure has a set of tasks (i.e. a set of data volumes) to be transferred to a subset of the mobile devices. In a Wi-Fi network, mobile devices usually associate themselves to the access points (APs) having the strongest radio signal. In this paper, we address the problem of AP activation (by frequency assignment) and association of AP with devices in the context of minimizing the overall data-transfer completion time. We present a constraint based formulation and also a heuristic as solutions. Simulations results are presented which contrast our proposed methods with some of the earlier works.
{"title":"An access point to device association technique for optimized data transfer in mobile grids","authors":"A. Banerjee, H. Paul, A. Mukherjee, P. Datta, Sajal K. Das","doi":"10.1109/PADSW.2014.7097881","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097881","url":null,"abstract":"In a mobile grid computing framework where mobile devices are used as computing resources, minimizing the task offloading time remains an important issue. A task is an independent unit of execution consisting of a input data volume for execution and optionally a target-specific executable. We consider a mobile grid infrastructure where mobile devices are connected via Wi-Fi network and the grid infrastructure has a set of tasks (i.e. a set of data volumes) to be transferred to a subset of the mobile devices. In a Wi-Fi network, mobile devices usually associate themselves to the access points (APs) having the strongest radio signal. In this paper, we address the problem of AP activation (by frequency assignment) and association of AP with devices in the context of minimizing the overall data-transfer completion time. We present a constraint based formulation and also a heuristic as solutions. Simulations results are presented which contrast our proposed methods with some of the earlier works.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126673489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PADSW.2014.7097801
Kai Bu, Jia Liu, Bin Xiao, Xuan Liu, Shigeng Zhang
Radio-Frequency Identification (RFID) technology has fostered many object monitoring systems. Along with this trend, tagged objects' value and privacy become a primary concern. A corresponding important problem is to verify the intactness of a set of tagged objects without leaking tag identifiers (IDs). However, existing solutions necessitate the knowledge of tag IDs. Without tag IDs as a priori, this paper studies intactness verification in anonymous RFID systems. We identify three critical solution requirements, that is, deterministic verification, anonymity preservation, and scalability. We propose Cardiff and Divar, two crypto-free, lightweight protocols that isolate tag IDs from intactness verification and satisfy solution requirements. Cardiff explores tag cardinality as intactness proof while Divar leverages Direct-Sequence Spread Spectrum (DSSS) enabled RFID. Both analytical and simulation results demonstrate that Cardiff and Divar can satisfy the requirements of accuracy, privacy, and scalability.
{"title":"Intactness verification in anonymous RFID systems","authors":"Kai Bu, Jia Liu, Bin Xiao, Xuan Liu, Shigeng Zhang","doi":"10.1109/PADSW.2014.7097801","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097801","url":null,"abstract":"Radio-Frequency Identification (RFID) technology has fostered many object monitoring systems. Along with this trend, tagged objects' value and privacy become a primary concern. A corresponding important problem is to verify the intactness of a set of tagged objects without leaking tag identifiers (IDs). However, existing solutions necessitate the knowledge of tag IDs. Without tag IDs as a priori, this paper studies intactness verification in anonymous RFID systems. We identify three critical solution requirements, that is, deterministic verification, anonymity preservation, and scalability. We propose Cardiff and Divar, two crypto-free, lightweight protocols that isolate tag IDs from intactness verification and satisfy solution requirements. Cardiff explores tag cardinality as intactness proof while Divar leverages Direct-Sequence Spread Spectrum (DSSS) enabled RFID. Both analytical and simulation results demonstrate that Cardiff and Divar can satisfy the requirements of accuracy, privacy, and scalability.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116819023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}