In recent years, the research on caching in cloud environment has become an important research topic, and it has profound meaning to research the cache replacement algorithm in hybrid Cloud. There aren't enough considerations on some aspects, such as the selection of pending cache files, the prefetching of pending cache files among different clouds and the cost of recovery of files. Considering those shortages, this paper proposes an optimized LRU algorithm based on pre-selection and cache prefetching of files. This algorithm determines whether the file is to meet the pre-selection and cache prefetching conditions before adding a cache file, and it implements the LRU cache replacement algorithm which is based on priority. The algorithm divides the cache into multiple priority queues, and uses the LRU cache replacement algorithm to select the replacement file in each queue. Then select the files in each priority and put them together, select the file to perform replacement operation which has minimum probability of being accessed again. Compared with three typical cache replacement algorithm GD-Size, LRU, LFU, experimental results show that the cache replacement algorithm in this paper not only effectively save cost, but also greatly enhance the byte hit rate, delay savings rate and cache hit rate.
{"title":"The Optimization of LRU Algorithm Based on Pre-Selection and Cache Prefetching of Files in Hybrid Cloud","authors":"Shumeng Du, Chunlin Li, XiJun Mao, Wei Yan","doi":"10.1109/PDCAT.2016.039","DOIUrl":"https://doi.org/10.1109/PDCAT.2016.039","url":null,"abstract":"In recent years, the research on caching in cloud environment has become an important research topic, and it has profound meaning to research the cache replacement algorithm in hybrid Cloud. There aren't enough considerations on some aspects, such as the selection of pending cache files, the prefetching of pending cache files among different clouds and the cost of recovery of files. Considering those shortages, this paper proposes an optimized LRU algorithm based on pre-selection and cache prefetching of files. This algorithm determines whether the file is to meet the pre-selection and cache prefetching conditions before adding a cache file, and it implements the LRU cache replacement algorithm which is based on priority. The algorithm divides the cache into multiple priority queues, and uses the LRU cache replacement algorithm to select the replacement file in each queue. Then select the files in each priority and put them together, select the file to perform replacement operation which has minimum probability of being accessed again. Compared with three typical cache replacement algorithm GD-Size, LRU, LFU, experimental results show that the cache replacement algorithm in this paper not only effectively save cost, but also greatly enhance the byte hit rate, delay savings rate and cache hit rate.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116806365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peibing Du, Hao Jiang, Housen Li, Lizhi Cheng, Canqun Yang
Polynomials are widely used in scientific computing and engineering. In this paper, we present an accurate and fast compensated algorithm to evaluate bivariate polynomials with floating-point coefficients. This algorithm is applying error free transformations to the bivariate Horner scheme and sum the final decomposition accurately. We also prove the proposed algorithm's accuracy with forward error analysis that the accuracy of the computed result is similar to the result computed by the bivariate Horner scheme in twice the working precision. Numerical experiments illustrate the behavior and it has higher efficiency than the bivariate Horner scheme implemented in double-double library.
{"title":"Accurate Evaluation of Bivariate Polynomials","authors":"Peibing Du, Hao Jiang, Housen Li, Lizhi Cheng, Canqun Yang","doi":"10.1109/PDCAT.2016.026","DOIUrl":"https://doi.org/10.1109/PDCAT.2016.026","url":null,"abstract":"Polynomials are widely used in scientific computing and engineering. In this paper, we present an accurate and fast compensated algorithm to evaluate bivariate polynomials with floating-point coefficients. This algorithm is applying error free transformations to the bivariate Horner scheme and sum the final decomposition accurately. We also prove the proposed algorithm's accuracy with forward error analysis that the accuracy of the computed result is similar to the result computed by the bivariate Horner scheme in twice the working precision. Numerical experiments illustrate the behavior and it has higher efficiency than the bivariate Horner scheme implemented in double-double library.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122842358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Optimizing for Non-Uniform Memory Access (NUMA) systems could be considered inappropriate because hardware architecture aware optimizations are not portable. On the contrary, this paper supports the idea that developing NUMA aware optimizations improves performance and energy consumption on NUMA systems and that these optimizations may be considered portable when they are non static. This paper introduces NUMA Balanced Thread and Data Mapping (BTDM), an extension of PThreads4w API [1]. NUMA-BTDM employs balanced data locality concept, improving thread and data mapping for NUMA systems. The purpose is to combine task parallelism with balanced data locality in order to obtain both better performance and reduced energy consumption on NUMA systems at run-time. The implementation of NUMA-BTDM targets homogeneous architectures based on the energy model with constant energy consumption or on the energy model in which each core is powered from a separate source (architectures on which parallel execution may reduce energy consumption compared to serial execution).
{"title":"NUMA-BTDM: A Thread Mapping Algorithm for Balanced Data Locality on NUMA Systems","authors":"Iulia Stirb","doi":"10.1109/PDCAT.2016.074","DOIUrl":"https://doi.org/10.1109/PDCAT.2016.074","url":null,"abstract":"Optimizing for Non-Uniform Memory Access (NUMA) systems could be considered inappropriate because hardware architecture aware optimizations are not portable. On the contrary, this paper supports the idea that developing NUMA aware optimizations improves performance and energy consumption on NUMA systems and that these optimizations may be considered portable when they are non static. This paper introduces NUMA Balanced Thread and Data Mapping (BTDM), an extension of PThreads4w API [1]. NUMA-BTDM employs balanced data locality concept, improving thread and data mapping for NUMA systems. The purpose is to combine task parallelism with balanced data locality in order to obtain both better performance and reduced energy consumption on NUMA systems at run-time. The implementation of NUMA-BTDM targets homogeneous architectures based on the energy model with constant energy consumption or on the energy model in which each core is powered from a separate source (architectures on which parallel execution may reduce energy consumption compared to serial execution).","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130608723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Admittedly, the broadband access network has been improved largely with the developing technologies, it is still facing challenges on managing and maintaining existed resources efficiently. In order to build up an intelligent and open network architecture, and solve the problem of heterogeneous networks consisting of devices from different vendors, we have worked out a web-based managing system implementing the concept of Software-Defined Network (SDN) and Network Functions Virtualization. The controlling plane is centered into the Controller layer and decoupled from the forwarding layer. The frame we proposed is also applicable for old routers, which do not support SDN, with an Agent on it to translate the OpenFlow messages. For a more intelligent routing schema, the controller is able to calculate with a fine-tuned ant colony optimization algorithm. At the top of the controller, the web-based managing system is accessible for operators, and they can manage the resource they possessed. With the above framework, we achieve the goal of an intelligent and open network architecture and verify it.
{"title":"Managing Broadband Access Network with a SDN-Based System","authors":"Junpeng Guo, Xiaohan Gao, Rentao Gu","doi":"10.1109/PDCAT.2016.022","DOIUrl":"https://doi.org/10.1109/PDCAT.2016.022","url":null,"abstract":"Admittedly, the broadband access network has been improved largely with the developing technologies, it is still facing challenges on managing and maintaining existed resources efficiently. In order to build up an intelligent and open network architecture, and solve the problem of heterogeneous networks consisting of devices from different vendors, we have worked out a web-based managing system implementing the concept of Software-Defined Network (SDN) and Network Functions Virtualization. The controlling plane is centered into the Controller layer and decoupled from the forwarding layer. The frame we proposed is also applicable for old routers, which do not support SDN, with an Agent on it to translate the OpenFlow messages. For a more intelligent routing schema, the controller is able to calculate with a fine-tuned ant colony optimization algorithm. At the top of the controller, the web-based managing system is accessible for operators, and they can manage the resource they possessed. With the above framework, we achieve the goal of an intelligent and open network architecture and verify it.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126287199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huge energy consumption of large-scale cloud data centers damages the environment with excessive carbon emission. More and more data center operators are seeking to reduce carbon footprint via various types of renewable energy sources. However, the intermittent availability of renewable energy source makes it quite challenging to cooperate the dynamic workload arrivals. In this paper, we investigate how to coordinate multi-type renewable energy (e.g. wind power and solar power) in order to reduce the long-term energy cost with spatio-temporal diversity of electricity price for geo-distributed cloud data centers under the constraints of service level agreement (SLA) and carbon footprints. To tackle the randomness of workload arrival, dynamic electricity price change and renewable energy generation, we first formulate the minimizing energy cost problem into a constrained stochastic optimization problem. Then, based on Lyapunov optimization technique, we design an online control algorithm which can work without long-term future system information for solving the problem. Finally, we evaluate the effectiveness of the algorithm with extensive simulations based on real-world workload traces, electricity price and historic climate data.
{"title":"Green-Aware Online Resource Allocation for Geo-Distributed Cloud Data Centers on Multi-Source Energy","authors":"Huaiwen He, Hong Shen","doi":"10.1109/PDCAT.2016.037","DOIUrl":"https://doi.org/10.1109/PDCAT.2016.037","url":null,"abstract":"Huge energy consumption of large-scale cloud data centers damages the environment with excessive carbon emission. More and more data center operators are seeking to reduce carbon footprint via various types of renewable energy sources. However, the intermittent availability of renewable energy source makes it quite challenging to cooperate the dynamic workload arrivals. In this paper, we investigate how to coordinate multi-type renewable energy (e.g. wind power and solar power) in order to reduce the long-term energy cost with spatio-temporal diversity of electricity price for geo-distributed cloud data centers under the constraints of service level agreement (SLA) and carbon footprints. To tackle the randomness of workload arrival, dynamic electricity price change and renewable energy generation, we first formulate the minimizing energy cost problem into a constrained stochastic optimization problem. Then, based on Lyapunov optimization technique, we design an online control algorithm which can work without long-term future system information for solving the problem. Finally, we evaluate the effectiveness of the algorithm with extensive simulations based on real-world workload traces, electricity price and historic climate data.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125747701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Virtualization technology has brought new vitality to data centers but also brought some thorny issues. Virtualization technology creates an abstract intermediate layer, separating the upper layer applications from the underlying infrastructure, which cause some difficulties to the upper layer applications for the effective use of resources. Based on this observation, we propose a dynamic load balancing system in virtualized environment. The system adjusts the weights of virtual machines in real-time in order to balancing physical servers' load and improve data center efficiency. In addition, by monitoring physical machine failure information, we can evacuate physical machines as soon as they failed to ensure high availability of the data center. We design a virtualized environment monitoring system and propose an efficient algorithm. We evaluate the proposed system with real implementations which show DLB has rather good performance.
{"title":"Dynamic Load Balancing for Physical Servers in Virtualized Environment","authors":"Mingming Zhang, Songyun Wang, Gaopan Huang, Yefei Li, S. Zhang, Zhuzhong Qian","doi":"10.1109/PDCAT.2016.057","DOIUrl":"https://doi.org/10.1109/PDCAT.2016.057","url":null,"abstract":"Virtualization technology has brought new vitality to data centers but also brought some thorny issues. Virtualization technology creates an abstract intermediate layer, separating the upper layer applications from the underlying infrastructure, which cause some difficulties to the upper layer applications for the effective use of resources. Based on this observation, we propose a dynamic load balancing system in virtualized environment. The system adjusts the weights of virtual machines in real-time in order to balancing physical servers' load and improve data center efficiency. In addition, by monitoring physical machine failure information, we can evacuate physical machines as soon as they failed to ensure high availability of the data center. We design a virtualized environment monitoring system and propose an efficient algorithm. We evaluate the proposed system with real implementations which show DLB has rather good performance.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121746487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Duplication and dynamic voltage/frequency scaling (DVFS) creates an interesting trade-off for scheduling task graphs on multiprocessors to improve energy consumption and schedule length (or makespan). With DVFS, tasks are made to run on low voltages, which decreases their computation power. However, it also increases their execution costs and hence, may increase the schedule length. Furthermore, applying DVFS on processors does not impact the communication delay/energy consumption. Duplicating a task on multiple processors reduces the communication delay among them, which further reduces the schedule length. Although duplication reduces the communication energy among processors, it also increases the overall computation energy. In this paper, we explore this trade-off between duplication and DVFS, and propose a polynomial time heuristic to schedule task graphs on heterogeneous multiprocessors. The tasks are carefully duplicated with DVFS to reduce its impact on the computation energy. The results demonstrate that the proposed algorithm is able to effectively balance the makespan and energy consumption over other algorithms in various scenarios.
{"title":"Energy Aware Scheduling on Heterogeneous Multiprocessors with DVFS and Duplication","authors":"Jagpreet Singh, Aditya Gujral, Harmandeep Singh, Jagbeer Singh, Nitin Auluck","doi":"10.1109/PDCAT.2016.036","DOIUrl":"https://doi.org/10.1109/PDCAT.2016.036","url":null,"abstract":"Duplication and dynamic voltage/frequency scaling (DVFS) creates an interesting trade-off for scheduling task graphs on multiprocessors to improve energy consumption and schedule length (or makespan). With DVFS, tasks are made to run on low voltages, which decreases their computation power. However, it also increases their execution costs and hence, may increase the schedule length. Furthermore, applying DVFS on processors does not impact the communication delay/energy consumption. Duplicating a task on multiple processors reduces the communication delay among them, which further reduces the schedule length. Although duplication reduces the communication energy among processors, it also increases the overall computation energy. In this paper, we explore this trade-off between duplication and DVFS, and propose a polynomial time heuristic to schedule task graphs on heterogeneous multiprocessors. The tasks are carefully duplicated with DVFS to reduce its impact on the computation energy. The results demonstrate that the proposed algorithm is able to effectively balance the makespan and energy consumption over other algorithms in various scenarios.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131744789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Truck arrival management forms a very active stream of research and a crucial challenge for a cross-dock terminals. The study focuses on the truck congestion problem, which leads to a lower operation efficiency and a longer waiting time at the gate and at the yard. One of the operational measures to solve this problem is the truck appointment system. It is used to coordinate the major cross-dock planning activities and to regulate the arrival time of trucks at the cross-dock. When the trucker get an appointment time different to its preference time, then we are talking about a truck deviation time. Because the deviation will result in daily operations schedule, an optimization model for truck appointment was proposed in this paper. In the model, the truck deviation time was minimized subject to the constraints of resources availability including dock doors, yard zones, gate lanes, workforce and material handling systems. To solve the model, a method based multi-agent system to real-time truck scheduling, that take into account the uncertainty of arrival time as an operational characteristic, was designed. It ensures a negotiation among truck agents and resource agents. Lastly, a numerical experiments are provided to illustrate the validity of the model and to illustrate the working and benefit of our approach.
{"title":"An Application Oriented Multi-Agent Based Approach to Dynamic Truck Scheduling at Cross-Dock","authors":"Houda Zouhaier, L. B. Said","doi":"10.1109/PDCAT.2016.058","DOIUrl":"https://doi.org/10.1109/PDCAT.2016.058","url":null,"abstract":"Truck arrival management forms a very active stream of research and a crucial challenge for a cross-dock terminals. The study focuses on the truck congestion problem, which leads to a lower operation efficiency and a longer waiting time at the gate and at the yard. One of the operational measures to solve this problem is the truck appointment system. It is used to coordinate the major cross-dock planning activities and to regulate the arrival time of trucks at the cross-dock. When the trucker get an appointment time different to its preference time, then we are talking about a truck deviation time. Because the deviation will result in daily operations schedule, an optimization model for truck appointment was proposed in this paper. In the model, the truck deviation time was minimized subject to the constraints of resources availability including dock doors, yard zones, gate lanes, workforce and material handling systems. To solve the model, a method based multi-agent system to real-time truck scheduling, that take into account the uncertainty of arrival time as an operational characteristic, was designed. It ensures a negotiation among truck agents and resource agents. Lastly, a numerical experiments are provided to illustrate the validity of the model and to illustrate the working and benefit of our approach.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123191796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many parallel programs are intended to yield deterministic results, but unpredictable thread or process interleavings can lead to subtle bugs and nondeterminism. We proposed a producer-consumer virtual memory–Many parallel programs are intended to yield deterministic results, but unpredictable thread or process interleavings can lead to subtle bugs and nondeterminism. We proposed a producer-consumer virtual memory–SPMC–for efficient system-enforced deterministic parallelism, and prototyped the SPMC model and its software stack entirely in Linux user space, called DLinux. This paper summarizes the implementation policies and limitations in our previous DLinux. To reduce SPMC page fault overhead and suspend/resume overhead which severely degrade the performance of DLinux, we enhance the SPMC model with nonblocking test and direct read and write primitives. Based on the extended SPMC model, we improve the implementation of upper programming abstractions. Experimental results show that relative to the previous version, the new DLinux can improve the performance of NPB workloads up to 2.33X and 1.76X on 8 and 16 processes, respectively. For CG on 8 processes, its runtime relative to MPICH2 decreases from 4.12X to 1.77X. SPMC–for efficient system-enforced deterministic parallelism, and prototyped the SPMC model and its software stack entirely in Linux user space, called DLinux. This paper summarizes the implementation policies and limitations in our previous DLinux. To reduce SPMC page fault overhead and suspend/resume overhead which severely degrade the performance of DLinux, we enhance the SPMC model with nonblocking test and direct read and write primitives. Based on the extended SPMC model, we improve the implementation of upper programming abstractions. Experimental results show that relative to the previous version, the new DLinux can improve the performance of NPB workloads up to 2.33X and 1.76X on 8 and 16 processes, respectively. For CG on 8 processes, its runtime relative to MPICH2 decreases from 4.12X to 1.77X.
{"title":"Making User-Level VMM for Deterministic Parallelism Nonblocking and Efficient","authors":"Yu Zhang, Jiange Zhang, Qiliang Zhang","doi":"10.1109/PDCAT.2016.042","DOIUrl":"https://doi.org/10.1109/PDCAT.2016.042","url":null,"abstract":"Many parallel programs are intended to yield deterministic results, but unpredictable thread or process interleavings can lead to subtle bugs and nondeterminism. We proposed a producer-consumer virtual memory–Many parallel programs are intended to yield deterministic results, but unpredictable thread or process interleavings can lead to subtle bugs and nondeterminism. We proposed a producer-consumer virtual memory–SPMC–for efficient system-enforced deterministic parallelism, and prototyped the SPMC model and its software stack entirely in Linux user space, called DLinux. This paper summarizes the implementation policies and limitations in our previous DLinux. To reduce SPMC page fault overhead and suspend/resume overhead which severely degrade the performance of DLinux, we enhance the SPMC model with nonblocking test and direct read and write primitives. Based on the extended SPMC model, we improve the implementation of upper programming abstractions. Experimental results show that relative to the previous version, the new DLinux can improve the performance of NPB workloads up to 2.33X and 1.76X on 8 and 16 processes, respectively. For CG on 8 processes, its runtime relative to MPICH2 decreases from 4.12X to 1.77X. SPMC–for efficient system-enforced deterministic parallelism, and prototyped the SPMC model and its software stack entirely in Linux user space, called DLinux. This paper summarizes the implementation policies and limitations in our previous DLinux. To reduce SPMC page fault overhead and suspend/resume overhead which severely degrade the performance of DLinux, we enhance the SPMC model with nonblocking test and direct read and write primitives. Based on the extended SPMC model, we improve the implementation of upper programming abstractions. Experimental results show that relative to the previous version, the new DLinux can improve the performance of NPB workloads up to 2.33X and 1.76X on 8 and 16 processes, respectively. For CG on 8 processes, its runtime relative to MPICH2 decreases from 4.12X to 1.77X.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126304331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
How to resist combined geometric attacks effectively while maintain a high embedding capacity is still a challenging task for the digital watermarking research. An affine correction based algorithm is proposed in this paper, which can resist combined geometric attacks and keep a higher watermark embedding capacity. The SURF algorithm and the RANSAC algorithm are used to extract, match and select feature points from the attacked image and the original image. Then, the least square algorithm is used to estimate the affine matrix of the geometric attacks according to the relationship between the matched feature points. The attacks are corrected based on the estimated affine matrix. A fine correction step is included to improve the precision of the watermark detection. To resist the cropping attacks, the watermark information is encoded with LT-coding. The encoded watermark is embedded in the DWT-DCT composite domain of the image. Experimental results show that the proposed algorithm not only has a high embedding capacity, but also is robust to many kinds of geometric attacks.
{"title":"Affine Correction Based Image Watermarking Robust to Geometric Attacks","authors":"Wuyong Zhang, Jianhua Chen, Rongshu Wang, Xiaolong Wang, Tian Meng","doi":"10.1109/PDCAT.2016.046","DOIUrl":"https://doi.org/10.1109/PDCAT.2016.046","url":null,"abstract":"How to resist combined geometric attacks effectively while maintain a high embedding capacity is still a challenging task for the digital watermarking research. An affine correction based algorithm is proposed in this paper, which can resist combined geometric attacks and keep a higher watermark embedding capacity. The SURF algorithm and the RANSAC algorithm are used to extract, match and select feature points from the attacked image and the original image. Then, the least square algorithm is used to estimate the affine matrix of the geometric attacks according to the relationship between the matched feature points. The attacks are corrected based on the estimated affine matrix. A fine correction step is included to improve the precision of the watermark detection. To resist the cropping attacks, the watermark information is encoded with LT-coding. The encoded watermark is embedded in the DWT-DCT composite domain of the image. Experimental results show that the proposed algorithm not only has a high embedding capacity, but also is robust to many kinds of geometric attacks.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122263365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}