首页 > 最新文献

2012 International Conference on High Performance Computing & Simulation (HPCS)最新文献

英文 中文
Acceleration of variance of color differences-based demosaicing using CUDA 使用CUDA加速基于色差的反马赛克方差
Pub Date : 2012-07-02 DOI: 10.1109/HPCSim.2012.6266965
Muhammad Ismail Faruqi, Fumihiko Ino, K. Hagihara
Image demosaicing algorithms are used to reconstruct a full color image from the incomplete color samples output (RAW data) of an image sensor overlaid with a Color Filter Array (CFA). Better demosaicing algorithms are superior in terms of acuity, dynamic range, signal to noise ratio, and artifact suppression, which make them suitable for high quality delivery such as theatrical broadcast. In this paper, we present our efforts in examining the feasibility of exploiting the Graphics Processing Unit (GPU) as an emerging accelerator to create an on-the-fly implementation of Variance of Color Differences (VCD) demosaicing, a state-of-the-art heuristic demosaicing algorithm developed to eliminate false-color artifacts in texture region of images. Our efforts in this paper are 1) implementing the algorithm as several kernels to separate the bottleneck portion of the algorithm from the rest and to minimize idle threads and 2) reducing I/O between shared and global memory when performing green channel interpolation by separating the input RAW data into four channels. We then compare the implementation featuring both acceleration methods with a single kernel implementation. Based on experimental results, our proposed acceleration methods achieved per-frame processing time of 343 ms on an nVidia GTX 480, which translates into 2.95 fps. Additionally, our proposed methods were also able to accelerate the kernel time and the effective memory bandwidth by a factor of 2.1× compared with its single kernel counterpart.
图像去马赛克算法用于从图像传感器的不完整颜色样本输出(RAW数据)中重建全彩色图像,该图像传感器与颜色滤波器阵列(CFA)叠加。更好的去马赛克算法在清晰度、动态范围、信噪比和伪影抑制等方面都具有优势,适用于戏剧广播等高质量的传输。在本文中,我们展示了我们在研究利用图形处理单元(GPU)作为新兴加速器来创建动态实现色差方差(VCD)去马赛克的可行性方面的努力,这是一种最先进的启发式去马赛克算法,用于消除图像纹理区域的假色伪影。我们在本文中的努力是:1)将算法实现为几个内核,以将算法的瓶颈部分与其余部分分开,并最大限度地减少空闲线程;2)在执行绿色通道插值时,通过将输入原始数据分离到四个通道,减少共享内存和全局内存之间的I/O。然后,我们将具有两种加速方法的实现与单个内核实现进行比较。基于实验结果,我们提出的加速方法在nVidia GTX 480上实现了每帧343 ms的处理时间,转换为2.95 fps。此外,我们提出的方法还能够将内核时间和有效内存带宽加快2.1倍。
{"title":"Acceleration of variance of color differences-based demosaicing using CUDA","authors":"Muhammad Ismail Faruqi, Fumihiko Ino, K. Hagihara","doi":"10.1109/HPCSim.2012.6266965","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266965","url":null,"abstract":"Image demosaicing algorithms are used to reconstruct a full color image from the incomplete color samples output (RAW data) of an image sensor overlaid with a Color Filter Array (CFA). Better demosaicing algorithms are superior in terms of acuity, dynamic range, signal to noise ratio, and artifact suppression, which make them suitable for high quality delivery such as theatrical broadcast. In this paper, we present our efforts in examining the feasibility of exploiting the Graphics Processing Unit (GPU) as an emerging accelerator to create an on-the-fly implementation of Variance of Color Differences (VCD) demosaicing, a state-of-the-art heuristic demosaicing algorithm developed to eliminate false-color artifacts in texture region of images. Our efforts in this paper are 1) implementing the algorithm as several kernels to separate the bottleneck portion of the algorithm from the rest and to minimize idle threads and 2) reducing I/O between shared and global memory when performing green channel interpolation by separating the input RAW data into four channels. We then compare the implementation featuring both acceleration methods with a single kernel implementation. Based on experimental results, our proposed acceleration methods achieved per-frame processing time of 343 ms on an nVidia GTX 480, which translates into 2.95 fps. Additionally, our proposed methods were also able to accelerate the kernel time and the effective memory bandwidth by a factor of 2.1× compared with its single kernel counterpart.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122541587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A method for communication efficient work distributions in stencil operation based applications on heterogeneous clusters 基于异构集群的模板操作应用中通信高效工作分配方法
Pub Date : 2012-07-02 DOI: 10.1109/HPCSim.2012.6266960
J. Schneible, L. Ríha, Maria Malik, T. El-Ghazawi, A. Alexandru
In recent years, the use of accelerators in conjunction with CPUs, known as heterogeneous computing, has brought about significant performance increases for scientific applications. One of the best examples of this is Lattice Quantum Chromo-Dynamics (QCD), a stencil operation based simulation. These simulations have a large memory footprint necessitating the use of many graphics processing units (GPUs) in parallel. This requires the use of a heterogeneous cluster with one or more GPUs per node. In order to obtain optimal performance, it is necessary to determine an efficient communication pattern between GPUs on the same node and between nodes. In this paper we present a performance model based method for minimizing the communication time of applications with stencil operations, such as Lattice QCD, on heterogeneous computing systems with a non-blocking Infiniband interconnection network. The proposed method is able to increase the performance of the most computationally intensive kernel of Lattice QCD by 25 percent due to improved overlapping of communication and computation.
近年来,将加速器与cpu结合使用,即所谓的异构计算,为科学应用程序带来了显著的性能提升。其中一个最好的例子是晶格量子色动力学(QCD),一种基于模板操作的模拟。这些模拟有很大的内存占用,需要并行使用许多图形处理单元(gpu)。这需要使用异构集群,每个节点有一个或多个gpu。为了获得最佳性能,需要确定同一节点上gpu之间以及节点之间的有效通信模式。在本文中,我们提出了一种基于性能模型的方法,用于在具有非阻塞Infiniband互连网络的异构计算系统上最小化具有模板操作的应用程序(如Lattice QCD)的通信时间。由于改进了通信和计算的重叠,所提出的方法能够将计算最密集的Lattice QCD内核的性能提高25%。
{"title":"A method for communication efficient work distributions in stencil operation based applications on heterogeneous clusters","authors":"J. Schneible, L. Ríha, Maria Malik, T. El-Ghazawi, A. Alexandru","doi":"10.1109/HPCSim.2012.6266960","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266960","url":null,"abstract":"In recent years, the use of accelerators in conjunction with CPUs, known as heterogeneous computing, has brought about significant performance increases for scientific applications. One of the best examples of this is Lattice Quantum Chromo-Dynamics (QCD), a stencil operation based simulation. These simulations have a large memory footprint necessitating the use of many graphics processing units (GPUs) in parallel. This requires the use of a heterogeneous cluster with one or more GPUs per node. In order to obtain optimal performance, it is necessary to determine an efficient communication pattern between GPUs on the same node and between nodes. In this paper we present a performance model based method for minimizing the communication time of applications with stencil operations, such as Lattice QCD, on heterogeneous computing systems with a non-blocking Infiniband interconnection network. The proposed method is able to increase the performance of the most computationally intensive kernel of Lattice QCD by 25 percent due to improved overlapping of communication and computation.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122601178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable high performance computing in wide area network 广域网中可扩展的高性能计算
Pub Date : 2012-07-02 DOI: 10.1109/HPCSim.2012.6266993
R. Hassani, P. Luksch
Many parallel applications in High Performance Computing have to communicate via wide area network (WAN), e.g., in a grid or cloud environment that spans multiple sites. Communication across WAN links slows down the application due to high latency and low bandwidth. Much of this overhead is due to the current implementations of the MPI (Message Passing Interface) standard. My project aims at improving WAN performance of MPI. Virtually most of today's wide area MPI implementations rely on the TCP/IP protocol. I propose to replace it by an innovative concurrent multipath communication method (CMC-SCTP) and integrate it with Open MPI project, which will increase bandwidth and enhance fault resilience within the MPI protocol stack in WAN environment. I plan to make my research results available to the community within the scope of the Open MPI project.
高性能计算中的许多并行应用程序必须通过广域网(WAN)进行通信,例如,在跨越多个站点的网格或云环境中。由于高延迟和低带宽,跨WAN链路的通信降低了应用程序的速度。这种开销大部分是由于MPI(消息传递接口)标准的当前实现造成的。我的项目旨在提高MPI的广域网性能。实际上,当今大多数广域MPI实现都依赖于TCP/IP协议。我建议用一种创新的并发多路径通信方法(CMC-SCTP)取代它,并将其与Open MPI项目集成,这将增加广域网环境下MPI协议栈的带宽和增强故障恢复能力。我计划在Open MPI项目的范围内将我的研究成果提供给社区。
{"title":"Scalable high performance computing in wide area network","authors":"R. Hassani, P. Luksch","doi":"10.1109/HPCSim.2012.6266993","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266993","url":null,"abstract":"Many parallel applications in High Performance Computing have to communicate via wide area network (WAN), e.g., in a grid or cloud environment that spans multiple sites. Communication across WAN links slows down the application due to high latency and low bandwidth. Much of this overhead is due to the current implementations of the MPI (Message Passing Interface) standard. My project aims at improving WAN performance of MPI. Virtually most of today's wide area MPI implementations rely on the TCP/IP protocol. I propose to replace it by an innovative concurrent multipath communication method (CMC-SCTP) and integrate it with Open MPI project, which will increase bandwidth and enhance fault resilience within the MPI protocol stack in WAN environment. I plan to make my research results available to the community within the scope of the Open MPI project.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121120868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An energy-aware multi-start local search heuristic for scheduling VMs on the OpenNebula cloud distribution 一种能量感知的多启动本地搜索启发式算法,用于调度OpenNebula云分布上的虚拟机
Pub Date : 2012-07-02 DOI: 10.1109/HPCSim.2012.6266899
Y. Kessaci, N. Melab, E. Talbi
Reducing energy consumption is an increasingly important issue in cloud computing, more specifically when dealing with a cloud distribution dispatched over a huge number of machines. Minimizing energy consumption can significantly reduce the amount of energy bills, and the greenhouse gas emissions. Therefore, many researches are carried out to develop new methods in order to consume less energy. In this paper, we present an Energy-aware Multi-start Local Search algorithm for an OpenNebula based Cloud (EMLS-ONC) that optimizes the energy consumption of an OpenNebula managed geographically distributed cloud computing infrastructure. The results of our EMLS-ONC scheduler are compared to the results obtained by the default scheduler of OpenNebula. The two approaches have been experimented using different (VMs) arrival scenarios and different hardware infrastructures. The results show that EMLS-ONC outperforms the previous OpenNebula's scheduler by a significant margin in terms of energy consumption. In addition, EMLS-ONC is also proved to schedule more applications.
在云计算中,降低能耗是一个越来越重要的问题,特别是在处理分布在大量机器上的云分布时。最大限度地减少能源消耗可以大大减少能源账单的数量,并减少温室气体的排放。因此,人们进行了许多研究,以开发新的方法,以减少能源消耗。在本文中,我们提出了一种能量感知的基于OpenNebula云的多启动本地搜索算法(EMLS-ONC),该算法优化了OpenNebula管理的地理分布式云计算基础设施的能源消耗。我们的EMLS-ONC调度器的结果与OpenNebula的默认调度器得到的结果进行了比较。这两种方法已经使用不同的(vm)到达场景和不同的硬件基础设施进行了实验。结果表明,EMLS-ONC在能耗方面明显优于之前的OpenNebula调度器。此外,EMLS-ONC还被证明可以调度更多的应用程序。
{"title":"An energy-aware multi-start local search heuristic for scheduling VMs on the OpenNebula cloud distribution","authors":"Y. Kessaci, N. Melab, E. Talbi","doi":"10.1109/HPCSim.2012.6266899","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266899","url":null,"abstract":"Reducing energy consumption is an increasingly important issue in cloud computing, more specifically when dealing with a cloud distribution dispatched over a huge number of machines. Minimizing energy consumption can significantly reduce the amount of energy bills, and the greenhouse gas emissions. Therefore, many researches are carried out to develop new methods in order to consume less energy. In this paper, we present an Energy-aware Multi-start Local Search algorithm for an OpenNebula based Cloud (EMLS-ONC) that optimizes the energy consumption of an OpenNebula managed geographically distributed cloud computing infrastructure. The results of our EMLS-ONC scheduler are compared to the results obtained by the default scheduler of OpenNebula. The two approaches have been experimented using different (VMs) arrival scenarios and different hardware infrastructures. The results show that EMLS-ONC outperforms the previous OpenNebula's scheduler by a significant margin in terms of energy consumption. In addition, EMLS-ONC is also proved to schedule more applications.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123127270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Hybrid parallel solutions of the Black-Scholes PDE with the truncated combination technique 用截断组合技术求解Black-Scholes偏微分方程的混合并行解
Pub Date : 2012-07-02 DOI: 10.1109/HPCSim.2012.6266992
J. Benk, D. Pflüger
This paper presents an efficient approach to parallel pricing of multi-dimensional financial derivatives based on the Black-Scholes Partial Differential Equation (BS-PDE). One of the main challenges for such multi-dimensional problems is the curse of dimensionality, that is tackled in our approach by the combination technique (CT). This technique consists of a combination of several solutions obtained on anisotropic full grids. Hence, it offers the possibility to compute the BS-PDE on each one in an embarrassingly parallel way. Besides parallelizing on the CT level, we have developed a shared memory parallel multigrid solver for the BS-PDE. The parallel efficiency of our hybrid parallel approach is demonstrated by strong scaling results of 5D and 6D pricing problems.
提出了一种基于Black-Scholes偏微分方程(BS-PDE)的多维金融衍生品平行定价方法。这种多维问题的主要挑战之一是维度的诅咒,在我们的方法中通过组合技术(CT)解决了这一问题。该技术由在各向异性全网格上得到的几个解组合而成。因此,它提供了以令人尴尬的并行方式计算每个节点上的BS-PDE的可能性。除了在CT级并行化之外,我们还为BS-PDE开发了一个共享内存并行多网格求解器。我们的混合并行方法的并行效率通过5D和6D定价问题的强缩放结果证明。
{"title":"Hybrid parallel solutions of the Black-Scholes PDE with the truncated combination technique","authors":"J. Benk, D. Pflüger","doi":"10.1109/HPCSim.2012.6266992","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266992","url":null,"abstract":"This paper presents an efficient approach to parallel pricing of multi-dimensional financial derivatives based on the Black-Scholes Partial Differential Equation (BS-PDE). One of the main challenges for such multi-dimensional problems is the curse of dimensionality, that is tackled in our approach by the combination technique (CT). This technique consists of a combination of several solutions obtained on anisotropic full grids. Hence, it offers the possibility to compute the BS-PDE on each one in an embarrassingly parallel way. Besides parallelizing on the CT level, we have developed a shared memory parallel multigrid solver for the BS-PDE. The parallel efficiency of our hybrid parallel approach is demonstrated by strong scaling results of 5D and 6D pricing problems.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115165677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Accurate CUDA performance modeling for sparse matrix-vector multiplication 精确的CUDA性能建模稀疏矩阵-向量乘法
Pub Date : 2012-07-02 DOI: 10.1109/HPCSim.2012.6266964
Ping Guo, Liqiang Wang
This paper presents an integrated analytical and profile-based CUDA performance modeling approach to accurately predict the kernel execution times of sparse matrix-vector multiplication for CSR, ELL, COO, and HYB SpMV CUDA kernels. Based on our experiments conducted on a collection of 8 widely-used testing matrices on NVIDIA Tesla C2050, the execution times predicted by our model match the measured execution times of NVIDIA's SpMV implementations very well. Specifically, for 29 out of 32 test cases, the performance differences are under or around 7%. For the rest 3 test cases, the differences are between 8% and 10%. For CSR, ELL, COO, and HYB SpMV kernels, the differences are 4.2%, 5.2%, 1.0%, and 5.7% on the average, respectively.
本文提出了一种集成的基于分析和概要文件的CUDA性能建模方法,以准确预测CSR, ELL, COO和HYB SpMV CUDA内核的稀疏矩阵向量乘法的内核执行时间。基于我们在NVIDIA Tesla C2050上对8个广泛使用的测试矩阵进行的实验,我们的模型预测的执行时间与NVIDIA SpMV实现的实际执行时间非常匹配。具体来说,对于32个测试用例中的29个,性能差异在7%以下或左右。对于其余3个测试用例,差异在8%到10%之间。对于CSR、ELL、COO和HYB SpMV内核,平均差异分别为4.2%、5.2%、1.0%和5.7%。
{"title":"Accurate CUDA performance modeling for sparse matrix-vector multiplication","authors":"Ping Guo, Liqiang Wang","doi":"10.1109/HPCSim.2012.6266964","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266964","url":null,"abstract":"This paper presents an integrated analytical and profile-based CUDA performance modeling approach to accurately predict the kernel execution times of sparse matrix-vector multiplication for CSR, ELL, COO, and HYB SpMV CUDA kernels. Based on our experiments conducted on a collection of 8 widely-used testing matrices on NVIDIA Tesla C2050, the execution times predicted by our model match the measured execution times of NVIDIA's SpMV implementations very well. Specifically, for 29 out of 32 test cases, the performance differences are under or around 7%. For the rest 3 test cases, the differences are between 8% and 10%. For CSR, ELL, COO, and HYB SpMV kernels, the differences are 4.2%, 5.2%, 1.0%, and 5.7% on the average, respectively.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122753319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Visual words selection for human action classification 人类行为分类的视觉词选择
Pub Date : 2012-07-02 DOI: 10.1109/HPCSim.2012.6266910
J. R. Cózar, José María González-Linares, Nicolás Guil Mata, Ruber Hernández, Yanio Heredia
Human action classification is an important task in computer vision. The Bag-of-Words model uses spatio-temporal features assigned to visual words of a vocabulary and some classification algorithm to attain this goal. In this work we have studied the effect of reducing the vocabulary size using a video word ranking method. We have applied this method to the KTH dataset to obtain a vocabulary with more descriptive words where the representation is more compact and efficient. Two feature descriptors, STIP and MoSIFT, and two classifiers, KNN and SVM, have been used to check the validity of our approach. Results for different vocabulary sizes show an improvement of the recognition rate whilst reducing the number of words as non-descriptive words are removed. Additionally, state-of-the-art performances are reached with this new compact vocabulary representation.
人体动作分类是计算机视觉中的一项重要任务。词袋模型利用词汇表中视觉词的时空特征和一些分类算法来实现这一目标。在这项工作中,我们研究了使用视频词排序方法减少词汇量的效果。我们已经将这种方法应用于KTH数据集,以获得具有更多描述性单词的词汇表,其中表示更紧凑和高效。两个特征描述符,STIP和MoSIFT,以及两个分类器,KNN和SVM,已经被用来检查我们的方法的有效性。不同词汇量下的结果表明,在去除非描述性词汇的同时,识别率也有所提高。此外,最先进的性能达到了这个新的紧凑的词汇表表示。
{"title":"Visual words selection for human action classification","authors":"J. R. Cózar, José María González-Linares, Nicolás Guil Mata, Ruber Hernández, Yanio Heredia","doi":"10.1109/HPCSim.2012.6266910","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266910","url":null,"abstract":"Human action classification is an important task in computer vision. The Bag-of-Words model uses spatio-temporal features assigned to visual words of a vocabulary and some classification algorithm to attain this goal. In this work we have studied the effect of reducing the vocabulary size using a video word ranking method. We have applied this method to the KTH dataset to obtain a vocabulary with more descriptive words where the representation is more compact and efficient. Two feature descriptors, STIP and MoSIFT, and two classifiers, KNN and SVM, have been used to check the validity of our approach. Results for different vocabulary sizes show an improvement of the recognition rate whilst reducing the number of words as non-descriptive words are removed. Additionally, state-of-the-art performances are reached with this new compact vocabulary representation.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129935350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Connection reservation algorithm in a Web server with service differentiation 具有服务差异的Web服务器中的连接保留算法
Pub Date : 2012-07-02 DOI: 10.1109/HPCSim.2012.6266955
Paulo S. F. Eustaquio, Ricardo Figueiredo, S. Bruschi, R. Santana, M. J. Santana
This paper presents an architecture prototype, named Web Server with Service Differentiation, able to provide QoS to different classes of services. With the implemented prototype, an admission control algorithm, named connection reservation algorithm is proposed and compared to the negotiation algorithm. The results of the performance evaluation have showed both algorithms met proportionally a higher number of high priority class (Class 1) requests in relation to low priority class (Class 2), although the connection reservation algorithm fitted all workload variance better. The connection reservation algorithm can be extended to the Web, where workload dynamic characteristics predominate.
本文提出了一种能够为不同类别的服务提供QoS的体系结构原型——具有服务差异化的Web服务器。在实现原型的基础上,提出了一种准入控制算法——连接保留算法,并与协商算法进行了比较。性能评估的结果表明,尽管连接保留算法更好地适应了所有工作负载差异,但两种算法在满足高优先级类(class 1)请求的比例上都高于低优先级类(class 2)请求。连接保留算法可以扩展到工作负载动态特征占主导地位的Web。
{"title":"Connection reservation algorithm in a Web server with service differentiation","authors":"Paulo S. F. Eustaquio, Ricardo Figueiredo, S. Bruschi, R. Santana, M. J. Santana","doi":"10.1109/HPCSim.2012.6266955","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266955","url":null,"abstract":"This paper presents an architecture prototype, named Web Server with Service Differentiation, able to provide QoS to different classes of services. With the implemented prototype, an admission control algorithm, named connection reservation algorithm is proposed and compared to the negotiation algorithm. The results of the performance evaluation have showed both algorithms met proportionally a higher number of high priority class (Class 1) requests in relation to low priority class (Class 2), although the connection reservation algorithm fitted all workload variance better. The connection reservation algorithm can be extended to the Web, where workload dynamic characteristics predominate.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130878321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A cloud-based watermarking method for health data security 基于云的健康数据安全水印方法
Pub Date : 2012-07-02 DOI: 10.1109/HPCSim.2012.6266986
Zhiwei Yu, C. Thomborson, Chaokun Wang, Jianmin Wang, Rui Li
Private health information once confined to local medical institutions is migrating onto the Internet as an Electronic Health Record (EHR) that is accessed by cloud computing. No matter where it is hosted, health data is subject to security breaches, privacy abuses, and access control violations. However, novel technologies have new vulnerabilities, and allow new mitigations. In this paper, we propose a watermarking method in the architecture of cloud computing, to mitigate the risks of insider disclosures. Our design and preliminary implementation are accomplished by exploiting the MapReduce mechanism in the cloudlet we built. Our evaluation shows that our proposal addresses all of the requirements of the Cloud Oriented Architecture (COA) framework of the Jericho Forum.
曾经局限于当地医疗机构的私人健康信息正在以云计算访问的电子健康记录(EHR)的形式迁移到互联网上。无论托管在何处,健康数据都会受到安全漏洞、隐私滥用和访问控制违规的影响。然而,新技术有新的漏洞,并允许新的缓解措施。在本文中,我们提出了一种云计算架构中的水印方法,以降低内部信息泄露的风险。我们的设计和初步实现是通过利用我们构建的云中的MapReduce机制来完成的。我们的评估表明,我们的提案满足了Jericho论坛面向云架构(COA)框架的所有需求。
{"title":"A cloud-based watermarking method for health data security","authors":"Zhiwei Yu, C. Thomborson, Chaokun Wang, Jianmin Wang, Rui Li","doi":"10.1109/HPCSim.2012.6266986","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266986","url":null,"abstract":"Private health information once confined to local medical institutions is migrating onto the Internet as an Electronic Health Record (EHR) that is accessed by cloud computing. No matter where it is hosted, health data is subject to security breaches, privacy abuses, and access control violations. However, novel technologies have new vulnerabilities, and allow new mitigations. In this paper, we propose a watermarking method in the architecture of cloud computing, to mitigate the risks of insider disclosures. Our design and preliminary implementation are accomplished by exploiting the MapReduce mechanism in the cloudlet we built. Our evaluation shows that our proposal addresses all of the requirements of the Cloud Oriented Architecture (COA) framework of the Jericho Forum.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127789829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Dynamically reducing overestimated design margin of MultiCores 动态减少高估的多核设计余量
Pub Date : 2012-07-02 DOI: 10.1109/HPCSim.2012.6266944
Toshinori Sato, Takanori Hayashida, Ken Yano
MultiCore processor is one of the promising techniques to satisfy computing demands of the future consumer devices. However, MultiCore processor is still threatened by increasing energy consumption due to PVT (Process-Voltage-Temperature) variations. They require large design margins in the supply voltage, resulting in large energy consumption. The combination of DVS (Dynamic voltage scaling) technique and Canary FF (flip-flop), named Canary-DVS, has been proposed to eliminate the overestimated voltage margin but has only been evaluated under the assumption of typical delay. This paper considers C2C (Core-to-Core) variations and evaluates how Canary-DVS eliminates the energy waste under the practical assumption of delay variations. We adopt Canary-DVS to a commercial processor, Toshiba's quad-core Media embedded Processor (MeP). From Monte Carlo simulations, it is found that energy is reduced by 18.6% on average and there are not any noticeable discrepancies from the typical situations, when 0.064 of σ/μ value is assumed in gate delay.
多核处理器是满足未来消费电子计算需求的一种很有前途的技术。然而,由于PVT(进程电压-温度)的变化,多核处理器仍然受到能量消耗增加的威胁。它们在供电电压方面需要较大的设计余量,从而导致较大的能耗。为了消除高估的电压裕度,已经提出了动态电压缩放技术(DVS)和金丝雀触发器(Canary-DVS)的组合,但仅在典型延迟假设下进行了评估。本文考虑了C2C (Core-to-Core)变化,并评估了在实际假设延迟变化的情况下,Canary-DVS如何消除能量浪费。我们将Canary-DVS应用于商用处理器,即东芝的四核媒体嵌入式处理器(MeP)。通过蒙特卡罗模拟,发现当栅极延迟σ/μ值为0.064时,能量平均减少18.6%,与典型情况没有明显差异。
{"title":"Dynamically reducing overestimated design margin of MultiCores","authors":"Toshinori Sato, Takanori Hayashida, Ken Yano","doi":"10.1109/HPCSim.2012.6266944","DOIUrl":"https://doi.org/10.1109/HPCSim.2012.6266944","url":null,"abstract":"MultiCore processor is one of the promising techniques to satisfy computing demands of the future consumer devices. However, MultiCore processor is still threatened by increasing energy consumption due to PVT (Process-Voltage-Temperature) variations. They require large design margins in the supply voltage, resulting in large energy consumption. The combination of DVS (Dynamic voltage scaling) technique and Canary FF (flip-flop), named Canary-DVS, has been proposed to eliminate the overestimated voltage margin but has only been evaluated under the assumption of typical delay. This paper considers C2C (Core-to-Core) variations and evaluates how Canary-DVS eliminates the energy waste under the practical assumption of delay variations. We adopt Canary-DVS to a commercial processor, Toshiba's quad-core Media embedded Processor (MeP). From Monte Carlo simulations, it is found that energy is reduced by 18.6% on average and there are not any noticeable discrepancies from the typical situations, when 0.064 of σ/μ value is assumed in gate delay.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"418 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117331426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2012 International Conference on High Performance Computing & Simulation (HPCS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1