首页 > 最新文献

2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)最新文献

英文 中文
Peak Power Management for scheduling real-time tasks on heterogeneous many-core systems 在异构多核系统上调度实时任务的峰值功率管理
Pub Date : 2014-12-01 DOI: 10.1109/PADSW.2014.7097809
Waqaas Munawar, Heba Khdr, Santiago Pagani, M. Shafique, Jian-Jia Chen, J. Henkel
The number and diversity of cores in on-chip systems is increasing rapidly. However, due to the Thermal Design Power (TDP) constraint, it is not possible to continuously operate all cores at the same time. Exceeding the TDP constraint may activate the Dynamic Thermal Management (DTM) to ensure thermal stability. Such hardware based closed-loop safeguards pose a big challenge in using many-core chips for real-time tasks. Managing the worst-case peak power usage of a chip can help toward resolving this issue. We present a scheme to minimize the peak power usage for frame-based and periodic real-time tasks on many-core processors by scheduling the sleep cycles for each active core and introduce the concept of a sufficient test for peak power consumption for task feasibility. We consider both inter-task and inter-core diversity in terms of power usage and present computationally efficient algorithms for peak power minimization for these cases, i.e., a special case of “homogeneous tasks on homogeneous cores” to the general case of “heterogeneous tasks on heterogeneous cores”. We evaluate our solution through extensive simulations using the 48-core SCC platform and gem5 architecture simulator. Our simulation results show the efficacy of our scheme.
片上系统中内核的数量和多样性正在迅速增加。然而,由于热设计功率(TDP)的限制,不可能同时连续运行所有内核。超过TDP约束可能会激活动态热管理(DTM)以确保热稳定性。这种基于硬件的闭环保护对使用多核芯片执行实时任务提出了很大的挑战。管理芯片最坏情况下的峰值功耗可以帮助解决这个问题。我们提出了一种方案,通过调度每个活动核心的睡眠周期来最小化多核处理器上基于帧和周期性实时任务的峰值功耗,并引入了任务可行性的峰值功耗测试的概念。我们考虑了任务间和核间的功耗多样性,并提出了这些情况下峰值功耗最小化的计算效率算法,即,从“同质核上的同质任务”的特殊情况到“异质核上的异构任务”的一般情况。我们通过使用48核SCC平台和gem5架构模拟器进行大量模拟来评估我们的解决方案。仿真结果表明了该方案的有效性。
{"title":"Peak Power Management for scheduling real-time tasks on heterogeneous many-core systems","authors":"Waqaas Munawar, Heba Khdr, Santiago Pagani, M. Shafique, Jian-Jia Chen, J. Henkel","doi":"10.1109/PADSW.2014.7097809","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097809","url":null,"abstract":"The number and diversity of cores in on-chip systems is increasing rapidly. However, due to the Thermal Design Power (TDP) constraint, it is not possible to continuously operate all cores at the same time. Exceeding the TDP constraint may activate the Dynamic Thermal Management (DTM) to ensure thermal stability. Such hardware based closed-loop safeguards pose a big challenge in using many-core chips for real-time tasks. Managing the worst-case peak power usage of a chip can help toward resolving this issue. We present a scheme to minimize the peak power usage for frame-based and periodic real-time tasks on many-core processors by scheduling the sleep cycles for each active core and introduce the concept of a sufficient test for peak power consumption for task feasibility. We consider both inter-task and inter-core diversity in terms of power usage and present computationally efficient algorithms for peak power minimization for these cases, i.e., a special case of “homogeneous tasks on homogeneous cores” to the general case of “heterogeneous tasks on heterogeneous cores”. We evaluate our solution through extensive simulations using the 48-core SCC platform and gem5 architecture simulator. Our simulation results show the efficacy of our scheme.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122036221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Directory Lookaside Table: Enabling scalable, low-conflict, many-core cache coherence directory 目录Lookaside表:支持可扩展、低冲突、多核缓存一致性目录
Pub Date : 2014-12-01 DOI: 10.1109/PADSW.2014.7097798
Xudong Shi, Feiqi Su, J. Peir
Maintaining hardware cache coherence on future CMPs becomes increasingly important and difficult as the number of cores keeps accelerating in mainstream multicore chips. The simple snooping-bus coherence scheme is not suitable due to its limited scalability. The sparse coherence directory approach may incur extra cache invalidations due to a topological mismatch between the coherence directory and the directories of all cache modules. In this paper, we propose an innovative CMP coherence directory that has three important properties. First, the directory has a simple set-associative design with small associativity. The number of directory entries matches the total number of cache blocks. Second, an augmented Directory Lookaside Table (DLT) allows blocks to be displaced from their primary sets in the coherence directory for alleviating hot-set conflicts. Third, to avoid expensive presence bits, each copy of a block along with the located core ID occupies a separate directory entry. Performance evaluations based on multithreaded and multi-programmed workloads demonstrate significant advantages of the proposed CMP directory over directories with traditional set-associative or skewed associative designs.
随着主流多核芯片的核心数量不断增加,在未来的cmp上保持硬件缓存一致性变得越来越重要和困难。简单的监听总线相干方案由于其可扩展性有限而不适合。由于相干目录和所有缓存模块的目录之间的拓扑不匹配,稀疏相干目录方法可能会导致额外的缓存失效。在本文中,我们提出了一个创新的CMP相干目录,它具有三个重要的性质。首先,目录具有简单的集合关联设计,关联性小。目录条目的数量与缓存块的总数相匹配。其次,增强型目录Lookaside表(DLT)允许块从一致性目录中的主集偏移,以减轻热集冲突。第三,为了避免昂贵的存在位,块的每个副本以及定位的核心ID占用一个单独的目录条目。基于多线程和多编程工作负载的性能评估表明,与使用传统集关联或倾斜关联设计的目录相比,所提出的CMP目录具有显著优势。
{"title":"Directory Lookaside Table: Enabling scalable, low-conflict, many-core cache coherence directory","authors":"Xudong Shi, Feiqi Su, J. Peir","doi":"10.1109/PADSW.2014.7097798","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097798","url":null,"abstract":"Maintaining hardware cache coherence on future CMPs becomes increasingly important and difficult as the number of cores keeps accelerating in mainstream multicore chips. The simple snooping-bus coherence scheme is not suitable due to its limited scalability. The sparse coherence directory approach may incur extra cache invalidations due to a topological mismatch between the coherence directory and the directories of all cache modules. In this paper, we propose an innovative CMP coherence directory that has three important properties. First, the directory has a simple set-associative design with small associativity. The number of directory entries matches the total number of cache blocks. Second, an augmented Directory Lookaside Table (DLT) allows blocks to be displaced from their primary sets in the coherence directory for alleviating hot-set conflicts. Third, to avoid expensive presence bits, each copy of a block along with the located core ID occupies a separate directory entry. Performance evaluations based on multithreaded and multi-programmed workloads demonstrate significant advantages of the proposed CMP directory over directories with traditional set-associative or skewed associative designs.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130698862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Sensor-free corner shape detection by wireless networks 无线网络无传感器角形检测
Pub Date : 2014-12-01 DOI: 10.1109/PADSW.2014.7097822
Yuxi Wang, Zimu Zhou, Kaishun Wu
Due to the rapid growth of the smartphone applications and the fast development of the Wireless Local Area Networks (WLANs), numerous indoor location-based techniques have been proposed during the past several decades. Floorplan, which defines the structure and functionality of a specific indoor environment, becomes a hot topic nowadays. Conventional floorplan techniques leverage smartphone sensors combined with WiFi signals to construct the floorplan of a building. However, existing approaches with sensors cannot detect the shape of a corner, and the sensors cost huge amount of energy during the whole floorplan constructing process. In this paper, we propose a sensor-free approach to detect the shape of a certain corner leveraging WiFi signals without using sensors on smartphones. Instead of utilizing traditional wireless communication indicator Received Signal Strength (RSS), we leverage a finer-grained indicator Channel State Information (CSI) to detect the shape of a certain corner. The evaluation of our approach shows that CSI is more robust in sensor-free corner shape detection, and we have achieved over 85% detection accuracy in simulation and over 70% detection accuracy in real indoor experiments.
由于智能手机应用的快速增长和无线局域网(wlan)的快速发展,在过去的几十年里,许多基于室内位置的技术被提出。平面设计定义了特定室内环境的结构和功能,成为当今的热门话题。传统的平面图技术利用智能手机传感器结合WiFi信号来构建建筑物的平面图。然而,现有的传感器方法无法检测到角落的形状,并且传感器在整个平面图构建过程中消耗了巨大的能量。在本文中,我们提出了一种无传感器的方法,利用WiFi信号检测某个角落的形状,而不使用智能手机上的传感器。与传统的无线通信指示器接收信号强度(RSS)不同,我们利用细粒度指示器信道状态信息(CSI)来检测某个角的形状。对我们的方法的评估表明,CSI在无传感器的角形检测中具有更强的鲁棒性,我们在模拟中达到了85%以上的检测精度,在真实的室内实验中达到了70%以上的检测精度。
{"title":"Sensor-free corner shape detection by wireless networks","authors":"Yuxi Wang, Zimu Zhou, Kaishun Wu","doi":"10.1109/PADSW.2014.7097822","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097822","url":null,"abstract":"Due to the rapid growth of the smartphone applications and the fast development of the Wireless Local Area Networks (WLANs), numerous indoor location-based techniques have been proposed during the past several decades. Floorplan, which defines the structure and functionality of a specific indoor environment, becomes a hot topic nowadays. Conventional floorplan techniques leverage smartphone sensors combined with WiFi signals to construct the floorplan of a building. However, existing approaches with sensors cannot detect the shape of a corner, and the sensors cost huge amount of energy during the whole floorplan constructing process. In this paper, we propose a sensor-free approach to detect the shape of a certain corner leveraging WiFi signals without using sensors on smartphones. Instead of utilizing traditional wireless communication indicator Received Signal Strength (RSS), we leverage a finer-grained indicator Channel State Information (CSI) to detect the shape of a certain corner. The evaluation of our approach shows that CSI is more robust in sensor-free corner shape detection, and we have achieved over 85% detection accuracy in simulation and over 70% detection accuracy in real indoor experiments.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114132339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Accelerated variance reduction methods on GPU 基于GPU的加速方差缩减方法
Pub Date : 2014-12-01 DOI: 10.1109/PADSW.2014.7097926
Chuan-Hsiang Han, Yu-Tuan Lin
Monte Carlo simulations have become widely used in computational finance. Standard error is the basic notion to measure the quality of a Monte Carlo estimator, and the square of standard error is defined as the variance divided by the total number of simulations. Variance reduction methods have been developed as efficient algorithms by means of probabilistic analysis. GPU acceleration plays a crucial role of increasing the total number of simulations. We show that the total effect of combining variance reduction methods as efficient software algorithms with GPU acceleration as a parallel-computing hardware device can yield a tremendous speed up for financial applications such as evaluation of option prices and estimation of joint default probabilities.
蒙特卡罗模拟在计算金融中得到了广泛的应用。标准误差是衡量蒙特卡罗估计质量的基本概念,标准误差的平方定义为方差除以模拟总数。方差约简方法是一种基于概率分析的高效算法。GPU加速在增加模拟总数方面起着至关重要的作用。我们表明,将方差减少方法作为高效的软件算法与GPU加速作为并行计算硬件设备相结合,可以为期权价格评估和联合违约概率估计等金融应用程序带来巨大的速度提升。
{"title":"Accelerated variance reduction methods on GPU","authors":"Chuan-Hsiang Han, Yu-Tuan Lin","doi":"10.1109/PADSW.2014.7097926","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097926","url":null,"abstract":"Monte Carlo simulations have become widely used in computational finance. Standard error is the basic notion to measure the quality of a Monte Carlo estimator, and the square of standard error is defined as the variance divided by the total number of simulations. Variance reduction methods have been developed as efficient algorithms by means of probabilistic analysis. GPU acceleration plays a crucial role of increasing the total number of simulations. We show that the total effect of combining variance reduction methods as efficient software algorithms with GPU acceleration as a parallel-computing hardware device can yield a tremendous speed up for financial applications such as evaluation of option prices and estimation of joint default probabilities.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114142295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Streaming in NoSQL NoSQL中的流
Pub Date : 2014-12-01 DOI: 10.1109/PADSW.2014.7097898
Chia-Ping Tsai, Hung-Chang Hsiao
Present NoSQL databases are passive entities, where users proactively access the databases. While NoSQL databases are scalable due to their horizontal scale-out designs, data items stored in potentially very large databases are difficult to retrieve in terms of access delay, programmability and usability. In this paper, we advocate supporting events in NoSQL. By introducing publishers and subscribers to NoSQL, our proposed NoSQL data store is capable of delivering data items that users are interested in. Additionally, our NoSQL database decouples publishers and subscribers such that application developers can emphasize on data manipulation without paying attention to communications for delivering data. We formally discuss the design requirements for streaming in NoSQL, and present a prototype implementation that addresses the design issues. We also outline our ongoing works in this paper.
目前的NoSQL数据库是被动的实体,用户主动访问数据库。虽然NoSQL数据库由于其水平向外扩展设计而具有可伸缩性,但存储在潜在的非常大的数据库中的数据项在访问延迟、可编程性和可用性方面很难检索。在本文中,我们提倡在NoSQL中支持事件。通过向NoSQL引入发布者和订阅者,我们提出的NoSQL数据存储能够提供用户感兴趣的数据项。此外,我们的NoSQL数据库解耦了发布者和订阅者,这样应用程序开发人员就可以把重点放在数据操作上,而不必关注交付数据的通信。我们正式讨论了NoSQL中流的设计需求,并给出了一个解决设计问题的原型实现。本文还概述了我们正在进行的工作。
{"title":"Streaming in NoSQL","authors":"Chia-Ping Tsai, Hung-Chang Hsiao","doi":"10.1109/PADSW.2014.7097898","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097898","url":null,"abstract":"Present NoSQL databases are passive entities, where users proactively access the databases. While NoSQL databases are scalable due to their horizontal scale-out designs, data items stored in potentially very large databases are difficult to retrieve in terms of access delay, programmability and usability. In this paper, we advocate supporting events in NoSQL. By introducing publishers and subscribers to NoSQL, our proposed NoSQL data store is capable of delivering data items that users are interested in. Additionally, our NoSQL database decouples publishers and subscribers such that application developers can emphasize on data manipulation without paying attention to communications for delivering data. We formally discuss the design requirements for streaming in NoSQL, and present a prototype implementation that addresses the design issues. We also outline our ongoing works in this paper.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116426201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Leverage similarity and locality to enhance fingerprint prefetching of data deduplication 利用相似性和局部性来增强重复数据删除的指纹预取
Pub Date : 2014-12-01 DOI: 10.1109/PADSW.2014.7097802
Yongtao Zhou, Yuhui Deng, Junjie Xie
Data deduplication has been widely used at data backup system due to the significantly reduced requirements of storage capacity and network bandwidth. However, the performance of data deduplication gradually decreases with the growth of deduplicated data. This is because the volume of fingerprints grows significantly with the increase of backup data, and a large portion of fingerprints have to be stored on disk drives. This incurs frequent disk accesses to locate fingerprints and blocks the process of data deduplication. Furthermore, the fingerprints belonging to the same file may be discretely stored on disk drives. This generates random and small disk accesses, and results in significant performance degradation when the fingerprints are referred. Additionally, a single fingerprint may appear only once during a backup process. This results in very low cache hit ratio due to lacking temporal locality. This paper proposes to employ file similarity to enhance the fingerprint prefetching, thus improving the cache hit ratio and the performance of data deduplication. Furthermore, the fingerprints are arranged sequently in terms of the backup data stream to maintain the locality and promote the performance. Experimental results demonstrate that the proposed idea can effectively reduce the number of fingerprint accesses going to disk drives, decrease the query overhead of fingerprints, thus significantly alleviating the disk bottleneck of data deduplication.
由于重复数据删除对存储容量和网络带宽的要求大大降低,因此在数据备份系统中得到了广泛的应用。但随着重复数据删除数据量的增长,重复数据删除的性能会逐渐降低。这是因为随着备份数据的增加,指纹的数量会显著增加,而且有很大一部分指纹需要存储在磁盘驱动器上。这会导致频繁访问磁盘以定位指纹,并导致重复数据删除进程受阻。此外,属于同一文件的指纹可能被分散地存储在磁盘驱动器上。这会产生随机和小的磁盘访问,并且在引用指纹时导致显著的性能下降。此外,单个指纹在备份过程中可能只出现一次。由于缺乏时间局部性,这将导致非常低的缓存命中率。本文提出利用文件相似度增强指纹预取,从而提高缓存命中率和重复数据删除性能。此外,指纹按照备份数据流顺序排列,保持了局部性,提高了性能。实验结果表明,该方法可以有效减少指纹访问磁盘驱动器的次数,降低指纹查询开销,从而显著缓解重复数据删除的磁盘瓶颈。
{"title":"Leverage similarity and locality to enhance fingerprint prefetching of data deduplication","authors":"Yongtao Zhou, Yuhui Deng, Junjie Xie","doi":"10.1109/PADSW.2014.7097802","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097802","url":null,"abstract":"Data deduplication has been widely used at data backup system due to the significantly reduced requirements of storage capacity and network bandwidth. However, the performance of data deduplication gradually decreases with the growth of deduplicated data. This is because the volume of fingerprints grows significantly with the increase of backup data, and a large portion of fingerprints have to be stored on disk drives. This incurs frequent disk accesses to locate fingerprints and blocks the process of data deduplication. Furthermore, the fingerprints belonging to the same file may be discretely stored on disk drives. This generates random and small disk accesses, and results in significant performance degradation when the fingerprints are referred. Additionally, a single fingerprint may appear only once during a backup process. This results in very low cache hit ratio due to lacking temporal locality. This paper proposes to employ file similarity to enhance the fingerprint prefetching, thus improving the cache hit ratio and the performance of data deduplication. Furthermore, the fingerprints are arranged sequently in terms of the backup data stream to maintain the locality and promote the performance. Experimental results demonstrate that the proposed idea can effectively reduce the number of fingerprint accesses going to disk drives, decrease the query overhead of fingerprints, thus significantly alleviating the disk bottleneck of data deduplication.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114424175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
RBPP: A row based DRAM page policy for the many-core era RBPP:多核时代基于行的DRAM页面策略
Pub Date : 2014-12-01 DOI: 10.1109/PADSW.2014.7097922
Xiaowei Shen, Fenglong Song, Haibo Meng, Shuqian An, Zhimin Zhang
Memory requests in many-core systems are interleaved with each other and the locality of many-core systems decreases heavily. Page policies in traditional single core systems are not effective when it comes to many-core systems, because the open-page policy needs much locality of memory requests and the close-page policy takes no advantage of the remaining locality of many-core systems. There are some related memory page management policies, but their high complexity makes them unsuitable to many-core systems. They either need too much modification in operating systems or have large area and power overhead. To overcome these shortcomings of current page policies, in this paper, we propose the row based page policy, that is, RBPP, for the many-core systems, which tracks the row addresses of memory requests to each bank and uses row addresses as the indicator to decide whether or not to close the row buffer when the active memory request finished. We evaluate the proposed RBPP via Gem5 and DRAMSim2, and the results show that row based page policy can decrease the average memory latency by 14.7% and 4.0% over the open-page policy and the close-page policy, respectively. And the area overhead of row based page policy is decreased by 91.4 % and 91.5% over access based page policy and two-level predictor page policy, respectively.
多核系统中的内存请求相互交错,多核系统的局部性大大降低。当涉及到多核系统时,传统单核系统中的页面策略并不有效,因为打开页面策略需要大量内存请求的局域性,而关闭页面策略没有利用多核系统的剩余局域性。有一些相关的内存页面管理策略,但是它们的高复杂性使得它们不适合多核系统。它们要么需要对操作系统进行太多的修改,要么有很大的面积和功率开销。为了克服当前页面策略的这些缺点,本文提出了多核系统的基于行页面策略,即RBPP,它跟踪每个银行的内存请求的行地址,并使用行地址作为指标来决定是否在活动内存请求完成时关闭行缓冲区。我们通过Gem5和DRAMSim2对所提出的RBPP进行了评估,结果表明,基于行的页面策略比打开页面策略和关闭页面策略的平均内存延迟分别降低了14.7%和4.0%。与基于访问的页面策略和两级预测器页面策略相比,基于行的页面策略的面积开销分别减少了91.4%和91.5%。
{"title":"RBPP: A row based DRAM page policy for the many-core era","authors":"Xiaowei Shen, Fenglong Song, Haibo Meng, Shuqian An, Zhimin Zhang","doi":"10.1109/PADSW.2014.7097922","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097922","url":null,"abstract":"Memory requests in many-core systems are interleaved with each other and the locality of many-core systems decreases heavily. Page policies in traditional single core systems are not effective when it comes to many-core systems, because the open-page policy needs much locality of memory requests and the close-page policy takes no advantage of the remaining locality of many-core systems. There are some related memory page management policies, but their high complexity makes them unsuitable to many-core systems. They either need too much modification in operating systems or have large area and power overhead. To overcome these shortcomings of current page policies, in this paper, we propose the row based page policy, that is, RBPP, for the many-core systems, which tracks the row addresses of memory requests to each bank and uses row addresses as the indicator to decide whether or not to close the row buffer when the active memory request finished. We evaluate the proposed RBPP via Gem5 and DRAMSim2, and the results show that row based page policy can decrease the average memory latency by 14.7% and 4.0% over the open-page policy and the close-page policy, respectively. And the area overhead of row based page policy is decreased by 91.4 % and 91.5% over access based page policy and two-level predictor page policy, respectively.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"64 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134221356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Model to estimate the size of a Hadoop cluster - HCEm 模型来估计Hadoop集群的大小- HCEm
Pub Date : 2014-12-01 DOI: 10.1109/PADSW.2014.7097897
J. Brito, Aleteia P. F. Araujo
This paper describes a model which aims to estimate the size of a cluster running Hadoop framework for the processing of large datasets at a given timeframe. As main contributions it denes (i) a light layer of optimization for MapReduce jobs, (ii) presents a model to estimate the size cluster for a Hadoop framework and (iii) performs tests using a real environment - the Amazon Elastic MapReduce. The proposed approach works with the MapReduce to dene the main configuration parameters and determines computational resources of hosts in the cluster in order to meet the desired runtime for the requirements of a given workload requirement. Thus, the results show that the proposed model is able to avoid to over-allocation or sub-allocation of computing resources on a Hadoop cluster.
本文描述了一个模型,该模型旨在估计在给定时间范围内运行Hadoop框架处理大型数据集的集群的大小。它的主要贡献是(i)为MapReduce作业提供了一个简单的优化层,(ii)提供了一个模型来估计Hadoop框架的集群大小,(iii)使用真实环境——Amazon Elastic MapReduce执行测试。该方法与MapReduce一起确定主要配置参数,并确定集群中主机的计算资源,以满足给定工作负载需求的期望运行时。结果表明,该模型能够避免Hadoop集群上计算资源的过度分配或子分配。
{"title":"Model to estimate the size of a Hadoop cluster - HCEm","authors":"J. Brito, Aleteia P. F. Araujo","doi":"10.1109/PADSW.2014.7097897","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097897","url":null,"abstract":"This paper describes a model which aims to estimate the size of a cluster running Hadoop framework for the processing of large datasets at a given timeframe. As main contributions it denes (i) a light layer of optimization for MapReduce jobs, (ii) presents a model to estimate the size cluster for a Hadoop framework and (iii) performs tests using a real environment - the Amazon Elastic MapReduce. The proposed approach works with the MapReduce to dene the main configuration parameters and determines computational resources of hosts in the cluster in order to meet the desired runtime for the requirements of a given workload requirement. Thus, the results show that the proposed model is able to avoid to over-allocation or sub-allocation of computing resources on a Hadoop cluster.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133627141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RWFS: Design and implementation of file system executing access control based on user's location RWFS:基于用户位置执行访问控制的文件系统的设计和实现
Pub Date : 2014-12-01 DOI: 10.1109/PADSW.2014.7097886
Yuki Yagi, Naofumi Kitsunezaki, H. Saito, Y. Tobe
In this research, we designed and implemented Real-World File System (RWFS), which can manage files as if we can put them onto or pick them up from the places of the real world. RWFS regards the places of the real world as directories of the file system by associating a directory with a place. We create directories called Real-World Directory (RWD) which forms a hierarchical structure to reflect the natural property of places. In addition to the conventional access rights of read, write, and execute as implemented in other file systems, RWFS accommodates utilizing location information of the target user in access rights; RWFS can decide whether or not the user can access to a particular file or directory based on the user's location. Therefore, accessible files for a user change depending on the user's location. This mechanism enables creating information that can be read or written by users who physically stay at a particular place. We evaluated this system by measuring turnaround time to operate the file system together with simulation.
在这项研究中,我们设计并实现了现实世界文件系统(RWFS),它可以像我们可以把它们放在现实世界的地方或从现实世界中取出它们一样管理文件。RWFS通过将目录与位置关联起来,将现实世界中的位置视为文件系统的目录。我们创建了名为“真实世界目录”(RWD)的目录,它形成了一个层次结构,以反映地点的自然属性。除了在其他文件系统中实现的读、写和执行的常规访问权限之外,RWFS还允许在访问权限中利用目标用户的位置信息;RWFS可以根据用户的位置决定用户是否可以访问特定的文件或目录。因此,用户可访问的文件根据用户的位置而变化。这种机制支持创建可由物理上停留在特定位置的用户读取或写入的信息。我们通过测量操作文件系统的周转时间以及模拟来评估该系统。
{"title":"RWFS: Design and implementation of file system executing access control based on user's location","authors":"Yuki Yagi, Naofumi Kitsunezaki, H. Saito, Y. Tobe","doi":"10.1109/PADSW.2014.7097886","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097886","url":null,"abstract":"In this research, we designed and implemented Real-World File System (RWFS), which can manage files as if we can put them onto or pick them up from the places of the real world. RWFS regards the places of the real world as directories of the file system by associating a directory with a place. We create directories called Real-World Directory (RWD) which forms a hierarchical structure to reflect the natural property of places. In addition to the conventional access rights of read, write, and execute as implemented in other file systems, RWFS accommodates utilizing location information of the target user in access rights; RWFS can decide whether or not the user can access to a particular file or directory based on the user's location. Therefore, accessible files for a user change depending on the user's location. This mechanism enables creating information that can be read or written by users who physically stay at a particular place. We evaluated this system by measuring turnaround time to operate the file system together with simulation.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133995653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Routing table status influence of Monitoring Kad 路由表状态对监控Kad的影响
Pub Date : 2014-12-01 DOI: 10.1109/PADSW.2014.7097892
Qiang Li, Jie Yu, Zhoujun Li
Kad is the most popular P2P file sharing system. Monitoring Kad peers' lookup traffic is an important work for the analysis and optimization of Peer-to-Peer (P2P) network. During the monitoring process, we find that the peer's status significantly influences the monitoring results. Each lookup action changes the searching peer's routing table status, and it may break the monitoring process. In this paper, we analyze the changes in the routing table to verify its effect on the monitoring process. If the distance between the target ID and searcher's Kad ID is in within a certain critical range, previous searches may cause future searches to fail with high probability. We estimate the boundary of this critical range. The experiments performed on eMule shows that such a critical range exist, and that deploying more than 1024 IP addresses cannot help to improve the success rate of the monitoring process.
Kad是最流行的P2P文件共享系统。监控Kad对等体的查找流量是分析和优化P2P网络的重要工作。在监测过程中,我们发现同伴的状态对监测结果有显著影响。每个查找操作都会改变搜索对等体的路由表状态,并且可能破坏监视过程。本文通过分析路由表的变化来验证其对监控过程的影响。如果目标ID与搜索者的Kad ID之间的距离在某个临界范围内,则先前的搜索可能会导致后续搜索失败的概率很大。我们估计了这个临界范围的边界。在eMule上进行的实验表明,存在这样一个临界范围,并且部署超过1024个IP地址无助于提高监控过程的成功率。
{"title":"Routing table status influence of Monitoring Kad","authors":"Qiang Li, Jie Yu, Zhoujun Li","doi":"10.1109/PADSW.2014.7097892","DOIUrl":"https://doi.org/10.1109/PADSW.2014.7097892","url":null,"abstract":"Kad is the most popular P2P file sharing system. Monitoring Kad peers' lookup traffic is an important work for the analysis and optimization of Peer-to-Peer (P2P) network. During the monitoring process, we find that the peer's status significantly influences the monitoring results. Each lookup action changes the searching peer's routing table status, and it may break the monitoring process. In this paper, we analyze the changes in the routing table to verify its effect on the monitoring process. If the distance between the target ID and searcher's Kad ID is in within a certain critical range, previous searches may cause future searches to fail with high probability. We estimate the boundary of this critical range. The experiments performed on eMule shows that such a critical range exist, and that deploying more than 1024 IP addresses cannot help to improve the success rate of the monitoring process.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116594600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1