首页 > 最新文献

Measurement and Modeling of Computer Systems最新文献

英文 中文
TS-CLOCK: temporal and spatial locality aware buffer replacement algorithm for NAND flash storages TS-CLOCK:用于NAND闪存存储的时间和空间位置感知缓冲区替换算法
Pub Date : 2014-06-16 DOI: 10.1145/2591971.2592028
Donghyun Kang, Changwoo Min, Y. Eom
NAND flash storage is widely adopted in all classes of computing devices. However, random write performance and lifetime issues remain to be addressed. In this paper, we propose a novel buffer replacement algorithm called TS-CLOCK that effectively resolves the remaining problems. Our experimental results show that TS-CLOCK outperforms state-of-the-art algorithms in terms of performance and lifetime.
NAND闪存被广泛应用于各类计算设备中。但是,随机写性能和生命周期问题仍有待解决。在本文中,我们提出了一种新的缓冲区替换算法TS-CLOCK,有效地解决了剩余的问题。我们的实验结果表明,TS-CLOCK在性能和寿命方面优于最先进的算法。
{"title":"TS-CLOCK: temporal and spatial locality aware buffer replacement algorithm for NAND flash storages","authors":"Donghyun Kang, Changwoo Min, Y. Eom","doi":"10.1145/2591971.2592028","DOIUrl":"https://doi.org/10.1145/2591971.2592028","url":null,"abstract":"NAND flash storage is widely adopted in all classes of computing devices. However, random write performance and lifetime issues remain to be addressed. In this paper, we propose a novel buffer replacement algorithm called TS-CLOCK that effectively resolves the remaining problems. Our experimental results show that TS-CLOCK outperforms state-of-the-art algorithms in terms of performance and lifetime.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126513198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Index policies for a multi-class queue with convex holding cost and abandonments 具有凸持有成本和放弃的多类别队列的索引策略
Pub Date : 2014-06-16 DOI: 10.1145/2591971.2591983
M. Larrañaga, U. Ayesta, M. Verloop
We investigate a resource allocation problem in a multi-class server with convex holding costs and user impatience under the average cost criterion. In general, the optimal policy has a complex dependency on all the input parameters and state information. Our main contribution is to derive index policies that can serve as heuristics and are shown to give good performance. Our index policy attributes to each class an index, which depends on the number of customers currently present in that class. The index values are obtained by solving a relaxed version of the optimal stochastic control problem and combining results from restless multi-armed bandits and queueing theory. They can be expressed as a function of the steady-state distribution probabilities of a one-dimensional birth-and-death process. For linear holding cost, the index can be calculated in closed-form and turns out to be independent of the arrival rates and the number of customers present. In the case of no abandonments and linear holding cost, our index coincides with the cμ-rule, which is known to be optimal in this simple setting. For general convex holding cost we derive properties of the index value in limiting regimes: we consider the behavior of the index (i) as the number of customers in a class grows large, which allows us to derive the asymptotic structure of the index policies, and (ii) as the abandonment rate vanishes, which allows us to retrieve an index policy proposed for the multi-class M/M/1 queue with convex holding cost and no abandonments. In fact, in a multi-server environment it follows from recent advances that the index policy is asymptotically optimal for linear holding cost. To obtain further insights into the index policy, we consider the fluid version of the relaxed problem and derive a closed-form expression for the fluid index. The latter coincides with the stochastic model in case of linear holding costs. For arbitrary convex holding cost the fluid index can be seen as the Gcμθ-rule, that is, including abandonments into the generalized cμ-rule (Gcμ-rule). Numerical experiments show that our index policies become optimal as the load in the system increases.
在平均成本准则下,研究了一类具有凸持有成本和用户不耐烦的多类别服务器的资源分配问题。通常,最优策略对所有输入参数和状态信息具有复杂的依赖关系。我们的主要贡献是推导出可以作为启发式方法的索引策略,并显示出良好的性能。我们的索引策略将索引属性赋予每个类,这取决于当前该类中存在的客户数量。通过求解一个松弛版的最优随机控制问题,结合不动多臂强盗和排队理论的结果,得到指标值。它们可以表示为一维生与死过程的稳态分布概率的函数。对于线性持有成本,该指数可以以封闭形式计算,并且与到达率和到场的顾客数量无关。在没有放弃和保持成本线性的情况下,我们的指数符合cμ规则,在这个简单的设置中,cμ规则是已知的最优的。对于一般的凸持有成本,我们得到了索引值在极限条件下的性质:我们考虑索引的行为(i)随着类中顾客数量的增加,这允许我们导出索引策略的渐近结构;(ii)当放弃率消失时,这允许我们检索针对具有凸持有成本且没有放弃的多类M/M/1队列提出的索引策略。事实上,在多服务器环境中,根据最近的进展,索引策略对于线性持有成本是渐近最优的。为了进一步了解指数策略,我们考虑了松弛问题的流体版本,并推导了流体指数的封闭形式表达式。后者与持有成本线性情况下的随机模型一致。对于任意凸保持代价,流体指数可以看作是g μθ-规则,即将放弃纳入广义cμ-规则(g μ-规则)。数值实验表明,我们的索引策略随着系统负载的增加而变得最优。
{"title":"Index policies for a multi-class queue with convex holding cost and abandonments","authors":"M. Larrañaga, U. Ayesta, M. Verloop","doi":"10.1145/2591971.2591983","DOIUrl":"https://doi.org/10.1145/2591971.2591983","url":null,"abstract":"We investigate a resource allocation problem in a multi-class server with convex holding costs and user impatience under the average cost criterion. In general, the optimal policy has a complex dependency on all the input parameters and state information. Our main contribution is to derive index policies that can serve as heuristics and are shown to give good performance. Our index policy attributes to each class an index, which depends on the number of customers currently present in that class. The index values are obtained by solving a relaxed version of the optimal stochastic control problem and combining results from restless multi-armed bandits and queueing theory. They can be expressed as a function of the steady-state distribution probabilities of a one-dimensional birth-and-death process. For linear holding cost, the index can be calculated in closed-form and turns out to be independent of the arrival rates and the number of customers present. In the case of no abandonments and linear holding cost, our index coincides with the cμ-rule, which is known to be optimal in this simple setting. For general convex holding cost we derive properties of the index value in limiting regimes: we consider the behavior of the index (i) as the number of customers in a class grows large, which allows us to derive the asymptotic structure of the index policies, and (ii) as the abandonment rate vanishes, which allows us to retrieve an index policy proposed for the multi-class M/M/1 queue with convex holding cost and no abandonments. In fact, in a multi-server environment it follows from recent advances that the index policy is asymptotically optimal for linear holding cost. To obtain further insights into the index policy, we consider the fluid version of the relaxed problem and derive a closed-form expression for the fluid index. The latter coincides with the stochastic model in case of linear holding costs. For arbitrary convex holding cost the fluid index can be seen as the Gcμθ-rule, that is, including abandonments into the generalized cμ-rule (Gcμ-rule). Numerical experiments show that our index policies become optimal as the load in the system increases.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114188709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Neighbor-cell assisted error correction for MLC NAND flash memories MLC NAND闪存的邻单元辅助纠错
Pub Date : 2014-06-16 DOI: 10.1145/2591971.2591994
Yu Cai, Gulay Yalcin, O. Mutlu, E. Haratsch, O. Unsal, A. Cristal, K. Mai
Continued scaling of NAND flash memory to smaller process technology nodes decreases its reliability, necessitating more sophisticated mechanisms to correctly read stored data values. To distinguish between different potential stored values, conventional techniques to read data from flash memory employ a single set of reference voltage values, which are determined based on the overall threshold voltage distribution of flash cells. Unfortunately, the phenomenon of program interference, in which a cell's threshold voltage unintentionally changes when a neighboring cell is programmed, makes this conventional approach increasingly inaccurate in determining the values of cells. This paper makes the new empirical observation that identifying the value stored in the immediate-neighbor cell makes it easier to determine the data value stored in the cell that is being read. We provide a detailed statistical and experimental characterization of threshold voltage distribution of flash memory cells conditional upon the immediate-neighbor cell values, and show that such conditional distributions can be used to determine a set of read reference voltages that lead to error rates much lower than when a single set of reference voltage values based on the overall distribution are used. Based on our analyses, we propose a new method for correcting errors in a flash memory page, neighbor-cell assisted correction (NAC). The key idea is to re-read a flash memory page that fails error correction codes (ECC) with the set of read reference voltage values corresponding to the conditional threshold voltage distribution assuming a neighbor cell value and use the re-read values to correct the cells that have neighbors with that value. Our simulations show that NAC effectively improves flash memory lifetime by 33% while having no (at nominal lifetime) or very modest (less than 5% at extended lifetime) performance overhead.
不断将NAND闪存扩展到更小的工艺技术节点会降低其可靠性,需要更复杂的机制来正确读取存储的数据值。为了区分不同的电位存储值,从闪存读取数据的传统技术使用一组参考电压值,这些值是根据闪存单元的总体阈值电压分布确定的。不幸的是,程序干扰现象,即当相邻的细胞被编程时,细胞的阈值电压无意中发生变化,使得这种传统方法在确定细胞值时越来越不准确。本文提出了新的经验观察,即识别存储在紧邻单元中的值可以更容易地确定正在读取的单元中存储的数据值。我们提供了一个详细的统计和实验表征的阈值电压分布的条件下的临近单元的值,并表明,这种条件分布可以用来确定一组读取参考电压,导致误差率远低于当一个单一的参考电压值基于整体分布使用。在此基础上,我们提出了一种新的纠错方法——邻元辅助纠错(NAC)。关键思想是使用与条件阈值电压分布相对应的一组读取参考电压值重新读取错误纠正码(ECC)失败的闪存页,假设邻居单元值为一个值,并使用重新读取的值来纠正具有该值的邻居单元。我们的模拟表明,NAC有效地将闪存寿命提高了33%,而没有(在标称寿命下)或非常适度(在延长寿命下小于5%)的性能开销。
{"title":"Neighbor-cell assisted error correction for MLC NAND flash memories","authors":"Yu Cai, Gulay Yalcin, O. Mutlu, E. Haratsch, O. Unsal, A. Cristal, K. Mai","doi":"10.1145/2591971.2591994","DOIUrl":"https://doi.org/10.1145/2591971.2591994","url":null,"abstract":"Continued scaling of NAND flash memory to smaller process technology nodes decreases its reliability, necessitating more sophisticated mechanisms to correctly read stored data values. To distinguish between different potential stored values, conventional techniques to read data from flash memory employ a single set of reference voltage values, which are determined based on the overall threshold voltage distribution of flash cells. Unfortunately, the phenomenon of program interference, in which a cell's threshold voltage unintentionally changes when a neighboring cell is programmed, makes this conventional approach increasingly inaccurate in determining the values of cells.\u0000 This paper makes the new empirical observation that identifying the value stored in the immediate-neighbor cell makes it easier to determine the data value stored in the cell that is being read. We provide a detailed statistical and experimental characterization of threshold voltage distribution of flash memory cells conditional upon the immediate-neighbor cell values, and show that such conditional distributions can be used to determine a set of read reference voltages that lead to error rates much lower than when a single set of reference voltage values based on the overall distribution are used. Based on our analyses, we propose a new method for correcting errors in a flash memory page, neighbor-cell assisted correction (NAC). The key idea is to re-read a flash memory page that fails error correction codes (ECC) with the set of read reference voltage values corresponding to the conditional threshold voltage distribution assuming a neighbor cell value and use the re-read values to correct the cells that have neighbors with that value. Our simulations show that NAC effectively improves flash memory lifetime by 33% while having no (at nominal lifetime) or very modest (less than 5% at extended lifetime) performance overhead.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114729954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 91
An online auction framework for dynamic resource provisioning in cloud computing 云计算中动态资源配置的在线拍卖框架
Pub Date : 2014-06-16 DOI: 10.1145/2591971.2591980
Weijie Shi, Linquan Zhang, Chuan Wu, Zongpeng Li, F. Lau
Auction mechanisms have recently attracted substantial attention as an efficient approach to pricing and resource allocation in cloud computing. This work, to the authors' knowledge, represents the first online combinatorial auction designed in the cloud computing paradigm, which is general and expressive enough to both (a) optimize system efficiency across the temporal domain instead of at an isolated time point, and (b) model dynamic provisioning of heterogeneous Virtual Machine (VM) types in practice. The final result is an online auction framework that is truthful, computationally efficient, and guarantees a competitive ratio ~ e+ 1 over e-1 ~ 3.30 in social welfare in typical scenarios. The framework consists of three main steps: (1) a tailored primal-dual algorithm that decomposes the long-term optimization into a series of independent one-shot optimization problems, with an additive loss of 1 over e-1 in competitive ratio, (2) a randomized auction sub-framework that applies primal-dual optimization for translating a centralized co-operative social welfare approximation algorithm into an auction mechanism, retaining a similar approximation ratio while adding truthfulness, and (3) a primal-dual update plus dual fitting algorithm for approximating the one-shot optimization with a ratio λ close to e. The efficacy of the online auction framework is validated through theoretical analysis and trace-driven simulation studies. We are also in the hope that the framework, as well as its three independent modules, can be instructive in auction design for other related problems.
拍卖机制最近作为云计算中定价和资源分配的有效方法引起了大量关注。据作者所知,这项工作代表了在云计算范式中设计的第一个在线组合拍卖,它具有通用性和表现力,足以(a)跨时间域而不是在孤立的时间点优化系统效率,以及(b)在实践中对异构虚拟机(VM)类型的动态配置进行建模。最终的结果是一个真实的、计算效率高的在线拍卖框架,并保证在典型场景下社会福利的竞争比为~ e+ 1 / e-1 ~ 3.30。该框架包括三个主要步骤:(1)一个定制的原始对偶算法,将长期优化分解为一系列独立的一次性优化问题,竞争比损失为1 / e-1;(2)一个随机拍卖子框架,应用原始对偶优化将集中式合作社会福利近似算法转化为拍卖机制,在保持近似比的同时增加真实性;(3)采用原始-对偶更新+对偶拟合算法,使比值λ接近于e,逼近单次优化。通过理论分析和轨迹驱动仿真研究,验证了在线拍卖框架的有效性。我们也希望这个框架以及它的三个独立模块能够对其他相关问题的拍卖设计起到指导作用。
{"title":"An online auction framework for dynamic resource provisioning in cloud computing","authors":"Weijie Shi, Linquan Zhang, Chuan Wu, Zongpeng Li, F. Lau","doi":"10.1145/2591971.2591980","DOIUrl":"https://doi.org/10.1145/2591971.2591980","url":null,"abstract":"Auction mechanisms have recently attracted substantial attention as an efficient approach to pricing and resource allocation in cloud computing. This work, to the authors' knowledge, represents the first online combinatorial auction designed in the cloud computing paradigm, which is general and expressive enough to both (a) optimize system efficiency across the temporal domain instead of at an isolated time point, and (b) model dynamic provisioning of heterogeneous Virtual Machine (VM) types in practice. The final result is an online auction framework that is truthful, computationally efficient, and guarantees a competitive ratio ~ e+ 1 over e-1 ~ 3.30 in social welfare in typical scenarios. The framework consists of three main steps: (1) a tailored primal-dual algorithm that decomposes the long-term optimization into a series of independent one-shot optimization problems, with an additive loss of 1 over e-1 in competitive ratio, (2) a randomized auction sub-framework that applies primal-dual optimization for translating a centralized co-operative social welfare approximation algorithm into an auction mechanism, retaining a similar approximation ratio while adding truthfulness, and (3) a primal-dual update plus dual fitting algorithm for approximating the one-shot optimization with a ratio λ close to e. The efficacy of the online auction framework is validated through theoretical analysis and trace-driven simulation studies. We are also in the hope that the framework, as well as its three independent modules, can be instructive in auction design for other related problems.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123965507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 172
Traffic congestion: models, costs and optimal transport 交通拥堵:模型、成本和最优交通
Pub Date : 2014-06-16 DOI: 10.1145/2591971.2592014
Chinmoy Mandayam, B. Prabhakar
We develop two models of highway traffic: (i) a deterministic fluid model based on conservation laws building on previous work and (ii) a mean-field model of a series of infinite server queues, where each stage in the tandem models a segment of highway. The models define the ``highway-map''---a transformation of time-varying arrival rate functions according to which vehicles arrive at the highway to the corresponding departure rate functions of vehicles exiting the highway. The two models are shown to be equivalent in that they obtain the same highway-map. The cost of congestion for vehicles traversing the highway is the total extra time they spend on the highway due to congestion. This cost is shown to be equal to the ``d-bar'' distance between the input and the output rate measures of the highway-map. This fact is used to formulate a convex optimization problem for determining the optimal way to shift users from peak to off-peak hours using incentives so that congestion costs are lowered.
我们开发了两种高速公路交通模型:(i)基于先前工作的守恒定律的确定性流体模型和(ii)一系列无限服务器队列的平均场模型,其中串联中的每个阶段建模一段高速公路。这些模型定义了“公路地图”——将车辆到达公路的时变到达率函数转换为车辆离开公路的相应离开率函数。这两个模型是等效的,因为它们得到的是相同的公路地图。车辆穿越高速公路的拥堵成本是由于拥堵而在高速公路上花费的额外时间的总和。这个成本被显示为等于高速公路地图的输入和输出速率度量之间的“d条”距离。这一事实被用来制定一个凸优化问题,以确定使用激励将用户从高峰时间转移到非高峰时间的最佳方法,从而降低拥堵成本。
{"title":"Traffic congestion: models, costs and optimal transport","authors":"Chinmoy Mandayam, B. Prabhakar","doi":"10.1145/2591971.2592014","DOIUrl":"https://doi.org/10.1145/2591971.2592014","url":null,"abstract":"We develop two models of highway traffic: (i) a deterministic fluid model based on conservation laws building on previous work and (ii) a mean-field model of a series of infinite server queues, where each stage in the tandem models a segment of highway. The models define the ``highway-map''---a transformation of time-varying arrival rate functions according to which vehicles arrive at the highway to the corresponding departure rate functions of vehicles exiting the highway. The two models are shown to be equivalent in that they obtain the same highway-map. The cost of congestion for vehicles traversing the highway is the total extra time they spend on the highway due to congestion. This cost is shown to be equal to the ``d-bar'' distance between the input and the output rate measures of the highway-map. This fact is used to formulate a convex optimization problem for determining the optimal way to shift users from peak to off-peak hours using incentives so that congestion costs are lowered.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129122414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Is sharing with retransmissions causing instabilities? 与重传共享会导致不稳定吗?
Pub Date : 2014-06-16 DOI: 10.1145/2591971.2592001
P. Jelenkovic, E. Skiani
Retransmissions represent a primary failure recovery mech- anism on all layers of communication network architecture. Similarly, fair sharing, e.g. processor sharing (PS), is a widely accepted approach to resource allocation among mul- tiple users. Recent work has shown that retransmissions in failure-prone, e.g. wireless ad hoc, networks can cause heavy tails and long delays. In this paper, we discover a new phe- nomenon showing that PS-based scheduling induces com- plete instability in the presence of retransmissions, regard- less of how low the traffic load may be. This phenomenon occurs even when the job sizes are bounded/fragmented, e.g. deterministic. Our analytical results are further validated via simulation experiments. Moreover, our work demon- strates that scheduling one job at a time, such as first-come- first-serve, achieves stability and should be preferred in these systems.
重传是通信网络体系结构各层的主要故障恢复机制。同样,公平共享,例如处理器共享(PS),是一种在多个用户之间广泛接受的资源分配方法。最近的研究表明,在容易发生故障的网络(例如无线自组织网络)中,重传可能导致重尾和长延迟。在本文中,我们发现了一种新的现象,表明基于ps的调度在存在重传的情况下,不管流量负载有多低,都会引起完全不稳定。即使作业大小是有限的/分散的,例如确定性的,也会出现这种现象。仿真实验进一步验证了分析结果。此外,我们的工作恶魔表明,调度一次一个工作,如先到先得,实现了稳定性,在这些系统中应该首选。
{"title":"Is sharing with retransmissions causing instabilities?","authors":"P. Jelenkovic, E. Skiani","doi":"10.1145/2591971.2592001","DOIUrl":"https://doi.org/10.1145/2591971.2592001","url":null,"abstract":"Retransmissions represent a primary failure recovery mech- anism on all layers of communication network architecture. Similarly, fair sharing, e.g. processor sharing (PS), is a widely accepted approach to resource allocation among mul- tiple users. Recent work has shown that retransmissions in failure-prone, e.g. wireless ad hoc, networks can cause heavy tails and long delays. In this paper, we discover a new phe- nomenon showing that PS-based scheduling induces com- plete instability in the presence of retransmissions, regard- less of how low the traffic load may be. This phenomenon occurs even when the job sizes are bounded/fragmented, e.g. deterministic. Our analytical results are further validated via simulation experiments. Moreover, our work demon- strates that scheduling one job at a time, such as first-come- first-serve, achieves stability and should be preferred in these systems.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"350 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120883171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
CApRI: CAche-conscious data reordering for irregular codes CApRI:不规则代码的缓存感知数据重排序
Pub Date : 2014-06-16 DOI: 10.1145/2591971.2591992
W. Ding, M. Kandemir
Caches play a critical role in today's computer systems and optimizing their performance has been a critical objective in the last couple of decades. Unfortunately, compared to a plethora of work in software and hardware directed code/data optimizations, much less effort has been spent in understanding the fundamental characteristics of data access patterns exhibited by application programs and their interaction with the underlying cache hardware. Therefore, in general it is hard to reason about cache behavior of a program running on a target system. Motivated by this observation, we first set up a "locality model" that can help us determine the theoretical bounds of the cache misses caused by irregular data accesses. We then explain how this locality model can be used for different data locality optimization purposes. After that, based on our model, we propose a data reordering (data layout reorganization) scheme that can be applied after any existing data reordering schemes for irregular applications to improve cache performance by further reducing the cache misses. We evaluate the effectiveness of our scheme using a set of 8 programs with irregular data accesses, and show that it brings significant improvements over the state-of-the-art on two commercial multicore machines.
缓存在当今的计算机系统中扮演着至关重要的角色,在过去的几十年里,优化它们的性能一直是一个关键的目标。不幸的是,与大量针对软件和硬件的代码/数据优化工作相比,在理解应用程序所显示的数据访问模式的基本特征以及它们与底层缓存硬件的交互方面花费的精力要少得多。因此,通常很难推断在目标系统上运行的程序的缓存行为。受此观察的启发,我们首先建立了一个“局部性模型”,它可以帮助我们确定由不规则数据访问引起的缓存丢失的理论界限。然后,我们将解释如何将此局部性模型用于不同的数据局部性优化目的。然后,基于我们的模型,我们提出了一种数据重新排序(数据布局重组)方案,该方案可以应用于任何现有的不规则应用程序的数据重新排序方案之后,通过进一步减少缓存缺失来提高缓存性能。我们使用一组8个具有不规则数据访问的程序来评估我们的方案的有效性,并表明它在两台商用多核机器上带来了显著的改进。
{"title":"CApRI: CAche-conscious data reordering for irregular codes","authors":"W. Ding, M. Kandemir","doi":"10.1145/2591971.2591992","DOIUrl":"https://doi.org/10.1145/2591971.2591992","url":null,"abstract":"Caches play a critical role in today's computer systems and optimizing their performance has been a critical objective in the last couple of decades. Unfortunately, compared to a plethora of work in software and hardware directed code/data optimizations, much less effort has been spent in understanding the fundamental characteristics of data access patterns exhibited by application programs and their interaction with the underlying cache hardware. Therefore, in general it is hard to reason about cache behavior of a program running on a target system. Motivated by this observation, we first set up a \"locality model\" that can help us determine the theoretical bounds of the cache misses caused by irregular data accesses. We then explain how this locality model can be used for different data locality optimization purposes. After that, based on our model, we propose a data reordering (data layout reorganization) scheme that can be applied after any existing data reordering schemes for irregular applications to improve cache performance by further reducing the cache misses. We evaluate the effectiveness of our scheme using a set of 8 programs with irregular data accesses, and show that it brings significant improvements over the state-of-the-art on two commercial multicore machines.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121321987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A comparison of core power gating strategies implemented in modern hardware 现代硬件中实现的核心功率门控策略的比较
Pub Date : 2014-06-16 DOI: 10.1145/2591971.2592017
Manish Arora, Srilatha Manne, Yasuko Eckert, Indrani Paul, N. Jayasena, D. Tullsen
Idle power is a significant contributor to overall energy consumption in modern multi-core processors. Cores can enter a full-sleep state, also known as C6, to reduce idle power; however, entering C6 incurs performance and power overheads. Since power gating can result in negative savings, hardware vendors implement various algorithms to manage C6 entry. In this paper, we examine state-of-the-art C6 entry algorithms and present a comparative analysis in the context of consumer and CPU-GPU benchmarks.
空闲电源是现代多核处理器总体能耗的重要组成部分。核心可以进入全睡眠状态,也称为C6,以减少空闲功率;但是,进入C6会导致性能和电源开销。由于功率门控可能导致负节省,硬件供应商实现了各种算法来管理C6入口。在本文中,我们研究了最先进的C6输入算法,并在消费者和CPU-GPU基准测试的背景下进行了比较分析。
{"title":"A comparison of core power gating strategies implemented in modern hardware","authors":"Manish Arora, Srilatha Manne, Yasuko Eckert, Indrani Paul, N. Jayasena, D. Tullsen","doi":"10.1145/2591971.2592017","DOIUrl":"https://doi.org/10.1145/2591971.2592017","url":null,"abstract":"Idle power is a significant contributor to overall energy consumption in modern multi-core processors. Cores can enter a full-sleep state, also known as C6, to reduce idle power; however, entering C6 incurs performance and power overheads. Since power gating can result in negative savings, hardware vendors implement various algorithms to manage C6 entry. In this paper, we examine state-of-the-art C6 entry algorithms and present a comparative analysis in the context of consumer and CPU-GPU benchmarks.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116444342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
The behavior of epidemics under bounded susceptibility 传染病在有限易感性下的行为
Pub Date : 2014-06-16 DOI: 10.1145/2591971.2591977
Subhashini Krishnasamy, Siddhartha Banerjee, S. Shakkottai
We investigate the sensitivity of epidemic behavior to a bounded susceptibility constraint -- susceptible nodes are infected by their neighbors via the regular SI/SIS dynamics, but subject to a cap on the infection rate. Such a constraint is motivated by modern social networks, wherein messages are broadcast to all neighbors, but attention spans are limited. Bounded susceptibility also arises in distributed computing applications with download bandwidth constraints, and in human epidemics under quarantine policies. Network epidemics have been extensively studied in literature; prior work characterizes the graph structures required to ensure fast spreading under the SI dynamics, and long lifetime under the SIS dynamics. In particular, these conditions turn out to be meaningful for two classes of networks of practical relevance -- dense, uniform (i.e., clique-like) graphs, and sparse, structured (i.e., star-like) graphs. We show that bounded susceptibility has a surprising impact on epidemic behavior in these graph families. For the SI dynamics, bounded susceptibility has no effect on star-like networks, but dramatically alters the spreading time in clique-like networks. In contrast, for the SIS dynamics, clique-like networks are unaffected, but star-like networks exhibit a sharp change in extinction times under bounded susceptibility. Our findings are useful for the design of disease-resistant networks and infrastructure networks. More generally, they show that results for existing epidemic models are sensitive to modeling assumptions in non-intuitive ways, and suggest caution in directly using these as guidelines for real systems.
我们研究了流行病行为对有界易感性约束的敏感性——易感节点通过规则的SI/SIS动力学被其邻居感染,但受感染率上限的限制。这种约束是由现代社交网络激发的,在社交网络中,信息被广播给所有的邻居,但注意力的持续时间是有限的。在具有下载带宽限制的分布式计算应用程序中,以及在隔离策略下的人类流行病中,也会出现有限的易感性。网络流行病在文献中得到了广泛的研究;先前的工作描述了确保在SI动态下快速扩展和在SIS动态下长寿命所需的图结构。特别是,这些条件对于两类实际相关的网络是有意义的——密集的,均匀的(即,团状的)图,和稀疏的,结构化的(即,星形的)图。我们表明,在这些图族中,有界易感性对流行病行为具有惊人的影响。对于SI动力学,有界磁化率对星形网络没有影响,但极大地改变了团状网络的扩散时间。相比之下,对于SIS动力学,团状网络不受影响,但星形网络在有限磁化率下表现出急剧变化的灭绝时间。我们的发现对抗病网络和基础设施网络的设计是有用的。更一般地说,他们表明,现有流行病模型的结果对非直观方式的建模假设很敏感,并建议在直接将这些模型作为实际系统的指导方针时要谨慎。
{"title":"The behavior of epidemics under bounded susceptibility","authors":"Subhashini Krishnasamy, Siddhartha Banerjee, S. Shakkottai","doi":"10.1145/2591971.2591977","DOIUrl":"https://doi.org/10.1145/2591971.2591977","url":null,"abstract":"We investigate the sensitivity of epidemic behavior to a bounded susceptibility constraint -- susceptible nodes are infected by their neighbors via the regular SI/SIS dynamics, but subject to a cap on the infection rate. Such a constraint is motivated by modern social networks, wherein messages are broadcast to all neighbors, but attention spans are limited. Bounded susceptibility also arises in distributed computing applications with download bandwidth constraints, and in human epidemics under quarantine policies.\u0000 Network epidemics have been extensively studied in literature; prior work characterizes the graph structures required to ensure fast spreading under the SI dynamics, and long lifetime under the SIS dynamics. In particular, these conditions turn out to be meaningful for two classes of networks of practical relevance -- dense, uniform (i.e., clique-like) graphs, and sparse, structured (i.e., star-like) graphs. We show that bounded susceptibility has a surprising impact on epidemic behavior in these graph families. For the SI dynamics, bounded susceptibility has no effect on star-like networks, but dramatically alters the spreading time in clique-like networks. In contrast, for the SIS dynamics, clique-like networks are unaffected, but star-like networks exhibit a sharp change in extinction times under bounded susceptibility.\u0000 Our findings are useful for the design of disease-resistant networks and infrastructure networks. More generally, they show that results for existing epidemic models are sensitive to modeling assumptions in non-intuitive ways, and suggest caution in directly using these as guidelines for real systems.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"385 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115221692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Randomized routing schemes for large processor sharing systems with multiple service rates 具有多个服务速率的大型处理器共享系统的随机路由方案
Pub Date : 2014-06-16 DOI: 10.1145/2591971.2592015
Arpan Mukhopadhyay, R. Mazumdar
We consider randomized job routing techniques for a system consisting of a large number of parallel processor sharing servers with heterogeneous server speeds. In particular, a scheme, that routes an incoming job request to the server providing the highest instantaneous processing rate per job among two servers, chosen uniformly at random, is proposed. We show that, unlike the homogeneous case, in the heterogeneous case, such randomized dynamic schemes need not always perform better than the optimal static scheme (in which jobs are assigned to servers with fixed probabilities independent of server states) in terms of reducing the mean response time of jobs. Specifically, we show that the stability region under the proposed scheme is a subset of that under the optimal static routing scheme. We also obtain the stationary tail distribution of server occupancies for the proposed scheme in the limit as the system size grows to infinity. This distribution has been shown to be insensitive to job length distribution and decay super-exponentially.
我们考虑了由大量并行处理器共享服务器组成的系统的随机作业路由技术,这些服务器具有不同的服务器速度。特别提出了一种在随机选择的两台服务器中,将作业请求路由到作业瞬时处理率最高的服务器的方案。我们表明,与同构情况不同,在异构情况下,这种随机动态方案在减少作业的平均响应时间方面并不总是比最优静态方案(其中作业被分配给具有固定概率的服务器,与服务器状态无关)表现得更好。具体来说,我们证明了该方案下的稳定区域是最优静态路由方案下的一个子集。在系统规模趋于无穷大的极限情况下,得到了该方案服务器占用率的平稳尾分布。该分布对作业长度分布不敏感,并呈超指数衰减。
{"title":"Randomized routing schemes for large processor sharing systems with multiple service rates","authors":"Arpan Mukhopadhyay, R. Mazumdar","doi":"10.1145/2591971.2592015","DOIUrl":"https://doi.org/10.1145/2591971.2592015","url":null,"abstract":"We consider randomized job routing techniques for a system consisting of a large number of parallel processor sharing servers with heterogeneous server speeds. In particular, a scheme, that routes an incoming job request to the server providing the highest instantaneous processing rate per job among two servers, chosen uniformly at random, is proposed. We show that, unlike the homogeneous case, in the heterogeneous case, such randomized dynamic schemes need not always perform better than the optimal static scheme (in which jobs are assigned to servers with fixed probabilities independent of server states) in terms of reducing the mean response time of jobs. Specifically, we show that the stability region under the proposed scheme is a subset of that under the optimal static routing scheme. We also obtain the stationary tail distribution of server occupancies for the proposed scheme in the limit as the system size grows to infinity. This distribution has been shown to be insensitive to job length distribution and decay super-exponentially.","PeriodicalId":306456,"journal":{"name":"Measurement and Modeling of Computer Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123824895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Measurement and Modeling of Computer Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1