Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems最新文献

英文中文

Using Burstable Instances in the Public Cloud: Why, When and How? 在公共云中使用突发实例:为什么，何时以及如何使用?

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2017-06-05 DOI: 10.1145/3078505.3078591

Cheng Wang, B. Urgaonkar, N. Nasiriani, G. Kesidis

To attract more customers, public cloud providers offer virtual machine (instance) types that trade off lower prices for poorer capacities. As one salient approach, the providers employ aggressive statistical multiplexing of multiple cheaper instances on a single physical server, resulting in tenants experiencing higher dynamism in the resource capacity of these instances. Examples of this are EC2's "type" instances and GCE's "shared-core" instances.We collectively refer to these as burstable instances for their ability to dynamically "burst" (increase the capacity of) their resources. Burstable instances are significantly cheaper than the "regular" instances, and offer time-varying CPU capacity comprising a minimum guaranteed base capacity/rate, which is much smaller than a short-lived peak capacity that becomes available upon operating at lower than base rate for a sufficient duration. Table 1 summarizes our classification of resource capacity dynamism for GCE and EC2 instances along with the nature of disclosure made by the provider. To exploit burstable instances cost-effectively, a tenant would need to carefully understand the significant additional complexity of such instances beyond that disclosed by the providers.

为了吸引更多的客户，公共云提供商提供虚拟机(实例)类型，以较低的价格换取较差的容量。作为一种突出的方法，提供商在单个物理服务器上对多个更便宜的实例采用积极的统计多路复用，从而使租户在这些实例的资源容量中体验到更高的动态性。这方面的例子是EC2的“类型”实例和GCE的“共享核心”实例。我们将这些实例统称为可爆发实例，因为它们具有动态“爆发”(增加资源容量)的能力。突发实例比“常规”实例便宜得多，并且提供随时间变化的CPU容量，包括最低保证的基本容量/速率，这比在低于基本速率的情况下运行足够长时间的短暂峰值容量要小得多。表1总结了我们对GCE和EC2实例的资源容量动态的分类，以及提供商披露的性质。为了经济有效地利用可突发实例，承租者需要仔细了解这些实例的显著额外复杂性，而不是提供者所披露的复杂性。

{"title":"Using Burstable Instances in the Public Cloud: Why, When and How?","authors":"Cheng Wang, B. Urgaonkar, N. Nasiriani, G. Kesidis","doi":"10.1145/3078505.3078591","DOIUrl":"https://doi.org/10.1145/3078505.3078591","url":null,"abstract":"To attract more customers, public cloud providers offer virtual machine (instance) types that trade off lower prices for poorer capacities. As one salient approach, the providers employ aggressive statistical multiplexing of multiple cheaper instances on a single physical server, resulting in tenants experiencing higher dynamism in the resource capacity of these instances. Examples of this are EC2's \"type\" instances and GCE's \"shared-core\" instances.We collectively refer to these as burstable instances for their ability to dynamically \"burst\" (increase the capacity of) their resources. Burstable instances are significantly cheaper than the \"regular\" instances, and offer time-varying CPU capacity comprising a minimum guaranteed base capacity/rate, which is much smaller than a short-lived peak capacity that becomes available upon operating at lower than base rate for a sufficient duration. Table 1 summarizes our classification of resource capacity dynamism for GCE and EC2 instances along with the nature of disclosure made by the provider. To exploit burstable instances cost-effectively, a tenant would need to carefully understand the significant additional complexity of such instances beyond that disclosed by the providers.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124201349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

A Simple Yet Effective Balanced Edge Partition Model for Parallel Computing 一种简单而有效的并行计算平衡边缘划分模型

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2017-06-05 DOI: 10.1145/3078505.3078520

Lingda Li, Robel Geda, Ari B. Hayes, Yan-Hao Chen, Pranav Chaudhari, E. Zhang, M. Szegedy

Graph edge partition models have recently become an appealing alternative to graph vertex partition models for distributed computing due to both their flexibility in balancing loads and their performance in reducing communication cost. In this paper, we propose a simple yet effective graph edge partitioning algorithm. In practice, our algorithm provides good partition quality while maintaining low partition overhead. It also outperforms similar state-of-the-art edge partition approaches, especially for power-law graphs. In theory, previous work showed that an approximation guarantee of O(dmax√(log n log k)) apply to the graphs with m=Ω(k2) edges (n is the number of vertices, and k is the number of partitions). We further rigorously proved that this approximation guarantee hold for all graphs. We also demonstrate the applicability of the proposed edge partition algorithm in real parallel computing systems. We draw our example from GPU program locality enhancement and demonstrate that the graph edge partition model does not only apply to distributed computing with many computer nodes, but also to parallel computing in a single computer node with a many-core processor.

图边缘划分模型由于其在平衡负载方面的灵活性和降低通信成本方面的性能，最近已成为图顶点划分模型的一种有吸引力的分布式计算替代方案。本文提出了一种简单而有效的图边缘划分算法。在实践中，我们的算法提供了良好的分区质量，同时保持了较低的分区开销。它还优于类似的最先进的边缘划分方法，特别是对于幂律图。理论上，先前的工作表明，O(dmax√(log n log k))的近似保证适用于m=Ω(k2)条边的图(n是顶点的数量，k是分区的数量)。进一步严格证明了该近似保证对所有图都成立。我们还证明了所提出的边缘划分算法在实际并行计算系统中的适用性。以GPU程序局部性增强为例，证明了图边划分模型不仅适用于多节点分布式计算，也适用于多核处理器单节点并行计算。

{"title":"A Simple Yet Effective Balanced Edge Partition Model for Parallel Computing","authors":"Lingda Li, Robel Geda, Ari B. Hayes, Yan-Hao Chen, Pranav Chaudhari, E. Zhang, M. Szegedy","doi":"10.1145/3078505.3078520","DOIUrl":"https://doi.org/10.1145/3078505.3078520","url":null,"abstract":"Graph edge partition models have recently become an appealing alternative to graph vertex partition models for distributed computing due to both their flexibility in balancing loads and their performance in reducing communication cost. In this paper, we propose a simple yet effective graph edge partitioning algorithm. In practice, our algorithm provides good partition quality while maintaining low partition overhead. It also outperforms similar state-of-the-art edge partition approaches, especially for power-law graphs. In theory, previous work showed that an approximation guarantee of O(dmax√(log n log k)) apply to the graphs with m=Ω(k2) edges (n is the number of vertices, and k is the number of partitions). We further rigorously proved that this approximation guarantee hold for all graphs. We also demonstrate the applicability of the proposed edge partition algorithm in real parallel computing systems. We draw our example from GPU program locality enhancement and demonstrate that the graph edge partition model does not only apply to distributed computing with many computer nodes, but also to parallel computing in a single computer node with a many-core processor.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116816296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Optimal Posted Prices for Online Cloud Resource Allocation 在线云资源分配的最优发布价格

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2017-06-05 DOI: 10.1145/3078505.3078529

Zijun Zhang, Zongpeng Li, Chuan Wu

We study online resource allocation in a cloud computing platform, through a posted pricing mechanism: The cloud provider publishes a unit price for each resource type, which may vary over time; upon arrival at the cloud system, a cloud user either takes the current prices, renting resources to execute its job, or refuses the prices without running its job there. We design pricing functions based on the current resource utilization ratios, in a wide array of demand-supply relationships and resource occupation durations, and prove worst-case competitive ratios of the pricing functions in terms of social welfare. In the basic case of a single-type, non-recycled resource (i.e., allocated resources are not later released for reuse), we prove that our pricing function design is optimal, in that any other pricing function can only lead to a worse competitive ratio. Insights obtained from the basic cases are then used to generalize the pricing functions to more realistic cloud systems with multiple types of resources, where a job occupies allocated resources for a number of time slots till completion, upon which time the resources are returned back to the cloud resource pool.

我们通过发布定价机制研究云计算平台中的在线资源分配:云提供商发布每种资源类型的单价，该单价可能随时间而变化;到达云系统后，云用户要么接受当前价格，租用资源来执行其作业，要么拒绝价格，而不运行其作业。我们设计了基于当前资源利用率的定价函数，在广泛的供求关系和资源占用持续时间下，并从社会福利的角度证明了最坏情况下定价函数的竞争比。在单一类型的非回收资源(即分配的资源以后不被释放再利用)的基本情况下，我们证明了我们的定价函数设计是最优的，任何其他定价函数只会导致更差的竞争比。然后使用从基本案例中获得的见解将定价函数推广到具有多种资源类型的更现实的云系统，其中作业在多个时间段占用分配的资源，直到完成，此时资源被返回到云资源池。

{"title":"Optimal Posted Prices for Online Cloud Resource Allocation","authors":"Zijun Zhang, Zongpeng Li, Chuan Wu","doi":"10.1145/3078505.3078529","DOIUrl":"https://doi.org/10.1145/3078505.3078529","url":null,"abstract":"We study online resource allocation in a cloud computing platform, through a posted pricing mechanism: The cloud provider publishes a unit price for each resource type, which may vary over time; upon arrival at the cloud system, a cloud user either takes the current prices, renting resources to execute its job, or refuses the prices without running its job there. We design pricing functions based on the current resource utilization ratios, in a wide array of demand-supply relationships and resource occupation durations, and prove worst-case competitive ratios of the pricing functions in terms of social welfare. In the basic case of a single-type, non-recycled resource (i.e., allocated resources are not later released for reuse), we prove that our pricing function design is optimal, in that any other pricing function can only lead to a worse competitive ratio. Insights obtained from the basic cases are then used to generalize the pricing functions to more realistic cloud systems with multiple types of resources, where a job occupies allocated resources for a number of time slots till completion, upon which time the resources are returned back to the cloud resource pool.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121925556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Investigation of the 2016 Linux TCP Stack Vulnerability at Scale 2016年Linux TCP栈大规模漏洞研究

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2017-06-05 DOI: 10.1145/3078505.3078510

Alan Quach, Zhongjie Wang, Zhiyun Qian

To combat blind in-window attacks against TCP, changes proposed in RFC 5961 have been implemented by Linux since late 2012. While successfully eliminating the old vulnerabilities, the new TCP implementation was reported in August 2016 to have introduced a subtle yet serious security flaw. Assigned CVE-2016-5696, the flaw exploits the challenge ACK rate limiting feature that could allow an off-path attacker to infer the presence/absence of a TCP connection between two arbitrary hosts, terminate such a connection, and even inject malicious payload. In this work, we perform a comprehensive measurement of the impact of the new vulnerability. This includes (1) tracking the vulnerable Internet servers, (2) monitoring the patch behavior over time, (3) picturing the overall security status of TCP stacks at scale. Towards this goal, we design a scalable measurement methodology to scan the Alexa top 1 million websites for almost 6 months. We also present how notifications impact the patching behavior, and compare the result with the Heartbleed and the Debian PRNG vulnerability. The measurement represents a valuable data point in understanding how Internet servers react to serious security flaws in the operating system kernel.

为了对抗针对TCP的窗口内盲攻击，RFC 5961中提出的更改自2012年底以来已由Linux实现。在成功消除旧漏洞的同时，2016年8月有报道称，新的TCP实现引入了一个微妙但严重的安全漏洞。该漏洞被指定为CVE-2016-5696，它利用了挑战ACK速率限制功能，允许偏离路径的攻击者推断两个任意主机之间是否存在TCP连接，终止这种连接，甚至注入恶意负载。在这项工作中，我们对新漏洞的影响进行了全面的测量。这包括(1)跟踪易受攻击的Internet服务器，(2)随时间监控补丁行为，(3)大规模描绘TCP堆栈的整体安全状态。为了实现这一目标，我们设计了一个可扩展的测量方法来扫描Alexa前100万个网站近6个月。我们还介绍了通知如何影响补丁行为，并将结果与Heartbleed和Debian PRNG漏洞进行比较。在理解Internet服务器如何对操作系统内核中的严重安全缺陷作出反应方面，该度量提供了一个有价值的数据点。

{"title":"Investigation of the 2016 Linux TCP Stack Vulnerability at Scale","authors":"Alan Quach, Zhongjie Wang, Zhiyun Qian","doi":"10.1145/3078505.3078510","DOIUrl":"https://doi.org/10.1145/3078505.3078510","url":null,"abstract":"To combat blind in-window attacks against TCP, changes proposed in RFC 5961 have been implemented by Linux since late 2012. While successfully eliminating the old vulnerabilities, the new TCP implementation was reported in August 2016 to have introduced a subtle yet serious security flaw. Assigned CVE-2016-5696, the flaw exploits the challenge ACK rate limiting feature that could allow an off-path attacker to infer the presence/absence of a TCP connection between two arbitrary hosts, terminate such a connection, and even inject malicious payload. In this work, we perform a comprehensive measurement of the impact of the new vulnerability. This includes (1) tracking the vulnerable Internet servers, (2) monitoring the patch behavior over time, (3) picturing the overall security status of TCP stacks at scale. Towards this goal, we design a scalable measurement methodology to scan the Alexa top 1 million websites for almost 6 months. We also present how notifications impact the patching behavior, and compare the result with the Heartbleed and the Debian PRNG vulnerability. The measurement represents a valuable data point in understanding how Internet servers react to serious security flaws in the operating system kernel.","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":" 649","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131977099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Session details: SIGMETRICS Achievement Award: Sem Borst SIGMETRICS成就奖:Sem Borst

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2017-06-05 DOI: 10.1145/3248534

L. Golubchik

引用次数: 0

Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms 现代DRAM芯片中设计引起的延迟变化:表征、分析和延迟减少机制

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2017-06-05 DOI: 10.1145/3078505.3078533

Donghyuk Lee, S. Khan, Lavanya Subramanian, Saugata Ghose, Rachata Ausavarungnirun, Gennady Pekhimenko, V. Seshadri, O. Mutlu

Variation has been shown to exist across the cells within a modern DRAM chip. Prior work has studied and exploited several forms of variation, such as manufacturing-process- or temperature-induced variation. We empirically demonstrate a new form of variation that exists within a real DRAM chip, induced by the design and placement of different components in the DRAM chip: different regions in DRAM, based on their relative distances from the peripheral structures, require different minimum access latencies for reliable operation. In particular, we show that in most real DRAM chips, cells closer to the peripheral structures can be accessed much faster than cells that are farther. We call this phenomenon design-induced variation in DRAM. Our goals are to i) understand design-induced variation that exists in real, state-of-the-art DRAM chips, ii) exploit it to develop low-cost mechanisms that can dynamically find and use the lowest latency at which to operate a DRAM chip reliably, and, thus, iii) improve overall system performance while ensuring reliable system operation. To this end, we first experimentally demonstrate and analyze designed-induced variation in modern DRAM devices by testing and characterizing 96 DIMMs (768 DRAM chips). Our experimental study shows that i) modern DRAM chips exhibit design-induced latency variation in both row and column directions, ii) access latency gradually increases in the row direction within a DRAM cell array (mat) and this pattern repeats in every mat, and iii) some columns require higher latency than others due to the internal hierarchical organization of the DRAM chip. Our characterization identifies DRAM regions that are vulnerable to errors, if operated at lower latency, and finds consistency in their locations across a given DRAM chip generation, due to design-induced variation. Variations in the vertical and horizontal dimensions, together, divide the cell array into heterogeneous-latency regions, where cells in some regions require longer access latencies for reliable operation. Reducing the latency uniformly across all regions in DRAM would improve performance, but can introduce failures in the inherently slower regions that require longer access latencies for correct operation. We refer to these inherently slower regions of DRAM as design-induced vulnerable regions. Based on our extensive experimental analysis, we develop two mechanisms that reliably reduce DRAM latency. First, DIVI Profiling uses runtime profiling to dynamically identify the lowest DRAM latency that does not introduce failures. DIVA Profiling exploits design-induced variation and periodically profiles only the vulnerable regions to determine the lowest DRAM latency at low cost. It is the first mechanism to dynamically determine the lowest latency that can be used to operate DRAM reliably. DIVA Profiling reduces the latency of read/write requests by 35.1%/57.8%, respectively, at 55C. Our second mechanism, DIVA Shuffling, shuffles data

变异已被证明存在于现代DRAM芯片的各个单元之间。先前的工作已经研究和利用了几种形式的变化，如制造过程或温度引起的变化。我们通过经验证明了一种新的变化形式，这种变化存在于真实的DRAM芯片中，由DRAM芯片中不同组件的设计和放置引起:DRAM中的不同区域，基于它们与外围结构的相对距离，需要不同的最小访问延迟以实现可靠的操作。特别是，我们表明，在大多数实际的DRAM芯片中，更接近外围结构的单元可以比更远的单元更快地被访问。我们把这种现象称为DRAM中的设计诱导变异。我们的目标是i)理解存在于真实的、最先进的DRAM芯片中的设计引起的变化，ii)利用它来开发低成本机制，可以动态地发现并使用最低的延迟来可靠地操作DRAM芯片，因此，iii)在确保可靠系统运行的同时提高整体系统性能。为此，我们首先通过测试和表征96个dimm(768个DRAM芯片)，实验证明和分析了现代DRAM器件中设计引起的变化。我们的实验研究表明，i)现代DRAM芯片在行和列方向上都表现出设计引起的延迟变化，ii)在DRAM单元阵列(mat)中，访问延迟在行方向上逐渐增加，并且这种模式在每个mat中重复，iii)由于DRAM芯片的内部分层组织，一些列需要比其他列更高的延迟。我们的特性确定了易受错误影响的DRAM区域，如果在较低的延迟下操作，并在给定的DRAM芯片生成中发现其位置的一致性，这是由于设计引起的变化。垂直和水平维度的变化一起将单元阵列划分为异构延迟区域，其中某些区域的单元需要更长的访问延迟才能进行可靠的操作。统一地减少DRAM中所有区域的延迟可以提高性能，但可能会在固有的较慢区域中引入故障，这些区域需要更长的访问延迟才能正确操作。我们将这些天生较慢的DRAM区域称为设计诱发的脆弱区域。基于我们广泛的实验分析，我们开发了两种可靠地减少DRAM延迟的机制。首先，DIVI分析使用运行时分析来动态识别不会引入故障的最低DRAM延迟。DIVA分析利用设计引起的变化，周期性地只分析易受攻击的区域，以低成本确定最低的DRAM延迟。它是第一个动态确定最低延迟的机制，可以用来可靠地操作DRAM。DIVA分析在55C时分别将读/写请求的延迟减少了35.1%/57.8%。我们的第二种机制，DIVA洗牌，洗牌数据，这样存储在脆弱区域的值被映射到多个纠错码(ECC)码字。因此，DIVA变换可以比传统的ECC多纠正26%的多比特错误。结合在一起，我们的两种机制将读/写延迟减少了40.0%/60.5%，这意味着在各种工作负载下，整体系统性能提高了14.7%/13.7%/13.8%(在2核/4核/8核系统中)，同时确保了可靠的操作。

{"title":"Design-Induced Latency Variation in Modern DRAM Chips: Characterization, Analysis, and Latency Reduction Mechanisms","authors":"Donghyuk Lee, S. Khan, Lavanya Subramanian, Saugata Ghose, Rachata Ausavarungnirun, Gennady Pekhimenko, V. Seshadri, O. Mutlu","doi":"10.1145/3078505.3078533","DOIUrl":"https://doi.org/10.1145/3078505.3078533","url":null,"abstract":"Variation has been shown to exist across the cells within a modern DRAM chip. Prior work has studied and exploited several forms of variation, such as manufacturing-process- or temperature-induced variation. We empirically demonstrate a new form of variation that exists within a real DRAM chip, induced by the design and placement of different components in the DRAM chip: different regions in DRAM, based on their relative distances from the peripheral structures, require different minimum access latencies for reliable operation. In particular, we show that in most real DRAM chips, cells closer to the peripheral structures can be accessed much faster than cells that are farther. We call this phenomenon design-induced variation in DRAM. Our goals are to i) understand design-induced variation that exists in real, state-of-the-art DRAM chips, ii) exploit it to develop low-cost mechanisms that can dynamically find and use the lowest latency at which to operate a DRAM chip reliably, and, thus, iii) improve overall system performance while ensuring reliable system operation. To this end, we first experimentally demonstrate and analyze designed-induced variation in modern DRAM devices by testing and characterizing 96 DIMMs (768 DRAM chips). Our experimental study shows that i) modern DRAM chips exhibit design-induced latency variation in both row and column directions, ii) access latency gradually increases in the row direction within a DRAM cell array (mat) and this pattern repeats in every mat, and iii) some columns require higher latency than others due to the internal hierarchical organization of the DRAM chip. Our characterization identifies DRAM regions that are vulnerable to errors, if operated at lower latency, and finds consistency in their locations across a given DRAM chip generation, due to design-induced variation. Variations in the vertical and horizontal dimensions, together, divide the cell array into heterogeneous-latency regions, where cells in some regions require longer access latencies for reliable operation. Reducing the latency uniformly across all regions in DRAM would improve performance, but can introduce failures in the inherently slower regions that require longer access latencies for correct operation. We refer to these inherently slower regions of DRAM as design-induced vulnerable regions. Based on our extensive experimental analysis, we develop two mechanisms that reliably reduce DRAM latency. First, DIVI Profiling uses runtime profiling to dynamically identify the lowest DRAM latency that does not introduce failures. DIVA Profiling exploits design-induced variation and periodically profiles only the vulnerable regions to determine the lowest DRAM latency at low cost. It is the first mechanism to dynamically determine the lowest latency that can be used to operate DRAM reliably. DIVA Profiling reduces the latency of read/write requests by 35.1%/57.8%, respectively, at 55C. Our second mechanism, DIVA Shuffling, shuffles data ","PeriodicalId":133673,"journal":{"name":"Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132430603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 112

Session details: Session 9: Accurate and Efficient Performance Measurement 会议详情:第9部分:准确和高效的绩效评估

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2017-06-05 DOI: 10.1145/3248546

G. Casale

引用次数: 0

Characterizing 3D Floating Gate NAND Flash 三维浮栅NAND闪存的表征

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2017-06-05 DOI: 10.1145/3078505.3078550

Qin Xiong, Fei Wu, Zhonghai Lu, Yue Zhu, You Zhou, Yibing Chu, C. Xie, Ping Huang

In this paper, we characterize a state-of-the-art 3D floating gate NAND flash memory through comprehensive experiments on an FPGA platform. Then, we present distinct observations on performance and reliability, such as operation latencies and various error patterns. We believe that through our work, novel 3D NAND flash-oriented designs can be developed to achieve better performance and reliability.

在本文中，我们通过在FPGA平台上的综合实验来表征最先进的3D浮门NAND闪存。然后，我们对性能和可靠性进行了不同的观察，例如操作延迟和各种错误模式。我们相信，通过我们的工作，可以开发出新的3D NAND闪存设计，以实现更好的性能和可靠性。

引用次数: 22

Session details: Session 2: Algorithms for Massive Processing Applications 会议详情:会议2:大规模处理应用的算法

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2017-06-05 DOI: 10.1145/3248536

Y. Tay

引用次数: 0

Session details: SIGMETRICS Keynote Talk: Michael Jordan 会议细节:SIGMETRICS主题演讲:迈克尔·乔丹

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2017-06-05 DOI: 10.1145/3248542

Zhi-Li Zhang

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀