重新定义跨数据中心存储的数据位置

Kwangsung Oh, A. Raghavan, A. Chandra, J. Weissman
{"title":"重新定义跨数据中心存储的数据位置","authors":"Kwangsung Oh, A. Raghavan, A. Chandra, J. Weissman","doi":"10.1145/2756594.2756596","DOIUrl":null,"url":null,"abstract":"Many Cloud applications exploit the diversity of storage options in a data center to achieve desired cost, performance, and durability tradeoffs. It is common to see applications using a combination of memory, local disk, and archival storage tiers within a single data center to meet their needs. For example, hot data can be kept in memory using ElastiCache, and colder data in cheaper, slower storage such as S3, using Amazon as an example. For user-facing applications, a recent trend is to exploit multiple data centers for data placement to enable better latency of access from users to their data. The conventional wisdom is that co-location of computation and storage within the same data center is a key to application performance, so that applications running within a data center are often still limited to access local data. In this paper, using experiments on Amazon, Microsoft, and Google clouds, we show that this assumption is false, and that accessing data in nearby data centers may be faster than local access at different or even same points in the storage hierarchy. This can lead to not only better performance, but also reduced cost, simpler consistency policies and reconsidering data locality in multiple DCs environment. This argues for an expansion of cloud storage tiers to consider non-local storage options, and has interesting implications for the design of a distributed storage system.","PeriodicalId":283088,"journal":{"name":"Proceedings of the 2nd International Workshop on Software-Defined Ecosystems","volume":"105 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Redefining Data Locality for Cross-Data Center Storage\",\"authors\":\"Kwangsung Oh, A. Raghavan, A. Chandra, J. Weissman\",\"doi\":\"10.1145/2756594.2756596\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many Cloud applications exploit the diversity of storage options in a data center to achieve desired cost, performance, and durability tradeoffs. It is common to see applications using a combination of memory, local disk, and archival storage tiers within a single data center to meet their needs. For example, hot data can be kept in memory using ElastiCache, and colder data in cheaper, slower storage such as S3, using Amazon as an example. For user-facing applications, a recent trend is to exploit multiple data centers for data placement to enable better latency of access from users to their data. The conventional wisdom is that co-location of computation and storage within the same data center is a key to application performance, so that applications running within a data center are often still limited to access local data. In this paper, using experiments on Amazon, Microsoft, and Google clouds, we show that this assumption is false, and that accessing data in nearby data centers may be faster than local access at different or even same points in the storage hierarchy. This can lead to not only better performance, but also reduced cost, simpler consistency policies and reconsidering data locality in multiple DCs environment. This argues for an expansion of cloud storage tiers to consider non-local storage options, and has interesting implications for the design of a distributed storage system.\",\"PeriodicalId\":283088,\"journal\":{\"name\":\"Proceedings of the 2nd International Workshop on Software-Defined Ecosystems\",\"volume\":\"105 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd International Workshop on Software-Defined Ecosystems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2756594.2756596\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Workshop on Software-Defined Ecosystems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2756594.2756596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

许多云应用程序利用数据中心中存储选项的多样性来实现所需的成本、性能和持久性权衡。应用程序在单个数据中心内使用内存、本地磁盘和归档存储层的组合来满足其需求是很常见的。例如,热数据可以使用ElastiCache保存在内存中,冷数据可以保存在更便宜、更慢的存储(如S3)中,以Amazon为例。对于面向用户的应用程序,最近的一个趋势是利用多个数据中心进行数据放置,以提高用户对其数据的访问延迟。传统观点认为,计算和存储在同一数据中心内的共存位置是提高应用程序性能的关键,因此在数据中心内运行的应用程序通常仍然仅限于访问本地数据。在本文中,通过对Amazon、Microsoft和Google云的实验,我们证明了这个假设是错误的,并且访问附近数据中心的数据可能比访问存储层次结构中不同甚至相同点的本地数据更快。这不仅可以提高性能,还可以降低成本,简化一致性策略,并重新考虑多个数据中心环境中的数据位置。这就要求扩展云存储层以考虑非本地存储选项,并对分布式存储系统的设计产生有趣的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Redefining Data Locality for Cross-Data Center Storage
Many Cloud applications exploit the diversity of storage options in a data center to achieve desired cost, performance, and durability tradeoffs. It is common to see applications using a combination of memory, local disk, and archival storage tiers within a single data center to meet their needs. For example, hot data can be kept in memory using ElastiCache, and colder data in cheaper, slower storage such as S3, using Amazon as an example. For user-facing applications, a recent trend is to exploit multiple data centers for data placement to enable better latency of access from users to their data. The conventional wisdom is that co-location of computation and storage within the same data center is a key to application performance, so that applications running within a data center are often still limited to access local data. In this paper, using experiments on Amazon, Microsoft, and Google clouds, we show that this assumption is false, and that accessing data in nearby data centers may be faster than local access at different or even same points in the storage hierarchy. This can lead to not only better performance, but also reduced cost, simpler consistency policies and reconsidering data locality in multiple DCs environment. This argues for an expansion of cloud storage tiers to consider non-local storage options, and has interesting implications for the design of a distributed storage system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Feeding the Beast: Getting Data into Big Systems XoS: An Extensible Cloud Operating System Continuous Delivery of Composite Solutions: A Case for Collaborative Software Defined PaaS Environments Virtual Fabric-based Approach for Virtual Data Center Network A Framework for Realizing Software-Defined Federations for Scientific Workflows
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1