Cachalot:一种网络感知的协作缓存网络,用于地理分布的数据密集型应用程序

Fan Jiang, C. Castillo, S. Ahalt
{"title":"Cachalot:一种网络感知的协作缓存网络,用于地理分布的数据密集型应用程序","authors":"Fan Jiang, C. Castillo, S. Ahalt","doi":"10.1109/NOMS.2018.8406273","DOIUrl":null,"url":null,"abstract":"Collaborative and data-intensive applications are hosted on geo-distributed infrastructures to exploit computing resources at scale. However, these applications typically incur massive data transfers over bandwidth-constrained wide- area networks (WANs) which impose significant performance overhead. Conventional distributed computing platforms (e.g., Spark) leverage caching to avoid duplicate executions of common computations and thus reduce network traffic. However, these techniques were developed for data center environments and therefore lack advanced network-aware mechanisms to support high-performance, data-intensive applications over the WAN in geo-distributed environments. Hence, we develop Cachalot - a novel network-aware, cooperative cache network for caching datasets generated by common computations shared among geo- distributed, data-intensive applications. We perform a simulation- based deep evaluation using both synthetic and real traces. The experimental results indicate Cachalot speeds up data-intensive applications by over 50%, reducing network traffic by up to 60%; and, outperforms state-of-the-art baselines by over 20% in geo-distributed environments for various common user-driven performance metrics.","PeriodicalId":19331,"journal":{"name":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","volume":"5 1","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Cachalot: A network-aware, cooperative cache network for geo-distributed, data-intensive applications\",\"authors\":\"Fan Jiang, C. Castillo, S. Ahalt\",\"doi\":\"10.1109/NOMS.2018.8406273\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Collaborative and data-intensive applications are hosted on geo-distributed infrastructures to exploit computing resources at scale. However, these applications typically incur massive data transfers over bandwidth-constrained wide- area networks (WANs) which impose significant performance overhead. Conventional distributed computing platforms (e.g., Spark) leverage caching to avoid duplicate executions of common computations and thus reduce network traffic. However, these techniques were developed for data center environments and therefore lack advanced network-aware mechanisms to support high-performance, data-intensive applications over the WAN in geo-distributed environments. Hence, we develop Cachalot - a novel network-aware, cooperative cache network for caching datasets generated by common computations shared among geo- distributed, data-intensive applications. We perform a simulation- based deep evaluation using both synthetic and real traces. The experimental results indicate Cachalot speeds up data-intensive applications by over 50%, reducing network traffic by up to 60%; and, outperforms state-of-the-art baselines by over 20% in geo-distributed environments for various common user-driven performance metrics.\",\"PeriodicalId\":19331,\"journal\":{\"name\":\"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium\",\"volume\":\"5 1\",\"pages\":\"1-9\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NOMS.2018.8406273\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOMS.2018.8406273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

协作和数据密集型应用程序托管在地理分布式基础设施上,以大规模地利用计算资源。然而,这些应用程序通常会在带宽受限的广域网(wan)上产生大量数据传输,从而带来显著的性能开销。传统的分布式计算平台(例如,Spark)利用缓存来避免重复执行公共计算,从而减少网络流量。然而,这些技术是为数据中心环境开发的,因此缺乏先进的网络感知机制来支持地理分布环境中WAN上的高性能、数据密集型应用程序。因此,我们开发了Cachalot——一种新颖的网络感知、协作缓存网络,用于缓存由地理分布的、数据密集型应用程序之间共享的公共计算生成的数据集。我们使用合成轨迹和真实轨迹进行了基于仿真的深度评估。实验结果表明,Cachalot将数据密集型应用程序的速度提高了50%以上,将网络流量减少了60%;并且,在地理分布环境中,对于各种常见的用户驱动的性能指标,其性能比最先进的基线高出20%以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Cachalot: A network-aware, cooperative cache network for geo-distributed, data-intensive applications
Collaborative and data-intensive applications are hosted on geo-distributed infrastructures to exploit computing resources at scale. However, these applications typically incur massive data transfers over bandwidth-constrained wide- area networks (WANs) which impose significant performance overhead. Conventional distributed computing platforms (e.g., Spark) leverage caching to avoid duplicate executions of common computations and thus reduce network traffic. However, these techniques were developed for data center environments and therefore lack advanced network-aware mechanisms to support high-performance, data-intensive applications over the WAN in geo-distributed environments. Hence, we develop Cachalot - a novel network-aware, cooperative cache network for caching datasets generated by common computations shared among geo- distributed, data-intensive applications. We perform a simulation- based deep evaluation using both synthetic and real traces. The experimental results indicate Cachalot speeds up data-intensive applications by over 50%, reducing network traffic by up to 60%; and, outperforms state-of-the-art baselines by over 20% in geo-distributed environments for various common user-driven performance metrics.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
SSH Kernel: A Jupyter Extension Specifically for Remote Infrastructure Administration Visual emulation for Ethereum's virtual machine Analyzing throughput and stability in cellular networks Network events in a large commercial network: What can we learn? Economic incentives on DNSSEC deployment: Time to move from quantity to quality
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1