{"title":"Cachalot: A network-aware, cooperative cache network for geo-distributed, data-intensive applications","authors":"Fan Jiang, C. Castillo, S. Ahalt","doi":"10.1109/NOMS.2018.8406273","DOIUrl":null,"url":null,"abstract":"Collaborative and data-intensive applications are hosted on geo-distributed infrastructures to exploit computing resources at scale. However, these applications typically incur massive data transfers over bandwidth-constrained wide- area networks (WANs) which impose significant performance overhead. Conventional distributed computing platforms (e.g., Spark) leverage caching to avoid duplicate executions of common computations and thus reduce network traffic. However, these techniques were developed for data center environments and therefore lack advanced network-aware mechanisms to support high-performance, data-intensive applications over the WAN in geo-distributed environments. Hence, we develop Cachalot - a novel network-aware, cooperative cache network for caching datasets generated by common computations shared among geo- distributed, data-intensive applications. We perform a simulation- based deep evaluation using both synthetic and real traces. The experimental results indicate Cachalot speeds up data-intensive applications by over 50%, reducing network traffic by up to 60%; and, outperforms state-of-the-art baselines by over 20% in geo-distributed environments for various common user-driven performance metrics.","PeriodicalId":19331,"journal":{"name":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","volume":"5 1","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOMS.2018.8406273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Collaborative and data-intensive applications are hosted on geo-distributed infrastructures to exploit computing resources at scale. However, these applications typically incur massive data transfers over bandwidth-constrained wide- area networks (WANs) which impose significant performance overhead. Conventional distributed computing platforms (e.g., Spark) leverage caching to avoid duplicate executions of common computations and thus reduce network traffic. However, these techniques were developed for data center environments and therefore lack advanced network-aware mechanisms to support high-performance, data-intensive applications over the WAN in geo-distributed environments. Hence, we develop Cachalot - a novel network-aware, cooperative cache network for caching datasets generated by common computations shared among geo- distributed, data-intensive applications. We perform a simulation- based deep evaluation using both synthetic and real traces. The experimental results indicate Cachalot speeds up data-intensive applications by over 50%, reducing network traffic by up to 60%; and, outperforms state-of-the-art baselines by over 20% in geo-distributed environments for various common user-driven performance metrics.