{"title":"Cachalot:一种网络感知的协作缓存网络,用于地理分布的数据密集型应用程序","authors":"Fan Jiang, C. Castillo, S. Ahalt","doi":"10.1109/NOMS.2018.8406273","DOIUrl":null,"url":null,"abstract":"Collaborative and data-intensive applications are hosted on geo-distributed infrastructures to exploit computing resources at scale. However, these applications typically incur massive data transfers over bandwidth-constrained wide- area networks (WANs) which impose significant performance overhead. Conventional distributed computing platforms (e.g., Spark) leverage caching to avoid duplicate executions of common computations and thus reduce network traffic. However, these techniques were developed for data center environments and therefore lack advanced network-aware mechanisms to support high-performance, data-intensive applications over the WAN in geo-distributed environments. Hence, we develop Cachalot - a novel network-aware, cooperative cache network for caching datasets generated by common computations shared among geo- distributed, data-intensive applications. We perform a simulation- based deep evaluation using both synthetic and real traces. The experimental results indicate Cachalot speeds up data-intensive applications by over 50%, reducing network traffic by up to 60%; and, outperforms state-of-the-art baselines by over 20% in geo-distributed environments for various common user-driven performance metrics.","PeriodicalId":19331,"journal":{"name":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","volume":"5 1","pages":"1-9"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Cachalot: A network-aware, cooperative cache network for geo-distributed, data-intensive applications\",\"authors\":\"Fan Jiang, C. Castillo, S. Ahalt\",\"doi\":\"10.1109/NOMS.2018.8406273\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Collaborative and data-intensive applications are hosted on geo-distributed infrastructures to exploit computing resources at scale. However, these applications typically incur massive data transfers over bandwidth-constrained wide- area networks (WANs) which impose significant performance overhead. Conventional distributed computing platforms (e.g., Spark) leverage caching to avoid duplicate executions of common computations and thus reduce network traffic. However, these techniques were developed for data center environments and therefore lack advanced network-aware mechanisms to support high-performance, data-intensive applications over the WAN in geo-distributed environments. Hence, we develop Cachalot - a novel network-aware, cooperative cache network for caching datasets generated by common computations shared among geo- distributed, data-intensive applications. We perform a simulation- based deep evaluation using both synthetic and real traces. The experimental results indicate Cachalot speeds up data-intensive applications by over 50%, reducing network traffic by up to 60%; and, outperforms state-of-the-art baselines by over 20% in geo-distributed environments for various common user-driven performance metrics.\",\"PeriodicalId\":19331,\"journal\":{\"name\":\"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium\",\"volume\":\"5 1\",\"pages\":\"1-9\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NOMS.2018.8406273\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOMS.2018.8406273","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cachalot: A network-aware, cooperative cache network for geo-distributed, data-intensive applications
Collaborative and data-intensive applications are hosted on geo-distributed infrastructures to exploit computing resources at scale. However, these applications typically incur massive data transfers over bandwidth-constrained wide- area networks (WANs) which impose significant performance overhead. Conventional distributed computing platforms (e.g., Spark) leverage caching to avoid duplicate executions of common computations and thus reduce network traffic. However, these techniques were developed for data center environments and therefore lack advanced network-aware mechanisms to support high-performance, data-intensive applications over the WAN in geo-distributed environments. Hence, we develop Cachalot - a novel network-aware, cooperative cache network for caching datasets generated by common computations shared among geo- distributed, data-intensive applications. We perform a simulation- based deep evaluation using both synthetic and real traces. The experimental results indicate Cachalot speeds up data-intensive applications by over 50%, reducing network traffic by up to 60%; and, outperforms state-of-the-art baselines by over 20% in geo-distributed environments for various common user-driven performance metrics.