{"title":"通过分布式缓存系统中的文件预取机制提高读取性能","authors":"Jing Gui, Yongbin Wang, Wuyue Shuai","doi":"10.1002/cpe.8215","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Distributed cache systems are utilized to enhance I/O performance between computing applications and storage systems. However, the traditional file access predictors employed in these cache systems are only suitable for workloads with simple file access patterns, rendering them inadequate for the complex access patterns found in big data computing scenarios. In this article, we propose a file access predictor (DFAP) based on WaveNet, which has exhibited promising results in file access tasks when compared to other baseline models. Cache systems are often constrained by limited cache space due to cost, cluster size, and other factors. In big data scenarios, cached data and prefetched data often compete for limited space. To address this issue, we introduce a cache prefetching algorithm (CBAP) for cache systems, which is based on cost-benefit analysis to improve cache utilization. Furthermore, we implement a novel file prefetching framework on Alluxio, which accelerates computing jobs by up to 18%.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"36 22","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving reading performance by file prefetching mechanism in distributed cache systems\",\"authors\":\"Jing Gui, Yongbin Wang, Wuyue Shuai\",\"doi\":\"10.1002/cpe.8215\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Distributed cache systems are utilized to enhance I/O performance between computing applications and storage systems. However, the traditional file access predictors employed in these cache systems are only suitable for workloads with simple file access patterns, rendering them inadequate for the complex access patterns found in big data computing scenarios. In this article, we propose a file access predictor (DFAP) based on WaveNet, which has exhibited promising results in file access tasks when compared to other baseline models. Cache systems are often constrained by limited cache space due to cost, cluster size, and other factors. In big data scenarios, cached data and prefetched data often compete for limited space. To address this issue, we introduce a cache prefetching algorithm (CBAP) for cache systems, which is based on cost-benefit analysis to improve cache utilization. Furthermore, we implement a novel file prefetching framework on Alluxio, which accelerates computing jobs by up to 18%.</p>\\n </div>\",\"PeriodicalId\":55214,\"journal\":{\"name\":\"Concurrency and Computation-Practice & Experience\",\"volume\":\"36 22\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-07-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Concurrency and Computation-Practice & Experience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cpe.8215\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.8215","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Improving reading performance by file prefetching mechanism in distributed cache systems
Distributed cache systems are utilized to enhance I/O performance between computing applications and storage systems. However, the traditional file access predictors employed in these cache systems are only suitable for workloads with simple file access patterns, rendering them inadequate for the complex access patterns found in big data computing scenarios. In this article, we propose a file access predictor (DFAP) based on WaveNet, which has exhibited promising results in file access tasks when compared to other baseline models. Cache systems are often constrained by limited cache space due to cost, cluster size, and other factors. In big data scenarios, cached data and prefetched data often compete for limited space. To address this issue, we introduce a cache prefetching algorithm (CBAP) for cache systems, which is based on cost-benefit analysis to improve cache utilization. Furthermore, we implement a novel file prefetching framework on Alluxio, which accelerates computing jobs by up to 18%.
期刊介绍:
Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of:
Parallel and distributed computing;
High-performance computing;
Computational and data science;
Artificial intelligence and machine learning;
Big data applications, algorithms, and systems;
Network science;
Ontologies and semantics;
Security and privacy;
Cloud/edge/fog computing;
Green computing; and
Quantum computing.