Xinyu Tang, X. Yao, Diyou Liu, Long Zhao, Li Li, Dehai Zhu, Guoqing Li
{"title":"A Ceph-based storage strategy for big gridded remote sensing data","authors":"Xinyu Tang, X. Yao, Diyou Liu, Long Zhao, Li Li, Dehai Zhu, Guoqing Li","doi":"10.1080/20964471.2021.1989792","DOIUrl":null,"url":null,"abstract":"ABSTRACT When using distributed storage systems to store gridded remote sensing data in large, distributed clusters, most solutions utilize big table index storage strategies. However, in practice, the performance of big table index storage strategies degrades as scenarios become more complex, and the reasons for this phenomenon are analyzed in this paper. To improve the read and write performance of distributed gridded data storage, this paper proposes a storage strategy based on Ceph software. The strategy encapsulates remote sensing images in the form of objects through a metadata management strategy to achieve the spatiotemporal retrieval of gridded data, finding the cluster location of gridded data through hash-like calculations. The method can effectively achieve spatial operation support in the clustered database and at the same time enable fast random read and write of the gridded data. Random write and spatial query experiments proved the feasibility, effectiveness, and stability of this strategy. The experiments prove that the method has higher stability than, and that the average query time is 38% lower than that for, the large table index storage strategy, which greatly improves the storage and query efficiency of gridded images.","PeriodicalId":8765,"journal":{"name":"Big Earth Data","volume":"16 1","pages":"323 - 339"},"PeriodicalIF":4.2000,"publicationDate":"2021-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Earth Data","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1080/20964471.2021.1989792","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 5
Abstract
ABSTRACT When using distributed storage systems to store gridded remote sensing data in large, distributed clusters, most solutions utilize big table index storage strategies. However, in practice, the performance of big table index storage strategies degrades as scenarios become more complex, and the reasons for this phenomenon are analyzed in this paper. To improve the read and write performance of distributed gridded data storage, this paper proposes a storage strategy based on Ceph software. The strategy encapsulates remote sensing images in the form of objects through a metadata management strategy to achieve the spatiotemporal retrieval of gridded data, finding the cluster location of gridded data through hash-like calculations. The method can effectively achieve spatial operation support in the clustered database and at the same time enable fast random read and write of the gridded data. Random write and spatial query experiments proved the feasibility, effectiveness, and stability of this strategy. The experiments prove that the method has higher stability than, and that the average query time is 38% lower than that for, the large table index storage strategy, which greatly improves the storage and query efficiency of gridded images.