{"title":"利用相似性提高原始图像图片库的存储效率","authors":"Binqi Zhang, Chen Wang, B. Zhou, Albert Y. Zomaya","doi":"10.1109/PDCAT.2016.045","DOIUrl":null,"url":null,"abstract":"Exploiting temporal and spatial locality is a way to improve the performance of data compression and deduplication in a storage system. Through our evaluation, we find that content level similarity measures such as similar tags of photos have a certain correlation to data compressibility. Raw images with similar tags can be compressed together to get better storage space savings. Furthermore, storing similar raw images together enables rapid data sorting, searching, and retrieval if the images are stored in a distributed and large-scale environment with reduced fragmentation. In this paper, we present the correlation results between content similarity and data compressibility using a dataset built from Flickr. The system design we proposed has been based on the evaluation and it optimizes storage efficiency for Top-N relevant images with the same tag. On one hand, the storage space is saved. On the other hand, the design may accelerate the query performance for Top-N relevance search.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving Storage Efficiency for Raw Image Photo Repository by Exploiting Similarity\",\"authors\":\"Binqi Zhang, Chen Wang, B. Zhou, Albert Y. Zomaya\",\"doi\":\"10.1109/PDCAT.2016.045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Exploiting temporal and spatial locality is a way to improve the performance of data compression and deduplication in a storage system. Through our evaluation, we find that content level similarity measures such as similar tags of photos have a certain correlation to data compressibility. Raw images with similar tags can be compressed together to get better storage space savings. Furthermore, storing similar raw images together enables rapid data sorting, searching, and retrieval if the images are stored in a distributed and large-scale environment with reduced fragmentation. In this paper, we present the correlation results between content similarity and data compressibility using a dataset built from Flickr. The system design we proposed has been based on the evaluation and it optimizes storage efficiency for Top-N relevant images with the same tag. On one hand, the storage space is saved. On the other hand, the design may accelerate the query performance for Top-N relevance search.\",\"PeriodicalId\":203925,\"journal\":{\"name\":\"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT.2016.045\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2016.045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Storage Efficiency for Raw Image Photo Repository by Exploiting Similarity
Exploiting temporal and spatial locality is a way to improve the performance of data compression and deduplication in a storage system. Through our evaluation, we find that content level similarity measures such as similar tags of photos have a certain correlation to data compressibility. Raw images with similar tags can be compressed together to get better storage space savings. Furthermore, storing similar raw images together enables rapid data sorting, searching, and retrieval if the images are stored in a distributed and large-scale environment with reduced fragmentation. In this paper, we present the correlation results between content similarity and data compressibility using a dataset built from Flickr. The system design we proposed has been based on the evaluation and it optimizes storage efficiency for Top-N relevant images with the same tag. On one hand, the storage space is saved. On the other hand, the design may accelerate the query performance for Top-N relevance search.