{"title":"Single-Snapshot File System Analysis","authors":"Avani Wildani, I. Adams, E. L. Miller","doi":"10.1109/MASCOTS.2013.47","DOIUrl":null,"url":null,"abstract":"Metadata snapshots are a common method for gaining insight into file systems due to their small size and relative ease of acquisition. Since they are static, most researchers have used them for relatively simple analyses such as file size distributions and age of files. We hypothesize that it is possible to gain much richer insights into file system and user behavior by clustering features in metadata snapshots and comparing the entropy within clusters to the entropy within natural partitions such as directory hierarchies. We discuss several different methods for gaining deeper insights into metadata snapshots, and show a small proof of concept using data from Los Alamos National Laboratories. In our initial work, we see evidence that it is possible to identify user locality information, traditionally the purview of dynamic traces, using a single static snapshot.","PeriodicalId":385538,"journal":{"name":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MASCOTS.2013.47","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Metadata snapshots are a common method for gaining insight into file systems due to their small size and relative ease of acquisition. Since they are static, most researchers have used them for relatively simple analyses such as file size distributions and age of files. We hypothesize that it is possible to gain much richer insights into file system and user behavior by clustering features in metadata snapshots and comparing the entropy within clusters to the entropy within natural partitions such as directory hierarchies. We discuss several different methods for gaining deeper insights into metadata snapshots, and show a small proof of concept using data from Los Alamos National Laboratories. In our initial work, we see evidence that it is possible to identify user locality information, traditionally the purview of dynamic traces, using a single static snapshot.
元数据快照是深入了解文件系统的常用方法,因为它们体积小且相对容易获取。由于它们是静态的,大多数研究人员使用它们进行相对简单的分析,例如文件大小分布和文件年龄。我们假设,通过对元数据快照中的特征进行集群,并将集群内的熵与自然分区(如目录层次结构)内的熵进行比较,可以更深入地了解文件系统和用户行为。我们讨论了几种不同的方法来获得对元数据快照的更深入的了解,并使用来自Los Alamos National Laboratories的数据展示了一个小的概念证明。在我们最初的工作中,我们看到有证据表明,使用单个静态快照可以识别用户位置信息(传统上是动态跟踪的范围)。