{"title":"Effective whitelisting for filesystem forensics","authors":"S. Chawathe","doi":"10.1109/ISI.2009.5137284","DOIUrl":null,"url":null,"abstract":"Forensic analysis of the large filesystems commonly found on current computers requires an effective method for categorizing and prioritizing files in order to avoid overwhelming the investigator. A key technique for this purpose is whitelisting files, i.e., skipping the detailed analysis of files that match files in a well known reference collection of files. Effective use of this technique requires an efficient method to match files, detecting not only exact matches, but also near matches or approximate matches. This paper outlines the requirements for such matching, formalizes them as the bounded best match and approximate bounded near-match problems, and describes methods to solve these problems. In particular, the approximate bounded near-match problem is mapped to the problem of finding near neighbors in a high-dimensional metric space and solved using locality-sensitive hashing.","PeriodicalId":210911,"journal":{"name":"2009 IEEE International Conference on Intelligence and Security Informatics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Intelligence and Security Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISI.2009.5137284","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Forensic analysis of the large filesystems commonly found on current computers requires an effective method for categorizing and prioritizing files in order to avoid overwhelming the investigator. A key technique for this purpose is whitelisting files, i.e., skipping the detailed analysis of files that match files in a well known reference collection of files. Effective use of this technique requires an efficient method to match files, detecting not only exact matches, but also near matches or approximate matches. This paper outlines the requirements for such matching, formalizes them as the bounded best match and approximate bounded near-match problems, and describes methods to solve these problems. In particular, the approximate bounded near-match problem is mapped to the problem of finding near neighbors in a high-dimensional metric space and solved using locality-sensitive hashing.