{"title":"A Methodology for File Relationship Discovery","authors":"M. Ondrejcek, Jason Kastner, R. Kooper, P. Bajcsy","doi":"10.1109/e-Science.2009.35","DOIUrl":null,"url":null,"abstract":"This paper addresses the problem of discovering temporal and contextual relationships across document, data, and software categories of electronic records. We designed a methodology to discover unknown relationships by conducting file system and file content analyses. The work also investigates automation of metadata extraction from engineering drawings and storage requirements for metadata extraction. The methodology has been applied to extracting information from a test collection of electronic records about the NAVY ship (TWR 841) archived by the US National Archive (NARA). This test collection represents a problem of unknown relationships among files that include 784 2D image drawings and 22 CAD models.","PeriodicalId":325840,"journal":{"name":"2009 Fifth IEEE International Conference on e-Science","volume":"105 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Fifth IEEE International Conference on e-Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/e-Science.2009.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper addresses the problem of discovering temporal and contextual relationships across document, data, and software categories of electronic records. We designed a methodology to discover unknown relationships by conducting file system and file content analyses. The work also investigates automation of metadata extraction from engineering drawings and storage requirements for metadata extraction. The methodology has been applied to extracting information from a test collection of electronic records about the NAVY ship (TWR 841) archived by the US National Archive (NARA). This test collection represents a problem of unknown relationships among files that include 784 2D image drawings and 22 CAD models.