{"title":"Generating Encrypted Document Index Structure Using Tree Browser","authors":"Doaa N. Mhawi, Haider W. Oleiwi, Heba L. Al-Taie","doi":"10.51173/jt.v5i2.948","DOIUrl":null,"url":null,"abstract":"The document indexing process aims to store documents in a manner that facilitates the process of retrieving specific documents efficiently in terms of accuracy and time complexity. Many information retrieval systems encounter security issues and execution time to retrieve relevant documents. In addition, these systems lead to ample storage. Therefore, it requires combining confidentiality with the indexed document, and a separate process is performed to encrypt the documents. Hence, a new indexing structure named tree browser (TB) was proposed in this paper to be applied to index files of the large document set in an encrypted manner. This method represents the keywords in a variable-length binary format before being stored in the index. This binary format provides additional encryption to the information stored and reduces the index size. The proposed method (TB) is applied to the WebKB dataset. This dataset is related to web page documents (semi-structured documents). The experimental results demonstrated that the storage size is reduced by using TB-tree to 48.5 MB, while the traditional index is 307 MB.","PeriodicalId":39617,"journal":{"name":"Journal of Biomolecular Techniques","volume":"92 4 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomolecular Techniques","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51173/jt.v5i2.948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Biochemistry, Genetics and Molecular Biology","Score":null,"Total":0}
引用次数: 0
Abstract
The document indexing process aims to store documents in a manner that facilitates the process of retrieving specific documents efficiently in terms of accuracy and time complexity. Many information retrieval systems encounter security issues and execution time to retrieve relevant documents. In addition, these systems lead to ample storage. Therefore, it requires combining confidentiality with the indexed document, and a separate process is performed to encrypt the documents. Hence, a new indexing structure named tree browser (TB) was proposed in this paper to be applied to index files of the large document set in an encrypted manner. This method represents the keywords in a variable-length binary format before being stored in the index. This binary format provides additional encryption to the information stored and reduces the index size. The proposed method (TB) is applied to the WebKB dataset. This dataset is related to web page documents (semi-structured documents). The experimental results demonstrated that the storage size is reduced by using TB-tree to 48.5 MB, while the traditional index is 307 MB.
期刊介绍:
The Journal of Biomolecular Techniques is a peer-reviewed publication issued five times a year by the Association of Biomolecular Resource Facilities. The Journal was established to promote the central role biotechnology plays in contemporary research activities, to disseminate information among biomolecular resource facilities, and to communicate the biotechnology research conducted by the Association’s Research Groups and members, as well as other investigators.