{"title":"A fast approximate algorithm for large-scale Latent Semantic Indexing","authors":"Dell Zhang, Zheng Zhu","doi":"10.1109/ICDIM.2008.4746764","DOIUrl":null,"url":null,"abstract":"Latent semantic indexing (LSI) is an effective method to discover the underlying semantic structure of data. It has numerous applications in information retrieval and data mining. However, the computational complexity of LSI may be prohibitively high when applied to very large datasets. In this paper, we present a fast approximate algorithm for large-scale LSI that is conceptually simple and theoretically justified. Our main contribution is to show that the proposed algorithm has provable error bound and linear computational complexity.","PeriodicalId":415013,"journal":{"name":"2008 Third International Conference on Digital Information Management","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Third International Conference on Digital Information Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIM.2008.4746764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Latent semantic indexing (LSI) is an effective method to discover the underlying semantic structure of data. It has numerous applications in information retrieval and data mining. However, the computational complexity of LSI may be prohibitively high when applied to very large datasets. In this paper, we present a fast approximate algorithm for large-scale LSI that is conceptually simple and theoretically justified. Our main contribution is to show that the proposed algorithm has provable error bound and linear computational complexity.