{"title":"An adaptive Latent Semantic Analysis for text mining","authors":"H. T. Tu, T. Phan, K. P. Nguyen","doi":"10.1109/ICSSE.2017.8030943","DOIUrl":null,"url":null,"abstract":"Latent Semantic Analysis or LSA uses a method of singular value decomposition of co-occurrence document-term matrix to derive a latent class model. Despite its success, there are some shortcomings in this technique. Recent works have improved the standard LSA using method of probability distribution, regularization, sparseness constraint. But there are still some other deficiencies. It is dealt with this paper, an adapted technique called hk-LSA based on reducing dimension of vector space and like-probabilistic relationships between document and latent-topic space is proposed. The adaptive technique overcomes some weak points of LSA such as processing density of orthogonal matrices, complexity in matrix decomposition, facing with alternative iteration algorithms, etc. The experiments show consistent and substantial improvements of the hk-LSA over LSA.","PeriodicalId":296191,"journal":{"name":"2017 International Conference on System Science and Engineering (ICSSE)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on System Science and Engineering (ICSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSE.2017.8030943","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Latent Semantic Analysis or LSA uses a method of singular value decomposition of co-occurrence document-term matrix to derive a latent class model. Despite its success, there are some shortcomings in this technique. Recent works have improved the standard LSA using method of probability distribution, regularization, sparseness constraint. But there are still some other deficiencies. It is dealt with this paper, an adapted technique called hk-LSA based on reducing dimension of vector space and like-probabilistic relationships between document and latent-topic space is proposed. The adaptive technique overcomes some weak points of LSA such as processing density of orthogonal matrices, complexity in matrix decomposition, facing with alternative iteration algorithms, etc. The experiments show consistent and substantial improvements of the hk-LSA over LSA.