{"title":"Entropy Regularization for Topic Modelling","authors":"Nagesh Bhattu Sristy, D. Somayajulu","doi":"10.1145/2662117.2662130","DOIUrl":null,"url":null,"abstract":"Supervised Latent Dirichlet based Topic Models are variants of Latent Dirichlet Topic Models with the additional capability to discriminate the samples. Light Supervision strategies are recently adopted to express the rich domain knowledge in the form of constraints. Posterior Regularization framework is developed for learning models from this weaker form of supervision expressing the set of constraints over the family of posteriors. Modelling arbitrary problem specific dependencies is a non-trivial task, increasing the complexity of already harder inference problem in the context of latent dirichlet based topic models. In the current work we propose posterior regularization method for topic models to capture wide variety of auxiliary supervision. This approach simplifies the computational challenges posed by additional compound terms. We have demonstrated the use of this framework in improving the utility of topic models in the presence of entropy constraints. We have experimented with real word datasets to test the above mentioned techniques.","PeriodicalId":286103,"journal":{"name":"I-CARE 2014","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"I-CARE 2014","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2662117.2662130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Supervised Latent Dirichlet based Topic Models are variants of Latent Dirichlet Topic Models with the additional capability to discriminate the samples. Light Supervision strategies are recently adopted to express the rich domain knowledge in the form of constraints. Posterior Regularization framework is developed for learning models from this weaker form of supervision expressing the set of constraints over the family of posteriors. Modelling arbitrary problem specific dependencies is a non-trivial task, increasing the complexity of already harder inference problem in the context of latent dirichlet based topic models. In the current work we propose posterior regularization method for topic models to capture wide variety of auxiliary supervision. This approach simplifies the computational challenges posed by additional compound terms. We have demonstrated the use of this framework in improving the utility of topic models in the presence of entropy constraints. We have experimented with real word datasets to test the above mentioned techniques.