{"title":"LEOnto","authors":"Anis Tissaoui, S. Sassi, R. Chbeir","doi":"10.1145/3415958.3433076","DOIUrl":null,"url":null,"abstract":"The Latent Dirichlet Allocation (LDA) model [18] was originally developed and utilised for document modeling and topic extraction in Information Retrieval. To design high quality domain ontologies, effective and usable methodologies are needed to facilitate their building process. In this paper, we propose a new approach for semi-automatic ontology enriching from textual corpus based on LDA model. In our approach, LDA is adopted to provide efficient dimension reduction, able to capture semantic relationships between word-topic and topic-document in terms of probability distributions with minimum human intervention. We conducted several experiments with different model parameters and the corresponding behavior of the enriching technique was evaluated by domain experts. We also compared the results of our method with two existing learning methods using the same dataset. The study showed that our method outperforms the other methods in terms of recall and precision measures.","PeriodicalId":198419,"journal":{"name":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th International Conference on Management of Digital EcoSystems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3415958.3433076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The Latent Dirichlet Allocation (LDA) model [18] was originally developed and utilised for document modeling and topic extraction in Information Retrieval. To design high quality domain ontologies, effective and usable methodologies are needed to facilitate their building process. In this paper, we propose a new approach for semi-automatic ontology enriching from textual corpus based on LDA model. In our approach, LDA is adopted to provide efficient dimension reduction, able to capture semantic relationships between word-topic and topic-document in terms of probability distributions with minimum human intervention. We conducted several experiments with different model parameters and the corresponding behavior of the enriching technique was evaluated by domain experts. We also compared the results of our method with two existing learning methods using the same dataset. The study showed that our method outperforms the other methods in terms of recall and precision measures.