Makhlouf Ledmi, Abdeldjalil Ledmi, Mohammed El Habib Souidi
{"title":"Classification of XML Documents Using Semantic Resources","authors":"Makhlouf Ledmi, Abdeldjalil Ledmi, Mohammed El Habib Souidi","doi":"10.1109/ICRAMI52622.2021.9585995","DOIUrl":null,"url":null,"abstract":"In this paper, we investigate the automatic classification of XML documents into predefined categories. We propose to develop a classification model by combining the content and structure of documents. Furthermore, we propose to use semantic resources, specifically WordNet and ontology linked to the terms of the corpus, in order to model the notion of the semantic neighborhood by using a calculation regarding the similarity between terms.To validate the results, we used the INEX 2007 XML corpus.","PeriodicalId":440750,"journal":{"name":"2021 International Conference on Recent Advances in Mathematics and Informatics (ICRAMI)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Recent Advances in Mathematics and Informatics (ICRAMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRAMI52622.2021.9585995","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this paper, we investigate the automatic classification of XML documents into predefined categories. We propose to develop a classification model by combining the content and structure of documents. Furthermore, we propose to use semantic resources, specifically WordNet and ontology linked to the terms of the corpus, in order to model the notion of the semantic neighborhood by using a calculation regarding the similarity between terms.To validate the results, we used the INEX 2007 XML corpus.