Melissa Oussaid, Farida Bouarab-Dahmani, N. Cullot
{"title":"Food Ontology Enrichment Using Word Embeddings and Machine Learning Technologies","authors":"Melissa Oussaid, Farida Bouarab-Dahmani, N. Cullot","doi":"10.1109/ISIA55826.2022.9993591","DOIUrl":null,"url":null,"abstract":"The emergence of the Internet has made available a large amount of food data in different formats. Therefore, manual relevant data extraction for food ontology population and enrichment has become a complex process. The automation of the knowledge extraction task offers significant opportunities to overcome several manual process limitations, such as complexity (time-consuming and resource-intense). In this paper, we propose a new approach that aims at the automated extraction of new ontological concepts from unstructured data to enrich a food ontology. For this purpose, an ontology and a corpus of food data have been built. This data is used to train the Word2Vec model. Then, a measure of similarity based on word embedding is done. New entities are selected as candidates according to the result of similarity scores and are used to generate new concepts. The obtained results showed the effectiveness of our proposal, with a precision score of 78%.","PeriodicalId":169898,"journal":{"name":"2022 5th International Symposium on Informatics and its Applications (ISIA)","volume":"147 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 5th International Symposium on Informatics and its Applications (ISIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIA55826.2022.9993591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The emergence of the Internet has made available a large amount of food data in different formats. Therefore, manual relevant data extraction for food ontology population and enrichment has become a complex process. The automation of the knowledge extraction task offers significant opportunities to overcome several manual process limitations, such as complexity (time-consuming and resource-intense). In this paper, we propose a new approach that aims at the automated extraction of new ontological concepts from unstructured data to enrich a food ontology. For this purpose, an ontology and a corpus of food data have been built. This data is used to train the Word2Vec model. Then, a measure of similarity based on word embedding is done. New entities are selected as candidates according to the result of similarity scores and are used to generate new concepts. The obtained results showed the effectiveness of our proposal, with a precision score of 78%.