{"title":"A Benchmark Study on Knowledge Graphs Enrichment and Pruning Methods in the Presence of Noisy Relationships","authors":"Stefano Faralli, Andrea Lenzi, Paola Velardi","doi":"10.1613/jair.1.14494","DOIUrl":null,"url":null,"abstract":"In the past few years, knowledge graphs (KGs), as a form of structured human intelligence, have attracted considerable research attention from academia and industry. In this very active field of study, a widely explored problem is that of link prediction, the task of predicting whether two nodes should be connected, based on node attributes and local or global graph connectivity properties. The state of the art in this area is represented by techniques based on graph embeddings. However, KGs, especially those acquired using automated or partly automated techniques, are often riddled with noise, e.g., wrong relationships, which makes the problem of link deletion as important as that of link prediction. In this paper, we address three main research questions. The first is about the true effectiveness of different knowledge graph embedding models under the presence of an increasing number of wrong links. The second is to asses if methods that can predict unknown relationships effectively, work equally well in recognizing incorrect relations. The third is to verify if there are systems robust enough to maintain primacy in all experimental conditions. To answer these research questions, we performed a systematic benchmark study in which the experimental setting includes ten state-of-the-art models, three common KG datasets with different structural properties and three downstream tasks: the widely explored tasks of link prediction and triple classification, and the less popular task of link deletion. Comparative studies often yield contradictory results, where the same systems score better or worse depending on the experimental context. In our work, in order to facilitate the discovery of clear performance patterns and their interpretation, we select and/or aggregate performance data to highlight each specific comparison dimension: dataset complexity, type of task, category of models, and robustness against noise.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"38 1","pages":"0"},"PeriodicalIF":4.5000,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1613/jair.1.14494","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the past few years, knowledge graphs (KGs), as a form of structured human intelligence, have attracted considerable research attention from academia and industry. In this very active field of study, a widely explored problem is that of link prediction, the task of predicting whether two nodes should be connected, based on node attributes and local or global graph connectivity properties. The state of the art in this area is represented by techniques based on graph embeddings. However, KGs, especially those acquired using automated or partly automated techniques, are often riddled with noise, e.g., wrong relationships, which makes the problem of link deletion as important as that of link prediction. In this paper, we address three main research questions. The first is about the true effectiveness of different knowledge graph embedding models under the presence of an increasing number of wrong links. The second is to asses if methods that can predict unknown relationships effectively, work equally well in recognizing incorrect relations. The third is to verify if there are systems robust enough to maintain primacy in all experimental conditions. To answer these research questions, we performed a systematic benchmark study in which the experimental setting includes ten state-of-the-art models, three common KG datasets with different structural properties and three downstream tasks: the widely explored tasks of link prediction and triple classification, and the less popular task of link deletion. Comparative studies often yield contradictory results, where the same systems score better or worse depending on the experimental context. In our work, in order to facilitate the discovery of clear performance patterns and their interpretation, we select and/or aggregate performance data to highlight each specific comparison dimension: dataset complexity, type of task, category of models, and robustness against noise.
期刊介绍:
JAIR(ISSN 1076 - 9757) covers all areas of artificial intelligence (AI), publishing refereed research articles, survey articles, and technical notes. Established in 1993 as one of the first electronic scientific journals, JAIR is indexed by INSPEC, Science Citation Index, and MathSciNet. JAIR reviews papers within approximately three months of submission and publishes accepted articles on the internet immediately upon receiving the final versions. JAIR articles are published for free distribution on the internet by the AI Access Foundation, and for purchase in bound volumes by AAAI Press.