{"title":"社论:数据链接特刊","authors":"A. Ferrara, A. Nikolov, F. Scharffe","doi":"10.2139/ssrn.3199075","DOIUrl":null,"url":null,"abstract":"In this special issue of the Journal of Web Semantics, we present two papers dealing both with one of the most important problem in the field of web data management: data interlinking. This field has gained significant interest over the last years, with the evolution of web technologies enabling the emergence of a web of data. The exponentially increasing number of data sources published as linked data or embedded in web pages through the use of dedicated schemas require techniques able to efficiently identify common entities appearing across these sources. Over the last years many systems were developed involving a wide range of techniques taking into account various information about the data sets involved in order to find the most accurate links between them. Vocabularies, existing links, data ranges, ontology alignments, and user input are combined for the best results. Most efficient systems are semiautomated as they require the user to input a linkage specification, indicating what to link with what and thus guiding the tool in the process. However, for web scale data interlinking, the amount of user input in a link specification is still too high. Most recent research thus focus on minimizing the user input. The two papers in this special issue are presenting research results going in this direction, each of them following a specific path to achieve a similar goal. In the first paper Active Learning of Expressive Linkage Rules using Genetic Programming, the authors of the interlinking tool Silk present a technique to automate the construction of linkage specifications through active learning and genetic algorithms. The resulting system only requires the user to validate a few links until an acceptable specification is reached. In the second paper An Automatic Key Discovery Approach for Data Linking, Fatiha SAIS, Nathalie Pernelle, and Danai Symeonidou propose a technique to automate the selection of predicates to be compared during the interlinking process. The method discovers sets of properties allowing to identify data resources uniquely in a given data set, similarly to the notion of keys in relational databases. Both articles have gone through a very rigorous selection process and were both improved since their first submission. It was an editorial choice to only retain articles meeting a very high standard, resulting in only two articles published. We believe this will ensure a stronger field of research. Enjoy reading!","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"23 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2013-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Editorial: Special Issue on Data Linking\",\"authors\":\"A. Ferrara, A. Nikolov, F. Scharffe\",\"doi\":\"10.2139/ssrn.3199075\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this special issue of the Journal of Web Semantics, we present two papers dealing both with one of the most important problem in the field of web data management: data interlinking. This field has gained significant interest over the last years, with the evolution of web technologies enabling the emergence of a web of data. The exponentially increasing number of data sources published as linked data or embedded in web pages through the use of dedicated schemas require techniques able to efficiently identify common entities appearing across these sources. Over the last years many systems were developed involving a wide range of techniques taking into account various information about the data sets involved in order to find the most accurate links between them. Vocabularies, existing links, data ranges, ontology alignments, and user input are combined for the best results. Most efficient systems are semiautomated as they require the user to input a linkage specification, indicating what to link with what and thus guiding the tool in the process. However, for web scale data interlinking, the amount of user input in a link specification is still too high. Most recent research thus focus on minimizing the user input. The two papers in this special issue are presenting research results going in this direction, each of them following a specific path to achieve a similar goal. In the first paper Active Learning of Expressive Linkage Rules using Genetic Programming, the authors of the interlinking tool Silk present a technique to automate the construction of linkage specifications through active learning and genetic algorithms. The resulting system only requires the user to validate a few links until an acceptable specification is reached. In the second paper An Automatic Key Discovery Approach for Data Linking, Fatiha SAIS, Nathalie Pernelle, and Danai Symeonidou propose a technique to automate the selection of predicates to be compared during the interlinking process. The method discovers sets of properties allowing to identify data resources uniquely in a given data set, similarly to the notion of keys in relational databases. Both articles have gone through a very rigorous selection process and were both improved since their first submission. It was an editorial choice to only retain articles meeting a very high standard, resulting in only two articles published. We believe this will ensure a stronger field of research. Enjoy reading!\",\"PeriodicalId\":49951,\"journal\":{\"name\":\"Journal of Web Semantics\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2013-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Web Semantics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.2139/ssrn.3199075\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Web Semantics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.2139/ssrn.3199075","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
In this special issue of the Journal of Web Semantics, we present two papers dealing both with one of the most important problem in the field of web data management: data interlinking. This field has gained significant interest over the last years, with the evolution of web technologies enabling the emergence of a web of data. The exponentially increasing number of data sources published as linked data or embedded in web pages through the use of dedicated schemas require techniques able to efficiently identify common entities appearing across these sources. Over the last years many systems were developed involving a wide range of techniques taking into account various information about the data sets involved in order to find the most accurate links between them. Vocabularies, existing links, data ranges, ontology alignments, and user input are combined for the best results. Most efficient systems are semiautomated as they require the user to input a linkage specification, indicating what to link with what and thus guiding the tool in the process. However, for web scale data interlinking, the amount of user input in a link specification is still too high. Most recent research thus focus on minimizing the user input. The two papers in this special issue are presenting research results going in this direction, each of them following a specific path to achieve a similar goal. In the first paper Active Learning of Expressive Linkage Rules using Genetic Programming, the authors of the interlinking tool Silk present a technique to automate the construction of linkage specifications through active learning and genetic algorithms. The resulting system only requires the user to validate a few links until an acceptable specification is reached. In the second paper An Automatic Key Discovery Approach for Data Linking, Fatiha SAIS, Nathalie Pernelle, and Danai Symeonidou propose a technique to automate the selection of predicates to be compared during the interlinking process. The method discovers sets of properties allowing to identify data resources uniquely in a given data set, similarly to the notion of keys in relational databases. Both articles have gone through a very rigorous selection process and were both improved since their first submission. It was an editorial choice to only retain articles meeting a very high standard, resulting in only two articles published. We believe this will ensure a stronger field of research. Enjoy reading!
期刊介绍:
The Journal of Web Semantics is an interdisciplinary journal based on research and applications of various subject areas that contribute to the development of a knowledge-intensive and intelligent service Web. These areas include: knowledge technologies, ontology, agents, databases and the semantic grid, obviously disciplines like information retrieval, language technology, human-computer interaction and knowledge discovery are of major relevance as well. All aspects of the Semantic Web development are covered. The publication of large-scale experiments and their analysis is also encouraged to clearly illustrate scenarios and methods that introduce semantics into existing Web interfaces, contents and services. The journal emphasizes the publication of papers that combine theories, methods and experiments from different subject areas in order to deliver innovative semantic methods and applications.