首页 > 最新文献

International Workshop on Open Data最新文献

英文 中文
Publish-time data integration for open data platforms 面向开放数据平台的发布时数据集成
Pub Date : 2013-06-03 DOI: 10.1145/2500410.2500413
Julian Eberius, Patrick Damme, Katrin Braunschweig, Maik Thiele, Wolfgang Lehner
Platforms for publication and collaborative management of data, such as Data.gov or Google Fusion Tables, are a new trend on the web. They manage very large corpora of datasets, but often lack an integrated schema, ontology, or even just common publication standards. This results in inconsistent names for attributes of the same meaning, which constrains the discovery of relationships between datasets as well as their reusability. Existing data integration techniques focus on reuse-time, i.e., they are applied when a user wants to combine a specific set of datasets or integrate them with an existing database. In contrast, this paper investigates a novel method of data integration at publish-time, where the publisher is provided with suggestions on how to integrate the new dataset with the corpus as a whole, without resorting to a manually created mediated schema or ontology for the platform. We propose data-driven algorithms that propose alternative attribute names for a newly published dataset based on attribute- and instance statistics maintained on the corpus. We evaluate the proposed algorithms using real-world corpora based on the Open Data Platform opendata.socrata.com and relational data extracted from Wikipedia. We report on the system's response time, and on the results of an extensive crowdsourcing-based evaluation of the quality of the generated attribute names alternatives.
发布和协作管理数据的平台,如data .gov或谷歌Fusion Tables,是网络上的新趋势。它们管理非常大的数据集语料库,但通常缺乏集成的模式、本体,甚至只是通用的发布标准。这导致相同含义的属性名称不一致,这限制了数据集之间关系的发现以及它们的可重用性。现有的数据集成技术侧重于重用时间,也就是说,当用户想要组合一组特定的数据集或将它们与现有数据库集成时,就会应用这些技术。相比之下,本文研究了一种在发布时进行数据集成的新方法,该方法为发布者提供了关于如何将新数据集与语料库作为一个整体集成的建议,而无需诉诸于为平台手动创建的中介模式或本体。我们提出了数据驱动算法,该算法基于语料库上维护的属性和实例统计信息,为新发布的数据集提供可选的属性名称。我们使用基于开放数据平台opendata.socrata.com的真实语料库和从维基百科中提取的关系数据来评估所提出的算法。我们报告了系统的响应时间,以及对生成的属性名替代方案的质量进行广泛的基于众包的评估的结果。
{"title":"Publish-time data integration for open data platforms","authors":"Julian Eberius, Patrick Damme, Katrin Braunschweig, Maik Thiele, Wolfgang Lehner","doi":"10.1145/2500410.2500413","DOIUrl":"https://doi.org/10.1145/2500410.2500413","url":null,"abstract":"Platforms for publication and collaborative management of data, such as Data.gov or Google Fusion Tables, are a new trend on the web. They manage very large corpora of datasets, but often lack an integrated schema, ontology, or even just common publication standards. This results in inconsistent names for attributes of the same meaning, which constrains the discovery of relationships between datasets as well as their reusability. Existing data integration techniques focus on reuse-time, i.e., they are applied when a user wants to combine a specific set of datasets or integrate them with an existing database. In contrast, this paper investigates a novel method of data integration at publish-time, where the publisher is provided with suggestions on how to integrate the new dataset with the corpus as a whole, without resorting to a manually created mediated schema or ontology for the platform. We propose data-driven algorithms that propose alternative attribute names for a newly published dataset based on attribute- and instance statistics maintained on the corpus. We evaluate the proposed algorithms using real-world corpora based on the Open Data Platform opendata.socrata.com and relational data extracted from Wikipedia. We report on the system's response time, and on the results of an extensive crowdsourcing-based evaluation of the quality of the generated attribute names alternatives.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123605033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
On the enrichment of a RDF repository of city points of interest based on social data 基于社会数据的城市兴趣点RDF资源库的丰富
Pub Date : 2013-06-03 DOI: 10.1145/2500410.2500411
Zied Sellami, Gianluca Quercini, C. Reynaud
Points of interest (POIs) in a city are specific locations that present some significance to people; examples include restaurants, museums, hotels, theatres and landmarks, just to name a few. Due to their role in our social and economic life, POIs have been increasingly gaining the attention of location-based applications, such as online maps and social networking sites. While it is relatively easy to find on the Web basic information about a POI, such as its geographic location, telephone number and opening hours, it is more challenging to have a deeper knowledge as to what other people say about it. What if a person wants to know all the restaurants in Paris that serve good seafood and provide a kind service? Typically, the answer to this question has to be looked for on websites that let people leave comments and opinions on POIs, a time-consuming manual task that few are willing to do. This search would be better supported by search engines if information mined from opinions were available in a structured form, such as RDF. In this position paper, we describe a general approach to enrich an existing RDF repository about POIs with data obtained from social networking sites.
城市中的兴趣点(POIs)是对人们有一定意义的特定地点;例子包括餐馆、博物馆、酒店、剧院和地标,仅举几例。由于它们在我们的社会和经济生活中的作用,poi越来越受到基于位置的应用程序的关注,例如在线地图和社交网站。虽然在Web上查找POI的基本信息相对容易,例如地理位置、电话号码和营业时间,但要深入了解其他人对它的评价则更具挑战性。如果一个人想知道巴黎所有供应美味海鲜和提供亲切服务的餐馆,该怎么办?通常,这个问题的答案必须在网站上寻找,这些网站允许人们在poi上发表评论和意见,这是一项耗时的手动任务,很少有人愿意这样做。如果从意见中挖掘的信息以结构化的形式(如RDF)可用,那么搜索引擎将更好地支持这种搜索。在这篇意见书中,我们描述了一种通用的方法,用从社会网络站点获得的数据来丰富现有的关于poi的RDF存储库。
{"title":"On the enrichment of a RDF repository of city points of interest based on social data","authors":"Zied Sellami, Gianluca Quercini, C. Reynaud","doi":"10.1145/2500410.2500411","DOIUrl":"https://doi.org/10.1145/2500410.2500411","url":null,"abstract":"Points of interest (POIs) in a city are specific locations that present some significance to people; examples include restaurants, museums, hotels, theatres and landmarks, just to name a few. Due to their role in our social and economic life, POIs have been increasingly gaining the attention of location-based applications, such as online maps and social networking sites. While it is relatively easy to find on the Web basic information about a POI, such as its geographic location, telephone number and opening hours, it is more challenging to have a deeper knowledge as to what other people say about it. What if a person wants to know all the restaurants in Paris that serve good seafood and provide a kind service? Typically, the answer to this question has to be looked for on websites that let people leave comments and opinions on POIs, a time-consuming manual task that few are willing to do. This search would be better supported by search engines if information mined from opinions were available in a structured form, such as RDF. In this position paper, we describe a general approach to enrich an existing RDF repository about POIs with data obtained from social networking sites.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"172 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131526024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The 'intellectual network': linking writers in the data.bnf.fr project “知识网络”:连接data.bnf.fr项目中的作者
Pub Date : 2013-06-03 DOI: 10.1145/2500410.2500418
Romain Wenz
This paper describes one of the specific functionalities of the data.bnf.fr library discovery service: the use of semantic Web technologies to create Web pages around "named entities" from the authority files.
本文描述了data.bnf.fr图书馆发现服务的一个特定功能:使用语义Web技术根据权威文件围绕“命名实体”创建Web页面。
{"title":"The 'intellectual network': linking writers in the data.bnf.fr project","authors":"Romain Wenz","doi":"10.1145/2500410.2500418","DOIUrl":"https://doi.org/10.1145/2500410.2500418","url":null,"abstract":"This paper describes one of the specific functionalities of the data.bnf.fr library discovery service: the use of semantic Web technologies to create Web pages around \"named entities\" from the authority files.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128392824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Publishing census as linked open data: a case study 将人口普查作为关联的开放数据发布:一个案例研究
Pub Date : 2013-06-03 DOI: 10.1145/2500410.2500412
Irene Petrou, George Papastefanatos, Theodore Dalamagas
In this paper we present a case study on publishing statistical data as Linked Open Data. Statistical or fact-based data are maintained by statistical agencies and organizations, harvested via surveys or aggregated from other sources and mainly concern to observations of socioeconomic indicators. In this case study, we present the publishing as LOD of the preliminary results of Greece's resident population census, conducted in 2011. We have employed the Data Cube vocabulary and the Google Refine tool for modelling and publishing the census results.
在本文中,我们提出了一个将统计数据作为关联开放数据发布的案例研究。统计数据或基于事实的数据由统计机构和组织维护,通过调查收集或从其他来源汇总,主要涉及社会经济指标的观察结果。在本案例研究中,我们将2011年希腊常住人口普查的初步结果作为LOD发布。我们使用Data Cube词汇表和Google Refine工具对普查结果进行建模和发布。
{"title":"Publishing census as linked open data: a case study","authors":"Irene Petrou, George Papastefanatos, Theodore Dalamagas","doi":"10.1145/2500410.2500412","DOIUrl":"https://doi.org/10.1145/2500410.2500412","url":null,"abstract":"In this paper we present a case study on publishing statistical data as Linked Open Data. Statistical or fact-based data are maintained by statistical agencies and organizations, harvested via surveys or aggregated from other sources and mainly concern to observations of socioeconomic indicators. In this case study, we present the publishing as LOD of the preliminary results of Greece's resident population census, conducted in 2011. We have employed the Data Cube vocabulary and the Google Refine tool for modelling and publishing the census results.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123154557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
WebSmatch: a tool for open data WebSmatch:一个开放数据的工具
Pub Date : 2013-06-03 DOI: 10.1145/2500410.2500420
E. Castanier, Rémi Coletta, P. Valduriez, Christian Frisch
Working with open data sources can yield high value information but raises major problems in terms of metadata extraction, data source integration and visualization. In this paper we describe a demonstration of WebSmatch, a flexible environment for Web data integration, based on a real, end-to-end data integration scenario over public data from Data Publica. The demonstration focuses on poorly structured input data sources (XLS files).
使用开放数据源可以产生高价值的信息,但在元数据提取、数据源集成和可视化方面会产生重大问题。在本文中,我们描述了WebSmatch的演示,WebSmatch是一种灵活的Web数据集成环境,它基于一个真实的端到端数据集成场景,使用来自data Publica的公共数据。该演示侧重于结构不良的输入数据源(XLS文件)。
{"title":"WebSmatch: a tool for open data","authors":"E. Castanier, Rémi Coletta, P. Valduriez, Christian Frisch","doi":"10.1145/2500410.2500420","DOIUrl":"https://doi.org/10.1145/2500410.2500420","url":null,"abstract":"Working with open data sources can yield high value information but raises major problems in terms of metadata extraction, data source integration and visualization. In this paper we describe a demonstration of WebSmatch, a flexible environment for Web data integration, based on a real, end-to-end data integration scenario over public data from Data Publica. The demonstration focuses on poorly structured input data sources (XLS files).","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133238242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Linked open GeoData management in the cloud 在云中链接开放GeoData管理
Pub Date : 2013-06-03 DOI: 10.1145/2500410.2500414
K. Kritikos, Yannis Rousakis, D. Kotzinos
The need to better integrate and link various isolated data sources on the web has been widely recognized and is tackled by the Linked Open Data (LOD) initiative. One of the problems to address is the issue of publishing and subsequently exploiting the data as LOD, due to reasons of data size and performance of the respective queries and to the publication complexity. This work addresses the size and performance issues by adapting the cloud as a hosting platform for LOD publication services so as to exploit its scalability and elasticity capabilities. The publication complexity issue is addressed by proposing a Linked Open Data-as-a-Service approach offering an integrated service based API for (semi)automatic publication of relational data as LOD and subsequent querying and updating capabilities.
更好地整合和链接网络上各种孤立的数据源的需要已得到广泛认可,并由链接开放数据(LOD)倡议解决。由于数据大小和各自查询的性能以及发布的复杂性,需要解决的问题之一是发布和随后利用数据作为LOD的问题。这项工作通过将云作为LOD发布服务的托管平台来解决大小和性能问题,从而利用其可伸缩性和弹性能力。通过提出一种链接开放数据即服务方法来解决发布复杂性问题,该方法提供了一种基于集成服务的API,用于(半)自动地将关系数据作为LOD发布,以及随后的查询和更新功能。
{"title":"Linked open GeoData management in the cloud","authors":"K. Kritikos, Yannis Rousakis, D. Kotzinos","doi":"10.1145/2500410.2500414","DOIUrl":"https://doi.org/10.1145/2500410.2500414","url":null,"abstract":"The need to better integrate and link various isolated data sources on the web has been widely recognized and is tackled by the Linked Open Data (LOD) initiative. One of the problems to address is the issue of publishing and subsequently exploiting the data as LOD, due to reasons of data size and performance of the respective queries and to the publication complexity. This work addresses the size and performance issues by adapting the cloud as a hosting platform for LOD publication services so as to exploit its scalability and elasticity capabilities. The publication complexity issue is addressed by proposing a Linked Open Data-as-a-Service approach offering an integrated service based API for (semi)automatic publication of relational data as LOD and subsequent querying and updating capabilities.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123860420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Visualizing a large collection of open datasets: an experiment with proximity graphs 可视化大量开放数据集:接近图实验
Pub Date : 2013-06-03 DOI: 10.1145/2500410.2500417
Tianyang Liu, D. Ahmed, F. Bouali, G. Venturini
We deal in this paper with the problem of creating an interactive and visual map for a large collection of Open datasets. We first describe how to define a representation space for such data, using text mining techniques to create features. Then, with a similarity measure between Open datasets, we use the K-nearest neighbors method for building a proximity graph between datasets. We use a force-directed layout method to visualize the graph (Tulip Software). We present the results with a collection of 300,000 datasets from the French Open data web site, in which the display of the graph is limited to 150,000 datasets. We study the discovered clusters and we show how they can be used to browse this large collection.
在本文中,我们处理的问题是为大量开放数据集创建一个交互式和可视化的地图。我们首先描述了如何为这些数据定义一个表示空间,使用文本挖掘技术来创建特征。然后,通过开放数据集之间的相似性度量,我们使用k近邻方法在数据集之间构建接近图。我们使用力定向布局方法来可视化图形(郁金香软件)。我们使用来自法国网球公开赛数据网站的300,000个数据集来展示结果,其中图形的显示仅限于150,000个数据集。我们将研究发现的集群,并展示如何使用它们来浏览这个大型集合。
{"title":"Visualizing a large collection of open datasets: an experiment with proximity graphs","authors":"Tianyang Liu, D. Ahmed, F. Bouali, G. Venturini","doi":"10.1145/2500410.2500417","DOIUrl":"https://doi.org/10.1145/2500410.2500417","url":null,"abstract":"We deal in this paper with the problem of creating an interactive and visual map for a large collection of Open datasets. We first describe how to define a representation space for such data, using text mining techniques to create features. Then, with a similarity measure between Open datasets, we use the K-nearest neighbors method for building a proximity graph between datasets. We use a force-directed layout method to visualize the graph (Tulip Software). We present the results with a collection of 300,000 datasets from the French Open data web site, in which the display of the graph is limited to 150,000 datasets. We study the discovered clusters and we show how they can be used to browse this large collection.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122123273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Agronomic taxon 农艺分类单元
Pub Date : 2013-06-03 DOI: 10.1145/2500410.2500415
C. Roussey, J. Chanet, V. Cellier, Fabien Amarger
In this paper, we describe the development of the first ontology module for observation of pest attacks in crop production. We applied the NeOn methodology and more particularly the ontology engineering method based on Ontology Design Pattern.
在本文中,我们描述了用于作物生产中害虫侵害观察的第一个本体模块的开发。我们采用了NeOn方法,特别是基于本体设计模式的本体工程方法。
{"title":"Agronomic taxon","authors":"C. Roussey, J. Chanet, V. Cellier, Fabien Amarger","doi":"10.1145/2500410.2500415","DOIUrl":"https://doi.org/10.1145/2500410.2500415","url":null,"abstract":"In this paper, we describe the development of the first ontology module for observation of pest attacks in crop production. We applied the NeOn methodology and more particularly the ontology engineering method based on Ontology Design Pattern.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116696811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Discovering keys in RDF/OWL dataset with KD2R 使用KD2R发现RDF/OWL数据集中的键
Pub Date : 2013-06-03 DOI: 10.1145/2500410.2500419
Danai Symeonidou, N. Pernelle, Fatiha Saïs
KD2R allows the automatic discovery of composite key constraints in RDF data sources that conform to a given ontology. We consider data sources for which the Unique Name Assumption is fulfilled. KD2R allows this discovery without having to scan all the data. Indeed, the proposed system looks for maximal non keys and derives minimal keys from this set of non keys. KD2R has been tested on several datasets available on the web of data and it has obtained promising results when the discovered keys are used to link data. In the demo, we will demonstrate the functionality of our tool and we will show on several datasets that the keys can be used in a datalinking tool.
KD2R允许自动发现符合给定本体的RDF数据源中的组合键约束。我们考虑满足唯一名称假设的数据源。KD2R允许这种发现,而无需扫描所有数据。实际上,所提出的系统寻找最大的非键,并从这组非键中获得最小的键。KD2R在数据网络上的几个可用数据集上进行了测试,当使用发现的键来链接数据时,它获得了令人满意的结果。在演示中,我们将演示工具的功能,并在几个数据集上展示可以在数据链接工具中使用的键。
{"title":"Discovering keys in RDF/OWL dataset with KD2R","authors":"Danai Symeonidou, N. Pernelle, Fatiha Saïs","doi":"10.1145/2500410.2500419","DOIUrl":"https://doi.org/10.1145/2500410.2500419","url":null,"abstract":"KD2R allows the automatic discovery of composite key constraints in RDF data sources that conform to a given ontology. We consider data sources for which the Unique Name Assumption is fulfilled. KD2R allows this discovery without having to scan all the data. Indeed, the proposed system looks for maximal non keys and derives minimal keys from this set of non keys. KD2R has been tested on several datasets available on the web of data and it has obtained promising results when the discovered keys are used to link data. In the demo, we will demonstrate the functionality of our tool and we will show on several datasets that the keys can be used in a datalinking tool.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129259833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
N2R-part: identity link discovery using partially aligned ontologies n2r部分:使用部分对齐本体的身份链接发现
Pub Date : 2013-06-03 DOI: 10.1145/2500410.2500416
N. Pernelle, Fatiha Saïs, B. Safar, Maria Koutraki, Tushar Ghosh
Thanks to the initiative of Linked Open Data, the RDF datasets that are published on the Web are more and more numerous. One active research field currently concerns the problem of finding links between entities. We focus in this paper on ontology-based data linking approaches which use linking rules based on the available schemas (or ontologies). This kind of systems assume to have beforehand a set of mappings between ontology elements. However, this set of mappings could be incomplete. We propose in this paper a data linking approach called N2R-Part. It is based on the computation of similarity scores by exploiting at the same time properties for which a mapping exists and those for which there is no mapping. We illustrate throughout an example how the exploitation of the unmapped properties improves the data linking results.
由于链接开放数据的倡议,在Web上发布的RDF数据集越来越多。目前一个活跃的研究领域涉及寻找实体之间联系的问题。本文主要关注基于本体的数据链接方法,这种方法使用基于可用模式(或本体)的链接规则。这类系统假定预先有一组本体元素之间的映射。然而,这组映射可能是不完整的。本文提出了一种称为N2R-Part的数据链接方法。它的基础是通过同时利用存在映射和不存在映射的属性来计算相似性分数。我们将通过一个示例说明如何利用未映射属性来改善数据链接结果。
{"title":"N2R-part: identity link discovery using partially aligned ontologies","authors":"N. Pernelle, Fatiha Saïs, B. Safar, Maria Koutraki, Tushar Ghosh","doi":"10.1145/2500410.2500416","DOIUrl":"https://doi.org/10.1145/2500410.2500416","url":null,"abstract":"Thanks to the initiative of Linked Open Data, the RDF datasets that are published on the Web are more and more numerous. One active research field currently concerns the problem of finding links between entities. We focus in this paper on ontology-based data linking approaches which use linking rules based on the available schemas (or ontologies). This kind of systems assume to have beforehand a set of mappings between ontology elements. However, this set of mappings could be incomplete. We propose in this paper a data linking approach called N2R-Part. It is based on the computation of similarity scores by exploiting at the same time properties for which a mapping exists and those for which there is no mapping. We illustrate throughout an example how the exploitation of the unmapped properties improves the data linking results.","PeriodicalId":328711,"journal":{"name":"International Workshop on Open Data","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126160752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
期刊
International Workshop on Open Data
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1