利用元素聚类提高XML模式匹配效率

22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI:10.1109/ICDEW.2006.159

M. Smiljanic, M. V. Keulen, W. Jonker

{"title":"利用元素聚类提高XML模式匹配效率","authors":"M. Smiljanic, M. V. Keulen, W. Jonker","doi":"10.1109/ICDEW.2006.159","DOIUrl":null,"url":null,"abstract":"Schema matching attempts to discover semantic mappings between elements of two schemas. Elements are cross compared using various heuristics (e.g., name, data-type, and structure similarity). Seen from a broader perspective, the schema matching problem is a combinatorial problem with an exponential complexity. This makes the naive matching algorithms for large schemas prohibitively inefficient. In this paper we propose a clustering based technique for improving the efficiency of large scale schema matching. The technique inserts clustering as an intermediate step into existing schema matching algorithms. Clustering partitions schemas and reduces the overall matching load, and creates a possibility to trade between the efficiency and effectiveness. The technique can be used in addition to other optimization techniques. In the paper we describe the technique, validate the performance of one implementation of the technique, and open directions for future research.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":"333 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":"{\"title\":\"Using Element Clustering to Increase the Efficiency of XML Schema Matching\",\"authors\":\"M. Smiljanic, M. V. Keulen, W. Jonker\",\"doi\":\"10.1109/ICDEW.2006.159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Schema matching attempts to discover semantic mappings between elements of two schemas. Elements are cross compared using various heuristics (e.g., name, data-type, and structure similarity). Seen from a broader perspective, the schema matching problem is a combinatorial problem with an exponential complexity. This makes the naive matching algorithms for large schemas prohibitively inefficient. In this paper we propose a clustering based technique for improving the efficiency of large scale schema matching. The technique inserts clustering as an intermediate step into existing schema matching algorithms. Clustering partitions schemas and reduces the overall matching load, and creates a possibility to trade between the efficiency and effectiveness. The technique can be used in addition to other optimization techniques. In the paper we describe the technique, validate the performance of one implementation of the technique, and open directions for future research.\",\"PeriodicalId\":331953,\"journal\":{\"name\":\"22nd International Conference on Data Engineering Workshops (ICDEW'06)\",\"volume\":\"333 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"22nd International Conference on Data Engineering Workshops (ICDEW'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDEW.2006.159\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDEW.2006.159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

摘要

模式匹配试图发现两个模式元素之间的语义映射。使用各种启发式方法(例如，名称、数据类型和结构相似性)交叉比较元素。从广义上看，模式匹配问题是一个具有指数复杂度的组合问题。这使得用于大型模式的朴素匹配算法的效率非常低。本文提出了一种基于聚类的方法来提高大规模模式匹配的效率。该技术将聚类作为中间步骤插入现有的模式匹配算法中。集群对模式进行分区，减少总体匹配负载，并在效率和有效性之间进行权衡。该技术可以与其他优化技术一起使用。在本文中，我们描述了该技术，验证了该技术的一个实现的性能，并为未来的研究开辟了方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Using Element Clustering to Increase the Efficiency of XML Schema Matching

Schema matching attempts to discover semantic mappings between elements of two schemas. Elements are cross compared using various heuristics (e.g., name, data-type, and structure similarity). Seen from a broader perspective, the schema matching problem is a combinatorial problem with an exponential complexity. This makes the naive matching algorithms for large schemas prohibitively inefficient. In this paper we propose a clustering based technique for improving the efficiency of large scale schema matching. The technique inserts clustering as an intermediate step into existing schema matching algorithms. Clustering partitions schemas and reduces the overall matching load, and creates a possibility to trade between the efficiency and effectiveness. The technique can be used in addition to other optimization techniques. In the paper we describe the technique, validate the performance of one implementation of the technique, and open directions for future research.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

22nd International Conference on Data Engineering Workshops (ICDEW'06)

自引率

0.00%

发文量

期刊最新文献

Web Interface Navigation Design: Which Style of Navigation-Link Menus Do Users Prefer? Replication Based on Objects Load under a Content Distribution Network A Stochastic Approach for Trust Management A Multiple-Perspective, Interactive Approach for Web Information Extraction and Exploration Seaweed: Distributed Scalable Ad Hoc Querying