22nd International Conference on Data Engineering Workshops (ICDEW'06)最新文献

英文中文

Automatic Discovery and Composition of Services with IRIS 使用IRIS自动发现和组合服务

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.34

U. Radetzki, A. Cremers

The service-oriented architecture (SOA) proposes a new software development paradigm that is based on loosely coupled software components deployed and located across the web. This key benefit of SOA allows building workflows crossing organizational boundaries. One of the major challenges in this field of research is the automatic discovery and composition of services. In this paper we suggest a novel ontology-based discovery and composition approach. In a first step standard web services descriptions based on WSDL are semantically enriched by pre-processing steps including general-purpose ontologies as well as domainspecific ontologies. Resulting semantic-aware service profiles are stored in a registry component. In a second step we describe the matchmaking algorithms underlying the IRIS discovery component. Finally, we evaluate the performance of our algorithms analytically as well as we measure its quality by an experimental setup.

面向服务的体系结构(SOA)提出了一种新的软件开发范式，该范式基于部署和分布在web上的松散耦合软件组件。SOA的这个关键优点允许构建跨组织边界的工作流。该研究领域的主要挑战之一是服务的自动发现和组合。在本文中，我们提出了一种新的基于本体的发现和合成方法。在第一步中，基于WSDL的标准web服务描述通过预处理步骤在语义上得到丰富，预处理步骤包括通用本体和特定于领域的本体。生成的感知语义的服务概要文件存储在注册中心组件中。在第二步中，我们描述了IRIS发现组件底层的配对算法。最后，我们分析地评估了我们的算法的性能，并通过实验装置测量了它的质量。

引用次数: 4

Privacy Protection: p-Sensitive k-Anonymity Property 隐私保护:p-敏感k-匿名属性

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.116

T. Truta, B. Vinay

In this paper, we introduce a new privacy protection property called p-sensitive k-anonymity. The existing kanonymity property protects against identity disclosure, but it fails to protect against attribute disclosure. The new introduced privacy model avoids this shortcoming. Two necessary conditions to achieve p-sensitive kanonymity property are presented, and used in developing algorithms to create masked microdata with p-sensitive k-anonymity property using generalization and suppression.

在本文中，我们引入了一种新的隐私保护属性——p敏感k匿名。现有的kanonymity属性可以防止身份泄露，但不能防止属性泄露。新引入的隐私模式避免了这一缺点。提出了实现p敏感k-匿名性的两个必要条件，并利用泛化和抑制的方法开发了具有p敏感k-匿名性的屏蔽微数据算法。

引用次数: 354

Novelty-based Incremental Document Clustering for On-line Documents 基于新颖性的在线文档增量聚类

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.100

Sophoin Khy, Y. Ishikawa, H. Kitagawa

Document clustering has been used as a core technique in managing vast amount of data and providing needed information. In on-line environments, generally new information gains more interests than old one. Traditional clustering focuses on grouping similar documents into clusters by treating each document with equal weight. We proposed a novelty-based incremental clustering method for on-line documents that has biases on recent documents. In the clustering method, the notion of ‘novelty’ is incorporated into a similarity function and a clustering method, a variant of the K-means method, is proposed. We examine the efficiency and behaviors of the method by experiments.

文档聚类已成为管理海量数据和提供所需信息的核心技术。在网络环境中，新信息通常比旧信息更有吸引力。传统的聚类侧重于通过对每个文档赋予相同的权重来将相似的文档分组到簇中。我们提出了一种基于新颖性的增量聚类方法，用于对最近的文档有偏差的在线文档。在聚类方法中，将“新颖性”的概念纳入到相似函数中，并提出了一种聚类方法，即K-means方法的变体。通过实验验证了该方法的有效性和性能。

引用次数: 12

Ontology-Based Information Tailoring 基于本体的信息裁剪

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.104

C. Curino, E. Quintarelli, L. Tanca

Current applications are often forced to filter the richness of datasources in order to reduce the information noise the user is subject to. We consider this aspect as a critical issue of applications, to be factorized at the data management level. The Context-ADDICT system, leveraging on ontology-based context and domain models, is able to personalize the data to be made available to the user by "context-aware tailoring". In this paper we present a formal approach to the definition of the relationship between context (represented by an appropriate context model) and application domain (modeled by a domain ontology). Once such relationship has been defined, we are able to work out the boundary of the portion of the domain relevant to a user in a certain context. We also sketch the implementation of a visual tool supporting the application designer in this modeling task

当前的应用程序经常被迫过滤丰富的数据源，以减少用户受到的信息噪声。我们认为这方面是应用程序的一个关键问题，需要在数据管理级别进行分解。context- addict系统利用基于本体的上下文和领域模型，能够通过“上下文感知剪裁”来个性化提供给用户的数据。在本文中，我们提出了一种正式的方法来定义上下文(由适当的上下文模型表示)和应用领域(由领域本体建模)之间的关系。一旦定义了这种关系，我们就能够计算出在特定上下文中与用户相关的领域部分的边界。我们还概述了在此建模任务中支持应用程序设计器的可视化工具的实现

引用次数: 16

Supporting Predicate-Window Queries in Data Stream Management Systems 支持数据流管理系统中的谓词窗口查询

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.140

T. Ghanem

The window query model is widely used in data stream management systems where the focus of a continuous query is limited to a set of the most recent tuples. In this dissertation, we show that an interesting and important class of continuous queries can not be answered by the existing sliding-window query models. Thus, we introduce a new model for continuous queries, termed the predicate-window query model that limits the focus of a continuous query to the stream tuples that qualify a certain predicate. Predicate windows are characterized by the following (1) The window predicate can be defined over any attribute in the stream tuple (ordered or unordered). (2) Stream tuples qualify and disqualify the window predicate in an out-of-order manner. The goal of this dissertation is to develop an efficient framework to realize predicate windows inside data stream management systems. The predicate-window query framework enables the system to efficiently support a wide variety of streaming applications through an expressive query language and efficient query evaluation mechanisms (i.e., query execution and query optimization). As a test bed for our research, the predicate-window framework is being developed inside Nile; a prototype data stream management system developed at Purdue Univers

窗口查询模型广泛应用于数据流管理系统中，其中连续查询的焦点仅限于最近的一组元组。在本文中，我们证明了一类有趣且重要的连续查询不能被现有的滑动窗口查询模型所回答。因此，我们为连续查询引入了一种新的模型，称为谓词窗口查询模型，它将连续查询的焦点限制在符合某个谓词的流元组上。谓词窗口的特征如下:(1)窗口谓词可以在流元组中的任何属性(有序或无序)上定义。(2)流元组以无序的方式限定和取消限定窗口谓词。本文的目标是开发一个有效的框架来实现数据流管理系统中的谓词窗口。谓词窗口查询框架使系统能够通过富有表现力的查询语言和高效的查询评估机制(即查询执行和查询优化)有效地支持各种流应用程序。作为我们研究的试验台，谓词窗口框架正在Nile内部开发;这是普渡大学开发的数据流管理系统原型

{"title":"Supporting Predicate-Window Queries in Data Stream Management Systems","authors":"T. Ghanem","doi":"10.1109/ICDEW.2006.140","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.140","url":null,"abstract":"The window query model is widely used in data stream management systems where the focus of a continuous query is limited to a set of the most recent tuples. In this dissertation, we show that an interesting and important class of continuous queries can not be answered by the existing sliding-window query models. Thus, we introduce a new model for continuous queries, termed the predicate-window query model that limits the focus of a continuous query to the stream tuples that qualify a certain predicate. Predicate windows are characterized by the following (1) The window predicate can be defined over any attribute in the stream tuple (ordered or unordered). (2) Stream tuples qualify and disqualify the window predicate in an out-of-order manner. The goal of this dissertation is to develop an efficient framework to realize predicate windows inside data stream management systems. The predicate-window query framework enables the system to efficiently support a wide variety of streaming applications through an expressive query language and efficient query evaluation mechanisms (i.e., query execution and query optimization). As a test bed for our research, the predicate-window framework is being developed inside Nile; a prototype data stream management system developed at Purdue Univers","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131632023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Using Element Clustering to Increase the Efficiency of XML Schema Matching 利用元素聚类提高XML模式匹配效率

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.159

M. Smiljanic, M. V. Keulen, W. Jonker

Schema matching attempts to discover semantic mappings between elements of two schemas. Elements are cross compared using various heuristics (e.g., name, data-type, and structure similarity). Seen from a broader perspective, the schema matching problem is a combinatorial problem with an exponential complexity. This makes the naive matching algorithms for large schemas prohibitively inefficient. In this paper we propose a clustering based technique for improving the efficiency of large scale schema matching. The technique inserts clustering as an intermediate step into existing schema matching algorithms. Clustering partitions schemas and reduces the overall matching load, and creates a possibility to trade between the efficiency and effectiveness. The technique can be used in addition to other optimization techniques. In the paper we describe the technique, validate the performance of one implementation of the technique, and open directions for future research.

模式匹配试图发现两个模式元素之间的语义映射。使用各种启发式方法(例如，名称、数据类型和结构相似性)交叉比较元素。从广义上看，模式匹配问题是一个具有指数复杂度的组合问题。这使得用于大型模式的朴素匹配算法的效率非常低。本文提出了一种基于聚类的方法来提高大规模模式匹配的效率。该技术将聚类作为中间步骤插入现有的模式匹配算法中。集群对模式进行分区，减少总体匹配负载，并在效率和有效性之间进行权衡。该技术可以与其他优化技术一起使用。在本文中，我们描述了该技术，验证了该技术的一个实现的性能，并为未来的研究开辟了方向。

引用次数: 33

Ontology-Driven Semantic Matches between Database Schemas 数据库模式之间本体驱动的语义匹配

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.105

Sangsoo Sung, D. McLeod

Schema matching has been historically difficult to automate. Most previous studies have tried to find matches by exploiting information on schema and data instances. However, schema and data instances cannot fully capture the semantic information of the databases. Therefore, some attributes can be matched to improper attributes. To address this problem, we propose a schema matching framework that supports identification of the correct matches by extracting the semantics from ontologies. In ontologies, two concepts share similar semantics in their common parent. In addition, the parent can be further used to quantify a similarity between them. By combining this idea with effective contemporary mapping algorithms, we perform an ontology-driven semantic matching in multiple data sources. Experimental results indicate that the proposed method successfully identifies higher accurate matches than those of previous works.

模式匹配历来难以实现自动化。以前的大多数研究都试图通过利用模式和数据实例上的信息来找到匹配。然而，模式和数据实例不能完全捕获数据库的语义信息。因此，有些属性可能会匹配到不合适的属性。为了解决这个问题，我们提出了一个模式匹配框架，该框架通过从本体中提取语义来支持正确匹配的识别。在本体中，两个概念在它们共同的父概念中共享相似的语义。此外，亲本可以进一步用于量化它们之间的相似性。通过将这一思想与有效的当代映射算法相结合，我们在多个数据源中执行本体驱动的语义匹配。实验结果表明，该方法比以往的方法识别出更高的匹配精度。

引用次数: 14

IdentifyingWeb Spam by Densely Connected Sites and its Statistics in a JapaneseWeb Snapshot 通过密集连接站点识别网络垃圾邮件及其在日本网络快照中的统计

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.64

Hiroshi Ono, Masashi Toyoda, M. Kitsuregawa

Web spamming refers to actions intended to mislead search engines into ranking certain pages higher than they deserve. Recently, the amount of web spam has increased dramatically, leading to a degradation of search results. One of the most effective spamming techniques is link spamming. This is done by setting up an interconnected structure of pages for deceiving link-based ranking methods, such as PageRank. In this paper, we analyze distributions of link spam in our archive of Japanese web pages using link analysis techniques.

网络垃圾邮件指的是误导搜索引擎将某些页面的排名提高到高于其应得的水平。最近，网络垃圾邮件的数量急剧增加，导致搜索结果的退化。最有效的垃圾邮件技术之一是链接垃圾邮件。这是通过设置一个相互连接的页面结构来欺骗基于链接的排名方法(如PageRank)来实现的。在本文中，我们使用链接分析技术分析了日本网页档案中垃圾链接的分布。

引用次数: 2

Bringing Relational Data into the SemanticWeb using SPARQL and Relational.OWL 使用SPARQL和Relational将关系数据引入语义web。猫头鹰

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.37

C. Laborda, Stefan Conrad

Despite all the efforts to build up a Semantic Web, where each machine can understand and interpret the data it processes, information is usually still stored in ordinary relational databases. Semantic Web applications needing access to such semantically unexploited data, have to create their own manual relational database to Semantic Web mappings. In this paper we analyze, whether the combination of Relational.OWL as a Semantic Web representation of relational databases and a semantic query language like SPARQL could be an alternative. The benefits of such an approach are clear, since it enables Semantic Web applications to access and query data actually stored in relational databases using their own built-in functionality.

尽管人们努力构建语义网，使每台机器都能理解和解释它所处理的数据，但信息通常仍然存储在普通的关系数据库中。语义Web应用程序需要访问这些语义上未被利用的数据，必须创建自己的手动关系数据库来映射语义Web。在本文中我们分析，是否结合关系。OWL作为关系数据库的语义Web表示和语义查询语言(如SPARQL)可能是另一种选择。这种方法的好处是显而易见的，因为它使语义Web应用程序能够使用自己的内置功能访问和查询实际存储在关系数据库中的数据。

引用次数: 71

Scalable and Efficient Data Streaming Algorithms for Detecting Common Content in Internet Traffic 用于检测互联网流量中常见内容的可扩展和高效数据流算法

22nd International Conference on Data Engineering Workshops (ICDEW'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDEW.2006.130

Minho Sung, Abhishek Kumar, Erran L. Li, Jia Wang, Jun Xu

Recent research on data streaming algorithms has provided powerful tools to efficiently monitor various characteristics of traffic passing through a single network link or node. However, it is often desirable to perform data streaming analysis on the traffic aggregated over hundreds or even thousands of links/nodes, which will provide network operators with a holistic view of the network operation. Shipping raw traffic data to a centralized location (i.e., “raw aggregation”) for streaming analysis is clearly not a feasible approach for a large network. In this paper, we propose a set of novel distributed data streaming algorithms that allow scalable and efficient monitoring of aggregated traffic without the need for raw aggregation. Our algorithms target the specific network monitoring problem of finding common content in the Internet traffic traversing several nodes/links, which has applications in network-wide intrusion detection, early warning for fast propagating worms, and detection of hot objects and spam traffic. We evaluate our algorithms through extensive simulations and experiments on traffic traces collected from a tier-1 ISP. The experimental results demonstrate that our algorithms can effectively detect common content in the traffic traversing across a large network.

最近对数据流算法的研究为有效监控通过单个网络链路或节点的流量的各种特征提供了强大的工具。然而，通常需要对聚合在数百甚至数千个链路/节点上的流量进行数据流分析，这将为网络运营商提供一个整体的网络运行视图。将原始流量数据发送到集中位置(即“原始聚合”)进行流分析显然不是大型网络的可行方法。在本文中，我们提出了一套新颖的分布式数据流算法，可以在不需要原始聚合的情况下对聚合流量进行可扩展和有效的监控。我们的算法针对特定的网络监控问题，即在穿越多个节点/链路的互联网流量中寻找共同内容，在全网入侵检测、快速传播蠕虫的早期预警、热点对象和垃圾流量的检测中具有应用价值。我们通过对从一级ISP收集的流量痕迹进行广泛的模拟和实验来评估我们的算法。实验结果表明，我们的算法能够有效地检测出大型网络中流量的共同内容。

{"title":"Scalable and Efficient Data Streaming Algorithms for Detecting Common Content in Internet Traffic","authors":"Minho Sung, Abhishek Kumar, Erran L. Li, Jia Wang, Jun Xu","doi":"10.1109/ICDEW.2006.130","DOIUrl":"https://doi.org/10.1109/ICDEW.2006.130","url":null,"abstract":"Recent research on data streaming algorithms has provided powerful tools to efficiently monitor various characteristics of traffic passing through a single network link or node. However, it is often desirable to perform data streaming analysis on the traffic aggregated over hundreds or even thousands of links/nodes, which will provide network operators with a holistic view of the network operation. Shipping raw traffic data to a centralized location (i.e., “raw aggregation”) for streaming analysis is clearly not a feasible approach for a large network. In this paper, we propose a set of novel distributed data streaming algorithms that allow scalable and efficient monitoring of aggregated traffic without the need for raw aggregation. Our algorithms target the specific network monitoring problem of finding common content in the Internet traffic traversing several nodes/links, which has applications in network-wide intrusion detection, early warning for fast propagating worms, and detection of hot objects and spam traffic. We evaluate our algorithms through extensive simulations and experiments on traffic traces collected from a tier-1 ISP. The experimental results demonstrate that our algorithms can effectively detect common content in the traffic traversing across a large network.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129667429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

22nd International Conference on Data Engineering Workshops (ICDEW'06)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀