Proceedings. 20th International Conference on Data Engineering最新文献

英文中文

ContextMetrics/sup /spl trade//: semantic and syntactic interoperability in cross-border trading systems 跨境贸易系统中的语义和句法互操作性

Proceedings. 20th International Conference on Data Engineering

Pub Date : 2004-08-09 DOI: 10.1109/ICDE.2004.1320053

Chito Jovellanos

We describe a method and system for quantifying the variances in the semantics and syntax of electronic transactions exchanged between business counterparties. ContextMetrics/sup /spl trade// enables (a) dynamic transformations of outbound and inbound transactions needed to effect 'straight-through-processing' (STP); (b) unbiased assessments of counterparty systems' capabilities to support STP; and (c) modeling of operational risks and financial exposures stemming from an enterprise's transactional systems.

我们描述了一种量化商业对手之间交换的电子交易的语义和语法差异的方法和系统。ContextMetrics/sup /spl trade//启用(a)实现“直通处理”(STP)所需的出站和入站事务的动态转换;(b)对交易对手系统支持STP的能力进行公正的评估;(c)对企业交易系统产生的操作风险和财务风险进行建模。

引用次数: 1

Stream query processing for healthcare bio-sensor applications 医疗保健生物传感器应用的流查询处理

Proceedings. 20th International Conference on Data Engineering

Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320048

Chung-Min Chen, H. Agrawal, M. Cochinwala, D. Rosenbluth

The need of a data stream management system (DSMS), with the capability of querying continuous data streams, has been well understood by the database research community. We provide an overview on a DSMS prototype called T2. T2 inherits some of the concepts of an early prototype, Tribeca [M. Sullivan et al. (1998)], developed also at Telcordia, but with complete new design and implementation in Java with an SQL-like query language. Our goal is to build a framework that provides a programming infrastructure as well as useful operators to support stream processing in different applications. We set our first targeted application to healthcare biosensor networks, where we applied T2 to monitoring and analyzing electrocardiogram (ECG) data streams, arriving via wireless networks from mobile subjects wearing ECG sensors. Monitoring remote patients via wireless sensors not only provides convenience and safety assurance to the patients, but also saves health care cost in many aspects.

数据库研究界对具有查询连续数据流能力的数据流管理系统(DSMS)的需求已经得到了很好的理解。我们提供了一个名为T2的DSMS原型的概述。T2继承了早期原型机的一些概念。Sullivan et al.(1998)]，也是在Telcordia开发的，但是用Java进行了全新的设计和实现，使用了类似sql的查询语言。我们的目标是构建一个框架，提供编程基础设施和有用的操作符，以支持不同应用程序中的流处理。我们将第一个目标应用于医疗保健生物传感器网络，在那里我们将T2应用于监测和分析心电图(ECG)数据流，这些数据流通过无线网络从佩戴ECG传感器的移动受试者那里到达。通过无线传感器对远程患者进行监测，不仅为患者提供了方便和安全的保证，而且在很多方面节约了医疗成本。

引用次数: 40

Approximate temporal aggregation 近似时间聚合

Proceedings. 20th International Conference on Data Engineering

Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1319996

Yufei Tao, D. Papadias, C. Faloutsos

Temporal aggregate queries retrieve summarized information about records with time-evolving attributes. Existing approaches have at least one of the following shortcomings: (i) they incur large space requirements, (ii) they have high processing cost and (iii) they are based on complex structures, which are not available in commercial systems. We solve these problems by approximation techniques with bounded error. We propose two methods: the first one is based on multiversion B-trees and has logarithmic worst-case query cost, while the second technique uses off-the-shelf B- and R-trees, and achieves the same performance in the expected case. We experimentally demonstrate that the proposed methods consume an order of magnitude less space than their competitors and are significantly faster, even for cases that the permissible error bound is very small.

时态聚合查询检索关于具有随时间变化属性的记录的汇总信息。现有的方法至少有以下缺点之一:(i)它们需要很大的空间，(ii)它们有很高的处理成本，(iii)它们基于复杂的结构，这在商业系统中是不可用的。我们用有界误差的近似技术来解决这些问题。我们提出了两种方法:第一种方法基于多版本B树，并且具有对数最坏情况查询成本，而第二种技术使用现成的B树和r树，并且在预期情况下达到相同的性能。我们通过实验证明，即使在允许误差范围非常小的情况下，所提出的方法所消耗的空间比其竞争对手少一个数量级，并且速度明显更快。

引用次数: 31

Data mining for intrusion detection: techniques, applications and systems 入侵检测的数据挖掘:技术、应用和系统

Proceedings. 20th International Conference on Data Engineering

Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320103

J. Pei, S. Upadhyaya, F. Farooq, V. Govindaraju

An intrusion is defined as any set of actions that compromise the integrity, confidentiality or availability of a resource. Intrusion detection is an important task for information infrastructure security. One major challenge in intrusion detection is that we have to identify the camouflaged intrusions from a huge amount of normal communication activities. Data mining is to identify valid, novel, potentially useful, and ultimately understandable patterns in massive data. It is demanding to apply data mining techniques to detect various intrusions. In the last several years, some exciting and important advances have been made in intrusion detection using data mining techniques. Research results have been published and some prototype systems have been established. Inspired by the huge demands from applications, the interactions and collaborations between the communities of security and data mining have been boosted substantially. This seminar will present an interdisciplinary survey of data mining techniques for intrusion detection so that the researchers from computer security and data mining communities can share the experiences and learn from each other. Some data mining based intrusion detection systems will also be reviewed briefly. Moreover, research challenges and problems will be discussed so that future collaborations may be stimulated. For data mining/database researchers and practitioners, the seminar will provide background knowledge and opportunities for applying data mining techniques to intrusion detection and computer security. For computer security researchers and practitioners, it provides knowledge on how data mining can benefit and enhance computer security. We will try to understand and appreciate the following technical issues.

入侵被定义为危害资源的完整性、机密性或可用性的任何一组操作。入侵检测是信息基础设施安全的一项重要任务。入侵检测的一个主要挑战是我们必须从大量的正常通信活动中识别伪装的入侵。数据挖掘是在海量数据中识别有效的、新颖的、潜在有用的和最终可理解的模式。应用数据挖掘技术检测各种类型的入侵是非常必要的。在过去的几年中，在使用数据挖掘技术进行入侵检测方面取得了一些令人兴奋的重要进展。研究成果已经发表，并建立了一些原型系统。受到来自应用程序的巨大需求的启发，安全和数据挖掘社区之间的交互和协作得到了极大的促进。本次研讨会将介绍入侵检测中数据挖掘技术的跨学科研究，以便计算机安全和数据挖掘领域的研究人员分享经验，相互学习。本文还简要介绍了一些基于数据挖掘的入侵检测系统。此外，将讨论研究的挑战和问题，以促进未来的合作。对于数据挖掘/数据库研究人员和实践者来说，研讨会将为他们提供将数据挖掘技术应用于入侵检测和计算机安全的背景知识和机会。对于计算机安全研究人员和从业人员，它提供了有关数据挖掘如何有益于和增强计算机安全的知识。我们将尝试理解和欣赏以下技术问题。

{"title":"Data mining for intrusion detection: techniques, applications and systems","authors":"J. Pei, S. Upadhyaya, F. Farooq, V. Govindaraju","doi":"10.1109/ICDE.2004.1320103","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320103","url":null,"abstract":"An intrusion is defined as any set of actions that compromise the integrity, confidentiality or availability of a resource. Intrusion detection is an important task for information infrastructure security. One major challenge in intrusion detection is that we have to identify the camouflaged intrusions from a huge amount of normal communication activities. Data mining is to identify valid, novel, potentially useful, and ultimately understandable patterns in massive data. It is demanding to apply data mining techniques to detect various intrusions. In the last several years, some exciting and important advances have been made in intrusion detection using data mining techniques. Research results have been published and some prototype systems have been established. Inspired by the huge demands from applications, the interactions and collaborations between the communities of security and data mining have been boosted substantially. This seminar will present an interdisciplinary survey of data mining techniques for intrusion detection so that the researchers from computer security and data mining communities can share the experiences and learn from each other. Some data mining based intrusion detection systems will also be reviewed briefly. Moreover, research challenges and problems will be discussed so that future collaborations may be stimulated. For data mining/database researchers and practitioners, the seminar will provide background knowledge and opportunities for applying data mining techniques to intrusion detection and computer security. For computer security researchers and practitioners, it provides knowledge on how data mining can benefit and enhance computer security. We will try to understand and appreciate the following technical issues.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122773524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 45

RACCOON: a peer-based system for data integration and sharing RACCOON:基于对等的数据集成和共享系统

Proceedings. 20th International Conference on Data Engineering

Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320081

Chen Li, Jia Li, Qi Zhong

Recent database applications see the emerging need to support data integration in distributed, peer-to-peer environments, in which autonomous peers (sources) connected by a network are willing to exchange data and services with each other. To address related research challenges, we are developing a system called "RACCOON", which allows different sources to integrate and share their data. We use an application to show several important features of the RACCOON system. The system also suggests semantic mappings for the user to choose. We show the two different querying modes, particularly how a query is expanded using the semantic mappings in the extended querying mode to compute as many answers to the query as possible.

最近的数据库应用程序看到了在分布式、点对点环境中支持数据集成的新需求，在这种环境中，由网络连接的自治对等点(源)愿意彼此交换数据和服务。为了应对相关的研究挑战，我们正在开发一个名为“RACCOON”的系统，该系统允许不同来源的数据集成和共享。我们使用一个应用程序来展示RACCOON系统的几个重要特性。系统还建议语义映射供用户选择。我们展示了两种不同的查询模式，特别是如何使用扩展查询模式中的语义映射扩展查询，以计算尽可能多的查询答案。

引用次数: 7

wmdb.*: rights protection for numeric relational data wmdb。*:数字关系数据的权利保护

Proceedings. 20th International Conference on Data Engineering

Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320091

R. Sion, M. Atallah, Sunil Prabhakar

We introduce wmdb.*, a solution for numeric relational data rights protection through watermarking. Rights protection for relational data is important in areas where sensitive, valuable content is to be outsourced. A good example is a data mining application, where data is sold in pieces to parties specialized in mining it. We show how various higher level semantic constraints such as classification preservation and maximum absolute change bounds are naturally handled and how random alteration attacks are well survived.

我们介绍wmdb。*，通过水印保护数字关系数据权利的解决方案。在外包敏感、有价值内容的领域，关系数据的权利保护非常重要。一个很好的例子是数据挖掘应用程序，其中数据以块的形式出售给专门从事数据挖掘的各方。我们展示了各种更高层次的语义约束，如分类保存和最大绝对变化界限是如何自然处理的，以及随机更改攻击是如何很好地幸存下来的。

引用次数: 6

Improved file synchronization techniques for maintaining large replicated collections over slow networks 改进的文件同步技术，用于在慢速网络上维护大型复制集合

Proceedings. 20th International Conference on Data Engineering

Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1319992

Torsten Suel, P. Noel, Dimitre Trendafilov

We study the problem of maintaining large replicated collections of files or documents in a distributed environment with limited bandwidth. This problem arises in a number of important applications, such as synchronization of data between accounts or devices, content distribution and Web caching networks, Web site mirroring, storage networks, and large scale Web search and mining. At the core of the problem lies the following challenge, called the file synchronization problem: given two versions of a file on different machines, say an outdated and a current one, how can we update the outdated version with minimum communication cost, by exploiting the significant similarity between the versions? While a popular open source tool for this problem called rsync is used in hundreds of thousands of installations, there have been only very few attempts to improve upon this tool in practice. We propose a framework for remote file synchronization and describe several new techniques that result in significant bandwidth savings. Our focus is on applications where very large collections have to be maintained over slow connections. We show that a prototype implementation of our framework and techniques achieves significant improvements over rsync. As an example application, we focus on the efficient synchronization of very large Web page collections for the purpose of search, mining, and content distribution.

我们研究了在带宽有限的分布式环境中维护大型复制文件或文档集合的问题。这个问题出现在许多重要的应用程序中，例如帐户或设备之间的数据同步、内容分发和Web缓存网络、Web站点镜像、存储网络以及大规模Web搜索和挖掘。问题的核心在于以下挑战，称为文件同步问题:给定不同机器上的文件的两个版本，例如过时的和当前的，我们如何利用版本之间的显著相似性，以最小的通信成本更新过时的版本?尽管针对这个问题的流行开源工具rsync被用于成千上万的安装中，但在实践中很少有人尝试改进这个工具。我们提出了一个远程文件同步的框架，并描述了几种可以显著节省带宽的新技术。我们的重点是那些必须在缓慢连接上维护非常大的集合的应用程序。我们展示了我们的框架和技术的原型实现比rsync实现了显著的改进。作为一个示例应用程序，我们将重点关注用于搜索、挖掘和内容分发的大型Web页面集合的高效同步。

{"title":"Improved file synchronization techniques for maintaining large replicated collections over slow networks","authors":"Torsten Suel, P. Noel, Dimitre Trendafilov","doi":"10.1109/ICDE.2004.1319992","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1319992","url":null,"abstract":"We study the problem of maintaining large replicated collections of files or documents in a distributed environment with limited bandwidth. This problem arises in a number of important applications, such as synchronization of data between accounts or devices, content distribution and Web caching networks, Web site mirroring, storage networks, and large scale Web search and mining. At the core of the problem lies the following challenge, called the file synchronization problem: given two versions of a file on different machines, say an outdated and a current one, how can we update the outdated version with minimum communication cost, by exploiting the significant similarity between the versions? While a popular open source tool for this problem called rsync is used in hundreds of thousands of installations, there have been only very few attempts to improve upon this tool in practice. We propose a framework for remote file synchronization and describe several new techniques that result in significant bandwidth savings. Our focus is on applications where very large collections have to be maintained over slow connections. We show that a prototype implementation of our framework and techniques achieves significant improvements over rsync. As an example application, we focus on the efficient synchronization of very large Web page collections for the purpose of search, mining, and content distribution.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130458580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 71

PRIX: indexing and querying XML using prufer sequences 要点:使用更优序列对XML进行索引和查询

Proceedings. 20th International Conference on Data Engineering

Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320005

P. Rao, Bongki Moon

We propose a new way of indexing XML documents and processing twig patterns in an XML database. Every XML document in the database can be transformed into a sequence of labels by Prufer's method that constructs a one-to-one correspondence between trees and sequences. During query processing, a twig pattern is also transformed into its Prufer sequence. By performing subsequence matching on the set of sequences in the database, and performing a series of refinement phases that we have developed, we can find all the occurrences of a twig pattern in the database. Our approach allows holistic processing of a twig pattern without breaking the twig into root-to-leaf paths and processing these paths individually. Furthermore, we show that all correct answers are found without any false dismissals or false alarms. Experimental results demonstrate the performance benefits of our proposed techniques.

提出了一种索引XML文档和处理XML数据库中分支模式的新方法。数据库中的每个XML文档都可以通过Prufer的方法转换为标签序列，该方法在树和序列之间构建一对一的对应关系。在查询处理期间，还将小枝模式转换为其Prufer序列。通过对数据库中的序列集执行子序列匹配，并执行我们开发的一系列细化阶段，我们可以找到数据库中出现的所有分支模式。我们的方法允许对小枝模式进行整体处理，而无需将小枝分解为根到叶的路径并单独处理这些路径。此外，我们证明了所有正确答案都被找到，没有任何错误的解雇或假警报。实验结果证明了我们所提出的技术的性能优势。

引用次数: 230

Extending XML database to support open XML 扩展XML数据库以支持开放XML

Proceedings. 20th International Conference on Data Engineering

Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320054

Jinyu Wang, Kongyi Zhou, K. Karun, Mark Scardina

XML is a widely accepted standard for exchanging business data. To optimize the management of XML and help companies build up their business partner networks over the Internet, database servers have introduced new XML storage and query features. However, each enterprise defines its own data elements in XML and modifies the XML documents to handle the evolving business needs. This makes XML data conform to heterogeneous schemas or schemas that evolve over time, which is not suitable for XML database storage. We provide an overview of the current XML database strategies and presents a streaming metadata-processing approach, enabling databases to handle multiple XML formats seamlessly.

XML是一种广泛接受的交换业务数据的标准。为了优化XML的管理并帮助公司在Internet上建立业务合作伙伴网络，数据库服务器引入了新的XML存储和查询特性。但是，每个企业都在XML中定义自己的数据元素，并修改XML文档以处理不断变化的业务需求。这使得XML数据符合异构模式或随时间发展的模式，这不适合XML数据库存储。我们概述了当前的XML数据库策略，并提出了一种流元数据处理方法，使数据库能够无缝地处理多种XML格式。

引用次数: 1

Dynamic extensible query processing in super-peer based P2P systems 基于超级对等点的P2P系统中的动态可扩展查询处理

Proceedings. 20th International Conference on Data Engineering

Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320077

C. Wiesner, A. Kemper, Stefan Brandl

To enable dynamic, extensible, and distributed query processing in super-peer based P2P networks, where standard query operators and user-defined code can be executed nearby the data, we distribute query processing to (super-) peers. Therefore, super-peers provide functionality for the management of the indices, query optimization, and query processing. Additionally, we expect that peers provide query processing capabilities to be full members of the P2P network. To enable this, super-peers have to provide an optimizer for generating efficient query plans from the queries they receive. The distribution process is guided by the routing index which is dynamic and corresponds to the data allocation schema in traditional distributed DBMSs.

为了在基于超级对等点的P2P网络中实现动态、可扩展和分布式的查询处理，标准查询操作符和用户定义代码可以在数据附近执行，我们将查询处理分发给(超级)对等点。因此，超级对等体提供了索引管理、查询优化和查询处理的功能。此外，我们期望提供查询处理能力的对等节点成为P2P网络的正式成员。为了实现这一点，超级对等体必须提供一个优化器，以便从它们接收到的查询生成高效的查询计划。路由索引是动态的，与传统的分布式dbms中的数据分配模式相对应。

引用次数: 5

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings. 20th International Conference on Data Engineering

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀