首页 > 最新文献

Proceedings. International Database Engineering and Applications Symposium最新文献

英文 中文
Continuous queries on trajectories of moving objects 对运动物体轨迹的连续查询
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351495
Philip Schmiegelt, B. Seeger, Andreas Behrend, W. Koch
Since navigation systems and tracking devices are becoming ubiquitous in our daily life, the development of efficient methods for processing massive sets of mobile objects are of utmost importance. Although future routes of mobile objects are often known in advance in many applications, this information is not fully utilized in most methods so far. In this paper, we reveal the beneficial effects of exploiting future routes for the early generation of the expected results of spatio-temporal queries. This kind of probable results is important for operative analytics in many applications like smart fleet management or intelligent logistics. For efficiently computing the high number of future trajectory points, a new index structure is presented which allows for a fast maintenance of query results under continuous changes of mobile objects. Our methods make use of specific update patterns, which require substantially less maintenance costs than the most general case of an update. A set of experiments based on a commonly used simulation environment shows the efficiency of our approach.
由于导航系统和跟踪设备在我们的日常生活中变得无处不在,因此开发处理大量移动物体的有效方法至关重要。虽然在许多应用中,移动对象的未来路径通常是预先已知的,但到目前为止,大多数方法都没有充分利用这些信息。在本文中,我们揭示了开发未来路径对早期生成时空查询预期结果的有益影响。这种可能的结果对于智能车队管理或智能物流等许多应用中的操作分析非常重要。为了高效地计算大量的未来轨迹点,提出了一种新的索引结构,可以在移动目标不断变化的情况下快速维护查询结果。我们的方法使用了特定的更新模式,与最一般的更新情况相比,它需要的维护成本要少得多。一组基于常用仿真环境的实验表明了该方法的有效性。
{"title":"Continuous queries on trajectories of moving objects","authors":"Philip Schmiegelt, B. Seeger, Andreas Behrend, W. Koch","doi":"10.1145/2351476.2351495","DOIUrl":"https://doi.org/10.1145/2351476.2351495","url":null,"abstract":"Since navigation systems and tracking devices are becoming ubiquitous in our daily life, the development of efficient methods for processing massive sets of mobile objects are of utmost importance. Although future routes of mobile objects are often known in advance in many applications, this information is not fully utilized in most methods so far. In this paper, we reveal the beneficial effects of exploiting future routes for the early generation of the expected results of spatio-temporal queries. This kind of probable results is important for operative analytics in many applications like smart fleet management or intelligent logistics. For efficiently computing the high number of future trajectory points, a new index structure is presented which allows for a fast maintenance of query results under continuous changes of mobile objects. Our methods make use of specific update patterns, which require substantially less maintenance costs than the most general case of an update. A set of experiments based on a commonly used simulation environment shows the efficiency of our approach.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"21 1","pages":"165-174"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73213962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Evaluation of data reduction techniques for vehicle to infrastructure communication saving purposes 基于车辆与基础设施通信节省目的的数据缩减技术评价
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351484
Luca Carafoli, F. Mandreoli, R. Martoglia, W. Penzo
In this paper we investigate the employment of different data reduction techniques to minimize V2I communication in an Intelligent Transportation System (ITS). We consider the context of the PEGASUS Project, where vehicles are equipped with sensor-based devices able to compute and communicate to a Control Centre (CC) information like vehicleăĂŹs position and speed. The CC relies on a general-purpose data management module that supports the execution of continuous queries as well as standard SQL one-time queries on the collected data to provide various infomobility services. The paper explores two categories of data reduction techniques: independent techniques, where vehicles autonomously send data to the CC, and information-need techniques, where data is sent by taking into account additional data received from the CC. The paper discusses and implements the technical changes needed in the CC to support the required infomobility services under the reduced availability of data. All the investigated techniques have been extensively evaluated in a variety of traffic scenarios.
在本文中,我们研究了在智能交通系统(ITS)中使用不同的数据缩减技术来最小化V2I通信。我们考虑了PEGASUS项目的背景,其中车辆配备了基于传感器的设备,能够计算并与控制中心(CC)通信vehicleăĂŹs位置和速度等信息。CC依赖于一个通用的数据管理模块,该模块支持对收集的数据执行连续查询和标准SQL一次性查询,以提供各种信息移动性服务。本文探讨了两类数据缩减技术:独立技术(车辆自主向CC发送数据)和信息需求技术(通过考虑从CC接收的额外数据来发送数据)。本文讨论并实施了CC所需的技术变革,以支持数据可用性降低下所需的信息移动服务。所有研究的技术已经在各种交通场景中进行了广泛的评估。
{"title":"Evaluation of data reduction techniques for vehicle to infrastructure communication saving purposes","authors":"Luca Carafoli, F. Mandreoli, R. Martoglia, W. Penzo","doi":"10.1145/2351476.2351484","DOIUrl":"https://doi.org/10.1145/2351476.2351484","url":null,"abstract":"In this paper we investigate the employment of different data reduction techniques to minimize V2I communication in an Intelligent Transportation System (ITS). We consider the context of the PEGASUS Project, where vehicles are equipped with sensor-based devices able to compute and communicate to a Control Centre (CC) information like vehicleăĂŹs position and speed. The CC relies on a general-purpose data management module that supports the execution of continuous queries as well as standard SQL one-time queries on the collected data to provide various infomobility services.\u0000 The paper explores two categories of data reduction techniques: independent techniques, where vehicles autonomously send data to the CC, and information-need techniques, where data is sent by taking into account additional data received from the CC. The paper discusses and implements the technical changes needed in the CC to support the required infomobility services under the reduced availability of data. All the investigated techniques have been extensively evaluated in a variety of traffic scenarios.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"5 1","pages":"61-70"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87530009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
XML query processing: efficiency and optimality XML查询处理:效率和最优性
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351478
Radim Bača, M. Krátký
XML (Extensible Mark-up Language) is a well established format which is often used for modeling of semi-structured data. XPath and XQuery are de facto standards among XML query languages and searching for occurrences of a twig pattern query (TPQ) in an XML document is one of their core tasks. There is a large number of different approaches addressing the TPQ matching problem. The aim of this article is to compare the state-of-the-art techniques and give an overview which can help to understand the relationships between different methodologies used in this area. We distinguish three main areas of a TPQ processing: (1) index data structures and XML document partitioning, (2) join algorithms, and (3) cost-based optimizations. We cover the most important techniques in each area and explain their relationships and possible combinations.
XML(可扩展标记语言)是一种成熟的格式,通常用于半结构化数据的建模。XPath和XQuery是XML查询语言中事实上的标准,在XML文档中搜索小模式查询(TPQ)的出现是它们的核心任务之一。有很多不同的方法来解决TPQ匹配问题。本文的目的是比较最先进的技术,并给出一个概述,这有助于理解该领域中使用的不同方法之间的关系。我们区分了TPQ处理的三个主要领域:(1)索引数据结构和XML文档分区;(2)连接算法;(3)基于成本的优化。我们将介绍每个领域中最重要的技术,并解释它们之间的关系和可能的组合。
{"title":"XML query processing: efficiency and optimality","authors":"Radim Bača, M. Krátký","doi":"10.1145/2351476.2351478","DOIUrl":"https://doi.org/10.1145/2351476.2351478","url":null,"abstract":"XML (Extensible Mark-up Language) is a well established format which is often used for modeling of semi-structured data. XPath and XQuery are de facto standards among XML query languages and searching for occurrences of a twig pattern query (TPQ) in an XML document is one of their core tasks.\u0000 There is a large number of different approaches addressing the TPQ matching problem. The aim of this article is to compare the state-of-the-art techniques and give an overview which can help to understand the relationships between different methodologies used in this area. We distinguish three main areas of a TPQ processing: (1) index data structures and XML document partitioning, (2) join algorithms, and (3) cost-based optimizations. We cover the most important techniques in each area and explain their relationships and possible combinations.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"4 1","pages":"8-13"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91081484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
ECTree: an extended tree index for attributed subgraph queries ECTree:用于属性子图查询的扩展树索引
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351503
Jun Luo, G. Butler
Graphs are popular data structures for modeling complex data types. There is a need for managing such graph data and providing efficient querying tools. In the graph mining realm, the problem lies in indexing a large number of graphs for fast retrieval. Indexing attributed graphs and using attributed queries can provide faster response time and results that are more refined. Our index technique ECTree focuses on extending an existing index to support attributed graph indexing and providing subgraph querying access to the extended index. The aim is to find a way such that the labels of the graphs as well as the attributes of the graphs are indexed at the same time. A query format is provided to query the extended index with flexibility on the attributes. In addition, regular expressions are used as query labels to provide flexibility. We also introduce a label-irrelevant vertex degree-attribute pruning method. All the techniques presented in our work are validated through experiments on both real and synthetic datasets.
图是为复杂数据类型建模的流行数据结构。需要管理这些图形数据并提供有效的查询工具。在图挖掘领域,问题在于为快速检索而对大量图建立索引。索引属性图和使用属性查询可以提供更快的响应时间和更精细的结果。我们的索引技术ECTree侧重于扩展现有索引,以支持属性图索引,并提供对扩展索引的子图查询访问。目的是找到一种方法,使图的标签和图的属性同时被索引。提供了一种查询格式来查询扩展索引,在属性上具有灵活性。此外,正则表达式用作查询标签以提供灵活性。我们还引入了一种与标签无关的顶点度属性修剪方法。我们工作中提出的所有技术都通过真实和合成数据集的实验进行了验证。
{"title":"ECTree: an extended tree index for attributed subgraph queries","authors":"Jun Luo, G. Butler","doi":"10.1145/2351476.2351503","DOIUrl":"https://doi.org/10.1145/2351476.2351503","url":null,"abstract":"Graphs are popular data structures for modeling complex data types. There is a need for managing such graph data and providing efficient querying tools. In the graph mining realm, the problem lies in indexing a large number of graphs for fast retrieval. Indexing attributed graphs and using attributed queries can provide faster response time and results that are more refined.\u0000 Our index technique ECTree focuses on extending an existing index to support attributed graph indexing and providing subgraph querying access to the extended index. The aim is to find a way such that the labels of the graphs as well as the attributes of the graphs are indexed at the same time. A query format is provided to query the extended index with flexibility on the attributes. In addition, regular expressions are used as query labels to provide flexibility. We also introduce a label-irrelevant vertex degree-attribute pruning method. All the techniques presented in our work are validated through experiments on both real and synthetic datasets.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"102 1","pages":"216-221"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77372086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ORDB holistic design ORDB整体设计
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351507
Patricia Roberts
Object-relational databases (ORDB) can be created with a mixture of relational and object-oriented features, giving database designers a wide range of options. However, a poor combination of choices can result in problems with storing, retrieving or updating data. This paper shows that decisions about the features to use in ORDB design require a holistic approach, considering the database design as a whole and as part of an information system. We highlight problems that occur with some combinations of features that can cause poor functionality.
对象关系数据库(ORDB)可以混合使用关系特性和面向对象特性创建,从而为数据库设计人员提供了广泛的选择。但是,选择组合不当可能导致存储、检索或更新数据时出现问题。本文表明,关于在ORDB设计中使用的特性的决策需要一个整体的方法,将数据库设计作为一个整体和信息系统的一部分来考虑。我们强调了一些可能导致功能不佳的功能组合所出现的问题。
{"title":"ORDB holistic design","authors":"Patricia Roberts","doi":"10.1145/2351476.2351507","DOIUrl":"https://doi.org/10.1145/2351476.2351507","url":null,"abstract":"Object-relational databases (ORDB) can be created with a mixture of relational and object-oriented features, giving database designers a wide range of options. However, a poor combination of choices can result in problems with storing, retrieving or updating data. This paper shows that decisions about the features to use in ORDB design require a holistic approach, considering the database design as a whole and as part of an information system. We highlight problems that occur with some combinations of features that can cause poor functionality.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"24 1","pages":"239-242"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74448805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transaction processing using thread-to-metadata 使用线程到元数据的事务处理
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351505
Hoda M. O. Mokhtar, Nariman Adel
In the distributed transactional database systems, there are a large number of concurrent transactions. Each transaction independently executes on a separate thread this is known as thread-to-transaction. The lock manager is hence responsible for maintaining isolation between concurrently-executing transactions, which takes more time from the total time of the execution. In this paper we present a system that improves the response time by using thread-to-metadata policy instead of using thread-to-transaction policy. The system consists of two modules: (a) Transaction Module, and (b) Data Module. The system minimizes the interaction with the lock manager and maintains all the ACID properties of the transactions. We also present experimental results that show how thread-to-metadata policy improves the response time.
在分布式事务数据库系统中,存在大量的并发事务。每个事务在单独的线程上独立执行,这被称为线程到事务。因此,锁管理器负责维护并发执行事务之间的隔离,这从执行的总时间中占用了更多的时间。在本文中,我们提出了一个系统,该系统通过使用线程到元数据策略而不是使用线程到事务策略来提高响应时间。该系统由两个模块组成:(a)事务模块和(b)数据模块。系统将与锁管理器的交互最小化,并维护事务的所有ACID属性。我们还提供了实验结果,显示线程到元数据策略如何改善响应时间。
{"title":"Transaction processing using thread-to-metadata","authors":"Hoda M. O. Mokhtar, Nariman Adel","doi":"10.1145/2351476.2351505","DOIUrl":"https://doi.org/10.1145/2351476.2351505","url":null,"abstract":"In the distributed transactional database systems, there are a large number of concurrent transactions. Each transaction independently executes on a separate thread this is known as thread-to-transaction. The lock manager is hence responsible for maintaining isolation between concurrently-executing transactions, which takes more time from the total time of the execution. In this paper we present a system that improves the response time by using thread-to-metadata policy instead of using thread-to-transaction policy. The system consists of two modules: (a) Transaction Module, and (b) Data Module. The system minimizes the interaction with the lock manager and maintains all the ACID properties of the transactions. We also present experimental results that show how thread-to-metadata policy improves the response time.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"93 1","pages":"230-234"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82399371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Autonomous database partitioning using data mining on single computers and cluster computers 在单台计算机和集群计算机上使用数据挖掘进行自主数据库分区
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351481
Liangzhe Li, L. Gruenwald
One of the most important metrics in measuring the performance of a database system is query response time, which is composed of I/O time and CPU time. I/O time is decided by the amount of data read/write from/to disks and how the data is located on disks. CPU time is decided by how the database system performs the query operations. So if we want to reduce the query response time we can reduce either I/O time or CPU time, or both of them. We know retrieving data from disks is much slower than retrieving data from main memory. Hence, one of the common ways to reduce I/O times is clustering data on disks so that queries will access only relevant data. This paper introduces an efficient algorithm, called AutoClust, for automatic database attribute clustering (or also called automatic database vertical partitioning) for single computers as well as cluster computers. It is based on closed item sets mined from queries and their attributes using association rule mining. The paper then presents experimental results comparing the performance of AutoClust with that of a baseline algorithm on both single computers and cluster computers using the TPC-H benchmark running on major commercial database systems. The experiments show that AutoClust has better query costs for both types of computers.
衡量数据库系统性能的最重要指标之一是查询响应时间,它由I/O时间和CPU时间组成。I/O时间取决于从磁盘读/写/到磁盘的数据量以及数据在磁盘上的位置。CPU时间由数据库系统执行查询操作的方式决定。因此,如果我们想要减少查询响应时间,我们可以减少I/O时间或CPU时间,或者两者都减少。我们知道从磁盘中检索数据要比从主存中检索数据慢得多。因此,减少I/O次数的常用方法之一是对磁盘上的数据进行集群化,以便查询只访问相关数据。本文介绍了一种高效的算法AutoClust,用于单计算机和集群计算机的自动数据库属性聚类(或称为自动数据库垂直分区)。它基于使用关联规则挖掘从查询及其属性中挖掘的封闭项集。然后,本文给出了使用主要商业数据库系统上运行的TPC-H基准测试,在单台计算机和集群计算机上比较AutoClust与基线算法性能的实验结果。实验表明,AutoClust在两种类型的计算机上都具有更好的查询成本。
{"title":"Autonomous database partitioning using data mining on single computers and cluster computers","authors":"Liangzhe Li, L. Gruenwald","doi":"10.1145/2351476.2351481","DOIUrl":"https://doi.org/10.1145/2351476.2351481","url":null,"abstract":"One of the most important metrics in measuring the performance of a database system is query response time, which is composed of I/O time and CPU time. I/O time is decided by the amount of data read/write from/to disks and how the data is located on disks. CPU time is decided by how the database system performs the query operations. So if we want to reduce the query response time we can reduce either I/O time or CPU time, or both of them. We know retrieving data from disks is much slower than retrieving data from main memory. Hence, one of the common ways to reduce I/O times is clustering data on disks so that queries will access only relevant data. This paper introduces an efficient algorithm, called AutoClust, for automatic database attribute clustering (or also called automatic database vertical partitioning) for single computers as well as cluster computers. It is based on closed item sets mined from queries and their attributes using association rule mining. The paper then presents experimental results comparing the performance of AutoClust with that of a baseline algorithm on both single computers and cluster computers using the TPC-H benchmark running on major commercial database systems. The experiments show that AutoClust has better query costs for both types of computers.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"24 1","pages":"32-41"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88202331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Dealing with inconsistencies in linked data mashups 处理链接数据混搭中的不一致性
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351496
Eveline R. Sacramento, M. Casanova, K. Breitman, A. Furtado, J. Macêdo, V. Vidal
Data mashups constructed from independent sources may contain inconsistencies, puzzling the user that observes the data. This paper formalizes the notion of consistent data mashups and introduces a heuristic procedure to compute such mashups.
从独立来源构建的数据mashup可能包含不一致性,这让观察数据的用户感到困惑。本文形式化了一致数据混搭的概念,并引入了计算这种混搭的启发式过程。
{"title":"Dealing with inconsistencies in linked data mashups","authors":"Eveline R. Sacramento, M. Casanova, K. Breitman, A. Furtado, J. Macêdo, V. Vidal","doi":"10.1145/2351476.2351496","DOIUrl":"https://doi.org/10.1145/2351476.2351496","url":null,"abstract":"Data mashups constructed from independent sources may contain inconsistencies, puzzling the user that observes the data. This paper formalizes the notion of consistent data mashups and introduces a heuristic procedure to compute such mashups.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"371 1","pages":"175-180"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77971421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
XML class outlier detection XML类离群值检测
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351494
G. Manco, E. Masciari
XML (eXtensible Markup Language) became in recent years the new standard for data representation and exchange on the WWW. This has resulted in a great need for data cleaning techniques in order to identify outlying data. In this paper, we present a technique for outlier detection that singles out anomalies with respect to a relevant group of objects. We exploit a suitable encoding of XML documents that are encoded as signals of fixed frequency that can be transformed using Fourier Transforms. Outliers are identified by simply looking at the signal spectra. The results show the effectiveness of our approach.
近年来,XML(可扩展标记语言)成为WWW上数据表示和交换的新标准。这导致了对数据清理技术的巨大需求,以便识别外围数据。在本文中,我们提出了一种异常检测技术,该技术可以针对一组相关对象挑出异常。我们开发了一种合适的XML文档编码,这些文档被编码为固定频率的信号,可以使用傅里叶变换进行转换。异常值是通过简单地观察信号光谱来识别的。结果表明了该方法的有效性。
{"title":"XML class outlier detection","authors":"G. Manco, E. Masciari","doi":"10.1145/2351476.2351494","DOIUrl":"https://doi.org/10.1145/2351476.2351494","url":null,"abstract":"XML (eXtensible Markup Language) became in recent years the new standard for data representation and exchange on the WWW. This has resulted in a great need for data cleaning techniques in order to identify outlying data. In this paper, we present a technique for outlier detection that singles out anomalies with respect to a relevant group of objects. We exploit a suitable encoding of XML documents that are encoded as signals of fixed frequency that can be transformed using Fourier Transforms. Outliers are identified by simply looking at the signal spectra. The results show the effectiveness of our approach.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"77 1","pages":"155-164"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72707293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Schematron schema inference 图式推理
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351482
M. Kozák, J. Stárka, I. Holubová
In this paper we introduce a method to infer a Schematron schema from a set of XML documents. We analyze different aspect of Schematron schema generation. Since the automatic inferring of XML documents is not a new problem, we will introduce only a single method that we will use in our experimental implementation. In the implementation we generate a grammar using the introduced inferring method and we allow the user to modify the grammar. The grammar is then transformed into Schematron schema by the use of our algorithm. Experimental results are a part of the paper.
本文介绍了一种从一组XML文档中推断Schematron模式的方法。我们分析了Schematron模式生成的不同方面。由于XML文档的自动推断不是一个新问题,因此我们将只介绍一种方法,并在我们的实验实现中使用它。在实现中,我们使用引入的推理方法生成语法,并允许用户修改语法。然后使用我们的算法将语法转换为Schematron模式。实验结果是本文的一部分。
{"title":"Schematron schema inference","authors":"M. Kozák, J. Stárka, I. Holubová","doi":"10.1145/2351476.2351482","DOIUrl":"https://doi.org/10.1145/2351476.2351482","url":null,"abstract":"In this paper we introduce a method to infer a Schematron schema from a set of XML documents. We analyze different aspect of Schematron schema generation. Since the automatic inferring of XML documents is not a new problem, we will introduce only a single method that we will use in our experimental implementation. In the implementation we generate a grammar using the introduced inferring method and we allow the user to modify the grammar. The grammar is then transformed into Schematron schema by the use of our algorithm. Experimental results are a part of the paper.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"42 1","pages":"42-50"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86025088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings. International Database Engineering and Applications Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1