Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)最新文献

英文中文

Information technology implementation for a distributed data system serving Earth scientists: seasonal to interannual ESIP 为地球科学家服务的分布式数据系统的信息技术实现:季节性到年际ESIP

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

Pub Date : 1998-07-01 DOI: 10.1109/SSDM.1998.688126

M. Kafatos, X. Wang, Zuotao Li, Ruixin Yang, D. Ziskin

We address the implementation of a distributed data system designed to serve Earth system scientists. A consortium led by George Mason University has been funded by NASA's Working Prototype Earth Science Information Partner (WP-ESIP) program to develop, implement, and operate a distributed data and information system. The system will address the research needs of seasonal to interannual scientists whose research focus includes phenomena such as El Nino, monsoons and associated climate studies. The system implementation involves several institutions using a multitiered client-server architecture. Specifically the consortium involves an information system of three physical sites, GMU, the Center for Ocean-Land-Atmosphere Studies (COLA) and the Goddard Distributed Active Archive Center, distributing tasks in the areas of user services, access to data, archiving, and other aspects enabled by a low-cost, scalable information technology implementation. The project can serve as a model for a larger WP-ESIP Federation to assist in the overall data information system associated with future large Earth Observing System data sets and their distribution. The consortium has developed innovative information technology techniques such as content based browsing, data mining and associated component working prototypes; analysis tools particularly GrADS developed by COLA, the preferred analysis tool of the working seasonal to interannual communities; and a Java front-end query engine working prototype.

我们解决了为地球系统科学家服务的分布式数据系统的实现。由乔治梅森大学领导的一个财团得到了NASA“工作原型地球科学信息合作伙伴”(WP-ESIP)项目的资助，以开发、实施和运行分布式数据和信息系统。该系统将满足季节性到年际科学家的研究需求，这些科学家的研究重点包括厄尔尼诺、季风和相关气候研究等现象。系统实现涉及使用多层客户机-服务器体系结构的几个机构。具体来说，该联盟包括一个由三个物理站点组成的信息系统，GMU、海洋-陆地-大气研究中心(COLA)和戈达德分布式活动档案中心，通过低成本、可扩展的信息技术实现，在用户服务、数据访问、存档和其他方面分配任务。该项目可作为一个更大的WP-ESIP联合会的模式，以协助与未来大型地球观测系统数据集及其分发有关的整个数据信息系统。该联盟开发了创新的信息技术技术，如基于内容的浏览、数据挖掘和相关组件工作原型;分析工具，特别是COLA开发的梯度分析工具，是工作季节到年际社区的首选分析工具;以及一个Java前端查询引擎工作原型。

{"title":"Information technology implementation for a distributed data system serving Earth scientists: seasonal to interannual ESIP","authors":"M. Kafatos, X. Wang, Zuotao Li, Ruixin Yang, D. Ziskin","doi":"10.1109/SSDM.1998.688126","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688126","url":null,"abstract":"We address the implementation of a distributed data system designed to serve Earth system scientists. A consortium led by George Mason University has been funded by NASA's Working Prototype Earth Science Information Partner (WP-ESIP) program to develop, implement, and operate a distributed data and information system. The system will address the research needs of seasonal to interannual scientists whose research focus includes phenomena such as El Nino, monsoons and associated climate studies. The system implementation involves several institutions using a multitiered client-server architecture. Specifically the consortium involves an information system of three physical sites, GMU, the Center for Ocean-Land-Atmosphere Studies (COLA) and the Goddard Distributed Active Archive Center, distributing tasks in the areas of user services, access to data, archiving, and other aspects enabled by a low-cost, scalable information technology implementation. The project can serve as a model for a larger WP-ESIP Federation to assist in the overall data information system associated with future large Earth Observing System data sets and their distribution. The consortium has developed innovative information technology techniques such as content based browsing, data mining and associated component working prototypes; analysis tools particularly GrADS developed by COLA, the preferred analysis tool of the working seasonal to interannual communities; and a Java front-end query engine working prototype.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115119917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Computational issues connected with the protection of sensitive statistics by auditing sum-queries 通过审计和查询来保护敏感统计数据的计算问题

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

Pub Date : 1998-07-01 DOI: 10.1109/SSDM.1998.688118

F. M. Malvestuto, M. Moscarini

An implementation of the auditing strategy is presented to avoid both exact and approximate disclosure. The key data structure is a query map, which is a graphical summary of answered queries. Since the size of a query map may be exponential in the number of answered queries, a query-restriction criterion is introduced to make every query map a graph. An auditing procedure on such a graph is presented and the computational issues connected with its implementation are discussed. All the computational tasks can be carried out efficiently but one, which is a provably intractable problem.

提出了一种避免准确和近似披露的审计策略。关键数据结构是查询映射，它是已回答查询的图形摘要。由于查询映射的大小可能与回答的查询数量呈指数关系，因此引入了查询限制条件，使每个查询映射都成为一个图。提出了一种对这种图的审计程序，并讨论了与它的实现有关的计算问题。所有的计算任务都可以有效地执行，只有一个是可证明的棘手问题。

引用次数: 14

Attribute uncertainty propagation in vector geographic information systems: sensitivity analysis 矢量地理信息系统中属性不确定性传播:敏感性分析

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

Pub Date : 1998-07-01 DOI: 10.1109/SSDM.1998.688134

O. Bonin

This paper presents a geographical sensitivity analysis on a vector road database. It consists in introducing controlled noise to the database and in studying the effects of this noise on the results of a chosen application. The objective is to give users the means to evaluate the accuracy of their application results for given quality parameters.

提出了一种基于矢量道路数据库的地理敏感性分析方法。它包括在数据库中引入受控噪声，并研究这种噪声对选定应用结果的影响。目的是给用户的手段，以评估其应用结果的准确性给定的质量参数。

引用次数: 6

Modeling multidimensional databases, cubes and cube operations 多维数据库、多维数据集和多维数据集操作建模

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

Pub Date : 1998-07-01 DOI: 10.1109/SSDM.1998.688111

Panos Vassiliadis

Online analytical processing (OLAP) is a trend in database technology, which has attracted the interest of a lot of research work. OLAP is based on the multidimensional view of data, supported either by multidimensional databases (MOLAP) or relational engines (ROLAP). We propose a model for multidimensional databases. Dimensions, dimension hierarchies and cubes are formally introduced. We also introduce cube operations (changing of levels in the dimension hierarchy, function application, navigation etc.). The approach is based on the notion of the base cube, which is used for the calculation of the results of cube operations. We focus our approach on the support of a series of operations on cubes (i.e., the preservation of the results of previous operations and the applicability of aggregate functions in a series of operations). Furthermore, we provide a mapping of the multidimensional model to the relational model and to multidimensional arrays.

联机分析处理(OLAP)是数据库技术的一个发展趋势，引起了许多研究工作的兴趣。OLAP基于数据的多维视图，由多维数据库(MOLAP)或关系引擎(ROLAP)支持。我们提出了一个多维数据库模型。正式介绍了维度、维度层次和多维数据集。我们还介绍了多维数据集操作(更改维度层次结构中的级别、函数应用、导航等)。该方法基于基本立方体的概念，基本立方体用于计算立方体操作的结果。我们的方法侧重于对多维数据集上的一系列操作的支持(即，保留先前操作的结果以及在一系列操作中聚合函数的适用性)。此外，我们还提供了多维模型到关系模型和多维数组的映射。

引用次数: 216

An extensible framework for spatio-temporal database applications 用于时空数据库应用程序的可扩展框架

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

Pub Date : 1998-07-01 DOI: 10.1109/SSDM.1998.688124

Glaucia Faria, C. B. Medeiros, M. Nascimento

There is a wide range of scientific applications requiring sophisticated management of spatio-temporal data. However existing database management systems offer very limited support for managing such data. Thus, it is left to the researchers themselves to repeatedly code this management into each application. We present an extensible framework, based on extending an object-oriented database system, with kernel spatio-temporal classes, data structures and functions, to provide support for the development of spatio-temporal applications.

广泛的科学应用需要对时空数据进行复杂的管理。然而，现有的数据库管理系统对管理此类数据提供的支持非常有限。因此,研究人员离开自己反复这个管理到每个应用程序代码。本文在扩展面向对象数据库系统的基础上，提出了一个可扩展的框架，其中包含核心时空类、数据结构和函数，为时空应用的开发提供支持。

引用次数: 31

ConIstat: a system to manage, record and present cyclical data ConIstat:管理、记录和呈现周期性数据的系统

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

Pub Date : 1998-07-01 DOI: 10.1109/SSDM.1998.688129

A. Sorce, F. Rizzo

The paper discusses ConIstat, a system to manage and present cyclical data organised in historical series to traditional and untraditional users. The data bank contains about 6300 time series, and is organized with the following dominions: external trade, invoiced, consistencies and ordered, production prices, index of work of the great enterprises, industrial production, contractual wages and salaries.

本文讨论了ConIstat，一个管理和呈现按历史序列组织的周期性数据给传统和非传统用户的系统。数据库包含约6300个时间序列，按以下领域组织:对外贸易、发票、一致性和订购、生产价格、大企业工作指数、工业生产、合同工资和薪金。

引用次数: 0

Benchmarking spatial joins a la carte 对空间连接进行基准测试

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

Pub Date : 1998-07-01 DOI: 10.1109/SSDM.1998.688109

O. Günther, Vincent Oria, P. Picouet, J. Saglio, M. Scholl

Spatial joins are join operations that involve spatial data types and operators. Spatial access methods are often used to speed up the computation of spatial joins. We address the issue of benchmarking spatial join operations. For this purpose, we first present a WWW-based benchmark generator to produce sets of rectangles. Using a Web browser experimenters can specify the number of rectangles in a sample, as well as the statistical distributions of their sizes, shapes, and locations. Second, using the generator and a well-defined set of statistical models we define several tests to compare the performance of three spatial join algorithms: nested loop, scan-and-index, and synchronized tree traversal. We also added a real-life data set from the Sequoia 2000 storage benchmark. Our results show that the relative performance of the different techniques mainly depends on two parameters: sample size, and selectivity of the join predicate. All of the statistical models and algorithms are available on the Web, which allows for easy verification and modification of our experiments.

空间连接是涉及空间数据类型和操作符的连接操作。空间访问方法通常用于加快空间连接的计算速度。我们解决了空间连接操作的基准测试问题。为此，我们首先提出了一个基于www的基准生成器来生成矩形集。使用Web浏览器，实验人员可以指定样本中矩形的数量，以及它们的大小、形状和位置的统计分布。其次，使用生成器和一组定义良好的统计模型，我们定义了几个测试来比较三种空间连接算法的性能:嵌套循环、扫描和索引以及同步树遍历。我们还添加了一个来自Sequoia 2000存储基准的真实数据集。我们的结果表明，不同技术的相对性能主要取决于两个参数:样本量和连接谓词的选择性。所有的统计模型和算法都可以在网上获得，这使得我们的实验可以很容易地验证和修改。

引用次数: 50

Metabolic pathway interface to molecular biology databases 代谢途径接口到分子生物学数据库

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

Pub Date : 1998-07-01 DOI: 10.1109/SSDM.1998.688132

Barry Zeeberg, Kevin Watanabe, S. Goto, R. Overbeek, L. Kerschberg, George Michaels

We present results of providing database support to biomedicine via federation of SDB Cooperation/Integration based upon the KEGG GUI for molecular biology. The federation provides a common link to three molecular biology databases. The added value of the federation is freedom from consulting multiple references to ascertain the full set of enzymatic reactions in a metabolic pathway, and the option of selecting multiple queries to submit to the federated SDBs. Each of the SDBs is extensive, but incomplete. The union of the SDBs, implemented transparently by the federation, is more complete. Each SDB provides a different approach to the options available for data presentation and a different set of Web server tools for data analysis. Thus, an important part of the added value of the federation is the cross-fertilization available in the union of the molecular biological content, the presentation of data, and the tools available for analysis.

本文介绍了基于分子生物学KEGG GUI的SDB合作/集成联盟为生物医学提供数据库支持的结果。该联盟提供了三个分子生物学数据库的公共链接。联合的附加价值在于，不必咨询多个参考来确定代谢途径中的全部酶促反应，并且可以选择向联合sdb提交多个查询。每个sdb都很广泛，但不完整。由联盟透明地实现的sdb联盟更加完整。每个SDB为数据表示提供了不同的方法，并为数据分析提供了一组不同的Web服务器工具。因此，联盟附加值的一个重要部分是在分子生物学内容、数据的呈现和可用的分析工具的联合中提供的交叉受精。

引用次数: 0

A pyramid data model for supporting content-based browsing and knowledge discovery 支持基于内容的浏览和知识发现的金字塔数据模型

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

Pub Date : 1998-07-01 DOI: 10.1109/SSDM.1998.688121

Zuotao Li, X. Wang, M. Kafatos, Ruixin Yang

Remote sensing from space can provide global and continuous observations. The associated measurement data need to be stored and studied to understand the Earth system processes. The ability of interactive content-based browsing, i.e., browsing or searching the content to narrow-down the interesting portions of data sets prior to actually accessing or ordering full data sets, is highly desirable for any Earth science data information system. However the large volumes of archived and future Earth science remote sensing data are clearly a serious challenge for an interactive browsing process. In this paper a pyramid data model is introduced to support interactive content-based browsing and knowledge discovery for a wide variety of Earth science remote sensing data sets. By using multi-level precomputation and robust nonparametric approximation procedures, the interactive browsing performance can be enhanced greatly. An initial implementation and testing of this data model has been carried out through our research prototype system, Virtual Domain Application Data Center (VDADC). Future implementations are planned for our Seasonal to Interannual Earth Science Information Partner (SIESIP) project.

空间遥感可以提供全球和连续的观测。相关的测量数据需要存储和研究，以了解地球系统的过程。交互式的基于内容的浏览能力，即，在实际访问或排序完整的数据集之前，浏览或搜索内容以缩小数据集中有趣的部分，对于任何地球科学数据信息系统都是非常理想的。然而，大量的存档和未来的地球科学遥感数据显然是交互式浏览过程的一个严重挑战。本文提出了一种金字塔数据模型，支持基于内容的交互式地球科学遥感数据浏览和知识发现。通过多级预计算和鲁棒的非参数逼近，可以大大提高交互式浏览性能。通过我们的研究原型系统虚拟域应用数据中心(VDADC)对该数据模型进行了初步实现和测试。未来的实现计划用于我们的季节性到年际地球科学信息合作伙伴(SIESIP)项目。

{"title":"A pyramid data model for supporting content-based browsing and knowledge discovery","authors":"Zuotao Li, X. Wang, M. Kafatos, Ruixin Yang","doi":"10.1109/SSDM.1998.688121","DOIUrl":"https://doi.org/10.1109/SSDM.1998.688121","url":null,"abstract":"Remote sensing from space can provide global and continuous observations. The associated measurement data need to be stored and studied to understand the Earth system processes. The ability of interactive content-based browsing, i.e., browsing or searching the content to narrow-down the interesting portions of data sets prior to actually accessing or ordering full data sets, is highly desirable for any Earth science data information system. However the large volumes of archived and future Earth science remote sensing data are clearly a serious challenge for an interactive browsing process. In this paper a pyramid data model is introduced to support interactive content-based browsing and knowledge discovery for a wide variety of Earth science remote sensing data sets. By using multi-level precomputation and robust nonparametric approximation procedures, the interactive browsing performance can be enhanced greatly. An initial implementation and testing of this data model has been carried out through our research prototype system, Virtual Domain Application Data Center (VDADC). Future implementations are planned for our Seasonal to Interannual Earth Science Information Partner (SIESIP) project.","PeriodicalId":120937,"journal":{"name":"Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127732670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Determining the optimal file size on tertiary storage systems based on the distribution of query sizes 根据查询大小的分布，确定三级存储系统上的最优文件大小

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

Pub Date : 1998-07-01 DOI: 10.1109/SSDM.1998.688108

L. Bernardo, H. Nordberg, D. Rotem, A. Shoshani

In tertiary storage systems, the data is stored on multiple tape volumes where each tape is further divided into files. Since in many such systems the minimum unit of data transfer is a file, it is an important problem to match file sizes with the access patterns to the data. In general, if the file size is large relative to the query size it will lead to the transfer of large amounts of irrelevant data whereas small file sizes will incur an overhead penalty associated with reading each new file. In this work, we analyze the relationship between file sizes and query response times and provide a methodology to compute the optimal file size given information about the distribution of query sizes. Exact closed form solutions for the cost function are given for two common distributions.

在三级存储系统中，数据存储在多个磁带卷中，每个磁带卷进一步划分为文件。由于在许多这样的系统中，数据传输的最小单位是文件，因此将文件大小与数据的访问模式相匹配是一个重要的问题。通常，如果文件大小相对于查询大小较大，则会导致传输大量不相关的数据，而文件大小较小则会导致与读取每个新文件相关的开销损失。在这项工作中，我们分析了文件大小和查询响应时间之间的关系，并提供了一种方法来计算关于查询大小分布的信息的最佳文件大小。对于两种常见分布，给出了代价函数的精确封闭解。

引用次数: 12

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings. Tenth International Conference on Scientific and Statistical Database Management (Cat. No.98TB100243)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀