2008 IEEE International Conference on Information Reuse and Integration最新文献

英文中文

Integration of information needs and seeking 整合信息需求与寻求

2008 IEEE International Conference on Information Reuse and Integration

Pub Date : 2008-07-13 DOI: 10.1109/IRI.2008.4583077

S. Al-Fedaghi

This paper investigates the problem of how to model the need for information, and the process of seeking information that satisfies that need. It utilizes a proposed information flow model that integrates information needs, information seeking, and information-based activities. The model includes the phases of need generation and propagation that are transformed into information seeking, which in turn triggers information flow to enable fulfillment of needs.

本文研究了如何建立信息需求模型，以及如何寻找满足信息需求的信息的过程。它利用了一个建议的信息流模型，该模型集成了信息需求、信息搜索和基于信息的活动。该模型包括需求产生和传播阶段，这些阶段转化为信息寻找，进而触发信息流以实现需求。

引用次数: 4

Crawling programs for wrapper-based applications 用于基于包装器的应用程序的爬行程序

2008 IEEE International Conference on Information Reuse and Integration

Pub Date : 2008-07-13 DOI: 10.1109/IRI.2008.4583023

Claudio Bertoli, Valter Crescenzi, P. Merialdo

Many large web sites provide pages containing highly valuable data. In order to extract data from these pages several methods and techniques have been developed to generate web wrappers, that is, programs that convert into a structured format the data embedded into HTML pages. These techniques easy the burden of writing applications that make reuse of data from the web. However the generation of wrappers is just one of the ingredients needed to the development of such applications. A necessary yet underestimated task is that of developing programs for driving a crawler towards the pages that contain the target data. We present a method and an associated tool to support this activity. Our method relies on a data model whose constructs allows a designer to define an intensional description of the organization of data in a web site. Based on the model, we introduce the concepts of (i) intensional navigation, which represents an abstract description of the navigation to be performed to reach pages of interest, and (ii) extensional navigation, which represents the actual set of navigation paths (i.e. sequences of links to be followed) that lead the target pages. The method is supported by a tool that infers an intensional navigation, i.e. the crawling program, from one sample extensional navigation. The tool, which has been developed as a Firefox plug-in, supports the designer in the task of defining and verifying the sample navigation and the inferred crawling program.

许多大型网站提供包含非常有价值的数据的页面。为了从这些页面中提取数据，已经开发了几种方法和技术来生成web包装器，即将嵌入HTML页面的数据转换为结构化格式的程序。这些技术减轻了编写应用程序以重用来自web的数据的负担。然而，包装器的产生只是开发此类应用所需的成分之一。一个必要但被低估的任务是开发程序来驱动爬虫程序向包含目标数据的页面移动。我们提出了一个方法和一个相关的工具来支持这个活动。我们的方法依赖于一个数据模型，该模型的结构允许设计人员定义网站中数据组织的内部描述。基于该模型，我们引入了以下概念:(i)内涵导航，它表示要执行的导航的抽象描述，以到达感兴趣的页面;(ii)扩展导航，它表示引导目标页面的实际导航路径集(即要遵循的链接序列)。该方法由一个工具支持，该工具可以从一个样本扩展导航中推断出一个内涵导航，即爬行程序。该工具是作为Firefox插件开发的，它支持设计人员定义和验证示例导航以及推断的爬行程序。

{"title":"Crawling programs for wrapper-based applications","authors":"Claudio Bertoli, Valter Crescenzi, P. Merialdo","doi":"10.1109/IRI.2008.4583023","DOIUrl":"https://doi.org/10.1109/IRI.2008.4583023","url":null,"abstract":"Many large web sites provide pages containing highly valuable data. In order to extract data from these pages several methods and techniques have been developed to generate web wrappers, that is, programs that convert into a structured format the data embedded into HTML pages. These techniques easy the burden of writing applications that make reuse of data from the web. However the generation of wrappers is just one of the ingredients needed to the development of such applications. A necessary yet underestimated task is that of developing programs for driving a crawler towards the pages that contain the target data. We present a method and an associated tool to support this activity. Our method relies on a data model whose constructs allows a designer to define an intensional description of the organization of data in a web site. Based on the model, we introduce the concepts of (i) intensional navigation, which represents an abstract description of the navigation to be performed to reach pages of interest, and (ii) extensional navigation, which represents the actual set of navigation paths (i.e. sequences of links to be followed) that lead the target pages. The method is supported by a tool that infers an intensional navigation, i.e. the crawling program, from one sample extensional navigation. The tool, which has been developed as a Firefox plug-in, supports the designer in the task of defining and verifying the sample navigation and the inferred crawling program.","PeriodicalId":169554,"journal":{"name":"2008 IEEE International Conference on Information Reuse and Integration","volume":"427 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115657312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 18

Empirical-based design — quality-driven assembly of components 基于经验的设计——质量驱动的组件装配

2008 IEEE International Conference on Information Reuse and Integration

Pub Date : 2008-07-13 DOI: 10.1109/IRI.2008.4583063

M. Kunz, S. Mencke, D. Rud, R. Dumke

The importance of providing integration architectures in every field of application is beyond controversy these days. Unfortunately existing solutions are mainly focusing on functionality. But for the success of Systems Integration in the long run, the quality of developed architectures is of substantial interest. Existing quality-related information can be reused to optimize this assembly of components to thereby always provide the best possible combination. For this purpose a framework for the quality-driven creation of architectures is proposed in this paper. Besides this quality-oriented characteristic, the usage of semantic knowledge and structured process descriptions enable an automatic procedure. Especially the combination of both is a promising approach.

如今，在每个应用领域提供集成体系结构的重要性是无可争议的。不幸的是，现有的解决方案主要关注于功能。但是从长远来看，对于系统集成的成功，开发的体系结构的质量是非常重要的。现有的质量相关信息可以被重用，以优化组件的组装，从而始终提供最佳的可能组合。为此，本文提出了一个质量驱动的架构创建框架。除了这种面向质量的特征外，语义知识和结构化过程描述的使用使过程自动化成为可能。特别是两者的结合是一种很有前途的方法。

引用次数: 4

Coupling data understanding with software reuse 将数据理解与软件重用相结合

2008 IEEE International Conference on Information Reuse and Integration

Pub Date : 2008-07-13 DOI: 10.1109/IRI.2008.4583014

G. S. Novak

Reuse of information requires an ability to understand data gathered from the web and to integrate that data with knowledge and reusable programs. We describe systems that allow a user to capture and understand data from the web and rapidly and easily write programs to analyze the data and combine it with other data. A data grokker parses data, inferring the data types of its fields both from field names and from values of the data itself; this produces both a local set of usable data and a set of data type descriptions that link the data to known types. The known types have knowledge and reusable procedures that can be inherited and used with the data. Web pages that perform calculations or data lookup can be treated as remote procedure calls, allowing calculations, proprietary data and real-time data to be used. We have developed a graphical programming system that can specialize reusable programs for use with data from the web, allowing rapid and easy construction of programs for custom analysis of web data. These systems are illustrated with examples.

信息的重用需要能够理解从网络上收集的数据，并将这些数据与知识和可重用程序集成。我们描述的系统允许用户从网络上捕获和理解数据，并快速轻松地编写程序来分析数据并将其与其他数据结合起来。数据挖掘者解析数据，从字段名和数据本身的值推断其字段的数据类型;这将生成一组可用的本地数据和一组将数据链接到已知类型的数据类型描述。已知类型具有可以继承并与数据一起使用的知识和可重用过程。可以将执行计算或数据查找的Web页面视为远程过程调用，从而允许使用计算、专有数据和实时数据。我们已经开发了一个图形化编程系统，可以专门使用可重用的程序来处理来自web的数据，允许快速和轻松地构建程序来定制web数据分析。用实例说明了这些系统。

引用次数: 0

Using default logic to enhance default logic: preliminary report 使用默认逻辑增强默认逻辑:初步报告

2008 IEEE International Conference on Information Reuse and Integration

Pub Date : 2008-07-13 DOI: 10.1109/IRI.2008.4583053

É. Grégoire

This paper is about the fusion of multiple knowledge sources represented using default logic. More precisely, the focus is on solving the problem that occurs when the standard-logic knowledge parts of the sources are contradictory, as default theories trivialize in this case. To overcome this problem, several candidate policies are discussed. Among them, it is shown that replacing each formula belonging to minimally unsatisfiable subformulas by a corresponding supernormal default exhibits appealing features.

本文研究了用默认逻辑表示的多知识源的融合问题。更准确地说，重点是解决当源的标准逻辑知识部分相互矛盾时发生的问题，因为默认理论在这种情况下变得微不足道。为了克服这个问题，讨论了几种候选策略。其中，用相应的超正常默认值代替属于最小不满足子公式的每个公式具有吸引人的特征。

引用次数: 0

Result identification for biomedical abstracts using Conditional Random Fields 基于条件随机场的生物医学摘要结果识别

2008 IEEE International Conference on Information Reuse and Integration

Pub Date : 2008-07-13 DOI: 10.1109/IRI.2008.4583016

Ryan T. K. Lin, Hong-Jie Dai, Yue-Yang Bow, Min-Yuh Day, Richard Tzong-Han Tsai, W. Hsu

For biomedical research, the most important parts of an abstract are the result and conclusion sections. Some journals divide an abstract into several sections so that readers can easily identify those parts, but others do not. We propose a method that can automatically identify the result and conclusion sections of any biomedical abstracts by formulating this identification problem as a sequence labeling task. Three feature sets (Position, Named Entity, and Word Frequency) are employed with Conditional Random Fields (CRFs) as the underlying machine learning model. Experimental results show that the combination of our proposed feature sets can achieve F-measure, precision, and recall scores of 92.50%, 95.32% and 89.85%, respectively.

对于生物医学研究，摘要中最重要的部分是结果和结论部分。有些期刊将摘要分成几个部分，以便读者可以很容易地识别这些部分，但有些期刊则不这样做。我们提出了一种可以自动识别任何生物医学摘要的结果和结论部分的方法，通过将这种识别问题表述为序列标记任务。三个特征集(位置、命名实体和词频)与条件随机场(CRFs)一起作为底层机器学习模型。实验结果表明，我们提出的特征集组合可以分别达到92.50%、95.32%和89.85%的F-measure、precision和recall分数。

引用次数: 6

Managing application domains in P2P systems 管理P2P系统中的应用程序域

2008 IEEE International Conference on Information Reuse and Integration

Pub Date : 2008-07-13 DOI: 10.1109/IRI.2008.4583073

Deise de Brum Saccol, Nina Edelweiss, R. Galante, Marcio Roberto de Mello

Peer-to-peer (P2P) systems provide shared access to resources that are spread over the network. In such scenario, files from the same domain can be found in different peers. When the user poses a query, the processing relies mainly on the flooding technique, which is quite inefficient for optimization purposes. To solve this issue, our work proposes to cluster documents from the same domain into super peers. Thus, files related to the same universe of discourse are grouped and the query processing is restricted to a subset of the network. The clustering task involves: ontology generation, document and ontology matching, and metadata management. This paper details the ontology generation task. The proposed mechanism implements the ontology manager in DetVX, a framework for detecting, managing and querying replicas and versions in a P2P context.

点对点(P2P)系统提供对分布在网络上的资源的共享访问。在这种情况下，可以在不同的对等体中找到来自同一域的文件。当用户提出查询时，处理主要依赖于泛洪技术，这对于优化目的是非常低效的。为了解决这个问题，我们的工作建议将来自同一领域的文档聚类到超级对等体中。因此，与同一话语域相关的文件被分组，查询处理被限制在网络的一个子集中。聚类任务包括:本体生成、文档与本体匹配、元数据管理。本文详细介绍了本体生成任务。提出的机制实现了DetVX中的本体管理器，DetVX是P2P环境中用于检测、管理和查询副本和版本的框架。

引用次数: 5

AMSQM: Adaptive multiple super-page queue management AMSQM:自适应多超级页队列管理

2008 IEEE International Conference on Information Reuse and Integration

Pub Date : 2008-07-13 DOI: 10.1504/ijids.2009.027658

Moshe Itshak, Y. Wiseman

Super-Pages have been wandering around for more than a decade. There are some particular operating systems that support Super-Paging and there are some recent research papers that show interesting ideas how to intelligently integrate them; however, nowadays Operating System’s page replacement mechanism still uses the old Clock algorithm which gives the same priority to small and large pages. In this paper we show a technique that enhances the page replacement mechanism to an algorithm based on more parameters and is suitable for a Super-Paging environment.

超级页面已经存在了十多年。有一些特殊的操作系统支持超级分页，最近有一些研究论文展示了如何智能地集成它们的有趣想法;然而，现在的操作系统的页面替换机制仍然使用旧的时钟算法，对大小页面给予相同的优先级。在本文中，我们展示了一种技术，该技术将页面替换机制增强为基于更多参数的算法，并且适用于超级分页环境。

引用次数: 28

Data component based management of reservoir simulation models 基于数据构件的油藏模拟模型管理

2008 IEEE International Conference on Information Reuse and Integration

Pub Date : 2008-07-13 DOI: 10.1109/IRI.2008.4583062

Cong Zhang, A. Bakshi, V. Prasanna

The management of reservoir simulation models has been an important need of engineers in petroleum industry. However, due to data sharing among reservoir simulation models, data replication is common and poses many challenges to model management, including management efficiency and data consistency. In this paper, we propose a data component based methodology to manage reservoir simulation models. It not only improves management efficiency by removing data replicas, but also facilitates information reuse among multiple models. We first identify the underlying structure of the simulation models and decompose them into three types of components: reservoir realization, design, and simulator configuration. Our methodology then identifies the duplicate components and guarantees that each component has one physical copy in the data repository. By separating the logical connections between the models and the components from the physical data files, our methodology provides a clean and efficient way to manage data sharing relationships among the models.

油藏模拟模型的管理已经成为石油工业工程师的一个重要需求。然而，由于油藏模拟模型之间的数据共享，数据复制很常见，这给模型管理带来了许多挑战，包括管理效率和数据一致性。在本文中，我们提出了一种基于数据组件的方法来管理油藏模拟模型。它不仅通过删除数据副本提高了管理效率，而且便于多个模型之间的信息重用。我们首先确定了模拟模型的底层结构，并将其分解为三种类型的组件:油藏实现、设计和模拟器配置。然后，我们的方法识别重复的组件，并保证每个组件在数据存储库中有一个物理副本。通过将模型和组件之间的逻辑连接与物理数据文件分离，我们的方法提供了一种干净而有效的方法来管理模型之间的数据共享关系。

引用次数: 5

Analysis methodology for project design utilizing UML 利用UML进行项目设计的分析方法

2008 IEEE International Conference on Information Reuse and Integration

Pub Date : 2008-07-13 DOI: 10.1109/IRI.2008.4583044

Citlalih Gutierrez Estrada, S. D. Zagal, M. N. Perez, Itzel Abundez Barrera, Rocio Elizabeth Pulido Alba, Mauro Sanchez Sanchez, René Arnulfo García-Hernández

This work focuses on the implementation of a methodology as a strategy for system development analysis in the early stages, starting from system specifications in natural language and ending with modeling of the specified system using different diagrams, with the help of the Unified Modeling Language (UML) to facilitate information reusability and cooperation. The ultimate goal is to guarantee the fulfillment of a given set of needs, ensuring that the system fulfills its design requirements and functionality correctly, making sure that the system conception process is successful from the first stages of development.

这项工作的重点是在早期阶段将方法论的实现作为系统开发分析的策略，从自然语言的系统规范开始，并在统一建模语言(UML)的帮助下，使用不同的图对指定的系统进行建模，以促进信息的可重用性和合作。最终目标是保证满足给定的一组需求，确保系统正确地满足其设计需求和功能，确保系统概念过程从开发的第一阶段开始就成功。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2008 IEEE International Conference on Information Reuse and Integration

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀