首页 > 最新文献

The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)最新文献

英文 中文
A novel way of computing similarities between nodes of a graph, with application to collaborative recommendation 一种计算图节点间相似度的新方法,并应用于协同推荐
François Fouss, A. Pirotte, M. Saerens
This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted, undirected graph. It is based on a Markov-chain model of random walk through the database. The suggested quantities, representing dissimilarities (or similarities) between any two elements, have the nice property of decreasing (increasing) when the number of paths connecting those elements increases and when the "length" of any path decreases. The model is evaluated on a collaborative recommendation task where suggestions are made about which movies people should watch based upon what they watched in the past. The model, which nicely fits into the so-called "statistical relational learning" framework as well as the "link analysis" paradigm, could also be used to compute document or word similarities, and, more generally could be applied to other database or Web mining tasks.
这项工作提出了一个新的视角来描述数据库元素之间的相似性,或者更一般地说,加权无向图的节点之间的相似性。它基于随机遍历数据库的马尔可夫链模型。所建议的数量,表示任意两个元素之间的不相似性(或相似性),具有当连接这些元素的路径数量增加和任何路径的“长度”减少时减少(增加)的良好特性。该模型是在一个协作推荐任务上进行评估的,该任务是根据人们过去看过的电影来建议他们应该看哪些电影。该模型非常适合所谓的“统计关系学习”框架和“链接分析”范式,还可以用于计算文档或单词的相似度,并且更一般地可以应用于其他数据库或Web挖掘任务。
{"title":"A novel way of computing similarities between nodes of a graph, with application to collaborative recommendation","authors":"François Fouss, A. Pirotte, M. Saerens","doi":"10.1109/WI.2005.9","DOIUrl":"https://doi.org/10.1109/WI.2005.9","url":null,"abstract":"This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted, undirected graph. It is based on a Markov-chain model of random walk through the database. The suggested quantities, representing dissimilarities (or similarities) between any two elements, have the nice property of decreasing (increasing) when the number of paths connecting those elements increases and when the \"length\" of any path decreases. The model is evaluated on a collaborative recommendation task where suggestions are made about which movies people should watch based upon what they watched in the past. The model, which nicely fits into the so-called \"statistical relational learning\" framework as well as the \"link analysis\" paradigm, could also be used to compute document or word similarities, and, more generally could be applied to other database or Web mining tasks.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116637151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 90
Measuring the relative performance of schema matchers 度量模式匹配器的相对性能
S. Berkovsky, Yaniv Eytani, A. Gal
Schema matching is a complex process focusing on matching between concepts describing the data in heterogeneous data sources. There is a shift from manual schema matching, done by human experts, to automatic matching, using various heuristics (schema matchers). In this work, we consider the problem of linearly combining the results of a set of schema matchers. We propose the use of machine learning algorithms to learn the optimal weight assignments, given a set of schema matchers. We also suggest the use of genetic algorithms to improve the process efficiency.
模式匹配是一个复杂的过程,主要关注描述异构数据源中数据的概念之间的匹配。从人工模式匹配(由人类专家完成)到使用各种启发式方法(模式匹配器)的自动匹配已经发生了转变。在这项工作中,我们考虑了一组模式匹配器结果的线性组合问题。我们建议使用机器学习算法来学习最优权重分配,给定一组模式匹配器。我们还建议使用遗传算法来提高过程效率。
{"title":"Measuring the relative performance of schema matchers","authors":"S. Berkovsky, Yaniv Eytani, A. Gal","doi":"10.1109/WI.2005.94","DOIUrl":"https://doi.org/10.1109/WI.2005.94","url":null,"abstract":"Schema matching is a complex process focusing on matching between concepts describing the data in heterogeneous data sources. There is a shift from manual schema matching, done by human experts, to automatic matching, using various heuristics (schema matchers). In this work, we consider the problem of linearly combining the results of a set of schema matchers. We propose the use of machine learning algorithms to learn the optimal weight assignments, given a set of schema matchers. We also suggest the use of genetic algorithms to improve the process efficiency.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125710745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
STAMP: adaptable templates for synchronized multimedia presentations STAMP:用于同步多媒体演示的适应性模板
Ioan Marius Bilasco, J. Gensel, M. Villanova-Oliver
This paper addresses the adaptation of dynamic and synchronized multimedia presentations built by querying XML compatible data sources. We provide WIS designers with facilities for describing presentations whose content is not known at design time in terms of quantity, but only after the execution of queries. Our approach relies on the definition of a template. A template consists of a model that aims at automatically adapting the multimedia content of a presentation to both the user's profile and the characteristics of her/his access device. We show here how a template is built and how adaptations of the presentation are performed when the quantity of information and/or the material capabilities of the access devices (e.g. display size), do not match the template's spatiotemporal specifications.
本文讨论了通过查询XML兼容数据源构建的动态和同步多媒体表示的适应性。我们为WIS设计人员提供了描述表示的工具,这些表示的内容在设计时在数量上是未知的,而只有在执行查询之后才知道。我们的方法依赖于模板的定义。模板由一个模型组成,该模型旨在自动使演示文稿的多媒体内容适应用户的配置文件和他/她的访问设备的特征。我们在这里展示了如何构建模板,以及当信息数量和/或访问设备的材料能力(例如显示尺寸)与模板的时空规格不匹配时,如何执行表示的调整。
{"title":"STAMP: adaptable templates for synchronized multimedia presentations","authors":"Ioan Marius Bilasco, J. Gensel, M. Villanova-Oliver","doi":"10.1109/WI.2005.137","DOIUrl":"https://doi.org/10.1109/WI.2005.137","url":null,"abstract":"This paper addresses the adaptation of dynamic and synchronized multimedia presentations built by querying XML compatible data sources. We provide WIS designers with facilities for describing presentations whose content is not known at design time in terms of quantity, but only after the execution of queries. Our approach relies on the definition of a template. A template consists of a model that aims at automatically adapting the multimedia content of a presentation to both the user's profile and the characteristics of her/his access device. We show here how a template is built and how adaptations of the presentation are performed when the quantity of information and/or the material capabilities of the access devices (e.g. display size), do not match the template's spatiotemporal specifications.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"430 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126096449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DSM-TKP: mining top-k path traversal patterns over Web click-streams DSM-TKP:在Web点击流上挖掘top-k路径遍历模式
Hua-Fu Li, Suh-Yin Lee, M. Shan
Online, single-pass mining Web click streams poses some interesting computational issues, such as unbounded length of streaming data, possibly very fast arrival rate and just one scan over previously arrived click-sequencer In this paper, we propose a new, single-pass algorithm, called DSM-TKP (data stream mining for top-k path traversal patterns), for mining top-k path traversal patterns, where k is the desired number of path traversal patterns to be mined. An effective summary data structure called TKP-forest (top-k path forest) is used to maintain the essential information about the top-k path traversal patterns of the click-stream so far. Experimental studies show that DSM-TKP algorithm uses stable memory usage and makes only one pass over the streaming data.
在线上,单遍挖掘Web点击流提出了一些有趣的计算问题,例如流数据的无限长度,可能非常快的到达率以及只需对先前到达的点击序列器进行一次扫描。在本文中,我们提出了一种新的单遍算法,称为DSM-TKP(数据流挖掘top-k路径遍历模式),用于挖掘top-k路径遍历模式,其中k是要挖掘的路径遍历模式的期望数量。使用一种称为TKP-forest (top-k path forest)的有效汇总数据结构来维护迄今为止关于点击流的top-k路径遍历模式的基本信息。实验研究表明,DSM-TKP算法使用稳定的内存,并且只对流数据进行一次传递。
{"title":"DSM-TKP: mining top-k path traversal patterns over Web click-streams","authors":"Hua-Fu Li, Suh-Yin Lee, M. Shan","doi":"10.1109/WI.2005.56","DOIUrl":"https://doi.org/10.1109/WI.2005.56","url":null,"abstract":"Online, single-pass mining Web click streams poses some interesting computational issues, such as unbounded length of streaming data, possibly very fast arrival rate and just one scan over previously arrived click-sequencer In this paper, we propose a new, single-pass algorithm, called DSM-TKP (data stream mining for top-k path traversal patterns), for mining top-k path traversal patterns, where k is the desired number of path traversal patterns to be mined. An effective summary data structure called TKP-forest (top-k path forest) is used to maintain the essential information about the top-k path traversal patterns of the click-stream so far. Experimental studies show that DSM-TKP algorithm uses stable memory usage and makes only one pass over the streaming data.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123977658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
User navigational behavior in e-learning virtual environments 电子学习虚拟环境中的用户导航行为
J. M. Carbó, Enric Mor, J. Minguillón
In this paper, we describe the navigational behavior of the students of an e-learning virtual environment, in order to determine whether such navigational patterns are related to the academic performance achieved by the students or not, and which behaviors can be identified as more successful. As an example, a subset of students taking a degree in computer science in a completely virtual online university is selected as the matter of study. Three levels of analysis are described: a session level, where students perform a few actions in a single session logged to the virtual campus; a course level, where all single sessions are joined to form a course navigational pattern; and a lifelong learning level, where students enroll in several subjects each academic semester. A simple experiment is outlined for the course level to demonstrate the possibilities of such analysis in a virtual e-learning environment. This experiment shows that the information collected in this level is useful for understanding user behavior and the relationship with his or her academic achievements, and that some intuitive ideas about the relevance of specific user actions or particularities can be also better explained.
在本文中,我们描述了电子学习虚拟环境中学生的导航行为,以确定这种导航模式是否与学生的学习成绩有关,以及哪些行为可以被认为是更成功的。举个例子,在一所完全虚拟的在线大学中,选择一群攻读计算机科学学位的学生作为研究对象。描述了三个级别的分析:会话级别,学生在登录到虚拟校园的单个会话中执行一些操作;课程级别,其中所有单独的会话连接起来形成课程导航模式;终身学习的水平,学生在每个学期注册几个科目。一个简单的实验概述了课程水平,以证明这种分析在虚拟电子学习环境中的可能性。本实验表明,在这一层面收集的信息对于理解用户行为及其与学术成就的关系是有用的,并且可以更好地解释关于特定用户行为的相关性或特殊性的一些直观想法。
{"title":"User navigational behavior in e-learning virtual environments","authors":"J. M. Carbó, Enric Mor, J. Minguillón","doi":"10.1109/WI.2005.155","DOIUrl":"https://doi.org/10.1109/WI.2005.155","url":null,"abstract":"In this paper, we describe the navigational behavior of the students of an e-learning virtual environment, in order to determine whether such navigational patterns are related to the academic performance achieved by the students or not, and which behaviors can be identified as more successful. As an example, a subset of students taking a degree in computer science in a completely virtual online university is selected as the matter of study. Three levels of analysis are described: a session level, where students perform a few actions in a single session logged to the virtual campus; a course level, where all single sessions are joined to form a course navigational pattern; and a lifelong learning level, where students enroll in several subjects each academic semester. A simple experiment is outlined for the course level to demonstrate the possibilities of such analysis in a virtual e-learning environment. This experiment shows that the information collected in this level is useful for understanding user behavior and the relationship with his or her academic achievements, and that some intuitive ideas about the relevance of specific user actions or particularities can be also better explained.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124166420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Semantic based collaborative P2P in ubiquitous computing 泛在计算中基于语义的协同P2P
M. Ruta, T. D. Noia, E. Sciascio, F. Donini
We propose a collaborative environment for semantic-enabled mobile devices (e.g. PDAs, cell phones, laptops) in peer to peer scenarios. Within the environment, resource discovery is performed exploiting technologies and techniques for knowledge representation developed for the semantic Web, which have been adapted to cope with the highly flexible structure of ad-hoc networks in ubiquitous computing. The approach exploits the standard Bluetooth stack, using the original UUID payload, to carry semantically annotated data. The environment is motivated and presented in a museum case study.
我们为点对点场景中支持语义的移动设备(例如pda、手机、笔记本电脑)提出了一个协作环境。在该环境中,资源发现是利用为语义Web开发的知识表示技术和技术来完成的,这些技术和技术已经适应了普适计算中ad-hoc网络的高度灵活的结构。该方法利用标准蓝牙堆栈,使用原始UUID有效负载来携带带有语义注释的数据。环境被激发并呈现在博物馆案例研究中。
{"title":"Semantic based collaborative P2P in ubiquitous computing","authors":"M. Ruta, T. D. Noia, E. Sciascio, F. Donini","doi":"10.1109/WI.2005.130","DOIUrl":"https://doi.org/10.1109/WI.2005.130","url":null,"abstract":"We propose a collaborative environment for semantic-enabled mobile devices (e.g. PDAs, cell phones, laptops) in peer to peer scenarios. Within the environment, resource discovery is performed exploiting technologies and techniques for knowledge representation developed for the semantic Web, which have been adapted to cope with the highly flexible structure of ad-hoc networks in ubiquitous computing. The approach exploits the standard Bluetooth stack, using the original UUID payload, to carry semantically annotated data. The environment is motivated and presented in a museum case study.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127380192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Proposal and verification of flexible interface mapping technique for automatic system cooperation based on semantics 基于语义的系统自动协作柔性接口映射技术的提出与验证
M. Nakatsuji, Y. Miyoshi, Tatsuyuki Kimura
These days, many companies are executing their business aims based on decentralized cooperation of software components which work on various systems over a network. However, messages and processes between systems are designed individually in each operations division. Therefore, the system development for adjusting interfaces is expensive, so the companies cannot introduce their services in a dynamic business environment. To resolve such problems, we propose interface modeling technique and message mapping technique which model the relationship between the message formats and semantics on the formats by using Web Ontology Language (OWL) and execute the mapping between the message formats by using semantics. We developed the user interactive message mapping tool and evaluated our proposed methods based on the interface specifications of real network management systems.
如今,许多公司都是基于软件组件的分散合作来执行他们的业务目标,这些组件在网络上的各种系统上工作。然而,系统之间的消息和流程是在每个操作部门中单独设计的。因此,调整接口的系统开发成本很高,企业无法在动态的业务环境中引入服务。为了解决这些问题,我们提出了接口建模技术和消息映射技术,利用Web本体语言(Web Ontology Language, OWL)对消息格式和语义之间的关系进行建模,并利用语义执行消息格式之间的映射。我们开发了用户交互消息映射工具,并基于实际网络管理系统的接口规范对我们提出的方法进行了评估。
{"title":"Proposal and verification of flexible interface mapping technique for automatic system cooperation based on semantics","authors":"M. Nakatsuji, Y. Miyoshi, Tatsuyuki Kimura","doi":"10.1109/WI.2005.120","DOIUrl":"https://doi.org/10.1109/WI.2005.120","url":null,"abstract":"These days, many companies are executing their business aims based on decentralized cooperation of software components which work on various systems over a network. However, messages and processes between systems are designed individually in each operations division. Therefore, the system development for adjusting interfaces is expensive, so the companies cannot introduce their services in a dynamic business environment. To resolve such problems, we propose interface modeling technique and message mapping technique which model the relationship between the message formats and semantics on the formats by using Web Ontology Language (OWL) and execute the mapping between the message formats by using semantics. We developed the user interactive message mapping tool and evaluated our proposed methods based on the interface specifications of real network management systems.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129850570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An interactive hybrid system for identifying and filtering unsolicited email 用于识别和过滤未经请求的电子邮件的交互式混合系统
M. D. D. Castillo, J. I. Serrano
This paper presents a system for automatically detecting and filtering unsolicited electronic messages. The underlying filtering method is based on email origin and content. A heuristic knowledge base formed by spam words is extracted from labelled emails by a finite state automata. The processing of three parts of every email by a single Bayesian filter and the integration of the every part classification allows to achieve a maximum performance goal. The system is dynamic and interactive and evolves from the evolution of spam by incremental machine learning.
本文提出了一种自动检测和过滤非应邀电子信息的系统。底层过滤方法基于电子邮件的来源和内容。利用有限状态自动机从带标签的电子邮件中提取由垃圾词语组成的启发式知识库。通过单个贝叶斯过滤器处理每个电子邮件的三个部分,并集成每个部分分类,可以实现最大的性能目标。该系统是动态的、交互式的,通过增量机器学习从垃圾邮件的进化中进化而来。
{"title":"An interactive hybrid system for identifying and filtering unsolicited email","authors":"M. D. D. Castillo, J. I. Serrano","doi":"10.1109/WI.2005.31","DOIUrl":"https://doi.org/10.1109/WI.2005.31","url":null,"abstract":"This paper presents a system for automatically detecting and filtering unsolicited electronic messages. The underlying filtering method is based on email origin and content. A heuristic knowledge base formed by spam words is extracted from labelled emails by a finite state automata. The processing of three parts of every email by a single Bayesian filter and the integration of the every part classification allows to achieve a maximum performance goal. The system is dynamic and interactive and evolves from the evolution of spam by incremental machine learning.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128974582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Progressive analysis scheme for Web document classification Web文档分类的渐进式分析方案
Li-Chun Sung, Chin-Hwa Kuo, M. Chen, Yeali S. Sun
In this paper, a Web document classification scheme, progressive analysis scheme (PAS) is proposed to efficiently and effectively classify HTML Web documents. When an author writes a Web document, HTML tags are used to visually emphasize the texts related to main concepts. The design of PAS is to catch the authoring convention in terms of the contributions of nested HTML tags to document classification. During the learning phase, PAS provides an enhanced tag sequence model to resolve the sample lacking problem in learning the classification contributions of HTML tag sequences. While in classification phase, PAS decomposes a Web document into regions based on the DOM tag-tree, and analyzes the regions in the descending order of their classification contributions. PAS also provides a mechanism called emphasis degree adjustment to defer the processing of noisy region during classification. The simulation results shows that PAS has better performance than full-text (e.g. SVM) and sequential classifier.
为了对HTML Web文档进行高效的分类,本文提出了一种Web文档分类方案——渐进式分析方案(PAS)。当作者编写Web文档时,HTML标记用于在视觉上强调与主要概念相关的文本。PAS的设计是根据嵌套HTML标记对文档分类的贡献来捕捉创作约定。在学习阶段,PAS提供了一个增强的标签序列模型,解决了在学习HTML标签序列的分类贡献时缺乏样本的问题。在分类阶段,PAS基于DOM标记树将Web文档分解为多个区域,并按其分类贡献的降序对这些区域进行分析。PAS还提供了一种称为强调度调整的机制,以延迟分类过程中噪声区域的处理。仿真结果表明,PAS比全文分类器(如SVM)和顺序分类器具有更好的性能。
{"title":"Progressive analysis scheme for Web document classification","authors":"Li-Chun Sung, Chin-Hwa Kuo, M. Chen, Yeali S. Sun","doi":"10.1109/WI.2005.119","DOIUrl":"https://doi.org/10.1109/WI.2005.119","url":null,"abstract":"In this paper, a Web document classification scheme, progressive analysis scheme (PAS) is proposed to efficiently and effectively classify HTML Web documents. When an author writes a Web document, HTML tags are used to visually emphasize the texts related to main concepts. The design of PAS is to catch the authoring convention in terms of the contributions of nested HTML tags to document classification. During the learning phase, PAS provides an enhanced tag sequence model to resolve the sample lacking problem in learning the classification contributions of HTML tag sequences. While in classification phase, PAS decomposes a Web document into regions based on the DOM tag-tree, and analyzes the regions in the descending order of their classification contributions. PAS also provides a mechanism called emphasis degree adjustment to defer the processing of noisy region during classification. The simulation results shows that PAS has better performance than full-text (e.g. SVM) and sequential classifier.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128017200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Multi-source knowledge bases and ontologies with multiple individual and social viewpoints 具有多个个人和社会观点的多源知识库和本体
Matthias Nickles, Ruth Cobos Pérez, Gerhard Weiss, Tina Froehner
In open environments like the Web, and open multiagent and Peer2Peer systems, consent among the autonomous, self-interested knowledge sources and users very often cannot be established, and the estimation of trustability and truthfulness of knowledge sources may not be possible. Moreover, competing viewpoints and their communicative contexts even provide valuable meta-knowledge about the intentions of the participants and their social relationships. As a foundational approach to semantically heterogeneous knowledge perspectives, we introduce a formal framework for the computational representation and integration of multi-source knowledge, which makes explicit heterogeneous viewpoints, and conflicting opinions and their social contexts, and allows for the rating, generalization and optional fusion of knowledge by social choice.
在像Web这样的开放环境中,以及开放的多智能体和Peer2Peer系统中,自治的、自利的知识来源和用户之间往往无法建立共识,对知识来源的可靠性和真实性的估计可能是不可能的。此外,相互竞争的观点及其交际语境甚至提供了有关参与者意图及其社会关系的有价值的元知识。作为语义异构知识视角的基本方法,我们引入了一种多源知识的计算表示和集成的形式化框架,该框架明确了异构观点、冲突观点及其社会背景,并允许通过社会选择对知识进行评分、泛化和可选融合。
{"title":"Multi-source knowledge bases and ontologies with multiple individual and social viewpoints","authors":"Matthias Nickles, Ruth Cobos Pérez, Gerhard Weiss, Tina Froehner","doi":"10.1109/WI.2005.104","DOIUrl":"https://doi.org/10.1109/WI.2005.104","url":null,"abstract":"In open environments like the Web, and open multiagent and Peer2Peer systems, consent among the autonomous, self-interested knowledge sources and users very often cannot be established, and the estimation of trustability and truthfulness of knowledge sources may not be possible. Moreover, competing viewpoints and their communicative contexts even provide valuable meta-knowledge about the intentions of the participants and their social relationships. As a foundational approach to semantically heterogeneous knowledge perspectives, we introduce a formal framework for the computational representation and integration of multi-source knowledge, which makes explicit heterogeneous viewpoints, and conflicting opinions and their social contexts, and allows for the rating, generalization and optional fusion of knowledge by social choice.","PeriodicalId":213856,"journal":{"name":"The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117215250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1