首页 > 最新文献

2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology最新文献

英文 中文
Modeling and Validation of Biased Human Trust 人类偏见信任的建模与验证
M. Hoogendoorn, S. W. Jaffry, P. V. Maanen, Jan Treur
When considering intelligent agents that interact with humans, having an idea of the trust levels of the human, for example in other agents or services, can be of great importance. Most models of human trust that exist, are based on some rationality assumption, and biased behavior is not represented, whereas a vast literature in Cognitive and Social Sciences indicates that humans often exhibit non-rational, biased behavior with respect to trust. This paper reports how some variations of biased human trust models have been designed, analyzed and validated against empirical data. The results show that such biased trust models are able to predict human trust significantly better.
在考虑与人类交互的智能代理时,了解人类的信任级别(例如在其他代理或服务中的信任级别)可能非常重要。大多数存在的人类信任模型都是基于一些理性假设,偏见行为没有得到体现,而认知和社会科学领域的大量文献表明,人类在信任方面经常表现出非理性、偏见的行为。本文报告了如何根据经验数据设计、分析和验证有偏见的人类信任模型的一些变化。结果表明,这种有偏见的信任模型能够更好地预测人类的信任。
{"title":"Modeling and Validation of Biased Human Trust","authors":"M. Hoogendoorn, S. W. Jaffry, P. V. Maanen, Jan Treur","doi":"10.1109/WI-IAT.2011.198","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.198","url":null,"abstract":"When considering intelligent agents that interact with humans, having an idea of the trust levels of the human, for example in other agents or services, can be of great importance. Most models of human trust that exist, are based on some rationality assumption, and biased behavior is not represented, whereas a vast literature in Cognitive and Social Sciences indicates that humans often exhibit non-rational, biased behavior with respect to trust. This paper reports how some variations of biased human trust models have been designed, analyzed and validated against empirical data. The results show that such biased trust models are able to predict human trust significantly better.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133789868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Strategic Behavior in Interaction Selection and Contact Selection 互动选择与接触选择中的策略行为
Björn-Oliver Hartmann, Klemens Böhm, Christian Hütter
Social search platforms like Aardvark or Yahoo Answers have attracted a lot of attention lately. In principle, participants have two strategic dimensions in social search systems: (1) Interaction selection, i.e., forwarding/processing incoming requests (or not), and (2) contact selection, i.e., adding or dropping contacts. In systems with these strategic dimensions, it is unclear whether nodes cooperate, and if they form efficient network structures. To shed light on this fundamental question, we have conducted a study to investigate human behavior in interaction selection and to investigate the ability of humans to form efficient networks. In order to limit the degree of problem understanding necessary by the study participants, we have introduced the problem as an online game. 193 subjects joined the study that was online for 67 days. One result is that subjects choose contacts strategically and that they use strategies that lead to cooperative and almost efficient systems. Surprisingly, subjects tend to overestimate the value of cooperative contacts and keep cooperative but costly contacts. This observation is important: Assisting agents that help subjects to avoid this behavior might yield more efficiency.
像Aardvark或Yahoo Answers这样的社交搜索平台最近吸引了很多关注。原则上,参与者在社交搜索系统中有两个战略维度:(1)交互选择,即转发/处理(或不转发)传入请求;(2)联系人选择,即添加或删除联系人。在具有这些战略维度的系统中,尚不清楚节点是否合作,以及它们是否形成有效的网络结构。为了阐明这个基本问题,我们进行了一项研究,调查人类在互动选择中的行为,并调查人类形成有效网络的能力。为了限制研究参与者对问题的必要理解程度,我们将问题作为在线游戏引入。193名受试者参加了为期67天的在线研究。结果之一是,实验对象策略性地选择联系人,他们使用的策略会导致合作和几乎高效的系统。令人惊讶的是,被试倾向于高估合作联系的价值,并保持合作但代价高昂的联系。这个观察结果很重要:帮助被试避免这种行为的代理可能会产生更高的效率。
{"title":"Strategic Behavior in Interaction Selection and Contact Selection","authors":"Björn-Oliver Hartmann, Klemens Böhm, Christian Hütter","doi":"10.1109/WI-IAT.2011.23","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.23","url":null,"abstract":"Social search platforms like Aardvark or Yahoo Answers have attracted a lot of attention lately. In principle, participants have two strategic dimensions in social search systems: (1) Interaction selection, i.e., forwarding/processing incoming requests (or not), and (2) contact selection, i.e., adding or dropping contacts. In systems with these strategic dimensions, it is unclear whether nodes cooperate, and if they form efficient network structures. To shed light on this fundamental question, we have conducted a study to investigate human behavior in interaction selection and to investigate the ability of humans to form efficient networks. In order to limit the degree of problem understanding necessary by the study participants, we have introduced the problem as an online game. 193 subjects joined the study that was online for 67 days. One result is that subjects choose contacts strategically and that they use strategies that lead to cooperative and almost efficient systems. Surprisingly, subjects tend to overestimate the value of cooperative contacts and keep cooperative but costly contacts. This observation is important: Assisting agents that help subjects to avoid this behavior might yield more efficiency.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134100145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Collaborative Metadata Enrichment for Adaptive Web-Based Learning 面向自适应网络学习的协同元数据充实
Róbert Móro, Ivan Srba, Maros Uncík, M. Bieliková, Marián Simko
In recent years we have witnessed expansion of Web 2.0. Its main feature is allowing users' collaboration in content creation using various means, e.g. annotations, discussions, wikis, blogs or tags. This approach has influenced also web-based learning, for which the term "Learning 2.0" has been introduced. In this paper we explore using tags in such systems. Tags can be used for improving of searching, categorization of web-documents, creating folksonomies and ontologies or enhancing the user-model. Another aspect of tags is that they act as a bridge between resources and users to create a social network. We integrated tags in a learning framework ALEF and experimentally evaluated their usage in education process.
近年来,我们见证了Web 2.0的扩展。它的主要特点是允许用户通过各种方式协作创建内容,例如注释、讨论、wiki、博客或标签。这种方法也影响了基于网络的学习,为此引入了术语“学习2.0”。在本文中,我们探索在这样的系统中使用标签。标签可用于改进搜索、web文档的分类、创建大众分类法和本体或增强用户模型。标签的另一个方面是它们作为资源和用户之间的桥梁来创建一个社交网络。我们将标签整合到一个学习框架ALEF中,并通过实验评估了标签在教育过程中的使用情况。
{"title":"Towards Collaborative Metadata Enrichment for Adaptive Web-Based Learning","authors":"Róbert Móro, Ivan Srba, Maros Uncík, M. Bieliková, Marián Simko","doi":"10.1109/WI-IAT.2011.220","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.220","url":null,"abstract":"In recent years we have witnessed expansion of Web 2.0. Its main feature is allowing users' collaboration in content creation using various means, e.g. annotations, discussions, wikis, blogs or tags. This approach has influenced also web-based learning, for which the term \"Learning 2.0\" has been introduced. In this paper we explore using tags in such systems. Tags can be used for improving of searching, categorization of web-documents, creating folksonomies and ontologies or enhancing the user-model. Another aspect of tags is that they act as a bridge between resources and users to create a social network. We integrated tags in a learning framework ALEF and experimentally evaluated their usage in education process.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130321450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Leveraging Network Properties for Trust Evaluation in Multi-agent Systems 利用网络属性进行多智能体系统信任评估
Xi Wang, M. Maghami, G. Sukthankar
In this paper, we present a collective classification approach for identifying untrustworthy individuals in multi-agent communities from a combination of observable features and network connections. Under the assumption that data are organized as independent and identically distributed (i.i.d.)samples, traditional classification is typically performed on each object independently, without considering the underlying network connecting the instances. In collective classification, a set of relational features, based on the connections between instances, is used to augment the feature vector used in classification. This approach can perform particularly well when the underlying data exhibits homophily, a propensity for similar items to be connected. We suggest that in many cases human communities exhibit homophily in trust levels since shared attitudes toward trust can facilitate the formation and maintenance of bonds, in the same way that other types of shared beliefs and value systems do. Hence, knowledge of an agent's connections provides a valuable cue that can assist in the identification of untrustworthy individuals who are misrepresenting themselves by modifying their observable information. This paper presents results that demonstrate that our proposed trust evaluation method is robust in cases where a large percentage of the individuals present misleading information.
在本文中,我们提出了一种集体分类方法,用于从可观察特征和网络连接的组合中识别多智能体社区中不可信的个体。在假设数据被组织为独立且同分布(i.i.d)的样本的情况下,传统的分类通常是独立地对每个对象执行分类,而不考虑连接实例的底层网络。在集体分类中,基于实例之间的联系,使用一组关系特征来增强分类中使用的特征向量。当底层数据表现出同质性时,这种方法可以执行得特别好,同质性是指相似的项目被连接起来的倾向。我们认为,在许多情况下,人类社区在信任水平上表现出同质性,因为对信任的共同态度可以促进纽带的形成和维持,就像其他类型的共同信念和价值体系一样。因此,对一个代理的连接的了解提供了一个有价值的线索,可以帮助识别不值得信任的个体,这些个体通过修改他们的可观察信息来歪曲自己。本文给出的结果表明,我们提出的信任评估方法在很大比例的个人提供误导性信息的情况下是鲁棒的。
{"title":"Leveraging Network Properties for Trust Evaluation in Multi-agent Systems","authors":"Xi Wang, M. Maghami, G. Sukthankar","doi":"10.1109/WI-IAT.2011.217","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.217","url":null,"abstract":"In this paper, we present a collective classification approach for identifying untrustworthy individuals in multi-agent communities from a combination of observable features and network connections. Under the assumption that data are organized as independent and identically distributed (i.i.d.)samples, traditional classification is typically performed on each object independently, without considering the underlying network connecting the instances. In collective classification, a set of relational features, based on the connections between instances, is used to augment the feature vector used in classification. This approach can perform particularly well when the underlying data exhibits homophily, a propensity for similar items to be connected. We suggest that in many cases human communities exhibit homophily in trust levels since shared attitudes toward trust can facilitate the formation and maintenance of bonds, in the same way that other types of shared beliefs and value systems do. Hence, knowledge of an agent's connections provides a valuable cue that can assist in the identification of untrustworthy individuals who are misrepresenting themselves by modifying their observable information. This paper presents results that demonstrate that our proposed trust evaluation method is robust in cases where a large percentage of the individuals present misleading information.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115785377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
A New Language Model Combining Single and Compound Terms 单词与复合词结合的新语言模型
Arezki Hammache, R. Ahmed-Ouamer, M. Boughanem
Most traditional information retrieval systems are based on single terms indexing. However, it is admitted that semantic content of a document (or a query) cannot be accurately captured by a simple set of independent keywords. Although, several works have incorporated phrases or other syntactic information in IR, such attempts have shown slight benefit, at best. Particularly in language modeling approaches this is achieved through the use of the big ram or n-gram models. However, in these models all big rams/n-grams are considered and weighted uniformly. In this paper we introduce a new approach to weight and consider only certain types of N-grams "compound terms". Experimental results on three test collections showed an improvement.
大多数传统的信息检索系统是基于单词索引的。然而,人们承认,文档(或查询)的语义内容不能通过一组简单的独立关键字来准确捕获。尽管有几部作品在IR中加入了短语或其他语法信息,但这种尝试充其量只能显示出轻微的好处。特别是在语言建模方法中,这是通过使用大内存或n-gram模型来实现的。然而,在这些模型中,所有的大公羊/n-克被统一考虑和加权。在本文中,我们引入了一种新的权值计算方法,并且只考虑特定类型的n -g“复合项”。在三个测试集上的实验结果表明了改进的效果。
{"title":"A New Language Model Combining Single and Compound Terms","authors":"Arezki Hammache, R. Ahmed-Ouamer, M. Boughanem","doi":"10.1109/WI-IAT.2011.52","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.52","url":null,"abstract":"Most traditional information retrieval systems are based on single terms indexing. However, it is admitted that semantic content of a document (or a query) cannot be accurately captured by a simple set of independent keywords. Although, several works have incorporated phrases or other syntactic information in IR, such attempts have shown slight benefit, at best. Particularly in language modeling approaches this is achieved through the use of the big ram or n-gram models. However, in these models all big rams/n-grams are considered and weighted uniformly. In this paper we introduce a new approach to weight and consider only certain types of N-grams \"compound terms\". Experimental results on three test collections showed an improvement.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122180998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
FAST: Friends Augmented Search Techniques - System Design & Data-Management Issues 朋友增强搜索技术-系统设计和数据管理问题
C. Weth, Anwitaman Datta
Improving web search solely based on algorithmic refinements has reached a plateau. The emerging generation of searching techniques tries to harness the ``wisdom of crowds'', using inputs from users in the spirit of Web 2.0. In this paper, we introduce a framework facilitating friends augmented search techniques (FAST). To that end, we present a browser add-on as front end for collaborative browsing and searching, supporting synchronous and asynchronous collaboration between users. We then describe the back end, a distributed key-value store for efficient information retrieval in the presence of an evolving knowledge base. The mechanisms we explore in supporting efficient query processing for FAST are applicable for many other recent Web 2.0 applications that rely on similar key-value stores. The specific collaborative search tool we present is expected to be an useful utility in its own right and spur further research on friends augmented search techniques, while the data-management techniques we developed are of general interest and applicability.
仅仅基于算法改进来改进网络搜索已经达到了一个平台期。新一代的搜索技术试图利用“群体智慧”,以Web 2.0的精神使用用户的输入。本文介绍了一个促进好友增强搜索技术(FAST)的框架。为此,我们提供了一个浏览器插件作为协作浏览和搜索的前端,支持用户之间的同步和异步协作。然后,我们描述了后端,一个分布式键值存储,用于在不断发展的知识库中进行有效的信息检索。我们探讨的支持FAST高效查询处理的机制适用于依赖类似键值存储的许多其他最新Web 2.0应用程序。我们提出的特定的协作搜索工具有望成为一个有用的工具,并推动对朋友增强搜索技术的进一步研究,而我们开发的数据管理技术具有普遍的兴趣和适用性。
{"title":"FAST: Friends Augmented Search Techniques - System Design & Data-Management Issues","authors":"C. Weth, Anwitaman Datta","doi":"10.1109/WI-IAT.2011.239","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.239","url":null,"abstract":"Improving web search solely based on algorithmic refinements has reached a plateau. The emerging generation of searching techniques tries to harness the ``wisdom of crowds'', using inputs from users in the spirit of Web 2.0. In this paper, we introduce a framework facilitating friends augmented search techniques (FAST). To that end, we present a browser add-on as front end for collaborative browsing and searching, supporting synchronous and asynchronous collaboration between users. We then describe the back end, a distributed key-value store for efficient information retrieval in the presence of an evolving knowledge base. The mechanisms we explore in supporting efficient query processing for FAST are applicable for many other recent Web 2.0 applications that rely on similar key-value stores. The specific collaborative search tool we present is expected to be an useful utility in its own right and spur further research on friends augmented search techniques, while the data-management techniques we developed are of general interest and applicability.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125505242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Recognizing Textual Entailment by Generality Using Informative Asymmetric Measures and Multiword Unit Identification to Summarize Ephemeral Clusters 基于信息不对称度量和多词单元识别的文本蕴涵识别
G. Dias, Sebastião Pais, K. Wegrzyn-Wolska, R. Mahl
In the context of Ephemeral Clustering of web Pages, it can be interesting to label each cluster with a small summary instead of just a label. Within this scope, we introduce the paradigm of Textual Entailment by Generality, which can be defined as the entailment from a specific web snippet towards a more general web snippet. The subjacent idea is to find the best web snippet, which summarizes and subsumes all the other web snippets within an ephemeral cluster. To reach this objective, we first propose a new informative asymmetric similarity measure called the Simplified Asymmetric InfoSimba(AISs), which can be combined with different asymmetric association measures. In particular, the AISs proposes an unsupervised language-independent solution to infer Textual Entailment by Generality and as such can help to encounter the web snippet with maximum semantic coverage. This new methodology is tested against the first Recognizing Textual Entailment data set (RTE-1)1 for an exhaustive number of asymmetric association measures with and without the identification of Multiword Units. The comparative experiments with existing state-of-the-art methodologies show promising results.
在网页的短暂聚类的背景下,用一个小的摘要来标记每个聚类可能会很有趣,而不仅仅是一个标签。在这个范围内,我们引入了一般性文本蕴涵范式,它可以被定义为从一个特定的web片段到一个更一般的web片段的蕴涵。次要的想法是找到最好的网页片段,它总结并包含所有其他网页片段在一个短暂的集群。为了实现这一目标,我们首先提出了一种新的信息不对称相似性度量,称为简化不对称InfoSimba(AISs),它可以与不同的不对称关联度量相结合。特别地,ais提出了一种无监督的语言独立解决方案,通过通则推断文本蕴涵,这样可以帮助遇到具有最大语义覆盖的web片段。这种新方法针对第一个识别文本蕴涵数据集(RTE-1)1进行了测试,以获得具有和不具有多词单位标识的非对称关联度量的详尽数量。与现有最先进方法的对比实验显示出良好的结果。
{"title":"Recognizing Textual Entailment by Generality Using Informative Asymmetric Measures and Multiword Unit Identification to Summarize Ephemeral Clusters","authors":"G. Dias, Sebastião Pais, K. Wegrzyn-Wolska, R. Mahl","doi":"10.1109/WI-IAT.2011.122","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.122","url":null,"abstract":"In the context of Ephemeral Clustering of web Pages, it can be interesting to label each cluster with a small summary instead of just a label. Within this scope, we introduce the paradigm of Textual Entailment by Generality, which can be defined as the entailment from a specific web snippet towards a more general web snippet. The subjacent idea is to find the best web snippet, which summarizes and subsumes all the other web snippets within an ephemeral cluster. To reach this objective, we first propose a new informative asymmetric similarity measure called the Simplified Asymmetric InfoSimba(AISs), which can be combined with different asymmetric association measures. In particular, the AISs proposes an unsupervised language-independent solution to infer Textual Entailment by Generality and as such can help to encounter the web snippet with maximum semantic coverage. This new methodology is tested against the first Recognizing Textual Entailment data set (RTE-1)1 for an exhaustive number of asymmetric association measures with and without the identification of Multiword Units. The comparative experiments with existing state-of-the-art methodologies show promising results.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132542547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Classification Based on Specific Vocabulary 基于特定词汇的分类
J. Savoy, Olena Zubaryeva
Assuming a binomial distribution for word occurrence, we propose computing a standardized Z score to define the specific vocabulary of a subset compared to that of the entire corpus. This approach is applied to weight terms characterizing a document (or a sample of texts). We then show how these Z score values can be used to derive an efficient categorization scheme. To evaluate this proposition we categorize speeches given by B. Obama as either electoral or presidential. The results tend to show that the suggested classification scheme performs better than a Support Vector Machine scheme, and a Naive Bayes classifier (10-fold cross validation).
假设单词出现的二项分布,我们建议计算一个标准化的Z分数来定义一个子集与整个语料库的特定词汇。这种方法应用于描述文档(或文本样本)的权重项。然后,我们将展示如何使用这些Z分数值来推导有效的分类方案。为了评价这一命题,我们将B.奥巴马的演讲分为选举演讲和总统演讲。结果表明,建议的分类方案优于支持向量机方案和朴素贝叶斯分类器(10倍交叉验证)。
{"title":"Classification Based on Specific Vocabulary","authors":"J. Savoy, Olena Zubaryeva","doi":"10.1109/WI-IAT.2011.19","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.19","url":null,"abstract":"Assuming a binomial distribution for word occurrence, we propose computing a standardized Z score to define the specific vocabulary of a subset compared to that of the entire corpus. This approach is applied to weight terms characterizing a document (or a sample of texts). We then show how these Z score values can be used to derive an efficient categorization scheme. To evaluate this proposition we categorize speeches given by B. Obama as either electoral or presidential. The results tend to show that the suggested classification scheme performs better than a Support Vector Machine scheme, and a Naive Bayes classifier (10-fold cross validation).","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115903044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Interest and Evaluation of Aggregated Search 聚合搜索的兴趣与评价
A. Kopliku, Firas Damak, K. Pinel-Sauvagnat, M. Boughanem
Major search engines perform what is known as Aggregated Search (AS). They integrate results coming from different vertical search engines (images, videos, news, etc.) with typical Web search results. Aggregated search is relatively new and its advantages need to be evaluated. Some existing works have already tried to evaluate the interest (usefulness) of aggregated search as well as the effectiveness of the existing approaches. However, most of evaluation methodologies were based (i) on what we call relevance by intent (i.e. search results were not shown to real users), and (ii) short text queries. In this paper, we conducted a user study which was designed to revisit and compare the interest of aggregated search, by exploiting both relevance by intent and content, and using both short text and fixed need queries. This user study allowed us to analyze the distribution of relevant results across different verticals, and to show that AS helps to identify complementary relevant sources for the same information need. Comparison between relevance by intent and relevance by content showed that relevance by intent introduces a bias in evaluation. Discussion about the results also allowed us to identify some useful thoughts concerning the evaluation of AS approaches.
主要的搜索引擎执行聚合搜索(as)。它们将来自不同垂直搜索引擎(图像、视频、新闻等)的结果与典型的Web搜索结果整合在一起。聚合搜索相对较新,需要对其优势进行评估。一些现有的工作已经尝试评估聚合搜索的兴趣(有用性)以及现有方法的有效性。然而,大多数评估方法是基于(i)我们所说的意图相关性(即搜索结果没有显示给真正的用户),以及(ii)短文本查询。在本文中,我们进行了一项用户研究,旨在通过利用意图和内容的相关性,以及使用短文本和固定需求查询,重新审视和比较聚合搜索的兴趣。这项用户研究使我们能够分析跨不同垂直领域的相关结果分布,并显示AS有助于为相同的信息需求识别互补的相关来源。意图关联和内容关联的比较表明,意图关联在评价中引入了偏差。关于结果的讨论也使我们能够确定一些关于AS方法评估的有用想法。
{"title":"Interest and Evaluation of Aggregated Search","authors":"A. Kopliku, Firas Damak, K. Pinel-Sauvagnat, M. Boughanem","doi":"10.1109/WI-IAT.2011.99","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.99","url":null,"abstract":"Major search engines perform what is known as Aggregated Search (AS). They integrate results coming from different vertical search engines (images, videos, news, etc.) with typical Web search results. Aggregated search is relatively new and its advantages need to be evaluated. Some existing works have already tried to evaluate the interest (usefulness) of aggregated search as well as the effectiveness of the existing approaches. However, most of evaluation methodologies were based (i) on what we call relevance by intent (i.e. search results were not shown to real users), and (ii) short text queries. In this paper, we conducted a user study which was designed to revisit and compare the interest of aggregated search, by exploiting both relevance by intent and content, and using both short text and fixed need queries. This user study allowed us to analyze the distribution of relevant results across different verticals, and to show that AS helps to identify complementary relevant sources for the same information need. Comparison between relevance by intent and relevance by content showed that relevance by intent introduces a bias in evaluation. Discussion about the results also allowed us to identify some useful thoughts concerning the evaluation of AS approaches.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132256717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Ontology-Based Feature Extraction 基于本体的特征提取
C. Vicient, D. Sánchez, Antonio Moreno
Knowledge-based data mining and classification algorithms require of systems that are able to extract textual attributes contained in raw text documents, and map them to structured knowledge sources (e.g. ontologies) so that they can be semantically analyzed. The system presented in this paper performs this tasks in an automatic way, relying on a predefined ontology which states the concepts in this the posterior data analysis will be focused. As features, our system focuses on extracting relevant Named Entities from textual resources describing a particular entity. Those are evaluated by means of linguistic and Web-based co-occurrence analyses to map them to ontological concepts, thereby discovering relevant features of the object. The system has been preliminary tested with tourist destinations and Wikipedia textual resources, showing promising results.
基于知识的数据挖掘和分类算法要求系统能够提取包含在原始文本文档中的文本属性,并将其映射到结构化的知识来源(如本体),以便对其进行语义分析。本文提出的系统以自动方式执行这些任务,依赖于预定义的本体,该本体陈述了其中的概念,后验数据分析将成为重点。作为功能,我们的系统侧重于从描述特定实体的文本资源中提取相关的命名实体。通过语言和基于web的共现分析来评估它们,将它们映射到本体论概念,从而发现对象的相关特征。该系统已经在旅游目的地和维基百科文本资源中进行了初步测试,显示出令人满意的结果。
{"title":"Ontology-Based Feature Extraction","authors":"C. Vicient, D. Sánchez, Antonio Moreno","doi":"10.1109/WI-IAT.2011.199","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.199","url":null,"abstract":"Knowledge-based data mining and classification algorithms require of systems that are able to extract textual attributes contained in raw text documents, and map them to structured knowledge sources (e.g. ontologies) so that they can be semantically analyzed. The system presented in this paper performs this tasks in an automatic way, relying on a predefined ontology which states the concepts in this the posterior data analysis will be focused. As features, our system focuses on extracting relevant Named Entities from textual resources describing a particular entity. Those are evaluated by means of linguistic and Web-based co-occurrence analyses to map them to ontological concepts, thereby discovering relevant features of the object. The system has been preliminary tested with tourist destinations and Wikipedia textual resources, showing promising results.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133761226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1