Cross-lingual projection encounters two major challenges, the noise from word-alignment error and the syntactic divergences between two languages. To solve these two problems, a semi-supervised learning framework of cross-lingual projection is proposed to get better annotations using parallel data. Moreover, a projection model is introduced to model the projection process of labeling from the resource-rich language to the resource-scarce language. The projection model, together with the traditional target model of cross-lingual projection, can be seen as two views of parallel data. Utilizing these two views, an extension of co-training algorithm to structured predictions is designed to boost the result of the two models. Experiments show that the proposed cross-lingual projection method improves the accuracy in the task of POS-tagging projection. And using only one-to-one alignments proves to lead to more accurate results than using all kinds of alignment information.
{"title":"Semi-supervised Learning Framework for Cross-Lingual Projection","authors":"PengLong Hu, Mo Yu, Jing Li, Conghui Zhu, T. Zhao","doi":"10.1109/WI-IAT.2011.58","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.58","url":null,"abstract":"Cross-lingual projection encounters two major challenges, the noise from word-alignment error and the syntactic divergences between two languages. To solve these two problems, a semi-supervised learning framework of cross-lingual projection is proposed to get better annotations using parallel data. Moreover, a projection model is introduced to model the projection process of labeling from the resource-rich language to the resource-scarce language. The projection model, together with the traditional target model of cross-lingual projection, can be seen as two views of parallel data. Utilizing these two views, an extension of co-training algorithm to structured predictions is designed to boost the result of the two models. Experiments show that the proposed cross-lingual projection method improves the accuracy in the task of POS-tagging projection. And using only one-to-one alignments proves to lead to more accurate results than using all kinds of alignment information.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125995203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Improving web search solely based on algorithmic refinements has reached a plateau. The emerging generation of searching techniques tries to harness the ``wisdom of crowds'', using inputs from users in the spirit of Web 2.0. In this paper, we introduce a framework facilitating friends augmented search techniques (FAST). To that end, we present a browser add-on as front end for collaborative browsing and searching, supporting synchronous and asynchronous collaboration between users. We then describe the back end, a distributed key-value store for efficient information retrieval in the presence of an evolving knowledge base. The mechanisms we explore in supporting efficient query processing for FAST are applicable for many other recent Web 2.0 applications that rely on similar key-value stores. The specific collaborative search tool we present is expected to be an useful utility in its own right and spur further research on friends augmented search techniques, while the data-management techniques we developed are of general interest and applicability.
{"title":"FAST: Friends Augmented Search Techniques - System Design & Data-Management Issues","authors":"C. Weth, Anwitaman Datta","doi":"10.1109/WI-IAT.2011.239","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.239","url":null,"abstract":"Improving web search solely based on algorithmic refinements has reached a plateau. The emerging generation of searching techniques tries to harness the ``wisdom of crowds'', using inputs from users in the spirit of Web 2.0. In this paper, we introduce a framework facilitating friends augmented search techniques (FAST). To that end, we present a browser add-on as front end for collaborative browsing and searching, supporting synchronous and asynchronous collaboration between users. We then describe the back end, a distributed key-value store for efficient information retrieval in the presence of an evolving knowledge base. The mechanisms we explore in supporting efficient query processing for FAST are applicable for many other recent Web 2.0 applications that rely on similar key-value stores. The specific collaborative search tool we present is expected to be an useful utility in its own right and spur further research on friends augmented search techniques, while the data-management techniques we developed are of general interest and applicability.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125505242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We are interested in dynamically setting up communication between agents which aim to minimize the communication load, based on formal ontologies and on the proper use of interaction mechanisms. To overcome the reasoning complexity we create a simplified ontology version by removing OWL constructs such as inverse roles, nominals, number restrictions.
{"title":"Simplifying Ontologies for Smoother Interaction in Heterogeneous Environments","authors":"I. A. Letia, O. Pop","doi":"10.1109/WI-IAT.2011.148","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.148","url":null,"abstract":"We are interested in dynamically setting up communication between agents which aim to minimize the communication load, based on formal ontologies and on the proper use of interaction mechanisms. To overcome the reasoning complexity we create a simplified ontology version by removing OWL constructs such as inverse roles, nominals, number restrictions.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114904211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowledge-based data mining and classification algorithms require of systems that are able to extract textual attributes contained in raw text documents, and map them to structured knowledge sources (e.g. ontologies) so that they can be semantically analyzed. The system presented in this paper performs this tasks in an automatic way, relying on a predefined ontology which states the concepts in this the posterior data analysis will be focused. As features, our system focuses on extracting relevant Named Entities from textual resources describing a particular entity. Those are evaluated by means of linguistic and Web-based co-occurrence analyses to map them to ontological concepts, thereby discovering relevant features of the object. The system has been preliminary tested with tourist destinations and Wikipedia textual resources, showing promising results.
{"title":"Ontology-Based Feature Extraction","authors":"C. Vicient, D. Sánchez, Antonio Moreno","doi":"10.1109/WI-IAT.2011.199","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.199","url":null,"abstract":"Knowledge-based data mining and classification algorithms require of systems that are able to extract textual attributes contained in raw text documents, and map them to structured knowledge sources (e.g. ontologies) so that they can be semantically analyzed. The system presented in this paper performs this tasks in an automatic way, relying on a predefined ontology which states the concepts in this the posterior data analysis will be focused. As features, our system focuses on extracting relevant Named Entities from textual resources describing a particular entity. Those are evaluated by means of linguistic and Web-based co-occurrence analyses to map them to ontological concepts, thereby discovering relevant features of the object. The system has been preliminary tested with tourist destinations and Wikipedia textual resources, showing promising results.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133761226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present an agent architecture for an enhanced theory of intent prediction with affective evaluation of expectations. The architecture combines models from psychology and robotics to create an online situation appraisal mechanism for preferential evaluation of predicted future states, which determines action selection in the current state. The architecture is implemented in a vehicular agent situated in a motorway driving scenario, requiring more than goal-directed planning: the agents must model the behaviour of others, and predict and evaluate future states. Simulations are described, showing that global properties emerge in the system that are improved and more stable with the new agent architecture.
{"title":"An Affective Anticipatory Agent Architecture","authors":"D. Sanderson, J. Pitt","doi":"10.1109/WI-IAT.2011.178","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.178","url":null,"abstract":"We present an agent architecture for an enhanced theory of intent prediction with affective evaluation of expectations. The architecture combines models from psychology and robotics to create an online situation appraisal mechanism for preferential evaluation of predicted future states, which determines action selection in the current state. The architecture is implemented in a vehicular agent situated in a motorway driving scenario, requiring more than goal-directed planning: the agents must model the behaviour of others, and predict and evaluate future states. Simulations are described, showing that global properties emerge in the system that are improved and more stable with the new agent architecture.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116882745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Dias, Sebastião Pais, K. Wegrzyn-Wolska, R. Mahl
In the context of Ephemeral Clustering of web Pages, it can be interesting to label each cluster with a small summary instead of just a label. Within this scope, we introduce the paradigm of Textual Entailment by Generality, which can be defined as the entailment from a specific web snippet towards a more general web snippet. The subjacent idea is to find the best web snippet, which summarizes and subsumes all the other web snippets within an ephemeral cluster. To reach this objective, we first propose a new informative asymmetric similarity measure called the Simplified Asymmetric InfoSimba(AISs), which can be combined with different asymmetric association measures. In particular, the AISs proposes an unsupervised language-independent solution to infer Textual Entailment by Generality and as such can help to encounter the web snippet with maximum semantic coverage. This new methodology is tested against the first Recognizing Textual Entailment data set (RTE-1)1 for an exhaustive number of asymmetric association measures with and without the identification of Multiword Units. The comparative experiments with existing state-of-the-art methodologies show promising results.
{"title":"Recognizing Textual Entailment by Generality Using Informative Asymmetric Measures and Multiword Unit Identification to Summarize Ephemeral Clusters","authors":"G. Dias, Sebastião Pais, K. Wegrzyn-Wolska, R. Mahl","doi":"10.1109/WI-IAT.2011.122","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.122","url":null,"abstract":"In the context of Ephemeral Clustering of web Pages, it can be interesting to label each cluster with a small summary instead of just a label. Within this scope, we introduce the paradigm of Textual Entailment by Generality, which can be defined as the entailment from a specific web snippet towards a more general web snippet. The subjacent idea is to find the best web snippet, which summarizes and subsumes all the other web snippets within an ephemeral cluster. To reach this objective, we first propose a new informative asymmetric similarity measure called the Simplified Asymmetric InfoSimba(AISs), which can be combined with different asymmetric association measures. In particular, the AISs proposes an unsupervised language-independent solution to infer Textual Entailment by Generality and as such can help to encounter the web snippet with maximum semantic coverage. This new methodology is tested against the first Recognizing Textual Entailment data set (RTE-1)1 for an exhaustive number of asymmetric association measures with and without the identification of Multiword Units. The comparative experiments with existing state-of-the-art methodologies show promising results.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132542547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we study user modeling on Twitter and investigate the interplay between personal interests and public trends. To generate semantically meaningful user profiles, we present a framework that allows us to enrich the semantics of individual Twitter messages and features user modeling as well as trend modeling strategies. These profiles can be re-used in other applications for (trend-aware) personalization. Given a large Twitter dataset, we analyze the characteristics of user and trend profiles and evaluate the quality of the profiles in the context of a personalized news recommendation system. We show that personal interests are more important for the recommendation process than public trends and that by combining both types of profiles we can further improve recommendation quality.
{"title":"Interweaving Trend and User Modeling for Personalized News Recommendation","authors":"Qinghong Gao, F. Abel, G. Houben, Ke Tao","doi":"10.1109/WI-IAT.2011.74","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.74","url":null,"abstract":"In this paper, we study user modeling on Twitter and investigate the interplay between personal interests and public trends. To generate semantically meaningful user profiles, we present a framework that allows us to enrich the semantics of individual Twitter messages and features user modeling as well as trend modeling strategies. These profiles can be re-used in other applications for (trend-aware) personalization. Given a large Twitter dataset, we analyze the characteristics of user and trend profiles and evaluate the quality of the profiles in the context of a personalized news recommendation system. We show that personal interests are more important for the recommendation process than public trends and that by combining both types of profiles we can further improve recommendation quality.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"64 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132738561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a meaning-based method to distinguish text without or with little semantic content from text that has meaning which can be processed. The basic method assumes that a semantic analyzer will be able to produce less output from semantically less grammatical input text. The method was pilot-tested on a corpus of blog spam. Future improvements, including a method to distinguish semantically unified from semantically disparate text are sketched. The tested method, but even more the projected improvements, open up the way to taking the spam filtering arms race to a new level that is very costly to spam producers.
{"title":"Baseline Semantic Spam Filtering","authors":"Christian F. Hempelmann, Vikas Mehra","doi":"10.1109/WI-IAT.2011.133","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.133","url":null,"abstract":"This paper presents a meaning-based method to distinguish text without or with little semantic content from text that has meaning which can be processed. The basic method assumes that a semantic analyzer will be able to produce less output from semantically less grammatical input text. The method was pilot-tested on a corpus of blog spam. Future improvements, including a method to distinguish semantically unified from semantically disparate text are sketched. The tested method, but even more the projected improvements, open up the way to taking the spam filtering arms race to a new level that is very costly to spam producers.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"818 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132794601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Kopliku, Firas Damak, K. Pinel-Sauvagnat, M. Boughanem
Major search engines perform what is known as Aggregated Search (AS). They integrate results coming from different vertical search engines (images, videos, news, etc.) with typical Web search results. Aggregated search is relatively new and its advantages need to be evaluated. Some existing works have already tried to evaluate the interest (usefulness) of aggregated search as well as the effectiveness of the existing approaches. However, most of evaluation methodologies were based (i) on what we call relevance by intent (i.e. search results were not shown to real users), and (ii) short text queries. In this paper, we conducted a user study which was designed to revisit and compare the interest of aggregated search, by exploiting both relevance by intent and content, and using both short text and fixed need queries. This user study allowed us to analyze the distribution of relevant results across different verticals, and to show that AS helps to identify complementary relevant sources for the same information need. Comparison between relevance by intent and relevance by content showed that relevance by intent introduces a bias in evaluation. Discussion about the results also allowed us to identify some useful thoughts concerning the evaluation of AS approaches.
{"title":"Interest and Evaluation of Aggregated Search","authors":"A. Kopliku, Firas Damak, K. Pinel-Sauvagnat, M. Boughanem","doi":"10.1109/WI-IAT.2011.99","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.99","url":null,"abstract":"Major search engines perform what is known as Aggregated Search (AS). They integrate results coming from different vertical search engines (images, videos, news, etc.) with typical Web search results. Aggregated search is relatively new and its advantages need to be evaluated. Some existing works have already tried to evaluate the interest (usefulness) of aggregated search as well as the effectiveness of the existing approaches. However, most of evaluation methodologies were based (i) on what we call relevance by intent (i.e. search results were not shown to real users), and (ii) short text queries. In this paper, we conducted a user study which was designed to revisit and compare the interest of aggregated search, by exploiting both relevance by intent and content, and using both short text and fixed need queries. This user study allowed us to analyze the distribution of relevant results across different verticals, and to show that AS helps to identify complementary relevant sources for the same information need. Comparison between relevance by intent and relevance by content showed that relevance by intent introduces a bias in evaluation. Discussion about the results also allowed us to identify some useful thoughts concerning the evaluation of AS approaches.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132256717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Norms are a way to specify acceptable behaviour in a context. In literature there is a lot of work on norm theories, models and specifications on how agents might take norms into account when reasoning but few practical implementations. In this paper we present a framework and an implementation for norm-oriented planning. Unlike most frameworks, our approach takes into consideration the operationalisation of norms during the plan generation phase. In our framework norms can be obligations or prohibitions which can be violated, and are accompanied by repair norms in case they are breached. Norm operational semantics is expressed as an extension/on top of STRIPS semantics, acting as a form of temporal restrictions over the trajectories (plans) computed by the planner. In combination with the agent's utility functions over the actions, the norm-aware planner computes the most profitable trajectory concluding to a state of the world where no pending obligations exist and any (obligation/prohibition) violation has been handled. An implementation of the framework in PDDL is provided.
{"title":"Norm-Aware Planning: Semantics and Implementation","authors":"Sofia Panagiotidi, Javier Vázquez-Salceda","doi":"10.1109/WI-IAT.2011.249","DOIUrl":"https://doi.org/10.1109/WI-IAT.2011.249","url":null,"abstract":"Norms are a way to specify acceptable behaviour in a context. In literature there is a lot of work on norm theories, models and specifications on how agents might take norms into account when reasoning but few practical implementations. In this paper we present a framework and an implementation for norm-oriented planning. Unlike most frameworks, our approach takes into consideration the operationalisation of norms during the plan generation phase. In our framework norms can be obligations or prohibitions which can be violated, and are accompanied by repair norms in case they are breached. Norm operational semantics is expressed as an extension/on top of STRIPS semantics, acting as a form of temporal restrictions over the trajectories (plans) computed by the planner. In combination with the agent's utility functions over the actions, the norm-aware planner computes the most profitable trajectory concluding to a state of the world where no pending obligations exist and any (obligation/prohibition) violation has been handled. An implementation of the framework in PDDL is provided.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133071332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}