Irony is something most people can tell is therewhen they see it, but it is not so easy to define, let alone detectautomatically. In this paper we describe the construction of abalanced corpus of ironic vs. serious watch reviews and show thepromising results achieved by classifiers trained on this corpusin predicting the presence of irony or lack thereof in productreviews from a manually labeled corpus. We try to find commonfeatures in the two corpora and outline our next steps towardsa model which would detect ironic utterances in more general contexts.
{"title":"Ridiculously Expensive Watches and Surprisingly Many Reviewers: A Study of Irony","authors":"Pavel Savov, R. Nielek","doi":"10.1109/WI.2016.0131","DOIUrl":"https://doi.org/10.1109/WI.2016.0131","url":null,"abstract":"Irony is something most people can tell is therewhen they see it, but it is not so easy to define, let alone detectautomatically. In this paper we describe the construction of abalanced corpus of ironic vs. serious watch reviews and show thepromising results achieved by classifiers trained on this corpusin predicting the presence of irony or lack thereof in productreviews from a manually labeled corpus. We try to find commonfeatures in the two corpora and outline our next steps towardsa model which would detect ironic utterances in more general contexts.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"151 1","pages":"725-729"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77060378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, the use of service robots has increased considerably and their social contribution is expected. It is desirable that a robot, as a provider of operational information, can answer questions in both the open domain and intended operations, to respond to questions in a manner that satisfies users. This paper proposes a question answering system that can respond to questions in both intended operations and open domain by linking an ontology, which is semi-automatically built from Wikipedia (Wikipedia-based ontology), with a domain ontology.
{"title":"Development and Evaluation of an Operational Service Robot Using Wikipedia-Based and Domain Ontologies","authors":"Hiroshi Asano, Takeshi Morita, Takahira Yamaguchi","doi":"10.1109/WI.2016.0086","DOIUrl":"https://doi.org/10.1109/WI.2016.0086","url":null,"abstract":"Recently, the use of service robots has increased considerably and their social contribution is expected. It is desirable that a robot, as a provider of operational information, can answer questions in both the open domain and intended operations, to respond to questions in a manner that satisfies users. This paper proposes a question answering system that can respond to questions in both intended operations and open domain by linking an ontology, which is semi-automatically built from Wikipedia (Wikipedia-based ontology), with a domain ontology.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"5 1","pages":"511-514"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87670134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The development of new digital technologies is swiftly rising. Thus, the cloud computing is grabbing-attention of information technology communities. In this context, diverse security issues are amplified. Particularly, access control seems of main importance because it ensures diverse security services, such as, authentication, identification, confidentiality and integrity. Several works are devoted for designing access control models. In this paper, we are particularly interested on distributed access control approaches. According to identified drawbacks of Multi-OrBAC model, we introduce a new distributed access control model for cloud computing based on Mobile Agent.
{"title":"Multi-organizational Access Control Model Based on Mobile Agents for Cloud Computing","authors":"Zeineb Ben Yahya, F. Ktata, K. Ghédira","doi":"10.1109/WI.2016.0116","DOIUrl":"https://doi.org/10.1109/WI.2016.0116","url":null,"abstract":"The development of new digital technologies is swiftly rising. Thus, the cloud computing is grabbing-attention of information technology communities. In this context, diverse security issues are amplified. Particularly, access control seems of main importance because it ensures diverse security services, such as, authentication, identification, confidentiality and integrity. Several works are devoted for designing access control models. In this paper, we are particularly interested on distributed access control approaches. According to identified drawbacks of Multi-OrBAC model, we introduce a new distributed access control model for cloud computing based on Mobile Agent.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"104 1","pages":"656-659"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85865215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Pacheco, Diego Pinheiro, Fernando Buarque de Lima-Neto, Eraldo Ribeiro, R. Menezes
Football (aka Soccer) is the most popular sport in the world. The popularity of the sport leads to several stories (some perhaps anecdotal) about supporters behaviors and to the emergence of rivalries such as the famous Barcelona-Real Madrid (in Spain). Little however has been done to characterize/profile online users' behaviors as football supporters and use them as an aggregate measure to club characterization. Today, the availability of data enable us to understand at a much greater scale if rivalries exist and if there are signatures that can be used to characterize supporting behavior. In this paper we use techniques from Data Science to characterize football supporters according to their activity on Twitter and to characterize clubs according to the behavior of their supporters. We show that it is possible to: (i) rank football clubs by their popularity and fans' dislike, (ii) identify the rivalries that exist between clubs and their supporters, and (iii) find specific signatures that repeat themselves across different clubs and in different countries. The results are evaluated on a large dataset of tweets relevant to major football leagues in Brazil and in the United Kingdom.
{"title":"Characterization of Football Supporters from Twitter Conversations","authors":"D. Pacheco, Diego Pinheiro, Fernando Buarque de Lima-Neto, Eraldo Ribeiro, R. Menezes","doi":"10.1109/WI.2016.0033","DOIUrl":"https://doi.org/10.1109/WI.2016.0033","url":null,"abstract":"Football (aka Soccer) is the most popular sport in the world. The popularity of the sport leads to several stories (some perhaps anecdotal) about supporters behaviors and to the emergence of rivalries such as the famous Barcelona-Real Madrid (in Spain). Little however has been done to characterize/profile online users' behaviors as football supporters and use them as an aggregate measure to club characterization. Today, the availability of data enable us to understand at a much greater scale if rivalries exist and if there are signatures that can be used to characterize supporting behavior. In this paper we use techniques from Data Science to characterize football supporters according to their activity on Twitter and to characterize clubs according to the behavior of their supporters. We show that it is possible to: (i) rank football clubs by their popularity and fans' dislike, (ii) identify the rivalries that exist between clubs and their supporters, and (iii) find specific signatures that repeat themselves across different clubs and in different countries. The results are evaluated on a large dataset of tweets relevant to major football leagues in Brazil and in the United Kingdom.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"52 1","pages":"169-176"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89601827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the advent of the Internet of things (IoT) and smart sensor technologies, the data-driven paradigm has been found promising to support human behavioral analysis in a smart home for better healthcare and well-being of senior adults. This work focuses on discovering daily activity routines from sensor data collected in a smart home. By representing the sensor data as a matrix, daily activity routines can be identified using matrix factorization methods. The key challenge rests on the fact that the matrix contains discrete labels as its elements, and decomposing the nominal data matrix into basis vectors of the labels is nontrivial. We propose a novel principled methodology to tackle the nominal matrix factorization problem. Assuming that the similarity matrix of the labels is known, the discrete labels are first projected onto a continuous space with the interlabel distance preserving the given similarity matrix of the labels as far as possible. Then, we extend a hierarchical probabilistic model for ordinal matrix factorization with Bayesian Lasso that the factorization can be more robust to noise and more sparse to ease human interpretation. Our experimental results based on a synthetic data set shows that the factorization results obtained using the proposed methodology outperform those obtained using a number of the state-of-the-art factorization methods in terms of the basis vector reconstruction accuracy. We also applied our model to a publicly available smart home data set to illustrate how the proposed methodology can be used to support daily activity routine analysis.
{"title":"Bayesian Nominal Matrix Factorization for Mining Daily Activity Patterns","authors":"Chen Li, W. K. Cheung, Jiming Liu, J. Ng","doi":"10.1109/WI.2016.0054","DOIUrl":"https://doi.org/10.1109/WI.2016.0054","url":null,"abstract":"With the advent of the Internet of things (IoT) and smart sensor technologies, the data-driven paradigm has been found promising to support human behavioral analysis in a smart home for better healthcare and well-being of senior adults. This work focuses on discovering daily activity routines from sensor data collected in a smart home. By representing the sensor data as a matrix, daily activity routines can be identified using matrix factorization methods. The key challenge rests on the fact that the matrix contains discrete labels as its elements, and decomposing the nominal data matrix into basis vectors of the labels is nontrivial. We propose a novel principled methodology to tackle the nominal matrix factorization problem. Assuming that the similarity matrix of the labels is known, the discrete labels are first projected onto a continuous space with the interlabel distance preserving the given similarity matrix of the labels as far as possible. Then, we extend a hierarchical probabilistic model for ordinal matrix factorization with Bayesian Lasso that the factorization can be more robust to noise and more sparse to ease human interpretation. Our experimental results based on a synthetic data set shows that the factorization results obtained using the proposed methodology outperform those obtained using a number of the state-of-the-art factorization methods in terms of the basis vector reconstruction accuracy. We also applied our model to a publicly available smart home data set to illustrate how the proposed methodology can be used to support daily activity routine analysis.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"1 1","pages":"335-342"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88655029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hung-Min Hsu, Wei-Sheng Zeng, Chen-Shuo Hung, Dung-Sheng Chen, R. Chang, Shian-Hua Lin, Jan-Ming Ho
Framing is a phenomenon that is studied and debated widely in sociology and political science. It refers to the manner in which audiences interpret information and justify their claims or activities. The subconscious influence of framing might lead to opinion changes and social movements. However, multi-frame classification on microblogging data has not yet been investigated. In this study, we aim to classify a large number of posts into frames. We describe in detail the implementation of a new algorithm for multi-frame classification tasks called Frame Dispatcher, which aims to classify microblogging data into frames. In our experiments, we extracted over 15,000 posts from approximately 200 Facebook fan pages concerning an anti-curriculum student movement. The experimental results show that Frame Dispatcher can classify microblogging data into frames efficiently and effectively.
{"title":"Frame Dispatcher: A Multi-frame Classification System for Social Movement by Using Microblogging Data","authors":"Hung-Min Hsu, Wei-Sheng Zeng, Chen-Shuo Hung, Dung-Sheng Chen, R. Chang, Shian-Hua Lin, Jan-Ming Ho","doi":"10.1109/WI.2016.0101","DOIUrl":"https://doi.org/10.1109/WI.2016.0101","url":null,"abstract":"Framing is a phenomenon that is studied and debated widely in sociology and political science. It refers to the manner in which audiences interpret information and justify their claims or activities. The subconscious influence of framing might lead to opinion changes and social movements. However, multi-frame classification on microblogging data has not yet been investigated. In this study, we aim to classify a large number of posts into frames. We describe in detail the implementation of a new algorithm for multi-frame classification tasks called Frame Dispatcher, which aims to classify microblogging data into frames. In our experiments, we extracted over 15,000 posts from approximately 200 Facebook fan pages concerning an anti-curriculum student movement. The experimental results show that Frame Dispatcher can classify microblogging data into frames efficiently and effectively.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"54 1","pages":"588-591"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90821064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a novel version of ExATO, a term extractor originally designed to extract relevant terms from corpora in Portuguese. In this new version not only corpora in Portuguese can be handled, but also texts in English are accepted. This extension is likely to offer the same quality pattern already achieved for Portuguese. In this paper, we draw the analysis of results in parallel corpora with respect to the intrinsic differences between Portuguese and English languages, and also the environment of usage for ExATO for Portuguese and English corpora. A brief comparison of ExATO and other similar tool is presented to illustrate the higher quality of ExATO extraction from English corpora.
{"title":"ExATO - High Quality Term Extraction for Portuguese and English","authors":"Lucelene Lopes, Paulo Fernandes, R. Vieira","doi":"10.1109/WI.2016.0092","DOIUrl":"https://doi.org/10.1109/WI.2016.0092","url":null,"abstract":"This paper presents a novel version of ExATO, a term extractor originally designed to extract relevant terms from corpora in Portuguese. In this new version not only corpora in Portuguese can be handled, but also texts in English are accepted. This extension is likely to offer the same quality pattern already achieved for Portuguese. In this paper, we draw the analysis of results in parallel corpora with respect to the intrinsic differences between Portuguese and English languages, and also the environment of usage for ExATO for Portuguese and English corpora. A brief comparison of ExATO and other similar tool is presented to illustrate the higher quality of ExATO extraction from English corpora.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"6 1","pages":"540-545"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81040984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Early detection of depression is important to improve human well-being. This paper proposes a new method to detect depression through time-frequency analysis of Internet behaviors. We recruited 728 postgraduate students and obtained their scores on a depression questionnaire (Zung Self-rating Depression Scale, SDS) and digital records of Internet behaviors. By time-frequency analysis, we built classification models for differentiating higher SDS group from lower group and prediction models for identifying mental status of depressed group more precisely. Experimental results show classification and prediction models work well, and time-frequency features are effective in capturing the changes of mental health status. Results of this paper might be useful to improve the performance of public mental health services.
{"title":"Predicting Depression from Internet Behaviors by Time-Frequency Features","authors":"Changye Zhu, Baobin Li, Ang Li, T. Zhu","doi":"10.1109/WI.2016.0060","DOIUrl":"https://doi.org/10.1109/WI.2016.0060","url":null,"abstract":"Early detection of depression is important to improve human well-being. This paper proposes a new method to detect depression through time-frequency analysis of Internet behaviors. We recruited 728 postgraduate students and obtained their scores on a depression questionnaire (Zung Self-rating Depression Scale, SDS) and digital records of Internet behaviors. By time-frequency analysis, we built classification models for differentiating higher SDS group from lower group and prediction models for identifying mental status of depressed group more precisely. Experimental results show classification and prediction models work well, and time-frequency features are effective in capturing the changes of mental health status. Results of this paper might be useful to improve the performance of public mental health services.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"44 1","pages":"383-390"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79955766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Existing studies on evolution of social network largely focus on addition of new nodes and links in the network. However, as network evolves, existing relationships degrade and break down, and some nodes go to hibernation or decide not to participate in any kind of activities in the network where it belongs. Such nodes and links, which we refer as "dull", may affect analysis and prediction tasks in networks. This paper formally defines the problem of predicting dull nodes and links at an early stage, and proposes a novel time aware method to solve it. Pruning of such nodes and links is framed as "network data cleaning" task. As the definitions of dull node and link are non-trivial and subjective, a novel scheme to label such nodes and links is also proposed here. Experimental results on two real network datasets demonstrate that the proposed method accurately predicts potential dull nodes and links. This paper further experimentally validates the need for data cleaning by investigating its effect on the well-known "link prediction" problem.
{"title":"A Time Aware Method for Predicting Dull Nodes and Links in Evolving Networks for Data Cleaning","authors":"Niladri Sett, Subhrendu Chattopadhyay, Sanasam Ranbir Singh, Sukumar Nandi","doi":"10.1109/WI.2016.0050","DOIUrl":"https://doi.org/10.1109/WI.2016.0050","url":null,"abstract":"Existing studies on evolution of social network largely focus on addition of new nodes and links in the network. However, as network evolves, existing relationships degrade and break down, and some nodes go to hibernation or decide not to participate in any kind of activities in the network where it belongs. Such nodes and links, which we refer as \"dull\", may affect analysis and prediction tasks in networks. This paper formally defines the problem of predicting dull nodes and links at an early stage, and proposes a novel time aware method to solve it. Pruning of such nodes and links is framed as \"network data cleaning\" task. As the definitions of dull node and link are non-trivial and subjective, a novel scheme to label such nodes and links is also proposed here. Experimental results on two real network datasets demonstrate that the proposed method accurately predicts potential dull nodes and links. This paper further experimentally validates the need for data cleaning by investigating its effect on the well-known \"link prediction\" problem.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"9 1","pages":"304-310"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78632537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Classical recommender systems provide users with ranked lists of recommendations that are relevant to their preferences. Each recommendation consists of a single item, e.g., a movie or a book. However, these ranked lists are not suitable for applications such as travel planning, which deal with heterogeneous items. In fact, in such applications, there is a need to recommend packages the user can choose from, each package being a set of Points of Interest (POIs), e.g., museums, parks, monuments, etc. In this paper, we focus on the problem of recommending a set of packages to the user, where each package is constituted with a set of POIs that may constitute a tour. Given a collection of POIs, where each POI has a cost and a time associated with it, and the user specifying a maximum total value for both the cost and the time (budgets), our goal is to recommend the most interesting packages for the user, where each package satisfies the budget constraints. We formally define the problem and we present a novel composite recommendation system, inspired from composite retrieval. We introduce a scoring function and propose a ranking algorithm that takes into account the preferences of the user, the diversity of POIs included in the package, as well as the popularity of POIs in the package. Extensive experimental evaluation of our proposed system, using a real dataset demonstrates its quality and its ability to improve both diversity and relevance of recommendations.
{"title":"A Composite Recommendation System for Planning Tourist Visits","authors":"Idir Benouaret, D. Lenne","doi":"10.1109/WI.2016.0110","DOIUrl":"https://doi.org/10.1109/WI.2016.0110","url":null,"abstract":"Classical recommender systems provide users with ranked lists of recommendations that are relevant to their preferences. Each recommendation consists of a single item, e.g., a movie or a book. However, these ranked lists are not suitable for applications such as travel planning, which deal with heterogeneous items. In fact, in such applications, there is a need to recommend packages the user can choose from, each package being a set of Points of Interest (POIs), e.g., museums, parks, monuments, etc. In this paper, we focus on the problem of recommending a set of packages to the user, where each package is constituted with a set of POIs that may constitute a tour. Given a collection of POIs, where each POI has a cost and a time associated with it, and the user specifying a maximum total value for both the cost and the time (budgets), our goal is to recommend the most interesting packages for the user, where each package satisfies the budget constraints. We formally define the problem and we present a novel composite recommendation system, inspired from composite retrieval. We introduce a scoring function and propose a ranking algorithm that takes into account the preferences of the user, the diversity of POIs included in the package, as well as the popularity of POIs in the package. Extensive experimental evaluation of our proposed system, using a real dataset demonstrates its quality and its ability to improve both diversity and relevance of recommendations.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"30 1","pages":"626-631"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72977567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}