P. A. S. Duarte, Maria J. P. Peixoto, Yuri S. F. Frota, Windson Viana
Development of Context-Aware and Mobile applications have significant challenges, such as the complexity of sensor access code and the heterogeneity of devices. The Google Awareness API is an initiative to mitigate this complexity. This paper presents an analysis of GitHub projects involving Awareness API. However, the results showed that the spread of this API among the developers community is still incipient. We propose to extend a tool to allow high-level modeling of context acquisition and code generation compatible with Awareness API. It reduces the complexity of acquiring contextual information and managing contextual rules.
{"title":"Generating Context Acquisition Code using Awareness API","authors":"P. A. S. Duarte, Maria J. P. Peixoto, Yuri S. F. Frota, Windson Viana","doi":"10.1145/3126858.3131586","DOIUrl":"https://doi.org/10.1145/3126858.3131586","url":null,"abstract":"Development of Context-Aware and Mobile applications have significant challenges, such as the complexity of sensor access code and the heterogeneity of devices. The Google Awareness API is an initiative to mitigate this complexity. This paper presents an analysis of GitHub projects involving Awareness API. However, the results showed that the spread of this API among the developers community is still incipient. We propose to extend a tool to allow high-level modeling of context acquisition and code generation compatible with Awareness API. It reduces the complexity of acquiring contextual information and managing contextual rules.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128625505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel P. T. Dantas, Antonio A. Rocha, Marcos Lage
The efficiency of urban mobility is a huge concern of urban population around the world. Because of this reason, city planners spend much of their time monitoring transportations systems and designing solutions in order to improve the system's quality. Among these solutions, the most successful ones are computational tools called Intelligent Transportation Systems (ITS). The success of ITS has encouraged public agencies, owners of the public transportation system information, to share their datasets with the population aiming to stimulate the development of new research and solutions that could help to improve urban mobility. Taking advantage of this trend, this work uses the Rio de Janeiro buses GPS logs dataset to extract some of the main operational information about the city bus system. More specifically, garage locations, start and end points of a route, and the route (the complete sequence of streets) of a bus line are inferred.This information is extremely important for both city planners and population since administrators can benefit from it to better plan the transportation system and the population can become more informed about the system, what improves its reliability and overall usage satisfaction.
{"title":"Extracting Bus Lines Services Information from GPS Registries","authors":"Daniel P. T. Dantas, Antonio A. Rocha, Marcos Lage","doi":"10.1145/3126858.3126881","DOIUrl":"https://doi.org/10.1145/3126858.3126881","url":null,"abstract":"The efficiency of urban mobility is a huge concern of urban population around the world. Because of this reason, city planners spend much of their time monitoring transportations systems and designing solutions in order to improve the system's quality. Among these solutions, the most successful ones are computational tools called Intelligent Transportation Systems (ITS). The success of ITS has encouraged public agencies, owners of the public transportation system information, to share their datasets with the population aiming to stimulate the development of new research and solutions that could help to improve urban mobility. Taking advantage of this trend, this work uses the Rio de Janeiro buses GPS logs dataset to extract some of the main operational information about the city bus system. More specifically, garage locations, start and end points of a route, and the route (the complete sequence of streets) of a bus line are inferred.This information is extremely important for both city planners and population since administrators can benefit from it to better plan the transportation system and the population can become more informed about the system, what improves its reliability and overall usage satisfaction.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130020860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nowadays people can provide feedback on products and services on the web. Site owners can use this kind of information in order to understand more their public preferences. Sentiment Analysis can help in this task, providing methods to infer the polarity of the reviews. In these methods, the classifier can use hints about the polarity of the words and the subject being discussed in order to infer the polarity of the text. However, many of these texts are short and, because of that, the classifier can have difficulties to infer these hints. We here propose a new sentiment analysis method that uses topic models to infer the polarity of short texts. The intuition of this approach is that, by using topics, the classifier is able to better understand the context and improve the performance in this task. In this approach, we first use methods to infer topics such as LDA, BTM and MedLDA in order to represent the review and, then, we apply a classifier (e.g. Linear SVM, Random Forest or Logistic Regression). In this method, we combine the results of classifiers and text representations in two ways: (1) by using single topic representation and multiple classifiers; (2) and using multiple topic representations and a single classifier. We also analyzed the impact of expanding these texts since the topic model methods can have difficulties to deal with the data sparsity present in these reviews. The proposed approach could achieve gains of up to 8.5% compared to our baseline. Moreover, we were able to determine the best classifier (Random Forest) and the best topic detection method (MedLDA).
{"title":"A Majority Voting Approach for Sentiment Analysis in Short Texts using Topic Models","authors":"R. Carmo, A. Lacerda, D. H. Dalip","doi":"10.1145/3126858.3126861","DOIUrl":"https://doi.org/10.1145/3126858.3126861","url":null,"abstract":"Nowadays people can provide feedback on products and services on the web. Site owners can use this kind of information in order to understand more their public preferences. Sentiment Analysis can help in this task, providing methods to infer the polarity of the reviews. In these methods, the classifier can use hints about the polarity of the words and the subject being discussed in order to infer the polarity of the text. However, many of these texts are short and, because of that, the classifier can have difficulties to infer these hints. We here propose a new sentiment analysis method that uses topic models to infer the polarity of short texts. The intuition of this approach is that, by using topics, the classifier is able to better understand the context and improve the performance in this task. In this approach, we first use methods to infer topics such as LDA, BTM and MedLDA in order to represent the review and, then, we apply a classifier (e.g. Linear SVM, Random Forest or Logistic Regression). In this method, we combine the results of classifiers and text representations in two ways: (1) by using single topic representation and multiple classifiers; (2) and using multiple topic representations and a single classifier. We also analyzed the impact of expanding these texts since the topic model methods can have difficulties to deal with the data sparsity present in these reviews. The proposed approach could achieve gains of up to 8.5% compared to our baseline. Moreover, we were able to determine the best classifier (Random Forest) and the best topic detection method (MedLDA).","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130990881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the Internet of Things (IoT) paradigm, a context management system (CMS) should support the context life cycle including acquisition, modeling, reasoning and distribution of context information. The context modeling phase is responsible for the representation of context information in a format which should meet IoT requirements such as expressiveness, reuse, extension and interoperability. In this paper, we present the Hermes Widget IoT service, which represents context information using the semantics of IoT-oriented ontologies such as IoT-Lite and Semantic Sensor Network (SSN). As contribution, Hermes Widget IoT allows any context provider object (e.g. a sensor) to be located, used, and have its corresponding context information represented and made available for querying through the Internet.
{"title":"An Ontology-based Representation Service of Context Information for the Internet of Things","authors":"E. Veiga, M. F. Arruda, J. Neto, R. B. Neto","doi":"10.1145/3126858.3126894","DOIUrl":"https://doi.org/10.1145/3126858.3126894","url":null,"abstract":"In the Internet of Things (IoT) paradigm, a context management system (CMS) should support the context life cycle including acquisition, modeling, reasoning and distribution of context information. The context modeling phase is responsible for the representation of context information in a format which should meet IoT requirements such as expressiveness, reuse, extension and interoperability. In this paper, we present the Hermes Widget IoT service, which represents context information using the semantics of IoT-oriented ontologies such as IoT-Lite and Semantic Sensor Network (SSN). As contribution, Hermes Widget IoT allows any context provider object (e.g. a sensor) to be located, used, and have its corresponding context information represented and made available for querying through the Internet.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"620 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123209311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. V. Araujo, Rayol Mendonca-Neto, F. Nakamura, E. Nakamura
DJ Khaled is a popular musician that is known for having many collaborators in his songs. Hence, in this paper, we model the evolution of DJ Khaled's collaboration network as nine different networks that incrementally consider the albums of his discography. The network of each album includes the collaborations from previous ones and adds the collaborations from the new album. The artists are represented as nodes and the edges are the number of songs they appear together. Our focus is to answer whether or not: (i) we can identify meaningful communities in this network; and (2) there is an artist who has greater influence as networks emerges. By using the network average clustering coefficient, we found that the artists in the the network tend to naturally cluster in a logical manner. As a result, we identified nine communities, six of them are meaningful, and we identified the rapper Rick Ross as the most influential artist of the network.
DJ Khaled是一位以歌曲中有许多合作者而闻名的流行音乐家。因此,在本文中,我们将DJ Khaled的合作网络的演变建模为九个不同的网络,这些网络逐渐考虑到他的专辑。每张专辑的网络包括以前专辑的合作,并添加新专辑的合作。艺术家们被表示为节点,边缘是他们一起出现的歌曲的数量。我们的重点是回答是否:(i)我们可以在这个网络中找到有意义的社区;(2)随着网络的出现,有一位艺术家的影响力更大。通过使用网络平均聚类系数,我们发现网络中的艺术家倾向于以一种逻辑的方式自然聚类。结果,我们确定了9个社区,其中6个是有意义的,我们确定说唱歌手里克·罗斯(Rick Ross)是该网络中最有影响力的艺术家。
{"title":"Using Complex Networks to Assess Collaboration in Rap Music: A Study Case of DJ Khaled","authors":"C. V. Araujo, Rayol Mendonca-Neto, F. Nakamura, E. Nakamura","doi":"10.1145/3126858.3131605","DOIUrl":"https://doi.org/10.1145/3126858.3131605","url":null,"abstract":"DJ Khaled is a popular musician that is known for having many collaborators in his songs. Hence, in this paper, we model the evolution of DJ Khaled's collaboration network as nine different networks that incrementally consider the albums of his discography. The network of each album includes the collaborations from previous ones and adds the collaborations from the new album. The artists are represented as nodes and the edges are the number of songs they appear together. Our focus is to answer whether or not: (i) we can identify meaningful communities in this network; and (2) there is an artist who has greater influence as networks emerges. By using the network average clustering coefficient, we found that the artists in the the network tend to naturally cluster in a logical manner. As a result, we identified nine communities, six of them are meaningful, and we identified the rapper Rick Ross as the most influential artist of the network.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"308 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122731808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thais G. Almeida, Bruno A. Souza, F. Nakamura, E. Nakamura
The freedom of expression provided by the Internet also favors malicious groups that propagate contents of hate, recruit new members, and threaten users. In this context, we propose a new approach for hate speech identification based on Information Theory quantifiers (entropy and divergence) to represent documents. As a differential of our approach, we capture weighted information of words, rather than just their frequency in documents. The results show that our approach overperforms techniques that use data representation, such as TF-IDF and unigrams combined to text classifiers, achieving an F1-score of 86%, 84% e 96% for classifying hate, offensive, and regular speech classes, respectively. Compared to the baselines, our proposal is a win-win solution that improves efficacy (F1-score) and efficiency (by reducing the dimension of the feature vector). The proposed solution is up to 2.27 times faster than the baseline.
{"title":"Detecting Hate, Offensive, and Regular Speech in Short Comments","authors":"Thais G. Almeida, Bruno A. Souza, F. Nakamura, E. Nakamura","doi":"10.1145/3126858.3131576","DOIUrl":"https://doi.org/10.1145/3126858.3131576","url":null,"abstract":"The freedom of expression provided by the Internet also favors malicious groups that propagate contents of hate, recruit new members, and threaten users. In this context, we propose a new approach for hate speech identification based on Information Theory quantifiers (entropy and divergence) to represent documents. As a differential of our approach, we capture weighted information of words, rather than just their frequency in documents. The results show that our approach overperforms techniques that use data representation, such as TF-IDF and unigrams combined to text classifiers, achieving an F1-score of 86%, 84% e 96% for classifying hate, offensive, and regular speech classes, respectively. Compared to the baselines, our proposal is a win-win solution that improves efficacy (F1-score) and efficiency (by reducing the dimension of the feature vector). The proposed solution is up to 2.27 times faster than the baseline.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"274 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115274905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cristiano Santos, R. Conceição, L. Agostini, G. Corrêa, B. Zatt, M. Porto
This paper presents a rate and complexity-aware coding scheme for fixed-camera videos that are designed to improve image quality in Regions of Interest (ROI) by prioritizing the encoding of such regions through the use of a modified mode decision equation. ROIs are defined in this work as faces, with the application of a face detection algorithm. Background Images (BGI) are also detected with the aim of reducing bitrate in coding blocks belonging to these areas. Finally, the proposed scheme also applies an early decision method intending to reduce coding time. Experimental results show that the proposed scheme is capable of improving the image quality in 0.99 dB in ROIs, reaching an improvement of 1.16 dB in the best case. Also, the scheme achieves an encoding time reduction of up to 55% (about 5.5%, on average) with unexpressive variations in the required bitrate.
{"title":"Rate and Complexity-Aware Coding Scheme for Fixed-Camera Videos Based on Region-of-Interest Detection","authors":"Cristiano Santos, R. Conceição, L. Agostini, G. Corrêa, B. Zatt, M. Porto","doi":"10.1145/3126858.3131599","DOIUrl":"https://doi.org/10.1145/3126858.3131599","url":null,"abstract":"This paper presents a rate and complexity-aware coding scheme for fixed-camera videos that are designed to improve image quality in Regions of Interest (ROI) by prioritizing the encoding of such regions through the use of a modified mode decision equation. ROIs are defined in this work as faces, with the application of a face detection algorithm. Background Images (BGI) are also detected with the aim of reducing bitrate in coding blocks belonging to these areas. Finally, the proposed scheme also applies an early decision method intending to reduce coding time. Experimental results show that the proposed scheme is capable of improving the image quality in 0.99 dB in ROIs, reaching an improvement of 1.16 dB in the best case. Also, the scheme achieves an encoding time reduction of up to 55% (about 5.5%, on average) with unexpressive variations in the required bitrate.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129620717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A hungry academic who is attending the WebMedia conference in Gramado uses a mobile phone app to obtain a recommendation for a place-to-eat. The restaurant that the app recommends is not within walking distance and serves a fusion-style cuisine with which the academic is unfamiliar. Should she accept the recommendation? Her confidence in the recommendation might be improved by an explanation of the recommender system's decision-making. But the explanations that recommender systems provide at present are often post hoc: fidelity to the system's decision-making is sometimes sacrificed for interpretability. Are fidelity and interpretability always in conflict? Or can they can be reconciled without damaging the quality of the recommendations themselves? This talk will review the kinds of explanations given by recommender systems. It will describe a new generation of recommender systems in which recommendation and explanation are more intimately connected and which seeks to maintain the quality of the recommendations while providing explanations that are both intelligible and reasonably faithful to the system's operation.
{"title":"Explaining Recommendations: Fidelity versus Interpretability","authors":"D. Bridge","doi":"10.1145/3126858.3133312","DOIUrl":"https://doi.org/10.1145/3126858.3133312","url":null,"abstract":"A hungry academic who is attending the WebMedia conference in Gramado uses a mobile phone app to obtain a recommendation for a place-to-eat. The restaurant that the app recommends is not within walking distance and serves a fusion-style cuisine with which the academic is unfamiliar. Should she accept the recommendation? Her confidence in the recommendation might be improved by an explanation of the recommender system's decision-making. But the explanations that recommender systems provide at present are often post hoc: fidelity to the system's decision-making is sometimes sacrificed for interpretability. Are fidelity and interpretability always in conflict? Or can they can be reconciled without damaging the quality of the recommendations themselves? This talk will review the kinds of explanations given by recommender systems. It will describe a new generation of recommender systems in which recommendation and explanation are more intimately connected and which seeks to maintain the quality of the recommendations while providing explanations that are both intelligible and reasonably faithful to the system's operation.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128847308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nilber Vittorazzi de Almeida, Silas Louzada Campos, V. Souza
In the field of Web Engineering, there are several methods proposed for the development of Web-based information systems (WISs). FrameWeb is a method that aims to develop WISs that use certain types of frameworks in their architecture, proposing models that incorporate concepts of these frameworks during system design. The method's modeling language is based on Model-Driven Development techniques, making it extensible to support different frameworks and development platforms. In this paper, we present a code generation tool for FrameWeb which harnesses the method's extensibility by being based on its language's meta-models. The tool works with an associated visual editor for FrameWeb models and showed promising results in initial evaluation efforts.
{"title":"A Model-Driven Approach for Code Generation for Web-based Information Systems Built with Frameworks","authors":"Nilber Vittorazzi de Almeida, Silas Louzada Campos, V. Souza","doi":"10.1145/3126858.3126863","DOIUrl":"https://doi.org/10.1145/3126858.3126863","url":null,"abstract":"In the field of Web Engineering, there are several methods proposed for the development of Web-based information systems (WISs). FrameWeb is a method that aims to develop WISs that use certain types of frameworks in their architecture, proposing models that incorporate concepts of these frameworks during system design. The method's modeling language is based on Model-Driven Development techniques, making it extensible to support different frameworks and development platforms. In this paper, we present a code generation tool for FrameWeb which harnesses the method's extensibility by being based on its language's meta-models. The tool works with an associated visual editor for FrameWeb models and showed promising results in initial evaluation efforts.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116416836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. C. Loures, Pedro O. S. Vaz de Melo, Adriano Veloso
Because of the ubiquitous use of the Internet in current society, it is easy to find groups or communities of people discussing about the most varied subjects. Learning about these subjects (or entities) from such discussions is of great interest for companies, organizations, public figures (e.g. politicians) and researchers alike. In this paper, we explore the problem of learning entity representations using online discussions about them as the only source of information. While such discussions may reveal relevant and surprising information about the corresponding subjects, they may also be completely irrelevant. As another challenge, while regular text documents usually contain a well structured language, online discussions often contain informal and mispelled words. Here we formally define the problem, propose a new benchmark for evaluating vector representation methods, and perform a deep evaluation of well-known techniques using three proposed evaluation scenarios: (i) clustering, (ii) ordering and (iii) recommendation. Results show that each method is better than at least one other in some evaluation.
{"title":"Generating Entity Representation from Online Discussions: Challenges and an Evaluation Framework","authors":"T. C. Loures, Pedro O. S. Vaz de Melo, Adriano Veloso","doi":"10.1145/3126858.3126882","DOIUrl":"https://doi.org/10.1145/3126858.3126882","url":null,"abstract":"Because of the ubiquitous use of the Internet in current society, it is easy to find groups or communities of people discussing about the most varied subjects. Learning about these subjects (or entities) from such discussions is of great interest for companies, organizations, public figures (e.g. politicians) and researchers alike. In this paper, we explore the problem of learning entity representations using online discussions about them as the only source of information. While such discussions may reveal relevant and surprising information about the corresponding subjects, they may also be completely irrelevant. As another challenge, while regular text documents usually contain a well structured language, online discussions often contain informal and mispelled words. Here we formally define the problem, propose a new benchmark for evaluating vector representation methods, and perform a deep evaluation of well-known techniques using three proposed evaluation scenarios: (i) clustering, (ii) ordering and (iii) recommendation. Results show that each method is better than at least one other in some evaluation.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"91 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116531794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}