首页 > 最新文献

2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)最新文献

英文 中文
Visual pollution localization through crowdsourcing and visual similarity clustering 基于众包和视觉相似聚类的视觉污染定位
Zuzana Kucharikova, Jakub Simko
Nowadays, many cities and communes suffer from advertisements appearing on aesthetically inappropriate or illegal places. This contamination of public space is called visual pollution. The first step in the fight against visual pollution is localization of physical advertising media (e.g., billboards) as accurately as is possible. One of the ways is to use volunteer effort through outdoor crowdsourcing. Smart mobile devices can support this process through localization sensors. However, these sensors are inaccurate enough on their own, plus, the media are not located exactly where the volunteers capture them. Therefore, the media localization is presently inaccurate. This paper presents a work-in-progress method to improve the localization of physical advertisement media. As input, the method takes captured media images along with spatial information about the device. The images are then clustered based on their locations, to form sets corresponding to the true physical media. Then, using visual analysis of the images and spatial orientation of devices, the method computes expected location of the physical media.
如今,许多城市和社区遭受广告出现在不美观或非法的地方。这种对公共空间的污染被称为视觉污染。与视觉污染作斗争的第一步是尽可能准确地定位实体广告媒体(如广告牌)。其中一种方法是通过户外众包利用志愿者的努力。智能移动设备可以通过定位传感器支持这一过程。然而,这些传感器本身就不够准确,而且,媒体也无法准确定位志愿者捕捉到它们的位置。因此,目前媒体的定位是不准确的。本文提出了一种改进实体广告媒体定位的方法。作为输入,该方法采用捕获的媒体图像以及有关设备的空间信息。然后根据它们的位置对图像进行聚类,形成与真实物理介质相对应的集合。然后,利用图像的视觉分析和设备的空间方向,该方法计算物理介质的期望位置。
{"title":"Visual pollution localization through crowdsourcing and visual similarity clustering","authors":"Zuzana Kucharikova, Jakub Simko","doi":"10.1109/SMAP.2017.8022662","DOIUrl":"https://doi.org/10.1109/SMAP.2017.8022662","url":null,"abstract":"Nowadays, many cities and communes suffer from advertisements appearing on aesthetically inappropriate or illegal places. This contamination of public space is called visual pollution. The first step in the fight against visual pollution is localization of physical advertising media (e.g., billboards) as accurately as is possible. One of the ways is to use volunteer effort through outdoor crowdsourcing. Smart mobile devices can support this process through localization sensors. However, these sensors are inaccurate enough on their own, plus, the media are not located exactly where the volunteers capture them. Therefore, the media localization is presently inaccurate. This paper presents a work-in-progress method to improve the localization of physical advertisement media. As input, the method takes captured media images along with spatial information about the device. The images are then clustered based on their locations, to form sets corresponding to the true physical media. Then, using visual analysis of the images and spatial orientation of devices, the method computes expected location of the physical media.","PeriodicalId":441461,"journal":{"name":"2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125826023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Sentiment analysis of social network posts in Slovak language 斯洛伐克语社交网络帖子的情感分析
Rastislav Krchnavy, Marián Simko
In this paper we tackle the issue of sentiment analysis of social network posts in a not well targeted language — Slovak. There is a significant lack of research in this area for minor languages, as they often introduce additional language-specific issues for text processing. In case of Slovak, common issues are high flection, complex morphology and syntax. User-generated content of social networks introduces additional challenges (variability of diacritics, inconsistent style, high error rate) that make the task even harder. In this paper, we propose a method for sentiment analysis of social network posts on Facebook. The proposed method is based on machine learning and incorporates multilevel text pre-processing aiming to deal with specifics of user-generated social content. The evaluation in a real-word setting employing data from Facebook pages of multiple well-known companies shows accuracy of our method comparable with approaches for major world languages.
在本文中,我们解决了在一个不太有针对性的语言-斯洛伐克的社交网络帖子的情感分析问题。小语种在这一领域的研究非常缺乏,因为它们经常为文本处理引入额外的语言特定问题。在斯洛伐克语中,常见的问题是高度反射,复杂的形态和句法。社交网络的用户生成内容引入了额外的挑战(变音符号的可变性、不一致的风格、高错误率),使任务变得更加困难。在本文中,我们提出了一种对Facebook社交网络帖子进行情感分析的方法。该方法基于机器学习,并结合了多层次文本预处理,旨在处理用户生成的社交内容的细节。在使用多家知名公司Facebook页面数据的真实世界环境中进行的评估表明,我们的方法的准确性可与世界主要语言的方法相媲美。
{"title":"Sentiment analysis of social network posts in Slovak language","authors":"Rastislav Krchnavy, Marián Simko","doi":"10.1109/SMAP.2017.8022661","DOIUrl":"https://doi.org/10.1109/SMAP.2017.8022661","url":null,"abstract":"In this paper we tackle the issue of sentiment analysis of social network posts in a not well targeted language — Slovak. There is a significant lack of research in this area for minor languages, as they often introduce additional language-specific issues for text processing. In case of Slovak, common issues are high flection, complex morphology and syntax. User-generated content of social networks introduces additional challenges (variability of diacritics, inconsistent style, high error rate) that make the task even harder. In this paper, we propose a method for sentiment analysis of social network posts on Facebook. The proposed method is based on machine learning and incorporates multilevel text pre-processing aiming to deal with specifics of user-generated social content. The evaluation in a real-word setting employing data from Facebook pages of multiple well-known companies shows accuracy of our method comparable with approaches for major world languages.","PeriodicalId":441461,"journal":{"name":"2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129103871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
High-performance and lightweight real-time deep face emotion recognition 高性能、轻量级的实时深度人脸情感识别
Justus Schwan, E. Ghaleb, E. Hortal, S. Asteriadis
Deep learning is used for all kinds of tasks which require human-like performance, such as voice and image recognition in smartphones, smart home technology, and self-driving cars. While great advances have been made in the field, results are often not satisfactory when compared to human performance. In the field of facial emotion recognition, especially in the wild, Convolutional Neural Networks (CNN) are employed because of their excellent generalization properties. However, while CNNs can learn a representation for certain object classes, an amount of (annotated) training data roughly proportional to the class's complexity is needed and seldom available. This work describes an advanced pre-processing algorithm for facial images and a transfer learning mechanism, two potential candidates for relaxing this requirement. Using these algorithms, a lightweight face emotion recognition application for Human-Computer Interaction with TurtleBot units was developed.
深度学习被用于智能手机的语音和图像识别、智能家居技术、自动驾驶汽车等需要类似人类表现的各种任务。虽然这一领域取得了巨大的进步,但与人类的表现相比,结果往往不令人满意。在面部情绪识别领域,特别是在野外,卷积神经网络(CNN)因其优异的泛化特性而被广泛应用。然而,虽然cnn可以学习特定对象类的表示,但需要大量(带注释的)训练数据,这些数据与类的复杂性大致成正比,而且很少可用。这项工作描述了一种先进的面部图像预处理算法和一种迁移学习机制,这是放松这一要求的两个潜在候选算法。利用这些算法,开发了一个轻量级的人脸情感识别应用程序,用于与乌龟机器人单元的人机交互。
{"title":"High-performance and lightweight real-time deep face emotion recognition","authors":"Justus Schwan, E. Ghaleb, E. Hortal, S. Asteriadis","doi":"10.1109/SMAP.2017.8022671","DOIUrl":"https://doi.org/10.1109/SMAP.2017.8022671","url":null,"abstract":"Deep learning is used for all kinds of tasks which require human-like performance, such as voice and image recognition in smartphones, smart home technology, and self-driving cars. While great advances have been made in the field, results are often not satisfactory when compared to human performance. In the field of facial emotion recognition, especially in the wild, Convolutional Neural Networks (CNN) are employed because of their excellent generalization properties. However, while CNNs can learn a representation for certain object classes, an amount of (annotated) training data roughly proportional to the class's complexity is needed and seldom available. This work describes an advanced pre-processing algorithm for facial images and a transfer learning mechanism, two potential candidates for relaxing this requirement. Using these algorithms, a lightweight face emotion recognition application for Human-Computer Interaction with TurtleBot units was developed.","PeriodicalId":441461,"journal":{"name":"2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121246915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A survey on political event analysis in Twitter 推特上的政治事件分析调查
Michalis Korakakis, E. Spyrou, Phivos Mylonas
This short survey paper attempts to provide an overview of the most recent research works on the popular politics domain within the framework of the Twitter social network. Given both the political turmoil that arouse at the end of 2016 and early 2017, and the increasing popularity of social networks in general, and Twitter, in particular, we feel that this topic forms an attractive candidate for fellow data mining researchers that came into sight over the last few months. Herein, we start by presenting a brief overview of our motivation and continue with basic information on the Twitter platform, which constitutes two clearly identifiable components, namely as an online news source and as one of the most popular social networking sites. Focus is then given to research works dealing with sentiment analysis in political topics and opinion polls, whereas we continue by reviewing the Twittersphere from the computational social science point of view, by including behavior analysis, social interaction and social influence identification methods and by discerning and discriminating its useful types within the social network, thus envisioning possible further utilization scenarios for the collected information. A short discussion on the identified conclusions and a couple of future research directions concludes the survey.
这篇简短的调查论文试图对Twitter社交网络框架内流行政治领域的最新研究工作进行概述。考虑到2016年底和2017年初引发的政治动荡,以及社交网络(尤其是Twitter)的日益普及,我们觉得这个话题对过去几个月出现的数据挖掘研究人员来说是一个有吸引力的候选者。在这里,我们首先简要概述我们的动机,并继续介绍Twitter平台的基本信息,它构成了两个明确可识别的组成部分,即作为在线新闻来源和最受欢迎的社交网站之一。然后将重点放在处理政治话题和民意调查中的情绪分析的研究工作上,而我们继续从计算社会科学的角度审查twitter圈,包括行为分析,社会互动和社会影响识别方法,并通过识别和区分其在社交网络中的有用类型,从而设想可能进一步利用收集到的信息的场景。本文将对研究的结论和未来的研究方向进行简短的讨论。
{"title":"A survey on political event analysis in Twitter","authors":"Michalis Korakakis, E. Spyrou, Phivos Mylonas","doi":"10.1109/SMAP.2017.8022660","DOIUrl":"https://doi.org/10.1109/SMAP.2017.8022660","url":null,"abstract":"This short survey paper attempts to provide an overview of the most recent research works on the popular politics domain within the framework of the Twitter social network. Given both the political turmoil that arouse at the end of 2016 and early 2017, and the increasing popularity of social networks in general, and Twitter, in particular, we feel that this topic forms an attractive candidate for fellow data mining researchers that came into sight over the last few months. Herein, we start by presenting a brief overview of our motivation and continue with basic information on the Twitter platform, which constitutes two clearly identifiable components, namely as an online news source and as one of the most popular social networking sites. Focus is then given to research works dealing with sentiment analysis in political topics and opinion polls, whereas we continue by reviewing the Twittersphere from the computational social science point of view, by including behavior analysis, social interaction and social influence identification methods and by discerning and discriminating its useful types within the social network, thus envisioning possible further utilization scenarios for the collected information. A short discussion on the identified conclusions and a couple of future research directions concludes the survey.","PeriodicalId":441461,"journal":{"name":"2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128192876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Using social networks to predict changes in health: Extended abstract 利用社会网络预测健康变化:扩展摘要
Karen S. Jung, O. Tonguz
Social networking sites not only have billions of users but detailed content about each individual's daily life. This detailed information about a person's life could be exploited to allow individuals to learn more about themselves. In this paper, we introduce the concept of using social networks to foresee changes in an individual's health. We develop a new model that can predict if a person has recently undergone weight loss by analyzing the text from the person's tweets. Sentiment analysis, parts-of-speech (POS) tagging, and categorization are used in this model. The model is tested on Twitter users and a good statistical accuracy is observed. The success of this model suggests that this idea could be further explored to identify other patterns and create new models for a variety of health changes and health problems, particularly those that are of huge interest to individuals and businesses.
社交网站不仅拥有数十亿用户,还拥有每个人日常生活的详细内容。这些关于一个人生活的详细信息可以被利用,让个人更多地了解自己。在本文中,我们引入了使用社交网络来预测个人健康变化的概念。我们开发了一个新模型,可以通过分析一个人的推文来预测他最近是否在减肥。该模型使用了情感分析、词性标注和分类。该模型在Twitter用户上进行了测试,并观察到良好的统计准确性。这一模式的成功表明,可以进一步探索这一想法,以确定其他模式,并为各种健康变化和健康问题,特别是那些个人和企业非常感兴趣的问题,创建新的模式。
{"title":"Using social networks to predict changes in health: Extended abstract","authors":"Karen S. Jung, O. Tonguz","doi":"10.1109/SMAP.2017.8022659","DOIUrl":"https://doi.org/10.1109/SMAP.2017.8022659","url":null,"abstract":"Social networking sites not only have billions of users but detailed content about each individual's daily life. This detailed information about a person's life could be exploited to allow individuals to learn more about themselves. In this paper, we introduce the concept of using social networks to foresee changes in an individual's health. We develop a new model that can predict if a person has recently undergone weight loss by analyzing the text from the person's tweets. Sentiment analysis, parts-of-speech (POS) tagging, and categorization are used in this model. The model is tested on Twitter users and a good statistical accuracy is observed. The success of this model suggests that this idea could be further explored to identify other patterns and create new models for a variety of health changes and health problems, particularly those that are of huge interest to individuals and businesses.","PeriodicalId":441461,"journal":{"name":"2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131933617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards adaptive brain-computer interfaces: Improving accuracy of detection of event-related potentials 面向自适应脑机接口:提高事件相关电位检测的准确性
Róbert Móro, Patrik Berger, M. Bieliková
Electroencefalography (EEG) has a wide range of applications in human-computer interaction and in adaptation and personalization of the interfaces. It can be used either as a sensor, e.g., for emotion detection, or as an input device that allows to take actions based on the brain's response to the presented stimuli. For the latter, it is crucial to be able to reliably detect event-related potentials (ERPs), which can be a hard task because of the noise in the signal, especially when using affordable consumer-oriented devices, such as Emotiv Epoc. In the paper, we present a method of EEG signal processing and classification for detection of ERP P300 wave. We particularly focus on the adaptive channel selection and propose to use genetic algorithm combined with linear discriminant analysis to determine the optimal subset of electrodes for signal processing for each individual user. We evaluated our proposed method on a standard data set outperforming the existing approaches even with decreasing size of a training set. In addition, we conducted a user study with Emotiv Epoc device on a standard P300 Speller task in order to compare the results of our method and to find out, whether this device is suitable for P300 detection.
脑电图(EEG)在人机交互以及界面的适配和个性化方面有着广泛的应用。它既可以用作传感器,例如用于情感检测,也可以作为输入设备,允许根据大脑对所呈现的刺激的反应采取行动。对于后者,能够可靠地检测事件相关电位(erp)至关重要,由于信号中的噪声,这可能是一项艰巨的任务,特别是在使用价格合理的面向消费者的设备(如Emotiv Epoc)时。本文提出了一种用于ERP P300波检测的脑电信号处理与分类方法。我们特别关注自适应信道选择,并建议使用遗传算法结合线性判别分析来确定每个用户信号处理的最佳电极子集。我们在标准数据集上评估了我们提出的方法,即使训练集的大小减小,也优于现有的方法。此外,我们使用Emotiv Epoc设备对一个标准的P300拼写任务进行了用户研究,以比较我们的方法的结果,并找出该设备是否适用于P300检测。
{"title":"Towards adaptive brain-computer interfaces: Improving accuracy of detection of event-related potentials","authors":"Róbert Móro, Patrik Berger, M. Bieliková","doi":"10.1109/SMAP.2017.8022664","DOIUrl":"https://doi.org/10.1109/SMAP.2017.8022664","url":null,"abstract":"Electroencefalography (EEG) has a wide range of applications in human-computer interaction and in adaptation and personalization of the interfaces. It can be used either as a sensor, e.g., for emotion detection, or as an input device that allows to take actions based on the brain's response to the presented stimuli. For the latter, it is crucial to be able to reliably detect event-related potentials (ERPs), which can be a hard task because of the noise in the signal, especially when using affordable consumer-oriented devices, such as Emotiv Epoc. In the paper, we present a method of EEG signal processing and classification for detection of ERP P300 wave. We particularly focus on the adaptive channel selection and propose to use genetic algorithm combined with linear discriminant analysis to determine the optimal subset of electrodes for signal processing for each individual user. We evaluated our proposed method on a standard data set outperforming the existing approaches even with decreasing size of a training set. In addition, we conducted a user study with Emotiv Epoc device on a standard P300 Speller task in order to compare the results of our method and to find out, whether this device is suitable for P300 detection.","PeriodicalId":441461,"journal":{"name":"2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116909698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Feature extraction for tweet classification: Do the humans perform better? 推文分类的特征提取:人类表现更好吗?
N. Tsapatsoulis, Constantinos Djouvas
Sentiment analysis of Twitter data became a research trend the last decade. Thanks to the Twitter API, massive amounts of tweets, relating to a topic of interest, can be collected in real time. Performing sentiment analysis of these tweets can be used to conduct social sensing and opinion mining. For instance, forecasting elections is a primary area in which sentiment analysis of tweets has been extensively applied the last few years. Sentiment analysis of Twitter data presents important challenges compared to the similar task of text classification. Tweets are limited to 140 characters; thus, the conveyed message is compressed and often context-dependent. The tweets are informal and unstructured, usually lacking grammatical soundness and use of a standard lexicon. On the other hand, tweets are usually annotated by their authors regarding their topic and sentiment with the aid of hashtags and emoticons. Identifying appropriate features for sentiment analysis of tweets remains an open research area since text indexing methods face the sparseness problem while POS tagging methods fail due to the lack of grammatical structure of tweets. Character based features, i.e., n-grams of characters, are currently getting popular because they are language independent. However, their effectiveness remains quite low. In this paper, we argue that tokens used by humans for sentiment analysis of tweets are probably the best feature set one can use for that purpose. We compare several automatically extracted features with the features (tokens) used by humans for tweet classification, under a machine learning framework. The results show that the manually indicated tokens combined with a Decision Tree classifier outperform any other feature set-classification algorithm combination. The manually annotated dataset that was used in our experiments is publicly available for anyone who wishes to use it.
对Twitter数据的情感分析在过去十年成为一种研究趋势。由于Twitter的API,可以实时收集与感兴趣的主题相关的大量tweet。对这些推文进行情感分析可以用于进行社会感知和意见挖掘。例如,预测选举是推特情绪分析在过去几年得到广泛应用的主要领域。与类似的文本分类任务相比,Twitter数据的情感分析提出了重要的挑战。推文限制在140个字符以内;因此,传递的消息是压缩的,并且通常与上下文相关。这些推文是非正式的、无结构的,通常缺乏语法合理性和标准词汇的使用。另一方面,推文通常由其作者借助标签和表情符号对其主题和情绪进行注释。由于文本索引方法面临稀疏性问题,而词性标注方法由于推文缺乏语法结构而失败,因此确定合适的推文情感分析特征仍然是一个开放的研究领域。基于字符的特征,即n-grams字符,目前正变得越来越流行,因为它们与语言无关。然而,它们的有效性仍然很低。在本文中,我们认为人类用于tweet情感分析的令牌可能是可以用于该目的的最佳功能集。在机器学习框架下,我们将几个自动提取的特征与人类用于tweet分类的特征(令牌)进行比较。结果表明,人工指示标记与决策树分类器的结合优于任何其他特征集分类算法的组合。在我们的实验中使用的手动注释数据集是公开的,任何人都可以使用它。
{"title":"Feature extraction for tweet classification: Do the humans perform better?","authors":"N. Tsapatsoulis, Constantinos Djouvas","doi":"10.1109/SMAP.2017.8022667","DOIUrl":"https://doi.org/10.1109/SMAP.2017.8022667","url":null,"abstract":"Sentiment analysis of Twitter data became a research trend the last decade. Thanks to the Twitter API, massive amounts of tweets, relating to a topic of interest, can be collected in real time. Performing sentiment analysis of these tweets can be used to conduct social sensing and opinion mining. For instance, forecasting elections is a primary area in which sentiment analysis of tweets has been extensively applied the last few years. Sentiment analysis of Twitter data presents important challenges compared to the similar task of text classification. Tweets are limited to 140 characters; thus, the conveyed message is compressed and often context-dependent. The tweets are informal and unstructured, usually lacking grammatical soundness and use of a standard lexicon. On the other hand, tweets are usually annotated by their authors regarding their topic and sentiment with the aid of hashtags and emoticons. Identifying appropriate features for sentiment analysis of tweets remains an open research area since text indexing methods face the sparseness problem while POS tagging methods fail due to the lack of grammatical structure of tweets. Character based features, i.e., n-grams of characters, are currently getting popular because they are language independent. However, their effectiveness remains quite low. In this paper, we argue that tokens used by humans for sentiment analysis of tweets are probably the best feature set one can use for that purpose. We compare several automatically extracted features with the features (tokens) used by humans for tweet classification, under a machine learning framework. The results show that the manually indicated tokens combined with a Decision Tree classifier outperform any other feature set-classification algorithm combination. The manually annotated dataset that was used in our experiments is publicly available for anyone who wishes to use it.","PeriodicalId":441461,"journal":{"name":"2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115414358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Efficient big data analysis on a single machine using apache spark and self-organizing map libraries 使用apache spark和自组织地图库在单机上进行高效的大数据分析
David Andresic, Petr Šaloun, Ioannis Anagnostopoulos
Apache Spark is commonly used as a big data analytical platform on powerful computer clusters, as it primarily employ the main computer memory for the evaluation. Our attempt adds self-organizing map software libraries onto a single big data analytical stack and is efficient and fast enough even on a standard single computer. This innovative approach brings the big data analysis to researchers with limited resources. Our genuine idea was experimentally confirmed and is described here. As a case study for our method we we used the available #Brexit data and the sentiment analysis of corresponding tweets and the correlation with the stock exchange data.
Apache Spark通常用作强大的计算机集群上的大数据分析平台,因为它主要使用主计算机内存进行评估。我们尝试将自组织地图软件库添加到单个大数据分析堆栈上,即使在标准的单个计算机上也足够高效和快速。这种创新的方法使资源有限的研究人员能够进行大数据分析。我们的真实想法在实验中得到了证实,并在这里进行了描述。作为我们方法的案例研究,我们使用了可用的#Brexit数据和相应推文的情绪分析以及与证券交易所数据的相关性。
{"title":"Efficient big data analysis on a single machine using apache spark and self-organizing map libraries","authors":"David Andresic, Petr Šaloun, Ioannis Anagnostopoulos","doi":"10.1109/SMAP.2017.8022657","DOIUrl":"https://doi.org/10.1109/SMAP.2017.8022657","url":null,"abstract":"Apache Spark is commonly used as a big data analytical platform on powerful computer clusters, as it primarily employ the main computer memory for the evaluation. Our attempt adds self-organizing map software libraries onto a single big data analytical stack and is efficient and fast enough even on a standard single computer. This innovative approach brings the big data analysis to researchers with limited resources. Our genuine idea was experimentally confirmed and is described here. As a case study for our method we we used the available #Brexit data and the sentiment analysis of corresponding tweets and the correlation with the stock exchange data.","PeriodicalId":441461,"journal":{"name":"2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126139794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Customer language processing: Extended abstract 客户语言处理:扩展抽象
A. Metzmacher, V. Heinrichs, B. Falk, R. Schmitt
The research presented is the first working step towards the goal of developing a domain-independent method for sentiment analysis of German customer feedback in social media. The approach proposes to apply the concept of natural language processing (NLP) to customer language processing (CLP). In this context we hypothesize an indifference in annotator ability in assigning customer reviews of tangible vs. intangible goods and an indifferences within customers' writing styles within their evaluation of these goods. To test these hypotheses, a study was conducted where participants had to assign the sentiment as well as the subject of customer reviews and its evaluative attribute. The results reveal that the inter-rater reliability of annotators does not differ significantly with respect to product groups. However a slight difference with respect to product categories could be observed. Moreover, there occur variations within the inter-rater ability according to the emotional commitment towards products.
这项研究是朝着开发一种独立于领域的方法来分析德国社交媒体客户反馈情绪的目标迈出的第一步。该方法提出将自然语言处理(NLP)的概念应用于客户语言处理(CLP)。在这种情况下,我们假设注释者在分配客户对有形商品和无形商品的评论时的能力无关,以及客户在评估这些商品时的写作风格无关。为了验证这些假设,进行了一项研究,参与者必须分配情绪以及客户评论的主题及其评估属性。结果表明,评价者之间的可靠性没有显著差异相对于产品组。然而,在产品类别方面可以观察到细微的差异。此外,根据对产品的情感承诺,评价者之间的能力也会发生变化。
{"title":"Customer language processing: Extended abstract","authors":"A. Metzmacher, V. Heinrichs, B. Falk, R. Schmitt","doi":"10.1109/SMAP.2017.8022663","DOIUrl":"https://doi.org/10.1109/SMAP.2017.8022663","url":null,"abstract":"The research presented is the first working step towards the goal of developing a domain-independent method for sentiment analysis of German customer feedback in social media. The approach proposes to apply the concept of natural language processing (NLP) to customer language processing (CLP). In this context we hypothesize an indifference in annotator ability in assigning customer reviews of tangible vs. intangible goods and an indifferences within customers' writing styles within their evaluation of these goods. To test these hypotheses, a study was conducted where participants had to assign the sentiment as well as the subject of customer reviews and its evaluative attribute. The results reveal that the inter-rater reliability of annotators does not differ significantly with respect to product groups. However a slight difference with respect to product categories could be observed. Moreover, there occur variations within the inter-rater ability according to the emotional commitment towards products.","PeriodicalId":441461,"journal":{"name":"2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123801415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Exploiting relevant dates to promote serendipity and situational curiosity in cultural heritage experiences 利用相关日期促进文化遗产体验中的意外发现和情境好奇心
Ahmed Dahroug, Martín López Nores, J. Pazos-Arias, Silvia González-Soutelo, S. Reboreda-Morillo, Angeliki Antoniou
Cultural heritage is typically not on the people's top lists when searching for entertainment activities. Publicized temporary exhibitions can draw many visitors to a museum, urged by the opportunity of seeing certain items, but in general there are no particular stimuli and the visits are postponed once and again, if not forever. In this paper, we argue that it is possible to instigate curiosity in relation to dates, periods and events that are relevant to the potential visitors and connected to the cultural heritage items in direct or subtle ways. Likewise, proper reasoning about dates can improve the experiences of actual visitors, by promoting serendipitous learning, increasing retention and revealing that subsequent visits may drive them around the venue following new appealing narratives. We present the outline of a recommender system grounded on rich semantic modeling of relevant dates in the user profiles, in the cultural heritage knowledge bases and in an almanac of important days, connected to keywords and/or historical events and characters. We also explain how this system is going to be used in the pilot experiments of the CROSSCULT EU project, starting in September 2017.
当人们搜索娱乐活动时,文化遗产通常不在首选之列。公开的临时展览可以吸引许多参观者来博物馆,因为有机会看到某些物品,但通常没有特别的刺激,参观一次又一次地被推迟,如果不是永远。在本文中,我们认为有可能激发与潜在游客相关的日期、时期和事件的好奇心,并以直接或微妙的方式与文化遗产项目联系起来。同样地,关于日期的合理推理可以通过促进偶然的学习,提高留存率,并揭示后续访问可能会驱使他们跟随新的吸引人的故事,从而改善实际访问者的体验。我们提出了一个推荐系统的大纲,该系统基于用户档案中相关日期的丰富语义建模,在文化遗产知识库和重要日期的年鉴中,与关键词和/或历史事件和人物相关。我们还解释了该系统将如何在2017年9月开始的CROSSCULT欧盟项目的试点实验中使用。
{"title":"Exploiting relevant dates to promote serendipity and situational curiosity in cultural heritage experiences","authors":"Ahmed Dahroug, Martín López Nores, J. Pazos-Arias, Silvia González-Soutelo, S. Reboreda-Morillo, Angeliki Antoniou","doi":"10.1109/SMAP.2017.8022673","DOIUrl":"https://doi.org/10.1109/SMAP.2017.8022673","url":null,"abstract":"Cultural heritage is typically not on the people's top lists when searching for entertainment activities. Publicized temporary exhibitions can draw many visitors to a museum, urged by the opportunity of seeing certain items, but in general there are no particular stimuli and the visits are postponed once and again, if not forever. In this paper, we argue that it is possible to instigate curiosity in relation to dates, periods and events that are relevant to the potential visitors and connected to the cultural heritage items in direct or subtle ways. Likewise, proper reasoning about dates can improve the experiences of actual visitors, by promoting serendipitous learning, increasing retention and revealing that subsequent visits may drive them around the venue following new appealing narratives. We present the outline of a recommender system grounded on rich semantic modeling of relevant dates in the user profiles, in the cultural heritage knowledge bases and in an almanac of important days, connected to keywords and/or historical events and characters. We also explain how this system is going to be used in the pilot experiments of the CROSSCULT EU project, starting in September 2017.","PeriodicalId":441461,"journal":{"name":"2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125658561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2017 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1