首页 > 最新文献

SMUC '11最新文献

英文 中文
ThemeCrowds: multiresolution summaries of twitter usage ThemeCrowds: twitter使用的多分辨率摘要
Pub Date : 2011-10-28 DOI: 10.1145/2065023.2065041
D. Archambault, Derek Greene, P. Cunningham, N. Hurley
Users of social media sites, such as Twitter, rapidly generate large volumes of text content on a daily basis. Visual summaries are needed to understand what groups of people are saying collectively in this unstructured text data. Users will typically discuss a wide variety of topics, where the number of authors talking about a specific topic can quickly grow or diminish over time, and what the collective is saying about the subject can shift as a situation develops. In this paper, we present a technique that summarises what collections of Twitter users are saying about certain topics over time. As the correct resolution for inspecting the data is unknown in advance, the users are clustered hierarchically over a fixed time interval based on the similarity of their posts. The visualisation technique takes this data structure as its input. Given a topic, it finds the correct resolution of users at each time interval and provides tags to summarise what the collective is discussing. The technique is tested on a large microblogging corpus, consisting of millions of tweets and over a million users.
Twitter等社交媒体网站的用户每天都会快速生成大量文本内容。要理解一组人在这个非结构化文本数据中共同说了什么,就需要可视化的摘要。用户通常会讨论各种各样的主题,其中讨论特定主题的作者数量可以随着时间的推移迅速增加或减少,并且集体对主题的看法可以随着情况的发展而变化。在本文中,我们提出了一种技术,可以总结Twitter用户的集合在一段时间内对特定主题的评论。由于事先不知道检查数据的正确分辨率,因此根据用户帖子的相似性在固定的时间间隔内对用户进行分层聚类。可视化技术将此数据结构作为其输入。给定一个主题,它在每个时间间隔内找到用户的正确解决方案,并提供标签来总结集体讨论的内容。该技术在一个大型微博语料库上进行了测试,该语料库由数百万条推文和超过100万用户组成。
{"title":"ThemeCrowds: multiresolution summaries of twitter usage","authors":"D. Archambault, Derek Greene, P. Cunningham, N. Hurley","doi":"10.1145/2065023.2065041","DOIUrl":"https://doi.org/10.1145/2065023.2065041","url":null,"abstract":"Users of social media sites, such as Twitter, rapidly generate large volumes of text content on a daily basis. Visual summaries are needed to understand what groups of people are saying collectively in this unstructured text data. Users will typically discuss a wide variety of topics, where the number of authors talking about a specific topic can quickly grow or diminish over time, and what the collective is saying about the subject can shift as a situation develops. In this paper, we present a technique that summarises what collections of Twitter users are saying about certain topics over time. As the correct resolution for inspecting the data is unknown in advance, the users are clustered hierarchically over a fixed time interval based on the similarity of their posts. The visualisation technique takes this data structure as its input. Given a topic, it finds the correct resolution of users at each time interval and provides tags to summarise what the collective is discussing. The technique is tested on a large microblogging corpus, consisting of millions of tweets and over a million users.","PeriodicalId":341071,"journal":{"name":"SMUC '11","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125453935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
The challenge of understanding the flow of sentiments in social media documents 理解社交媒体文档中情绪流动的挑战
Pub Date : 2011-10-28 DOI: 10.1145/2065023.2065025
D. Losada
This talk is focused on a key task in the area of Opinion Mining and Sentiment Analysis: polarity classification of social media documents (e.g. blog posts). Estimating polarity is much more demanding than estimating topicality. As a matter of fact, the effectiveness of polarity classification is still modest and does not compare with the effectiveness of standard retrieval tasks. Polarity estimation is severely affected by parts of the text that are off-topic or that simply do not express any opinion. In fact, the key sentiments in a document often appear in specific locations of the text. Furthermore, there are usually conflicting opinions in a given document and this mixed set of opinions harms the performance of automatic methods designed to estimate the overall orientation of the text. In this talk, I will argue that understanding the flow of sentiments in a text is a major challenge for effectively predicting the document's orientation towards a given topic. I will briefly outline some possible avenues to address this challenging issue and review some recent papers that take steps in this direction.
这次演讲的重点是意见挖掘和情感分析领域的一个关键任务:社交媒体文档(如博客文章)的极性分类。估计极性比估计局部性要困难得多。事实上,极性分类的有效性仍然是适度的,不能与标准检索任务的有效性相比。极性估计严重影响的部分文本,离题或根本没有表达任何意见。事实上,文档中的关键情感通常出现在文本的特定位置。此外,在给定的文档中通常存在相互冲突的意见,这种混合的意见集损害了用于估计文本总体方向的自动方法的性能。在这次演讲中,我将论证理解文本中的情感流是有效预测文档对给定主题的方向的主要挑战。我将简要概述一些可能的途径来解决这个具有挑战性的问题,并回顾一些最近在这个方向上采取步骤的论文。
{"title":"The challenge of understanding the flow of sentiments in social media documents","authors":"D. Losada","doi":"10.1145/2065023.2065025","DOIUrl":"https://doi.org/10.1145/2065023.2065025","url":null,"abstract":"This talk is focused on a key task in the area of Opinion Mining and Sentiment Analysis: polarity classification of social media documents (e.g. blog posts). Estimating polarity is much more demanding than estimating topicality. As a matter of fact, the effectiveness of polarity classification is still modest and does not compare with the effectiveness of standard retrieval tasks. Polarity estimation is severely affected by parts of the text that are off-topic or that simply do not express any opinion. In fact, the key sentiments in a document often appear in specific locations of the text. Furthermore, there are usually conflicting opinions in a given document and this mixed set of opinions harms the performance of automatic methods designed to estimate the overall orientation of the text.\u0000 In this talk, I will argue that understanding the flow of sentiments in a text is a major challenge for effectively predicting the document's orientation towards a given topic. I will briefly outline some possible avenues to address this challenging issue and review some recent papers that take steps in this direction.","PeriodicalId":341071,"journal":{"name":"SMUC '11","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132486872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Mining tag similarity in folksonomies 在大众分类法中挖掘标签相似度
Pub Date : 2011-10-28 DOI: 10.1145/2065023.2065037
Geir Solskinnsbakk, J. Gulla
Folksonomies are becoming increasingly popular, both among users who find them simple and intuitive to use, and scientists as interesting research objects. Folksonomies can be viewed as large informal sources of semantics. Harnessing the semantics for search or concept extraction requires us to be able to recognize linguistic similarity between tags. In this paper we propose an approach that uses a combination of morpho-syntactic and semantic similarity measures without using any external linguistic resources to mine tag pairs that can be reduced to base tags. Our approach is based on the Levenshtein distance for morpho-syntactic similarity and tag signatures for semantic similarity. The evaluation of our approach, based on a data set crawled from Delicious, shows that we are able to recognize a wide range of linguistic variations with high quality.
大众分类法正变得越来越流行,不仅在用户中发现它们简单直观,而且在科学家中作为有趣的研究对象。大众分类法可以看作是语义的大型非正式来源。利用语义进行搜索或概念提取要求我们能够识别标签之间的语言相似性。在本文中,我们提出了一种方法,该方法使用形态句法和语义相似性度量的组合,而不使用任何外部语言资源来挖掘可以简化为基本标签的标签对。我们的方法基于词法相似度的Levenshtein距离和语义相似度的标签签名。基于从Delicious抓取的数据集对我们的方法进行的评估表明,我们能够以高质量识别各种各样的语言变体。
{"title":"Mining tag similarity in folksonomies","authors":"Geir Solskinnsbakk, J. Gulla","doi":"10.1145/2065023.2065037","DOIUrl":"https://doi.org/10.1145/2065023.2065037","url":null,"abstract":"Folksonomies are becoming increasingly popular, both among users who find them simple and intuitive to use, and scientists as interesting research objects. Folksonomies can be viewed as large informal sources of semantics. Harnessing the semantics for search or concept extraction requires us to be able to recognize linguistic similarity between tags. In this paper we propose an approach that uses a combination of morpho-syntactic and semantic similarity measures without using any external linguistic resources to mine tag pairs that can be reduced to base tags. Our approach is based on the Levenshtein distance for morpho-syntactic similarity and tag signatures for semantic similarity. The evaluation of our approach, based on a data set crawled from Delicious, shows that we are able to recognize a wide range of linguistic variations with high quality.","PeriodicalId":341071,"journal":{"name":"SMUC '11","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130701294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
期刊
SMUC '11
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1