首页 > 最新文献

Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing最新文献

英文 中文
Mirroring the real world in social media: twitter, geolocation, and sentiment analysis 在社交媒体上反映现实世界:推特、地理定位和情感分析
Eric Baucom, Azade Sanjari, Xiaozhong Liu, Miao Chen
In recent years social media has been used to characterize and predict real world events, and in this research we seek to investigate how closely Twitter mirrors the real world. Specifically, we wish to characterize the relationship between the language used on Twitter and the results of the 2011 NBA Playoff games. We hypothesize that the language used by Twitter users will be useful in classifying the users' locations combined with the current status of which team is in the lead during the game. This is based on the common assumption that "fans" of a team have more positive sentiment and will accordingly use different language when their team is doing well. We investigate this hypothesis by labeling each tweet according the the location of the user along with the team that is in the lead at the time of the tweet. The hypothesized difference in language (as measured by tfidf) should then have predictive power over the tweet labels. We find that indeed it does and we experiment further by adding semantic orientation (SO) information as part of the feature set. The SO does not offer much improvement over tf-idf alone. We discuss the relative strengths of the two types of features for our data.
近年来,社交媒体已经被用来描述和预测现实世界的事件,在这项研究中,我们试图调查Twitter对现实世界的反映有多密切。具体来说,我们希望描述Twitter上使用的语言与2011年NBA季后赛结果之间的关系。我们假设Twitter用户使用的语言将有助于对用户的位置进行分类,并结合哪支球队在比赛中处于领先地位。这是基于一个普遍的假设,即一支球队的“球迷”有更积极的情绪,因此当他们的球队表现良好时,他们会使用不同的语言。我们通过根据用户的位置以及推文发布时领先的团队标记每条推文来调查这一假设。假设的语言差异(由tfidf测量)应该对tweet标签具有预测能力。我们发现确实如此,我们通过添加语义方向(SO)信息作为特征集的一部分进行了进一步的实验。与单独使用tf-idf相比,SO并没有提供多少改进。我们讨论了两种类型的特征对于我们的数据的相对优势。
{"title":"Mirroring the real world in social media: twitter, geolocation, and sentiment analysis","authors":"Eric Baucom, Azade Sanjari, Xiaozhong Liu, Miao Chen","doi":"10.1145/2513549.2513559","DOIUrl":"https://doi.org/10.1145/2513549.2513559","url":null,"abstract":"In recent years social media has been used to characterize and predict real world events, and in this research we seek to investigate how closely Twitter mirrors the real world. Specifically, we wish to characterize the relationship between the language used on Twitter and the results of the 2011 NBA Playoff games. We hypothesize that the language used by Twitter users will be useful in classifying the users' locations combined with the current status of which team is in the lead during the game. This is based on the common assumption that \"fans\" of a team have more positive sentiment and will accordingly use different language when their team is doing well. We investigate this hypothesis by labeling each tweet according the the location of the user along with the team that is in the lead at the time of the tweet. The hypothesized difference in language (as measured by tfidf) should then have predictive power over the tweet labels. We find that indeed it does and we experiment further by adding semantic orientation (SO) information as part of the feature set. The SO does not offer much improvement over tf-idf alone. We discuss the relative strengths of the two types of features for our data.","PeriodicalId":126426,"journal":{"name":"Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing","volume":"109 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120971228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Says who?: automatic text-based content analysis of television news 谁说的?:基于文本的电视新闻内容自动分析
Carlos Castillo, G. D. F. Morales, Marcelo Mendoza, Nasir Khan
We perform an automatic analysis of television news programs, based on the closed captions that accompany them. Specifically, we collect all the news broadcasted in over 140 television channels in the US during a period of six months. We start by segmenting, processing, and annotating the closed captions automatically. Next, we focus on the analysis of their linguistic style and on mentions of people using NLP methods. We present a series of key insights about news providers, people in the news, and we discuss the biases that can be uncovered by automatic means. These insights are contrasted by looking at the data from multiple points of view, including qualitative assessment.
我们对电视新闻节目进行自动分析,基于它们附带的封闭字幕。具体来说,我们收集了美国140多个电视频道在6个月内播出的所有新闻。我们从自动分割、处理和注释闭标题开始。接下来,我们将重点分析他们的语言风格和提到使用NLP方法的人。我们提出了一系列关于新闻提供者、新闻人物的关键见解,并讨论了可以通过自动手段发现的偏见。通过从多个角度(包括定性评估)查看数据,对比这些见解。
{"title":"Says who?: automatic text-based content analysis of television news","authors":"Carlos Castillo, G. D. F. Morales, Marcelo Mendoza, Nasir Khan","doi":"10.1145/2513549.2513558","DOIUrl":"https://doi.org/10.1145/2513549.2513558","url":null,"abstract":"We perform an automatic analysis of television news programs, based on the closed captions that accompany them. Specifically, we collect all the news broadcasted in over 140 television channels in the US during a period of six months. We start by segmenting, processing, and annotating the closed captions automatically. Next, we focus on the analysis of their linguistic style and on mentions of people using NLP methods. We present a series of key insights about news providers, people in the news, and we discuss the biases that can be uncovered by automatic means. These insights are contrasted by looking at the data from multiple points of view, including qualitative assessment.","PeriodicalId":126426,"journal":{"name":"Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124138562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing 2013年使用自然语言处理挖掘非结构化大数据国际研讨会论文集
{"title":"Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing","authors":"","doi":"10.1145/2513549","DOIUrl":"https://doi.org/10.1145/2513549","url":null,"abstract":"","PeriodicalId":126426,"journal":{"name":"Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134412965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1