使用自定义词汇和NLP识别Twitter中的暴力行为

Jonathan Adkins
{"title":"使用自定义词汇和NLP识别Twitter中的暴力行为","authors":"Jonathan Adkins","doi":"10.34190/eccws.21.1.340","DOIUrl":null,"url":null,"abstract":"Information warfare is no longer a denizen purely of the political domain. It is a phenomenon that permeates other domains, especially those of mass communications and cybersecurity. Deepfakes, sock puppets, and microtargeted political advertising on social media are some examples of techniques that have been employed by threat actors to exert influence over consumers of mass media. Social Network Analysis (SNA) is an aggregation of tools and techniques used to research and analyze the nature of relationships between entities. SNA makes use of such tools as text mining, sentiment analysis, and machine learning algorithms to identify and measure aspects of human behavior in certain defined conditions. One area of interest in SNA is the ability to identify and measure levels of strong emotions in groups of people. In particular, we have developed a technique in which the potential for increased violence within a community can be identified and measured using a combination of text mining, sentiment analysis, and graph theory. We have compiled a custom lexicon of terms used commonly in discussions relating to acts of violence. Each term in the lexicon has a numerical weight associated with it, indicating how violent the term is. We will take samples of online community discussions from Twitter and use the R and Python programming languages to cross-reference the samples with our lexicon. The results will be displayed in a Twitter discussion graph where the user nodes are color-coded according to the overall level of violence that is inherent in the Tweet. This methodology will demonstrate which communities within an online social network discussion are more at risk for potentially violent behavior. We assert that when this approach is used in association with other NLP techniques such as word embeddings and sentiment analysis, it will provide cybersecurity and homeland security analysts with actionable threat intelligence.","PeriodicalId":258360,"journal":{"name":"European Conference on Cyber Warfare and Security","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identification of Violence in Twitter Using a Custom Lexicon and NLP\",\"authors\":\"Jonathan Adkins\",\"doi\":\"10.34190/eccws.21.1.340\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Information warfare is no longer a denizen purely of the political domain. It is a phenomenon that permeates other domains, especially those of mass communications and cybersecurity. Deepfakes, sock puppets, and microtargeted political advertising on social media are some examples of techniques that have been employed by threat actors to exert influence over consumers of mass media. Social Network Analysis (SNA) is an aggregation of tools and techniques used to research and analyze the nature of relationships between entities. SNA makes use of such tools as text mining, sentiment analysis, and machine learning algorithms to identify and measure aspects of human behavior in certain defined conditions. One area of interest in SNA is the ability to identify and measure levels of strong emotions in groups of people. In particular, we have developed a technique in which the potential for increased violence within a community can be identified and measured using a combination of text mining, sentiment analysis, and graph theory. We have compiled a custom lexicon of terms used commonly in discussions relating to acts of violence. Each term in the lexicon has a numerical weight associated with it, indicating how violent the term is. We will take samples of online community discussions from Twitter and use the R and Python programming languages to cross-reference the samples with our lexicon. The results will be displayed in a Twitter discussion graph where the user nodes are color-coded according to the overall level of violence that is inherent in the Tweet. This methodology will demonstrate which communities within an online social network discussion are more at risk for potentially violent behavior. We assert that when this approach is used in association with other NLP techniques such as word embeddings and sentiment analysis, it will provide cybersecurity and homeland security analysts with actionable threat intelligence.\",\"PeriodicalId\":258360,\"journal\":{\"name\":\"European Conference on Cyber Warfare and Security\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Conference on Cyber Warfare and Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34190/eccws.21.1.340\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Conference on Cyber Warfare and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34190/eccws.21.1.340","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

信息战不再是纯粹的政治领域。这种现象也渗透到其他领域,尤其是大众传播和网络安全领域。深度造假、袜子木偶和社交媒体上的微目标政治广告是威胁行为者用来对大众媒体消费者施加影响的一些技术例子。社会网络分析(SNA)是用于研究和分析实体之间关系本质的工具和技术的集合。SNA使用文本挖掘、情感分析和机器学习算法等工具来识别和测量特定条件下人类行为的各个方面。对SNA感兴趣的一个领域是识别和测量人群中强烈情绪水平的能力。特别是,我们开发了一种技术,可以使用文本挖掘、情感分析和图论的组合来识别和测量社区内暴力增加的可能性。我们编制了一个关于暴力行为的讨论中常用术语的自定义词汇。词典中的每个术语都有一个与之相关的数值权重,表示该术语的暴力程度。我们将从Twitter上获取在线社区讨论的示例,并使用R和Python编程语言将示例与我们的词典交叉引用。结果将显示在Twitter讨论图中,其中用户节点根据Tweet中固有的整体暴力程度进行颜色编码。该方法将展示在线社交网络讨论中的哪些社区更容易出现潜在的暴力行为。我们断言,当这种方法与其他NLP技术(如词嵌入和情感分析)结合使用时,它将为网络安全和国土安全分析师提供可操作的威胁情报。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Identification of Violence in Twitter Using a Custom Lexicon and NLP
Information warfare is no longer a denizen purely of the political domain. It is a phenomenon that permeates other domains, especially those of mass communications and cybersecurity. Deepfakes, sock puppets, and microtargeted political advertising on social media are some examples of techniques that have been employed by threat actors to exert influence over consumers of mass media. Social Network Analysis (SNA) is an aggregation of tools and techniques used to research and analyze the nature of relationships between entities. SNA makes use of such tools as text mining, sentiment analysis, and machine learning algorithms to identify and measure aspects of human behavior in certain defined conditions. One area of interest in SNA is the ability to identify and measure levels of strong emotions in groups of people. In particular, we have developed a technique in which the potential for increased violence within a community can be identified and measured using a combination of text mining, sentiment analysis, and graph theory. We have compiled a custom lexicon of terms used commonly in discussions relating to acts of violence. Each term in the lexicon has a numerical weight associated with it, indicating how violent the term is. We will take samples of online community discussions from Twitter and use the R and Python programming languages to cross-reference the samples with our lexicon. The results will be displayed in a Twitter discussion graph where the user nodes are color-coded according to the overall level of violence that is inherent in the Tweet. This methodology will demonstrate which communities within an online social network discussion are more at risk for potentially violent behavior. We assert that when this approach is used in association with other NLP techniques such as word embeddings and sentiment analysis, it will provide cybersecurity and homeland security analysts with actionable threat intelligence.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
From Provoking Emotions to fake Images: The Recurring Signs of fake news and Phishing Scams Spreading on Social Media in Hungary, Romania and Slovakia A Commentary and Exploration of Maritime Applications of Biosecurity and Cybersecurity Intersections Cultural Influences on Information Security Processing Model and Classification of Cybercognitive Attacks: Based on Cognitive Psychology Role of Techno-Economic Coalitions in Future Cyberspace Governance: 'Backcasting' as a Method for Strategic Foresight
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1