Using common-sense knowledge-base for detecting word obfuscation in adversarial communication

Swati Agarwal, A. Sureka
{"title":"Using common-sense knowledge-base for detecting word obfuscation in adversarial communication","authors":"Swati Agarwal, A. Sureka","doi":"10.1109/COMSNETS.2015.7098738","DOIUrl":null,"url":null,"abstract":"Word obfuscation or substitution means replacing one word with another word in a sentence to conceal the textual content or communication. Word obfuscation is used in adversarial communication by terrorist or criminals for conveying their messages without getting red-flagged by security and intelligence agencies intercepting or scanning messages (such as emails and telephone conversations). ConceptNet is a freely available semantic network represented as a directed graph consisting of nodes as concepts and edges as assertions of common sense about these concepts. We present a solution approach exploiting vast amount of semantic knowledge in ConceptNet for addressing the technically challenging problem of word substitution in adversarial communication. We frame the given problem as a textual reasoning and context inference task and utilize ConceptNet's natural-language-processing tool-kit for determining word substitution. We use ConceptNet to compute the conceptual similarity between any two given terms and define a Mean Average Conceptual Similarity (MACS) metric to identify out-of-context terms. The test-bed to evaluate our proposed approach consists of Enron email dataset (having over 600000 emails generated by 158 employees of Enron Corporation) and Brown corpus (totaling about a million words drawn from a wide variety of sources). We implement word substitution techniques used by previous researches to generate a test dataset.We conduct a series of experiments consisting of word substitution methods used in the past to evaluate our approach. Experimental results reveal that the proposed approach is effective.","PeriodicalId":277593,"journal":{"name":"2015 7th International Conference on Communication Systems and Networks (COMSNETS)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 7th International Conference on Communication Systems and Networks (COMSNETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMSNETS.2015.7098738","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Word obfuscation or substitution means replacing one word with another word in a sentence to conceal the textual content or communication. Word obfuscation is used in adversarial communication by terrorist or criminals for conveying their messages without getting red-flagged by security and intelligence agencies intercepting or scanning messages (such as emails and telephone conversations). ConceptNet is a freely available semantic network represented as a directed graph consisting of nodes as concepts and edges as assertions of common sense about these concepts. We present a solution approach exploiting vast amount of semantic knowledge in ConceptNet for addressing the technically challenging problem of word substitution in adversarial communication. We frame the given problem as a textual reasoning and context inference task and utilize ConceptNet's natural-language-processing tool-kit for determining word substitution. We use ConceptNet to compute the conceptual similarity between any two given terms and define a Mean Average Conceptual Similarity (MACS) metric to identify out-of-context terms. The test-bed to evaluate our proposed approach consists of Enron email dataset (having over 600000 emails generated by 158 employees of Enron Corporation) and Brown corpus (totaling about a million words drawn from a wide variety of sources). We implement word substitution techniques used by previous researches to generate a test dataset.We conduct a series of experiments consisting of word substitution methods used in the past to evaluate our approach. Experimental results reveal that the proposed approach is effective.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于常识知识库的对抗性交际词混淆检测
词语混淆或替代是指用句子中的一个词代替另一个词来掩盖文本内容或交流。单词混淆是恐怖分子或犯罪分子在对抗性通信中使用的,目的是传达他们的信息,而不会被安全和情报机构拦截或扫描信息(如电子邮件和电话交谈)。ConceptNet是一个免费可用的语义网络,表示为一个有向图,由节点作为概念和边缘作为关于这些概念的常识断言组成。我们提出了一种利用概念网中大量语义知识的解决方案,以解决对抗性通信中具有技术挑战性的词替换问题。我们将给定的问题框架为文本推理和上下文推理任务,并利用ConceptNet的自然语言处理工具包来确定单词替换。我们使用ConceptNet来计算任意两个给定术语之间的概念相似性,并定义了一个平均概念相似性(MACS)度量来识别上下文外的术语。评估我们提出的方法的测试平台由安然电子邮件数据集(由安然公司的158名员工生成的60多万封电子邮件)和Brown语料库(从各种来源提取的总计约100万字)组成。我们实现了以前研究中使用的词替换技术来生成测试数据集。我们进行了一系列的实验,包括过去使用的单词替换方法来评估我们的方法。实验结果表明,该方法是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
GSM-based positioning for public transportation commuters Passing the torch: Role alternation for fair energy usage in D2D group communication Performance analysis of parameters affecting power efficiency in networks BlinkToSCoAP: An end-to-end security framework for the Internet of Things Contextual sensitivity of the ambient temperature sensor in Smartphones
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1