Finding Critical Files from a Packet

Junnyung Hur, Hahoon Jeon, Hyeon Gy Shon, Young Jae Kim, Myungkeun Yoon
{"title":"Finding Critical Files from a Packet","authors":"Junnyung Hur, Hahoon Jeon, Hyeon Gy Shon, Young Jae Kim, Myungkeun Yoon","doi":"10.1109/INFOCOM42981.2021.9488914","DOIUrl":null,"url":null,"abstract":"Network-based intrusion detection and data leakage prevention systems inspect packets to detect if critical files such as malware or confidential documents are transferred. However, this kind of detection requires heavy computing resources in reassembling packets and only well-known protocols can be interpreted. Besides, finding similar files from a storage requires pairwise comparisons. In this paper, we present a new network-based file identification scheme that inspects packets independently without reassembly and finds similar files through inverted indexing instead of pairwise comparison. We use a contents-based chunking algorithm to consistently divide both files and packets into multiple byte sequences, called chunks. If a packet is a part of a file, they would have common chunks. The challenging problem is that packet chunking and inverted-index search should be fast and scalable enough for packet processing. The file identification should be accurate although many chunks are noises. In this paper, we use a small Bloom filter and a delayed query strategy to solve the problems. To the best of our knowledge, this is the first scheme that identifies a specific critical file from a packet over unknown protocols. Experimental results show that the proposed scheme can successfully identify a critical file from a packet.","PeriodicalId":293079,"journal":{"name":"IEEE INFOCOM 2021 - IEEE Conference on Computer Communications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2021 - IEEE Conference on Computer Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOM42981.2021.9488914","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Network-based intrusion detection and data leakage prevention systems inspect packets to detect if critical files such as malware or confidential documents are transferred. However, this kind of detection requires heavy computing resources in reassembling packets and only well-known protocols can be interpreted. Besides, finding similar files from a storage requires pairwise comparisons. In this paper, we present a new network-based file identification scheme that inspects packets independently without reassembly and finds similar files through inverted indexing instead of pairwise comparison. We use a contents-based chunking algorithm to consistently divide both files and packets into multiple byte sequences, called chunks. If a packet is a part of a file, they would have common chunks. The challenging problem is that packet chunking and inverted-index search should be fast and scalable enough for packet processing. The file identification should be accurate although many chunks are noises. In this paper, we use a small Bloom filter and a delayed query strategy to solve the problems. To the best of our knowledge, this is the first scheme that identifies a specific critical file from a packet over unknown protocols. Experimental results show that the proposed scheme can successfully identify a critical file from a packet.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从数据包中查找关键文件
基于网络的入侵检测和数据泄漏防御系统通过检测报文是否传输了恶意软件或机密文件等重要文件。但是,这种检测方式在重组报文时需要大量的计算资源,并且只能解释已知的协议。此外,从存储中查找相似的文件需要两两比较。在本文中,我们提出了一种新的基于网络的文件识别方案,该方案独立检测数据包而不重组,并通过倒排索引而不是两两比较来查找相似的文件。我们使用基于内容的分块算法将文件和数据包一致地划分为多个字节序列,称为块。如果一个包是文件的一部分,那么它们将具有共同的块。具有挑战性的问题是,分组和倒排索引搜索对于分组处理来说应该足够快速和可扩展。文件识别应该是准确的,尽管许多块是噪声。在本文中,我们使用一个小的布隆过滤器和延迟查询策略来解决这个问题。据我们所知,这是第一个通过未知协议从数据包中识别特定关键文件的方案。实验结果表明,该方法能够成功地从数据包中识别出关键文件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Message from the TPC Chairs Enabling Edge-Cloud Video Analytics for Robotics Applications Practical Analysis of Replication-Based Systems Towards Minimum Fleet for Ridesharing-Aware Mobility-on-Demand Systems Beyond Value Perturbation: Local Differential Privacy in the Temporal Setting
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1