更现实的网站指纹使用深度学习

2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS) Pub Date : 2020-11-01 DOI:10.1109/ICDCS47774.2020.00058

Weiqi Cui, Tao Chen, Chan-Tin Eric

{"title":"更现实的网站指纹使用深度学习","authors":"Weiqi Cui, Tao Chen, Chan-Tin Eric","doi":"10.1109/ICDCS47774.2020.00058","DOIUrl":null,"url":null,"abstract":"Website fingerprinting (WF) allows a passive local eavesdropper to monitor the encrypted channel where users search the Internet and determine which website the user is visiting from the recorded traffic. The effectiveness of using deep learning (DL) in WF attacks has been explored in recent work. However, they all are built and evaluated on one-page traces. Our goal is to explore whether deep learning can be used to handle the situations when the captured traces are not best-case for an adversary, such as partial traces and two-page traces. We aim to reduce the distance between the lab experiments and the realistic conditions. We evaluate our proposed method in both closed-world and open-world settings and found that Convolutional Neural Network (CNN) outperforms Long-Short Term Memory network (LSTM) in all scenarios. CNN also shows a great potential in predicting on a smaller number of packets. For partial trace missing 20% packets in the beginning of the trace, the accuracy is improved from 8.28% to 86.93% compared to the original DL model by adding the head detection. We then show the accuracy of predicting on two-page traces. With an overlap of 80% between two websites, we are able to achieve an accuracy of 89.25% and 74.2% for the first and second website in the closed-world evaluation, and 95.5% and 75% in the open world from our simulation. To verify our simulation results, we set up a crawler to collect both training and testing data and gathered the largest two-page traces testing dataset ever used. The results shown in the real world experiment is consistent with the simulation.","PeriodicalId":158630,"journal":{"name":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","volume":"124 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"More Realistic Website Fingerprinting Using Deep Learning\",\"authors\":\"Weiqi Cui, Tao Chen, Chan-Tin Eric\",\"doi\":\"10.1109/ICDCS47774.2020.00058\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Website fingerprinting (WF) allows a passive local eavesdropper to monitor the encrypted channel where users search the Internet and determine which website the user is visiting from the recorded traffic. The effectiveness of using deep learning (DL) in WF attacks has been explored in recent work. However, they all are built and evaluated on one-page traces. Our goal is to explore whether deep learning can be used to handle the situations when the captured traces are not best-case for an adversary, such as partial traces and two-page traces. We aim to reduce the distance between the lab experiments and the realistic conditions. We evaluate our proposed method in both closed-world and open-world settings and found that Convolutional Neural Network (CNN) outperforms Long-Short Term Memory network (LSTM) in all scenarios. CNN also shows a great potential in predicting on a smaller number of packets. For partial trace missing 20% packets in the beginning of the trace, the accuracy is improved from 8.28% to 86.93% compared to the original DL model by adding the head detection. We then show the accuracy of predicting on two-page traces. With an overlap of 80% between two websites, we are able to achieve an accuracy of 89.25% and 74.2% for the first and second website in the closed-world evaluation, and 95.5% and 75% in the open world from our simulation. To verify our simulation results, we set up a crawler to collect both training and testing data and gathered the largest two-page traces testing dataset ever used. The results shown in the real world experiment is consistent with the simulation.\",\"PeriodicalId\":158630,\"journal\":{\"name\":\"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)\",\"volume\":\"124 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDCS47774.2020.00058\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS47774.2020.00058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

摘要

网站指纹(WF)允许被动的本地窃听者监控用户搜索互联网的加密通道，并从记录的流量中确定用户访问的是哪个网站。在WF攻击中使用深度学习(DL)的有效性已经在最近的工作中进行了探索。然而，它们都是在单页轨迹上构建和评估的。我们的目标是探索深度学习是否可以用于处理捕获的轨迹对对手来说不是最佳情况的情况，例如部分轨迹和两页轨迹。我们的目标是缩小实验室实验与现实条件之间的距离。我们在封闭世界和开放世界环境下评估了我们提出的方法，发现卷积神经网络(CNN)在所有场景下都优于长短期记忆网络(LSTM)。CNN在预测较小数量的数据包方面也显示出巨大的潜力。对于部分跟踪在跟踪开始时丢失20%的数据包，通过添加头部检测，与原始DL模型相比，准确率从8.28%提高到86.93%。然后我们在两页的轨迹上展示了预测的准确性。在两个网站重叠率为80%的情况下，我们的模拟结果表明，第一和第二网站在封闭世界的评估准确率分别为89.25%和74.2%，在开放世界的评估准确率分别为95.5%和75%。为了验证我们的模拟结果，我们设置了一个爬虫来收集训练和测试数据，并收集了使用过的最大的两页跟踪测试数据集。实际实验结果与仿真结果吻合较好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

More Realistic Website Fingerprinting Using Deep Learning

Website fingerprinting (WF) allows a passive local eavesdropper to monitor the encrypted channel where users search the Internet and determine which website the user is visiting from the recorded traffic. The effectiveness of using deep learning (DL) in WF attacks has been explored in recent work. However, they all are built and evaluated on one-page traces. Our goal is to explore whether deep learning can be used to handle the situations when the captured traces are not best-case for an adversary, such as partial traces and two-page traces. We aim to reduce the distance between the lab experiments and the realistic conditions. We evaluate our proposed method in both closed-world and open-world settings and found that Convolutional Neural Network (CNN) outperforms Long-Short Term Memory network (LSTM) in all scenarios. CNN also shows a great potential in predicting on a smaller number of packets. For partial trace missing 20% packets in the beginning of the trace, the accuracy is improved from 8.28% to 86.93% compared to the original DL model by adding the head detection. We then show the accuracy of predicting on two-page traces. With an overlap of 80% between two websites, we are able to achieve an accuracy of 89.25% and 74.2% for the first and second website in the closed-world evaluation, and 95.5% and 75% in the open world from our simulation. To verify our simulation results, we set up a crawler to collect both training and testing data and gathered the largest two-page traces testing dataset ever used. The results shown in the real world experiment is consistent with the simulation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助