{"title":"改进了Tor上的网站指纹识别","authors":"Tao Wang, I. Goldberg","doi":"10.1145/2517840.2517851","DOIUrl":null,"url":null,"abstract":"In this paper, we propose new website fingerprinting techniques that achieve a higher classification accuracy on Tor than previous works. We describe our novel methodology for gathering data on Tor; this methodology is essential for accurate classifier comparison and analysis. We offer new ways to interpret the data by using the more fundamental Tor cells as a unit of data rather than TCP/IP packets. We demonstrate an experimental method to remove Tor SENDMEs, which are control cells that provide no useful data, in order to improve accuracy. We also propose a new set of metrics to describe the similarity between two traffic instances; they are derived from observations on how a site is loaded. Using our new metrics we achieve a higher success rate than previous authors. We conduct a thorough analysis and comparison between our new algorithms and the previous best algorithm. To identify the potential power of website fingerprinting on Tor, we perform open-world experiments; we achieve a recall rate over 95% and a false positive rate under 0.2% for several potentially monitored sites, which far exceeds previous reported recall rates. In the closed-world experiments, our accuracy is 91%, as compared to 86-87% from the best previous classifier on the same data.","PeriodicalId":406846,"journal":{"name":"Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"273","resultStr":"{\"title\":\"Improved website fingerprinting on Tor\",\"authors\":\"Tao Wang, I. Goldberg\",\"doi\":\"10.1145/2517840.2517851\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose new website fingerprinting techniques that achieve a higher classification accuracy on Tor than previous works. We describe our novel methodology for gathering data on Tor; this methodology is essential for accurate classifier comparison and analysis. We offer new ways to interpret the data by using the more fundamental Tor cells as a unit of data rather than TCP/IP packets. We demonstrate an experimental method to remove Tor SENDMEs, which are control cells that provide no useful data, in order to improve accuracy. We also propose a new set of metrics to describe the similarity between two traffic instances; they are derived from observations on how a site is loaded. Using our new metrics we achieve a higher success rate than previous authors. We conduct a thorough analysis and comparison between our new algorithms and the previous best algorithm. To identify the potential power of website fingerprinting on Tor, we perform open-world experiments; we achieve a recall rate over 95% and a false positive rate under 0.2% for several potentially monitored sites, which far exceeds previous reported recall rates. In the closed-world experiments, our accuracy is 91%, as compared to 86-87% from the best previous classifier on the same data.\",\"PeriodicalId\":406846,\"journal\":{\"name\":\"Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"273\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2517840.2517851\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2517840.2517851","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In this paper, we propose new website fingerprinting techniques that achieve a higher classification accuracy on Tor than previous works. We describe our novel methodology for gathering data on Tor; this methodology is essential for accurate classifier comparison and analysis. We offer new ways to interpret the data by using the more fundamental Tor cells as a unit of data rather than TCP/IP packets. We demonstrate an experimental method to remove Tor SENDMEs, which are control cells that provide no useful data, in order to improve accuracy. We also propose a new set of metrics to describe the similarity between two traffic instances; they are derived from observations on how a site is loaded. Using our new metrics we achieve a higher success rate than previous authors. We conduct a thorough analysis and comparison between our new algorithms and the previous best algorithm. To identify the potential power of website fingerprinting on Tor, we perform open-world experiments; we achieve a recall rate over 95% and a false positive rate under 0.2% for several potentially monitored sites, which far exceeds previous reported recall rates. In the closed-world experiments, our accuracy is 91%, as compared to 86-87% from the best previous classifier on the same data.