{"title":"网络匿名化对网络搜索隐私保护的有效性研究","authors":"Sai Teja Peddinti, Nitesh Saxena","doi":"10.1145/1966913.1966984","DOIUrl":null,"url":null,"abstract":"Web search has emerged as one of the most important applications on the internet, with several search engines available to the users. There is a common practice among these search engines to log and analyse the user queries, which leads to serious privacy implications. One well known solution to search privacy involves issuing the queries via an anonymizing network, such as Tor, thereby hiding one's identity from the search engine. A fundamental problem with this solution, however, is that user queries are still obviously revealed to the search engine, although they are \"mixed\" among the queries issued by other users of the same anonymization service.\n In this paper, we consider the problem of identifying the queries of a user of interest (UOI) within a pool of queries received by a search engine over an anonymizing network. We demonstrate that an adversarial search engine can extract the UOI's queries, when it is equipped with only a short-term user search query history, by utilizing only the query content information and off-the-shelf machine learning classifiers. More specifically, by treating a selected set of 60 users --- from the publicly-available AOL search logs --- as the users of interest performing web search over an anonymizing network, we show that each user's queries can be identified with 25.95% average accuracy, when mixed with queries of 99 other users of the anonymization service. This average accuracy drops to 18.95% when queries of 999 other users of the anonymization service are mixed together. Though the average accuracies are not so high, our results indicate that few users of interest could be identified with accuracies as high as 80--98%, even when their queries are mixed among queries of 999 other users. Our results cast serious doubts on the effectiveness of anonymizing web search queries by means of anonymizing networks.","PeriodicalId":72308,"journal":{"name":"Asia CCS '22 : proceedings of the 2022 ACM Asia Conference on Computer and Communications Security : May 30-June 3, 2022, Nagasaki, Japan. ACM Asia Conference on Computer and Communications Security (17th : 2022 : Nagasaki-shi, Japan ; ...","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2011-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"On the effectiveness of anonymizing networks for web search privacy\",\"authors\":\"Sai Teja Peddinti, Nitesh Saxena\",\"doi\":\"10.1145/1966913.1966984\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web search has emerged as one of the most important applications on the internet, with several search engines available to the users. There is a common practice among these search engines to log and analyse the user queries, which leads to serious privacy implications. One well known solution to search privacy involves issuing the queries via an anonymizing network, such as Tor, thereby hiding one's identity from the search engine. A fundamental problem with this solution, however, is that user queries are still obviously revealed to the search engine, although they are \\\"mixed\\\" among the queries issued by other users of the same anonymization service.\\n In this paper, we consider the problem of identifying the queries of a user of interest (UOI) within a pool of queries received by a search engine over an anonymizing network. We demonstrate that an adversarial search engine can extract the UOI's queries, when it is equipped with only a short-term user search query history, by utilizing only the query content information and off-the-shelf machine learning classifiers. More specifically, by treating a selected set of 60 users --- from the publicly-available AOL search logs --- as the users of interest performing web search over an anonymizing network, we show that each user's queries can be identified with 25.95% average accuracy, when mixed with queries of 99 other users of the anonymization service. This average accuracy drops to 18.95% when queries of 999 other users of the anonymization service are mixed together. Though the average accuracies are not so high, our results indicate that few users of interest could be identified with accuracies as high as 80--98%, even when their queries are mixed among queries of 999 other users. Our results cast serious doubts on the effectiveness of anonymizing web search queries by means of anonymizing networks.\",\"PeriodicalId\":72308,\"journal\":{\"name\":\"Asia CCS '22 : proceedings of the 2022 ACM Asia Conference on Computer and Communications Security : May 30-June 3, 2022, Nagasaki, Japan. ACM Asia Conference on Computer and Communications Security (17th : 2022 : Nagasaki-shi, Japan ; ...\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-03-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Asia CCS '22 : proceedings of the 2022 ACM Asia Conference on Computer and Communications Security : May 30-June 3, 2022, Nagasaki, Japan. ACM Asia Conference on Computer and Communications Security (17th : 2022 : Nagasaki-shi, Japan ; ...\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1966913.1966984\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asia CCS '22 : proceedings of the 2022 ACM Asia Conference on Computer and Communications Security : May 30-June 3, 2022, Nagasaki, Japan. ACM Asia Conference on Computer and Communications Security (17th : 2022 : Nagasaki-shi, Japan ; ...","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1966913.1966984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

网络搜索已经成为互联网上最重要的应用之一,有几个搜索引擎可供用户使用。在这些搜索引擎中,有一种常见的做法是记录和分析用户查询,这会导致严重的隐私问题。搜索隐私的一个众所周知的解决方案是通过匿名网络(如Tor)发出查询,从而对搜索引擎隐藏自己的身份。然而,这种解决方案的一个基本问题是,用户的查询仍然明显地显示给搜索引擎,尽管它们“混合”在同一匿名化服务的其他用户发出的查询中。在本文中,我们考虑了在匿名网络上搜索引擎接收的查询池中识别感兴趣用户(UOI)查询的问题。我们证明了对抗性搜索引擎可以通过仅利用查询内容信息和现成的机器学习分类器,在仅配备短期用户搜索查询历史的情况下提取UOI的查询。更具体地说,通过从公开的AOL搜索日志中选择60个用户作为在匿名网络上执行网络搜索的用户,我们表明,当与匿名服务的99个其他用户的查询混合在一起时,每个用户的查询可以以25.95%的平均准确率识别。当999个其他匿名化服务用户的查询混合在一起时,平均准确率下降到18.95%。虽然平均准确率不是很高,但我们的结果表明,即使他们的查询与999个其他用户的查询混合在一起,也很少有感兴趣的用户可以被识别出准确率高达80% -98%的用户。我们的研究结果对通过匿名化网络来匿名化网络搜索查询的有效性提出了严重的质疑。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
On the effectiveness of anonymizing networks for web search privacy
Web search has emerged as one of the most important applications on the internet, with several search engines available to the users. There is a common practice among these search engines to log and analyse the user queries, which leads to serious privacy implications. One well known solution to search privacy involves issuing the queries via an anonymizing network, such as Tor, thereby hiding one's identity from the search engine. A fundamental problem with this solution, however, is that user queries are still obviously revealed to the search engine, although they are "mixed" among the queries issued by other users of the same anonymization service. In this paper, we consider the problem of identifying the queries of a user of interest (UOI) within a pool of queries received by a search engine over an anonymizing network. We demonstrate that an adversarial search engine can extract the UOI's queries, when it is equipped with only a short-term user search query history, by utilizing only the query content information and off-the-shelf machine learning classifiers. More specifically, by treating a selected set of 60 users --- from the publicly-available AOL search logs --- as the users of interest performing web search over an anonymizing network, we show that each user's queries can be identified with 25.95% average accuracy, when mixed with queries of 99 other users of the anonymization service. This average accuracy drops to 18.95% when queries of 999 other users of the anonymization service are mixed together. Though the average accuracies are not so high, our results indicate that few users of interest could be identified with accuracies as high as 80--98%, even when their queries are mixed among queries of 999 other users. Our results cast serious doubts on the effectiveness of anonymizing web search queries by means of anonymizing networks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enabling Attribute-Based Access Control in Linux Kernel. Verbal, visual, and verbal-visual puns in translation: cognitive multimodal analysis Impoliteness in parliamentary discourse: a cognitive-pragmatic and sociocultural approach The functions of heraldic symbols in the English fiction Possible worlds of a literary text character: a cognitive and quantitative linguistic approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1