Attacking HTTPS Secure Search Service through Correlation Analysis of HTTP Webpages Accessed

Qian Liping, Wang Lidong
{"title":"Attacking HTTPS Secure Search Service through Correlation Analysis of HTTP Webpages Accessed","authors":"Qian Liping, Wang Lidong","doi":"10.14257/IJSIA.2017.11.7.03","DOIUrl":null,"url":null,"abstract":"It is very common for Internet users to query a search engine when retrieving web information. Sensitive data about search engine user’s intentions or behavior can be inferred from his query phrases and the webpages he visits subsequently. In order to protect contents of communications from being eavesdropped, a search engine can adopt HTTPS-by-default to provide bidirectional encryption to protect its users’ privacy. Since the majority of webpages indexed in search engine’s results pages are still on HTTP-enabled websites and the contents of these webpages can be observed by attackers once the user click on the indexed web-links. We propose a novel approach for attacking secure search through correlating analysis of encrypted search with unencrypted webpages the user visits subsequently. We show that a simple weighted TF-DF mechanism is sufficient for selecting guessing phrase candidates. Imitating search engine users, by querying these candidates and enumerating webpages indexed in results pages, we can hit the definite query phrases and meanwhile reconstruct user’s web-surfing trails through DNS-based URLs comparison and flow feature statistics-based network traffic analysis. In the experiment including 180 Chinese and English search phrases, we achieved 67.78% hit rate at first guess and 96.11% hit rate within three guesses. Our empirical research shows that HTTPS traffic can be correlated and de-anonymized through HTTP traffic and secure search of search engine is not always secure unless HTTPS-by-default enabled everywhere.","PeriodicalId":46187,"journal":{"name":"International Journal of Security and Its Applications","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Security and Its Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/IJSIA.2017.11.7.03","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

It is very common for Internet users to query a search engine when retrieving web information. Sensitive data about search engine user’s intentions or behavior can be inferred from his query phrases and the webpages he visits subsequently. In order to protect contents of communications from being eavesdropped, a search engine can adopt HTTPS-by-default to provide bidirectional encryption to protect its users’ privacy. Since the majority of webpages indexed in search engine’s results pages are still on HTTP-enabled websites and the contents of these webpages can be observed by attackers once the user click on the indexed web-links. We propose a novel approach for attacking secure search through correlating analysis of encrypted search with unencrypted webpages the user visits subsequently. We show that a simple weighted TF-DF mechanism is sufficient for selecting guessing phrase candidates. Imitating search engine users, by querying these candidates and enumerating webpages indexed in results pages, we can hit the definite query phrases and meanwhile reconstruct user’s web-surfing trails through DNS-based URLs comparison and flow feature statistics-based network traffic analysis. In the experiment including 180 Chinese and English search phrases, we achieved 67.78% hit rate at first guess and 96.11% hit rate within three guesses. Our empirical research shows that HTTPS traffic can be correlated and de-anonymized through HTTP traffic and secure search of search engine is not always secure unless HTTPS-by-default enabled everywhere.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过访问HTTP网页的相关性分析攻击HTTPS安全搜索服务
互联网用户在检索网络信息时查询搜索引擎是非常常见的。关于搜索引擎用户意图或行为的敏感数据可以从他的查询短语和他随后访问的网页中推断出来。为了保护通信内容不被窃听,搜索引擎可以默认采用HTTPS来提供双向加密,以保护用户的隐私。由于搜索引擎结果页面中索引的大多数网页仍在启用HTTP的网站上,一旦用户单击索引的网页链接,攻击者就可以观察到这些网页的内容。我们提出了一种攻击安全搜索的新方法,通过将加密搜索与用户随后访问的未加密网页进行关联分析。我们证明了一个简单的加权TF-DF机制足以选择猜测短语候选者。模仿搜索引擎用户,通过查询这些候选者并枚举结果页面中索引的网页,我们可以命中确定的查询短语,同时通过基于DNS的URL比较和基于流量特征统计的网络流量分析来重建用户的网络浏览轨迹。在包含180个中英文搜索短语的实验中,我们获得了67.78%的第一次猜测命中率和96.11%的三次猜测命中度。我们的实证研究表明,HTTPS流量可以通过HTTP流量进行关联和去匿名化,除非在所有地方默认启用HTTPS,否则搜索引擎的安全搜索并不总是安全的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International Journal of Security and Its Applications
International Journal of Security and Its Applications COMPUTER SCIENCE, INFORMATION SYSTEMS-
自引率
0.00%
发文量
0
期刊介绍: IJSIA aims to facilitate and support research related to security technology and its applications. Our Journal provides a chance for academic and industry professionals to discuss recent progress in the area of security technology and its applications. Journal Topics: -Access Control -Ad Hoc & Sensor Network Security -Applied Cryptography -Authentication and Non-repudiation -Cryptographic Protocols -Denial of Service -E-Commerce Security -Identity and Trust Management -Information Hiding -Insider Threats and Countermeasures -Intrusion Detection & Prevention -Network & Wireless Security -Peer-to-Peer Security -Privacy and Anonymity -Secure installation, generation and operation -Security Analysis Methodologies -Security assurance -Security in Software Outsourcing -Security products or systems -Security technology -Systems and Data Security
期刊最新文献
Capturing Security Mechanisms Applied to Ecommerce: An Analysis of Transaction Security Blockchain Approach to Cyber Security Vulnerabilities Attacks and Potential Countermeasures Mitigation of Wireless Body Area Networks Challenges using Cooperation Improving the Security Quality of Use Case Models through the Application of Software Refactoring Using Genetic Algorithm LTA: A Linked Timestamp based Authentication Protocol for Sensor Network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1