首页 > 最新文献

Proceedings of the 25th International Conference on World Wide Web最新文献

英文 中文
Characterizing Long-tail SEO Spam on Cloud Web Hosting Services 表征长尾搜索引擎优化垃圾邮件在云网络托管服务
Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883008
Xiaojing Liao, Chang Liu, Damon McCoy, E. Shi, S. Hao, R. Beyah
The popularity of long-tail search engine optimization (SEO) brings with new security challenges: incidents of long-tail keyword poisoning to lower competition and increase revenue have been reported. The emergence of cloud web hosting services provides a new and effective platform for long-tail SEO spam attacks. There is growing evidence that large-scale long-tail SEO campaigns are being carried out on cloud hosting platforms because they offer low-cost, high-speed hosting services. In this paper, we take the first step toward understanding how long-tail SEO spam is implemented on cloud hosting platforms. After identifying 3,186 cloud directories and 318,470 doorway pages on the leading cloud platforms for long-tail SEO spam, we characterize their abusive behavior. One highlight of our findings is the effectiveness of the cloud-based long-tail SEO spam, with 6% of the doorway pages successfully appearing in the top 10 search results of the poisoned long-tail keywords. Examples of other important discoveries include how such doorway pages monetize traffic and their ability to manage cloud platform's countermeasures. These findings bring such abuse to the spotlight and provide some insights to eliminating this practice.
长尾搜索引擎优化(SEO)的流行带来了新的安全挑战:长尾关键词中毒事件的报道降低了竞争,增加了收入。云虚拟主机服务的出现为长尾SEO垃圾邮件攻击提供了一个新的有效平台。越来越多的证据表明,大规模的长尾搜索引擎优化活动正在云托管平台上进行,因为它们提供低成本、高速的托管服务。在本文中,我们迈出了了解长尾SEO垃圾邮件如何在云托管平台上实现的第一步。在确定了3186个云目录和318470个门户页面在领先的云平台上的长尾搜索引擎优化垃圾邮件后,我们描述了他们的滥用行为。我们发现的一个亮点是基于云的长尾搜索引擎优化垃圾邮件的有效性,6%的门户页面成功出现在有毒长尾关键词的前10个搜索结果中。其他重要发现的例子包括这些门户页面如何将流量货币化,以及它们管理云平台对策的能力。这些发现使这种虐待行为成为人们关注的焦点,并为消除这种做法提供了一些见解。
{"title":"Characterizing Long-tail SEO Spam on Cloud Web Hosting Services","authors":"Xiaojing Liao, Chang Liu, Damon McCoy, E. Shi, S. Hao, R. Beyah","doi":"10.1145/2872427.2883008","DOIUrl":"https://doi.org/10.1145/2872427.2883008","url":null,"abstract":"The popularity of long-tail search engine optimization (SEO) brings with new security challenges: incidents of long-tail keyword poisoning to lower competition and increase revenue have been reported. The emergence of cloud web hosting services provides a new and effective platform for long-tail SEO spam attacks. There is growing evidence that large-scale long-tail SEO campaigns are being carried out on cloud hosting platforms because they offer low-cost, high-speed hosting services. In this paper, we take the first step toward understanding how long-tail SEO spam is implemented on cloud hosting platforms. After identifying 3,186 cloud directories and 318,470 doorway pages on the leading cloud platforms for long-tail SEO spam, we characterize their abusive behavior. One highlight of our findings is the effectiveness of the cloud-based long-tail SEO spam, with 6% of the doorway pages successfully appearing in the top 10 search results of the poisoned long-tail keywords. Examples of other important discoveries include how such doorway pages monetize traffic and their ability to manage cloud platform's countermeasures. These findings bring such abuse to the spotlight and provide some insights to eliminating this practice.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"47 2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91039114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
An In-depth Study of Mobile Browser Performance 手机浏览器性能的深入研究
Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883014
Javad Nejati, A. Balasubramanian
Mobile page load times are an order of magnitude slower compared to non-mobile pages. It is not clear what causes the poor performance: the slower network, the slower computational speeds, or other reasons. Further, most Web optimizations are designed for non-mobile browsers and do not translate well to the mobile browser. Towards understanding mobile Web page load times, in this paper we: (1) perform an in-depth pairwise comparison of loading a page on a mobile versus a non-mobile browser, and (2) characterize the bottlenecks in the mobile browser {em vis-a-vis} non-mobile browsers. To this end, we build a testbed that allows us to directly compare the low-level page load activities and bottlenecks when loading a page on a mobile versus a non-mobile browser. We find that computation is the main bottleneck when loading a page on mobile browsers. This is in contrast to non-mobile browsers where networking is the main bottleneck. We also find that the composition of the critical path during page load is different when loading pages on the mobile versus the non-mobile browser. A key takeaway of our work is that we need to fundamentally rethink optimizations for mobile browsers.
与非移动页面相比,移动页面加载时间要慢一个数量级。目前还不清楚导致性能差的原因:网络速度较慢,计算速度较慢,还是其他原因。此外,大多数Web优化都是为非移动浏览器设计的,不能很好地转换为移动浏览器。为了理解移动Web页面加载时间,本文中我们:(1)对在移动和非移动浏览器上加载页面进行了深入的两两比较,(2)描述了移动浏览器(相对于非移动浏览器)的瓶颈。为此,我们构建了一个测试平台,它允许我们直接比较在移动端和非移动端浏览器上加载页面时的低级页面加载活动和瓶颈。我们发现,在移动浏览器上加载页面时,计算是主要瓶颈。这与网络是主要瓶颈的非移动浏览器形成了鲜明对比。我们还发现,在移动端和非移动端浏览器上加载页面时,页面加载过程中关键路径的组成是不同的。我们工作的一个关键收获是,我们需要从根本上重新考虑针对移动浏览器的优化。
{"title":"An In-depth Study of Mobile Browser Performance","authors":"Javad Nejati, A. Balasubramanian","doi":"10.1145/2872427.2883014","DOIUrl":"https://doi.org/10.1145/2872427.2883014","url":null,"abstract":"Mobile page load times are an order of magnitude slower compared to non-mobile pages. It is not clear what causes the poor performance: the slower network, the slower computational speeds, or other reasons. Further, most Web optimizations are designed for non-mobile browsers and do not translate well to the mobile browser. Towards understanding mobile Web page load times, in this paper we: (1) perform an in-depth pairwise comparison of loading a page on a mobile versus a non-mobile browser, and (2) characterize the bottlenecks in the mobile browser {em vis-a-vis} non-mobile browsers. To this end, we build a testbed that allows us to directly compare the low-level page load activities and bottlenecks when loading a page on a mobile versus a non-mobile browser. We find that computation is the main bottleneck when loading a page on mobile browsers. This is in contrast to non-mobile browsers where networking is the main bottleneck. We also find that the composition of the critical path during page load is different when loading pages on the mobile versus the non-mobile browser. A key takeaway of our work is that we need to fundamentally rethink optimizations for mobile browsers.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90372984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 89
Mechanism Design for Mixed Bidders 混合投标人机制设计
Pub Date : 2016-04-11 DOI: 10.1145/2872427.2882983
Y. Bachrach, S. Ceppi, Ian A. Kash, P. Key, M. Khani
The Generalized Second Price (GSP) auction has appealing properties when ads are simple (text based and identical in size), but does not generalize to richer ad settings, whereas truthful mechanisms such as VCG do. However, a straight switch from GSP to VCG incurs significant revenue loss for the search engine. We introduce a transitional mechanism which encourages advertisers to update their bids to their valuations, while mitigating revenue loss. In this setting, it is easier to propose first a payment function rather than an allocation function, so we give a general framework which guarantees incentive compatibility by requiring that the payment functions satisfy two specific properties. Finally, we analyze the revenue impacts of our mechanism on a sample of Bing data.
当广告很简单(基于文本且大小相同)时,广义第二价格(GSP)拍卖具有吸引人的特性,但不能推广到更丰富的广告设置,而真实机制(如VCG)则可以。然而,从GSP直接切换到VCG会给搜索引擎带来巨大的收入损失。我们引入了一种过渡机制,鼓励广告客户将其出价更新为其估值,同时减少收入损失。在这种情况下,首先提出支付函数比提出分配函数更容易,因此我们给出了一个一般框架,通过要求支付函数满足两个特定的属性来保证激励兼容性。最后,我们分析了我们的机制对必应数据样本的收益影响。
{"title":"Mechanism Design for Mixed Bidders","authors":"Y. Bachrach, S. Ceppi, Ian A. Kash, P. Key, M. Khani","doi":"10.1145/2872427.2882983","DOIUrl":"https://doi.org/10.1145/2872427.2882983","url":null,"abstract":"The Generalized Second Price (GSP) auction has appealing properties when ads are simple (text based and identical in size), but does not generalize to richer ad settings, whereas truthful mechanisms such as VCG do. However, a straight switch from GSP to VCG incurs significant revenue loss for the search engine. We introduce a transitional mechanism which encourages advertisers to update their bids to their valuations, while mitigating revenue loss. In this setting, it is easier to propose first a payment function rather than an allocation function, so we give a general framework which guarantees incentive compatibility by requiring that the payment functions satisfy two specific properties. Finally, we analyze the revenue impacts of our mechanism on a sample of Bing data.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77594357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
User Fatigue in Online News Recommendation 网络新闻推荐中的用户疲劳
Pub Date : 2016-04-11 DOI: 10.1145/2872427.2874813
Hao Ma, Xueqing Liu, Zhihong Shen
Many aspects and properties of Recommender Systems have been well studied in the past decade, however, the impact of User Fatigue has been mostly ignored in the literature. User fatigue represents the phenomenon that a user quickly loses the interest on the recommended item if the same item has been presented to this user multiple times before. The direct impact caused by the user fatigue is the dramatic decrease of the Click Through Rate (CTR, i.e., the ratio of clicks to impressions). In this paper, we present a comprehensive study on the research of the user fatigue in online recommender systems. By analyzing user behavioral logs from Bing Now news recommendation, we find that user fatigue is a severe problem that greatly affects the user experience. We also notice that different users engage differently with repeated recommendations. Depending on the previous users' interaction with repeated recommendations, we illustrate that under certain condition the previously seen items should be demoted, while some other times they should be promoted. We demonstrate how statistics about the analysis of the user fatigue can be incorporated into ranking algorithms for personalized recommendations. Our experimental results indicate that significant gains can be achieved by introducing features that reflect users' interaction with previously seen recommendations (up to 15% enhancement on all users and 34% improvement on heavy users).
在过去的十年中,推荐系统的许多方面和特性都得到了很好的研究,然而,用户疲劳的影响在文献中大多被忽视。用户疲劳指的是,如果同一件商品在之前多次呈现给用户,用户很快就会对推荐商品失去兴趣。用户疲劳造成的直接影响是点击率(CTR,即点击与印象之比)的急剧下降。本文对在线推荐系统中的用户疲劳进行了全面的研究。通过分析Bing Now新闻推荐的用户行为日志,我们发现用户疲劳是一个严重的问题,极大地影响了用户体验。我们还注意到,不同的用户对重复推荐的反应是不同的。根据以前的用户与重复推荐的交互,我们说明在某些情况下,以前看到的项目应该降级,而在其他情况下,它们应该被提升。我们演示了如何将有关用户疲劳分析的统计数据纳入个性化推荐的排名算法。我们的实验结果表明,通过引入反映用户与之前看到的推荐的交互的功能,可以获得显著的收益(所有用户提高15%,重度用户提高34%)。
{"title":"User Fatigue in Online News Recommendation","authors":"Hao Ma, Xueqing Liu, Zhihong Shen","doi":"10.1145/2872427.2874813","DOIUrl":"https://doi.org/10.1145/2872427.2874813","url":null,"abstract":"Many aspects and properties of Recommender Systems have been well studied in the past decade, however, the impact of User Fatigue has been mostly ignored in the literature. User fatigue represents the phenomenon that a user quickly loses the interest on the recommended item if the same item has been presented to this user multiple times before. The direct impact caused by the user fatigue is the dramatic decrease of the Click Through Rate (CTR, i.e., the ratio of clicks to impressions). In this paper, we present a comprehensive study on the research of the user fatigue in online recommender systems. By analyzing user behavioral logs from Bing Now news recommendation, we find that user fatigue is a severe problem that greatly affects the user experience. We also notice that different users engage differently with repeated recommendations. Depending on the previous users' interaction with repeated recommendations, we illustrate that under certain condition the previously seen items should be demoted, while some other times they should be promoted. We demonstrate how statistics about the analysis of the user fatigue can be incorporated into ranking algorithms for personalized recommendations. Our experimental results indicate that significant gains can be achieved by introducing features that reflect users' interaction with previously seen recommendations (up to 15% enhancement on all users and 34% improvement on heavy users).","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78775800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Non-Linear Mining of Competing Local Activities 竞争本地活动的非线性挖掘
Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883010
Yasuko Matsubara, Yasushi Sakurai, C. Faloutsos
Given a large collection of time-evolving activities, such as Google search queries, which consist of d keywords/activities for m locations of duration n, how can we analyze temporal patterns and relationships among all these activities and find location-specific trends? How do we go about capturing non-linear evolutions of local activities and forecasting future patterns? For example, assume that we have the online search volume for multiple keywords, e.g., "Nokia/Nexus/Kindle" or "CNN/BBC" for 236 countries/territories, from 2004 to 2015. Our goal is to analyze a large collection of multi-evolving activities, and specifically, to answer the following questions: (a) Is there any sign of interaction/competition between two different keywords? If so, who competes with whom? (b) In which country is the competition strong? (c) Are there any seasonal/annual activities? (d) How can we automatically detect important world-wide (or local) events? We present COMPCUBE, a unifying non-linear model, which provides a compact and powerful representation of co-evolving activities; and also a novel fitting algorithm, COMPCUBE-FIT, which is parameter-free and scalable. Our method captures the following important patterns: (B)asic trends, i.e., non-linear dynamics of co-evolving activities, signs of (C)ompetition and latent interaction, e.g., Nokia vs. Nexus, (S)easonality, e.g., a Christmas spike for iPod in the U.S. and Europe, and (D)eltas, e.g., unrepeated local events such as the U.S. election in 2008. Thanks to its concise but effective summarization, COMPCUBE can also forecast long-range future activities. Extensive experiments on real datasets demonstrate that COMPCUBE consistently outperforms the best state-of- the-art methods in terms of both accuracy and execution speed.
给定大量随时间变化的活动,例如Google搜索查询,它由m个持续时间为n的地点的d个关键字/活动组成,我们如何分析所有这些活动之间的时间模式和关系,并找到特定于地点的趋势?我们如何捕捉本地活动的非线性演变并预测未来的模式?例如,假设我们有多个关键词的在线搜索量,例如,从2004年到2015年,236个国家/地区的“Nokia/Nexus/Kindle”或“CNN/BBC”。我们的目标是分析大量的多进化活动,特别是回答以下问题:(a)两个不同的关键词之间是否存在任何相互作用/竞争的迹象?如果是这样,谁与谁竞争?(b)哪个国家的竞争最激烈?(c)有没有季节性/年度活动?(d)我们如何能自动发现重要的世界性(或地方性)事件?我们提出了COMPCUBE,一个统一的非线性模型,它提供了一个紧凑而强大的共同进化活动的表示;并提出了一种新的无参数可扩展拟合算法COMPCUBE-FIT。我们的方法捕获了以下重要模式:(B)基本趋势,即共同发展活动的非线性动态,(C)竞争和潜在互动的迹象,例如,诺基亚与Nexus, (S)合理性,例如,iPod在美国和欧洲的圣诞节高峰,以及(D)eltas,例如,不重复的本地事件,如2008年美国大选。由于其简洁而有效的总结,COMPCUBE还可以预测长期的未来活动。在真实数据集上进行的大量实验表明,COMPCUBE在准确性和执行速度方面始终优于最先进的方法。
{"title":"Non-Linear Mining of Competing Local Activities","authors":"Yasuko Matsubara, Yasushi Sakurai, C. Faloutsos","doi":"10.1145/2872427.2883010","DOIUrl":"https://doi.org/10.1145/2872427.2883010","url":null,"abstract":"Given a large collection of time-evolving activities, such as Google search queries, which consist of d keywords/activities for m locations of duration n, how can we analyze temporal patterns and relationships among all these activities and find location-specific trends? How do we go about capturing non-linear evolutions of local activities and forecasting future patterns? For example, assume that we have the online search volume for multiple keywords, e.g., \"Nokia/Nexus/Kindle\" or \"CNN/BBC\" for 236 countries/territories, from 2004 to 2015. Our goal is to analyze a large collection of multi-evolving activities, and specifically, to answer the following questions: (a) Is there any sign of interaction/competition between two different keywords? If so, who competes with whom? (b) In which country is the competition strong? (c) Are there any seasonal/annual activities? (d) How can we automatically detect important world-wide (or local) events? We present COMPCUBE, a unifying non-linear model, which provides a compact and powerful representation of co-evolving activities; and also a novel fitting algorithm, COMPCUBE-FIT, which is parameter-free and scalable. Our method captures the following important patterns: (B)asic trends, i.e., non-linear dynamics of co-evolving activities, signs of (C)ompetition and latent interaction, e.g., Nokia vs. Nexus, (S)easonality, e.g., a Christmas spike for iPod in the U.S. and Europe, and (D)eltas, e.g., unrepeated local events such as the U.S. election in 2008. Thanks to its concise but effective summarization, COMPCUBE can also forecast long-range future activities. Extensive experiments on real datasets demonstrate that COMPCUBE consistently outperforms the best state-of- the-art methods in terms of both accuracy and execution speed.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81906892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Recommendations in Signed Social Networks 签名社交网络中的推荐
Pub Date : 2016-04-11 DOI: 10.1145/2872427.2882971
Jiliang Tang, C. Aggarwal, Huan Liu
Recommender systems play a crucial role in mitigating the information overload problem in social media by suggesting relevant information to users. The popularity of pervasively available social activities for social media users has encouraged a large body of literature on exploiting social networks for recommendation. The vast majority of these systems focus on unsigned social networks (or social networks with only positive links), while little work exists for signed social networks (or social networks with positive and negative links). The availability of negative links in signed social networks presents both challenges and opportunities in the recommendation process. We provide a principled and mathematical approach to exploit signed social networks for recommendation, and propose a model, RecSSN, to leverage positive and negative links in signed social networks. Empirical results on real-world datasets demonstrate the effectiveness of the proposed framework. We also perform further experiments to explicitly understand the effect of signed networks in RecSSN.
推荐系统通过向用户推荐相关信息,在缓解社交媒体信息过载问题上发挥着至关重要的作用。社交媒体用户无处不在的社交活动的流行,鼓励了大量关于利用社交网络进行推荐的文献。这些系统中的绝大多数都专注于未签名的社交网络(或只有积极链接的社交网络),而针对有签名的社交网络(或有积极和消极链接的社交网络)的工作却很少。签名社交网络中负面链接的可用性在推荐过程中既是挑战也是机遇。我们提供了一个原则性和数学方法来利用签名社交网络进行推荐,并提出了一个模型,RecSSN,以利用签名社交网络中的积极和消极联系。实际数据集的实证结果证明了所提出框架的有效性。我们还进行了进一步的实验来明确地理解签名网络在RecSSN中的影响。
{"title":"Recommendations in Signed Social Networks","authors":"Jiliang Tang, C. Aggarwal, Huan Liu","doi":"10.1145/2872427.2882971","DOIUrl":"https://doi.org/10.1145/2872427.2882971","url":null,"abstract":"Recommender systems play a crucial role in mitigating the information overload problem in social media by suggesting relevant information to users. The popularity of pervasively available social activities for social media users has encouraged a large body of literature on exploiting social networks for recommendation. The vast majority of these systems focus on unsigned social networks (or social networks with only positive links), while little work exists for signed social networks (or social networks with positive and negative links). The availability of negative links in signed social networks presents both challenges and opportunities in the recommendation process. We provide a principled and mathematical approach to exploit signed social networks for recommendation, and propose a model, RecSSN, to leverage positive and negative links in signed social networks. Empirical results on real-world datasets demonstrate the effectiveness of the proposed framework. We also perform further experiments to explicitly understand the effect of signed networks in RecSSN.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"159 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76606678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 101
Remedying Web Hijacking: Notification Effectiveness and Webmaster Comprehension 补救网络劫持:通知有效性和网站管理员理解
Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883039
Frank H. Li, Grant Ho, Eric Kuan, Yuan Niu, L. Ballard, Kurt Thomas, Elie Bursztein, V. Paxson
As miscreants routinely hijack thousands of vulnerable web servers weekly for cheap hosting and traffic acquisition, security services have turned to notifications both to alert webmasters of ongoing incidents as well as to expedite recovery. In this work we present the first large-scale measurement study on the effectiveness of combinations of browser, search, and direct webmaster notifications at reducing the duration a site remains compromised. Our study captures the life cycle of 760,935 hijacking incidents from July, 2014--June, 2015, as identified by Google Safe Browsing and Search Quality. We observe that direct communication with webmasters increases the likelihood of cleanup by over 50% and reduces infection lengths by at least 62%. Absent this open channel for communication, we find browser interstitials---while intended to alert visitors to potentially harmful content---correlate with faster remediation. As part of our study, we also explore whether webmasters exhibit the necessary technical expertise to address hijacking incidents. Based on appeal logs where webmasters alert Google that their site is no longer compromised, we find 80% of operators successfully clean up symptoms on their first appeal. However, a sizeable fraction of site owners do not address the root cause of compromise, with over 12% of sites falling victim to a new attack within 30 days. We distill these findings into a set of recommendations for improving web security and best practices for webmasters.
由于不法分子每周都会劫持数千个易受攻击的网络服务器,以获取廉价的托管服务和流量,安全服务部门已经转向通知,提醒网站管理员正在发生的事件,并加快恢复速度。在这项工作中,我们提出了第一个大规模的测量研究,研究了浏览器、搜索和直接网站管理员通知组合在减少网站受损持续时间方面的有效性。我们的研究捕获了2014年7月至2015年6月期间760,935起劫持事件的生命周期,这些事件由谷歌安全浏览和搜索质量确定。我们观察到,与网站管理员的直接沟通增加了50%以上的清除可能性,并减少了至少62%的感染时间。如果没有这种开放的沟通渠道,我们发现浏览器插页广告——虽然旨在提醒访问者注意潜在的有害内容——与更快的修复相关。作为我们研究的一部分,我们还探讨了网站管理员是否表现出必要的技术专长来解决劫持事件。根据网站管理员提醒谷歌他们的网站不再受到威胁的申诉日志,我们发现80%的运营商在第一次申诉时就成功地清除了症状。然而,相当一部分网站所有者没有解决入侵的根本原因,超过12%的网站在30天内成为新攻击的受害者。我们将这些发现提炼成一组建议,以提高网络安全性,并为网站管理员提供最佳实践。
{"title":"Remedying Web Hijacking: Notification Effectiveness and Webmaster Comprehension","authors":"Frank H. Li, Grant Ho, Eric Kuan, Yuan Niu, L. Ballard, Kurt Thomas, Elie Bursztein, V. Paxson","doi":"10.1145/2872427.2883039","DOIUrl":"https://doi.org/10.1145/2872427.2883039","url":null,"abstract":"As miscreants routinely hijack thousands of vulnerable web servers weekly for cheap hosting and traffic acquisition, security services have turned to notifications both to alert webmasters of ongoing incidents as well as to expedite recovery. In this work we present the first large-scale measurement study on the effectiveness of combinations of browser, search, and direct webmaster notifications at reducing the duration a site remains compromised. Our study captures the life cycle of 760,935 hijacking incidents from July, 2014--June, 2015, as identified by Google Safe Browsing and Search Quality. We observe that direct communication with webmasters increases the likelihood of cleanup by over 50% and reduces infection lengths by at least 62%. Absent this open channel for communication, we find browser interstitials---while intended to alert visitors to potentially harmful content---correlate with faster remediation. As part of our study, we also explore whether webmasters exhibit the necessary technical expertise to address hijacking incidents. Based on appeal logs where webmasters alert Google that their site is no longer compromised, we find 80% of operators successfully clean up symptoms on their first appeal. However, a sizeable fraction of site owners do not address the root cause of compromise, with over 12% of sites falling victim to a new attack within 30 days. We distill these findings into a set of recommendations for improving web security and best practices for webmasters.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"185 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88100856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
From Freebase to Wikidata: The Great Migration 从Freebase到维基数据:大迁移
Pub Date : 2016-04-11 DOI: 10.1145/2872427.2874809
Thomas Pellissier Tanon, Denny Vrandečić, Sebastian Schaffert, T. Steiner, Lydia Pintscher
Collaborative knowledge bases that make their data freely available in a machine-readable form are central for the data strategy of many projects and organizations. The two major collaborative knowledge bases are Wikimedia's Wikidata and Google's Freebase. Due to the success of Wikidata, Google decided in 2014 to offer the content of Freebase to the Wikidata community. In this paper, we report on the ongoing transfer efforts and data mapping challenges, and provide an analysis of the effort so far. We describe the Primary Sources Tool, which aims to facilitate this and future data migrations. Throughout the migration, we have gained deep insights into both Wikidata and Freebase, and share and discuss detailed statistics on both knowledge bases.
以机器可读的形式免费提供其数据的协作知识库是许多项目和组织的数据策略的核心。两个主要的协作知识库是维基媒体的Wikidata和谷歌的Freebase。由于维基数据的成功,谷歌在2014年决定将Freebase的内容提供给维基数据社区。在本文中,我们报告了正在进行的转移工作和数据映射挑战,并提供了到目前为止的工作分析。我们描述了主要来源工具,它旨在促进当前和未来的数据迁移。在整个迁移过程中,我们对Wikidata和Freebase都有了深入的了解,并分享和讨论了这两个知识库的详细统计数据。
{"title":"From Freebase to Wikidata: The Great Migration","authors":"Thomas Pellissier Tanon, Denny Vrandečić, Sebastian Schaffert, T. Steiner, Lydia Pintscher","doi":"10.1145/2872427.2874809","DOIUrl":"https://doi.org/10.1145/2872427.2874809","url":null,"abstract":"Collaborative knowledge bases that make their data freely available in a machine-readable form are central for the data strategy of many projects and organizations. The two major collaborative knowledge bases are Wikimedia's Wikidata and Google's Freebase. Due to the success of Wikidata, Google decided in 2014 to offer the content of Freebase to the Wikidata community. In this paper, we report on the ongoing transfer efforts and data mapping challenges, and provide an analysis of the effort so far. We describe the Primary Sources Tool, which aims to facilitate this and future data migrations. Throughout the migration, we have gained deep insights into both Wikidata and Freebase, and share and discuss detailed statistics on both knowledge bases.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73315494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 193
GoCAD: GPU-Assisted Online Content-Adaptive Display Power Saving for Mobile Devices in Internet Streaming GoCAD: gpu辅助在线内容自适应显示在互联网流媒体中为移动设备省电
Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883064
Yao Liu, Mengbai Xiao, Ming Zhang, Xin Li, Mian Dong, Zhan Ma, Zhenhua Li, Songqing Chen
During Internet streaming, a significant portion of the battery power is always consumed by the display panel on mobile devices. To reduce the display power consumption, backlight scaling, a scheme that intelligently dims the backlight has been proposed. To maintain perceived video appearance in backlight scaling, a computationally intensive luminance compensation process is required. However, this step, if performed by the CPU as existing schemes suggest, could easily offset the power savings gained from backlight scaling. Furthermore, computing the optimal backlight scaling values requires per-frame luminance information, which is typically too energy intensive for mobile devices to compute. Thus, existing schemes require such information to be available in advance. And such an offline approach makes these schemes impractical. To address these challenges, in this paper, we design and implement GoCAD, a GPU-assisted Online Content-Adaptive Display power saving scheme for mobile devices in Internet streaming sessions. In GoCAD, we employ the mobile device's GPU rather than the CPU to reduce power consumption during the luminance compensation phase. Furthermore, we compute the optimal backlight scaling values for small batches of video frames in an online fashion using a dynamic programming algorithm. Lastly, we make novel use of the widely available video storyboard, a pre-computed set of thumbnails associated with a video, to intelligently decide whether or not to apply our backlight scaling scheme for a given video. For example, when the GPU power consumption would offset the savings from dimming the backlight, no backlight scaling is conducted. To evaluate the performance of GoCAD, we implement a prototype within an Android application and use a Monsoon power monitor to measure the real power consumption. Experiments are conducted on more than 460 randomly selected YouTube videos. Results show that GoCAD can effectively produce power savings without affecting rendered video quality.
在互联网流媒体过程中,很大一部分电池电量总是被移动设备上的显示面板所消耗。为了降低显示器的功耗和背光缩放,提出了一种智能调暗背光的方案。为了在背光缩放中保持可感知的视频外观,需要计算密集的亮度补偿过程。然而,这一步,如果CPU执行现有方案建议,可以很容易地抵消从背光缩放获得的电力节省。此外,计算最佳的背光缩放值需要每帧亮度信息,这对于移动设备来说通常过于耗能。因此,现有的计划要求事先获得这些资料。而这种离线方式使得这些计划不切实际。为了解决这些挑战,在本文中,我们设计并实现了GoCAD,一个gpu辅助的在线内容自适应显示节能方案,用于互联网流媒体会话的移动设备。在GoCAD中,我们使用移动设备的GPU而不是CPU来减少亮度补偿阶段的功耗。此外,我们使用动态规划算法以在线方式计算小批量视频帧的最佳背光缩放值。最后,我们新颖地使用了广泛使用的视频故事板,这是一组预先计算的与视频相关的缩略图,可以智能地决定是否为给定的视频应用我们的背光缩放方案。例如,当GPU功耗将抵消调暗背光所节省的费用时,不进行背光缩放。为了评估GoCAD的性能,我们在Android应用程序中实现了一个原型,并使用Monsoon功率监视器来测量实际功耗。实验在460多个随机选择的YouTube视频上进行。结果表明,GoCAD可以在不影响渲染视频质量的情况下有效地节省功耗。
{"title":"GoCAD: GPU-Assisted Online Content-Adaptive Display Power Saving for Mobile Devices in Internet Streaming","authors":"Yao Liu, Mengbai Xiao, Ming Zhang, Xin Li, Mian Dong, Zhan Ma, Zhenhua Li, Songqing Chen","doi":"10.1145/2872427.2883064","DOIUrl":"https://doi.org/10.1145/2872427.2883064","url":null,"abstract":"During Internet streaming, a significant portion of the battery power is always consumed by the display panel on mobile devices. To reduce the display power consumption, backlight scaling, a scheme that intelligently dims the backlight has been proposed. To maintain perceived video appearance in backlight scaling, a computationally intensive luminance compensation process is required. However, this step, if performed by the CPU as existing schemes suggest, could easily offset the power savings gained from backlight scaling. Furthermore, computing the optimal backlight scaling values requires per-frame luminance information, which is typically too energy intensive for mobile devices to compute. Thus, existing schemes require such information to be available in advance. And such an offline approach makes these schemes impractical. To address these challenges, in this paper, we design and implement GoCAD, a GPU-assisted Online Content-Adaptive Display power saving scheme for mobile devices in Internet streaming sessions. In GoCAD, we employ the mobile device's GPU rather than the CPU to reduce power consumption during the luminance compensation phase. Furthermore, we compute the optimal backlight scaling values for small batches of video frames in an online fashion using a dynamic programming algorithm. Lastly, we make novel use of the widely available video storyboard, a pre-computed set of thumbnails associated with a video, to intelligently decide whether or not to apply our backlight scaling scheme for a given video. For example, when the GPU power consumption would offset the savings from dimming the backlight, no backlight scaling is conducted. To evaluate the performance of GoCAD, we implement a prototype within an Android application and use a Monsoon power monitor to measure the real power consumption. Experiments are conducted on more than 460 randomly selected YouTube videos. Results show that GoCAD can effectively produce power savings without affecting rendered video quality.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73468097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Exploiting Green Energy to Reduce the Operational Costs of Multi-Center Web Search Engines 利用绿色能源降低多中心网络搜索引擎的运行成本
Pub Date : 2016-04-11 DOI: 10.1145/2872427.2883021
Roi Blanco, Matteo Catena, N. Tonellotto
Carbon dioxide emissions resulting from fossil fuels (brown energy) combustion are the main cause of global warming due to the greenhouse effect. Large IT companies have recently increased their efforts in reducing the carbon dioxide footprint originated from their data center electricity consumption. On one hand, better infrastructure and modern hardware allow for a more efficient usage of electric resources. On the other hand, data-centers can be powered by renewable sources (green energy) that are both environmental friendly and economically convenient. In this paper, we tackle the problem of targeting the usage of green energy to minimize the expenditure of running multi-center Web search engines, i.e., systems composed by multiple, geographically remote, computing facilities. We propose a mathematical model to minimize the operational costs of multi-center Web search engines by exploiting renewable energies whenever available at different locations. Using this model, we design an algorithm which decides what fraction of the incoming query load arriving into one processing facility must be forwarded to be processed at different sites to use green energy sources. We experiment using real traffic from a large search engine and we compare our model against state of the art baselines for query forwarding. Our experimental results show that the proposed solution maintains an high query throughput, while reducing by up to ~25% the energy operational costs of multi-center search engines. Additionally, our algorithm can reduce the brown energy consumption by almost 6% when energy-proportional servers are employed.
由于温室效应,化石燃料(棕色能源)燃烧产生的二氧化碳排放是全球变暖的主要原因。大型IT公司最近加大了减少数据中心电力消耗产生的二氧化碳足迹的努力。一方面,更好的基础设施和现代化的硬件可以更有效地利用电力资源。另一方面,数据中心可以由既环保又经济方便的可再生能源(绿色能源)供电。在本文中,我们解决了目标使用绿色能源的问题,以尽量减少运行多中心Web搜索引擎的支出,即由多个地理上遥远的计算设施组成的系统。我们提出了一个数学模型,通过在不同地点利用可再生能源来最小化多中心网络搜索引擎的运营成本。利用该模型,我们设计了一种算法,该算法决定到达一个处理设施的传入查询负载的哪些部分必须转发到不同的站点进行处理,以使用绿色能源。我们使用来自大型搜索引擎的真实流量进行实验,并将我们的模型与查询转发的最新基线进行比较。实验结果表明,该方法保持了较高的查询吞吐量,同时将多中心搜索引擎的能量运行成本降低了25%。此外,当使用能量比例服务器时,我们的算法可以减少近6%的棕色能源消耗。
{"title":"Exploiting Green Energy to Reduce the Operational Costs of Multi-Center Web Search Engines","authors":"Roi Blanco, Matteo Catena, N. Tonellotto","doi":"10.1145/2872427.2883021","DOIUrl":"https://doi.org/10.1145/2872427.2883021","url":null,"abstract":"Carbon dioxide emissions resulting from fossil fuels (brown energy) combustion are the main cause of global warming due to the greenhouse effect. Large IT companies have recently increased their efforts in reducing the carbon dioxide footprint originated from their data center electricity consumption. On one hand, better infrastructure and modern hardware allow for a more efficient usage of electric resources. On the other hand, data-centers can be powered by renewable sources (green energy) that are both environmental friendly and economically convenient. In this paper, we tackle the problem of targeting the usage of green energy to minimize the expenditure of running multi-center Web search engines, i.e., systems composed by multiple, geographically remote, computing facilities. We propose a mathematical model to minimize the operational costs of multi-center Web search engines by exploiting renewable energies whenever available at different locations. Using this model, we design an algorithm which decides what fraction of the incoming query load arriving into one processing facility must be forwarded to be processed at different sites to use green energy sources. We experiment using real traffic from a large search engine and we compare our model against state of the art baselines for query forwarding. Our experimental results show that the proposed solution maintains an high query throughput, while reducing by up to ~25% the energy operational costs of multi-center search engines. Additionally, our algorithm can reduce the brown energy consumption by almost 6% when energy-proportional servers are employed.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73760850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
期刊
Proceedings of the 25th International Conference on World Wide Web
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1